• Posts
  • RSS
  • ◂◂RSS
  • Contact

  • Determining What People Want to See

    January 4th, 2012
    tech  [html]
    One of the major open problems of the modern internet is how to figure out what someone is interested in seeing. What news stories will I enjoy? What comments are worth reading? The general problem is incredibly broad: you want to know about email, web sites, articles, comments, tweets, status updates, website changes, site private messages, mailing lists, blogs, and others. You don't have time to read even a tiny fraction of everything, and different sources vary hugely in how much you want to read them, so you want some sort of filtering system.

    The traditional answer was to hire editors. Publication was expensive and only practical on a large scale, so the extra cost of people to decide what should be published was unavoidable and relatively small. With the internet the constraints have changed: publication is much cheaper, the overhead for personalization is far lower, and you can get much better feedback from readers.

    One new solution that couldn't have surfaced without the internet is voting. Sites that use this explicitly include Reddit, Digg, and Hacker News. Users vote links and comments up or down, votes 'decay' with time, and that controls what everyone else sees. This gets you a hybrid of popularity and recency. It's vulnerable to people creating fake accounts and vote rings, but the people running the sites seem to do a good job with automatic countermeasures.

    Voting is also vulnerable to "bad judgment": people voting up things that I'm not interested in. Reddit dealt with this somewhat by creating subreddits: sections of Reddit devoted to specific topics, where you can control which ones show up for you. So I see posts on Boston, giving, music, and don't see posts from pics, funny, politics, etc. This works reasonably well, but I've found myself going to Reddit less and less over time as I've found fewer things that interest me.

    The Hacker News approach to "bad judgment" is more explicit: there is one site, some content guidelines, and users have a strong culture of flagging things that they see as bringing down the quality of the site. This seems to work better, or it just aligns more with my interests, but I read it a lot more than Reddit nowadays.

    Another problem is that people mostly don't vote, and when they do vote it's usually on things that have already been recognized as good. While sites do have sections for new posts most readers don't see things until they get to the front page, which means the decision about what gets to the front page is made by the small number of people watching the 'new queue'.

    Social networks can take a different approach because they have more information: they know some people who you're likely to have similar interests with. Facebook shows me posts by people I've friended (taking into account how much we interact), and then it does something similar to the voting sites by treating 'like' and interaction as upvotes. This is how you get the newish default "highlighted stories first" algorithm:

    If you switch to Facebook's recency option, it still does some filtering; I'm pretty sure the new upper right pane is the full unfiltered real time feed for all your Facebook friends. The full feed isn't so interesting, so you can see why they're putting work into filtering.

    A very different approach is the rss reader and mailing list approach: you pick some set of sources for things you're interested in, and then it shows you every new item ordered by recency. I like this a lot, because there are a lot of people (mostly friends, some not) where I want to read everything they have to say. I do need to be careful to add/remove things based on how interested I am in their content, though, which is more work than other options require.

    The main thing I find frustrating in trying to understand the current best solutions to this problem is that everyone running a system has a strong incentive to keep its workings secret so people have a harder time gaming or copying it.

    Comment via: google plus, facebook

    Recent posts on blogs I like:

    Governance in Rich Liberal American Cities

    Matt Yglesias has a blog post called Make Blue America Great Again, about governance in rich liberal states like New York and California. He talks about various good government issues, and he pays a lot of attention specifically to TransitMatters and our …

    via Pedestrian Observations November 19, 2020

    Collections: Why Military History?

    This week, I want to talk about the discipline of military history: what it is, why it is important and how I see my own place within it. This is going to be a bit of an unusual collections post as it is less about the past itself and more about how we st…

    via A Collection of Unmitigated Pedantry November 13, 2020

    Misalignment and misuse: whose values are manifest?

    Crossposted from world spirit sock puppet. AI related disasters are often categorized as involving misaligned AI, or misuse, or accident. Where: misuse means the bad outcomes were wanted by the people involved, misalignment means the bad outcomes were wan…

    via Meteuphoric November 13, 2020

    more     (via openring)


  • Posts
  • RSS
  • ◂◂RSS
  • Contact