::  Posts  ::  RSS  ::  ◂◂RSS  ::  Contact

Determining What People Want to See

January 4th, 2012
tech  [html]

One of the major open problems of the modern internet is how to figure out what someone is interested in seeing. What news stories will I enjoy? What comments are worth reading? The general problem is incredibly broad: you want to know about email, web sites, articles, comments, tweets, status updates, website changes, site private messages, mailing lists, blogs, and others. You don't have time to read even a tiny fraction of everything, and different sources vary hugely in how much you want to read them, so you want some sort of filtering system.

The traditional answer was to hire editors. Publication was expensive and only practical on a large scale, so the extra cost of people to decide what should be published was unavoidable and relatively small. With the internet the constraints have changed: publication is much cheaper, the overhead for personalization is far lower, and you can get much better feedback from readers.

One new solution that couldn't have surfaced without the internet is voting. Sites that use this explicitly include Reddit, Digg, and Hacker News. Users vote links and comments up or down, votes 'decay' with time, and that controls what everyone else sees. This gets you a hybrid of popularity and recency. It's vulnerable to people creating fake accounts and vote rings, but the people running the sites seem to do a good job with automatic countermeasures.

Voting is also vulnerable to "bad judgment": people voting up things that I'm not interested in. Reddit dealt with this somewhat by creating subreddits: sections of Reddit devoted to specific topics, where you can control which ones show up for you. So I see posts on Boston, giving, music, and don't see posts from pics, funny, politics, etc. This works reasonably well, but I've found myself going to Reddit less and less over time as I've found fewer things that interest me.

The Hacker News approach to "bad judgment" is more explicit: there is one site, some content guidelines, and users have a strong culture of flagging things that they see as bringing down the quality of the site. This seems to work better, or it just aligns more with my interests, but I read it a lot more than Reddit nowadays.

Another problem is that people mostly don't vote, and when they do vote it's usually on things that have already been recognized as good. While sites do have sections for new posts most readers don't see things until they get to the front page, which means the decision about what gets to the front page is made by the small number of people watching the 'new queue'.

Social networks can take a different approach because they have more information: they know some people who you're likely to have similar interests with. Facebook shows me posts by people I've friended (taking into account how much we interact), and then it does something similar to the voting sites by treating 'like' and interaction as upvotes. This is how you get the newish default "highlighted stories first" algorithm:

If you switch to Facebook's recency option, it still does some filtering; I'm pretty sure the new upper right pane is the full unfiltered real time feed for all your Facebook friends. The full feed isn't so interesting, so you can see why they're putting work into filtering.

A very different approach is the rss reader and mailing list approach: you pick some set of sources for things you're interested in, and then it shows you every new item ordered by recency. I like this a lot, because there are a lot of people (mostly friends, some not) where I want to read everything they have to say. I do need to be careful to add/remove things based on how interested I am in their content, though, which is more work than other options require.

The main thing I find frustrating in trying to understand the current best solutions to this problem is that everyone running a system has a strong incentive to keep its workings secret so people have a harder time gaming or copying it.

Comment via: google plus, facebook

Recent posts on blogs I like:

How Fast New York Regional Rail Could Be Part 3

In the third and last installment of my series posting sample commuter rail schedules for New York (part 1, part 2), let’s look at trains in New Jersey. This is going to be a longer post, covering six different lines, namely all New Jersey Transit lines t…

via Pedestrian Observations October 21, 2019

Strong stances

I. The question of confidence Should one hold strong opinions? Some say yes. Some say that while it’s hard to tell, it tentatively seems pretty bad (probably). There are many pragmatically great upsides, and a couple of arguably unconscionable downsides. …

via Meteuphoric October 15, 2019

What do executives do, anyway?

An executive with 8,000 indirect reports and 2000 hours of work in a year can afford to spend, at most, 15 minutes per year per person in their reporting hierarchy... even if they work on nothing else. That job seems impossible. How can anyone make any im…

via apenwarr September 29, 2019

more     (via openring)

More Posts:

  ::  Posts  ::  RSS  ::  ◂◂RSS  ::  Contact