::  Posts  ::  RSS  ::  ◂◂RSS  ::  Contact

Design Testing

November 19th, 2011
work, experiment, tech  [html]
One of the things I really like about working on websites is that we can run real experiments. If we have a change we're considering making, we can have half our users see the new version while the other half see the old version, and we can see which one performs better. [1] These are randomized, controlled, double blind trials, with no publication bias issues, and a successful result means a better version of our site that we can start using immediately. After years in school where running a proper experiment meant weeks of careful experiment design, laborious data collection, compromises in experimental procedure for the sake of practicality, insufficient sample sizes, poor generalization, and unclear usefulness, this is really satisfying.

One humbling aspect is that I've realized I'm not very good at predicting whether a change will help. None of us are. When we test new designs, sometimes they work well and other times they don't. [2] For an example of this, consider two redesigns from the early days of our daily deals website. The first is an email design, the second is a site design:

Old:
New:
Old:
New:
One of these was a 14% improvement, the other a 27% degradation. Can you tell which was which?


[1] Not all websites have an obvious metric for "performs better". For example, how does wikipedia know if a site change improves things for their users? (More edits? Better edits? More time reading? Less time?) We're generally trying to sell things, however, so we can mostly just look at the fraction of users who advance to the next step in the sales process.

[2] This really shows the value of testing: if we just made every change we thought was good, we wouldn't improve anywhere near as much as just adopting the changes that help.

Comment via: google plus, facebook

Recent posts on blogs I like:

Cops on Public Transportation

I wrote a post about American moral panics about fare evasion two months ago, which was mirrored on Streetsblog. I made a mistake in that post that I’d like to correct – and yet the correction itself showcases something interesting about why there are arm…

via Pedestrian Observations January 17, 2020

A foolish consistency

“The other terror that scares us from self-trust is our consistency; a reverence for our past act or word, because the eyes of others have no other data for computing our orbit than our past acts, and we are loath to disappoint them. But why should you ke…

via Holly Elmore January 5, 2020

Algorithms interviews: theory vs. practice

When I ask people at trendy big tech companies why algorithms quizzes are mandatory, the most common answer I get is something like "we have so much scale, we can't afford to have someone accidentally write an O(n^2) algorithm and bring the site d…

via Posts on Dan Luu January 5, 2020

more     (via openring)

More Posts:


  ::  Posts  ::  RSS  ::  ◂◂RSS  ::  Contact