Design Testing

November 19th, 2011
experiment, tech, work
One of the things I really like about working on websites is that we can run real experiments. If we have a change we're considering making, we can have half our users see the new version while the other half see the old version, and we can see which one performs better. [1] These are randomized, controlled, double blind trials, with no publication bias issues, and a successful result means a better version of our site that we can start using immediately. After years in school where running a proper experiment meant weeks of careful experiment design, laborious data collection, compromises in experimental procedure for the sake of practicality, insufficient sample sizes, poor generalization, and unclear usefulness, this is really satisfying.

One humbling aspect is that I've realized I'm not very good at predicting whether a change will help. None of us are. When we test new designs, sometimes they work well and other times they don't. [2] For an example of this, consider two redesigns from the early days of our daily deals website. The first is an email design, the second is a site design:

Old:
New:
Old:
New:
One of these was a 14% improvement, the other a 27% degradation. Can you tell which was which?


[1] Not all websites have an obvious metric for "performs better". For example, how does wikipedia know if a site change improves things for their users? (More edits? Better edits? More time reading? Less time?) We're generally trying to sell things, however, so we can mostly just look at the fraction of users who advance to the next step in the sales process.

[2] This really shows the value of testing: if we just made every change we thought was good, we wouldn't improve anywhere near as much as just adopting the changes that help.

Referenced in:

Comment via: google plus, facebook, substack

Recent posts on blogs I like:

Paid Subscriber Questions Post

I have picked up a lot of new paid subscribers since the last time I did something like this, so: let's do a subscriber questions post!

via Thing of Things July 21, 2025

Retrospective on life tracking and effectiveness systems

I’ve been doing life tracking for around 10 years, and this post is looking back at some things I learned from the data (since my previous retrospective in 2017). Highlights include what I get out of the Oura ring, correlations between sleep and deep work…

via Victoria Krakovna July 4, 2025

Elixir's Last Dance

On May 18th, the contra dance band Elixir had their last gig ever. The dance was packed: there were three hundred people. It was the only dance BIDA has ever done where they sold tickets. People flew from across the country just to hear Elixir play one la…

via Lily Wise's Blog Posts June 5, 2025

more     (via openring)