Jeff Kaufman  ::  Blog Posts  ::  RSS Feed  ::  Contact

Hit frequencies

When load testing server-side software there are two straight-forward options:

  1. Stress all endpoints evenly
  2. Stress one endpoint in isolation
For example, if you're testing a dynamic website with siege you could (1) give it a list of all the urls on your site and have it rotate among them, or you could (2) pick a representive url and just hit that one repeatedly. Like any synthetic test this isn't completely realistic, but in this case you're missing something pretty important: if you want to test something that uses caching, you need to approximate the distribution of a real workload.

Why? How well caching works depends on your hit rate, which depends on what the distribution of your requests looks like. Situation (1) is a caching worst-case, (2) is a caching best case, and your real situation is somewhere in the middle. For example, from my access logs over the past couple years, here's the distribution of requests I've seen:

1275358/news.rss
562369/
437312/favicon.ico
406583/index
168296/robots.txt
162728/p/
116353/simple_piano_recordings/piano_chords.svg
94048/icdiff
92680/p/mercury-spill
63728/wsgi/json-comments-cached/gp/i6WrmjSHXLg
[snip ~1k entries]
1179/news/2011-11-02
1178/wsgi/json-comments/gp/YTnbXQoRPRN
1178/fiddle-clip-on/pictures/11-top.jpg
[snip ~10k entries]
113/news/back_from_564.rss
113/news/2012-07-24
113/news/2012-07-09.html
[snip ~100k entries]
3/nextbus/omnitrans/82/7387/
3/nextbus/art/1135/759763/
3/nextbus/mbta/39/6460/
[snip ~1M entries]
1/ngx_pagespeed_beacon?ets=load:906&rload=1605...
1/nextbus/jtafla/17/1433/next/
1/news/all/trillion-dollar-platinum-coin

This is kind of a mess, but the main idea is that we have a few "hot" endpoints, and then a long tail of less popular entries. I can use this distribution in load testing, to get something between uniform sampling (option 1) and singleton sampling (option 2) that better represents what real load on the site would look like.

In case this is useful to other people, here's the frequency list (hits-frequency.txt.gz) and a short script to sample from it (generate_urls.py).

full post...

Work Desktop Setup

I spend most of my day standing in front of my work desktop, doing some combination of programming, code review, documentation, and email. Over the years I've figured out a relatively productive setup:

  • browser on one monitor
  • tile the rest of the space with tall terminals
  • put some extra terminals for starting jobs etc off on the side

This looks like: more...

Making things mobile friendly II

When I got a phone and started caring about how things looked on mobile I only fixed some of the pages on my site: home page, blog post template, etc. I have various pages scattered around, however, that I never got around to fixing. Most of why I had been putting this off is that it seemed like a lot of finicky work, like updating the EA forum display was. Except all of these pages are just simple html with no css: more...

Making things mobile friendly

The EA Forum is based on the Reddit codebase, forked long before Reddit added a good mobile version. While it would still be good to have a proper mobile version, here's an idea for how to get it mostly mobile friendly with a relatively small amount of work:

  1. The page was designed for desktop browsers that can do pages at least 1024px wide, so set a viewport tag with display=1024px.
  2. The page was designed to make generous use of horizontal space (sidebar, indented posts), but on mobile that's less available, so hide the sidebar and stop indenting posts.
  3. Bump up the font sizes of various things.
It looks like this: more...
Caller Locations

Luke Donforth is collecting responses for a caller directory. You can submit your info here if you'd like to be included. I was interested at looking at the data to see patterns, but so far the only aspect with enough responses to be interesting was zip code:

Here's an interactive version: map.

full post...

Festival Stats 2016

Another year, another set of gig stats: more...

More Posts


Jeff Kaufman  ::  Blog Posts  ::  RSS Feed  ::  Contact  ::  G+ Profile