• Posts
  • RSS
  • ◂◂RSS
  • Contact

  • Looking at RSS User-Agents

    February 4th, 2021
    meta, rs, tech  [html]
    An RSS reader sends periodic requests to get the latest feed. This includes a User-Agent field, identifying which fetcher is running:
    Feedbin feed-id:1242010 - 38 subscribers
    
    This fetcher is nicely passing along statistics, saying how many readers it represents.

    I took one day of logs, with 5,962 requests for my RSS feed:

    $ sudo grep '"GET /news.rss ' \
        /var/log/nginx/access.log.1 \
      | awk -F'"' '{print $6}' \
      | wc -l
    5962
    
    There were 162 unique User-Agents:
    $ sudo grep '"GET /news.rss ' \
        /var/log/nginx/access.log.1 \
      | awk -F'"' '{print $6}' \
      | sort \
      | uniq \
      | wc -l
    162
    
    Of the 5,962 requests, 932 (16%) gave stats:
    $ sudo grep '"GET /news.rss ' \
        /var/log/nginx/access.log.1 \
      | awk -F'"' '{print $6}' \
      | grep 'subscriber\|reader' \
      | wc -l
    932
    
    They sent 21 distinct User-Agents:
    $ sudo grep '"GET /news.rss ' \
        /var/log/nginx/access.log.1 \
      | awk -F'"' '{print $6}' \
      | grep 'subscriber\|reader' \
      | sort \
      | uniq \
      | wc -l
    21
    
    Some sent multiple requests with different numbers of subscribers:
    Feedbin feed-id:1242010 - 38 subscribers
    Feedbin feed-id:372940 - 11 subscribers
    Feedbin feed-id:382 - 1 subscribers
    
    I suspect this comes from people using old URLs that then get redirected to my current URL. For example, now it's https://www.jefftk.com/news.rss, but it used to be http://www.jefftk.com/news.rss, and even longer ago it was an sccs.swarthmore.edu address. Summing subscriber counts, I see:
    • Feedly: 573
    • inoreader.com: 87
    • NewsBlur: 62
    • Feedbin: 50
    • theoldreader.com: 34
    • Dreamwidth Studios: 7
    • BazQux: 5
    • Bloglovin: 2
    • Feed Wrangler: 2
    • pine.blog: 1
    While this only tells us about users who are subscribed to my blog, it seems like Feedly is the biggest player here by a lot.

    Different services fetched at different intervals. Taking the shortest interval for each distinct User-Agent:

    • Feedly: 7min
    • Feedbin: 15min
    • Bloglovin: 30min
    • Dreamwidth Studios: 30min
    • Feed Wrangler: 30min
    • NewsBlur: 30min
    • BazQux: 40min
    • inoreader.com: 1hr
    • theoldreader.com: 2hr
    • pine.blog: 24hr
    Looking through the requests that don't list subscribers, several do seem to be services. I'll try reaching out to them to see if they're interested in adding subscriber counts to their User-Agents.

    Comment via: facebook, lesswrong

    Recent posts on blogs I like:

    How to Build High-Speed Rail with Money the United States Has

    The bipartisan infrastructure framework (BIF) just passed the Senate by a large margin, with money for both roads and public transportation. Unlike the 2009 Obama stimulus, the BIF has plenty of money for high-speed rail – not just $8 billion as in the 20…

    via Pedestrian Observations July 31, 2021

    Collections: The Queen’s Latin or Who Were the Romans, Part V: Saving And Losing an Empire

    This is the fifth and final part (I, II, III, IV) of our series asking the question ‘Who were the Romans?’ How did they understand themselves as a people and the idea of ‘Roman’ as an identity? Was this a homogeneous, ethnically defined group, as some ver…

    via A Collection of Unmitigated Pedantry July 30, 2021

    Songs about terrible relationships

    [Spoilers for several old musicals.] TV Tropes lists dozens of examples of the “I want” song (where the hero of a musical sings about their dream of escaping their small surroundings). After watching a bunch of musicals on maternity leave, I’m wondering h…

    via The whole sky July 17, 2021

    more     (via openring)


  • Posts
  • RSS
  • ◂◂RSS
  • Contact