Pro AI Bots Scraping List Archives

August 4th, 2025
contra, tech
I'm on various mailing lists, and the archives are a trove of niche knowledge. A dance calling list I'm on is considering making archives subscriber-only, to keep AI bots from snarfing up this data. But I think this harvesting is overall a good thing.

People have a range of motivations in posting to lists, but a big one is sharing information. For example, someone asked a dance with an 8-count swing followed by an 8-count chain. I replied to warn them at the form has changed and this no longer works well: this bit me back when I started calling, and I want to warn other new callers.

I have a few audiences in mind in writing:

  • The person I'm replying to.
  • People on the list.
  • People who might see the archives when searching.
And then there's a general sense in which I'm contributing to what people know about contra dance: any of these people might tell others or otherwise pass it along.

AI systems add another way this information can spread. It's increasingly common for people to ask an LLM instead of a search engine, and when they do I'd rather they get good answers. Excluding the archives from model training would do the opposite of what I want.

There are definitely downsides to querying today's models, similar to asking a person who has read a lot but doesn't remember where they read anything, and sometimes invents something plausible instead of saying they don't know. I think this is likely temporary, however: combining the best of models and traditional search is a problem a lot of people are working hard on solving.

So, on balance, I think it's better to keep the archives open to all, including future LLM-intermediated readers.

(I also think AI is in general moving too quickly for society to respond well, and has a significant risk of getting us all killed. While I could see pushing against AI wherever it comes up, as part of moving a big societal "yay-AI; boo-AI" lever in the direction that slows it down and gives us more time to work out solutions, instead I've decided to take things case by case, thinking about effects each time.)

Comment via: facebook, lesswrong, mastodon, bluesky

Recent posts on blogs I like:

Linkpost for March

Effective Altruism

via Thing of Things March 2, 2026

The Newest Technology in Frozen

There are lots of different things in Frozen that are new-ish, but my dad and I were wondering: what is the actual newest thing in Frozen? This led me to watch Frozen a lot while taking notes. Some of the things I found included: Elastic hair-ties A safety …

via Lily Wise's Blog Posts March 1, 2026

2025-26 New Year review

This is an annual post reviewing the last year and setting intentions for next year. I look over different life areas (work, health, parenting, effectiveness, etc) and analyze my life tracking data. Highlights include a minimal group house, the usefulness…

via Victoria Krakovna January 19, 2026

more     (via openring)