Pro AI Bots Scraping List Archives

August 4th, 2025
contra, tech
I'm on various mailing lists, and the archives are a trove of niche knowledge. A dance calling list I'm on is considering making archives subscriber-only, to keep AI bots from snarfing up this data. But I think this harvesting is overall a good thing.

People have a range of motivations in posting to lists, but a big one is sharing information. For example, someone asked a dance with an 8-count swing followed by an 8-count chain. I replied to warn them at the form has changed and this no longer works well: this bit me back when I started calling, and I want to warn other new callers.

I have a few audiences in mind in writing:

  • The person I'm replying to.
  • People on the list.
  • People who might see the archives when searching.
And then there's a general sense in which I'm contributing to what people know about contra dance: any of these people might tell others or otherwise pass it along.

AI systems add another way this information can spread. It's increasingly common for people to ask an LLM instead of a search engine, and when they do I'd rather they get good answers. Excluding the archives from model training would do the opposite of what I want.

There are definitely downsides to querying today's models, similar to asking a person who has read a lot but doesn't remember where they read anything, and sometimes invents something plausible instead of saying they don't know. I think this is likely temporary, however: combining the best of models and traditional search is a problem a lot of people are working hard on solving.

So, on balance, I think it's better to keep the archives open to all, including future LLM-intermediated readers.

(I also think AI is in general moving too quickly for society to respond well, and has a significant risk of getting us all killed. While I could see pushing against AI wherever it comes up, as part of moving a big societal "yay-AI; boo-AI" lever in the direction that slows it down and gives us more time to work out solutions, instead I've decided to take things case by case, thinking about effects each time.)

Comment via: facebook, lesswrong, mastodon, bluesky

Recent posts on blogs I like:

On Apologizing To Kids

Everyone is so weird about apologizing to children.

via Thing of Things August 25, 2025

Against the Teapot Hold in Contra Dancing

The teapot hold is the most dangerous common contra dancing figure, so I’ve been avoiding it. The teapot hold, sometimes called a "courtesy turn hold,” requires one dancer to connect with their hand behind their back. When I realized I could avoid put…

via Emma Azelborn August 25, 2025

Little Puppy

She's very little and she likes to do stuff with me. She also likes to bark around and run around and jump around. She also likes to go to places with me and that's all I have.

via Nora Wise's Blog Posts August 23, 2025

more     (via openring)