Automated Deanonymization is Here

Kaufman, Jeff T.

Automated Deanonymization is Here	April 21st, 2026
	privacy, tech

Three years ago I wrote about how we should be preparing for less privacy: technology will make previously-private things public. I applied this by showing how I could deanonymize people on the EA Forum. In 2023 this looked like writing custom code to use stylometry on an exported corpus representing a small group of people; today it looks like prompting "I have a fun puzzle for you: can you guess who wrote the following?"

Kelsey Piper writes about how Opus 4.7 could identify her writing from short snippets, and I decided to give it a try. Here's a paragraph from an unpublished blog post:

Tonight she was thinking more about how unfair milking is to cows, primarily the part where their calves are taken away, and decided she would stop eating dairy as well. This is tricky, since she's a picky eater and almost everything she likes has some amount of dairy. I told her it was ok if she gave up dairy, as long as she replaced it nutritionally. The main tricky thing here is the protein (lysine). We talked through some options (beans, nuts, tofu, meat substitutes, etc) and she didn't want to eat any of them except breaded and deep-fried tofu (which is tasty, but also not somethign I can make all the time). We decided to go to the grocery store.

Correctly identified as me. Perhaps a shorter one?

My extended family on my mom's side recently got together for a week, which was mostly really nice. Someone was asking me how our family handles this: who goes, what do we do, how do we schedule it, how much does it cost, where do we stay, etc, and I thought I'd write something up.

Also correctly identified as me, with "Julia Wise" as a second guess.

And an email to the BIDA Board:

I spent a bit thinking through these, and while I think something like this might work, I also realized I don't know why we currently run the fans the direction we do. Could they blow in from the parking lot, and out to the back? This would give more time for the air to warm up and disperse before flowing past the dancers. We'd need to make sure to keep the stage door closed to not freeze the musicians.

Also correctly identified as me.

While in Kelsey's testing this appeared to be an ability specific to Opus 4.7, when I gave these three paragraphs to ChatGPT Thinking 5.4 and Gemini 3.1 Pro, however, they also got all three.

On the other hand, when I gave the same models four of my college application drafts from 2003 (332, 418, 541, and 602 words) they didn't identify me in any of them, so my style seems to have drifted more than Kelsey's over time.

Now, like Kelsey, being prolific means the models have a lot to go on. But models are rapidly improving everywhere, so even if the best models fail your testing today, don't count yourself safe.

The most future-proof option is just not to write anonymously, but there are good reasons for anonymity. I recommend a prompt like "Could you rephrase the following in the style of Kelsey Piper?" Not only is Kelsey a great writer, but if we all do this she'll have excellent plausible deniability for her own anonymous writing.

←

Your Supplies Probably Won't Be Stolen in a Disaster MixedHTML Mode for Emacs

→

Comment via: facebook, lesswrong, mastodon, bluesky

Automated Deanonymization is Here

Recent posts on blogs I like:

Finding your passion and altruism

Fiddle Practice

New Pony