The Right to be Forgotten

June 20th, 2012
policy, publicy, tech
People forget. If you make a mistake, after a while no one remembers about it. This forgetting has never been perfect, with written records and small town gossip extending the collective memory, but computers make it much worse. If you said something stupid in the midst of a heated argument on usenet in 1983, it's probably archived and can show up in searches.

When Facebook released Timeline, making older posts and images much more accessible, many people were upset. They had thought of all that old stuff as forgotten, but Facebook remembered it. One friend spent hours going through and systematically removing anything having to do with their old boyfriend so their new boyfriend wouldn't be jealous. Our culture hasn't adjusted to this level of persistence.

One solution, which the EU is proposing making law, would be to give people an option to remove things they find embarrassing. In many cases this is non-objectionable: I write a comment, change my mind, I go back and delete it. But what if someone else has responded quoting me, in a comment or in their own top level post? What if you take a picture of me and I want it taken down? Perhaps because I'm picking my nose on the subway, or I'm John Pike and want the picture of me pepper spraying protesters taken off Louise Macabitas' Facebook wall?

We do have some precedent for this kind of mandated forgetting. You can get some information about you removed from your criminal record and credit bureaus are required to drop most data after seven years. So would the EU's proposal to expand people's rights over data about them be a problem?

I think it would be. The problem is a disconnect between what people are used to and what is a good fit for the technology. Criminal records and credit reports are very limited databases. They're highly structured, centrally managed, and used for very specific purposes. This lets the rules for what should be allowed or excluded by very clear. Most of the web isn't like that: it's a distributed mess of conflicting information used for a very wide range of purposes. The rules for exclusion would be very difficult to get right because matching people's expectations to the technology is difficult.

But what are people's expectations? To understand better, I looked through comments on news stories discussing the EU proposal and chose some representative comments by people in favor of creating a right to be forgotten. For each I want to answer two questions: is what they want technologically feasible, and would it be a good idea?

One has the right to not have their name show up in Google or other search engine queries. A person also has the right to control what and how any information shows up about them in any search engine query. (source)
This person would like to be able to not show up in a search engine, and perhaps block specific results. If everyone had unique names, this wouldn't be too difficult, but it's very difficult to identify just the results that correspond to a particular person with a given name without without removing ones for other same-named people. For blocking specific results you would similarly have the problem of ensuring that people were only requesting removal of information about themselves and not about their same-named rivals.

If we did figure this out, however, I don't think it would be good for society. You already have a right to demand sites to remove libel. But for truthful information, I see a much greater potential for harm via over-removal than benefit from allowing people to hide past embarrassments.

Free speech does NOT mean once I say something I want to say it for all eternity. Free speech means you control your words, you say something, then change your mind completely and say something else. The internet captures your words and associates them with your name forever, taking all the freedom from you, the speaker. (source)
This is similar to the previous concern, except limited to things you wrote instead of also including other people's writing about you. From a technical perspective, if your writing was on an account you still control then it wouldn't be that hard to require delete functionality. If you abandoned the account, connecting it back to you would be harder, but there could be some process. If your words are incorporated into writing by other people (for example, this post), it's much harder. From a policy perspective, I don't see much harm from the first two cases, though I do think we need to continue to allow the third case in order to have debate.
When did we hold the vote to decide that everyone's personal information should be freely available to anyone able to use a search engine?

This applies to so-called "public records" as well - at least with the social media sites people voluntarily supplied their information, but services which scrape government public records make available all sorts of data about individuals without their consent or knowledge: address, employment, marital status, age, even pictures of houses. This kind of information used to require the employment of a private detective - now anyone with access to a web browser can get the same information with a button click.

... aggregation of that information in Internet-accessible search engines was never part of the original concept of public records. An opt-out needs to be adopted. (source)

This proposes having information that you can look up in a time/money-consuming way but limiting easy access. This seems difficult to me, but you might be able to make a law saying that anyone who reads public records must charge a minimum of time/money per record for access, and that redistribution was illegal. It sounds like a bad idea to me: if this information is harmful to make known, shouldn't we keep it from the public entirely? What's some information that's harmful if it's easily accessed but not if it requires paying a service to get?
Europe has had a lot of experience of what can happen when there is no privacy and the government falls off a moral cliff. Its why privacy laws are stronger in the EU than the US. This is a fundamental difference in that the US has never been run by a dictatorship in the recognized sense of the word. The onerous Patriot Act gives government the absolute right to get whatever information it wants from service providers in the US, including the servers where your digital life is stored be it Google, Facebook or whatever, no matter if you are a US citizen or not; companies have to comply. Of course law abiding citizens have nothing to fear, just as the 1930's Jewish population of Germany, who were law abiding had nothing to fear... The right to be forgotten should be an inalienable human right when it comes to commercial services; a 5 to 7 year data storage rule could work. (source)
Setting a maximum of five or seven years for data storage would be technically easy. Any time you acquire some data you must save a timestamp, then you run a delete process every so often that flushes old data. Backups make this a little tricky, but you could segment the data in them by year or something. For most users on some sites this would be completely fine: most Facebook users probably wouldn't care if this applied to most data. On other sites, for example the question-and-answer site Stack Overflow, there could easily be questions answered a decade ago that remain useful to others and that the question-answerers were expecting to stay up indefinitely. Even for Facebook, some old photos might be really important to people while others might be embarrassing. And I would hate to have the comments on my posts automatically disappear, because some of them were really good [1].

As for what to do about data storage enabling dictatorships, I think the best we can do is to make sure that wide availability of data goes both ways. Lots of government transparency, clear laws allowing recording of police officers and other government officials, and no "right to be forgotten".

The strongest argument I see for a right to restrict data is where someone reveals a lot of data about themself, intentionally or not, and then begins to worry about stalkers. It seems hard to say that the benefit to others of knowing their address outweighs the harm from the stalker getting the same information. If someone could figure out a practical way to implement this that restricted the information removal to just what would do the most harm in the hands of the stalker, could hide the information before the stalker could get it, and that didn't have a huge chilling effect, I could support that. But I'm not optimistic.


[1] In fact, part of my motivation for pulling those two comments into a post was a fear that Facebook wouldn't keep old comments around forever. (Though if Allison or Ben wanted me to remove them I would.)

Comment via: google plus, facebook

Recent posts on blogs I like:

Development RCTs Are Good Actually

In defense of trying things out

via Thing of Things March 25, 2024

Clarendon Postmortem

I posted a postmortem of a community I worked to help build, Clarendon, in Cambridge MA, over at Supernuclear.

via Home March 19, 2024

How web bloat impacts users with slow devices

In 2017, we looked at how web bloat affects users with slow connections. Even in the U.S., many users didn't have broadband speeds, making much of the web difficult to use. It's still the case that many users don't have broadband speeds, both …

via Posts on March 16, 2024

more     (via openring)