|June 1st, 2022|
Reading back over the talk, with the benefit of seven years of hindsight, I'm not happy with my treatment of existential risk. Here's that section of the talk, with thoughts interspersed:
There's a bit of a continuum. At one end we have risks like an asteroid hitting the earth. Cataloging asteroids and comets that might hit the earth at some point is something that people are working on, and actually is reasonably well understood. Because we have a pretty good understanding, and governments have a lot of sensible people, risks like this are reasonably well funded. So this end of the continuum is probably not high impact.
I'm far more skeptical of the "governments have this covered" position than I was in 2015. Some of this is for theoretical reasons (ex: preventing catastrophe benefits people beyond your country) and some of it is from observing governments more (ex: pandemic response).
At the other end we have risks like the development of an artificial intelligence that destroys us through its indifference. Very few people are working on this, there's low funding, and we don't have much understanding of the problem. Neglectedness is a strong heuristic for finding causes where your contribution can go far, and this does seem relatively neglected. The main question for me, though, is how do you know if you're making progress?
First, a brief digression into feedback loops. [...] Back to AI risk.
The problem is we really really don't know how to make good feedback loops here. We can theorize that an AI needs certain properties not to just kill us all, and that in order to have those properties it would be useful to have certain theorems proved, and go work on those theorems. And maybe we have some success at this, and the mathematical community thinks highly of us instead of dismissing our work. But if our reasoning about what math would be useful is off there's no way for us to find out. Everything will still seem like it's going well.
This was primarily criticism of MIRI's approach, and was about a year before Concrete Problems in AI Safety came out. That paper had an enormous impact on the field, and when I interviewed one of the co-authors a year later I really liked the emphasis on grounding work in empirical feedback loops.
With existential risk we have a continuum from well understood risks that don't need our marginal contribution to poorly understood risks where we don't have a way to find out if our contribution is reducing the risk. Maybe there's a sweet spot in between, where we can make progress and existing funding bodies are blind to the need? Future generations don't get to vote, so it wouldn't surprise me if governments systematically discount their interest. I'm not aware of any good candidates here, but if you'd like to find me after the talk I'd be interested in hearing about any.
I think this was mostly wrong. In the talk I divided work into what Owen Cotton-Barratt calls "Phase 1" and "Phase 2". First you have indirectly valuable work, such as exploring what things might be good to do, evaluating specific options, or building capacity, and then you have work that more concretely makes the world better, such as distributing bednets, detecting outbreaks, or preventing illicit synthesis of hazardous DNA.
While this Phase 1/2 division is still a good one, and I continue to find the arguments against over-investing in "Phase 1" convincing, I didn't apply it well. What I missed was that fixing our lack of knowledge about underinvested areas within existential risk was itself a strong candidate for "Phase 1" work. And that through that work we would likely be able to identify "Phase 2" options that were even more important, tractable, and neglected than our best options within global poverty.
Fortunately, others did invest in improving our knowledge here, and in 2022 we have a much better understanding of work that would be directly useful for reducing existential risk. For several examples see Concrete Biosecurity Projects (some of which could be big), and that's just within one area.
So does this mean I think people should, on the margin, switch from funding "Phase 2" work within global poverty to within existential risk? Weirdly, not really! With the changing funding situation, as far as I can tell, the solid "Phase 2" longtermist projects aren't limited by available funding but by people and the speed at which they can scale. How to donate in this more funding-abundant environment is a complicated question for another post. But I do think it means that if you're looking to move from earning to give into something directly valuable then, depending on personal fit, work on existential risk would likely be higher impact than on global poverty.