Superintelligence Risk Project Update

July 10th, 2017
airisk, ea
I've now been working on my project of assessing risk from superintelligence for a little over a week, though I was traveling for the end of last week. To keep me motivated, and let other people understand how I'm approaching this, here's what I've done so far:

I currently see three main views:

  • AGI is too far away for us to tell what it will be like, and don't think we can make progress now. The approach laid out in Concrete Problems (pdf) is good ML, but it's not additionally valuable from a superintelligence risk perspective. —I think this is most ML researchers (ex) and a high fraction of ML practitioners.

  • AGI may happen soon with systems similar to current ones, so we should improve their alignment, transparency, and robustness. Or AGI is farther off but what we learn on current systems is likely to be pretty transferable. —This seems especially common among AGI researchers (ex), less common among general ML researchers, and uncommon among ML practitioners.

  • Making AGI safe requires a solid understanding of what intelligence is, how to make decisions, how to handle logical uncertainty, and other questions. We need to build a theoretical foundation for provably aligned AGI. —This view is primarily associated with MIRI and is relatively popular within EA.

The three places where I see people disagreeing the most are:

  • Will AGI look like what we have now? The more similar you think it will be to what we have now, the more likely work on it today is to transfer. This seems to be the main difference between the "too soon to work on it" and "work on making current systems safer" groups.

  • Does progress require applicability? Can we advance our understanding with a theory-only approach that we only apply much later, or do we need to be constantly testing ideas in real systems? This seems like the main reason ML people are skeptical of theoretical-foundation style approaches.

  • Does safety require proof? Can we make a system we trust where we only have observational evidence that it's doing the sort of things we want it to do? This seems like the main reason theoretical-foundation people are skeptical of Concrete Problems style approaches.

Comparing this to my list when I was getting started:
  • How likely is it that current approaches are all we need for AGI with relatively straightforward extensions and a lot of scaling? Pretty much the same question as #1 above.

  • How valuable is it to work on solving problems that are probably not the right ones? ... I think some of the disagreement may be EAs being more comfortable valuing work on things that they're pretty sure won't be valuable but will be very valuable if they do. Not at the Pascal's Wager level, but at levels like 15%. But it's also just an open question, and people have pretty different senses of how much work now is likely to transfer.

  • How useful is it to have a strong theoretical foundation, vs just understanding the technology enough from an engineering perspective that we can make it do things for us? Pretty much my #3 above.

  • How similar is this to normal engineering? How much should we expect companies' desires that their AI systems do what they want them to do to work out? Talking to Dario convinced me this wasn't the right way to be thinking about it: "Dario's response was that transparency and safety are difficult research areas and, while they do pay off in the short run, they pay off more in the long run, so will tend to be underinvested in. There are also many more promising research directions than researchers right now, so what ends up getting explored is highly dependent on what researchers are interested in."

  • As we get closer to AGI, how likely is the ML community to take superintelligence risk seriously? I no longer think this is a cause of disagreement. People in the ML community who think we're close generally do take it seriously and many work on it.

Referenced in:

Comment via: google plus, facebook

Recent posts on blogs I like:

Weird Nerd Moralism

Let's talk about Nietzsche without having read him!

via Thing of Things November 29, 2024

Developing the middle ground on polarized topics

Avoiding false dichotomies The post Developing the middle ground on polarized topics appeared first on Otherwise.

via Otherwise November 25, 2024

How to eat vegan on Icon of the Seas

Royal Caribbean has a new giant cruise ship, Icon of the Seas, which has a large selection of food options.

via Home November 21, 2024

more     (via openring)