{"items": [{"author": "Dario", "source_link": "https://www.facebook.com/jefftk/posts/887465634632?comment_id=887483873082", "anchor": "fb-887483873082", "service": "fb", "text": "There\u2019s an implicit assumption, or perhaps just a way of phrasing things, that\u2019s been bothering me a bit about these posts (mostly not your fault, probably more a failure to communicate on my part).  The posts ask whether work like Concrete Problems/Human Preferences is or isn\u2019t good ML work and separately whether it is or isn\u2019t helpful with AGI, as if these are two separate and totally binary questions.  My view is different \u2014 I see a spectrum of more and more capable AI systems, and a subfield of research about how to keep their actions in line with what humans would want, even as they make more and more autonomous decisions.  Consider DQN \u2014 the first system to play a wide range of Atari games using a fairly general algorithm.  No one claims it is \u201crelevant\u201d to general intelligence or \u201cwill transfer\u201d to general intelligence or a \u201ccritical component\u201d.  But it contains insights and capabilities that have opened up new directions \u2014 it and ideas inspired by it are the root of an intellectual tree that will probably touch on many capabilities key to building general intelligence.  If you\u2019re trying to work towards general intelligence in 2013, it\u2019s hard to do better than DQN as a stepping stone.  It would be odd to say it\u2019s \u201cgood ML work but doesn\u2019t tell us anything about intelligence\u201d (some people still do say this but it\u2019s becoming less common).<br><br>I don\u2019t particularly want my work presented as \u201cattempting to be helpful with AGI safety\u201d so much as \u201ctaking the first small steps towards a long-term vision where powerful AI does what humans want it to do\u201d.  The small steps are very concrete and simple, operate on systems we understand, and make no claim of being directly relevant to the final goal, any more than DQN claims to be a blueprint for general intelligence.  The long term vision motivates the short term research but involves claims that by their nature can\u2019t be known for sure in advance. <br><br>I think this is actually a good template for research: take small, concrete steps that you are able to strongly defend empirically, and that are useful on their own, while also having a long-term vision that can be vague and based on hunches, but that is expansive and helps to guide your belief about why some problems are particularly important to work on.  People can and should objectively judge the short-term contribution (does it work? is it impressive? is it honest? is it extensible?), but the long term vision is mostly a matter of differing intuitions, and only time will tell who is right.  Given that it takes so long to tell which visions are the right ones, I consider it healthy for many different researchers to pursue different visions, while all being held to the standard of producing useful, understandable work whose motivations and results are narrowly justifiable.  All I really ask from other researchers is that they consider disagreements over safety/alignment to be in this same vein of healthy disagreements over long-term vision; in my experience they are mostly already doing this.", "timestamp": "1500129066"}, {"author": "Raymond", "source_link": "https://www.facebook.com/jefftk/posts/887465634632?comment_id=887483873082&reply_comment_id=887593852682", "anchor": "fb-887483873082_887593852682", "service": "fb", "text": "&rarr;&nbsp;&gt; Given that it takes so long to tell which visions are the right ones, I consider it healthy for many different researchers to pursue different visions, while all being held to the standard of producing useful, understandable work whose motivations and results are narrowly justifiable. All I really ask from other researchers is that they consider disagreements over safety/alignment to be in this same vein of healthy disagreements over long-term vision; in my experience they are mostly already doing this.<br><br>This really resonates with me. My own personal expounding:<br><br>When humanity acquired nuclear weapons, AFAICT we barely survived - there were multiple close calls, we were not prepared, it is possible that the main reason we're alive is anthropics, and regardless, it seems like we were very *underprepared* for the political and technical challenges ahead.<br><br>With AI (and other powerful technologies) I want to be extremely *overprepared*. If we can all see the problem coming, we should not be waiting till the last minute to cram for the final exam. We should be doing everything in our power to ensure we pass with flying colors and that it's not even *in doubt* whether we'll pass with flying colors.<br><br>Do we know exactly what to do? No. But we have lots of ideas that seem *worth having fully explored*. I think it's worth holding researchers to high standards, but with so many *really* wasteful ventures (and science-for-the-sake-of-science) going on in the world, putting this on hold because it seems maybe-wasteful seems like it's holding AI research to a higher standard than most exploratory math and science.", "timestamp": "1500157212"}, {"author": "Jesse", "source_link": "https://www.facebook.com/jefftk/posts/887465634632?comment_id=887483873082&reply_comment_id=887934969082", "anchor": "fb-887483873082_887934969082", "service": "fb", "text": "&rarr;&nbsp;&gt; Consider DQN \u2014 the first system to play a wide range of Atari games using a fairly general algorithm. No one claims it is \u201crelevant\u201d to general intelligence or \u201cwill transfer\u201d to general intelligence or a \u201ccritical component\u201d<br><br>Just as a data point, my strong impression as an outside observer has been that many people interested in AI and AI safety *do* believe all of these things.  If lots of people share your more nuanced perspective, I wish it would be made clearer.", "timestamp": "1500256875"}, {"author": "Dario", "source_link": "https://www.facebook.com/jefftk/posts/887465634632?comment_id=887486303212", "anchor": "fb-887486303212", "service": "fb", "text": "To take one example from your post, the approaches I'm ultimately interested in using to make powerful AI safe are unlikely to rely on learning crude reward/utility functions; that's an overly literal interpretation of the direction we're taking our research.  I think AI/ML systems will ultimately have to learn detailed models of how humans work, and then use those models in some complex way to make decisions humans would approve of.  But right now even crude models of human preferences haven't been much explored in ML, so we have to start there.  I entirely agree these early approaches can't be \"applied\" or \"transferred\" to AGI.  The goal isn't to build AGI components, but rather start a line of research that leads to more and more sophisticated ways of aligning with human preferences.", "timestamp": "1500130037"}, {"author": "James", "source_link": "https://www.facebook.com/jefftk/posts/887465634632?comment_id=887531387862", "anchor": "fb-887531387862", "service": "fb", "text": "Imagine that in 1900 physicists learned of the possibility of hydrogen bombs, and theorized that by 1970 a total war between powers armed with these weapons could destroy civilization.  Would there have been any value to people speculating about a future containing these super-weapons, or would it have been a more productive use of time to wait until the eve of these weapons\u2019 development before considering how they would remake warfare?  Thinking about the possibility of mutually assured destruction, establishing taboos against using atomic weapons in anger, and designing an international system to stop terrorists and rogue states from acquiring nuclear weapons would, I believe, have been worthwhile endeavor for scholars living in my hypothetical 1900 world.", "timestamp": "1500142794"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/887465634632?comment_id=887531387862&reply_comment_id=887535968682", "anchor": "fb-887531387862_887535968682", "service": "fb", "text": "&rarr;&nbsp;The concept of \"weapon that's much more destructive than anything we have now\" is much simpler to reason about how to handle, and the specifics don't matter much.  For example, https://en.wikipedia.org/wiki/Solution_Unsatisfactory would be almost unchanged if you switched it from radiation-based weapons to bomb-based ones.<br><br>There are also ways that specifics matter a lot: what protects you against them, how targeted can they be, etc.  For example, how valuable is it to promote dispersion.", "timestamp": "1500144565"}, {"author": "James", "source_link": "https://www.facebook.com/jefftk/posts/887465634632?comment_id=887531387862&reply_comment_id=887540359882", "anchor": "fb-887531387862_887540359882", "service": "fb", "text": "&rarr;&nbsp;Jeff&nbsp;Kaufman Agreed.  This is why Steve Omohundro 's Basic AI Drives approach is so valuable because it shows that for a huge set of possible set of computer superintelligences, the AIs would  likely have similar intermediate goals.  Alas, the intermediate goals would have a high chance at causing mankind's extinction.  Taking this into account, the sane approach would be to slow down the development of any computer superintelligence.", "timestamp": "1500145605"}, {"author": "Alexey", "source_link": "https://www.facebook.com/jefftk/posts/887465634632?comment_id=887531387862&reply_comment_id=887882479272", "anchor": "fb-887531387862_887882479272", "service": "fb", "text": "&rarr;&nbsp;I wrote now an article that one of such drives would be militarisation trend of AI, as it has to take over the world.", "timestamp": "1500243056"}, {"author": "Will", "source_link": "https://www.facebook.com/jefftk/posts/887465634632?comment_id=887718018852", "anchor": "fb-887718018852", "service": "fb", "text": "I think AI safety might be valuable for AGI. However I think other things would be more valuable.<br><br>It would be interesting to ask the question: What percentage of EA funding would should go to each type of AI safety activity? Possible activities are<br><br>1) AI safety with current ML<br>2) Theory of Reasoning <br>3) AGI safety strategy/policy<br>4) Trying to make the breakthroughs required for AGI so that AI safety people can be the first people to think about it and spread safety thinking when talking about AGi.<br>5) AI safety outreach<br><br>I personally lean towards the last 3. I like the first 2 and it would be great if all 5 could share a community.", "timestamp": "1500193696"}]}