|July 14th, 2017|
|airisk, giving [amp]|
Before our conversation he looked some at Concrete Problems in AI Safety (pdf) and Deep Reinforcement Learning from Human Preferences (pdf). His view on both was that they were good work from the perspective of advancing ML but very unlikely to be relevant to making AGI safer: the systems that get us to AGI will look very different from the ones we have now.
One reason is that he saw a lot of learning from humans as being mediated by learning utility functions, but he sees utility functions as a very limited model. Economists and others use utility functions when talking about people because thatâs mathematically tractable, but itâs a bad description of how humans actually behave. Trying to come up with utility functions that best explain human preferences or behavior probably solves some problems nicely and is helpful, but while Bryce wouldnât completely rule it out he thought it was very unlikely to get us to AGI.
We tried to get more into why he thinks implementations for AGI will look vastly different from what we will have today, and couldnât make progress there. Bryce thinks there are deep questions about what intelligence really is that we donât understand yet, and that as we make progress on those questions weâll develop very different sorts of ML systems. If something like todayâs deep learning is still a part of what we eventually end up with, itâs more likely to be something that solves specific problems than as a critical component.
(This has been a common theme in my discussions with people recently: very different intuitions on the distance to AGI in terms of technical work required, and also on whether work weâre doing today is likely to transfer.)