This is a perfectly nice essay on not being a Yudkowskian “FOOMer”. But a worldview defined in opposition to the holes in that scenario has completely failed in both imagination of other scenarios, and in keeping abreast of the actual contours of risks as the shape of this technology has started to materialize. Delegating those to a throwaway sentence about how it “might very well be catastrophic” is willful blindness that completely misses the point.
If a hard takeoff happens, it doesn’t matter if it happened over 10 hours or 10 years. It doesn’t matter if intelligence isn’t the ultimate bottleneck, if we’re still dwarfed by machine capabilities. It doesn’t matter that the exact course of a river is computationally irreducible, to us living in a flood plain.
I agree that alignment is likely to ultimately be a philosophical and political problem rather than a technical one. I fail to see how that’s any justification for continuing down this path whatsoever.
On "Techno-Optimism": Like you, I believe technology has been strongly net-positive for humanity. At the same time, (almost) every new technology can cause damage that clearly outweighs its potential benefits: A knife can cut bread or kill a person. The reason why knives (and technology in general) are still net-positive is that we KNOW and CARE about their risks: we keep them out of the hands of children, don't allow them on board of planes, etc. With powerful AI, I think most decision makers either don't know or don't care sufficiently about the risks.
On the limits of intelligence: To become uncontrollable and destroy our future, an AI would neither have to be allknowing nor allmighty. All it needs to be is a superhuman manipulator that is able to take over power the same way a human dictator would, only 100x more effective. It could then make sure that we'll not get in the way of whatever goal it pursues, which is very likely not a goal we would want it to pursue. It will likely not destroy us on purpose, just as we don't destroy rhinos on purpose. We just change the world in a way that suits our goals, not theirs.
On alignment: Regardless how easy to solve problem 2 and 3 may or may not be, as long as problem 1 is not solved, we must not risk creating something that could get out of control. This does not mean we can't continue developing powerful AI - there are many safe ways to use narrow superhuman tool-AI or sub-human general-purpose AI that in combination I believe can help us achieve almost anything a superintelligent AGI could do for us.
I think this whole picture involves not thinking about what the world looks like after superintelligence. It doesn't sound like you are imagining
- billions or trillions of robots doing all physical labour
- all economically relevant cognitive tasks performed by humans are now performed by AIs
- all of this is happening so quickly that humans are not able to keep up with the pace of discovery
You argue for limits on intelligence, which might be correct. But the specific limits you talk about -- like "Predicting the locations of all the lakes and rivers on earth in 50 years" are much much higher bars than the capabilities discussed above. And you obviously don't need to have an insane level of prescience of the type you describe to take over the world.
For example, the takeover story in AI 2027 didn't rely on any vastly superhuman capabilities, it only relied on basically normal politicking, industrial buildouts, and bioweapons. More generally, AI takeover / extinction arguments don't rely on anything crazy like
So the real claim you are (implicitly arguing for) is that AI capabilities will hit a wall before being able to automate the whole economy. I agree that AI takeover/extinction seems unlikely if the AIs cannot do all economically relevant tasks (including, e.g. extremely complicated economically relevant tasks like building new EUV machines), largely because AIs won't be able to be self-sufficient afterwards. But humans somehow manage to build new EUV machines (and operate the rest of the economy). So this is a vastly lower capability bar than the examples you argue against in the post.
I think you should more directly try to think about exactly how far AI capabilities will go. Start with the automation of all economically relevant tasks, and try to make specific claims about what tasks you think AIs won't be able to do. Then, once AIs cross the supposed walls, I hope you'll update towards my position.
Thanks for the new piece and I believe most of your subs/audience know how much you've got going on right now. Hope the little one is doing well!
ps. Agree or disagree, I always learn something or you send me into some rabbit hole of new things to research. That's the reason I subscribe, not for the blistering hot-takes. There's always Twitter/X for that... :0)
How can one not be an optimist? These technologies are changing so many things for the better! It's living with the pessimists that can actually be the challenge as trying to decide between those who will never get it and those who have yet to get it (but are open to it) can be difficult to decipher. Invest in the latter, let the former live in their lonely, isolated bubble. Ick.
Paul Christiano's prediction of gradual disempowerment seems much more likely. The debate between Yudkowsky and him was interesting. Computational irreducibility doesn't mean an AI couldn't build the future it wants. The best way to predict the future is by molding it.
This is terrific. I've been asked to write a job description for an "AI Czar" at a flagship university (not my own) and your piece gets at why most job descriptions fail: not distinguishing between intelligence and knowledge, not addressing the distributed knowledge problem, and not seeing that alignment is a philosophical and social issue. So, thank you, I'll be citing you and sending this around.
This is a perfectly nice essay on not being a Yudkowskian “FOOMer”. But a worldview defined in opposition to the holes in that scenario has completely failed in both imagination of other scenarios, and in keeping abreast of the actual contours of risks as the shape of this technology has started to materialize. Delegating those to a throwaway sentence about how it “might very well be catastrophic” is willful blindness that completely misses the point.
If a hard takeoff happens, it doesn’t matter if it happened over 10 hours or 10 years. It doesn’t matter if intelligence isn’t the ultimate bottleneck, if we’re still dwarfed by machine capabilities. It doesn’t matter that the exact course of a river is computationally irreducible, to us living in a flood plain.
I agree that alignment is likely to ultimately be a philosophical and political problem rather than a technical one. I fail to see how that’s any justification for continuing down this path whatsoever.
Some thoughts:
On "Techno-Optimism": Like you, I believe technology has been strongly net-positive for humanity. At the same time, (almost) every new technology can cause damage that clearly outweighs its potential benefits: A knife can cut bread or kill a person. The reason why knives (and technology in general) are still net-positive is that we KNOW and CARE about their risks: we keep them out of the hands of children, don't allow them on board of planes, etc. With powerful AI, I think most decision makers either don't know or don't care sufficiently about the risks.
On the limits of intelligence: To become uncontrollable and destroy our future, an AI would neither have to be allknowing nor allmighty. All it needs to be is a superhuman manipulator that is able to take over power the same way a human dictator would, only 100x more effective. It could then make sure that we'll not get in the way of whatever goal it pursues, which is very likely not a goal we would want it to pursue. It will likely not destroy us on purpose, just as we don't destroy rhinos on purpose. We just change the world in a way that suits our goals, not theirs.
On alignment: Regardless how easy to solve problem 2 and 3 may or may not be, as long as problem 1 is not solved, we must not risk creating something that could get out of control. This does not mean we can't continue developing powerful AI - there are many safe ways to use narrow superhuman tool-AI or sub-human general-purpose AI that in combination I believe can help us achieve almost anything a superintelligent AGI could do for us.
I find this post poorly argued.
I think this whole picture involves not thinking about what the world looks like after superintelligence. It doesn't sound like you are imagining
- billions or trillions of robots doing all physical labour
- all economically relevant cognitive tasks performed by humans are now performed by AIs
- all of this is happening so quickly that humans are not able to keep up with the pace of discovery
You argue for limits on intelligence, which might be correct. But the specific limits you talk about -- like "Predicting the locations of all the lakes and rivers on earth in 50 years" are much much higher bars than the capabilities discussed above. And you obviously don't need to have an insane level of prescience of the type you describe to take over the world.
For example, the takeover story in AI 2027 didn't rely on any vastly superhuman capabilities, it only relied on basically normal politicking, industrial buildouts, and bioweapons. More generally, AI takeover / extinction arguments don't rely on anything crazy like
So the real claim you are (implicitly arguing for) is that AI capabilities will hit a wall before being able to automate the whole economy. I agree that AI takeover/extinction seems unlikely if the AIs cannot do all economically relevant tasks (including, e.g. extremely complicated economically relevant tasks like building new EUV machines), largely because AIs won't be able to be self-sufficient afterwards. But humans somehow manage to build new EUV machines (and operate the rest of the economy). So this is a vastly lower capability bar than the examples you argue against in the post.
I think you should more directly try to think about exactly how far AI capabilities will go. Start with the automation of all economically relevant tasks, and try to make specific claims about what tasks you think AIs won't be able to do. Then, once AIs cross the supposed walls, I hope you'll update towards my position.
Hi Dean,
Thanks for the new piece and I believe most of your subs/audience know how much you've got going on right now. Hope the little one is doing well!
ps. Agree or disagree, I always learn something or you send me into some rabbit hole of new things to research. That's the reason I subscribe, not for the blistering hot-takes. There's always Twitter/X for that... :0)
How can one not be an optimist? These technologies are changing so many things for the better! It's living with the pessimists that can actually be the challenge as trying to decide between those who will never get it and those who have yet to get it (but are open to it) can be difficult to decipher. Invest in the latter, let the former live in their lonely, isolated bubble. Ick.
Paul Christiano's prediction of gradual disempowerment seems much more likely. The debate between Yudkowsky and him was interesting. Computational irreducibility doesn't mean an AI couldn't build the future it wants. The best way to predict the future is by molding it.
This is terrific. I've been asked to write a job description for an "AI Czar" at a flagship university (not my own) and your piece gets at why most job descriptions fail: not distinguishing between intelligence and knowledge, not addressing the distributed knowledge problem, and not seeing that alignment is a philosophical and social issue. So, thank you, I'll be citing you and sending this around.