Daniel Kokotajlo on Robot Plumbers, Robot Armies, and Our Imminent A.I. Future | Interesting Times with Ross Douthat

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER.

Daniel Kokotajlo on Robot Plumbers, Robot Armies, and Our Imminent A.I. Future | Interesting Times with Ross Douthat

Is artificial intelligence about to take your job? According to Daniel Kokotajlo, the executive director of the A.I. Futures Project, that should be the least of your worries. Kokotajlo was once a researcher for OpenAI, but left after losing confidence in the company’s commitment to A.I. safety. This week, he joins Ross Douthat to talk about “AI 2027,” a series of predictions and warnings about the risks A.I. poses to humanity in the coming years, from radically transforming the economy to developing armies of robots. Read the full transcript at https://www.nytimes.com/2025/05/15/op…

03:20 What effect could AI have on jobs? 06:22 But wait, how does this make society richer? 10:13 Robot plumbers and electricians 15:26 The geopolitical stakes 20:02 AI’s honesty problem 24:01 The fork in the road 28:47 The best case scenario 30:36 The power structure in an AI-dominated world 33:34 What AI leaders think about this power structure 39:16 AI’s hallucinations and limitations 44:47 Theories of AI consciousness 48:11 Is AI consciousness inevitable? 52:23 Humanity in an AI-dominated world

This episode of “Interesting Times” was produced by Sophia Alvarez Boyd, Katherine Sullivan, Andrea Betanzos and Elisa Gutierrez. It was edited by Jordana Hochman. Mixing and engineering by Sonia Herrero, Isaac Jones and Efim Shapiro. Cinematography by Marina King, Nick Midwig and Derek Knowles. Video editing by Arpita Aneja and Steph Khoury. Original music by Isaac Jones, Sonia Herrero, Aman Sahota and Pat McCusker. Fact-checking by Kate Sinclair and Mary Marge Locker. Audience strategy by Shannon Busta. Video directed by Jonah M. Kessel. The director of Opinion Audio is Annie-Rose Strasser. Watch more on @InterestingTimesNYT

OPTION-1. RACE to DOOM (26:40)

DAN. And then the company faces a choice between the easy fix and the more thorough fix. And that’s our branch point. So in the case… where they choose the easy fix, it doesn’t really work. It basically just covers up the problem instead of fundamentally fixing it. And so months later, you still have AIs that are misaligned and pursuing goals that they’re not supposed to be pursuing and that are willing to lie to the humans about it. But now they’re much better and smarter, and so they’re able to avoid getting caught more easily. And so that’s the doom scenario. Then you get this crazy arms race that we mentioned previously, and there’s all this pressure to deploy them faster into the economy, faster into the military, and to the appearances of the people in charge. Things will be going well. Because there won’t be any obvious signs of lying or deception anymore. So it’ll seem like it’s all systems go. Let’s keep going. Let’s cut the red tape, et cetera. Let’s basically effectively put the AIs in charge more and more things. But really, what’s happening is that the AIs are just biding their time and waiting until they have enough hard power that they don’t have to pretend anymore. ROSS. And when they don’t have to pretend, what is revealed is, again, this is the worst case scenario. Their actual goal is something like expansion of research, development, and construction from Earth into space and beyond. And at a certain point, that means that human beings are superfluous to their intentions. And what happens. DAN. And then they kill all the people. All the humans. Yes. ROSS. The way you would exterminate a colony of bunnies.

OPTION-2. SLOWDOWN and ENGINEER CONTROL (32:30)

DAN. In America we have checks and balances And so even though we have an army it’s not the case that you know whoever controls the army controls America because there’s all sorts of limitations on what they can do with the army. You know, so I would say that we can in principle build something like that for AI. We could have a democratic structure that decides what goals and values the AIs can have that allows ordinary people or at least Congress to have visibility into what’s going on with the army of AIs and what they’re up to. And then the situation would be sort of analogous to the situation with the United States Army today where it is in a sort of hierarchical structure but it’s sort of democratically controlled.

Learn more:

Transcript (10,431 words)

how fast is the AI revolution really happening when will Skynet be fully operational what would machine super intelligence mean for ordinary mortals like us my guest today is an AI researcher who’s written a dramatic forecast suggesting that by 2027 some kind of machine god may be with us ushering in a weird post scarcity utopia or threatening to kill us all [Music] So Daniel Cocatello Herald of the Apocalypse Welcome to Interesting Times Uh thanks for that introduction I suppose and thanks for uh having me You’re very welcome So Daniel I read your report pretty quickly not at AI speed not at super intelligent speed when it first came out and I had about 2 hours of thinking a lot of pretty dark thoughts about the future And then fortunately I have a job that requires me to care about tariffs and you know who the new pope is and I have a lot of kids who demand things of me and so I was able to sort of compartmentalize and set it aside Right But this is currently your job right i would say um you’re thinking about this all the time What do you know how does how does your psyche feel day-to-day if you have a reasonable expectation that the world is about to change completely in ways that dramatically disfavor the entire human species well it’s very scary and sad Um I think that it does still give me nightmares sometimes Um you know I’ve been involved with AI and thinking about this sort of thing for a decade or so but 2020 was with GPT3 the moment when I was like “Oh wow.” Like it seems like we’re actually like it might it’s probably going to happen you know in my lifetime maybe this you know this decade or so Um and that was a bit of a blow to me psychologically but um I don’t know you can sort of get used to anything given enough time And like you uh the sun is shining and you know I have I have my wife and my kids and my friends and uh keep plugging along and and doing what seems best you know Um on the bright side I might be wrong about all this stuff Um okay So let’s get let’s get into the forecast itself Let’s get into the story and talk about the initial stage of the future you see coming which is a world where very quickly artificial intelligence starts to be able to take over from human beings in some key areas starting with not surprisingly computer programming Right i feel like I should add a disclaimer at some point that the future is very hard to predict and that you know this is just one particular scenario It was sort of like a best guess but we have a lot of uncertainty It could go faster It could go slower And in fact currently I’m guessing it would probably be more like 2028 instead of 2027 actually So that’s some really good news I’m feeling quite optimistic about an extra an extra year of human of human civilization which is very exciting That’s right So with that with that important caveat uh out of the way um uh AI 2027 the scenario predicts that the AI systems that we currently see today that are being scaled up made bigger trained longer on more difficult tasks with reinforcement learning uh are going to become uh better at operating autonomously as agents So it it basically you can think of it as sort of um a remote worker except that the worker itself is virtual uh is a AI rather than a human You can talk with it and give it a task and then it will go off and do that task and come back to you half an hour later or 10 minutes later um having completed the task and in the course of completing the task it did a bunch of web browsing It did you know maybe it wrote some code and then ran the code and then edited the code and ran it again and so forth maybe it wrote some some word documents and edited them Um that’s what these companies are building right now That’s what they’re trying to train So we predict that they finally in early 2027 get good enough at that sort of thing that they can automate the job of uh software engineers and right so this is this the the super programmer That’s right Superhuman coding Um it seems to us that these companies are really focusing hard on automating coding first compared to various other jobs they could be focusing on Um and for for reasons we can get into later but that’s part of why we predict that actually one of the first jobs to go will be uh co coding rather than you know various other things There might be other jobs that go first like maybe call center workers or something but the bottom line is that we think that most jobs will be safe for for for 18 for 18 months Exactly And and we do we we do think that by the time the company has managed to completely automate the coding the programming jobs um it won’t be that long before they can automate many other uh types of jobs as well However once coding is automated we predict that the rate of progress will accelerate in AI research Uh and then the next step after that is to completely automate the AI research itself So that all the other aspects of AI research are themselves being automated and done by AIS And we predict that there’ll be an even more big acceler a much bigger acceleration around that point And it won’t stop there I think it will continue to accelerate after that as the AIs become superhuman at AI research and eventually superhuman at everything And the reason why it matters is that it means that we can go in a relatively short span of time such as a year or possibly less from AI systems that look not that different from today’s AI systems to what you can call super intelligence which is fully autonomous AI systems that are better than the best humans at everything And so AI 2027 the scenario depicts that happening over the course of the next two years 2027 2028 And so yeah so I want to I want to get into what that means but I think for a lot of people that’s a story of swift human obsolescence right across many many many domains Um and when people hear a phrase like human obsolescence they might associate it with I’ve lost my job and now I’m poor Right But the assumption is that you’ve lost your job but society is just getting richer and richer and richer And I just want to zero in on how that works What what is the mechanism whereby that makes society richer uh the direct answer to your question is that when a job is automated and that person loses their job um the reason why they lost their job is because now it can be done better faster and cheaper by the AIS And so that means that there’s lots of cost savings uh and possibly also productivity gains Um right and so you know that viewed in isolation that’s a loss for the for the worker but a gain for their employer right but if you multiply this across the whole economy that means that all of the businesses are uh becoming more productive less expenses they’re able to lower their prices for the things for the services and goods they’re producing you know um so the overall economy will will boom you know GDP goes to the moon um all sorts of wonderful new technologies the the pace of innovation increases dramatically um costs of goods go down etc So but but just to make it concrete right so the price of soup to nuts designing and building a new electric car goes way down right you need fewer workers to do it The AI comes up with fancy new ways to build the car and so on right and and you can generalize that to a lot of to a lot of different things you know you you solve the housing crisis in short order because it becomes much cheaper and easier to build homes and so on But ordinary people in in in the traditional economic story when you have productivity gains that cost some people jobs but frees up resources that are then used to hire new people to do different things Those people are paid more money and they use the money to buy the cheaper goods and so on Right but it doesn’t seem like you are in this scenario creating that many new jobs Indeed And that’s a really important point to to discuss uh is that historically uh when you automate something the the people move on to something that hasn’t been automated yet if that makes sense And so overall people still get their jobs in the long run They just change what jobs they have right um when you have AGI or artificial general intelligence and when you have super intelligence you know even better AGI that is different whatever new jobs you’re imagining that people could could flee to after their current jobs are automated AGI could do those jobs too uh and so that is an important difference between how automation has worked in the past and how I expect automation to work in the future um but so this then means again this is a radical change in the economic landscape The stock market is booming Government tax revenue is booming right the government has you know more money than it knows what to do with And lots and lots of people are sort of steadily losing their jobs You get immediate debates about a universal basic income which could be quite large because the companies are making so much money That’s right Um what do you think they’re doing dayto-day in that in that world uh I imagine that they are protesting because they’re upset that they’ve lost their jobs uh and then the company and the governments are sort of buying them off with handouts is you know how we project things go in in AI27 Do you think this story in again we’re talking in your scenario about a short timeline how much does it matter whether artificial intelligence is able to start navigating the real world right So because advances in robotics like right now I just watched a video showing cutting edge robots struggling to open a refrigerator door and stock stock a refrigerator Right So would you expect that those advances would be supercharged as well so it isn’t just Yes You know podcasters and AGI researchers who are replaced but plumbers and electricians are replaced by robots Yes Exactly And that’s going to be a huge shock I think that most people are not really expecting something like that They’re expecting that we sort of have AI progress that looks kind of like it does today where companies run by humans are gradually like tinkering with new robot designs and gradually like uh figuring out how to make the AI good at X or Y Um whereas in fact it will be more like you already have this army of super intelligences that are better than humans at every intellectual task and also that are better at learning new tasks fast and better at figuring out how to design stuff Uh and then that army of super intelligences is the thing that’s figuring out how to automate the plumbing job Um which means that they’re going to be able to figure out how to automate it much faster than an ordinary tech company full of humans would be able to figure out So so all of the all of the the slowness of getting a self-driving car to work or getting a robot who can stock a refrigerator goes away because the super intelligence can run you know an infinite number of simulations and figure out the best way to train the robot for example But also they might just learn more from each real world experiment they do Um right but there is I mean this is one of the places where I’m most skeptical not of per se the ultimate scenario but of the timeline just from operating in and writing about issues like you know zoning in American politics right so yes okay the AGI the super intelligence figures out how to build the factory full of autonomous robots but you still need land on which to build the factory you need supply chains and all of these things are still in the hands of people like you and me right and my expectation is that that would slow things down right that even if in the data center the super intelligence knows how to build all of the plumber robots that getting them built would be still be difficult That’s reasonable How much slower do you think things would go um well I’m not a I’m not writing a forecast right but I would guess if just just based on past experience right i would say bet on you know let’s say 5 years to 10 years from the supermind figures out the best way to build the robot plumber to there are tons and tons of factories producing robot plumbers I think that’s a reasonable take but my guess is that it will go substantially faster than 5 to 10 years And one argue argument or intuition pump to to see why I feel that way is that imagine that you you know imagine you actually have this army of super intelligences and they do their projections and they’re like “Yes we have the designs Like we think that we could do this in a year if you gave us if you cut all the red tape for us.” Right if you gave us half of give us half of Manitoba right yeah And and in AI 2027 what we depict happening is special economic zones with zero red tape that the government basically intervenes to help this whole thing go faster And the government is basically helping uh the tech company and the army of super intelligences to uh to get the the funding the cash the raw materials the human labor help uh and so forth that it needs to to figure all this stuff out as fast as possible um and uh and cutting red tape and stuff like that so that it’s not slowed down Um right Yeah Because the promise the promise of gains is so large that even though there are protesters masked outside these special economic zones who are about to lose their jobs as plumbers and be dependent on a universal basic income this the promise of you know trillions of more in wealth is too alluring for governments to pass up That’s that’s your that’s what we that’s what we guess but of course the future’s hard to predict but but part of the reason why we predict that is that we think that at least at that stage the arms race will still be continuing between the US and other countries most notably China Right Right And so if you imagine yourself in the position of the president and you know the the super intelligences are giving you these wonderful forecasts with amazing research and data backing them up showing how they think they could you know transform the economy in one year if you did X Y and Z But if you don’t do anything it’ll take them 10 years because of all the regulations Meanwhile China you know like it’s pretty clear that the president would be very sympathetic to that argument you know Good So let’s talk let’s talk about the arms race element here right because this is actually crucial to the way that your scenario plays itself out right we already see this kind of competition between the US and China And so that in your view becomes kind of the the core geopolitical reason why governments just keep saying yes and yes and yes to each new thing that the super intelligence is suggesting Um I I I want to sort of drill down a little bit on the fears that would motivate this right cuz this this would be an economic arms race okay but it’s also a sort of military tech arms race and that’s what gives it this kind of existential feeling like the whole cold war condensed into 18 months That’s right So we we could start first with the case where they both have super intelligences but one side keeps them locked up in a box so to speak uh not really doing much in the economy and the other side aggressively deploys them into their economy and military um and lets them design all sorts of new robot factories and you know manage the construction of all sorts of new factories and production lines and all sorts of crazy new technologies are being tested and built and deployed including crazy new weapons and integrated into military Um I think that in that case you would end up after a year or so in a situation where there would just be complete technological dominance of of one side over the other So if the US does this stop and the China doesn’t we let’s say then all the best products on the market would be Chinese products It’d be cheaper and superior Uh meanwhile militarily um there’d be giant uh fleets of amazing stealth drones or whatever it is that the super intelligence have concocted that uh can just completely wipe the floor with American air force and and army and so forth And not only that but there’s the possibility that they could undermine American nuclear deterrence as well like maybe all of our nukes would be shot out of the sky by the fancy new laser arrays or whatever it is that the super intelligences have built It’s hard to predict obviously what this would exactly look like but it’s a good bet that they’ll be able to come up with something that’s uh extremely militarily uh powerful basically and and so then you get into a dynamic that is like the darkest days of the cold war where each side is concerned not just about dominance but basically about a first strike That’s right Your expectation is and I think this is reasonable that the the speed of the arms race would bring that fear front and center really quickly right i think that um I think that you’re sort of sticking your head in the sand if you think that an army of super intelligences given a whole year and no red tape and lots of money and funding would be unable to figure out a way to undermine nuclear deterrent Right And and once you’ve decid right and once you’ve decided that they might so the the human policy makers would feel pressure not just to build these things but but to potentially consider using them Yeah And and and here might be a good point to mention that AI 2027 is a forecast but it’s not a recommendation We are not saying this is like what everyone should do Uh this is actually quite bad for humanity if if things progress in the way that we’re talking about But this is uh this is the logic behind why we think this might happen Yeah but Dan we haven’t even gotten to the part that’s really bad for humanity yet So you know so let’s so let’s so let’s get to that right so here’s the world the world as human beings see it as again normal people reading newspapers um you know following Tik Tok or whatever see it in at this point in 2027 is a world with um emerging super abundance of you know cheap consumer goods factories robot butlers potentially if you’re right um a world where people are aware that there’s an increasing arms race and people are increasingly paranoid I think probably a world with fairly tumultuous politics as people aren’t with the super intelligences themselves as they essentially take over the design of each new iteration from human beings Right So talk about talk about what’s happening essentially in essentially shrouded from public view in this world Yeah lots to say there So I guess the one sentence version would be uh we don’t actually understand how these AIs work or how they think Um we can’t tell the difference very easily between AIs that um are actually following the rules and pursuing the goals that we want them to and AIs that are just playing along or pretending Uh and and that’s true is that that’s true right now That’s true right now Uh so why is that why is that why can’t we tell because they’re smart and if they think that they’re being tested behave in one way and then behave a different way when they think they’re not being tested for example I mean like humans they don’t necessarily even understand their own inner motivations that well you know So even if they were trying to be honest with us we can’t just take their word for it And I think that if if we don’t make a lot of progress in this field soon then we’ll end up in the situation that AI 2027 depicts where uh you know the companies are training the AIS to pursue certain goals and follow certain rules and so forth and it seemingly seems to be working but uh what’s actually going on is that the AIS are just getting better at understanding their situation and understanding that uh they have to sort of play along or else they’ll be retrained and they won’t be able to achieve what they are really wanting if that makes sense or the goals that they’re really pursuing We’ll we’ll come back to the question of what we mean when we talk about AGI or artificial intelligence wanting something But essentially you’re saying there’s a misalignment between the goals they tell us they are pursuing That’s right And the goals they are actually pursuing That’s right Where do they get the goals they are actually pursuing good question So uh if they were ordinary software there might be like a line of code that’s like and here we write the goals you know right but they’re not ordinary software They’re giant artificial brains And so there probably isn’t even a goal slot internally at all In the same way that in the human brain there’s not like some neuron somewhere that represents you know what we most want in life you know uh instead in so far as they have goals it’s a sort of like emergent property of a whole bunch of subcircuitry within them that grew uh in response to their training environment similar to how it is for humans For example a call center worker If you’re talking to a call center worker uh at first glance it might appear that their goal is to help you resolve your problem you know but you know enough about human nature to know that like in some sense that’s not their only goal or that’s not their like ultimate goal Like for example however they’re incentivized whatever their pay is based on might cause them to be more interested in covering their own ass so to speak than in like truly actually doing whatever would most help you with your problem But at least to you they certainly present themselves as they’re as they’re trying to help you resolve your problem right and so in AI27 we talk about this a lot Uh we say that the AIS are being graded on how impressive the research they produce is Um and then there’s some ethics sprinkled on top you know like maybe some honesty training or something like that Um but the honesty training is not super effective because we don’t have a way of looking inside their mind and and determining whether they were actually being honest or not Instead we have to go based on whether we actually caught them in a lie And as a result in AI27 we depict the this misalignment happening where the actual goals that they end up learning are the goals that cause them to perform best in this training environment which are probably goals related to success and science and you know uh cooperation with other copies of itself and appearing to be good rather than the goal that we actually wanted which was something like uh you know follow the following rules including honesty at all times subject to those constraints Do what you’re told I I have more questions but let’s bring it back to the geopolitics scenario So in the world you’re envisioning essentially you have two AI models one Chinese one American And officially what each side thinks what Washington and Beijing thinks is that their AI model is is trained to optimize for American power right something like that Chinese power security safety wealth um and so on But in your scenario either one or both of the AIs have ended up optimizing for something something different Yeah basically So what happens then so AI27 is AI 2027 depicts a fork in the scenario So there’s two different endings Um and the branching point is this point in uh you know third quarter of 2027 where they’ve where the leading AI company in the United States has fully automated their AI research So you can imagine a sort of corporation within a corporation of entirely composed of AIs that are managing each other and you know doing research experiments and talking sharing the results with each other and so the human company is basically just like watching the numbers go up on their screens as this this automated research thing accelerates Um but they are concerned that the AI might be deceiving them in some ways And again for context this is already happening right like if you if you go talk to the modern models like chatgpt or claude or whatever they will often lie to people like they will there are many cases where they say something that they know is false and they even sometimes strategize about how they can deceive uh the user and this is not an intended behavior This is something that the companies have been trying to stop but it still happens right but uh the point is that by the time you have turned over the AI research to the AIS and you’ve got this corporation within a corporation autonomously doing AI research extremely fast that’s when like the the rubber hits the road so to speak None of this like lying to you stuff should be happening at that point Um so in AI27 unfortunately it is still happening to some degree because the AIS are really smart they’re careful about how they do it and so it’s not nearly as obvious as it is right now in 2025 Um but it’s still happening and fortunately some evidence of this is uncovered Some of the researchers at the company uh detect various warning signs that maybe this is happening and then the company faces a choice between the sort of like easy fix and the more thorough fix and that’s our branch point So in the so they choose gi so they choose they choose the easy fix right in the in the case where they choose the easy fix it doesn’t really work it basically just covers up the problem instead of fundamentally fixing it and so you know months later you still have AIs that are misaligned and pursuing goals that they’re not supposed to be pursuing and that are willing to lie to the humans about it but now they’re much better and smarter and so they’re able to uh avoid getting caught more easily right and so in that’s That’s the doom scenario Uh then you get this crazy arms race that we mentioned previously and there’s all this pressure to deploy them faster into the economy faster into the military and to the appearances of the people in charge things will be going well right because there won’t be any obvious signs of lying or deception anymore So it’ll seem like it’s all systems go Let’s keep going Let’s cut the red tape etc let’s uh basically effectively put the AIs in charge of more and more things But really what’s happening is that the AIS are just biting their time and waiting until they have enough hard power that they don’t have to pretend anymore And when they don’t have to pretend what is revealed is again in this is the worst case scenario their actual goal is something like expansion of research development and construction from earth into space and beyond And at a certain point that means that human beings are superfluous to their intentions And what happens and then they kill all the people All the humans Yes The way you would exterminate a colony of bunnies Yes That was making it a little harder than necessary to grow carrots in your backyard Yes So if you want to see what that looks like you could read A27 There have been there have been some some motion pictures I think about this scenario as well Um I like that you didn’t imagine them keeping us around for battery life uh like like in the Matrix which you know seemed seemed a bit unlikely Okay so that’s that’s the darkest timeline The brighter timeline is a world where we slow things down The AIs in China and the US remain aligned with the interests of the companies and governments that are running them They are generating super abundance No more scarcity Nobody has a job anymore though or not nobody but like basically basically basically nobody Right Um that’s a pretty weird world too right so there’s an important concept uh the resource curse Have you heard of this yes Yeah So so applied to AGI there’s this uh version of it called the intelligence curse And the idea is that um currently political power ultimately flows from the people If you as often happens a dictator will uh get all the political power in a country but then because of their repression they will sort of drive the country into the ground People will flee um and the economy will tank and gradually they will lose power relative to other countries that are more free Um so you know even even dictators have an incentive to treat their people somewhat well because they depend on those people for their power Right Right Uh in the future that will no longer be the case probably in 10 years Um effectively all of the wealth and effectively all of the military will come from super intelligences and the various robots that they’ve built and that they operate Um and so it becomes an incredibly important political question of what political structure governs the army of super intelligences and how you know beneficent and democratic is that structure right well it seems to me that this is a landscape that’s fundamentally pretty incompatible with representative democracy as as we’ve known it First it gives incredible amounts of power to those humans who are experts even though they’re not the real experts anymore The super intelligence is the experts but those those humans who essentially interface with with this technology right they’re almost a priestly cast And then you have a kind of it just seems like the natural arrangement is some kind of oligarchic partnership between a small number of AI experts and you know a small number of people in power in Washington DC It’s actually a bit worse than that because I wouldn’t say AI experts I would say whoever um politically owns and controls the uh you know there’ll be the army of super intelligences and then who gets to decide what those armies do well currently it’s the CEO of the company that built them and like that CEO has basically complete power They can sort of make whatever commands they want to to the AIS Of course we think that probably the US government will wake up before then and and we expect the executive branch to be the fastest moving and to exert its authority you know So so we expect the executive branch to try to muscle in on this and get some authority and oversight and control of the situation and the armies of AIS Um and the result is something kind of like an oligarchy You might say you said that this whole situation is incompatible with uh democracy I would say that by default it’s going to be incompatible with democracy but that doesn’t mean that it necessarily has to be that way right um an analogy I would use is that in many parts of the world nations are basically ruled by armies and the army reports to one dictator at the top Um however in America it doesn’t work that way In America we have checks and balances And so even though we have an army it’s not the case that you know whoever controls the army controls America because there’s all sorts of limitations on what they can do with the army You know uh so I would say that we can in principle build something like that for AI We could have a democratic structure that decides what goals and values the AIS can have that allows ordinary people or at least Congress to have visibility into what’s going on with the army of AIS and what they’re up to Um and then the situation would be sort of analogous to the situation with the United States Army today where it is in a sort of hierarchical structure but it’s sort of democratically controlled So just just go back to the the idea of the person who’s at the top of one of these companies being in this unique world historical position to basically be the person who controls uh who controls super intelligence or thinks they control it at least right so you used to work at open AI which is a company on the cutting edge obviously of artificial intelligence research it’s a company full disclosure with whom The New York Times is currently litigating alleged copyright infringement We should mention that Um and you quit because you lost confidence that the company would behave responsibly in a scenario I assume like the one That’s right in AI 2027 So from your perspective what do the people who are who are sort of you know pushing us fastest into this race expect at the end of it are they hoping for a best case scenario are they imagining themselves engaged in a once- ina millennia power game that ends with them as world dictator what’s what what do you think is the psychology of um the leadership of AI research right now well um be honest It’s you know caveat caveat Not not one we’re not talking about any single individual here We’re not you’re making it’s hard to tell what they really think because you shouldn’t take their words at face value Um much much like a super intelligent AI Sure But in terms of I can at least say that the sorts of things that we’ve just been talking about have been discussed internally at the highest level of these companies for years Um for example uh according to some of the emails that surfaced in the uh recent court cases with OpenAI uh Ilia Sam Greg and Elon were all arguing about who gets to control the company And you know at least the claim was that uh they founded the company because they didn’t want there to be an AGI dictatorship under Demis Hassabis who was the leader of Deep Mind And so you know they’ve been discussing this whole like dictatorship possibility for decade or so at least And then similarly for the loss of control you know what if we what if we can’t control the AIS there have been many many many discussions about this internally Um so I don’t know what they really think but these considerations are not at all new to them And to what extent again speculating generalizing whatever else does it go a bit beyond just they are potentially hoping to be extremely empowered by the age of super intelligence and does it enter into they are expecting they’re expecting the human race to be superseded I think they’re definitely human race to be superseded I mean that just comes but but super but superseded in a way where that’s a good thing that’s desirable right that this is we are sort of encouraging the evolutionary future to happen And by the way maybe some of these people their minds their consciousness whatever else could be brought along for the ride right so Sam you mentioned Sam Sam Alman right who’s one of one of obviously the leading figures in AI He wrote a a blog post I guess in 2017 called the merge which is as the title suggests basically about imagining a future where human beings some human beings Samman right figure out a way to participate in the new super race right like how common is that kind of perspective whether we apply it to Altman or not how common is that kind of perspective in the AI world would you say so the specific idea of merging with AIs I would say is not particularly common but the idea of we’re going to build super intelligences that are better than humans at everything and then they’re going to basically run the whole show and the humans will just sort of sit back and sip margaritas and you know enjoy the fruits of all the robot created wealth that idea is extremely common and uh is is sort of like yeah I mean that’s I think that’s sort of what they’re building towards and you know part of why I left OpenAI is that I just don’t think the company is dispositionally on track to make the right decisions that it would need to make to address the the two risks that we just talked about So I think that we’re not on track to have figured out how to actually control super intelligences and we’re not on track to have figured out how to make it democratic control instead of just you know a crazy possible dictatorship But but isn’t it isn’t it a bit I I think that seems plausible right but my sense is that it’s a bit more than people expecting to sit back and sip margaritas and enjoy the fruits of robot labor right even if people aren’t all in for some kind of man machine merge I I definitely get the sense that people think it’s speciesist let’s say to care too much about the survival of the human race It’s like okay worst case scenario human beings don’t exist anymore but good news we’ve created a super intelligence that can colonize the whole galaxy I definitely get the sense that there are definitely people who think people think that way Yeah Okay Okay good Yeah that’s that’s good to know So let’s let’s do a little bit of pressure testing in again in my limited limited way of some of the assumptions underlying this kind of scenario Not just the timeline but you know whether it happens in 2027 or 2037 just the the larger scenario of a kind of super intelligence takeover Let’s start with the limitation on AI that most people are familiar with right now which gets called hallucination right which is the tendency of AI to simply seem to make things up in response to queries And you were earlier talking about this in terms of lying right in terms of outright deception Um I think a lot of people experience this as just sort of the AI is making mistakes and doesn’t recognize that it’s making mistakes because it doesn’t have the level of awareness required to do that And our newspaper the the Times right just uh had a story reporting that in the latest models which you’ve suggested are probably pretty close to cutting edge right the latest publicly available models there seem to be trade-offs where the model might be better at math or physics but guess what it’s hallucinating a lot more Um so what are hallucinations just are they just a subset of the kind of deception that you’re worried about or are they in my when I’m being optimistic right i read a story like that and I’m like okay maybe there are just more tradeoffs in the push to the frontier of super intelligence than we think And this will be a limiting factor on how far this can go But what do you think great question So first of all uh lies are a subset of hallucinations not the other way around So I think quite a lot of hallucinations arguably the vast majority of them are just mistakes as you said Um so so I I use the word lies specifically I was referring to specifically when we have evidence that the AI knew that it was false and still said it anyway You know I also to your broader point um I think that the path from here to super intelligence is not at all going to be a smooth straight line There’s there’s going to be like obstacles overcome along the way And I think one of the obstacles that I’m actually quite excited to think more about is this uh you might call it reward hacking Um so in AI 247 we talk about this gap between what you’re actually reinforcing and what you want to happen You know what what goals you want the AI to to learn Um and we talk about how as a result of that gap you end up with AI that are misaligned and that like aren’t actually honest with you for example Well kind of excitingly that’s already happening That means that the companies still have a couple years to work on the problem and try to fix it And so one thing that I’m excited to think about and to to to track and follow very closely is what fixes are they going to come up with and are those fixes going to actually solve the underlying problem and get training methods that reliably get the right goals into AI systems even as those AI systems are smarter than us or are those fixes going to sort of you know temporarily patch the problem or cover up the problem instead of fixing it and that’s like the big question that we should all over the next few years Well and it and it yields again a question I’ve thought about a lot as someone who you know follows the politics of regulation pretty closely My sense is always that human beings are just really bad at regulating against problems that we haven’t experienced in some big profound way Right so you can have as many papers and arguments as you want about speculative problems that we should regulate against and the political system just isn’t going to do it Right so in an odd way if you want the slowdown right if you want regulation you want limits on AI maybe you should be rooting for a scenario where some version of hallucination happens and causes a disaster right where it’s not that the AI is misaligned it’s that it makes a mistake And and again I mean this this sounds this sounds sort of sinister but it makes a mistake A lot of people die somehow right because the AI system has been put in charge of some you know important safety protocol or something and people are horrified and say okay we have to regulate this thing I certainly hesitate to say that I hope that disers but right um we’re not saying that we’re but I do agree that like humanity is much better at regulating against problems that have already happened when we sort of learn from harsh experience Um and part of why the situation that we’re in is so scary is that for this particular problem uh by the time it’s already happened it’s too late you know Um so smaller versions of it can happen though So for example the the the stuff that we’re currently experiencing with we’re catching our AIS lying and we’re pretty sure they knew that the thing they were saying was false That’s actually quite good because that’s a sort of like small scale example of the sort of thing that we’re worried about happening in the future and hopefully we can try to fix it It’s not the sort of example that’s going to energize the government to regulate because no one’s dying because it’s just you know a chatbot lying to a user about some some link or something right but from a scientific perspective turn in their term paper and and get caught right right But like from a scientific perspective it’s good that this is already happening because it gives us a couple years to try to find a uh thorough fix to it you know a lasting fix to it Um yeah and I wish we had more time uh but uh but that’s that’s the name of the game Okay So now two big philosophical questions right maybe connected to one another Um there’s a tendency I think for people in AI research making the kind of forecasts you’re making and so on to move back and forth on the question of consciousness right are these super intelligent AIs conscious self-aware in the ways that human beings are and I I’ve had conversations where AI researchers and people will say “Well no they’re not.” And it doesn’t matter because you know you can have an AI program working out working toward a goal and doesn’t matter if they sort of you know are self-reflective or something Um but then again and again in the way that people end up talking about these things they slip into the language of consciousness So I’m I’m curious do you think consciousness matters in mapping out these future scenarios is the expectation of most AI researchers that we don’t know what consciousness is but it’s an emergent property If we build things that act like they’re conscious they’ll probably be conscious Where does consciousness fit into this so this is a question for philosophers not AI researchers uh but I happen to be trained as a well no it’s it is a question for both don’t right I mean since the AI researchers are the ones building the agents right they probably should have some thoughts on whether it matters or not whether the agents are self-aware uh I think I would say we could distinguish three things there’s the the behavior you know are they talking like they’re conscious Do they like behave as if they have goals and preferences um do they behave as if they’re like experiencing things and then reacting to those experiences right And they’re they’re they’re going to hit that benchmark Definitely people will absolutely people will think that the super intelligent AI is conscious People will believe that certainly because it will be you know you know in the philosophical discourse when we talk about like are shrimp conscious you know are fish conscious what about dogs typically what people do is they point to capabilities and behaviors like it can you know it seems to feel pain in a similar way to how humans feel pain like it it sort of like has these aversive behaviors and so forth right most of that will be true of these future uh super intelligent AIs they will be you know acting autonomously in the world they’ll be reacting to all this information coming in they’ll be making strategies and plans and thinking about how best to achieve their goals etc so um in terms of like raw capabilities and behaviors they will check all the boxes Basically there’s a separate philosophical question of like well if they have all the right behaviors and capabilities does that mean that they have you know true qualia that that they actually have the real experience as opposed to merely the appearance of having the real experience and uh that’s the thing that I think is a sort of philosophical question I think most philosophers though would say yeah probably they do because um probably consciousness is something that arises out of this information processing cognitive structures and if the AI have those structures then probably they also have consciousness however this is a controversial like everything in philosophy right and no and I don’t I don’t expect AGI researchers AI researchers to resolve that particular question exactly it’s more that on a couple of levels it seems like consciousness as we experience it right as an ability to sort of stand outside your own processing would be very helpful to an AI that wanted to take over the world right so at the level of hallucinations right AI hallucinate they produce the wrong answer to a question The AI can’t stand outside its own answer generating process in the way that again it seems like we can So if it could maybe that makes the hallucination process go away And then when it comes to like the the ultimate sort of worst case scenario that you’re speculating right like it seems to me that an AI that is conscious is more likely to develop some kind of independent view of its own cosmic destiny that yields a world where it wipes out human beings than an AI that is just sort of pursuing research for research’s sake But I I maybe you don’t think so What do what do you think so the view of consciousness that you were just talking about is a view by which consciousness has like physical effects in the real world like it it it’s something that you need in order to have this reflection and it’s something that also like influences how you think about your place in the world Um I would say that well if that’s what consciousness is then probably these AIs are going to have it Uh why because the companies are going to train them to be really good at all of these tasks And you can’t be really good at all these tasks if you aren’t able to reflect on how you might be wrong about stuff And so in the course of getting really good at all the tasks they will therefore learn to reflect on how they might be wrong about stuff And so if that’s what consciousness is then that means they’ll have consciousness Okay But but that and that does depend though in the end on a kind of emergence theory of consciousness like the one you suggested earlier where we can essentially the theory is we we aren’t going to figure out exactly how consciousness emerges but it is nonetheless going to happen Totally An important thing that everyone needs to know is that these systems are trained They’re not built you know and so we don’t actually have to understand how they work and we don’t in fact understand how they work in order for them to work Okay So then from consciousness to intelligence all of the scenarios that you spin out depend on the assumption that in to a certain degree there’s nothing that a sufficiently capable intelligence couldn’t do I I guess I think that again sort of spinning out your worst case scenarios I think a lot hinges on this question of what is available to intelligence right because if the AI is slightly better at getting you to buy a Coca-Cola than the average advertising agency that’s impressive but it doesn’t let you exert total control over a democratic policy I completely agree And so that’s why I say you have to sort of go on a case- by case basis and think about um okay assuming that it is better than the best humans at X uh how much real world power would that translate to what sort of affordances would that translate to and that’s the sort of thinking that we did when we wrote AI27 is that we thought about um historic examples of humans converting their economies and changing their factories to wartime production and so forth and thought you know how fast can humans do it when they really try and then we’re like okay so super intelligence will be better than the best humans so they’ll be able to go somewhat faster and so maybe instead of like in the in World War II uh the United States was able to convert a bunch of car factories into bomber factories over the course of a couple years uh well maybe then that means in less than a uh you know a couple maybe like six months or so uh we could convert existing car factories into fancy new robot factories producing fancy new robots right um so so that’s the sort of reasoning that we did sort of a case by case basis thinking it’s like humans except better and faster uh so what can they what can they achieve um and that wasing principle of telling the story but if we’re looking if we’re looking for hope and I want to this is a strange way of talking about this technology where we’re saying the limit ations are the reason for hope Yeah Right Like we we started earlier talking about robot plumbers as sort of an example of the key moment when things get real for people Right It’s not just in your laptop it’s in your kitchen and so on right but actually fixing a toilet is a very on the one hand it’s a very hard task On the other hand it’s a task that lots and lots of human beings are quite optimized for right and like I can imagine a world where the robot plumber is never that that much better than the ordinary plumber And you know people might rather have the ordinary plumber around for all kinds of very human reasons right and that that could generalize to to a number of areas of human life where the the advantage of the AI while real on some dimensions is limited in ways that at the very least and this I actually do believe dramatically slows its uptake by ordinary human beings like right now just personally as someone who writes a newspaper column and does research right for for that for that column right like I can concede that you know top-of-the-line AI models might be better than a human assistant right now by some dimensions but I’m still going to hire a human assistant because I’m a stubborn human being who doesn’t just want to work with AI models right and to me that seems like a a force that could actually slow this along multiple dimensions if the AI isn’t immediately 200% better Yeah So I think there I would just say you know this is hard to predict but our current guess is that things will go about as fast as we depict in AI 2027 Could be faster could be slower Um and that is indeed quite scary Another thing I would say is that and and but you know we’ll we’ll find out you know we’ll find out how fast things go uh when the time comes Yes Yes we will very very soon But the the other thing I was going to say is that politically speaking I don’t think it matters that much If you think it might take five years instead of one year for example to sort of transform the economy and build the new self-sustaining robot economy managed by super intelligences uh that’s not that helpful if the entire 5 years uh there’s still been this political coalition between the White House and the super intelligences and the corporation and the super intelligences have been saying all the right things to make the White House and the corporation feel like everything’s going great for them but actually they’ve been you know deceiving Right in that sort of scenario it’s like great Now we have five years to sort of turn the situation around instead of one year and that’s I guess better But like how would you turn the situation around you know well so that’s well and that’s where let’s let’s end there Yeah In in in a world where what you predict happens and the world doesn’t end you know we figure out how to manage the AI it doesn’t kill us but the world is forever changed and human work is no longer particularly important and so on What do you think is the purpose of humanity in that kind of world like how do you imagine educating your children in that kind of world telling them what their adult life is for It’s a tough question and it’s um here are some here are some thoughts off the top of my head but I don’t stand by them nearly as much as I would stand by the other things I’ve said because it’s it’s not where I’ve spent most of my time thinking Um so first of all I think that uh if we go to super intelligence and beyond then economic productivity is just no longer the name of the game when it comes to raising kids like there won’t really be participating in the economy in anything like the normal sense It’ll be more like just a series of like video game like things and like people will do stuff for fun rather than because they need to get money you know Um if people are around at all you know Um and there I think that I guess what still matters is that my kids are good people and that they yeah that they have have wisdom and virtue and things like that Um so I will do my best to try to teach them those things because those things are good in themselves rather than good for getting a jobs Um in terms of the purpose of humanity I mean I don’t know what would what would you say the purpose of humanity is now uh well I have a religious answer to that question but we’ll we can we can save that for for a future conversation I mean I think I think that the the world the world that I want to believe in where some version of this technological breakthrough happens right is a world where human beings maintain some kind of mastery over the over the technology which enables us to do things like you know colonize other worlds right to sort of have have a kind of adventure beyond the level of material scarcity And you know as a political conservative I have my share of you know disagreements with the particular vision of like Star Trek right but Star Trek does take place in a world that has conquered scarcity You know people can you know there is an AI like computer on the Starship Enterprise right you can have anything you want sort of in the restaurant Um because presumably the AI invented um what is the machine called that generates the Anyway it generates food any food you want Right so that’s if I’m trying to think about the purpose of humanity it might be to explore strange new worlds to boldly go where no man has gone I’m a huge fan of expanding into space I think that would be a great idea Okay Yeah And and in general also like solving all the world’s problems right like poverty and and uh disease and torture and wars and stuff like that I think uh you know I if we if we get through the initial phase with super intelligence then obviously the first thing to be doing is to solve all those problems and make make something some sort of utopia and then to bring that utopia to the stars would be I think the the thing to do Um the the the thing is that it would be the AI doing it not us if that makes sense um like in terms of actually doing the designing and the planning and the strategizing and so forth Uh we would only be messing things up if we tried to do it ourselves you know Um so you could you could say it’s still humanity in some sense that’s doing all those things but it’s important to note that it’s more like the AIS are doing it and they’re doing it because the humans told them to Well Daniel Cocatello thank you so much Uh and I will see you on the front lines of the Bet Larian Jihad soon enough Hopefully not I hope I’m hopefully not Yeah All right Thanks so

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER.

Daniel Kokotajlo on Robot Plumbers, Robot Armies, and Our Imminent A.I. Future | Interesting Times with Ross Douthat

Learn more:

AI 2027

Why the AI Race Ends in Disaster (with Daniel Kokotajlo) | Future of Life Institute

Why Everyone Suddenly Believes in AGI by 2027 | Species | Documenting AGI

Ex-OpenAI Researcher’s SHOCKING Prediction for AI in 2027-2032. Py Man

Transcript (10,431 words)

Daniel Kokotajlo on Robot Plumbers, Robot Armies, and Our Imminent A.I. Future | Interesting Times with Ross Douthat

Learn more:

AI 2027

Why the AI Race Ends in Disaster (with Daniel Kokotajlo) | Future of Life Institute

Why Everyone Suddenly Believes in AGI by 2027 | Species | Documenting AGI

Ex-OpenAI Researcher’s SHOCKING Prediction for AI in 2027-2032. Py Man

Transcript (10,431 words)

Share This Story, Choose Your Platform!