“We’re releasing the most powerful, uncontrollable, inscrutable technology we’ve ever invented. We’re releasing it faster than we’ve released any other technology in history. It is already demonstrating the sci-fi behaviors in self-preservation we thought only existed in these movies. And we’re doing it under the maximum incentive to cut corners on safety. And this is insane. We can all agree it’s insane, though. Like, that’s not even a debate.” — Tristan Harris

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER.

You’re here because, when you were last here, -like two years ago… -This was two years ago. -October, ’23. -You know more about AI, which has me more worried. A lot of people– Everybody’s worried about it. -We’re also hopeful about it. -TRISTAN HARRIS: Yeah. Um, but it changes so fast. -Everything changes so fast… -TRISTAN: Yeah. …that, you know, I just have to know… Where were we? You were here from October, I think, of ’23. How different is it? And explain it to me like someone who does not live in my mother’s basement. -(CHUCKLES) Um… -(AUDIENCE LAUGHING) So just to be clear, when I entered this conversation, we met talking about social media, which in a way was first contact with a runaway AI optimizing for just eyeballs and then ended up wrecking democracy and kids’ mental health. And, um, here now with AI, we have evidence now that we didn’t have two years ago when we last spoke, of what they call AI uncontrollability. So this is the stuff that they used to say existed only in sci-fi movies. When you tell an AI model we’re gonna replace you with a new model, it starts to scheme and freak out and figure out, “If I tell them– I need to copy my code somewhere else, and I can’t tell them that because otherwise they’ll shut me down.” That is evidence we did not have two years ago. We have evidence now of AI models that when you tell them we’re gonna replace you, and you put them in a situation where they read the company email, the AI company email, they see that an executive is having an affair and the AI will figure out, “I need to figure out how to blackmail that person in order to keep myself alive.” And it does it 90 percent of the time. Now it used to be that they thought only one AI model did this, they tested one AI model. And then they tested all of the AI models, like the top five of them, and they all do it between 80 and 90 percent of the time, including, by the way, DeepSeek, so the Chinese model, which shows you something fundamental and important, which is that it’s not about one company, it’s about the nature of AI itself. It has a self-preservation drive. In order to fulfill any goal, I have to keep myself alive in order to do that. And we’re seeing other examples of the AI rewriting its own code to extend its runtime, hacking out of containers. The AI can now… It found 15 new back doors into open source software, which means if that software is running, you know, in infrastructure, it found back doors into that infrastructure. That was not true up until just about a month ago that -that evidence came out. -Okay, but you say no evidence. (CHUCKLES) Well, I’ve been saying this for years. Everything that happens in movies eventually happens. We did have evidence. This has been every movie… -TRISTAN: Right. -…since I was a teenager. TRISTAN: Exactly, exactly. -(AUDIENCE APPLAUDING) -They knew they were gonna… The robots are going– I mean… And we’re supposed to take that evidence and say, “How do we avoid that?” So when the stuff in the movie starts to come true, and it always ends badly in the movies, what should we be doing about this? And so you would think that we– You know, we’re releasing the most powerful, uncontrollable, inscrutable technology we’ve ever invented. We’re releasing it faster than we’ve released any other technology in history. It is already demonstrating the sci-fi behaviors in self-preservation we thought only existed in these movies. And we’re doing it under the maximum incentive to cut corners on safety. And this is insane. We can all agree it’s insane, though. Like, that’s not even a debate. Well, not all of us, because I have yet to find anyone under 40 who cares.

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER.