A very important read from extremely respected and knowledgable scientists!

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER.

SCIENCE. Managing extreme AI risks amid rapid progress

Preparation requires technical research and development, as well as adaptive, proactive governance.

20 May 2024

YOSHUA BENGIO, GEOFFREY HINTON, ANDREW YAO, DAWN SONG, PIETER ABBEEL, TREVOR DARRELL, YUVAL NOAH HARARI, YA-QIN ZHANG, LAN XUE, SHAI SHALEV-SHWARTZ, GILLIAN HADFIELD, JEFF CLUNE, TEGAN MAHARAJ, FRANK HUTTER, ATILIM GÜNEŞ BAYDIN, SHEILA MCILRAITH, QIQI GAO, ASHWIN ACHARYA, DAVID KRUEGER, ANCA DRAGAN, PHILIP TORR, STUART RUSSELL, DANIEL KAHNEMAN, JAN BRAUNER , AND SÖREN MINDERMANN

Artificial intelligence (AI) is progressing rapidly, and companies are shifting their focus to developing generalist AI systems that can autonomously act and pursue goals. Increases in capabilities and autonomy may soon massively amplify AI’s impact, with risks that include large-scale social harms, malicious uses, and an irreversible loss of human control over autonomous AI systems. Although researchers have warned of extreme risks from AI (1), there is a lack of consensus about how to manage them. Society’s response, despite promising first steps, is incommensurate with the possibility of rapid, transformative progress that is expected by many experts. AI safety research is lagging. Present governance initiatives lack the mechanisms and institutions to prevent misuse and recklessness and barely address autonomous systems. Drawing on lessons learned from other safety-critical technologies, we outline a comprehensive plan that combines technical research and development (R&D) with proactive, adaptive governance mechanisms for a more commensurate preparation.

Regulating advanced artificial agents – Science

Michael K. Cohen, Noam Kolt, Yoshua Bengio, Gillian K. Hadfield, Stuart Russell

Governance frameworks should address the prospect of AI systems that cannot be safely tested

Technical experts and policy-makers have increasingly emphasized the need to address extinction risk from artificial intelligence (AI) systems that might circumvent safeguards and thwart attempts to control them (1). Reinforcement learning (RL) agents that plan over a long time horizon far more effectively than humans present particular risks. Giving an advanced AI system the objective to maximize its reward and, at some point, withholding reward from it, strongly incentivizes the AI system to take humans out of the loop, if it has the opportunity. The incentive to deceive humans and thwart human control arises not only for RL agents but for long-term planning agents (LTPAs) more generally. Because empirical testing of sufficiently capable LTPAs is unlikely to uncover these dangerous tendencies, our core regulatory proposal is simple: Developers should not be permitted to build sufficiently capable LTPAs, and the resources required to build them should be subject to stringent controls.

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER.