FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER.

Consensus Statement on AI Safety as a Global Public Good | IDAIS-Venice, 2024 VENICE, ITALY | September 5th-8th

Rapid advances in artificial intelligence (AI) systems’ capabilities are pushing humanity closer to a world where AI meets and surpasses human intelligence. Experts agree these AI systems are likely to be developed in the coming decades, with many of them believing they will arrive imminently. Loss of human control or malicious use of these AI systems could lead to catastrophic outcomes for all of humanity. Unfortunately, we have not yet developed the necessary science to control and safeguard the use of such advanced intelligence. The global nature of these risks from AI makes it necessary to recognize AI safety as a global public good, and work towards global governance of these risks. Collectively, we must prepare to avert the attendant catastrophic risks that could arrive at any time.

Promising initial steps by the international community show cooperation on AI safety and governance is achievable despite geopolitical tensions. States and AI developers around the world committed to foundational principles to foster responsible development of AI and minimize risks at two intergovernmental summits. Thanks to these summits, states established AI Safety Institutes or similar institutions to advance testing, research and standards-setting.

These efforts are laudable and must continue. States must sufficiently resource AI Safety Institutes, continue to convene summits and support other global governance efforts. However, states must go further than they do today. As an initial step, states should develop authorities to detect and respond to AI incidents and catastrophic risks within their jurisdictions. These domestic authorities should coordinate to develop a global contingency plan to respond to severe AI incidents and catastrophic risks. In the longer term, states should develop an international governance regime to prevent the development of models that could pose global catastrophic risks.

Deep and foundational research needs to be conducted to guarantee the safety of advanced AI systems. This work must begin swiftly to ensure they are developed and validated prior to the advent of advanced AIs. To enable this, we call on states to carve out AI safety as a cooperative area of academic and technical activity, distinct from broader geostrategic competition on development of AI capabilities.

The international community should consider setting up three clear processes to prepare for a world where advanced AI systems pose catastrophic risks:

Emergency Preparedness Agreements and Institutions, through which domestic AI safety authorities convene, collaborate on, and commit to implement model registration and disclosures, incident reporting, tripwires, and contingency plans.

A Safety Assurance Framework, requiring developers to make a high-confidence safety case prior to deploying models whose capabilities exceed specified thresholds. Post-deployment monitoring should also be a key component of assurance for highly capable AI systems as they become more widely adopted. These safety assurances should be subject to independent audits.

Independent Global AI Safety and Verification Research, developing techniques that would allow states to rigorously verify that AI safety-related claims made by developers, and potentially other states, are true and valid. To ensure the independence of this research it should be conducted globally and funded by a wide range of governments and philanthropists.

Emergency Preparedness Agreements and Institutions

States should agree on technical and institutional measures required to prepare for advanced AI systems, regardless of their development timescale. To facilitate these agreements, we need an international body to bring together AI safety authorities, fostering dialogue and collaboration in the development and auditing of AI safety regulations across different jurisdictions. This body would ensure states adopt and implement a minimal set of effective safety preparedness measures, including model registration, disclosure, and tripwires.

Over time, this body could also set standards for and commit to using verification methods to enforce domestic implementations of the Safety Assurance Framework. These methods can be mutually enforced through incentives and penalty mechanisms, such as conditioning access to markets on compliance with global standards. Experts and safety authorities should establish incident reporting and contingency plans, and regularly update the list of verified practices to reflect current scientific understanding. This body will be a critical initial coordination mechanism. In the long run, however, states will need to go further to ensure truly global governance of risks from advanced AI.

Safety Assurance Framework

Frontier AI developers must demonstrate to domestic authorities that the systems they develop or deploy will not cross red lines such as those defined in the IDAIS-Beijing consensus statement.

To implement this, we need to build further scientific consensus on risks and red lines. Additionally, we should set early-warning thresholds: levels of model capabilities indicating that a model may cross or come close to crossing a red line. This approach builds on and harmonizes the existing patchwork of voluntary commitments such as responsible scaling policies. Models whose capabilities fall below early-warning thresholds require only limited testing and evaluation, while more rigorous assurance mechanisms are needed for advanced AI systems exceeding these early-warning thresholds.

Although testing can alert us to risks, it only gives us a coarse-grained understanding of a model. This is insufficient to provide safety guarantees for advanced AI systems. Developers should submit a high-confidence safety case, i.e., a quantitative analysis that would convince the scientific community that their system design is safe, as is common practice in other safety-critical engineering disciplines. Additionally, safety cases for sufficiently advanced systems should discuss organizational processes, including incentives and accountability structures, to favor safety.

Pre-deployment testing, evaluation and assurance are not sufficient. Advanced AI systems may increasingly engage in complex multi-agent interactions with other AI systems and users. This interaction may lead to emergent risks that are difficult to predict. Post-deployment monitoring is a critical part of an overall assurance framework, and could include continuous automated assessment of model behavior, centralized AI incident tracking databases, and reporting of the integration of AI in critical systems. Further assurance should be provided by automated run-time checks, such as by verifying that the assumptions of a safety case continue to hold and safely shutting down a model if operated in an out-of-scope environment.

States have a key role to play in ensuring safety assurance happens. States should mandate that developers conduct regular testing for concerning capabilities, with transparency provided through independent pre-deployment audits by third parties granted sufficient access to developers’ staff, systems and records necessary to verify the developer’s claims. Additionally, for models exceeding early-warning thresholds, states could require that independent experts approve a developer’s safety case prior to further training or deployment. Moreover, states can help institute ethical norms for AI engineering, for example by stipulating that engineers have an individual duty to protect the public interest similar to those held by medical or legal professionals. Finally, states will also need to build governance processes to ensure adequate post-deployment monitoring.

While there may be variations in Safety Assurance Frameworks required nationally, states should collaborate to achieve mutual recognition and commensurability of frameworks.

Independent Global AI Safety and Verification Research

Independent research into AI safety and verification is critical to develop techniques to ensure the safety of advanced AI systems. States, philanthropists, corporations and experts should enable global independent AI safety and verification research through a series of Global AI Safety and Verification Funds. These funds should scale to a significant fraction of global AI research and development expenditures to adequately support and grow independent research capacity.

In addition to foundational AI safety research, these funds would focus on developing privacy-preserving and secure verification methods, which act as enablers for domestic governance and international cooperation. These methods would allow states to credibly check an AI developer’s evaluation results, and whether mitigations specified in their safety case are in place. In the future, these methods may also allow states to verify safety-related claims made by other states, including compliance with the Safety Assurance Frameworks and declarations of significant training runs.

Eventually, comprehensive verification could take place through several methods, including third party governance (e.g., independent audits), software (e.g., audit trails) and hardware (e.g., hardware-enabled mechanisms on AI chips). To ensure global trust, it will be important to have international collaborations developing and stress-testing verification methods.

Critically, despite broader geopolitical tensions, globally trusted verification methods have allowed, and could allow again, states to commit to specific international agreements.

Signatories

Yoshua Bengio

Professor at the Université de Montréal;
the Founder and Scientific Director of Mila
Quebec AI Institute,
Chair of the International Scientific Report on the Safety of Advanced AI
Turing Award Winner

Andrew Yao

Dean of the Institute for Interdisciplinary Information Sciences and Dean of the College of Artificial Intelligence at Tsinghua University
Turing Award Winner

Geoffrey Hinton

Chief Scientific Advisor
University of Toronto Vector Institute
Turing Award Winner

Zhang Ya-Qin 张亚勤

Director of the Tsinghua Institute for AI Industry Research (AIR)
Former President of Baidu

Stuart Russell

Professor and Smith-Zadeh Chair
in Engineering at the University of California, Berkeley
Founder of Center for Human-Compatible Artificial Intelligence (CHAI)
at the University of California, Berkeley

Gillian Hadfield

Incoming Professor at the School of Government
and Policy and School of Engineering of the Johns Hopkins University

Professor of Law and Strategic Management at the University of Toronto

Mary Robinson

Former President of Ireland, Chair of the Elders

Xue Lan 薛澜

Dean of Schwarzman College at Tsinghua University, Director at the Institute for AI International Governance

Yoshua Bengio

Professor at the Université de Montréal;
the Founder and Scientific Director of Mila
Quebec AI Institute,
Chair of the International Scientific Report on the Safety of Advanced AI
Turing Award Winner

Andrew Yao

Dean of the Institute for Interdisciplinary Information Sciences and Dean of the College of Artificial Intelligence at Tsinghua University
Turing Award Winner

Geoffrey Hinton

Chief Scientific Advisor
University of Toronto Vector Institute
Turing Award Winner

Zhang Ya-Qin 张亚勤

Director of the Tsinghua Institute for AI Industry Research (AIR)
Former President of Baidu

Stuart Russell

Professor and Smith-Zadeh Chair
in Engineering at the University of California, Berkeley
Founder of Center for Human-Compatible Artificial Intelligence (CHAI)
at the University of California, Berkeley

Gillian Hadfield

Incoming Professor at the School of Government
and Policy and School of Engineering of the Johns Hopkins University

Professor of Law and Strategic Management at the University of Toronto

Mary Robinson

Former President of Ireland, Chair of the Elders

Xue Lan 薛澜

Dean of Schwarzman College at Tsinghua University, Director at the Institute for AI International Governance

Mariano-Florentino (Tino) Cuéllar

Former California Supreme Court Justice and Member, National Academy of Sciences Committee on the Ethics and Governance of Computing Research and Its Applications

Fu Ying 傅莹

Zeng Yi 曾毅

Director of the International Research Center for AI Ethics and Governance
and Deputy Director of the Research Center for Brain-inspired Intelligence
Institute of Automation,
Chinese Academy of Sciences (CAS),
Member of United Nations High-level Advisory Body on AI
Member of UNESCO High-level Expert Group on Implementation of AI Ethics

He Tianxing 贺天行

Incoming Assistant Professor at Tsinghua University

Lu Chaochao 陆超超

Kwok Yan Lam

Associate Vice President (Strategy and Partnerships)
at Nanyang Technological University (NTU), Singapore,
Executive Director of the Digital Trust Centre (DTC)
designated as Singapore’s AI Safety Institute,
Professor, School of Computer Science and Engineering
Nanyang Technological University (NTU), Singapore

Tang Jie 唐杰

Chief Scientist of Zhipu
Professor of Computer Science at Tsinghua University

Dawn Nakagawa

President of the Berggruen Institute

Benjamin Prud’homme

Vice-President of Policy, Safety and Global Affairs at Mila
Québec AI Institute

Robert Trager

Co-Director of the Oxford Martin AI Governance Initiative
International Governance Lead at the Centre for the Governance of AI

Yang Yaodong 杨耀东

Assistant Professor at the Institute for AI
Peking University
Director of the Center for Large Model Safety
Beijing Academy of AI
Head of the PKU Alignment and Interaction Research Lab (PAIR)

Yang Chao 杨超

Zhang HongJiang 张宏江

Founding Chairman of the Beijing Academy of Artificial Intelligence (BAAI)

Wang Zhongyuan

Director, Beijing Academy of Artificial Intelligence (BAAI)

Sam Bowman

Member of Technical Staff and Co-Director for Alignment Science, Anthropic
Associate Professor of Data Science, Linguistics, and Computer Science, New York University

Dan Baer

Sebastian Hallensleben

Chair of CEN-CENELEC JTC 21
where European AI standards to underpin EU regulation are being developed
Head of Digitalisation and Artificial Intelligence
at VDE Association for Electrical
Electronic and Information Technologies,
Member of the Expert Advisory Board of the EU

Ong Chen Hui

Assistant Chief Executive
of Business and Technology Group at
the Infocomm and Media Development Authority (IMDA), Singapore.

Fynn Heide

Executive Director, Safe AI Forum

Conor McGurk

Managing Director, Safe AI Forum

Saad Siddiqui

Safe AI Forum

Isabella Duan

Safe AI Forum

Adam Gleave

Founder and CEO, FAR AI

Xin Chen

PhD Student, ETH Zurich

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER.