An excellent summary of the RISKS of AGI from leading scientist Professor Yoshua Bengio the “Godfather of AI”

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER.

Recognized worldwide as one of the leading experts in artificial intelligence, Yoshua Bengio is most known for his pioneering work in deep learning, earning him the 2018 A.M. Turing Award, “the Nobel Prize of Computing,” with Geoffrey Hinton and Yann LeCun.

He is Full Professor at Université de Montréal, and the Founder and Scientific Director of Mila – Quebec AI Institute. He co-directs the CIFAR Learning in Machines & Brains program as Senior Fellow and acts as Scientific Director of IVADO.

In 2019, he was awarded the prestigious Killam Prize and in 2022, became the computer scientist with the highest impact in the world as measured by h-index. He is a Fellow of both the Royal Society of London and Canada, Knight of the Legion of Honor of France, Officer of the Order of Canada, Member of the UN’s Scientific Advisory Board for Independent Advice on Breakthroughs in Science and Technology since 2023 and a Canada CIFAR AI Chair.

Concerned about the social impact of AI, he actively contributed to the Montreal Declaration for the Responsible Development of Artificial Intelligence

Recognized worldwide as one of the leading experts in artificial intelligence, Yoshua Bengio is most known for his pioneering work in deep learning, earning him the 2018 A.M. Turing Award, “the Nobel Prize of Computing,” with Geoffrey Hinton and Yann LeCun.

He is Full Professor at Université de Montréal, and the Founder and Scientific Director of Mila – Quebec AI Institute. He co-directs the CIFAR Learning in Machines & Brains program as Senior Fellow and acts as Scientific Director of IVADO.

In 2019, he was awarded the prestigious Killam Prize and for several years has been the computer scientist with the greatest impact in terms of citations, as measured by the h-index.  He is a Fellow of both the Royal Society of London and Canada, Knight of the Legion of Honor of France, Officer of the Order of Canada, Member of the UN’s Scientific Advisory Board for Independent Advice on Breakthroughs in Science and Technology since 2023and a Canada CIFAR AI Chair.

Concerned about the social impact of AI, he actively contributed to the Montreal Declaration for the Responsible Development of Artificial Intelligence and currently chairs the International Scientific Report on the Safety of Advanced AI.

Implications of Artificial General Intelligence on National and International Security

This paper was initially published by the Aspen Strategy Group (ASG), a policy program of the Aspen Institute. It was released as part of a collection of papers titled Intelligent Defense: Navigating National Security in the Age of AI. To explore the rest of the collection, please visit the publication here.

As highlighted in the International Scientific Report on the Safety of Advanced AI, the capabilities of general-purpose AI systems have been steadily increasing over the last decade, with a pronounced acceleration in the last few years. If these trends continue, and as per the declared goals of the leading AI companies, we are likely to achieve human-level capabilities across a broad spectrum of cognitive skills, what is commonly called Artificial General Intelligence (AGI). It is remarkable that we have already achieved human-level competence in natural language, i.e., systems that can read and understand texts, and fluently respond or generate new textual, visual, audio or video content. And while scientific advances are impossible to predict precisely, many leading researchers now estimate the timeline to AGI could be as short as a few years or a decade. On Metaculus (a rigorous and recognized prediction market) over 20% predict AGI before 2027. This is consistent with the steady advances of the last decade driven by algorithmic progress and scaling up the amount of computing resources used, and by the exponential increase in global AI R&D investments well into the trillions of dollars. While the lack of internal deliberation abilities, i.e., thinking, has long been considered one of the main weaknesses of current AI, a recent advance based on a new form of AI with internal deliberation suggests we might potentiallybe on the brink of bridging the gap to human-level reasoning. It could also be the case that research obstacles will delay this by many years or even decades, but the precautionary principle demands that we consider what is plausible and with a catastrophic potential.

Moreover, frontier AI companies are seeking to develop AI with a specific skill that could very well unlock all others and turbocharge advances: AIs with the ability to advance research in AI. An AI system that would be as capable at AI research as the topmost handful of researchers in an AI Lab would multiply the advanced research workforce by orders of magnitude. Although it takes tens of thousands of GPUs to train the AI, once trained it can be deployed at inference-time in parallel, yielding the equivalent of hundreds of thousands of automated AI workers. Such scaling up could greatly accelerate the path towards superhuman AI systems. The materialization of this scenario could lead to a fast transition from AGI to Artificial Super-Intelligence (ASI), ranging from a few months to a few years according to some experts. Imagining such possibilities can be challenging, and we have no guarantee that they will materialize, as the pace and direction of future AI development are largely dependent on the political decisions and scientific advances in months and years ahead. However, given the consequences of some of the scenarios among those outlined by experts as plausible, we now need to seriously consider how to mitigate them.

If ASI arises, what could be the consequences? It is plausible that the potential benefits would be tremendous and could enable both significant economic growth and great improvements in the well-being of societies, through advances in medicine, education, agriculture, fighting climate change, and more. However, such superior intelligence could also provide unequaled strategic advantages on a global scale and tip the balance in favor of a few (companies, countries or individuals), while causing great harm to many others. This is particularly true in the current geopolitical and corporate contexts whereby control of these technologies is extraordinarily concentrated. Societies would have to address a number of questions: Who will control this great power and to what ends? Could such concentrated power threaten democracy? Beyond the danger of malicious use, do we even have the knowledge and capacity to control machines that are smarter than humans? ASI would open a Pandora’s box, enabling both beneficial and destructive outcomes, possibly at the scale of the currently known existential risks. A significant fraction of AI researchers acknowledge the possibility of such risks: A recent survey of nearly 3000 authors of machine learning papers at recognized scientific venues shows that “between 37.8% and 51.4% of respondents gave at least a 10% chance to advanced AI leading to outcomes as bad as human extinction.”

There are also scientific reasons for these concerns (see also the above-cited report for more references). First, one has to remember the basics of AI: the ability to correctly answer questions and achieve goals. Hence, with an ASI, whoever dictates those questions and goals could exploit that intellectual power to effectively have stronger scientific, psychological and planning abilities than other human organizations, and could use that power to enhance their own strength, potentially at the expense of the greater collective. Malicious use of AI could gradually enable extreme concentrations of power, including dominance in economic, political or military terms, if no counter-acting power is in place to prevent any ASI and those who control it from acquiring a decisive strategic advantage. Second, the ASI could be the controlling entity, if it has as a goal its own preservation. In this case, it would most probably execute subgoals to increase its probability of survival. An ASI with a primary self-preservation goal could notably scatter offspring in insecure computing systems globally, and speak fluently and extremely persuasively in all major languages. If this were to happen before we figure out a way to ensure ASI is either aligned with human interests, or subject to effective human oversight and control, this could lead to catastrophic outcomes and major threats to our collective security. Keeping in mind that some humans would (rightly) want to turn off such a machine, precisely to avoid harm, it would be in the advantage of the AI to (1) try to make sure it is difficult for humans to turn it off, e.g., by copying itself in many places across the internet, (2) try to influence humans in its favor, e.g., via cyberattacks, persuasion, threats, and (3) once it has reduced its dependence on humans (e.g., via robotic manufacturing), aim to eliminate humans altogether, e.g., using a new species-killing virus.

There are many trajectories that could lead to the emergence of an AI with a self-preservation goal. A human operator could specify this goal explicitly (just like we type queries in ChatGPT), for example to advance an ideology (some groups have the stated objective of seeing ASI replace humanity as the dominant entities). But there are also mathematical reasons why self-preservation may emerge unintentionally. By definition, AGI would achieve at least human-level autonomy, if merely given access to a command line on its own servers (see recent advances allowing an AI to control a computer). In order for an autonomous agent to ensure the highest possible chance of achieving almost any long-term goal, including goals given by human operators, it will need to ensure its preservation. If we are not careful, that implicit self-preservation goal could lead to actions against the well-being of societies, and there are currently no highly reliable techniques to design AI that is guaranteed to be safe. It is also worth noting that given the immense commercial and military value of enabling strong agency in frontier AI systems, (i.e., allowing the AI to not only answer questions but also to plan and act autonomously) there are powerful incentivesto invest significant R&D efforts in this pursuit. The current state-of-the-art method seeking to evolve AIs into agents through reinforcement learning techniques involves creating systems that seek maximum positive rewards – thus opening up the possibility of the AI eventually being capable of overtaking the reward system itself.

Major AGI and ASI National Security Challenges

If and when AI systems are able to operate at or above human-level intelligence and autonomy, there would be an unprecedented level of risk for national and international security. Moving towards action to start mitigating these threats is urgent, both because of the unknown timeline for AGI and ASI, and the plausible speed-gap between implementing guardrails, countermeasures and international agreements versus deploying the next frontier AI systems, especially in the current regulatory environment, which imposes little to no restrictions.

It is useful to categorize the different kinds of threats because their mitigations may differ, and we need to find solutions to each that do not worsen the others. In general, there will be many unforeseen effects and the potential for catastrophic outcomes, all calling for caution.

  1. National security threats from adversaries using AGI/ASI: Even before the possible emergence of AGI, malicious actors could use future advanced AI to facilitate mass destruction, with threats ranging from CBRN (chemical, biological, radiological, nuclear) to cyber attacks. The recently revealed OpenAI o1 model (September 2024) is the first model to cross the company’s own boundary from “low risk” to “medium risk” for CBRN capabilities – the maximum level of risk before OpenAI’s policies would preclude releasing a model. Such threats will only increase as AI capabilities continue to rise. Advances in autonomy and agency should be monitored closely, and it should be expected that o1’s progress in reasoning abilities and simple solvable planning problems (35.5% to 97.8% on Blocksworld) could soon open the door to better long-term planning and thus improved AI agency (for which o1 has not yet been trained). This could yield great economic and geopolitical value but would also pose significant threats to people and infrastructures, unless political and technical guardrails and countermeasures are put in place to prevent AGI systems from falling into the wrong hands.
  1. Threats to democratic processes from current and future AI: Deepfakes are already used in political campaigns and to promote dangerous conspiracy theories. The negative impact of future advances in AI could be of a much larger magnitude, and AGI could significantly disrupt societal and power equilibria. In the short term, we are not far from the use of generative AI to design personalized persuasion campaigns. A recent study compared GPT-4 with humans in their ability to change the opinion of a subject (who doesn’t know whether they are interacting with a human or a machine) through a text-based dialogue. When GPT-4 has access to the subject’s Facebook page, that personalization enables substantially greater persuasion than that from humans. We are only one small step away from greatly increasing such a persuasion ability by improving the persuasion skills of generative models with additional specialized training (called fine-tuning). A state-of-the-art open-source model such as Llama-3.1 could likely be used for such a purpose by a nefarious actor. Such a threat will likely grow as persuasion abilities of advanced AI increase, and we may soon have to face superhuman persuasion abilities given large language models’ high competency in languages (already above the human average). If this is combined with advances in planning and agency (to achieve surgical and personalized goals in opinion-shaping), the effect could be highly destabilizing for democratic processes in favor of a rise in totalitarian regimes.
  1. Threats to the effective rule of law: Whoever controls the first ASI technology may gain enough power — through cyberattacks, political influence or enhanced military force — to inhibit other players with the same ASI goal. Such a centralization of power could happen either within or across national territories, and would be a major threat to states’ sovereignty. The temptation to use ASI to increase one’s power will be strong for some, and may be rationalized by fear of adversaries (including political or business opponents) doing it first. Modern democracy emerged from early information and communication technologies (from postal systems and newspapers to fax machines) where no single human could easily defeat a majority of other humans if they could communicate and coordinate in a group. But this fundamental equilibrium could be toppled with the emergence of the first ASI.
  1. Threats to humanity from loss of human control to rogue AIs: Rogue AIs could emerge anywhere (domestically or internationally), either because of carelessness (impelled by pressure from a military arms race, or a commercial equivalent) or intentionally (because of ideological motivations). If their intelligence and ability to act in the world is sufficient, they could gradually control more of their environment, first to immediately protect themselves and then to make sure humans could never turn them off. Note that this threat could be increased by the emergence of more totalitarian regimes which lack good self-correcting mechanisms and might unintentionally allow the emergence of a rogue ASI.

Government Interventions for Risk Mitigation

The above risks must be mitigated together: Addressing one but not the others could prove to be a monumental mistake. In particular, racing to AGI to land there before adversaries could greatly increase the democratic (b,c) and rogue AI (d) risks. Navigating the required balancing act sufficiently well will be challenging, and the ideas below should be taken as the beginning of a much-needed global effort, one that will require our best minds and substantial investments and innovations, in both science and governance.

  1. Major and urgent R&D investments are needed to develop AI with safety guarantees that would continue to be effective if current AIs are scaled to the AGI and ASI levels, with the objective to deliver solutions before AI labs reach AGI. Some of that safety-first frontier AI R&D can be stimulated with regulatory incentives – such as clarifying liability for developers for systems that could cause major harm – but may ultimately be better served through non-profit organizations with a public protection mission and robust governance mechanisms designed to avoid the conflicts of interest that can emerge with commercial objectives. This is particularly important to address challenges (a), (b) and especially (d), for which no satisfactory solution currently exists. Note that some parts of that R&D, if shared globally, would help mitigate risks (see a,b,c,d), e.g., identifying general means to evaluate risks and how to put technical guardrails, while other parts of that R&D, if shared globally, would increase all those risks, e.g., by increasing capabilities (since improving the prediction of future harm, which is useful to build technical guardrails, requires advances in capabilities).
  1. Governments must have substantial visibility and control over these technological advances to avoid the above scenarios, for example to reduce the chances that an ASI developed by an AI lab be deployed without the necessary protections or accessed by a malicious actor. Since frontier AI development is currently fully in private hands, a transition will be required to ensure AGI is developed with national security and other public goods as more central goals, beyond economic returns. Regulation can provide necessary controls, but stronger options may be needed. Outright nationalization, at least in the current environment, is unlikely but promoted by some. Another possibility is that of private-public partnerships, with frontier efforts mostly led under a secured governmental umbrella, with safe commercial spin-offs. This would help in addressing all four challenges (see a,b,c,d) above, especially by securing the most advanced models and imposing appropriate guardrails.
  1. This unprecedented context calls for an innovation in the checks-and-balances and multilateral governance mechanisms and treaties for AGI development. To make sure that no single individual, corporation or government could unethically use the power of ASI for their own benefit, up to and including a self-coup, will require institutional and legal changes that could take years to develop, negotiate and implement. Although governments can oversee and regulate their domestic AI labs, international agreements could reduce the possibility that one country creates a rogue AI (see (d) in the above section) or use ASI against populations and infrastructure of another country (a). Such a multilateral governance and international treaties could reassure authoritarian regimes with nuclear weapons, afraid to lose the ASI race, and willing to do anything to lose their power. In this context, what kind of organization with multilateral governance should be developing AGI and ASI? Organizations that are government-funded but at arm’s length could play a critical role in balancing interests, countering conflicts of interest and ensuring public good. For example, CERN is a non-governmental entity created by international convention, subject to regulations of its host countries, which are in turn subject to governance obligations to the IAEA, a separate non-governmental entity created by a separate international treaty. A strong design of governance mechanisms (both in terms of rules and of technological support) is crucial, both to avoid abuse of an ASI’s power (see c), as well as safety compromises arising from race dynamics (between companies or between countries), which could lead to a rogue AI (see d). Furthermore, creating a network of such non-profit multilaterally governed labslocated in different countries could further decentralize power and protect global stability.
  1. International treaty compliance verification technology will be required sooner than later, but take time to develop and deploy. Past treaties on nuclear weapons became possible thanks to treaty compliance verification procedures and techniques. An AGI treaty would only be effective to prevent a dangerous arms race (or be signed in the first place) if we first develop procedures and technology to verify AI treaty compliance. There currently exists no such reliable verification mechanism, which means that the compliance to any international agreement would be impossible to assess. Software may at first seem hard to govern in a verifiable way. However, hardware-enabled governance mechanisms, ideally sufficiently flexible to adapt to changes in governance rules and technology, and the existence of major bottlenecks in the AI hardware supply chain, could enable technological solutions to AI treaty compliance verification.

Given the magnitude of the risks and the potentially catastrophic unknown unknowns, reason should dictate caution and significant efforts to better understand and mitigate those risks, even if we think that the likelihood of catastrophes is small. It will be tempting to accelerate to win the AGI race, but this is a race where everyone could lose. Let us instead use our agency while we still can and deploy our best minds to move forward with sufficient understanding and management of the risks with robust multilateral governance to avoid its perils and thus reap the benefits of AGI for all of humanity.

“There are many trajectories that could lead to the emergence of an AI with a self-preservation goal.” — Yoshua Bengio

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER.