Existential Risk Observatory. Reducing human extinction risks by informing the public debate.
Unaligned AI
Toby Ord estimates a one in ten likeliness that unaligned AI will cause human extinction or permanent and drastic curtailment of humanity’s potential in the next hundred years. How could that happen?
What is AI?
Artificial Intelligence (AI) refers to a class of hardware and software systems that can be said to be ‘intelligent’ in a broad sense of the word. Sometimes this refers to the system’s ability to take actions to achieve predetermined goals but can also refer to particular abilities linked to intelligence, such as understanding human speech. AI is already all around us, from the underlying algorithms that power automatic translation services, to the way that digital media providers learn your preferences to show you the content most relevant to your interests. Despite incredible advances in AI technology in the past decade1, current AI capabilities are likely to be just the tip of the iceberg compared to what could be possible in the future.
AGI and Superintelligence
One of the main contributions to current estimates of the existential risk from AI is the prospect of incredibly powerful AI systems that may come to be developed in the future. These differ from current AI systems, sometimes called ‘Narrow AI’. There are many different terms that researchers have used when discussing systems, each with its own subtle definition. These include: Artificial General Intelligence (AGI), an AI system that is proficient in all aspects of human intelligence; Human-Level Artificial Intelligence (HLAI), an AI that can at least match human capabilities in all aspects of intelligence; and Artificial Superintelligence (ASI), an AI system that is greatly more capable than humans in all areas of intelligence 2.
The Alignment Problem
One fundamental problem with both current and future AI systems is that of the alignment problem. Outlined in detail in Stuart Russell’s recent book ‘Human Compatible’, the alignment problem is simply the issue of how to make sure that the goals of an AI system are aligned with those of humanity. While on the face of it this may seem like a simple task (‘We’re building the AI, so we can just build it in such a way that it’s aligned by design!’), there are many technical and philosophical issues that arise when trying to build aligned AI. For a start, what exactly do we mean by human values? Humans do not share many of their values with each other, so will it be possible to isolate values that are fundamental to, and shared by, all humans? Even if we manage to identify such values, will it be possible to represent them in such a way that an AI can understand and take into account? 3
Intelligence Explosion
One factor that may further inhibit our ability to influence future AI systems is if an intelligence explosion takes place. This possibility was first suggested by mathematician I.J. Good in the mid 1960’s. He wrote:
‘Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an ‘intelligence explosion’, and the intelligence of man would be left far behind. This the first ultraintelligent machine is the last invention that man need ever make.’
The basic argument is that, once an AI reaches a certain level (or quality) of intelligence, it will be better at designing AI than humans. It will then be able to either improve its own capabilities or build other AI systems that are more intelligent than itself. The resulting AI will then be even better at designing AI, and so would be able to build an AI system that is even more intelligent. This argument continues recursively, with AI continually self-improving, eventually becoming far more intelligent than humans, without further input from a human designer.
There are still many unknowns regarding the nature, and even possibility, of such a scenario 4 One such area of debate regards the rate of such self-improvement. Currently it is very difficult to predict how quickly an adequately intelligent AI will be able to improve upon its own capabilities. If a fast takeoff scenario occurs, whereby the AI is rapidly able to self-improve and becomes many times more intelligent than humans in a matter of weeks or months, there will clearly be less time to ensure that that AI is aligned with our values. In such a case it would be important to have ensured alignment long before the AI reached the point of being able to self-improve.
On the other hand, it may turn out that the difficulty of designing AIs that are more and more intelligent increases faster than the level of intelligence attained at each iteration. In this case it is hard to see how an intelligence explosion could occur, and the intelligence of the resulting AI system(s) will either converge to a constant value or continue increasing but at a steady rate.
Risks from Narrow AI
Even before AGI or ASI are within reach 5, there are still ways in which AI could pose existential risk, mainly through acting as a risk factor 6.
One example of how this could happen is if nations become involved in an AI arms race, where political tensions rise as a result of international competition to develop advanced AI systems 7. This could lead to an ‘AI cold war’, which would increase the chances of global conflict. Alternatively, AI systems could become powerful enough 8 to undermine current cybersecurity and encryption methods, having drastic effects on the global technological infrastructure that underlies much of modern society.
Thus, even for those that are sceptical of whether AGI is possible, there are still causes for concern relating to current, or narrow AI systems.
- Particularly in an area known as ‘Deep Reinforcement Learning’
- Others include Transformative Artificial Intelligence (TAI), or Prepotent Artificial Intelligence
- For a further discussion of this see this news post.
- Despite Good’s apparent adamance
- Assuming they are physically possible
- See the section on Risk Factors on this page
- Note that this is not explicitly linked to the usual notion of ‘arms race’ in terms of weapons
- Without reaching AGI status