FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER.

About p(doom)… a very good read from a respected source!

REPORT. Thinking Inside the Box: Controlling and Using an Oracle AI.

The creation of super-human artificial intelligence may turn out to be potentially survivable.

Stuart Armstrong. Anders Sandberg. Nick Bostrom.

(2012) Minds and Machines.

Abstract

There is no strong reason to believe that human-level intelligence represents an upper limit of the capacity of artificial intelligence, should it be realized. This poses serious safety issues, since a superintelligent system would have great power to direct the future according to its possibly flawed motivation system. Solving this issue in general has proven to be considerably harder than expected. This paper looks at one particular approach, Oracle AI. An Oracle AI is an AI that does not act in the world except by answering questions. Even this narrow approach presents considerable challenges. In this paper, we analyse and critique various methods of controlling the AI. In general an Oracle AI might be safer than unrestricted AI, but still remains potentially dangerous.

Keywords: Artificial Intelligence, Superintelligence, Security, Risks, Motivational control, Capability control

6 Conclusions

Analysing the different putative solutions to the OAI-control problem has been a generally discouraging exercise. The physical methods of control, which should be implemented in all cases, are not enough to ensure safe OAI. The other methods of control have been variously insufficient, problematic, or even dangerous.

But these methods are still in their infancy. Control methods used in the real world have been the subject of extensive theoretical analysis or long practical refinement. The lack of intensive study in AI safety leaves methods in this field very underdeveloped. But this is an opportunity: much progress can be expected at relatively little effort. For instance, there is no reason that a few good ideas would not be enough to put the concepts of space and time restrictions on a sufficiently firm basis for rigorous coding.

But the conclusion is not simply that more study is needed. This paper has made some progress in analysing the contours of the problem, and identifying those areas most amenable to useful study, what is important and what is dispensable, and some of the dangers and pitfalls to avoid. The danger of naively relying on confining the OAI to a virtual sub-world should be clear, while sensible boxing methods should be

universally applicable. Motivational control appears potentially promising, but it requires more understanding of AI motivation systems before it can be used.

Even the negative results are of use, insofar as they inoculate us against false confidence: the problem of AI control is genuinely hard, and it is important to recognise this. A list of approaches to avoid is valuable as it can help narrow the search.

On the other hand, there are reasons to believe the oracle AI approach is safer than the general AI approach. The accuracy and containment problems are strictly simpler than the general AI safety problem, and many more tools are available to us: physical and epistemic capability control mainly rely on having the AI boxed, while many motivational control methods are enhanced by this fact. Hence there are grounds to direct high-intelligence AI research to explore the oracle AI model.

The creation of super-human artificial intelligence may turn out to be potentially survivable.

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER.