The creation of Artificial Intelligence (AI) holds great promise, but with it also comes existential risk. How can we know AI will be safe? How can we know it will not destroy us? How can we know that its values will be aligned with ours? Because of this risk an entire field has sprung up surrounding AI Safety and Security. But this is an unsolvable problem, AI can never be fully controlled, writes Roman V. Yampolskiy.
The invention of Artificial Intelligence will shift the trajectory of human civilization. But to reap the benefits of such powerful technology – and to avoid the dangers – we must be able to control it. Currently we have no idea whether such control is even possible. My view is that Artificial Intelligence (AI) - and its more advanced version, Artificial Super Intelligence (ASI) – could never be fully controlled.
Solving an unsolvable problem
The unprecedented progress in Artificial Intelligence (AI), over the last decade has not been smooth. Multiple AI failures [1, 2] and cases of dual use (when AI is used for purposes beyond its maker’s intentions) [3] have shown that it is not sufficient to create highly capable machines, but that those machines must also be beneficial [4] for humanity. This concern birthed a new sub-field of research, ‘AI Safety and Security’ [5] with hundreds of papers published annually. But all of this research assumes that controlling highly capable intelligent machines is possible, an assumption which has not been established by any rigorous means.
It is standard practice in computer science to show that a problem does not belong to a class of unsolvable problems [6, 7 ]before investing resources into trying to solve it. No mathematical proof - or even a rigorous argument! - has been published to demonstrate that the AI control problem might be solvable, in principle let alone in practice.
No mathematical proof - or even a rigorous argument! - has been published to demonstrate that the AI control problem might be solvable.
The Hard Problem of AI Safety
The AI Control Problem is the definitive challenge and the hard problem of AI Safety and Security. Methods to control superintelligence fall into two camps: Capability Control and Motivational Control [8]. Capability control limits potential harm from an ASI system by restricting its environment [9-12], adding shut-off mechanisms [13, 14], or trip wires [12]. Motivational control designs ASI systems to have no desire to cause harm in the first place. Capability control methods are considered temporary measures at best, certainly not as long-term solutions for ASI control [8].
Motivational control is a more promising route and it would need to be designed into ASI systems. But there are different types of control, which we can see easily in the example of a “smart” self-driving car. If a human issues a direct command - “Please stop the car!”, the controlled AI could respond in four ways:
We can retain human control or cede power to controlling AI but neither option provides both control and safety.
Looking at these options, we realize two things. First, humans are fallible and therefore we are fundamentally unsafe (we crash our cars all the time) and so keeping humans in control will produce unsafe AI actions (such as stopping the car in the middle of busy road). But second, we realize that transferring decision-making power to AI leaves us subjugated to AI’s whims.
Join the conversation