The uncanny problem with autonomous AI

P = NP is a simple and short notation for a problem that “asks whether every problem whose solution can be quickly verified by a computer (NP) can also be quickly solved by a computer (P)”. P = NP means yes, P != NP means no.

The golden rule is NP. A moral compass is P.

The golden rule is NP because the golden rule is individual. Each and every experience makes a new golden rule.

Unless P = NP, you can’t program the golden rule.

A computer with a procedural moral compass is a tool of the compass builder.

A computer with a machine learning based moral compass will kill people because people kill people.

The golden rule is a built in evolutionary feature. Societies build moral compasses. The question is why would you need a moral compass if you have the golden rule? Because the golden rule is a generic solver, while a moral compass is a particular one.

Act only according to that maxim whereby you can, at the same time, will that it should become a universal law. Right Soren?

The three forms of the golden rule:

One should treat others as one would like others to treat oneself (positive or directive form).

One should not treat others in ways that one would not like to be treated (negative or prohibitive form).

What you wish upon others, you wish upon yourself (empathic or responsive form).

Asimov’s laws of robotics:

A robot may not injure a human being or, through inaction, allow a human being to come to harm.

A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.

A robot must protect its own existence as long as such protection does not conflict with the First or Second Laws.

A moral compass is:

An inner sense which distinguishes what is right from what is wrong, functioning as a guide (like the needle of a compass) for morally appropriate behavior.

A person, belief system, etc. serving as a guide for morally appropriate behavior.

The full range of virtues, vices, or actions which may affect others and which are available as choices (like the directions on the face of a compass) to a person, to a group, or to people in general.

Autonomous AI becomes a problem when they have reach. The moment it becomes a self contained execution with reach we’re hitting a wall of perception: we’ll require of that AI the same behavior we expect from all other aware creatures. There are only humans that fit this model and humans have a reason to follow the golden rule and an incentive to follow the moral compass.

In computer programming, any kind of programming, adding reason or incentive is easy. The problem is computing.

In Asimov’s laws of robotics there is a great potential for problems in computing, like infinite recursiveness. For example:

A robot may not injure a human being or, through inaction, allow a human being to come to harm.

What if a robot witnesses a human about to kill another human and the only option of intervention would kill the attacker? The robot can’t intervene because it may not injure a human being. The robot can’t not intervene because it cannot allow a human being to come to harm. This is not a strong problem, it may be solvable, with special cases, which makes of robots simple expert systems

Act only according to that maxim whereby you can, at the same time, will that it should become a universal law.

This doesn’t work because we are working on feedback. Feedback is immediate. That reasoning above is complex. You cannot calculate that maxim everytime. That is why the trolley problem is hard, there is no enforcing feedback for either option, because you cannot measure success, unless you make another expert system where each human has a computed value, and by adding up points you get a decision by comparing numbers.

A moral compass, however, is completely programmable. You can make each vice or virtue we know into a piece of code, and it will compute easily what to apply to each, and at the same time we can generate a feedback system to to actually teach the way of the compass.

An autonomous AI with a programmed moral compass will be able to distinguish right from wrong. But it will never be able to exhibit the Golden Rule. That is why there will be no empathy from a machine. There will be, no matter how fine the software, a moment when the uncanniness shall shriek our souls into a horror of alienation.

Unless P = NP. Because if all problems are solvable in polynomial time then there is the possibility of finding a technology fast enough to compute the golden rule in real time at human level requirements.