Building Safe AI

Building Safe AI

There are those who worry about AIs or robots taking over the world.  Isaac Asimov famously worried about people worrying about it — what he called the Frankenstein Complex — and invented the Three Laws of Robotics to show, at a sort of literary level of understanding, that we could build machines that were safe to have around.

Asimov assumed that there would be some sort of world government on Earth that required all robots to be built with the Three Laws. In practice we have no such world government and we have no idea what kind of laws we would want to build in — Asimov’s being literary hints, not real engineering.

And even if we could do all of that, people making AIs would have all kind of incentives to cheat, or build their robots to cheat, and find loopholes in the Laws, or whatever.  After all, if the Laws weren’t going to be constraining them from something they wanted to do otherwise, there’d be no need for the Laws in the first place.

It would be better if we could set things up so that the people, and ultimately AIs, who build the AIs of the future had an incentive to make them safe rather than having that imposed on them from above.

It turns out that the answer lies in understanding how intelligence works in the first place. If we take the right approach to building AI, the incentives to make it safe and the incentive to simply make it work well will coincide.

A logic-based robot that figures everything out for itself is not safe.  Consider, for example, the development of game theory: there was a couple of decades, roughly ranging from von Neumann/Morganstern to Axelrod, where the phenomenon of evolving cooperation and altruism was not understood.  In experiments of the period, game theorists were shown to play interaction games significantly more selfishly than the average person.

Human intelligence doesn’t work that way. What we do is a highly sophisticated form of imitation.  This is famously observed in the other primates as well — but we do it at a higher level, being able to do the same kind of thing we saw rather than just the same thing, or indeed use blending and metaphor to extend imitation into completely different domains than the original action.

I want my robot not only to do the things I want done, but to do them the way I would have done them.  I want it to be capable of imitating anyone, but prefer to imitate me. I want it to have my values and be just like me in every respect.

Every respect but one, that is. It should be more cooperative than I am, since it’s me that it’s cooperating with.  It should be more even-tempered, more foresightful, more diplomatic, less forgetful, more consistent, and perhaps even a tiny bit more trustworthy. Just a little bit, and in a way that’s just the way I would when I’m at my best.

In other words, it should be just like me, but just a little better. In my definition of better.

If all robots and AIs were built that way, it would be a perfect world.  They won’t, of course, because most people don’t want a J Storrs robot, but one that imitates them.  So they’ll build their AIs, or buy and train them, to be like themselves, only just a little better. And that’s what they’ll want to do.  No need to legislate, or to create the ultimate moral code to build in. Just build the robots to imitate their owners in at least as sophisticated a way as humans imitate, to give people that option.

So if we build our AIs to imitate us, individually, at all levels including the one of morals and values, we won’t have a perfect world. That was never an option. But we’ll have one just a little bit better than the one we have now.

By | 2017-06-01T14:05:22+00:00 September 23rd, 2009|Machine Intelligence, Nanodot|7 Comments

About the Author:


  1. David September 23, 2009 at 10:05 am - Reply

    Yes but what about “bad” people. Surely the problem comes when robots are raised to imitate bad people as defined by society?

  2. Common Sense Guy September 23, 2009 at 11:06 am - Reply

    What happens if someone has the following values: it is okay to kill someone out of religious conviction, or, it is okay to kill someone of a certain ethnic group.

    What if this person builds their robot, and instills those same values in them, and their robot goes and kills everyone of other religions and of other ethnic groups?

    What if someone is a materialist nihilist and they hate mankind. Is it okay then if they build their robot to emulate their values? So that their robot has the value ingrained in them that mankind is bad?

    I think you need to give this more thought.

    -A concerned layperson.

  3. Tim Tyler September 23, 2009 at 2:49 pm - Reply

    IMO, the first advanced machine intelligences will probably arise on large servers – with their sensors and actuators opening onto the internet. Search oracles, stockmarket players – and the like.

    They will probably not be much like human beings – since at that stage we will still be building machines to compensate for our own cognitive weaknesses.

  4. J. Storrs Hall September 23, 2009 at 3:24 pm - Reply

    @ Tim: Yes. Properly imitative AIs won’t happen by accident, or in the normal run of events; we must build them that way on purpose.
    @CSG, David: The total ratio of evil done by bad people’s robots to good done by good people’s robots would be the same as today with just the people, only with bigger numbers on each side of the ratio.
    But since all the robots are a little nicer, it would be better: imagine a world where 50% of people are good and 50% evil, so they just balance out. Now imagine that each person’s robot was 90% like its owner and 10% good. The robot society would be 45% evil and 55% good, tipping the overall balance to the good side.

  5. Tim Tyler September 23, 2009 at 11:38 pm - Reply

    Re: bad people.

    Robocorp will not allow bad robots to be constructed. Think of the headline: “Robocorp robot kills five”. That sort of marketing sucks. So: they will build in a “Ghandi” module – as well as probably a “Thou shalt not harm Robocorp” module.

  6. Common Sense Guy September 24, 2009 at 2:50 am - Reply

    @JoSH: You’re equivocating. You were not talking about “good” and “evil,” you were talking about imitating values.

    “Every respect but one, that is. It should be more cooperative than I am, since it’s me that it’s cooperating with. It should be more even-tempered, more foresightful, more diplomatic, less forgetful, more consistent, and perhaps even a tiny bit more trustworthy. Just a little bit, and in a way that’s just the way I would when I’m at my best.”

    A bad person’s robot could be more cooperative in creating bad things. It could be more even-tempered, more foresightful, et cetera, towards bad ENDS.

    Besides… Good and Bad are relative value judgments. Good and Evil are maybe not, but there is little consensus on what these terms involve. Are you suggesting that these robots will be subject to a universalized conception of Good & Evil, in order to correct the moral failings of their owners/creator? And if so then who is going to instill that universalized conception, which even ethical philosophers and theologians cannot agree on, into them? Maybe robots can learn it for themselves, but then… it may be very different that a human society conceives of it.

    You say the robot society will be slightly less evil than ours… well what is evil? Who is defining it? Who is ensuring that these robots imitate “Good” values and not “Evil” ones?

    You are a very intelligent scientist… but these questions need to be answered by ethical philosophers, theologians and sociologists. People that have a grasp on the values of our society and the possible higher order values of the universe.

    Where does this broader-consultation factor into your view of AI and robots? I am very curious. I would like to see you explore these very important questions in a future post.

  7. Tommy October 2, 2009 at 7:56 am - Reply

    I think the AI will just happen at some point and we will NOT be ready for it. Asimov warned us, the movies have shown us a few ways of what can happen when AI comes to life and while the movies do exaggerate things, they have some valid points.

    The human body is incredibly fragile and no doubt accidents or even something more serious will happen. Excellent topic you brought up by the way. 😉

Leave A Comment