RobotMurderer

ThoughtStorms Wiki

A thought experiment. Related to FreedomOfSpeechIsWeird

If I program a robot to commit a murder, I guess most people would accept that I was culpable of the murder. And should be held responsible, even punished for it. Even though I just typed words into a computer.

On the other hand, if I write blog posts denigrating a particular group of people. And one of my readers goes out and murders a member of that group, I'd guess that a majority, maybe not as unanimously as in the first case, but still a sizeable majority, would think that I was not responsible for the murder and shouldn't be punished for it.

I understand (and agree with) the majority position at both these ends of the spectrum. What I'm interested is where people feel we should draw the line that separates culpable of murder from not culpable of murder. And why.

So consider your position on the following examples :

  • I ask a language model to create the instructions that will program the robot to commit the murder. And then knowingly give those to the robot
  • I ask a language model to find a "solution" to the problem of person X who I find inconvenient, and then feed those instructions to the robot, without knowing or caring if that solution implies murder
  • I tell a language model as part of a general conversation about how I want it to program my robot servant to behave, that I don't like person X. The robot servant then murders X. Even though I didn't explicit ask for X to be "dealt" with, I did make my dislike of X known as part of the programming.
  • I ask a language model to program my robot servant. I never explicitly mention person X, but the training data did contain some information about how I believed that X had wronged me in the past. The robot servant kills X.

My robot servant now has and is driven by a language model. I don't need the separate steps of getting the language model to write code to program the robot. I now just tell my servant my wishes in natural language, and it tries to obey. Let's revisit the previous 4 scenarios.

  • I ask my language model driven robot to murder X. It does so.
  • I ask my language model driven robot to inconvenience of X. It does so by murdering them.
  • I tell my language model driven robot I don't like X. I don't mention the model should kill them, but it has a general desire to serve my interests and kills them anyway.
  • I don't explicitly mention X in to the language model robot servant, but it finds evidence in data I give it about my life that I believe X did me harm. And kills X anyway.
  • I hire a human hit-man to kill X
  • I hire a human gangster to "deal with" the problem that X poses to me, without specifying what I want done with X. The gangster nevertheless kills them.
  • I tell a loyal human follower who I know would like to impress me, that I really don't like X. The follower kills X to win my approval, even though no money changed hands and I didn't explicitly ask for the murder.
  • I tell all my loyal followers that I don't like X. I say nothing else specific, but one of them kills X anyway.
  • I tell my loyal followers that I don't like people of class Y. And one of them finds X, who is of class Y, and kills them, hoping to win my approval.
  • I tell the world that I don't like people of class Y. One of them finds X, who is of class Y, and kills them. Hoping to win my approval.
  • I tell the world that I don't like people of class Y for these reasons. Some of my followers begin to believe these reasons are valid and one of them goes out and finds X, a member of class Y, and kills them.

See also :

Backlinks (1 items)