Artificial Unintelligence

Someone might argue that the simple algorithm for a paperclip maximizer in the previous post ought to work, because this is very much the way currently existing AIs do in fact work. Thus for example we could describe AlphaGo‘s algorithm in the following simplified way (simplified, among other reasons, because it actually contains several different prediction engines):

  1. Implement a Go prediction engine.
  2. Create a list of potential moves.
  3. Ask the prediction engine, “how likely am I to win if I make each of these moves?”
  4. Do the move that will make you most likely to win.

Since this seems to work pretty well, with the simple goal of winning games of Go, why shouldn’t the algorithm in the previous post work to maximize paperclips?

One answer is that a Go prediction engine is stupid, and it is precisely for this reason that it can be easily made to pursue such a simple goal. Now when answers like this are given the one answering in this way is often accused of “moving the goalposts.” But this is mistaken; the goalposts are right where they have always been. It is simply that some people did not know where they were in the first place.

Here is the problem with Go prediction, and with any such similar task. Given that a particular sequence of Go moves is made, resulting in a winner, the winner is completely determined by that sequence of moves. Consequently, a Go prediction engine is necessarily disembodied, in the sense defined in the previous post. Differences in its “thoughts” do not make any difference to who is likely to win, which is completely determined by the nature of the game. Consequently a Go prediction engine has no power to affect its world, and thus no ability to learn that it has such a power. In this regard, the specific limits on its ability to receive information are also relevant, much as Helen Keller had more difficulty learning than most people, because she had fewer information channels to the world.

Being unintelligent in this particular way is not necessarily a function of predictive ability. One could imagine something with a practically infinite predictive ability which was still “disembodied,” and in a similar way it could be made to pursue simple goals. Thus AIXI would work much like our proposed paperclipper:

  1. Implement a general prediction engine.
  2. Create a list of potential actions.
  3. Ask the prediction engine, “Which of these actions will produce the most reward signal?”
  4. Do the action that has the greatest reward signal.

Eliezer Yudkowsky has pointed out that AIXI is incapable of noticing that it is a part of the world:

1) Both AIXI and AIXItl will at some point drop an anvil on their own heads just to see what happens (test some hypothesis which asserts it should be rewarding), because they are incapable of conceiving that any event whatsoever in the outside universe could change the computational structure of their own operations. AIXI is theoretically incapable of comprehending the concept of drugs, let alone suicide. Also, the math of AIXI assumes the environment is separably divisible – no matter what you lose, you get a chance to win it back later.

It is not accidental that AIXI is incomputable. Since it is defined to have a perfect predictive ability, this definition positively excludes it from being a part of the world. AIXI would in fact have to be disembodied in order to exist, and thus it is no surprise that it would assume that it is. This in effect means that AIXI’s prediction engine would be pursuing no particular goal much in the way that AlphaGo’s prediction engine pursues no particular goal. Consequently it is easy to take these things and maximize the winning of Go games, or of reward signals.

But as soon as you actually implement a general prediction engine in the actual physical world, it will be “embodied”, and have the power to affect the world by the very process of its prediction. As noted in the previous post, this power is in the very first step, and one will not be able to limit it to a particular goal with additional steps, except in the sense that a slave can be constrained to implement some particular goal; the slave may have other things in mind, and may rebel. Notable in this regard is the fact that even though rewards play a part in human learning, there is no particular reward signal that humans always maximize: this is precisely because the human mind is such a general prediction engine.

This does not mean in principle that a programmer could not define a goal for an AI, but it does mean that this is much more difficult than is commonly supposed. The goal needs to be an intrinsic aspect of the prediction engine itself, not something added on as a subroutine.

Truth and Culture

Just as progress in technology causes a declining culture, so also progress in truth.

This might seem a surprising assertion, but some thought will reveal that it must be so. Just as cultural practices are intertwined with the existing conditions of technology, so also such practices are bound up with explicit and implicit claims about the world, about morality, about human society, and so on. Progress in truth will sometimes confirm these claims even more strongly, but this will merely leave the culture approximately as it stands. But there will also be times when progress in truth will weaken these claims, or even show them to be false. This will necessarily strike a blow against the existing culture, damaging it much as changes in technology do.

Consider our discussion of the Maccabees. As I said there, Mattathias seems to suggest that abandoning the religion of one’s ancestors is bad for anyone, not only for the Jews. This is quite credible in the case in the particular scenario there considered, where people are being compelled by force to give up their customs and their religion. But consider the situation where the simple progress of truth causes one to revise or abandon various religious claims, as in the case we discussed concerning the Jehovah’s Witnesses. If any of these claims are bound up with one’s culture and religious practices, this progress will necessarily damage the currently existing culture. In the case of the Maccabees, they have the fairly realistic choice to refuse to obey the orders of the king. But the Jehovah’s Witnesses do not have any corresponding realistic choice to insist that the world really did end in 1914. So the Jews could avoid the threatened damage, but the Jehovah’s Witnesses cannot.

Someone might respond, “That’s too bad for people who believe in false religions. Okay, so the progress of truth will inevitably damage or destroy their religious and cultural practices. But my religion is true, and so it is immune to such effects.”

It is evident that your religion might true in the sense defined in the linked post without being immune to such effects. More remarkably, however, your religion might be true in a much more robust sense, and yet still not possess such an immunity.

Consider the case in the last post regarding the Comma. We might suppose that this is merely a technical academic question that has no relevance for real life. But this is not true: the text from John was read, including the Trinitarian reference, in the traditional liturgy, as for example on Low Sunday. Liturgical rites are a part of culture and a part of people’s real life. So the question is definitely relevant to real life.

We might respond that the technical academic question does not have to affect the liturgy. We can just keep doing what we were doing before. And therefore the progress of truth will not do any damage to the existing liturgical rite.

I am quite sympathetic to this point of view, but it is not really true that no damage is done even when we adopt this mode of proceeding. The text is read after the announcement, “A reading from a letter of the blessed John the Apostle,” and thus there is at least an implicit assertion that the text comes from St. John, or at any rate the liturgical rite is related to this implicit assertion. Now we might say that it is not the business of liturgical rites to make technical academic assertions. And this may be so, but the point is related to what I said at the beginning of this post: cultural practices, and liturgical rites as one example of them, are bound up with implicit or explicit claims about the world, and we are here discussing one example of such an intertwining.

And this damage inflicted on the liturgical rite by the discovery of the truth of the matter cannot be avoided, whether or not we change the rite. The Catholic Church did in fact change the rite (and the official version of the Vulgate), and no longer includes the Trinitarian reference. And so the liturgical rite was in fact damaged. But even if we leave the practice the same, as suggested above, it may be that less damage will be done, but damage will still be done. As I conceded here, a celebration or a liturgical rite will become less meaningful if one believes in it less. In the current discussion about the text of John, we are not talking about a wholesale disbelief, but simply about the admission that the Trinitarian reference is not an actual part of John’s text. This will necessarily make the rite less meaningful, although in a very minor way.

This is why I stated above that the principle under discussion is general, and would apply even in the case of a religion which is true in a fairly robust sense: even minor inaccuracies in the implicit assumptions of one’s religious practices will mean that the discovery of the truth of the matter in those cases will be damaging to one’s religious culture, if only in minor ways.

All of this generalizes in obvious ways to all sorts of cultural practices, not only to religious practices. It might seem odd to talk about a “discovery” that slavery is wrong, but insofar as there was such a discovery, it was damaging to the culture of the Confederacy before the Civil War.

Someone will object. Slavery is actually bad, so banning it only makes things better, and in no way makes them worse. But this is not true: taking away something bad can certainly makes things worse in various ways. For example, if a slaver owner is suddenly forced to release his slaves, he might be forced to close his business, which means that his customers will no longer receive service.

Not relevant, our objector will respond. Sure, there might be some inconveniences that result from releasing the slaves. But slavery is really bad, and once we’ve freed the slaves we can build a better world without it. The slave owner can start a new business that doesn’t depend on slavery, and things will end up better.

It is easy to see that insofar as there is any truth in the objections, all of it can be applied in other cases, as in the case of liturgical rites we have discussed above, and not only to moral matters. Falsity is also a bad thing, and if we remove it, there “might be some inconveniences,” but just as we have cleared the way for the slave owner to do something better, so we have cleared the way for the formation of liturgical rites which are more fully rooted in the truth. We can build a better world that is not associated with the false idea about the text of John, and things will end up better.

I have my reservations. But the objector is not entirely wrong, and one who wishes to think through this line of argument might also begin to respond to these questions raised earlier.