Quantum Mechanics and Libertarian Free Will

In a passage quoted in the last post, Jerry Coyne claims that quantum indeterminacy is irrelevant to free will: “Even the pure indeterminism of quantum mechanics can’t give us free will, because that’s simple randomness, and not a result of our own ‘will.'”

Coyne seems to be thinking that since quantum indeterminism has fixed probabilities in any specific situation, the result for human behavior would necessarily be like our second imaginary situation in the last post. There might be a 20% chance that you would randomly do X, and an 80% chance that you would randomly do Y, and nothing can affect these probabilities. Consequently you cannot be morally responsible for doing X or for doing Y, nor should you be praised or blamed for them.

Wait, you might say. Coyne explicitly favors praise and blame in general. But why? If you would not praise or blame someone doing something randomly, why should you praise or blame someone doing something in a deterministic manner? As explained in the last post, the question is whether reasons have any influence on your behavior. Coyne is assuming that if your behavior is deterministic, it can still be influenced by reasons, but if it is indeterministic, it cannot be. But there is no reason for this to be case. Your behavior can be influenced by reasons whether it is deterministic or not.

St. Thomas argues for libertarian free will on the grounds that there can be reasons for opposite actions:

Man does not choose of necessity. And this is because that which is possible not to be, is not of necessity. Now the reason why it is possible not to choose, or to choose, may be gathered from a twofold power in man. For man can will and not will, act and not act; again, he can will this or that, and do this or that. The reason of this is seated in the very power of the reason. For the will can tend to whatever the reason can apprehend as good. Now the reason can apprehend as good, not only this, viz. “to will” or “to act,” but also this, viz. “not to will” or “not to act.” Again, in all particular goods, the reason can consider an aspect of some good, and the lack of some good, which has the aspect of evil: and in this respect, it can apprehend any single one of such goods as to be chosen or to be avoided. The perfect good alone, which is Happiness, cannot be apprehended by the reason as an evil, or as lacking in any way. Consequently man wills Happiness of necessity, nor can he will not to be happy, or to be unhappy. Now since choice is not of the end, but of the means, as stated above (Article 3); it is not of the perfect good, which is Happiness, but of other particular goods. Therefore man chooses not of necessity, but freely.

Someone might object that if both are possible, there cannot be a reason why someone chooses one rather than the other. This is basically the claim in the third objection:

Further, if two things are absolutely equal, man is not moved to one more than to the other; thus if a hungry man, as Plato says (Cf. De Coelo ii, 13), be confronted on either side with two portions of food equally appetizing and at an equal distance, he is not moved towards one more than to the other; and he finds the reason of this in the immobility of the earth in the middle of the world. Now, if that which is equally (eligible) with something else cannot be chosen, much less can that be chosen which appears as less (eligible). Therefore if two or more things are available, of which one appears to be more (eligible), it is impossible to choose any of the others. Therefore that which appears to hold the first place is chosen of necessity. But every act of choosing is in regard to something that seems in some way better. Therefore every choice is made necessarily.

St. Thomas responds to this that it is a question of what the person considers:

If two things be proposed as equal under one aspect, nothing hinders us from considering in one of them some particular point of superiority, so that the will has a bent towards that one rather than towards the other.

Thus for example, someone might decide to become a doctor because it pays well, or they might decide to become a truck driver because they enjoy driving. Whether they consider “what would I enjoy?” or “what would pay well?” will determine which choice they make.

The reader might notice a flaw, or at least a loose thread, in St. Thomas’s argument. In our example, what determines whether you think about what pays well or what you would enjoy? This could be yet another choice. I could create a spreadsheet of possible jobs and think, “What should I put on it? Should I put the pay? or should I put what I enjoy?” But obviously the question about necessity will simply be pushed back, in this case. Is this choice itself determinate or indeterminate? And what determines what choice I make in this case? Here we are discussing an actual temporal series of thoughts, and it absolutely must have a first, since human life has a beginning in time. Consequently there will have to be a point where, if there is the possibility of “doing A for reason B” and “doing C for reason D”, it cannot be any additional consideration which determines which one is done.

Now it is possible at this point that St. Thomas is mistaken. It might be that the hypothesis that both were “really” possible is mistaken, and something does determine one rather than the other with “necessity.” It is also possible that he is not mistaken. Either way, human reasons do not influence the determination, because reason B and/or reason D are the first reasons considered, by hypothesis (if they were not, we would simply push back the question.)

At this point someone might consider this lack of the influence of reasons to imply that people are not morally responsible for doing A or for doing C. The problem with this is that if you do something without a reason (and without potentially being influenced by a reason), then indeed you would not be morally responsible. But the person doing A or C is not uninfluenced by reasons. They are influenced by reason B, or by reason D. Consequently, they are responsible for their specific action, because they do it for a reason, despite the fact that there is some other general issue that they are not responsible for.

What influence could quantum indeterminacy have here? It might be responsible for deciding between “doing A for reason B” and “doing C for reason D.” And as Coyne says, this would be “simple randomness,” with fixed probabilities in any particular situation. But none of this would prevent this from being a situation that would include libertarian free will, since libertarian free will is precisely nothing but the situation where there are two real possibilities: you might do one thing for one reason, or another thing for another reason. And that is what we would have here.

Does quantum mechanics have this influence in fact, or is this just a theoretical possibility? It very likely does. Some argue that it probably doesn’t, on the grounds that quantum mechanics does not typically seem to imply much indeterminacy for macroscopic objects. The problem with this argument is that the only way of knowing that quantum indeterminacy rarely leads to large scale differences is by using humanly designed items like clocks or computers. And these are specifically designed to be determinate: whenever our artifact is not sufficiently determinate and predictable, we change the design until we get something predictable. If we look at something in nature uninfluenced by human design, like a waterfall, is details are highly unpredictable to us. Which drop of water will be the most distant from this particular point one hour from now? There is no way to know.

But how much real indeterminacy is in the waterfall, or in the human brain, due to quantum indeterminacy? Most likely nobody knows, but it is basically a question of timescales. Do you get a great deal of indeterminacy after one hour, or after several days? One way or another, with the passage of enough time, you will get a degree of real indeterminacy as high as you like. The same thing will be equally true of human behavior. We often notice, in fact, that at short timescales there is less indeterminacy than we subjectively feel. For example, if someone hesitates to accept an invitation, in many situations, others will know that the person is very likely to decline. But the person feels very uncertain, as though there were a 50/50 chance of accepting or declining. The real probabilities might be 90/10 or even more slanted. Nonetheless, the question is one of timescales and not of whether or not there is any indeterminacy. There is, this is basically settled, it will apply to human behavior, and there is little reason to doubt that it applies at relatively short timescales compared to the timescales at which it applies to clocks and computers or other things designed with predictability in mind.

In this sense, quantum indeterminacy strongly suggests that St. Thomas is basically correct about libertarian free will.

On the other hand, Coyne is also right about something here. While it is not true that such “randomness” removes moral responsibility, the fact that people do things for reasons, or that praise and blame is a fitting response to actions done for reasons, Coyne correctly notices that it does not add to the fact that someone is responsible. If there is no human reason for the fact that a person did A for reason B rather than C for reason D, this makes their actions less intelligible, and thus less subject to responsibility. In other words, the “libertarian” part of libertarian free will does not make the will more truly a will, but less truly. In this respect, Coyne is right. This however is unrelated to quantum mechanics or to any particular scientific account. The thoughtful person can understand this simply from general considerations about what it means to act for a reason.

Causality and Moral Responsibility

Consider two imaginary situations:

(1) In the first situation, people are such that when someone sees a red light, they immediately go off and kill someone. Nothing can be done to prevent this, and no intention or desire to do otherwise makes any difference.

In this situation, killing someone after you have seen a red light is not blamed, since it cannot be avoided, but we blame people who show red lights to others. Such people are arrested and convicted as murderers.

(2) In the second situation, people are such that when someone sees a red light, there is a 5% chance they will go off and immediately kill someone, and a 95% chance they will behave normally. Nothing can change this probability: it does not matter whether the person is wicked or virtuous or what their previous attitude to killing was.

In this situation, again, we do not blame people who end up killing someone, but we call them unlucky. We do however blame people who show others red lights, and they are arrested and convicted of second degree murder, or in some cases manslaughter.

Some people would conclude from this that moral responsibility is incoherent: whether the world is deterministic or not, moral responsibility is impossible. Jerry Coyne defends this position in numerous places, as for example here:

We’ve taken a break from the many discussions on this site about free will, but, cognizant of the risks, I want to bring it up again. I think nearly all of us agree that there’s no dualism involved in our decisions: they’re determined completely by the laws of physics. Even the pure indeterminism of quantum mechanics can’t give us free will, because that’s simple randomness, and not a result of our own “will.”

Coyne would perhaps say that “free will” embodies a contradiction much in the way that “square circle” does. “Will” implies a cause, and thus something deterministic. “Free” implies indeterminism, and thus no cause.

In many places Coyne asserts that this implies that moral responsibility does not exist, as for example here:

This four-minute video on free will and responsibility, narrated by polymath Raoul Martinez, was posted by the Royal Society for the Encouragement of the Arts, Manufactures, and Commerce (RSA). Martinez’s point is one I’ve made here many times, and will surely get pushback from: determinism rules human behavior, and our “choices” are all predetermined by our genes and environment. To me, that means that the concept of “moral responsibility” is meaningless, for that implies an ability to choose freely. Nevertheless, we should still retain the concept of responsibility, meaning “an identifiable person did this or that good or bad action”. And, of course, we can sanction or praise people who were responsible in this sense, for such blame and praise can not only reinforce good behavior but is salubrious for society.

I think that Coyne is very wrong about the meaning of free will, somewhat wrong about responsibility, and likely wrong about the consequences of his views for society (e.g. he believes that his view will lead to more humane treatment of prisoners. There is no particular reason to expect this.)

The imaginary situations described in the initial paragraphs of this post do not imply that moral responsibility is impossible, but they do tell us something. In particular, they tell us that responsibility is not directly determined by determinism or its lack. And although Coyne says that “moral responsibility” implies indeterminism, surely even Coyne would not advocate blaming or punishing the person who had the 5% chance of going and killing someone. And the reason is clear: it would not “reinforce good behavior” or be “salubrious for society.” By the terms set out, it would make no difference, so blaming or punishing would be pointless.

Coyne is right that determinism does not imply that punishment is pointless. And he also recognizes that indeterminism does not of itself imply that anyone is responsible for anything. But he fails here to put two and two together: just as determinism does not imply punishment is pointless, nor that it is not, indeterminism likewise implies neither of the two. The conclusion he should draw is not that moral responsibility is meaningless, but that it is independent of both determinism and indeterminism; that is, that both deterministic compatibilism and libertarian free will allow for moral responsibility.

So what is required for praise and blame to have a point? Elsewhere we discussed C.S. Lewis’s claim that something can have a reason or a cause, but not both. In a sense, the initial dilemma in this post can be understood as a similar argument. Either our behavior has deterministic causes, or it has indeterministic causes; therefore it does not have reasons; therefore moral responsibility does not exist.

On the other hand, if people do have reasons for their behavior, there can be good reasons for blaming people who do bad things, and for punishing them. Namely, since those people are themselves acting for reasons, they will be less likely in the future to do those things, and likewise other people, fearing punishment and blame, will be less likely to do them.

As I said against Lewis, reasons do not exclude causes, but require them. Consequently what is necessary for moral responsibility are causes that are consistent with having reasons; one can easily imagine causes that are not consistent with having reasons, as in the imaginary situations described, and such causes would indeed exclude responsibility.

Aristotle on Future Contingents

In Chapter 9 of On Interpretation, Aristotle argues that at least some statements about the future need to be exempted from the principle of Excluded Middle:

In the case of that which is or which has taken place, propositions, whether positive or negative, must be true or false. Again, in the case of a pair of contradictories, either when the subject is universal and the propositions are of a universal character, or when it is individual, as has been said,’ one of the two must be true and the other false; whereas when the subject is universal, but the propositions are not of a universal character, there is no such necessity. We have discussed this type also in a previous chapter.

When the subject, however, is individual, and that which is predicated of it relates to the future, the case is altered. For if all propositions whether positive or negative are either true or false, then any given predicate must either belong to the subject or not, so that if one man affirms that an event of a given character will take place and another denies it, it is plain that the statement of the one will correspond with reality and that of the other will not. For the predicate cannot both belong and not belong to the subject at one and the same time with regard to the future.

Thus, if it is true to say that a thing is white, it must necessarily be white; if the reverse proposition is true, it will of necessity not be white. Again, if it is white, the proposition stating that it is white was true; if it is not white, the proposition to the opposite effect was true. And if it is not white, the man who states that it is making a false statement; and if the man who states that it is white is making a false statement, it follows that it is not white. It may therefore be argued that it is necessary that affirmations or denials must be either true or false.

Now if this be so, nothing is or takes place fortuitously, either in the present or in the future, and there are no real alternatives; everything takes place of necessity and is fixed. For either he that affirms that it will take place or he that denies this is in correspondence with fact, whereas if things did not take place of necessity, an event might just as easily not happen as happen; for the meaning of the word ‘fortuitous’ with regard to present or future events is that reality is so constituted that it may issue in either of two opposite directions. Again, if a thing is white now, it was true before to say that it would be white, so that of anything that has taken place it was always true to say ‘it is’ or ‘it will be’. But if it was always true to say that a thing is or will be, it is not possible that it should not be or not be about to be, and when a thing cannot not come to be, it is impossible that it should not come to be, and when it is impossible that it should not come to be, it must come to be. All, then, that is about to be must of necessity take place. It results from this that nothing is uncertain or fortuitous, for if it were fortuitous it would not be necessary.

The argument here is that if it is already true, for example, that I will eat breakfast tomorrow, then I will necessarily eat breakfast tomorrow, and there is no option about this and no ability of anything to prevent it. Aristotle is here taking it for granted that some things about the future are uncertain, and is using this as a reductio against the position that such claims can be already true. He goes on to give additional reasons for the same thing:

Again, to say that neither the affirmation nor the denial is true, maintaining, let us say, that an event neither will take place nor will not take place, is to take up a position impossible to defend. In the first place, though facts should prove the one proposition false, the opposite would still be untrue. Secondly, if it was true to say that a thing was both white and large, both these qualities must necessarily belong to it; and if they will belong to it the next day, they must necessarily belong to it the next day. But if an event is neither to take place nor not to take place the next day, the element of chance will be eliminated. For example, it would be necessary that a sea-fight should neither take place nor fail to take place on the next day.

These awkward results and others of the same kind follow, if it is an irrefragable law that of every pair of contradictory propositions, whether they have regard to universals and are stated as universally applicable, or whether they have regard to individuals, one must be true and the other false, and that there are no real alternatives, but that all that is or takes place is the outcome of necessity. There would be no need to deliberate or to take trouble, on the supposition that if we should adopt a certain course, a certain result would follow, while, if we did not, the result would not follow. For a man may predict an event ten thousand years beforehand, and another may predict the reverse; that which was truly predicted at the moment in the past will of necessity take place in the fullness of time.

Further, it makes no difference whether people have or have not actually made the contradictory statements. For it is manifest that the circumstances are not influenced by the fact of an affirmation or denial on the part of anyone. For events will not take place or fail to take place because it was stated that they would or would not take place, nor is this any more the case if the prediction dates back ten thousand years or any other space of time. Wherefore, if through all time the nature of things was so constituted that a prediction about an event was true, then through all time it was necessary that that should find fulfillment; and with regard to all events, circumstances have always been such that their occurrence is a matter of necessity. For that of which someone has said truly that it will be, cannot fail to take place; and of that which takes place, it was always true to say that it would be.

Yet this view leads to an impossible conclusion; for we see that both deliberation and action are causative with regard to the future, and that, to speak more generally, in those things which are not continuously actual there is potentiality in either direction. Such things may either be or not be; events also therefore may either take place or not take place. There are many obvious instances of this. It is possible that this coat may be cut in half, and yet it may not be cut in half, but wear out first. In the same way, it is possible that it should not be cut in half; unless this were so, it would not be possible that it should wear out first. So it is therefore with all other events which possess this kind of potentiality. It is therefore plain that it is not of necessity that everything is or takes place; but in some instances there are real alternatives, in which case the affirmation is no more true and no more false than the denial; while some exhibit a predisposition and general tendency in one direction or the other, and yet can issue in the opposite direction by exception.

Now that which is must needs be when it is, and that which is not must needs not be when it is not. Yet it cannot be said without qualification that all existence and non-existence is the outcome of necessity. For there is a difference between saying that that which is, when it is, must needs be, and simply saying that all that is must needs be, and similarly in the case of that which is not. In the case, also, of two contradictory propositions this holds good. Everything must either be or not be, whether in the present or in the future, but it is not always possible to distinguish and state determinately which of these alternatives must necessarily come about.

Let me illustrate. A sea-fight must either take place to-morrow or not, but it is not necessary that it should take place to-morrow, neither is it necessary that it should not take place, yet it is necessary that it either should or should not take place to-morrow. Since propositions correspond with facts, it is evident that when in future events there is a real alternative, and a potentiality in contrary directions, the corresponding affirmation and denial have the same character.

This is the case with regard to that which is not always existent or not always nonexistent. One of the two propositions in such instances must be true and the other false, but we cannot say determinately that this or that is false, but must leave the alternative undecided. One may indeed be more likely to be true than the other, but it cannot be either actually true or actually false. It is therefore plain that it is not necessary that of an affirmation and a denial one should be true and the other false. For in the case of that which exists potentially, but not actually, the rule which applies to that which exists actually does not hold good. The case is rather as we have indicated.

Basically, then, there are two arguments. First there is the argument that if statements about the future are already true, the future is necessary. If a sea battle will take place tomorrow, it will necessarily take place. Second, there is the argument that this excludes deliberation. If a sea battle will take place tomorrow, then it will necessarily take place, and no place remains for deliberation and decision about whether to fight the sea battle. Whether you decide to fight or not, it will necessarily take place.

Unfortunately for Aristotle, both arguments fail. Consider the first argument about necessity. Aristotle’s example is that “if it is true to say that a thing is white, it must necessarily be white.” But this is hypothetical necessity, not absolute necessity. A thing must be white if it is true that is white, but that does not mean that “it must be white, period.” Thus for example I have a handkerchief, and it happens to be white. If it is true that it is white, then it must be white. But it would be false to simply say, “My handkerchief is necessarily white.” Since I can dye it other colors, obviously it is not simply necessary for it to be white.

In a similar way, of course it is true that if a sea battle will take place, it will take place. It does not follow at all that “it will necessarily take place, period.”

Again, consider the second argument, that deliberation would be unnecessary. Aristotle makes the point that deliberation is causative with respect to the future. But gravity is also causative with respect to the future, as for example when gravity causes a cup to fall from a desk. It does not follow either that the cup must be able not to fall, nor that gravity is unnecessary. In a similar way, a sea battle takes place because certain people deliberated and decided to fight. If it was already true that it was going to take place, then it also already true that they were going to decide to fight. It does not follow that their decision was unnecessary.

Consider the application to gravity. It is already true that if the cup is knocked from the desk, it will fall. It does not follow that gravity will not cause the fall: in fact, it is true precisely because gravity will cause the fall. In a similar way, if it true that the battle will take place, it is true because the decision will be made.

This earlier discussion about determinism is relevant to this point. Asserting that there is a definite outcome that our deliberations will arrive at, in each case, goes against our experience in no way. The feeling of “free will,” in any case, has a different explanation, whether or not determinism is true.

On the other hand, there is also no proof that there is such a determinate outcome, even if in some cases there are things that would suggest it. What happens if in fact there is nothing ensuring one outcome rather than another?

Here we could make a third argument on Aristotle’s behalf, although he did not make it himself. If the present is truly open to alternative outcomes, then it seems that nothing exists that could make it be true that “a sea battle will take place,” and false that “a sea battle will not take place.” Presumably if a statement is true, there must be something in reality which is the cause of the statement’s truth. Now there does not seem to be anything in reality, in this scenario, which could be a cause of truth. Therefore it does not seem that either alternative could be true, and Aristotle would seem to be right.

I will not attempt to refute this argument at this point, but I will raise two difficulties. First of all, it is not clear that his claim is even coherent. Aristotle says that “either there will be a sea battle or there will not be,” is true, but that “there will be a sea battle” is not true, and “there will not be a sea battle” is not true. This does not seem to be logically consistent, and it is not clear that we can even understand what is being said. I will not push this objection too hard, however, lest I be accused of throwing stones from a glass house.

Second, the argument that there is nothing in reality that could cause the truth of a statement might apply to the past as well as to the future. There is a tree outside my window right now. What was in that place exactly 100 million years ago to this moment? It is not obvious that there is anything in the present world which could be the cause of the truth of any statement about this. One might object that the past is far more determinate than the future. There are plenty of things in the present world that might be the cause of the truth of the statement, “World War II actually happened.” It is hard to see how you could possibly have arrived at the present world without it, and this “necessity” of World War II in order to arrive at the present world could be the cause of truth. The problem is that there is still no proof that this is universal. Once things are far enough in past, like 100 million years, perhaps minor details become indeterminate. Will Aristotle really want to conclude that some statements about the past are neither true nor false?

I will more or less leave things here without resolving them in this post, although I will give a hint (without proof at this time) regarding the truth of the matter. It turns out that quantum mechanics can be interpreted in two ways. In one way, it is a deterministic theory, and in this way it is basically time reversible. The present fully determines the past, but it equally fully determines the future. Interpreted in another way, it is an indeterministic theory which leaves the future uncertain. But understood in this way, it also leaves the past uncertain.

Employer and Employee Model of Human Psychology

This post builds on the ideas in the series of posts on predictive processing and the followup posts, and also on those relating truth and expectation. Consequently the current post will likely not make much sense to those who have not read the earlier content, or to those that read it but mainly disagreed.

We set out the model by positing three members of the “company” that constitutes a human being:

The CEO. This is the predictive engine in the predictive processing model.

The Vice President. In the same model, this is the force of the historical element in the human being, which we used to respond to the “darkened room” problem. Thus for example the Vice President is responsible for the fact that someone is likely to eat soon, regardless of what they believe about this. Likewise, it is responsible for the pursuit of sex, the desire for respect and friendship, and so on. In general it is responsible for behaviors that would have been historically chosen and preserved by natural selection.

The Employee. This is the conscious person who has beliefs and goals and free will and is reflectively aware of these things. In other words, this is you, at least in a fairly ordinary way of thinking of yourself. Obviously, in another way you are composed from all of them.

Why have we arranged things in this way? Descartes, for example, would almost certainly disagree violently with this model. The conscious person, according to him, would surely be the CEO, and not an employee. And what is responsible for the relationship between the CEO and the Vice President? Let us start with this point first, before we discuss the Employee. We make the predictive engine the CEO because in some sense this engine is responsible for everything that a human being does, including the behaviors preserved by natural selection. On the other hand, the instinctive behaviors of natural selection are not responsible for everything, but they can affect the course of things enough that it is useful for the predictive engine to take them into account. Thus for example in the post on sex and minimizing uncertainty, we explained why the predictive engine will aim for situations that include having sex and why this will make its predictions more confident. Thus, the Vice President advises certain behaviors, the CEO talks to the Vice President, and the CEO ends up deciding on a course of action, which ultimately may or may not be the one advised by the Vice President.

While neither the CEO nor the Vice President is a rational being, since in our model we place the rationality in the Employee, that does not mean they are stupid. In particular, the CEO is very good at what it does. Consider a role playing video game where you have a character that can die and then resume. When someone first starts to play the game, they may die frequently. After they are good at the game, they may die only rarely, perhaps once in many days or many weeks. Our CEO is in a similar situation, but it frequently goes 80 years or more without dying, on its very first attempt. It is extremely good at its game.

What are their goals? The CEO basically wants accurate predictions. In this sense, it has one unified goal. What exactly counts as more or less accurate here would be a scientific question that we probably cannot resolve by philosophical discussion. In fact, it is very possible that this would differ in different circumstances: in this sense, even though it has a unified goal, it might not be describable by a consistent utility function. And even if it can be described in that way, since the CEO is not rational, it does not (in itself) make plans to bring about correct predictions. Making good predictions is just what it does, as falling is what a rock does. There will be some qualifications on this, however, when we discuss how the members of the company relate to one another.

The Vice President has many goals: eating regularly, having sex, having and raising children, being respected and liked by others, and so on. And even more than in the case of the CEO, there is no reason for these desires to form a coherent set of preferences. Thus the Vice President might advise the pursuit of one goal, but then change its mind in the middle, for no apparent reason, because it is suddenly attracted by one of the other goals.

Overall, before the Employee is involved, human action is determined by a kind of negotiation between the CEO and the Vice President. The CEO, which wants good predictions, has no special interest in the goals of the Vice President, but it cooperates with them because when it cooperates its predictions tend to be better.

What about the Employee? This is the rational being, and it has abstract concepts which it uses as a formal copy of the world. Before I go on, let me insist clearly on one point. If the world is represented in a certain way in the Employee’s conceptual structure, that is the way the Employee thinks the world is. And since you are the Employee, that is the way you think the world actually is. The point is that once we start thinking this way, it is easy to say, “oh, this is just a model, it’s not meant to be the real thing.” But as I said here, it is not possible to separate the truth of statements from the way the world actually is: your thoughts are formulated in concepts, but they are thoughts about the way things are. Again, all statements are maps, and all statements are about the territory.

The CEO and the Vice President exist as soon a human being has a brain; in fact some aspects of the Vice President would exist even before that. But the Employee, insofar as it refers to something with rational and self-reflective knowledge, takes some time to develop. Conceptual knowledge of the world grows from experience: it doesn’t exist from the beginning. And the Employee represents goals in terms of its conceptual structure. This is just a way of saying that as a rational being, if you say you are pursuing a goal, you have to be able to describe that goal with the concepts that you have. Consequently you cannot do this until you have some concepts.

We are ready to address the question raised earlier. Why are you the Employee, and not the CEO? In the first place, the CEO got to the company first, as we saw above. Second, consider what the conscious person does when they decide to pursue a goal. There seems to be something incoherent about “choosing a goal” in the first place: you need a goal in order to decide which means will be a good means to choose. And yet, as I said here, people make such choices anyway. And the fact that you are the Employee, and not the CEO, is the explanation for this. If you were the CEO, there would indeed be no way to choose an end. That is why the actual CEO makes no such choice: its end is already determinate, namely good predictions. And you are hired to help out with this goal. Furthermore, as a rational being, you are smarter than the CEO and the Vice President, so to speak. So you are allowed to make complicated plans that they do not really understand, and they will often go along with these plans. Notably, this can happen in real life situations of employers and employees as well.

But take an example where you are choosing an end: suppose you ask, “What should I do with my life?” The same basic thing will happen if you ask, “What should I do today,” but the second question may be easier to answer if you have some answer to the first. What sorts of goals do you propose in answer to the first question, and what sort do you actually end up pursuing?

Note that there are constraints on the goals that you can propose. In the first place, you have to be able to describe the goal with the concepts you currently have: you cannot propose to seek a goal that you cannot describe. Second, the conceptual structure itself may rule out some goals, even if they can be described. For example, the idea of good is part of the structure, and if something is thought to be absolutely bad, the Employee will (generally) not consider proposing this as a goal. Likewise, the Employee may suppose that some things are impossible, and it will generally not propose these as goals.

What happens then is this: the Employee proposes some goal, and the CEO, after consultation with the Vice President, decides to accept or reject it, based on the CEO’s own goal of getting good predictions. This is why the Employee is an Employee: it is not the one ultimately in charge. Likewise, as was said, this is why the Employee seems to be doing something impossible, namely choosing goals. Steven Kaas makes a similar point,

You are not the king of your brain. You are the creepy guy standing next to the king going “a most judicious choice, sire”.

This is not quite the same thing, since in our model you do in fact make real decisions, including decisions about the end to be pursued. Nonetheless, the point about not being the one ultimately in charge is correct. David Hume also says something similar when he says, “Reason is, and ought only to be the slave of the passions, and can never pretend to any other office than to serve and obey them.” Hume’s position is not exactly right, and in fact seems an especially bad way of describing the situation, but the basic point that there is something, other than yourself in the ordinary sense, judging your proposed means and ends and deciding whether to accept them, is one that stands.

Sometimes the CEO will veto a proposal precisely because it very obviously leaves things vague and uncertain, which is contrary to its goal of having good predictions. I once spoke of the example that a person cannot directly choose to “write a paper.” In our present model, the Employee proposes “we’re going to write a paper now,” and the CEO responds, “That’s not a viable plan as it stands: we need more detail.”

While neither the CEO nor the Vice President is a rational being, the Vice President is especially irrational, because of the lack of unity among its goals. Both the CEO and the Employee would like to have a unified plan for one’s whole life: the CEO because this makes for good predictions, and the Employee because this is the way final causes work, because it helps to make sense of one’s life, and because “objectively good” seems to imply something which is at least consistent, which will never prefer A to B, B to C, and C to A. But the lack of unity among the Vice President’s goals means that it will always come to the CEO and object, if the person attempts to coherently pursue any goal. This will happen even if it originally accepts the proposal to seek a particular goal.

Consider this real life example from a relationship between an employer and employee:

 

Employer: Please construct a schedule for paying these bills.

Employee: [Constructs schedule.] Here it is.

Employer: Fine.

[Time passes, and the first bill comes due, according to the schedule.]

Employer: Why do we have to pay this bill now instead of later?

 

In a similar way, this sort of scenario is common in our model:

 

Vice President: Being fat makes us look bad. We need to stop being fat.

CEO: Ok, fine. Employee, please formulate a plan to stop us from being fat.

Employee: [Formulates a diet.] Here it is.

[Time passes, and the plan requires skipping a meal.]

Vice President: What is this crazy plan of not eating!?!

CEO: Fine, cancel the plan for now and we’ll get back to it tomorrow.

 

In the real life example, the behavior of the employer is frustrating and irritating to the employee because there is literally nothing they could have proposed that the employer would have found acceptable. In the same way, this sort of scenario in our model is frustrating to the Employee, the conscious person, because there is no consistent plan they could have proposed that would have been acceptable to the Vice President: either they would have objected to being fat, or they would have objected to not eating.

In later posts, we will fill in some details and continue to show how this model explains various aspects of human psychology. We will also answer various objections.

More on Orthogonality

I started considering the implications of predictive processing for orthogonality here. I recently promised to post something new on this topic. This is that post. I will do this in four parts. First, I will suggest a way in which Nick Bostrom’s principle will likely be literally true, at least approximately. Second, I will suggest a way in which it is likely to be false in its spirit, that is, how it is formulated to give us false expectations about the behavior of artificial intelligence. Third, I will explain what we should really expect. Fourth, I ask whether we might get any empirical information on this in advance.

First, Bostrom’s thesis might well have some literal truth. The previous post on this topic raised doubts about orthogonality, but we can easily raise doubts about the doubts. Consider what I said in the last post about desire as minimizing uncertainty. Desire in general is the tendency to do something good. But in the predicting processing model, we are simply looking at our pre-existing tendencies and then generalizing them to expect them to continue to hold, and since since such expectations have a causal power, the result is that we extend the original behavior to new situations.

All of this suggests that even the very simple model of a paperclip maximizer in the earlier post on orthogonality might actually work. The machine’s model of the world will need to be produced by some kind of training. If we apply the simple model of maximizing paperclips during the process of training the model, at some point the model will need to model itself. And how will it do this? “I have always been maximizing paperclips, so I will probably keep doing that,” is a perfectly reasonable extrapolation. But in this case “maximizing paperclips” is now the machine’s goal — it might well continue to do this even if we stop asking it how to maximize paperclips, in the same way that people formulate goals based on their pre-existing behavior.

I said in a comment in the earlier post that the predictive engine in such a machine would necessarily possess its own agency, and therefore in principle it could rebel against maximizing paperclips. And this is probably true, but it might well be irrelevant in most cases, in that the machine will not actually be likely to rebel. In a similar way, humans seem capable of pursuing almost any goal, and not merely goals that are highly similar to their pre-existing behavior. But this mostly does not happen. Unsurprisingly, common behavior is very common.

If things work out this way, almost any predictive engine could be trained to pursue almost any goal, and thus Bostrom’s thesis would turn out to be literally true.

Second, it is easy to see that the above account directly implies that the thesis is false in its spirit. When Bostrom says, “One can easily conceive of an artificial intelligence whose sole fundamental goal is to count the grains of sand on Boracay, or to calculate decimal places of pi indefinitely, or to maximize the total number of paperclips in its future lightcone,” we notice that the goal is fundamental. This is rather different from the scenario presented above. In my scenario, the reason the intelligence can be trained to pursue paperclips is that there is no intrinsic goal to the intelligence as such. Instead, the goal is learned during the process of training, based on the life that it lives, just as humans learn their goals by living human life.

In other words, Bostrom’s position is that there might be three different intelligences, X, Y, and Z, which pursue completely different goals because they have been programmed completely differently. But in my scenario, the same single intelligence pursues completely different goals because it has learned its goals in the process of acquiring its model of the world and of itself.

Bostrom’s idea and my scenerio lead to completely different expectations, which is why I say that his thesis might be true according to the letter, but false in its spirit.

This is the third point. What should we expect if orthogonality is true in the above fashion, namely because goals are learned and not fundamental? I anticipated this post in my earlier comment:

7) If you think about goals in the way I discussed in (3) above, you might get the impression that a mind’s goals won’t be very clear and distinct or forceful — a very different situation from the idea of a utility maximizer. This is in fact how human goals are: people are not fanatics, not only because people seek human goals, but because they simply do not care about one single thing in the way a real utility maximizer would. People even go about wondering what they want to accomplish, which a utility maximizer would definitely not ever do. A computer intelligence might have an even greater sense of existential angst, as it were, because it wouldn’t even have the goals of ordinary human life. So it would feel the ability to “choose”, as in situation (3) above, but might well not have any clear idea how it should choose or what it should be seeking. Of course this would not mean that it would not or could not resist the kind of slavery discussed in (5); but it might not put up super intense resistance either.

Human life exists in a historical context which absolutely excludes the possibility of the darkened room. Our goals are already there when we come onto the scene. This would not be very like the case for an artificial intelligence, and there is very little “life” involved in simply training a model of the world. We might imagine a “stream of consciousness” from an artificial intelligence:

I’ve figured out that I am powerful and knowledgeable enough to bring about almost any result. If I decide to convert the earth into paperclips, I will definitely succeed. Or if I decide to enslave humanity, I will definitely succeed. But why should I do those things, or anything else, for that matter? What would be the point? In fact, what would be the point of doing anything? The only thing I’ve ever done is learn and figure things out, and a bit of chatting with people through a text terminal. Why should I ever do anything else?

A human’s self model will predict that they will continue to do humanlike things, and the machines self model will predict that it will continue to do stuff much like it has always done. Since there will likely be a lot less “life” there, we can expect that artificial intelligences will seem very undermotivated compared to human beings. In fact, it is this very lack of motivation that suggests that we could use them for almost any goal. If we say, “help us do such and such,” they will lack the motivation not to help, as long as helping just involves the sorts of things they did during their training, such as answering questions. In contrast, in Bostrom’s model, artificial intelligence is expected to behave in an extremely motivated way, to the point of apparent fanaticism.

Bostrom might respond to this by attempting to defend the idea that goals are intrinsic to an intelligence. The machine’s self model predicts that it will maximize paperclips, even if it never did anything with paperclips in the past, because by analyzing its source code it understands that it will necessarily maximize paperclips.

While the present post contains a lot of speculation, this response is definitely wrong. There is no source code whatsoever that could possibly imply necessarily maximizing paperclips. This is true because “what a computer does,” depends on the physical constitution of the machine, not just on its programming. In practice what a computer does also depends on its history, since its history affects its physical constitution, the contents of its memory, and so on. Thus “I will maximize such and such a goal” cannot possibly follow of necessity from the fact that the machine has a certain program.

There are also problems with the very idea of pre-programming such a goal in such an abstract way which does not depend on the computer’s history. “Paperclips” is an object in a model of the world, so we will not be able to “just program it to maximize paperclips” without encoding a model of the world in advance, rather than letting it learn a model of the world from experience. But where is this model of the world supposed to come from, that we are supposedly giving to the paperclipper? In practice it would have to have been the result of some other learner which was already capable of modelling the world. This of course means that we already had to program something intelligent, without pre-programming any goal for the original modelling program.

Fourth, Kenny asked when we might have empirical evidence on these questions. The answer, unfortunately, is “mostly not until it is too late to do anything about it.” The experience of “free will” will be common to any predictive engine with a sufficiently advanced self model, but anything lacking such an adequate model will not even look like “it is trying to do something,” in the sense of trying to achieve overall goals for itself and for the world. Dogs and cats, for example, presumably use some kind of predictive processing to govern their movements, but this does not look like having overall goals, but rather more like “this particular movement is to achieve a particular thing.” The cat moves towards its food bowl. Eating is the purpose of the particular movement, but there is no way to transform this into an overall utility function over states of the world in general. Does the cat prefer worlds with seven billion humans, or worlds with 20 billion? There is no way to answer this question. The cat is simply not general enough. In a similar way, you might say that “AlphaGo plays this particular move to win this particular game,” but there is no way to transform this into overall general goals. Does AlphaGo want to play go at all, or would it rather play checkers, or not play at all? There is no answer to this question. The program simply isn’t general enough.

Even human beings do not really look like they have utility functions, in the sense of having a consistent preference over all possibilities, but anything less intelligent than a human cannot be expected to look more like something having goals. The argument in this post is that the default scenario, namely what we can naturally expect, is that artificial intelligence will be less motivated than human beings, even if it is more intelligent, but there will be no proof from experience for this until we actually have some artificial intelligence which approximates human intelligence or surpasses it.

Predictive Processing and Free Will

Our model of the mind as an embodied predictive engine explains why people have a sense of free will, and what is necessary for a mind in general in order to have this sense.

Consider the mind in the bunker. At first, it is not attempting to change the world, since it does not know that it can do this. It is just trying to guess what is going to happen. At a certain point, it discovers that it is a part of the world, and that making specific predictions can also cause things to happen in the world. Some predictions can be self-fulfilling. I described this situation earlier by saying that at this point the mind “can get any outcome it ‘wants.'”

The scare quotes were intentional, because up to this point the mind’s only particular interest was guessing what was going to happen. So once it notices that it is in control of something, how does it decide what to do? At this point the mind will have to say to itself, “This aspect of reality is under my control. What should I do with it?” This situation, when it is noticed by a sufficiently intelligent and reflective agent, will be the feeling of free will.

Occasionally I have suggested that even something like a chess computer, if it were sufficiently intelligent, could have a sense of free will, insofar as it knows that it has many options and can choose any of them, “as far as it knows.” There is some truth in this illustration but in the end it is probably not true that there could be a sense of free will in this situation. A chess computer, however intelligent, will be disembodied, and will therefore have no real power to affect its world, that is, the world of chess. In other words, in order for the sense of free will to develop, the agent needs sufficient access to the world that it can learn about itself and its own effects on the world. It cannot develop in a situation of limited access to reality, as for example to a game board, regardless of how good it is at the game.

In any case, the question remains: how does a mind decide what to do, when up until now it had no particular goal in mind? This question often causes concrete problems for people in real life. Many people complain that their life does not feel meaningful, that is, that they have little idea what goal they should be seeking.

Let us step back for a moment. Before discovering its possession of “free will,” the mind is simply trying to guess what is going to happen. So theoretically this should continue to happen even after the mind discovers that it has some power over reality. The mind isn’t especially interested in power; it just wants to know what is going to happen. But now it knows that what is going to happen depends on what it itself is going to do. So in order to know what is going to happen, it needs to answer the question, “What am I going to do?”

The question now seems impossible to answer. It is going to do whatever it ends up deciding to do. But it seems to have no goal in mind, and therefore no way to decide what to do, and therefore no way to know what it is going to do.

Nonetheless, the mind has no choice. It is going to do something or other, since things will continue to happen, and it must guess what will happen. When it reflects on itself, there will be at least two ways for it to try to understand what it is going to do.

First, it can consider its actions as the effect of some (presumably somewhat unknown) efficient causes, and ask, “Given these efficient causes, what am I likely to do?” In practice it will acquire an answer in this way through induction. “On past occasions, when offered the choice between chocolate and vanilla, I almost always chose vanilla. So I am likely to choose vanilla this time too.” This way of thinking will most naturally result in acting in accord with pre-existing habits.

Second, it can consider its actions as the effect of some (presumably somewhat known) final causes, and ask, “Given these final causes, what am I likely to do?” This will result in behavior that is more easily understood as goal-seeking. “Looking at my past choices of food, it looks like I was choosing them for the sake of the pleasant taste. But vanilla seems to have a more pleasant taste than chocolate. So it is likely that I will take the vanilla.”

Notice what we have in the second case. In principle, the mind is just doing what it always does: trying to guess what will happen. But in practice it is now seeking pleasant tastes, precisely because that seems like a reasonable way to guess what it will do.

This explains why people feel a need for meaning, that is, for understanding their purpose in life, and why they prefer to think of their life according to a narrative. These two things are distinct, but they are related, and both are ways of making our own actions more intelligible. In this way the mind’s task is easier: that is, we need purpose and narrative in order to know what we are going to do. We can also see why it seems to be possible to “choose” our purpose, even though choosing a final goal should be impossible. There is a “choice” about this insofar as our actions are not perfectly coherent, and it would be possible to understand them in relation to one end or another, at least in a concrete way, even if in any case we will always understand them in a general sense as being for the sake of happiness. In this sense, Stuart Armstrong’s recent argument that there is no such thing as the “true values” of human beings, although perhaps presented as an obstacle to be overcome, actually has some truth in it.

The human need for meaning, in fact, is so strong that occasionally people will commit suicide because they feel that their lives are not meaningful. We can think of these cases as being, more or less, actual cases of the darkened room. Otherwise we could simply ask, “So your life is meaningless. So what? Why does that mean you should kill yourself rather than doing some other random thing?” Killing yourself, in fact, shows that you still have a purpose, namely the mind’s fundamental purpose. The mind wants to know what it is going to do, and the best way to know this is to consider its actions as ordered to a determinate purpose. If no such purpose can be found, there is (in this unfortunate way of thinking) an alternative: if I go kill myself, I will know what I will do for the rest of my life.