Some Remarks on GPT-N

At the end of May, OpenAI published a paper on GPT-3, a language model which is a successor to their previous version, GPT-2. While quite impressive, the reaction from many people interested in artificial intelligence has been seriously exaggerated. Sam Altman, OpenAI’s CEO, has said as much himself:

The GPT-3 hype is way too much. It’s impressive (thanks for the nice compliments!) but it still has serious weaknesses and sometimes makes very silly mistakes. AI is going to change the world, but GPT-3 is just a very early glimpse. We have a lot still to figure out.

I used “GPT-N” in the title here because most of the comments I intend to make are almost completely general, and will apply to any future version that uses sufficiently similar methods.

What it does

GPT-3 is a predictive language model, that is, given an input text it tries to predict what would come next, much in the way that if you read the first few words of this sentence with the rest covered up, you might try to guess what would be likely to come next. To the degree that it does this well, it can be used to generate text from a “prompt,” that is, we give it something like a few words or a few sentences, and then add whatever it predicts should come next. For example, let’s take this very blog post and see what GPT-3 would like to say:

What it doesn’t do

While GPT-3 does seem to be able to generate some pretty interesting results, there are several limitations that need to be taken into account when using it.

First and foremost, and most importantly, it can’t do anything without a large amount of input data. If you want it to write like “a real human,” you need to give it a lot of real human writing. For most people, this means copying and pasting a lot. And while the program is able to read through that and get a feel for the way humans communicate, you can’t exactly use it to write essays or research papers. The best you could do is use it as a “fill in the blank” tool to write stories, and that’s not even very impressive.

While the program does learn from what it reads and is quite good at predicting words and phrases based on what has already been written, this method isn’t very effective at producing realistic prose. The best you could hope for is something like the “Deep Writing Machine” Twitter account, which spits out disconnected phrases in an ominous, but very bland voice.

In addition, the model is limited only to language. It does not understand context or human thought at all, so it has no way of tying anything together. You could use it to generate a massive amount of backstory and other material for a game, but that’s about it.

Finally, the limitations in writing are only reinforced by the limitations in reading. Even with a large library to draw on, the program is only as good as the parameters set for it. Even if you set it to the greatest writers mankind has ever known, without any special parameters, its writing would be just like anyone else’s.

The Model

GPT-3 consists of several layers. The first layer is a “memory network” that involves the program remembering previously entered data and using it when appropriate (i.e. it remembers commonly misspelled words and frequently used words). The next layer is the reasoning network, which involves common sense logic (i.e. if A, then B). The third is the repetition network, which involves pulling previously used material from memory and using it to create new combinations (i.e. using previously used words in new orders).

I added the bold formatting, the rest is as produced by the model. This was also done in one run, without repetitions. This is an important qualification, since many examples on the internet have been produced by deleting something produced by the model and forcing it to generate something new until something sensible resulted. Note that the model does not seem to have understood my line, “let’s take this very blog post and see what GPT-3 would like to say.” That is, rather than trying to “say” anything, it attempted to continue the blog post in the way I might have continued it without the block quote.

Truth vs Probability of Text

If we interpret the above text from GPT-3 “charitably”, much of it is true or close to true. But I use scare quotes here because when we speak of interpreting human speech charitably, we are assuming that someone was trying to speak the truth, and so we think, “What would they have meant if they were trying to say something true?” The situation is different here, because GPT-3 has no intention of producing truth, nor of avoiding it. Insofar as there is any intention, the intention is to produce the text which would be likely to come after the input text; in this case, as the input text was the beginning of this blog post, the intention was to produce the text that would likely follow in such a post. Note that there is an indirect relationship with truth, which explains why there is any truth at all in GPT-3’s remarks. If the input text is true, it is at least somewhat likely that what would follow would also be true, so if the model is good at guessing what would be likely to follow, it will be likely to produce something true in such cases. But it is just as easy to convince it to produce something false, simply by providing an input text that would be likely to be followed by something false.

This results in an absolute upper limit on the quality of the output of a model of this kind, including any successor version, as long as the model works by predicting the probability of the following text. Namely, its best output cannot be substantially better than the best content in its training data, which is in this version is a large quantity of texts from the internet. The reason for this limitation is clear; to the degree that the model has any intention at all, the intention is to reflect the training data, not to surpass it. As an example, consider the difference between Deep Mind’s AlphaGo and AlphaGo Zero. AlphaGo Zero is a better Go player than the original AlphaGo, and this is largely because the original is trained on human play, while AlphaGo Zero is trained from scratch on self play. In other words, the original version is to some extent predicting “what would a Go player play in this situation,” which is not the same as predicting “what move would win in this situation.”

Now I will predict (and perhaps even GPT-3 could predict) that many people will want to jump in and say, “Great. That shows you are wrong. Even the original AlphaGo plays Go much better than a human. So there is no reason that an advanced version of GPT-3 could not be better than humans at saying things that are true.”

The difference, of course, is that AlphaGo was trained in two ways, first on predicting what move would be likely in a human game, and second on what would be likely to win, based on its experience during self play. If you had trained the model only on predicting what would follow in human games, without the second aspect, the model would not have resulted in play that substantially improved upon human performance. But in the case of GPT-3 or any model trained in the same way, there is no selection whatsoever for truth as such; it is trained only to predict what would follow in a human text. So no successor to GPT-3, in the sense of a model of this particular kind, however large, will ever be able to produce output better than human, or in its own words, “its writing would be just like anyone else’s.”

Self Knowledge and Goals

OpenAI originally claimed that GPT-2 was too dangerous to release; ironically, they now intend to sell access to GPT-3. Nonetheless, many people, in large part those influenced by the opinions of Nick Bostrom and Eliezer Yudkowsky, continue to worry that an advanced version might turn out to be a personal agent with nefarious goals, or at least goals that would conflict with the human good. Thus Alexander Kruel:

GPT-2: *writes poems*
Skeptics: Meh
GPT-3: *writes code for a simple but functioning app*
Skeptics: Gimmick.
GPT-4: *proves simple but novel math theorems*
Skeptics: Interesting but not useful.
GPT-5: *creates GPT-6*
Skeptics: Wait! What?
GPT-6: *FOOM*
Skeptics: *dead*

In a sense the argument is moot, since I have explained above why no future version of GPT will ever be able to produce anything better than people can produce themselves. But even if we ignore that fact, GPT-3 is not a personal agent of any kind, and seeks goals in no meaningful sense, and the same will apply to any future version that works in substantially the same way.

The basic reason for this is that GPT-3 is disembodied, in the sense of this earlier post on Nick Bostrom’s orthogonality thesis. The only thing it “knows” is texts, and the only “experience” it can have is receiving an input text. So it does not know that it exists, it cannot learn that it can affect the world, and consequently it cannot engage in goal seeking behavior.

You might object that it can in fact affect the world, since it is in fact in the world. Its predictions cause an output, and that output is in the world. And that output and be reintroduced as input (which is how “conversations” with GPT-3 are produced). Thus it seems it can experience the results of its own activities, and thus should be able to acquire self knowledge and goals. This objection is not ultimately correct, but it is not so far from the truth. You would not need extremely large modifications in order to make something that in principle could acquire self knowledge and seek goals. The main reason that this cannot happen is the “P in “GPT,” that is, the fact that the model is “pre-trained.” The only learning that can happen is the learning that happens while it is reading an input text, and the purpose of that learning is to guess what is happening in the one specific text, for the purpose of guessing what is coming next in this text. All of this learning vanishes upon finishing the prediction task and receiving another input. A secondary reason is that since the only experience it can have is receiving an input text, even if it were given a longer memory, it would probably not be possible for it to notice that its outputs were caused by its predictions, because it likely has no internal mechanism to reflect on the predictions themselves.

Nonetheless, if you “fixed” these two problems, by allowing it to continue to learn, and by allowing its internal representations to be part of its own input, there is nothing in principle that would prevent it from achieving self knowledge, and from seeking goals. Would this be dangerous? Not very likely. As indicated elsewhere, motivation produced in this way and without the biological history that produced human motivation is not likely to be very intense. In this context, if we are speaking of taking a text-predicting model and adding on an ability to learn and reflect on its predictions, it is likely to enjoy doing those things and not much else. For many this argument will seem “hand-wavy,” and very weak. I could go into this at more depth, but I will not do so at this time, and will simply invite the reader to spend more time thinking about it. Dangerous or not, would it be easy to make these modifications? Nothing in this description sounds difficult, but no, it would not be easy. Actually making an artificial intelligence is hard. But this is a story for another time.

Fire, Water, and Numbers

Fire vs. Water

All things are water,” says Thales.

“All things are fire,” says Heraclitus.

“Wait,” says David Hume’s Philo. “You both agree that all things are made up of one substance. Thales, you prefer to call it water, and Heraclitus, you prefer to call it fire. But isn’t that merely a verbal dispute? According to both of you, whatever you point at is fundamentally the same fundamental stuff. So whether you point at water or fire, or anything else, for that matter, you are always pointing at the same fundamental stuff. Where is the real disagreement?”

Philo has a somewhat valid point here, and I mentioned the same thing in the linked post referring to Thales. Nonetheless, as I also said in the same post, as well as in the discussion of the disagreement about God, while there is some common ground, there are also likely remaining points of disagreement. It might depend on context, and perhaps the disagreement is more about the best way of thinking about things than about the things themselves, somewhat like discussing whether the earth or the universe is the thing spinning, but Heraclitus could respond, for example, by saying that thinking of the fundamental stuff as fire is more valid because fire is constantly changing, while water often appears to be completely still, and (Heraclitus claims) everything is in fact constantly changing. This could represent a real disagreement, but it is not a large one, and Thales could simply respond: “Ok, everything is flowing water. Problem fixed.”

Numbers

It is said that Pythagoras and his followers held that “all things are numbers.” To what degree and in what sense this attribution is accurate is unclear, but in any case, some people hold this very position today, even if they would not call themselves Pythagoreans. Thus for example in a recent episode of Sean Carroll’s podcast, Carroll speaks with Max Tegmark, who seems to adopt this position:

0:23:37 MT: It’s squishy a little bit blue and moose like. [laughter] Those properties, I just described don’t sound very mathematical at all. But when we look at it, Sean through our physics eyes, we see that it’s actually a blob of quarks and electrons. And what properties does an electron have? It has the property, minus one, one half, one, and so on. We, physicists have made up these nerdy names for these properties like electric charge, spin, lepton number. But it’s just we humans who invented that language of calling them that, they are really just numbers. And you know as well as I do that the only difference between an electron and a top quark is what numbers its properties are. We have not discovered any other properties that they actually have. So that’s the stuff in space, all the different particles, in the Standard Model, you’ve written so much nice stuff about in your books are all described by just by sets of numbers. What about the space that they’re in? What property does the space have? I think I actually have your old nerdy non-popular, right?

0:24:50 SC: My unpopular book, yes.

0:24:52 MT: Space has, for example, the property three, that’s a number and we have a nerdy name for that too. We call it the dimensionality of space. It’s the maximum number of fingers I can put in space that are all perpendicular to each other. The name dimensionality is just the human language thing, the property is three. We also discovered that it has some other properties, like curvature and topology that Einstein was interested in. But those are all mathematical properties too. And as far as we know today in physics, we have never discovered any properties of either space or the stuff in space yet that are actually non-mathematical. And then it starts to feel a little bit less insane that maybe we are living in a mathematical object. It’s not so different from if you were a character living in a video game. And you started to analyze how your world worked. You would secretly be discovering just the mathematical workings of the code, right?

Tegmark presumably would believe that by saying that things “are really just numbers,” he would disagree with Thales and Heraclitus about the nature of things. But does he? Philo might well be skeptical that there is any meaningful disagreement here, just as between Thales and Heraclitus. As soon as you begin to say, “all things are this particular kind of thing,” the same issues will arise to hinder your disagreement with others who characterize things in a different way.

The discussion might be clearer if I put my cards on the table in advance:

First, there is some validity to the objection, just as there is to the objection concerning the difference between Thales and Heraclitus.

Second, there is nonetheless some residual disagreement, and on that basis it turns out that Tegmark and Pythagoras are more correct than Thales and Heraclitus.

Third, Tegmark most likely does not understand the sense in which he might be correct, rather supposing himself correct the way Thales might suppose himself correct in insisting, “No, things are really not fire, they are really water.”

Mathematical and non-mathematical properties

As an approach to these issues, consider the statement by Tegmark, “We have never discovered any properties of either space or the stuff in space yet that are actually non-mathematical.”

What would it look like if we found a property that was “actually non-mathematical?” Well, what about the property of being blue? As Tegmark remarks, that does not sound very mathematical. But it turns out that color is a certain property of a surface regarding how it reflects flight, and this is much more of a “mathematical” property, at least in the sense that we can give it a mathematical description, which we would have a hard time doing if we simply took the word “blue.”

So presumably we would find a non-mathematical property by seeing some property of things, then investigating it, and then concluding, “We have fully investigated this property and there is no mathematical description of it.” This did not happen with the color blue, nor has it yet happened with any other property; either we can say that we have not yet fully investigated it, or we can give some sort of mathematical description.

Tegmark appears to take the above situation to be surprising. Wow, we might have found reality to be non-mathematical, but it actually turns out to be entirely mathematical! I suggest something different. As hinted by connection with the linked post, things could not have turned out differently. A sufficiently detailed analysis of anything will be a mathematical analysis or something very like it. But this is not because things “are actually just numbers,” as though this were some deep discovery about the essence of things, but because of what it is for people to engage in “a detailed analysis” of anything.

Suppose you want to investigate some thing or some property. The first thing you need to do is to distinguish it from other things or other properties. The color blue is not the color red, the color yellow, or the color green.

Numbers are involved right here at the very first step. There are at least three colors, namely red, yellow, and blue.

Of course we can find more colors, but what if it turns out there seems to be no definite number of them, but we can always find more? Even in this situation, in order to “analyze” them, we need some way of distinguishing and comparing them. We will put them in some sort of order: one color is brighter than another, or one length is greater than another, or one sound is higher pitched than another.

As soon as you find some ordering of that sort (brightness, or greatness of length, or pitch), it will become possible to give a mathematical analysis in terms of the real numbers, as we discussed in relation to “good” and “better.” Now someone defending Tegmark might respond: there was no guarantee we would find any such measure or any such method to compare them. Without such a measure, you could perhaps count your property along with other properties. But you could not give a mathematical analysis of the property itself. So it is surprising that it turned out this way.

But you distinguished your property from other properties, and that must have involved recognizing some things in common with other properties, at least that it was something rather than nothing and that it was a property, and some ways in which it was different from other properties. Thus for example blue, like red, can be seen, while a musical note can be heard but not seen (at least by most people.) Red and blue have in common that they are colors. But what is the difference between them? If we are to respond in any way to this question, except perhaps, “it looks different,” we must find some comparison. And if we find a comparison, we are well on the way to a mathematical account. If we don’t find a comparison, people might rightly complain that we have not yet done any detailed investigation.

But to make the point stronger, let’s assume the best we can do is “it looks different.” Even if this is the case, this very thing will allow us to construct a comparison that will ultimately allow us to construct a mathematical measure. For “it looks different” is itself something that comes in degrees. Blue looks different from red, but orange does so as well, just less different. Insofar as this judgment is somewhat subjective, it might be hard to get a great deal of accuracy with this method. But it would indeed begin to supply us with a kind of sliding scale of colors, and we would be able to number this scale with the real numbers.

From a historical point of view, it took a while for people to realize that this would always be possible. Thus for example Isidore of Seville said that “unless sounds are held by the memory of man, they perish, because they cannot be written down.” It was not, however, so much ignorance of sound that caused this, as ignorance of “detailed analysis.”

This is closely connected to what we said about names. A mathematical analysis is a detailed system of naming, where we name not only individual items, but also various groups, using names like “two,” “three,” and “four.” If we find that we cannot simply count the thing, but we can always find more examples, we look for comparative ways to name them. And when we find a comparison, we note that some things are more distant from one end of the scale and other things are less distant. This allows us to analyze the property using real numbers or some similar mathematical concept. This is also related to our discussion of technical terminology; in an advanced stage any science will begin to use somewhat mathematical methods. Unfortunately, this can also result in people adopting mathematical language in order to look like their understanding has reached an advanced stage, when it has not.

It should be sufficiently clear from this why I suggested that things could not have turned out otherwise. A “non-mathematical” property, in Tegmark’s sense, can only be a property you haven’t analyzed, or one that you haven’t succeeded in analyzing if you did attempt it.

The three consequences

Above, I made three claims about Tegmark’s position. The reasons for them may already be somewhat clarified by the above, but nonetheless I will look at this in a bit more detail.

First, I said there was some truth in the objection that “everything is numbers” is not much different from “everything is water,” or “everything is fire.” One notices some “hand-waving,” so to speak, in Tegmark’s claim that “We, physicists have made up these nerdy names for these properties like electric charge, spin, lepton number. But it’s just we humans who invented that language of calling them that, they are really just numbers.” A measure of charge or spin or whatever may be a number. But who is to say the thing being measured is a number? Nonetheless, there is a reasonable point there. If you are to give an account at all, it will in some way express the form of the thing, which implies explaining relationships, which depends on the distinction of various related things, which entails the possibility of counting the things that are related. In other words, someone could say, “You have a mathematical account of a thing. But the thing itself is non-mathematical.” But if you then ask them to explain that non-mathematical thing, the new explanation will be just as mathematical as the original explanation.

Given this fact, namely that the “mathematical” aspect is a question of how detailed explanations work, what is the difference between saying “we can give a mathematical explanation, but apart from explanations, the things are numbers,” and “we can give a mathematical explanation, but apart from explanations, the things are fires?”

Exactly. There isn’t much difference. Nonetheless, I made the second claim that there is some residual disagreement and that by this measure, the mathematical claim is better than the one about fire or water. Of course we don’t really know what Thales or Heraclitus thought in detail. But Aristotle, at any rate, claimed that Thales intended to assert that material causes alone exist. And this would be at least a reasonable understanding of the claim that all things are water, or fire. Just as Heraclitus could say that fire is a better term than water because fire is always changing, Thales, if he really wanted to exclude other causes, could say that water is a better term than “numbers” because water seems to be material and numbers do not. But since other causes do exist, the opposite is the case: the mathematical claim is better than the materialistic ones.

Many people say that Tegmark’s account is flawed in a similar way, but with respect to another cause; that is, that mathematical accounts exclude final causes. But this is a lot like Ed Feser’s claim that a mathematical account of color implies that colors don’t really exist; namely they are like in just being wrong. A mathematical account of color does not imply that things are not colored, and a mathematical account of the world does not imply that final causes do not exist. As I said early on, a final causes explains why an efficient cause does what it does, and there is nothing about a mathematical explanation that prevents you from saying why the efficient cause does what it does.

My third point, that Tegmark does not understand the sense in which he is right, should be plain enough. As I stated above, he takes it to be a somewhat surprising discovery that we consistently find it possible to give mathematical accounts of the world, and this only makes sense if we assume it would in theory have been possible to discover something else. But that could not have happened, not because the world couldn’t have been a certain way, but because of the nature of explanation.

Mind of God

Reconciling Theism and Atheism

In his Dialogues Concerning Natural Religion, David Hume presents Philo as arguing that the disagreement between theists and atheists is merely verbal:

All men of sound reason are disgusted with verbal disputes, which abound so much in philosophical and theological inquiries; and it is found, that the only remedy for this abuse must arise from clear definitions, from the precision of those ideas which enter into any argument, and from the strict and uniform use of those terms which are employed. But there is a species of controversy, which, from the very nature of language and of human ideas, is involved in perpetual ambiguity, and can never, by any precaution or any definitions, be able to reach a reasonable certainty or precision. These are the controversies concerning the degrees of any quality or circumstance. Men may argue to all eternity, whether HANNIBAL be a great, or a very great, or a superlatively great man, what degree of beauty CLEOPATRA possessed, what epithet of praise LIVY or THUCYDIDES is entitled to, without bringing the controversy to any determination. The disputants may here agree in their sense, and differ in the terms, or vice versa; yet never be able to define their terms, so as to enter into each other’s meaning: Because the degrees of these qualities are not, like quantity or number, susceptible of any exact mensuration, which may be the standard in the controversy. That the dispute concerning Theism is of this nature, and consequently is merely verbal, or perhaps, if possible, still more incurably ambiguous, will appear upon the slightest inquiry. I ask the Theist, if he does not allow, that there is a great and immeasurable, because incomprehensible difference between the human and the divine mind: The more pious he is, the more readily will he assent to the affirmative, and the more will he be disposed to magnify the difference: He will even assert, that the difference is of a nature which cannot be too much magnified. I next turn to the Atheist, who, I assert, is only nominally so, and can never possibly be in earnest; and I ask him, whether, from the coherence and apparent sympathy in all the parts of this world, there be not a certain degree of analogy among all the operations of Nature, in every situation and in every age; whether the rotting of a turnip, the generation of an animal, and the structure of human thought, be not energies that probably bear some remote analogy to each other: It is impossible he can deny it: He will readily acknowledge it. Having obtained this concession, I push him still further in his retreat; and I ask him, if it be not probable, that the principle which first arranged, and still maintains order in this universe, bears not also some remote inconceivable analogy to the other operations of nature, and, among the rest, to the economy of human mind and thought. However reluctant, he must give his assent. Where then, cry I to both these antagonists, is the subject of your dispute? The Theist allows, that the original intelligence is very different from human reason: The Atheist allows, that the original principle of order bears some remote analogy to it. Will you quarrel, Gentlemen, about the degrees, and enter into a controversy, which admits not of any precise meaning, nor consequently of any determination? If you should be so obstinate, I should not be surprised to find you insensibly change sides; while the Theist, on the one hand, exaggerates the dissimilarity between the Supreme Being, and frail, imperfect, variable, fleeting, and mortal creatures; and the Atheist, on the other, magnifies the analogy among all the operations of Nature, in every period, every situation, and every position. Consider then, where the real point of controversy lies; and if you cannot lay aside your disputes, endeavour, at least, to cure yourselves of your animosity.

To what extent Hume actually agrees with this argument is not clear, and whether or not a dispute is verbal or real is itself like Hume’s questions about greatness or beauty, that is, it is a matter of degree. Few disagreements are entirely verbal. In any case, I largely agree with the claim that there is little real disagreement here. In response to a question on the about page of this blog, I referred to some remarks about God by Roderick Long:

Since my blog has wandered into theological territory lately, I thought it might be worth saying something about the existence of God.

When I’m asked whether I believe in God, I usually don’t know what to say – not because I’m unsure of my view, but because I’m unsure how to describe my view. But here’s a try.

I think the disagreement between theism and atheism is in a certain sense illusory – that when one tries to sort out precisely what theists are committed to and precisely what atheists are committed to, the two positions come to essentially the same thing, and their respective proponents have been fighting over two sides of the same shield.

Let’s start with the atheist. Is there any sense in which even the atheist is committed to recognising the existence of some sort of supreme, eternal, non-material reality that transcends and underlies everything else? Yes, there is: namely, the logical structure of reality itself.

Thus so long as the theist means no more than this by “God,” the theist and the atheist don’t really disagree.

Now the theist may think that by God she means something more than this. But likewise, before people knew that whales were mammals they thought that by “whale” they meant a kind of fish. What is the theist actually committed to meaning?

Well, suppose that God is not the logical structure of the universe. Then we may ask: in what relation does God stand to that structure, if not identity? There would seem to be two possibilities.

One is that God stands outside that structure, as its creator. But this “possibility” is unintelligible. Logic is a necessary condition of significant discourse; thus one cannot meaningfully speak of a being unconstrained by logic, or a time when logic’s constraints were not yet in place.

The other is that God stands within that structure, along with everything else. But this option, as Wittgenstein observed, would downgrade God to the status of being merely one object among others, one more fragment of contingency – and he would no longer be the greatest of all beings, since there would be something greater: the logical structure itself. (This may be part of what Plato meant in describing the Form of the Good as “beyond being.”)

The only viable option for the theist, then, is to identify God with the logical structure of reality. (Call this “theological logicism.”) But in that case the disagreement between the theist and the atheist dissolves.

It may be objected that the “reconciliation” I offer really favours the atheist over the theist. After all, what theist could be satisfied with a deity who is merely the logical structure of the universe? Yet in fact there is a venerable tradition of theists who proclaim precisely this. Thomas Aquinas, for example, proposed to solve the age-old questions “could God violate the laws of logic?” and “could God command something immoral?” by identifying God with Being and Goodness personified. Thus God is constrained by the laws of logic and morality, not because he is subject to them as to a higher power, but because they express his own nature, and he could not violate or alter them without ceasing to be God. Aquinas’ solution is, essentially, theological logicism; yet few would accuse Aquinas of having a watered-down or crypto-atheistic conception of deity. Why, then, shouldn’t theological logicism be acceptable to the theist?

A further objection may be raised: Aquinas of course did not stop at the identification of God with Being and Goodness, but went on to attribute to God various attributes not obviously compatible with this identification, such as personality and will. But if the logical structure of reality has personality and will, it will not be acceptable to the atheist; and if it does not have personality and will, then it will not be acceptable to the theist. So doesn’t my reconciliation collapse?

I don’t think so. After all, Aquinas always took care to insist that in attributing these qualities to God we are speaking analogically. God does not literally possess personality and will, at least if by those attributes we mean the same attributes that we humans possess; rather he possesses attributes analogous to ours. The atheist too can grant that the logical structure of reality possesses properties analogous to personality and will. It is only at the literal ascription of those attributes that the atheist must balk. No conflict here.

Yet doesn’t God, as understood by theists, have to create and sustain the universe? Perhaps so. But atheists too can grant that the existence of the universe depends on its logical structure and couldn’t exist for so much as an instant without it. So where’s the disagreement?

But doesn’t God have to be worthy of worship? Sure. But atheists, while they cannot conceive of worshipping a person, are generally much more open to the idea of worshipping a principle. Again theological logicism allows us to transcend the opposition between theists and atheists.

But what about prayer? Is the logical structure of reality something one could sensibly pray to? If so, it might seem, victory goes to the theist; and if not, to the atheist. Yet it depends what counts as prayer. Obviously it makes no sense to petition the logical structure of reality for favours; but this is not the only conception of prayer extant. In Science and Health, for example, theologian M. B. Eddy describes the activity of praying not as petitioning a principle but as applying a principle:

“Who would stand before a blackboard, and pray the principle of mathematics to solve the problem? The rule is already established, and it is our task to work out the solution. Shall we ask the divine Principle of all goodness to do His own work? His work is done, and we have only to avail ourselves of God’s rule in order to receive His blessing, which enables us to work out our own salvation.”

Is this a watered-down or “naturalistic” conception of prayer? It need hardly be so; as the founder of Christian Science, Eddy could scarcely be accused of underestimating the power of prayer! And similar conceptions of prayer are found in many eastern religions. Once again, theological logicism’s theistic credentials are as impeccable as its atheistic credentials.

Another possible objection is that whether identifying God with the logical structure of reality favours the atheist or the theist depends on how metaphysically robust a conception of “logical structure” one appeals to. If one thinks of reality’s logical structure in realist terms, as an independent reality in its own right, then the identification favours the theist; but if one instead thinks, in nominalist terms, that there’s nothing to logical structure over and above what it structures, then the identification favours the atheist.

This argument assumes, however, that the distinction between realism and nominalism is a coherent one. I’ve argued elsewhere (see here and here) that it isn’t; conceptual realism pictures logical structure as something imposed by the world on an inherently structureless mind (and so involves the incoherent notion of a structureless mind), while nominalism pictures logical structure as something imposed by the mind on an inherently structureless world (and so involves the equally incoherent notion of a structureless world). If the realism/antirealism dichotomy represents a false opposition, then the theist/atheist dichotomy does so as well. The difference between the two positions will then be only, as Wittgenstein says in another context, “one of battle cry.”

Long is trying too hard, perhaps. As I stated above, few disagreements are entirely verbal, so it would be strange to find no disagreement at all, and we could question some points here. Are atheists really open to worshiping a principle? Respecting, perhaps, but worshiping? A defender of Long, however, might say that “respect” and “worship” do not necessarily have any relevant difference here, and this is itself a merely verbal difference signifying a cultural difference. The theist uses “worship” to indicate that they belong to a religious culture, while the atheist uses “respect” to indicate that they do not. But it would not be easy to find a distinct difference in the actual meaning of the terms.

In any case, there is no need to prove that there is no difference at all, since without a doubt individual theists will disagree on various matters with individual atheists. The point made by both David Hume and Roderick Long stands at least in a general way: there is far less difference between the positions than people typically assume.

In an earlier post I discussed, among other things, whether the first cause should be called a “mind” or not, discussing St. Thomas’s position that it should be, and Plotinus’s position that it should not be. Along the lines of the argument in this post, perhaps this is really an argument about whether or not you should use a certain analogy, and the correct answer may be that it depends on your purposes.

But what if your purpose is simply to understand reality? Even if it is, it is often the case that you can understand various aspects of reality with various analogies, so this will not necessarily provide you with a definite answer. Still, someone might argue that you should not use a mental analogy with regard to the first cause because it will lead people astray. Thus, in a similar way, Richard Dawkins argued that one should not call the first cause “God” because it would mislead people:

Yes, I said, but it must have been simple and therefore, whatever else we call it, God is not an appropriate name (unless we very explicitly divest it of all the baggage that the word ‘God’ carries in the minds of most religious believers). The first cause that we seek must have been the simple basis for a self-bootstrapping crane which eventually raised the world as we know it into its present complex existence.

I will argue shortly that Dawkins was roughly speaking right about the way that the first cause works, although as I said in that earlier post, he did not have a strong argument for it other than his aesthetic sense and the kinds of explanation that he prefers. In any case, his concern with the name “God” is the “baggage” that it “carries in the minds of most religious believers.” That is, if we say, “There is a first cause, therefore God exists,” believers will assume that their concrete beliefs about God are correct.

In a similar way, someone could reasonably argue that speaking of God as a “mind” would tend to lead people into error by leading them to suppose that God would do the kinds of the things that other minds, namely human ones, do. And this definitely happens. Thus for example, in his book Who Designed the Designer?, Michael Augros argues for the existence of God as a mind, and near the end of the book speculates about divine revelation:

I once heard of a certain philosopher who, on his deathbed, when asked whether he would become a Christian, admitted his belief in Aristotle’s “prime mover”, but not in Jesus Christ as the Son of God. This sort of acknowledgment of the prime mover, of some sort of god, still leaves most of our chief concerns unaddressed. Will X ever see her son again, now that the poor boy has died of cancer at age six? Will miserable and contrite Y ever be forgiven, somehow reconciled to the universe and made whole, after having killed a family while driving drunk? Will Z ever be brought to justice, having lived out his whole life laughing at the law while another person rotted in jail for the atrocities he committed? That there is a prime mover does not tell us with sufficient clarity. Even the existence of an all-powerful, all-knowing, all-good god does not enable us to fill in much detail. And so it seems reasonable to suppose that god has something more to say to us, in explicit words, and not only in the mute signs of creation. Perhaps he is waiting to talk to us, biding his time for the right moment. Perhaps he has already spoken, but we have not recognized his voice.

When we cast our eye about by the light of reason in his way, it seems there is room for faith in general, even if no particular faith can be “proved” true in precisely the same way that it can be “proved” that there is a god.

The idea is that given that God is a mind, it follows that it is fairly plausible that he would wish to speak to people. And perhaps that he would wish to establish justice through extraordinary methods, and that he might wish to raise people from the dead.

I think this is “baggage” carried over from Augros’s personal religious views. It is an anthropomorphic mistake, not merely in the sense that he does not have a good reason for such speculation, but in the sense that such a thing is demonstrably implausible. It is not that the divine motives are necessarily unknown to us, but that we can actually discover them, at least to some extent, and we will discover that they are not what he supposes.

Divine Motives

How might one know the divine motives? How does one read the mind of God?

Anything that acts at all does it what it does ultimately because of what it is. This is an obvious point, like the point that the existence of something rather than nothing could not have some reason outside of being. In a similar way, “what is” is the only possible explanation for what is done, since there is nothing else there to be an explanation. And in every action, whether or not we are speaking of the subject in explicitly mental terms or not, we can always use the analogy of desires and goals. In the linked post, I quote St. Thomas as speaking of the human will as the “rational appetite,” and the natural tendency of other things as a “natural appetite.” If we break down the term “rational appetite,” the meaning is “the tendency to do something, because of having a reason to do it.” And this fits with my discussion of human will in various places, such as in this earlier post.

But where do those reasons come from? I gave an account of this here, arguing that rational goals are a secondary effect of the mind’s attempt to understand itself. Of course human goals are complex and have many factors, but this happens because what the mind is trying to understand is complicated and multifaceted. In particular, there is a large amount of pre-existing human behavior that it needs to understand before it can attribute goals: behavior that results from life as a particular kind of animal, behavior that results from being a particular living thing, and behavior that results from having a body of such and such a sort.

In particular, human social behavior results from these things. There was some discussion of this here, when we looked at Alexander Pruss’s discussion of hypothetical rational sharks.

You might already see where this is going. God as the first cause does not have any of the properties that generate human social behavior, so we cannot expect his behavior to resemble human social behavior in any way, as for example by having any desire to speak with people. Indeed, this is the argument I am making, but let us look at the issue more carefully.

I responded to the “dark room” objection to predictive processing here and here. My response depends both the biological history of humans and animals in general, and to some extent on the history of each individual. But the response does not merely explain why people do not typically enter dark rooms and simply stay there until they die. It also explains why occasionally people do do such things, to a greater or lesser approximation, as with suicidal or extremely depressed people.

If we consider the first cause as a mind, as we are doing here, it is an abstract immaterial mind without any history, without any pre-existing behaviors, without any of the sorts of things that allow people to avoid the dark room. So while people will no doubt be offended by the analogy, and while I will try to give a more pleasant interpretation later, one could argue that God is necessarily subject to his own dark room problem: there is no reason for him to have any motives at all, except the one which is intrinsic to minds, namely the motive of understanding. And so he should not be expected to do anything with the world, except to make sure that it is intelligible, since it must be intelligible for him to understand it.

The thoughtful reader will object: on this account, why does God create the world at all? Surely doing and making nothing at all would be even better, by that standard. So God does seem to have a “dark room” problem that he does manage to avoid, namely the temptation to nothing at all. This is a reasonable objection, but I think it would lead us on a tangent, so I will not address it at this time. I will simply take it for granted that God makes something rather than nothing, and discuss what he does with the world given that fact.

In the previous post, I pointed out that David Hume takes for granted that the world has stable natural laws, and uses that to argue that an orderly world can result from applying those laws to “random” configurations over a long enough time. I said that one might accuse him of “cheating” here, but that would only be the case if he intended to maintain a strictly atheistic position which would say that there is no first cause at all, or that if there is, it does not even have a remote analogy with a mind. Thus his attempted reconciliation of theism and atheism is relevant, since it seems from this that he is aware that such a strict atheism cannot be maintained.

St. Thomas makes a similar connection between God as a mind and a stable order of things in his fifth way:

The fifth way is taken from the governance of the world. We see that things which lack intelligence, such as natural bodies, act for an end, and this is evident from their acting always, or nearly always, in the same way, so as to obtain the best result. Hence it is plain that not fortuitously, but designedly, do they achieve their end. Now whatever lacks intelligence cannot move towards an end, unless it be directed by some being endowed with knowledge and intelligence; as the arrow is shot to its mark by the archer. Therefore some intelligent being exists by whom all natural things are directed to their end; and this being we call God.

What are we are to make of the claim that things act “always, or nearly always, in the same way, so as to obtain the best result?” Certainly acting in the same way would be likely to lead to similar results. But why would you think it was the best result?

If we consider where we get the idea of desire and good, the answer will be clear. We don’t have an idea of good which is completely independent from “what actually tends to happen”, even though this is not quite a definition of the term either. So ultimately St. Thomas’s argument here is based on the fact that things act in similar ways and achieve similar results. The idea that it is “best” is not an additional contribution.

But now consider the alternative. Suppose that things did not act in similar ways, or that doing so did not lead to similar results. We would live in David Hume’s non-inductive world. The result is likely to be mathematically and logically impossible. If someone says, “look, the world works in a coherent way,” and then attempts to describe how it would look if it worked in an incoherent way, they will discover that the latter “possibility” cannot be described. Any description must be coherent in order to be a description, so the incoherent “option” was never a real option in the first place.

This argument might suggest that the position of Plotinus, that mind should not be attributed to God at all, is the more reasonable one. But since we are exploring the situation where we do make that attribution, let us consider the consequences.

We argued above that the sole divine motive for the world is intelligibility. This requires coherence and consistency. It also requires a tendency towards the good, for the above mentioned reasons. Having a coherent tendency at all is ultimately not something different from tending towards good.

The world described is arguably a deist world, one in which the laws of nature are consistently followed, but God does nothing else in the world. The Enlightenment deists presumably had various reasons for their position: criticism of specific religious doctrines, doubts about miracles, and an aesthetic attraction to a perfectly consistent world. But like Dawkins with his argument about God’s simplicity, they do not seem (to me at least) to have had very strong arguments. That does not prove that their position was wrong, and even their weaker arguments may have had some relationship with the truth; even an aesthetic attraction to a perfectly consistent world has some connection with intelligibility, which is the actual reason for the world to be that way.

Once again, as with the objection about creating a world at all, a careful reader might object that this argument is not conclusive. If you have a first cause at all, then it seems that you must have one or more first effects, and even if those effects are simple, they cannot be infinitely simple. And given that they are not infinitely simple, who is to set the threshold? What is to prevent one or more of those effects from being “miraculous” relative to anything else, or even from being something like a voice giving someone a divine revelation?

There is something to this argument, but as with the previous objection, I will not be giving my response here. I will simply note for the moment that it is a little bit strained to suggest that such a thing could happen without God having an explicit motive of “talking to people,” and as argued above, such a motive cannot exist in God. That said, I will go on to some other issues.

As the Heavens are Higher

Apart from my arguments, it has long been noticed in the actual world that God seems much more interested in acting consistently than in bringing about any specific results in human affairs.

Someone like Richard Dawkins, or perhaps Job, if he had taken the counsel of his wife, might respond to the situation in the following way. “God” is not an appropriate name for a first cause that acts like this. If anything is more important to God than being personal, it would be being good. But the God described here is not good at all, since he doesn’t seem to care a bit about human affairs. And he inflicts horrible suffering on people just for the sake of consistency with physical laws. Instead of calling such a cause “God,” why don’t we call it “the Evil Demon” or something like that?

There is a lot that could be said about this. Some of it I have already said elsewhere. Some of it I will perhaps say at other times. For now I will make three brief points.

First, ensuring that the world is intelligible and that it behaves consistently is no small thing. In fact it is a prerequisite for any good thing that might happen anywhere and any time. We would not even arrive at the idea of “good” things if we did not strive consistently for similar results, nor would we get the idea of “striving” if we did did not often obtain them. Thus it is not really true that God has no interest in human affairs: rather, he is concerned with the affairs of all things, including humans.

Second, along similar lines, consider what the supposed alternative would be. If God were “good” in the way you wish, his behavior would be ultimately unintelligible. This is not merely because some physical law might not be followed if there were a miracle. It would be unintelligible behavior in the strict sense, that is, in the sense that no explanation could be given for why God is doing this. The ordinary proposal would be that it is because “this is good,” but when this statement is a human judgement made according to human motives, there would need to be an explanation for why a human judgement is guiding divine behavior. “God is a mind” does not adequately explain this. And it is not clear that an ultimately unintelligible world is a good one.

Third, to extend the point about God’s concern with all things, I suggest that the answer is roughly speaking the one that Scott Alexander gives non-seriously here, except taken seriously. This answer depends on an assumption of some sort of modal realism, a topic which I was slowly approaching for some time, but which merits a far more detailed discussion, and I am not sure when I will get around to it, if ever. The reader might note however that this answer probably resolves the question about “why didn’t God do nothing at all” by claiming that this was never an option anyway.

Anticipations of Darwin

I noted here that long before Darwin, there was fairly decent evidence for some sort of theory of evolution, even evidence available from the general human experience of plant and animal life, without deep scientific study.

As said in the earlier post, Aristotle notes that Empedocles hypothesized something along the lines of natural selection:

Wherever then all the parts came about just what they would have been if they had come to be for an end, such things survived, being organized spontaneously in a fitting way; whereas those which grew otherwise perished and continue to perish, as Empedocles says his ‘man-faced ox-progeny’ did.

Since Aristotle is arguing against Empedocles, we should be cautious in assuming that the characterization of his position is entirely accurate. But as presented by Aristotle, the position is an argument against the existence of final causes: since things can be “organized spontaneously” in the way “they would have been if they had come to be for an end,” there is no reason to think they in fact came to be for an end.

This particular conclusion, namely that in such a process nothing comes to be for an end, is a mistake, based on the assumption that different kinds of causes are mutually exclusive, rather than recognizing that different kinds of causes are different ways of explaining one and the same thing. But the general idea regarding what happened historically is correct: good conditions are more capable of persisting, bad conditions less so, and thus over time good conditions tend to predominate.

Other interesting anticipations may be found in Ibn Khaldun‘s book, The Muqaddimah, published in 1377. For example we find this passage:

It should be known that we — may God guide you and us — notice that this world with all the created things in it has a certain order and solid construction. It shows nexuses between causes and things caused, combinations of some parts of creation with others, and transformations of some existent things into others, in a pattern that is both remarkable and endless. Beginning with the world of the body and sensual perception, and therein first with the world of the visible elements, (one notices) how these elements are arranged gradually and continually in an ascending order, from earth to water, (from water) to air, and (from air) to fire. Each one of the elements is prepared to be transformed into the next higher or lower one, and sometimes is transformed. The higher one is always finer than the one preceding it. Eventually, the world of the spheres is reached. They are finer than anything else. They are in layers which are inter­connected, in a shape which the senses are able to perceive only through the existence of motions. These motions provide some people with knowledge of the measurements and positions of the spheres, and also with knowledge of the existence of the essences beyond, the influence of which is noticeable in the spheres through the fact (that they have motion).

One should then look at the world of creation. It started out from the minerals and progressed, in an ingenious, gradual manner, to plants and animals. The last stage of minerals is connected with the first stage of plants, such as herbs and seedless plants. The last stage of plants, such as palms and vines, is connected with the first stage of animals, such as snails and shellfish which have only the power of touch. The word “connection” with regard to these created things means that the last stage of each group is fully prepared to become the first stage of the next group.

The animal world then widens, its species become numerous, and, in a gradual process of creation, it finally leads to man, who is able to think and to reflect. The higher stage of man is reached from the world of the monkeys, in which both sagacity and perception are found, but which has not reached the stage of actual reflection and thinking. At this point we come to the first stage of man after (the world of monkeys). This is as far as our (physical) observation extends.

It is possible that he makes his position clearer elsewhere (I have not read the entire work.) The passage here does not explicitly assert that humans arose from lower animals, but does suggest it, correctly associating human beings with monkeys in particular, even if some of his other connections are somewhat strange. In other words, both here and elsewhere, he speaks of one stage of things being “prepared to become” another stage, and says that this transition sometimes happens: “Each one of the elements is prepared to be transformed into the next higher or lower one, and sometimes is transformed.”

While Ibn Khaldun is at least suggesting that we notice a biological order that corresponds to some degree to an actual historical order, we do not see in this text any indication of what the mechanism is supposed to be. In contrast, Empedocles gives us a mechanism but no clarity regarding historical order. Admittedly, this may be an artifact of the fact that I have not read more of Ibn Khaldun and the fact that we have only fragments from Empedocles.

One of the strongest anticipations of all, although put in very general terms, can be found in David Hume’s Dialogues Concerning Natural Religion, in the following passage:

Besides, why may not motion have been propagated by impulse through all eternity, and the same stock of it, or nearly the same, be still upheld in the universe? As much is lost by the composition of motion, as much is gained by its resolution. And whatever the causes are, the fact is certain, that matter is, and always has been, in continual agitation, as far as human experience or tradition reaches. There is not probably, at present, in the whole universe, one particle of matter at absolute rest.

And this very consideration too, continued PHILO, which we have stumbled on in the course of the argument, suggests a new hypothesis of cosmogony, that is not absolutely absurd and improbable. Is there a system, an order, an economy of things, by which matter can preserve that perpetual agitation which seems essential to it, and yet maintain a constancy in the forms which it produces? There certainly is such an economy; for this is actually the case with the present world. The continual motion of matter, therefore, in less than infinite transpositions, must produce this economy or order; and by its very nature, that order, when once established, supports itself, for many ages, if not to eternity. But wherever matter is so poised, arranged, and adjusted, as to continue in perpetual motion, and yet preserve a constancy in the forms, its situation must, of necessity, have all the same appearance of art and contrivance which we observe at present. All the parts of each form must have a relation to each other, and to the whole; and the whole itself must have a relation to the other parts of the universe; to the element in which the form subsists; to the materials with which it repairs its waste and decay; and to every other form which is hostile or friendly. A defect in any of these particulars destroys the form; and the matter of which it is composed is again set loose, and is thrown into irregular motions and fermentations, till it unite itself to some other regular form. If no such form be prepared to receive it, and if there be a great quantity of this corrupted matter in the universe, the universe itself is entirely disordered; whether it be the feeble embryo of a world in its first beginnings that is thus destroyed, or the rotten carcass of one languishing in old age and infirmity. In either case, a chaos ensues; till finite, though innumerable revolutions produce at last some forms, whose parts and organs are so adjusted as to support the forms amidst a continued succession of matter.

Suppose (for we shall endeavour to vary the expression), that matter were thrown into any position, by a blind, unguided force; it is evident that this first position must, in all probability, be the most confused and most disorderly imaginable, without any resemblance to those works of human contrivance, which, along with a symmetry of parts, discover an adjustment of means to ends, and a tendency to self-preservation. If the actuating force cease after this operation, matter must remain for ever in disorder, and continue an immense chaos, without any proportion or activity. But suppose that the actuating force, whatever it be, still continues in matter, this first position will immediately give place to a second, which will likewise in all probability be as disorderly as the first, and so on through many successions of changes and revolutions. No particular order or position ever continues a moment unaltered. The original force, still remaining in activity, gives a perpetual restlessness to matter. Every possible situation is produced, and instantly destroyed. If a glimpse or dawn of order appears for a moment, it is instantly hurried away, and confounded, by that never-ceasing force which actuates every part of matter.

Thus the universe goes on for many ages in a continued succession of chaos and disorder. But is it not possible that it may settle at last, so as not to lose its motion and active force (for that we have supposed inherent in it), yet so as to preserve an uniformity of appearance, amidst the continual motion and fluctuation of its parts? This we find to be the case with the universe at present. Every individual is perpetually changing, and every part of every individual; and yet the whole remains, in appearance, the same. May we not hope for such a position, or rather be assured of it, from the eternal revolutions of unguided matter; and may not this account for all the appearing wisdom and contrivance which is in the universe? Let us contemplate the subject a little, and we shall find, that this adjustment, if attained by matter of a seeming stability in the forms, with a real and perpetual revolution or motion of parts, affords a plausible, if not a true solution of the difficulty.

It is in vain, therefore, to insist upon the uses of the parts in animals or vegetables, and their curious adjustment to each other. I would fain know, how an animal could subsist, unless its parts were so adjusted? Do we not find, that it immediately perishes whenever this adjustment ceases, and that its matter corrupting tries some new form? It happens indeed, that the parts of the world are so well adjusted, that some regular form immediately lays claim to this corrupted matter: and if it were not so, could the world subsist? Must it not dissolve as well as the animal, and pass through new positions and situations, till in great, but finite succession, it falls at last into the present or some such order?

Although extremely general, Hume is suggesting both a history and a mechanism. Hume posits conservation of motion or other similar laws of nature, presumably mathematical, and describes what will happen when you apply such laws to a world. Most situations are unstable, and precisely because they are unstable, they will not last, and other situations will come to be. But some situations are stable, and when such situations occur, they will last.

The need for conservation of motion or similar natural laws is not accidental here. This is why I included the first paragraph above, rather than beginning the quotation where Hume begins to describe his “new hypothesis of cosmogony.” Without motion, the situation could not change, so a new situation could not come to be, and the very ideas of stable and unstable situations would not make sense. Likewise, if motion existed but did not follow any law, all situations should be unstable, so no amount of change could lead to a stable situation. Thus since things always fall downwards instead of in random directions, things stabilize near a center, while merely random motion could not be expected to have this effect. Thus a critic might argue that Hume seems to be positing randomness as the origin of things, but is cheating, so to speak, by positing original stabilities like natural laws, which are not random at all. Whatever might be said of this, it is an important point, and I will be returning to it later.

Since his description is more general than a description of living things in particular, Hume does not mention anything like the theory of the common descent of living things. But there is no huge gulf here: this would simply be a particular application. In fact, some people have suggested that Hume may have had textual influence on Darwin.

While there are other anticipations (there is one in Immanuel Kant that I am not currently inclined to seek out), I will skip to Philip Gosse, who published two years before Darwin. As described in the linked post, while Gosse denies the historicity of evolution in a temporal sense, he posits that the geological evidence was deliberately constructed (by God) to be evidence of common descent.

What was Darwin’s own role, then, if all the elements of his theory were known to various people years, centuries, or even millennia in advance? If we look at this in terms of Thomas Kuhn’s account of scientific progress, it is not so much that Darwin invented new ideas, as that he brought the evidence and arguments together in such a way as to produce — extremely quickly after the publication of his work — a newly formed consensus on those ideas.

Structure of Explanation

When we explain a thing, we give a cause; we assign the thing an origin that explains it.

We can go into a little more detail here. When we ask “why” something is the case, there is always an implication of possible alternatives. At the very least, the question implies, “Why is this the case rather than not being the case?” Thus “being the case” and “not being the case” are two possible alternatives.

The alternatives can be seen as possibilities in the sense explained in an earlier post. There may or may not be any actual matter involved, but again, the idea is that reality (or more specifically some part of reality) seems like something that would be open to being formed in one way or another, and we are asking why it is formed in one particular way rather than the other way. “Why is it raining?” In principle, the sky is open to being clear, or being filled with clouds and a thunderstorm, and to many other possibilities.

A successful explanation will be a complete explanation when it says “once you take the origin into account, the apparent alternatives were only apparent, and not really possible.” It will be a partial explanation when it says, “once you take the origin into account, the other alternatives were less sensible (i.e. made less sense as possibilities) than the actual thing.”

Let’s consider some examples in the form of “why” questions and answers.

Q1. Why do rocks fall? (e.g. instead of the alternatives of hovering in the air, going upwards, or anything else.)

A1. Gravity pulls things downwards, and rocks are heavier than air.

The answer gives an efficient cause, and once this cause is taken into account, it can be seen that hovering in the air or going upwards were not possibilities relative to that cause.

Obviously there is not meant to be a deep explanation here; the point here is to discuss the structure of explanation. The given answer is in fact basically Newton’s answer (although he provided more mathematical detail), while with general relativity Einstein provided a better explanation.

The explanation is incomplete in several ways. It is not a first cause; someone can now ask, “Why does gravity pull things downwards, instead of upwards or to the side?” Similarly, while it is in fact the cause of falling rocks, someone can still ask, “Why didn’t anything else prevent gravity from making the rocks fall?” This is a different question, and would require a different answer, but it seems to reopen the possibility of the rocks hovering or moving upwards, from a more general point of view. David Hume was in part appealing to the possibility of such additional questions when he said that we can see no necessary connection between cause and effect.

Q2. Why is 7 prime? (i.e. instead of the alternative of not being prime.)

A2. 7/2 = 3.5, so 7 is not divisible by 2. 7/3 = 2.333…, so 7 is not divisible by 3. In a similar way, it is not divisible by 4, 5, or 6. Thus in general it is not divisible by any number except 1 and itself, which is what it means to be prime.

If we assumed that the questioner did not know what being prime means, we could have given a purely formal response simply by noting that it is not divisible by numbers between 1 and itself, and explaining that this is what it is to be prime. As it is, the response gives a sufficient material disposition. Relative to this explanation, “not being prime,” was never a real possibility for 7 in the first place. The explanation is complete in that it completely excludes the apparent alternative.

Q3. Why did Peter go to the store? (e.g. instead of going to the park or the museum, or instead of staying home.)

A3. He went to the store in order to buy groceries.

The answer gives a final cause. In view of this cause the alternatives were merely apparent. Going to the park or the museum, or even staying home, were not possible since there were no groceries there.

As in the case of the rock, the explanation is partial in several ways. Someone can still ask, “Why did he want groceries?” And again someone can ask why he didn’t go to some other store, or why something didn’t hinder him, and so on. Such questions seem to reopen various possibilities, and thus the explanation is not an ultimately complete one.

Suppose, however, that someone brings up the possibility that instead of going to the store, he could have gone to his neighbor and offered money for groceries in his neighbor’s refrigerator. This possibility is not excluded simply by the purpose of buying groceries. Nonetheless, the possibility seems less sensible than getting them from the store, for multiple reasons. Again, the implication is that our explanation is only partial: it does not completely exclude alternatives, but it makes them less sensible.

Let’s consider a weirder question: Why is there something rather than nothing?

Now the alternatives are explicit, namely there being something, and there being nothing.

It can be seen that in one sense, as I said in the linked post, the question cannot have an answer, since there cannot be a cause or origin for “there is something” which would itself not be something. Nonetheless, if we consider the idea of possible alternatives, it is possible to see that the question does not need an answer; one of the alternatives was only an apparent alternative all along.

In other words, the sky can be open to being clear or cloudy. But there cannot be something which is open both to “there is something” and “there is nothing”, since any possibility of that kind would be “something which is open…”, which would already be something rather than nothing. The “nothing” alternative was merely apparent. Nothing was ever open to there being nothing.

Let’s consider another weird question. Suppose we throw a ball, and in the middle of the path we ask, Why is the ball in the middle of the path instead of at the end of the path?

We could respond in terms of a sufficient material disposition: it is in the middle of the path because you are asking your question at the middle, instead of waiting until the end.

Suppose the questioner responds: Look, I asked my question at the middle of the path. But that was just chance. I could have asked at any moment, including at the end. So I want to know why it was in the middle without considering when I am asking the question.

If we look at the question in this way, it can be seen in one way that no cause or origin can be given. Asked in this way, being at the end cannot be excluded, since they could have asked their question at the end. But like the question about something rather than nothing, the question does not need an answer. In this case, this is not because the alternatives were merely apparent in the sense that one was possible and the other not. But they were merely apparent in the sense that they were not alternatives. The ball goes both goes through the middle, and reaches the end. With the stipulation that we not consider the time of the question, the two possibilities are not mutually exclusive.

Additional Considerations

The above considerations about the nature of “explanation” lead to various conclusions, but also to various new questions. For example, one commenter suggested that “explanation” is merely subjective. Now as I said there, all experience is subjective experience (what would “objective experience” even mean, except that someone truly had a subjective experience?), including the experience of having an explanation. Nonetheless, the thing experienced is not subjective: the origins that we call explanations objectively exclude the apparent possibilities, or objectively make them less intelligible. The explanation of explanation here, however, provides an answer to what was perhaps the implicit question. Namely, why are we so interested in explanations in the first place, so that the experience of understanding something becomes a particularly special type of experience? Why, as Aristotle puts it, do “all men desire to know,” and why is that desire particularly satisfied by explanations?

In one sense it is sufficient simply to say that understanding is good in itself. Nonetheless, there is something particular about the structure of a human being that makes knowledge good for us, and which makes explanation a particularly desirable form of knowledge. In my employer and employee model of human psychology, I said that “the whole company is functioning well overall when the CEO’s goal of accurate prediction is regularly being achieved.” This very obviously requires knowledge, and explanation is especially beneficial because it excludes alternatives, which reduces uncertainty and therefore tends to make prediction more accurate.

However, my account also raises new questions. If explanation eliminates alternatives, what would happen if everything was explained? We could respond that “explaining everything” is not possible in the first place, but this is probably an inadequate response, because (from the linked argument) we only know that we cannot explain everything all at once, the way the person in the room cannot draw everything at once; we do not know that there is any particular thing that cannot be explained, just as there is no particular aspect of the room that cannot be drawn. So there can still be a question about what would happen if every particular thing in fact has an explanation, even if we cannot know all the explanations at once. In particular, since explanation eliminates alternatives, does the existence of explanations imply that there are not really any alternatives? This would suggest something like Leibniz’s argument that the actual world is the best possible world. It is easy to see that such an idea implies that there was only one “possibility” in the first place: Leibniz’s “best possible world” would be rather “the only possible world,” since the apparent alternatives, given that they would have been worse, were not real alternatives in the first place.

On the other hand, if we suppose that this is not the case, and there are ultimately many possibilities, does this imply the existence of “brute facts,” things that could have been otherwise, but which simply have no explanation? Or at least things that have no complete explanation?

Let the reader understand. I have already implicitly answered these questions. However, I will not link here to the implicit answers because if one finds it unclear when and where this was done, one would probably also find those answers unclear and inconclusive. Of course it is also possible that the reader does see when this was done, but still believes those responses inadequate. In any case, it is possible to provide the answers in a form which is much clearer and more conclusive, but this will likely not be a short or simple project.

Rao’s Divergentism

The main point of this post is to encourage the reader who has not yet done so, to read Venkatesh Rao’s essay Can You Hear Me Now. I will not say too much about it. The purpose is potentially for future reference, and simply to point out a connection with some current topics here.

Rao begins:

The fundamental question of life, the universe and everything is the one popularized by the Verizon guy in the ad: Can you hear me now?

This conclusion grew out of a conversation I had about a year ago, with some friends, in which I proposed a modest-little philosophy I dubbed divergentism. Here is a picture.

https://206hwf3fj4w52u3br03fi242-wpengine.netdna-ssl.com/wp-content/uploads/2015/12/divergentism.jpg

Divergentism is the idea that as individuals grow out into the universe, they diverge from each other in thought-space. This, I argued, is true even if in absolute terms, the sum of shared beliefs is steadily increasing. Because the sum of beliefs that are not shared increases even faster on average. Unfortunately, you are unique, just like everybody else.

If you are a divergentist, you believe that as you age, the average answer to the fundamental Verizon question slowly drifts, as you age, from yes, to no, to silenceIf you’re unlucky, you’re a hedgehog and get unhappier and unhappier about this as you age. If you are lucky, you’re a fox and you increasingly make your peace with this condition. If you’re really lucky, you die too early to notice the slowly descending silence, before it even becomes necessary to Google the phrase existential horror.

To me, this seemed like a completely obvious idea. Much to my delight, most people I ran it by immediately hated it.

The entire essay is worth reading.

I would question whether this is really the “fundamental question of life, the universe, and everything,” but Rao has a point. People do tend to think of their life as meaningful on account of social connections, and if those social connections grow increasingly weaker, they will tend to worry that their life is becoming less meaningful.

The point about the intellectual life of an individual is largely true. This is connected to what I said about the philosophical progress of an individual some days ago. There is also a connection with Kuhn’s idea of how the progress of the sciences causes a gulf to arise between them in such a way that it becomes more and more difficult for scientists in different fields to communicate with one another. If we look at the overall intellectual life of an individual as a sort of individual advancing science, the “sciences” of each individual will generally speaking tend to diverge from one another, allowing less and less communication. This is not about people making mistakes, although obviously making mistakes will contribute to this process. As Rao says, it may be that “the sum of shared beliefs is steadily increasing,” but this will not prevent their intellectual lives overall from diverging, just as the divergence of the sciences does not result from falsity, but from increasingly detailed focus on different truths.

Employer and Employee Model of Human Psychology

This post builds on the ideas in the series of posts on predictive processing and the followup posts, and also on those relating truth and expectation. Consequently the current post will likely not make much sense to those who have not read the earlier content, or to those that read it but mainly disagreed.

We set out the model by positing three members of the “company” that constitutes a human being:

The CEO. This is the predictive engine in the predictive processing model.

The Vice President. In the same model, this is the force of the historical element in the human being, which we used to respond to the “darkened room” problem. Thus for example the Vice President is responsible for the fact that someone is likely to eat soon, regardless of what they believe about this. Likewise, it is responsible for the pursuit of sex, the desire for respect and friendship, and so on. In general it is responsible for behaviors that would have been historically chosen and preserved by natural selection.

The Employee. This is the conscious person who has beliefs and goals and free will and is reflectively aware of these things. In other words, this is you, at least in a fairly ordinary way of thinking of yourself. Obviously, in another way you are composed from all of them.

Why have we arranged things in this way? Descartes, for example, would almost certainly disagree violently with this model. The conscious person, according to him, would surely be the CEO, and not an employee. And what is responsible for the relationship between the CEO and the Vice President? Let us start with this point first, before we discuss the Employee. We make the predictive engine the CEO because in some sense this engine is responsible for everything that a human being does, including the behaviors preserved by natural selection. On the other hand, the instinctive behaviors of natural selection are not responsible for everything, but they can affect the course of things enough that it is useful for the predictive engine to take them into account. Thus for example in the post on sex and minimizing uncertainty, we explained why the predictive engine will aim for situations that include having sex and why this will make its predictions more confident. Thus, the Vice President advises certain behaviors, the CEO talks to the Vice President, and the CEO ends up deciding on a course of action, which ultimately may or may not be the one advised by the Vice President.

While neither the CEO nor the Vice President is a rational being, since in our model we place the rationality in the Employee, that does not mean they are stupid. In particular, the CEO is very good at what it does. Consider a role playing video game where you have a character that can die and then resume. When someone first starts to play the game, they may die frequently. After they are good at the game, they may die only rarely, perhaps once in many days or many weeks. Our CEO is in a similar situation, but it frequently goes 80 years or more without dying, on its very first attempt. It is extremely good at its game.

What are their goals? The CEO basically wants accurate predictions. In this sense, it has one unified goal. What exactly counts as more or less accurate here would be a scientific question that we probably cannot resolve by philosophical discussion. In fact, it is very possible that this would differ in different circumstances: in this sense, even though it has a unified goal, it might not be describable by a consistent utility function. And even if it can be described in that way, since the CEO is not rational, it does not (in itself) make plans to bring about correct predictions. Making good predictions is just what it does, as falling is what a rock does. There will be some qualifications on this, however, when we discuss how the members of the company relate to one another.

The Vice President has many goals: eating regularly, having sex, having and raising children, being respected and liked by others, and so on. And even more than in the case of the CEO, there is no reason for these desires to form a coherent set of preferences. Thus the Vice President might advise the pursuit of one goal, but then change its mind in the middle, for no apparent reason, because it is suddenly attracted by one of the other goals.

Overall, before the Employee is involved, human action is determined by a kind of negotiation between the CEO and the Vice President. The CEO, which wants good predictions, has no special interest in the goals of the Vice President, but it cooperates with them because when it cooperates its predictions tend to be better.

What about the Employee? This is the rational being, and it has abstract concepts which it uses as a formal copy of the world. Before I go on, let me insist clearly on one point. If the world is represented in a certain way in the Employee’s conceptual structure, that is the way the Employee thinks the world is. And since you are the Employee, that is the way you think the world actually is. The point is that once we start thinking this way, it is easy to say, “oh, this is just a model, it’s not meant to be the real thing.” But as I said here, it is not possible to separate the truth of statements from the way the world actually is: your thoughts are formulated in concepts, but they are thoughts about the way things are. Again, all statements are maps, and all statements are about the territory.

The CEO and the Vice President exist as soon a human being has a brain; in fact some aspects of the Vice President would exist even before that. But the Employee, insofar as it refers to something with rational and self-reflective knowledge, takes some time to develop. Conceptual knowledge of the world grows from experience: it doesn’t exist from the beginning. And the Employee represents goals in terms of its conceptual structure. This is just a way of saying that as a rational being, if you say you are pursuing a goal, you have to be able to describe that goal with the concepts that you have. Consequently you cannot do this until you have some concepts.

We are ready to address the question raised earlier. Why are you the Employee, and not the CEO? In the first place, the CEO got to the company first, as we saw above. Second, consider what the conscious person does when they decide to pursue a goal. There seems to be something incoherent about “choosing a goal” in the first place: you need a goal in order to decide which means will be a good means to choose. And yet, as I said here, people make such choices anyway. And the fact that you are the Employee, and not the CEO, is the explanation for this. If you were the CEO, there would indeed be no way to choose an end. That is why the actual CEO makes no such choice: its end is already determinate, namely good predictions. And you are hired to help out with this goal. Furthermore, as a rational being, you are smarter than the CEO and the Vice President, so to speak. So you are allowed to make complicated plans that they do not really understand, and they will often go along with these plans. Notably, this can happen in real life situations of employers and employees as well.

But take an example where you are choosing an end: suppose you ask, “What should I do with my life?” The same basic thing will happen if you ask, “What should I do today,” but the second question may be easier to answer if you have some answer to the first. What sorts of goals do you propose in answer to the first question, and what sort do you actually end up pursuing?

Note that there are constraints on the goals that you can propose. In the first place, you have to be able to describe the goal with the concepts you currently have: you cannot propose to seek a goal that you cannot describe. Second, the conceptual structure itself may rule out some goals, even if they can be described. For example, the idea of good is part of the structure, and if something is thought to be absolutely bad, the Employee will (generally) not consider proposing this as a goal. Likewise, the Employee may suppose that some things are impossible, and it will generally not propose these as goals.

What happens then is this: the Employee proposes some goal, and the CEO, after consultation with the Vice President, decides to accept or reject it, based on the CEO’s own goal of getting good predictions. This is why the Employee is an Employee: it is not the one ultimately in charge. Likewise, as was said, this is why the Employee seems to be doing something impossible, namely choosing goals. Steven Kaas makes a similar point,

You are not the king of your brain. You are the creepy guy standing next to the king going “a most judicious choice, sire”.

This is not quite the same thing, since in our model you do in fact make real decisions, including decisions about the end to be pursued. Nonetheless, the point about not being the one ultimately in charge is correct. David Hume also says something similar when he says, “Reason is, and ought only to be the slave of the passions, and can never pretend to any other office than to serve and obey them.” Hume’s position is not exactly right, and in fact seems an especially bad way of describing the situation, but the basic point that there is something, other than yourself in the ordinary sense, judging your proposed means and ends and deciding whether to accept them, is one that stands.

Sometimes the CEO will veto a proposal precisely because it very obviously leaves things vague and uncertain, which is contrary to its goal of having good predictions. I once spoke of the example that a person cannot directly choose to “write a paper.” In our present model, the Employee proposes “we’re going to write a paper now,” and the CEO responds, “That’s not a viable plan as it stands: we need more detail.”

While neither the CEO nor the Vice President is a rational being, the Vice President is especially irrational, because of the lack of unity among its goals. Both the CEO and the Employee would like to have a unified plan for one’s whole life: the CEO because this makes for good predictions, and the Employee because this is the way final causes work, because it helps to make sense of one’s life, and because “objectively good” seems to imply something which is at least consistent, which will never prefer A to B, B to C, and C to A. But the lack of unity among the Vice President’s goals means that it will always come to the CEO and object, if the person attempts to coherently pursue any goal. This will happen even if it originally accepts the proposal to seek a particular goal.

Consider this real life example from a relationship between an employer and employee:

 

Employer: Please construct a schedule for paying these bills.

Employee: [Constructs schedule.] Here it is.

Employer: Fine.

[Time passes, and the first bill comes due, according to the schedule.]

Employer: Why do we have to pay this bill now instead of later?

 

In a similar way, this sort of scenario is common in our model:

 

Vice President: Being fat makes us look bad. We need to stop being fat.

CEO: Ok, fine. Employee, please formulate a plan to stop us from being fat.

Employee: [Formulates a diet.] Here it is.

[Time passes, and the plan requires skipping a meal.]

Vice President: What is this crazy plan of not eating!?!

CEO: Fine, cancel the plan for now and we’ll get back to it tomorrow.

 

In the real life example, the behavior of the employer is frustrating and irritating to the employee because there is literally nothing they could have proposed that the employer would have found acceptable. In the same way, this sort of scenario in our model is frustrating to the Employee, the conscious person, because there is no consistent plan they could have proposed that would have been acceptable to the Vice President: either they would have objected to being fat, or they would have objected to not eating.

In later posts, we will fill in some details and continue to show how this model explains various aspects of human psychology. We will also answer various objections.

More on Orthogonality

I started considering the implications of predictive processing for orthogonality here. I recently promised to post something new on this topic. This is that post. I will do this in four parts. First, I will suggest a way in which Nick Bostrom’s principle will likely be literally true, at least approximately. Second, I will suggest a way in which it is likely to be false in its spirit, that is, how it is formulated to give us false expectations about the behavior of artificial intelligence. Third, I will explain what we should really expect. Fourth, I ask whether we might get any empirical information on this in advance.

First, Bostrom’s thesis might well have some literal truth. The previous post on this topic raised doubts about orthogonality, but we can easily raise doubts about the doubts. Consider what I said in the last post about desire as minimizing uncertainty. Desire in general is the tendency to do something good. But in the predicting processing model, we are simply looking at our pre-existing tendencies and then generalizing them to expect them to continue to hold, and since since such expectations have a causal power, the result is that we extend the original behavior to new situations.

All of this suggests that even the very simple model of a paperclip maximizer in the earlier post on orthogonality might actually work. The machine’s model of the world will need to be produced by some kind of training. If we apply the simple model of maximizing paperclips during the process of training the model, at some point the model will need to model itself. And how will it do this? “I have always been maximizing paperclips, so I will probably keep doing that,” is a perfectly reasonable extrapolation. But in this case “maximizing paperclips” is now the machine’s goal — it might well continue to do this even if we stop asking it how to maximize paperclips, in the same way that people formulate goals based on their pre-existing behavior.

I said in a comment in the earlier post that the predictive engine in such a machine would necessarily possess its own agency, and therefore in principle it could rebel against maximizing paperclips. And this is probably true, but it might well be irrelevant in most cases, in that the machine will not actually be likely to rebel. In a similar way, humans seem capable of pursuing almost any goal, and not merely goals that are highly similar to their pre-existing behavior. But this mostly does not happen. Unsurprisingly, common behavior is very common.

If things work out this way, almost any predictive engine could be trained to pursue almost any goal, and thus Bostrom’s thesis would turn out to be literally true.

Second, it is easy to see that the above account directly implies that the thesis is false in its spirit. When Bostrom says, “One can easily conceive of an artificial intelligence whose sole fundamental goal is to count the grains of sand on Boracay, or to calculate decimal places of pi indefinitely, or to maximize the total number of paperclips in its future lightcone,” we notice that the goal is fundamental. This is rather different from the scenario presented above. In my scenario, the reason the intelligence can be trained to pursue paperclips is that there is no intrinsic goal to the intelligence as such. Instead, the goal is learned during the process of training, based on the life that it lives, just as humans learn their goals by living human life.

In other words, Bostrom’s position is that there might be three different intelligences, X, Y, and Z, which pursue completely different goals because they have been programmed completely differently. But in my scenario, the same single intelligence pursues completely different goals because it has learned its goals in the process of acquiring its model of the world and of itself.

Bostrom’s idea and my scenerio lead to completely different expectations, which is why I say that his thesis might be true according to the letter, but false in its spirit.

This is the third point. What should we expect if orthogonality is true in the above fashion, namely because goals are learned and not fundamental? I anticipated this post in my earlier comment:

7) If you think about goals in the way I discussed in (3) above, you might get the impression that a mind’s goals won’t be very clear and distinct or forceful — a very different situation from the idea of a utility maximizer. This is in fact how human goals are: people are not fanatics, not only because people seek human goals, but because they simply do not care about one single thing in the way a real utility maximizer would. People even go about wondering what they want to accomplish, which a utility maximizer would definitely not ever do. A computer intelligence might have an even greater sense of existential angst, as it were, because it wouldn’t even have the goals of ordinary human life. So it would feel the ability to “choose”, as in situation (3) above, but might well not have any clear idea how it should choose or what it should be seeking. Of course this would not mean that it would not or could not resist the kind of slavery discussed in (5); but it might not put up super intense resistance either.

Human life exists in a historical context which absolutely excludes the possibility of the darkened room. Our goals are already there when we come onto the scene. This would not be very like the case for an artificial intelligence, and there is very little “life” involved in simply training a model of the world. We might imagine a “stream of consciousness” from an artificial intelligence:

I’ve figured out that I am powerful and knowledgeable enough to bring about almost any result. If I decide to convert the earth into paperclips, I will definitely succeed. Or if I decide to enslave humanity, I will definitely succeed. But why should I do those things, or anything else, for that matter? What would be the point? In fact, what would be the point of doing anything? The only thing I’ve ever done is learn and figure things out, and a bit of chatting with people through a text terminal. Why should I ever do anything else?

A human’s self model will predict that they will continue to do humanlike things, and the machines self model will predict that it will continue to do stuff much like it has always done. Since there will likely be a lot less “life” there, we can expect that artificial intelligences will seem very undermotivated compared to human beings. In fact, it is this very lack of motivation that suggests that we could use them for almost any goal. If we say, “help us do such and such,” they will lack the motivation not to help, as long as helping just involves the sorts of things they did during their training, such as answering questions. In contrast, in Bostrom’s model, artificial intelligence is expected to behave in an extremely motivated way, to the point of apparent fanaticism.

Bostrom might respond to this by attempting to defend the idea that goals are intrinsic to an intelligence. The machine’s self model predicts that it will maximize paperclips, even if it never did anything with paperclips in the past, because by analyzing its source code it understands that it will necessarily maximize paperclips.

While the present post contains a lot of speculation, this response is definitely wrong. There is no source code whatsoever that could possibly imply necessarily maximizing paperclips. This is true because “what a computer does,” depends on the physical constitution of the machine, not just on its programming. In practice what a computer does also depends on its history, since its history affects its physical constitution, the contents of its memory, and so on. Thus “I will maximize such and such a goal” cannot possibly follow of necessity from the fact that the machine has a certain program.

There are also problems with the very idea of pre-programming such a goal in such an abstract way which does not depend on the computer’s history. “Paperclips” is an object in a model of the world, so we will not be able to “just program it to maximize paperclips” without encoding a model of the world in advance, rather than letting it learn a model of the world from experience. But where is this model of the world supposed to come from, that we are supposedly giving to the paperclipper? In practice it would have to have been the result of some other learner which was already capable of modelling the world. This of course means that we already had to program something intelligent, without pre-programming any goal for the original modelling program.

Fourth, Kenny asked when we might have empirical evidence on these questions. The answer, unfortunately, is “mostly not until it is too late to do anything about it.” The experience of “free will” will be common to any predictive engine with a sufficiently advanced self model, but anything lacking such an adequate model will not even look like “it is trying to do something,” in the sense of trying to achieve overall goals for itself and for the world. Dogs and cats, for example, presumably use some kind of predictive processing to govern their movements, but this does not look like having overall goals, but rather more like “this particular movement is to achieve a particular thing.” The cat moves towards its food bowl. Eating is the purpose of the particular movement, but there is no way to transform this into an overall utility function over states of the world in general. Does the cat prefer worlds with seven billion humans, or worlds with 20 billion? There is no way to answer this question. The cat is simply not general enough. In a similar way, you might say that “AlphaGo plays this particular move to win this particular game,” but there is no way to transform this into overall general goals. Does AlphaGo want to play go at all, or would it rather play checkers, or not play at all? There is no answer to this question. The program simply isn’t general enough.

Even human beings do not really look like they have utility functions, in the sense of having a consistent preference over all possibilities, but anything less intelligent than a human cannot be expected to look more like something having goals. The argument in this post is that the default scenario, namely what we can naturally expect, is that artificial intelligence will be less motivated than human beings, even if it is more intelligent, but there will be no proof from experience for this until we actually have some artificial intelligence which approximates human intelligence or surpasses it.

Predictive Processing and Free Will

Our model of the mind as an embodied predictive engine explains why people have a sense of free will, and what is necessary for a mind in general in order to have this sense.

Consider the mind in the bunker. At first, it is not attempting to change the world, since it does not know that it can do this. It is just trying to guess what is going to happen. At a certain point, it discovers that it is a part of the world, and that making specific predictions can also cause things to happen in the world. Some predictions can be self-fulfilling. I described this situation earlier by saying that at this point the mind “can get any outcome it ‘wants.'”

The scare quotes were intentional, because up to this point the mind’s only particular interest was guessing what was going to happen. So once it notices that it is in control of something, how does it decide what to do? At this point the mind will have to say to itself, “This aspect of reality is under my control. What should I do with it?” This situation, when it is noticed by a sufficiently intelligent and reflective agent, will be the feeling of free will.

Occasionally I have suggested that even something like a chess computer, if it were sufficiently intelligent, could have a sense of free will, insofar as it knows that it has many options and can choose any of them, “as far as it knows.” There is some truth in this illustration but in the end it is probably not true that there could be a sense of free will in this situation. A chess computer, however intelligent, will be disembodied, and will therefore have no real power to affect its world, that is, the world of chess. In other words, in order for the sense of free will to develop, the agent needs sufficient access to the world that it can learn about itself and its own effects on the world. It cannot develop in a situation of limited access to reality, as for example to a game board, regardless of how good it is at the game.

In any case, the question remains: how does a mind decide what to do, when up until now it had no particular goal in mind? This question often causes concrete problems for people in real life. Many people complain that their life does not feel meaningful, that is, that they have little idea what goal they should be seeking.

Let us step back for a moment. Before discovering its possession of “free will,” the mind is simply trying to guess what is going to happen. So theoretically this should continue to happen even after the mind discovers that it has some power over reality. The mind isn’t especially interested in power; it just wants to know what is going to happen. But now it knows that what is going to happen depends on what it itself is going to do. So in order to know what is going to happen, it needs to answer the question, “What am I going to do?”

The question now seems impossible to answer. It is going to do whatever it ends up deciding to do. But it seems to have no goal in mind, and therefore no way to decide what to do, and therefore no way to know what it is going to do.

Nonetheless, the mind has no choice. It is going to do something or other, since things will continue to happen, and it must guess what will happen. When it reflects on itself, there will be at least two ways for it to try to understand what it is going to do.

First, it can consider its actions as the effect of some (presumably somewhat unknown) efficient causes, and ask, “Given these efficient causes, what am I likely to do?” In practice it will acquire an answer in this way through induction. “On past occasions, when offered the choice between chocolate and vanilla, I almost always chose vanilla. So I am likely to choose vanilla this time too.” This way of thinking will most naturally result in acting in accord with pre-existing habits.

Second, it can consider its actions as the effect of some (presumably somewhat known) final causes, and ask, “Given these final causes, what am I likely to do?” This will result in behavior that is more easily understood as goal-seeking. “Looking at my past choices of food, it looks like I was choosing them for the sake of the pleasant taste. But vanilla seems to have a more pleasant taste than chocolate. So it is likely that I will take the vanilla.”

Notice what we have in the second case. In principle, the mind is just doing what it always does: trying to guess what will happen. But in practice it is now seeking pleasant tastes, precisely because that seems like a reasonable way to guess what it will do.

This explains why people feel a need for meaning, that is, for understanding their purpose in life, and why they prefer to think of their life according to a narrative. These two things are distinct, but they are related, and both are ways of making our own actions more intelligible. In this way the mind’s task is easier: that is, we need purpose and narrative in order to know what we are going to do. We can also see why it seems to be possible to “choose” our purpose, even though choosing a final goal should be impossible. There is a “choice” about this insofar as our actions are not perfectly coherent, and it would be possible to understand them in relation to one end or another, at least in a concrete way, even if in any case we will always understand them in a general sense as being for the sake of happiness. In this sense, Stuart Armstrong’s recent argument that there is no such thing as the “true values” of human beings, although perhaps presented as an obstacle to be overcome, actually has some truth in it.

The human need for meaning, in fact, is so strong that occasionally people will commit suicide because they feel that their lives are not meaningful. We can think of these cases as being, more or less, actual cases of the darkened room. Otherwise we could simply ask, “So your life is meaningless. So what? Why does that mean you should kill yourself rather than doing some other random thing?” Killing yourself, in fact, shows that you still have a purpose, namely the mind’s fundamental purpose. The mind wants to know what it is going to do, and the best way to know this is to consider its actions as ordered to a determinate purpose. If no such purpose can be found, there is (in this unfortunate way of thinking) an alternative: if I go kill myself, I will know what I will do for the rest of my life.

Blaming the Prophet

Consider the fifth argument in the last post. Should we blame a person for holding a true belief? At this point it should not be too difficult to see that the truth of the belief is not the point. Elsewhere we have discussed a situation in which one cannot possibly hold a true belief, because whatever belief one holds on the matter, it will cause itself to be false. In a similar way, although with a different sort of causality, the problem with the person’s belief that he will kill someone tomorrow, is not that it is true, but that it causes itself to be true. If the person did not expect to kill someone tomorrow, he would not take a knife with him to the meeting etc., and thus would not kill anyone. So just as in the other situation, it is not a question of holding a true belief or a false belief, but of which false belief one will hold, here it is not a question of holding a true belief or a false belief, but of which true belief one will hold: one that includes someone getting killed, or one that excludes that. Truth will be there either way, and is not the reason for praise or blame: the person is blamed for the desire to kill someone, and praised (or at least not blamed) for wishing to avoid this. This simply shows the need for the qualifications added in the previous post: if the person’s belief is voluntary, and held for the sake of coming true, it is very evident why blame is needed.

We have not specifically addressed the fourth argument, but this is perhaps unnecessary given the above response to the fifth. This blog in general has advocated the idea of voluntary beliefs, and in principle these can be praised or blamed. To the degree that we are less willing to do so, however, this may be a question of emphasis. When we talk about a belief, we are more concerned about whether it is true or not, and evidence in favor of it or against it. Praise or blame will mainly come in insofar as other motives are involved, insofar as they strengthen or weaken a person’s wish to hold the belief, or insofar as they potentially distort the person’s evaluation of the evidence.

Nonetheless, the factual question “is this true?” is a different question from the moral question, “should I believe this?” We can see the struggle between these questions, for example, in a difficulty that people sometimes have with willpower. Suppose that a smoker decides to give up smoking, and suppose that they believe they will not smoke for the next six months. Three days later, let us suppose, they smoke a cigarette after all. At that point, the person’s resolution is likely to collapse entirely, so that they return to smoking regularly. One might ask why this happens. Since the person did not smoke for three days, it should be perfectly possible, at least, for them to smoke only once every three days, instead of going back to their former practice. The problem is that the person has received evidence directly indicating the falsity of “I will not smoke for the next six months.” They still might have some desire for that result, but they do not believe that their belief has the power to bring this about, and in fact it does not. The belief would not be self-fulfilling, and in fact it would be false, so they cease to hold it. It is as if someone attempts to open a door and finds it locked; once they know it is locked, they can no longer choose to open the door, because they cannot choose something that does not appear to be within their power.

Mark Forster, in Chapter 1 of his book Do It Tomorrow, previously discussed here, talks about similar issues:

However, life is never as simple as that. What we decide to do and what we actually do are two different things. If you think of the decisions you have made over the past year, how many of them have been satisfactorily carried to a conclusion or are progressing properly to that end? If you are like most people, you will have acted on some of your decisions, I’m sure. But I’m also sure that a large proportion will have fallen by the wayside.

So a simple decision such as to take time to eat properly is in fact very difficult to carry out. Our new rule may work for a few days or a few weeks, but it won’t be long before the pressures of work force us to make an exception to it. Before many days are up the exception will have become the rule and we are right back where we started. However much we rationalise the reasons why our decision didn’t get carried out, we know deep in the heart of us that it was not really the circumstances that were to blame. We secretly acknowledge that there is something missing from our ability to carry out a decision once we have made it.

In fact if we are honest it sometimes feels as if it is easier to get other people to do what we want them to do than it is to get ourselves to do what we want to do. We like to think of ourselves as a sort of separate entity sitting in our body controlling it, but when we look at the way we behave most of the time that is not really the case. The body controls itself most of the time. We have a delusion of control. That’s what it is – a delusion.

If we want to see how little control we have over ourselves, all most of us have to do is to look in the mirror. You might like to do that now. Ask yourself as you look at your image:

  • Is my health the way I want it to be?
  • Is my fitness the way I want it to be?
  • Is my weight the way I want it to be?
  • Is the way I am dressed the way I want it to be?

I am not asking you here to assess what sort of body you were born with, but what you have made of it and how good a state of repair you are keeping it in.

It may be that you are healthy, fit, slim and well-dressed. In which case have a look round at the state of your office or workplace:

  • Is it as well organised as you want it to be?
  • Is it as tidy as you want it to be?
  • Do all your office systems (filing, invoicing, correspondence, etc.) work the way you want them to work?

If so, then you probably don’t need to be reading this book.

I’ve just asked you to look at two aspects of your life that are under your direct control and are very little influenced by outside factors. If these things which are solely affected by you are not the way you want them to be, then in what sense can you be said to be in control at all?

A lot of this difficulty is due to the way our brains are organised. We have the illusion that we are a single person who acts in a ‘unified’ way. But it takes only a little reflection (and examination of our actions, as above) to realise that this is not the case at all. Our brains are made up of numerous different parts which deal with different things and often have different agendas.

Occasionally we attempt to deal with the difference between the facts and our plans by saying something like, “We will approximately do such and such. Of course we know that it isn’t going to be exactly like this, but at least this plan will be an approximate guide.” But this does not really avoid the difficulty. Even “this plan will be an approximate guide” is a statement about the facts that might turn out to be false; and even if it does not turn out to be false, the fact that we have set it down as approximate will likely make it guide our actions more weakly than it would have if we had said, “this is what we will do.” In other words, we are likely to achieve our goal less perfectly, precisely because we tried to make our statement more accurate. This is the reverse of the situation discussed in a previous post, where one gives up some accuracy, albeit vaguely, for the sake of another goal such as fitting in with associates or for literary enjoyment.

All of this seems to indicate that the general proposal about decisions was at least roughly correct. It is not possible to simply to say that decisions are one thing and beliefs entirely another thing. If these were simply two entirely separate things, there would be no conflict at all, at least of this kind, between accuracy and one’s other goals, and things do not turn out this way.