What You Learned Before You Were Born

In Plato’s Meno, Socrates makes the somewhat odd claim that the ability of people to learn things without being directly told them proves that somehow they must have learned them or known them in advance. While we can reasonably assume this is wrong in a literal sense, there is some likeness of the truth here.

The whole of a human life is a continuous learning process generally speaking without any sudden jumps. We think of a baby’s learning as different from the learning of a child in school, and the learning of the child as rather different from the learning of an adult. But if you look at that process in itself, there may be sudden jumps in a person’s situation, such as when they graduate from school or when they get married, but there are no sudden jumps from not knowing anything about a topic or an object to suddenly knowing all about it. The learning itself happens gradually. It is the same with the manner in which it takes place; adults do indeed learn in a different manner from that in which children or infants learn. But if you ask how that manner got to be different, it certainly did so gradually, not suddenly.

But in addition to all this, there is a kind of “knowledge” that is not learned at all during one’s life, but is possessed from the beginning. From the beginning people have the ability to interact with the world in such a way that they will survive and go on to learn things. Thus from the beginning they must “know” how to do this. Now one might object that infants have no such knowledge, and that the only reason they survive is that their parents or others keep them alive. But the objection is mistaken: infants know to cry out when they hungry or in pain, and this is part of what keeps them alive. Similarly, an infant knows to drink the milk from its mother rather than refusing it, and this is part of what keeps it alive. Similarly in regard to learning, if an infant did not know the importance of paying close attention to speech sounds, it would never learn a language.

When was this “knowledge” learned? Not in the form of a separated soul, but through the historical process of natural selection.

Selection and Artificial Intelligence

This has significant bearing on our final points in the last post. Is the learning found in AI in its current forms more like the first kind of learning above, or like the kind found in the process of natural selection?

There may be a little of both, but the vast majority of learning in such systems is very much the second kind, and not the first kind. For example, AlphaGo is trained by self-play, where moves and methods of play that tend to lose are eliminated in much the way that in the process of natural selection, manners of life that do not promote survival are eliminated. Likewise a predictive model like GPT-3 is trained, through a vast number of examples, to avoid predictions that turn out to be less accurate and to make predictions that tend to be more accurate.

Now (whether or not this is done in individual cases) you might take a model of this kind and fine tune it based on incoming data, perhaps even in real time, which is a bit more like the first kind of learning. But in our actual situation, the majority of what is known by our AI systems is based on the second kind of learning.

This state of affairs should not be surprising, because the first kind of learning described above is impossible without being preceded by the second. The truth in Socrates’ claim is that if a system does not already “know” how to learn, of course it will not learn anything.

Intelligence and Universality

Elsewhere I have mentioned the argument, often made in great annoyance, that people who take some new accomplishment in AI or machine learning and proclaim that it is “not real intelligence” or that the algorithm is “still fundamentally stupid”, and other things of that kind, are “moving the goalposts,” especially since in many such cases, there really were people who said that something that could do such a thing would be intelligent.

As I said in the linked post, however, there is no problem of moving goalposts unless you originally had them in the wrong place. And attaching intelligence to any particular accomplishment, such as “playing chess well” or even “producing a sensible sounding text,” or anything else with that sort of particularity, is misplacing the goalposts. As we might remember, what excited Francis Bacon was the thought that there were no clear limits, at all, on what science (namely the working out of intelligence) might accomplish. In fact he seems to have believed that there were no limits at all, which is false. Nonetheless, he was correct that those limits are extremely vague, and that much that many assumed to be impossible would turn out to be possible. In other words, human intelligence does not have very meaningful limits on what it can accomplish, and artificial intelligence will be real intelligence (in the same sense that artificial diamonds can be real diamonds) when artificial intelligence has no meaningful limits on what it can accomplish.

I have no time for playing games with objections like, “but humans can’t multiply two 1000 digit numbers in one second, and no amount of thought will give them that ability.” If you have questions of this kind, please answer them for yourself, and if you can’t, sit still and think about it until you can. I have full confidence in your ability to find the answers, given sufficient thought.

What is needed for “real intelligence,” then, is universality. In a sense everyone knew all along that this was the right place for the goalposts. Even if someone said “if a machine can play chess, it will be intelligent,” they almost certainly meant that their expectation was that a machine that could play chess would have no clear limits on what it could accomplish. If you could have told them for a fact that the future would be different: that a machine would be able to play chess but that (that particular machine) would never be able to do anything else, they would have conceded that the machine would not be intelligent.

Training and Universality

Current AI systems are not universal, and clearly have no ability whatsoever to become universal, without first undergoing deep changes in those systems, changes that would have to be initiated by human beings. What is missing?

The problem is the training data. The process of evolution produced the general ability to learn by using the world itself as the training data. In contrast, our AI systems take a very small subset of the world (like a large set of Go games or a large set of internet text), and train a learning system on that subset. Why take a subset? Because the world is too large to fit into a computer, especially if that computer is a small part of the world.

This suggests that going from the current situation to “artificial but real” intelligence is not merely a question of making things better and better little by little. There is a more fundamental problem that would have to be overcome, and it won’t be overcome simply by larger training sets, by faster computing, and things of this kind. This does not mean that the problem is impossible, but it may turn out to be much more difficult than people expected. For example, if there is no direct solution, people might try to create Robin Hanson’s “ems”, where one would more or less copy the learning achieved by natural selection. Or even if that is not done directly, a better understanding of what it means to “know how to learn,” might lead to a solution, although probably one that would not depend on training a model on massive amounts of data.

What happens if there is no solution, or no solution is found? At times people will object to the possibility of such a situation along these times: “this situation is incoherent, since obviously people will be able to keep making better and better machine learning systems, so sooner or later they will be just as good as human intelligence.” But in fact the situation is not incoherent; if it happened, various types of AI system would approach various asymptotes, and this is entirely coherent. We can already see this in the case of GPT-3, where as I noted, there is an absolute bound on its future performance. In general such bounds in their realistic form are more restrictive than their in-principle form; I do not actually expect some successor to GPT-3 to write sensible full length books. Note however that even if this happened (as long as the content itself was not fundamentally better than what humans have done) I would not be “moving the goalposts”; I do not expect that to happen, but its happening would not imply any fundamental difference, since this is still within the “absolute” bounds that we have discussed. In contrast, if a successor to GPT-3 published a cure for cancer, this would prove that I had made some mistake on the level of principle.

Some Remarks on GPT-N

At the end of May, OpenAI published a paper on GPT-3, a language model which is a successor to their previous version, GPT-2. While quite impressive, the reaction from many people interested in artificial intelligence has been seriously exaggerated. Sam Altman, OpenAI’s CEO, has said as much himself:

The GPT-3 hype is way too much. It’s impressive (thanks for the nice compliments!) but it still has serious weaknesses and sometimes makes very silly mistakes. AI is going to change the world, but GPT-3 is just a very early glimpse. We have a lot still to figure out.

I used “GPT-N” in the title here because most of the comments I intend to make are almost completely general, and will apply to any future version that uses sufficiently similar methods.

What it does

GPT-3 is a predictive language model, that is, given an input text it tries to predict what would come next, much in the way that if you read the first few words of this sentence with the rest covered up, you might try to guess what would be likely to come next. To the degree that it does this well, it can be used to generate text from a “prompt,” that is, we give it something like a few words or a few sentences, and then add whatever it predicts should come next. For example, let’s take this very blog post and see what GPT-3 would like to say:

What it doesn’t do

While GPT-3 does seem to be able to generate some pretty interesting results, there are several limitations that need to be taken into account when using it.

First and foremost, and most importantly, it can’t do anything without a large amount of input data. If you want it to write like “a real human,” you need to give it a lot of real human writing. For most people, this means copying and pasting a lot. And while the program is able to read through that and get a feel for the way humans communicate, you can’t exactly use it to write essays or research papers. The best you could do is use it as a “fill in the blank” tool to write stories, and that’s not even very impressive.

While the program does learn from what it reads and is quite good at predicting words and phrases based on what has already been written, this method isn’t very effective at producing realistic prose. The best you could hope for is something like the “Deep Writing Machine” Twitter account, which spits out disconnected phrases in an ominous, but very bland voice.

In addition, the model is limited only to language. It does not understand context or human thought at all, so it has no way of tying anything together. You could use it to generate a massive amount of backstory and other material for a game, but that’s about it.

Finally, the limitations in writing are only reinforced by the limitations in reading. Even with a large library to draw on, the program is only as good as the parameters set for it. Even if you set it to the greatest writers mankind has ever known, without any special parameters, its writing would be just like anyone else’s.

The Model

GPT-3 consists of several layers. The first layer is a “memory network” that involves the program remembering previously entered data and using it when appropriate (i.e. it remembers commonly misspelled words and frequently used words). The next layer is the reasoning network, which involves common sense logic (i.e. if A, then B). The third is the repetition network, which involves pulling previously used material from memory and using it to create new combinations (i.e. using previously used words in new orders).

I added the bold formatting, the rest is as produced by the model. This was also done in one run, without repetitions. This is an important qualification, since many examples on the internet have been produced by deleting something produced by the model and forcing it to generate something new until something sensible resulted. Note that the model does not seem to have understood my line, “let’s take this very blog post and see what GPT-3 would like to say.” That is, rather than trying to “say” anything, it attempted to continue the blog post in the way I might have continued it without the block quote.

Truth vs Probability of Text

If we interpret the above text from GPT-3 “charitably”, much of it is true or close to true. But I use scare quotes here because when we speak of interpreting human speech charitably, we are assuming that someone was trying to speak the truth, and so we think, “What would they have meant if they were trying to say something true?” The situation is different here, because GPT-3 has no intention of producing truth, nor of avoiding it. Insofar as there is any intention, the intention is to produce the text which would be likely to come after the input text; in this case, as the input text was the beginning of this blog post, the intention was to produce the text that would likely follow in such a post. Note that there is an indirect relationship with truth, which explains why there is any truth at all in GPT-3’s remarks. If the input text is true, it is at least somewhat likely that what would follow would also be true, so if the model is good at guessing what would be likely to follow, it will be likely to produce something true in such cases. But it is just as easy to convince it to produce something false, simply by providing an input text that would be likely to be followed by something false.

This results in an absolute upper limit on the quality of the output of a model of this kind, including any successor version, as long as the model works by predicting the probability of the following text. Namely, its best output cannot be substantially better than the best content in its training data, which is in this version is a large quantity of texts from the internet. The reason for this limitation is clear; to the degree that the model has any intention at all, the intention is to reflect the training data, not to surpass it. As an example, consider the difference between Deep Mind’s AlphaGo and AlphaGo Zero. AlphaGo Zero is a better Go player than the original AlphaGo, and this is largely because the original is trained on human play, while AlphaGo Zero is trained from scratch on self play. In other words, the original version is to some extent predicting “what would a Go player play in this situation,” which is not the same as predicting “what move would win in this situation.”

Now I will predict (and perhaps even GPT-3 could predict) that many people will want to jump in and say, “Great. That shows you are wrong. Even the original AlphaGo plays Go much better than a human. So there is no reason that an advanced version of GPT-3 could not be better than humans at saying things that are true.”

The difference, of course, is that AlphaGo was trained in two ways, first on predicting what move would be likely in a human game, and second on what would be likely to win, based on its experience during self play. If you had trained the model only on predicting what would follow in human games, without the second aspect, the model would not have resulted in play that substantially improved upon human performance. But in the case of GPT-3 or any model trained in the same way, there is no selection whatsoever for truth as such; it is trained only to predict what would follow in a human text. So no successor to GPT-3, in the sense of a model of this particular kind, however large, will ever be able to produce output better than human, or in its own words, “its writing would be just like anyone else’s.”

Self Knowledge and Goals

OpenAI originally claimed that GPT-2 was too dangerous to release; ironically, they now intend to sell access to GPT-3. Nonetheless, many people, in large part those influenced by the opinions of Nick Bostrom and Eliezer Yudkowsky, continue to worry that an advanced version might turn out to be a personal agent with nefarious goals, or at least goals that would conflict with the human good. Thus Alexander Kruel:

GPT-2: *writes poems*
Skeptics: Meh
GPT-3: *writes code for a simple but functioning app*
Skeptics: Gimmick.
GPT-4: *proves simple but novel math theorems*
Skeptics: Interesting but not useful.
GPT-5: *creates GPT-6*
Skeptics: Wait! What?
GPT-6: *FOOM*
Skeptics: *dead*

In a sense the argument is moot, since I have explained above why no future version of GPT will ever be able to produce anything better than people can produce themselves. But even if we ignore that fact, GPT-3 is not a personal agent of any kind, and seeks goals in no meaningful sense, and the same will apply to any future version that works in substantially the same way.

The basic reason for this is that GPT-3 is disembodied, in the sense of this earlier post on Nick Bostrom’s orthogonality thesis. The only thing it “knows” is texts, and the only “experience” it can have is receiving an input text. So it does not know that it exists, it cannot learn that it can affect the world, and consequently it cannot engage in goal seeking behavior.

You might object that it can in fact affect the world, since it is in fact in the world. Its predictions cause an output, and that output is in the world. And that output and be reintroduced as input (which is how “conversations” with GPT-3 are produced). Thus it seems it can experience the results of its own activities, and thus should be able to acquire self knowledge and goals. This objection is not ultimately correct, but it is not so far from the truth. You would not need extremely large modifications in order to make something that in principle could acquire self knowledge and seek goals. The main reason that this cannot happen is the “P in “GPT,” that is, the fact that the model is “pre-trained.” The only learning that can happen is the learning that happens while it is reading an input text, and the purpose of that learning is to guess what is happening in the one specific text, for the purpose of guessing what is coming next in this text. All of this learning vanishes upon finishing the prediction task and receiving another input. A secondary reason is that since the only experience it can have is receiving an input text, even if it were given a longer memory, it would probably not be possible for it to notice that its outputs were caused by its predictions, because it likely has no internal mechanism to reflect on the predictions themselves.

Nonetheless, if you “fixed” these two problems, by allowing it to continue to learn, and by allowing its internal representations to be part of its own input, there is nothing in principle that would prevent it from achieving self knowledge, and from seeking goals. Would this be dangerous? Not very likely. As indicated elsewhere, motivation produced in this way and without the biological history that produced human motivation is not likely to be very intense. In this context, if we are speaking of taking a text-predicting model and adding on an ability to learn and reflect on its predictions, it is likely to enjoy doing those things and not much else. For many this argument will seem “hand-wavy,” and very weak. I could go into this at more depth, but I will not do so at this time, and will simply invite the reader to spend more time thinking about it. Dangerous or not, would it be easy to make these modifications? Nothing in this description sounds difficult, but no, it would not be easy. Actually making an artificial intelligence is hard. But this is a story for another time.

Fire, Water, and Numbers

Fire vs. Water

All things are water,” says Thales.

“All things are fire,” says Heraclitus.

“Wait,” says David Hume’s Philo. “You both agree that all things are made up of one substance. Thales, you prefer to call it water, and Heraclitus, you prefer to call it fire. But isn’t that merely a verbal dispute? According to both of you, whatever you point at is fundamentally the same fundamental stuff. So whether you point at water or fire, or anything else, for that matter, you are always pointing at the same fundamental stuff. Where is the real disagreement?”

Philo has a somewhat valid point here, and I mentioned the same thing in the linked post referring to Thales. Nonetheless, as I also said in the same post, as well as in the discussion of the disagreement about God, while there is some common ground, there are also likely remaining points of disagreement. It might depend on context, and perhaps the disagreement is more about the best way of thinking about things than about the things themselves, somewhat like discussing whether the earth or the universe is the thing spinning, but Heraclitus could respond, for example, by saying that thinking of the fundamental stuff as fire is more valid because fire is constantly changing, while water often appears to be completely still, and (Heraclitus claims) everything is in fact constantly changing. This could represent a real disagreement, but it is not a large one, and Thales could simply respond: “Ok, everything is flowing water. Problem fixed.”

Numbers

It is said that Pythagoras and his followers held that “all things are numbers.” To what degree and in what sense this attribution is accurate is unclear, but in any case, some people hold this very position today, even if they would not call themselves Pythagoreans. Thus for example in a recent episode of Sean Carroll’s podcast, Carroll speaks with Max Tegmark, who seems to adopt this position:

0:23:37 MT: It’s squishy a little bit blue and moose like. [laughter] Those properties, I just described don’t sound very mathematical at all. But when we look at it, Sean through our physics eyes, we see that it’s actually a blob of quarks and electrons. And what properties does an electron have? It has the property, minus one, one half, one, and so on. We, physicists have made up these nerdy names for these properties like electric charge, spin, lepton number. But it’s just we humans who invented that language of calling them that, they are really just numbers. And you know as well as I do that the only difference between an electron and a top quark is what numbers its properties are. We have not discovered any other properties that they actually have. So that’s the stuff in space, all the different particles, in the Standard Model, you’ve written so much nice stuff about in your books are all described by just by sets of numbers. What about the space that they’re in? What property does the space have? I think I actually have your old nerdy non-popular, right?

0:24:50 SC: My unpopular book, yes.

0:24:52 MT: Space has, for example, the property three, that’s a number and we have a nerdy name for that too. We call it the dimensionality of space. It’s the maximum number of fingers I can put in space that are all perpendicular to each other. The name dimensionality is just the human language thing, the property is three. We also discovered that it has some other properties, like curvature and topology that Einstein was interested in. But those are all mathematical properties too. And as far as we know today in physics, we have never discovered any properties of either space or the stuff in space yet that are actually non-mathematical. And then it starts to feel a little bit less insane that maybe we are living in a mathematical object. It’s not so different from if you were a character living in a video game. And you started to analyze how your world worked. You would secretly be discovering just the mathematical workings of the code, right?

Tegmark presumably would believe that by saying that things “are really just numbers,” he would disagree with Thales and Heraclitus about the nature of things. But does he? Philo might well be skeptical that there is any meaningful disagreement here, just as between Thales and Heraclitus. As soon as you begin to say, “all things are this particular kind of thing,” the same issues will arise to hinder your disagreement with others who characterize things in a different way.

The discussion might be clearer if I put my cards on the table in advance:

First, there is some validity to the objection, just as there is to the objection concerning the difference between Thales and Heraclitus.

Second, there is nonetheless some residual disagreement, and on that basis it turns out that Tegmark and Pythagoras are more correct than Thales and Heraclitus.

Third, Tegmark most likely does not understand the sense in which he might be correct, rather supposing himself correct the way Thales might suppose himself correct in insisting, “No, things are really not fire, they are really water.”

Mathematical and non-mathematical properties

As an approach to these issues, consider the statement by Tegmark, “We have never discovered any properties of either space or the stuff in space yet that are actually non-mathematical.”

What would it look like if we found a property that was “actually non-mathematical?” Well, what about the property of being blue? As Tegmark remarks, that does not sound very mathematical. But it turns out that color is a certain property of a surface regarding how it reflects flight, and this is much more of a “mathematical” property, at least in the sense that we can give it a mathematical description, which we would have a hard time doing if we simply took the word “blue.”

So presumably we would find a non-mathematical property by seeing some property of things, then investigating it, and then concluding, “We have fully investigated this property and there is no mathematical description of it.” This did not happen with the color blue, nor has it yet happened with any other property; either we can say that we have not yet fully investigated it, or we can give some sort of mathematical description.

Tegmark appears to take the above situation to be surprising. Wow, we might have found reality to be non-mathematical, but it actually turns out to be entirely mathematical! I suggest something different. As hinted by connection with the linked post, things could not have turned out differently. A sufficiently detailed analysis of anything will be a mathematical analysis or something very like it. But this is not because things “are actually just numbers,” as though this were some deep discovery about the essence of things, but because of what it is for people to engage in “a detailed analysis” of anything.

Suppose you want to investigate some thing or some property. The first thing you need to do is to distinguish it from other things or other properties. The color blue is not the color red, the color yellow, or the color green.

Numbers are involved right here at the very first step. There are at least three colors, namely red, yellow, and blue.

Of course we can find more colors, but what if it turns out there seems to be no definite number of them, but we can always find more? Even in this situation, in order to “analyze” them, we need some way of distinguishing and comparing them. We will put them in some sort of order: one color is brighter than another, or one length is greater than another, or one sound is higher pitched than another.

As soon as you find some ordering of that sort (brightness, or greatness of length, or pitch), it will become possible to give a mathematical analysis in terms of the real numbers, as we discussed in relation to “good” and “better.” Now someone defending Tegmark might respond: there was no guarantee we would find any such measure or any such method to compare them. Without such a measure, you could perhaps count your property along with other properties. But you could not give a mathematical analysis of the property itself. So it is surprising that it turned out this way.

But you distinguished your property from other properties, and that must have involved recognizing some things in common with other properties, at least that it was something rather than nothing and that it was a property, and some ways in which it was different from other properties. Thus for example blue, like red, can be seen, while a musical note can be heard but not seen (at least by most people.) Red and blue have in common that they are colors. But what is the difference between them? If we are to respond in any way to this question, except perhaps, “it looks different,” we must find some comparison. And if we find a comparison, we are well on the way to a mathematical account. If we don’t find a comparison, people might rightly complain that we have not yet done any detailed investigation.

But to make the point stronger, let’s assume the best we can do is “it looks different.” Even if this is the case, this very thing will allow us to construct a comparison that will ultimately allow us to construct a mathematical measure. For “it looks different” is itself something that comes in degrees. Blue looks different from red, but orange does so as well, just less different. Insofar as this judgment is somewhat subjective, it might be hard to get a great deal of accuracy with this method. But it would indeed begin to supply us with a kind of sliding scale of colors, and we would be able to number this scale with the real numbers.

From a historical point of view, it took a while for people to realize that this would always be possible. Thus for example Isidore of Seville said that “unless sounds are held by the memory of man, they perish, because they cannot be written down.” It was not, however, so much ignorance of sound that caused this, as ignorance of “detailed analysis.”

This is closely connected to what we said about names. A mathematical analysis is a detailed system of naming, where we name not only individual items, but also various groups, using names like “two,” “three,” and “four.” If we find that we cannot simply count the thing, but we can always find more examples, we look for comparative ways to name them. And when we find a comparison, we note that some things are more distant from one end of the scale and other things are less distant. This allows us to analyze the property using real numbers or some similar mathematical concept. This is also related to our discussion of technical terminology; in an advanced stage any science will begin to use somewhat mathematical methods. Unfortunately, this can also result in people adopting mathematical language in order to look like their understanding has reached an advanced stage, when it has not.

It should be sufficiently clear from this why I suggested that things could not have turned out otherwise. A “non-mathematical” property, in Tegmark’s sense, can only be a property you haven’t analyzed, or one that you haven’t succeeded in analyzing if you did attempt it.

The three consequences

Above, I made three claims about Tegmark’s position. The reasons for them may already be somewhat clarified by the above, but nonetheless I will look at this in a bit more detail.

First, I said there was some truth in the objection that “everything is numbers” is not much different from “everything is water,” or “everything is fire.” One notices some “hand-waving,” so to speak, in Tegmark’s claim that “We, physicists have made up these nerdy names for these properties like electric charge, spin, lepton number. But it’s just we humans who invented that language of calling them that, they are really just numbers.” A measure of charge or spin or whatever may be a number. But who is to say the thing being measured is a number? Nonetheless, there is a reasonable point there. If you are to give an account at all, it will in some way express the form of the thing, which implies explaining relationships, which depends on the distinction of various related things, which entails the possibility of counting the things that are related. In other words, someone could say, “You have a mathematical account of a thing. But the thing itself is non-mathematical.” But if you then ask them to explain that non-mathematical thing, the new explanation will be just as mathematical as the original explanation.

Given this fact, namely that the “mathematical” aspect is a question of how detailed explanations work, what is the difference between saying “we can give a mathematical explanation, but apart from explanations, the things are numbers,” and “we can give a mathematical explanation, but apart from explanations, the things are fires?”

Exactly. There isn’t much difference. Nonetheless, I made the second claim that there is some residual disagreement and that by this measure, the mathematical claim is better than the one about fire or water. Of course we don’t really know what Thales or Heraclitus thought in detail. But Aristotle, at any rate, claimed that Thales intended to assert that material causes alone exist. And this would be at least a reasonable understanding of the claim that all things are water, or fire. Just as Heraclitus could say that fire is a better term than water because fire is always changing, Thales, if he really wanted to exclude other causes, could say that water is a better term than “numbers” because water seems to be material and numbers do not. But since other causes do exist, the opposite is the case: the mathematical claim is better than the materialistic ones.

Many people say that Tegmark’s account is flawed in a similar way, but with respect to another cause; that is, that mathematical accounts exclude final causes. But this is a lot like Ed Feser’s claim that a mathematical account of color implies that colors don’t really exist; namely they are like in just being wrong. A mathematical account of color does not imply that things are not colored, and a mathematical account of the world does not imply that final causes do not exist. As I said early on, a final causes explains why an efficient cause does what it does, and there is nothing about a mathematical explanation that prevents you from saying why the efficient cause does what it does.

My third point, that Tegmark does not understand the sense in which he is right, should be plain enough. As I stated above, he takes it to be a somewhat surprising discovery that we consistently find it possible to give mathematical accounts of the world, and this only makes sense if we assume it would in theory have been possible to discover something else. But that could not have happened, not because the world couldn’t have been a certain way, but because of the nature of explanation.

The Power of a Name

Fairy tales and other stories occasionally suggest the idea that a name gives some kind of power over the thing named, or at least that one’s problems concerning a thing may be solved by knowing its name, as in the story of Rumpelstiltskin. There is perhaps a similar suggestion in Revelation 2:7, “Whoever has ears, let them hear what the Spirit says to the churches. To the one who is victorious, I will give some of the hidden manna. I will also give that person a white stone with a new name written on it, known only to the one who receives it.” The secrecy of the new name may indicate (among other things) that others will have no power over that person.

There is more truth in this idea than one might assume without much thought. For example, anonymous authors do not want to be “doxxed” because knowing the name of the author really does give some power in relation to them which is not had without the knowledge of their name. Likewise, as a blogger, occasionally I want to cite something, but cannot remember the name of the author or article where the statement is made. Even if I remember the content fairly clearly, lacking the memory of the name makes finding the content far more difficult, while on the other name, knowing the name gives me the power of finding the content much more easily.

But let us look a bit more deeply into this. Hilary Lawson, whose position was somewhat discussed here, has a discussion along these lines in Part II of his book, Closure: A Story of Everything. Since he denies that language truly refers to the world at all, as I mentioned in the linked post on his position, it is important to him that language has other effects, and in particular has practical goals. He says in chapter 4:

In order to understand the mechanism of practical linguistic closure consider an example where a proficient speaker of English comes across a new word. Suppose that we are visiting a zoo with a friend. We stand outside a cage and our friend says: ‘An aasvogel.” …

It might appear at first from this example that nothing has been added by the realisation of linguistic closure. The sound ‘aasvogel’ still sounds the same, the image of the bird still looks the same. So what has changed? The sensory closures on either side may not have changed, but a new closure has been realised. A new closure which is in addition to the prior available closures and which enables intervention which was not possible previously. For example, we now have a means of picking out this particular bird in the zoo because the meaning that has been realised will have identified a something in virtue of which this bird is an aasvogel and which thus enables us to distinguish it from others. As a result there will be many consequences for how we might be able to intervene.

The important point here is simply that naming something, even before taking any additional steps, immediately gives one the ability to do various practical things that one could not previously do. In a passage by Helen Keller, previously quoted here, she says:

Since I had no power of thought, I did not compare one mental state with another. So I was not conscious of any change or process going on in my brain when my teacher began to instruct me. I merely felt keen delight in obtaining more easily what I wanted by means of the finger motions she taught me.

We may have similar experiences as adults learning a foreign language while living abroad. At first one has very little ability to interact with the foreign world, but suddenly everything is possible.

Or consider the situation of a hunter gatherer who may not know how to count. It may be obvious to them that a bigger pile of fruit is better than a smaller one, but if two piles look similar, they may have no way to know which is better. But once they decide to give “one fruit and another” a name like “two,” and “two and one” a name like “three,” and so on, suddenly they obtain a great advantage that they previously did not possess. It is now possible to count piles and to discover that one pile has sixty-four while another has sixty-three. And it turns out that by treating the “sixty-four” as bigger than the other pile, although it does not look bigger, they end up better off.

In this sense one could look at the scientific enterprise of looking for mathematical laws of nature as one long process of looking for better names. We can see that some things are faster and some things are slower, but the vague names “fast” and “slow” cannot accomplish much. Once we can name different speeds more precisely, we can put them all in order and accomplish much more, just as the hunter gatherer can accomplish more after learning to count. And this extends to the full power of technology: the men who landed on the moon, did so ultimately due to the power of names.

If you take Lawson’s view, that language does not refer to the world at all, all of this is basically casting magic spells. In fact, he spells this out himself, in so many words, in chapter 3:

All material is in this sense magical. It enables intervention that cannot be understood. Ancient magicians were those who had access to closures that others did not know, in the same way that the Pharaohs had access to closures not available to their subjects. This gave them a supernatural character. It is now that thought that their magic has been explained, as the knowledge of herbs, metals or the weather. No such thing has taken place. More powerful closures have been realised, more powerful magic that can subsume the feeble closures of those magicians. We have simply lost sight of its magical character. Anthropology has many accounts of tribes who on being observed by a Western scientist believe that the observer has access to some very powerful magic. Magic that produces sound and images from boxes, and makes travel swift. We are inclined to smile patronisingly believing that we merely have knowledge — the technology behind radio and television, and motor vehicles — and not magic. The closures behind the technology do indeed provide us with knowledge and understanding and enable us to handle activity, but they do not explain how the closures enable intervention. How the closures are successful remains incomprehensible and in this sense is our magic.

I don’t think we should dismiss this point of view entirely, but I do think it is more mistaken than otherwise, basically because of the original mistake of thinking that language cannot refer to the world. But the point that names are extremely powerful is correct and important, to the point where even the analogy of technology as “magic that works” does make a certain amount of sense.

Common Sense

I have tended to emphasize common sense as a basic source in attempting to philosophize or otherwise understand reality. Let me explain what I mean by the idea of common sense.

The basic idea is that something is common sense when everyone agrees that something is true. If we start with this vague account, something will be more definitively common sense to the degree that it is truer that everyone agrees, and likewise to the degree that it is truer that everyone agrees.

If we consider anything that one might think of as a philosophical view, we will find at least a few people who disagree, at least verbally, with the claim. But we may be able to find some that virtually everyone agrees with. These pertain more to common sense than things that fewer people agree with. Likewise, if we consider everyday claims rather than philosophical ones, we will probably be able to find things that everyone agrees with apart from some very localized contexts. These pertain even more to common sense. Likewise, if everyone has always agreed with something both in the past and present, that pertains more to common sense than something that everyone agrees with in the present, but where some have disagreed in the past.

It will be truer that everyone agrees in various ways: if everyone is very certain of something, that pertains more to common sense than something people are less certain about. If some people express disagreement with a view, but everyone’s revealed preferences or beliefs indicate agreement, that can be said to pertain to common sense to some degree, but not so much as where verbal affirmations and revealed preferences and beliefs are aligned.

Naturally, all of this is a question of vague boundaries: opinions are more or less a matter of common sense. We cannot sort them into two clear categories of “common sense” and “not common sense.” Nonetheless, we would want to base our arguments, as much as possible, on things that are more squarely matters of common sense.

We can raise two questions about this. First, is it even possible? Second, why do it?

One might object that the proposal is impossible. For no one can really reason except from their own opinions. Otherwise, one might be formulating a chain of argument, but it is not one’s own argument or one’s own conclusion. But this objection is easily answered. In the first place, if everyone agrees on something, you probably agree yourself, and so reasoning from common sense will still be reasoning from your own opinions. Second, if you don’t personally agree, since belief is voluntary, you are capable of agreeing if you choose, and you probably should, for reasons which will be explained in answering the second question.

Nonetheless, the objection is a reasonable place to point out one additional qualification. “Everyone agrees with this” is itself a personal point of view that someone holds, and no one is infallible even with respect to this. So you might think that everyone agrees, while in fact they do not. But this simply means that you have no choice but to do the best you can in determining what is or what is not common sense. Of course you can be mistaken about this, as you can about anything.

Why argue from common sense? I will make two points, a practical one and a theoretical one. The practical point is that if your arguments are public, as for example this blog, rather than written down in a private journal, then you presumably want people to read them and to gain from them in some way. The more you begin from common sense, the more profitable your thoughts will be in this respect. More people will be able to gain from your thoughts and arguments if more people agree with the starting points.

There is also a theoretical point. Consider the statement, “The truth of a statement never makes a person more likely to utter it.” If this statement were true, no one could ever utter it on account of its truth, but only for other reasons. So it is not something that a seeker of truth would ever say. On the other hand, there can be no doubt that the falsehood of some statements, on some occasions, makes those statements more likely to be affirmed by some people. Nonetheless, the nature of language demands that people have an overall tendency, most of the time and in most situations, to speak the truth. We would not be able to learn the meaning of a word without it being applied accurately, most of the time, to the thing that it means. In fact, if everyone was always uttering falsehoods, we would simply learn that “is” means “is not,” and that “is not,” means “is,” and the supposed falsehoods would not be false in the language that we would acquire.

It follows that greater agreement that something is true, other things being equal, implies that the thing is more likely to be actually true. Stones have a tendency to fall down: so if we find a great collection of stones, the collection is more likely to be down at the bottom of a cliff rather than perched precisely on the tip of a mountain. Likewise, people have a tendency to utter the truth, so a great collection of agreement suggests something true rather than something false.

Of course, this argument depends on “other things being equal,” which is not always the case. It is possible that most people agree on something, but you are reasonably convinced that they are mistaken, for other reasons. But if this is the case, your arguments should depend on things that they would agree with even more strongly than they agree with the opposite of your conclusion. In other words, it should be based on things which pertain even more to common sense. Suppose it does not: ultimately the very starting point of your argument is something that everyone else agrees is false. This will probably be an evident insanity from the beginning, but let us suppose that you find it reasonable. In this case, Robin Hanson’s result discussed here implies that you must be convinced that you were created in very special circumstances which would guarantee that you would be right, even though no one else was created in these circumstances. There is of course no basis for such a conviction. And our ability to modify our priors, discussed there, implies that the reasonable behavior is to choose to agree with the priors of common sense, if we find our natural priors departing from them, except in cases where the disagreement is caused by agreement with even stronger priors of common sense. Thus for example in this post I gave reasons for disagreeing with our natural prior on the question, “Is this person lying or otherwise deceived?” in some cases. But this was based on mathematical arguments that are even more convincing than that natural prior.

Consistency and Reality

Consistency and inconsistency, in their logical sense, are relationships between statements or between the parts of a statement. They are not properties of reality as such.

“Wait,” you will say. “If consistency is not a property of reality, then you are implying that reality is not consistent. So reality is inconsistent?”

Not at all. Consistency and inconsistency are contraries, not contradictories, and they are properties of statements. So reality as such is neither consistent nor inconsistent, in the same way that sounds are neither white nor black.

We can however speak of consistency with respect to reality in an extended sense, just as we can speak of truth with respect to reality in an extended sense, even though truth refers first to things that are said or thought. In this way we can say that a thing is true insofar as it is capable of being known, and similarly we might say that reality is consistent, insofar as it is capable of being known by consistent claims, and incapable of being known by inconsistent claims. And reality indeed seems consistent in this way: I might know the weather if I say “it is raining,” or if I say, “it is not raining,” depending on conditions, but to say “it is both raining and not raining in the same way” is not a way of knowing the weather.

Consider the last point more precisely. Why can’t we use such statements to understand the world? The statement about the weather is rather different from statements like, “The normal color of the sky is not blue but rather green.” We know what it would be like for this to be the case. For example, we know what we would expect if it were the case. It cannot be used to understand the world in fact, because these expectations fail. But if they did not, we could use it to understand the world. Now consider instead the statement, “The sky is both blue and not blue in exactly the same way.” There is now no way to describe the expectations we would have if this were the case. It is not that we understand the situation and know that it does not apply, as with the claim about the color of the sky: rather, the situation described cannot be understood. It is literally unintelligible.

This also explains why we should not think of consistency as a property of reality in a primary sense. If it were, it would be like the color blue as a property of the sky. The sky is in fact blue, but we know what it would be like for it to be otherwise. We cannot equally say, “reality is in fact consistent, but we know what it would be like for it to be inconsistent.” Instead, the supposedly inconsistent situation is a situation that cannot be understood in the first place. Reality is thus consistent not in the primary sense but in a secondary sense, namely that it is rightly understood by consistent things.

But this also implies that we cannot push the secondary consistency of reality too far, in several ways and for several reasons.

First, while inconsistency as such does not contribute to our understanding of the world, a concrete inconsistent set of claims can help us understand the world, and in many situations better than any particular consistent set of claims that we might currently come up with. This was discussed in a previous post on consistency.

Second, we might respond to the above by pointing out that it is always possible in principle to formulate a consistent explanation of things which would be better than the inconsistent one. We might not currently be able to arrive at the consistent explanation, but it must exist.

But even this needs to be understood in a somewhat limited way. Any consistent explanation of things will necessarily be incomplete, which means that more complete explanations, whether consistent or inconsistent, will be possible. Consider for example these recent remarks of James Chastek on Gödel’s theorem:

1.) Given any formal system, let proposition (P) be this formula is unprovable in the system

2.) If P is provable, a contradiction occurs.

3.) Therefore, P is known to be unprovable.

4.) If P is known to be unprovable it is known to be true.

5.) Therefore, P is (a) unprovable in a system and (b) known to be true.

In the article linked by Chastek, John Lucas argues that this is a proof that the human mind is not a “mechanism,” since we can know to be true something that the mechanism will not able to prove.

But consider what happens if we simply take the “formal system” to be you, and “this formula is unprovable in the system” to mean “you cannot prove this statement to be true.” Is it true, or not? And can you prove it?

If you say that it is true but that you cannot prove it, the question is how you know that it is true. If you know by the above reasoning, then you have a syllogistic proof that it is true, and so it is false that you cannot prove it, and so it is false.

If you say that it is false, then you cannot prove it, because false things cannot be proven, and so it is true.

It is evident here that you can give no consistent response that you can know to be true; “it is true but I cannot know it to be true,” may be consistent, but obviously if it is true, you cannot know it to be true, and if it is false, you cannot know it to be true. What is really proven by Gödel’s theorem is not that the mind is not a “mechanism,” whatever that might be, but that any consistent account of arithmetic must be incomplete. And if any consistent account of arithmetic alone is incomplete, much  more must any consistent explanation of reality as a whole be incomplete. And among more complete explanations, there will be some inconsistent ones as well as consistent ones. Thus you might well improve any particular inconsistent position by adopting a consistent one, but you might again improve any particular consistent position by adopting an inconsistent one which is more complete.

The above has some relation to our discussion of the Liar Paradox. Someone might be tempted to give the same response to “tonk” and to “true”:

The problem with “tonk” is that it is defined in such a way as to have inconsistent implications. So the right answer is to abolish it. Just do not use that word. In the same way, “true” is defined in such a way that it has inconsistent implications. So the right answer is to abolish it. Just do not use that word.

We can in fact avoid drawing inconsistent conclusions using this method. The problem with the method is obvious, however. The word “tonk” does not actually exist, so there is no problem with abolishing it. It never contributed to our understanding of the world in the first place. But the word “true” does exist, and it contributes to our understanding of the world. To abolish it, then, would remove some inconsistency, but it would also remove part of our understanding of the world. We would be adopting a less complete but more consistent understanding of things.

Hilary Lawson discusses this response in Closure: A Story of Everything:

Russell and Tarski’s solution to self-referential paradox succeeds only by arbitrarily outlawing the paradox and thus provides no solution at all.

Some have claimed to have a formal, logical, solution to the paradoxes of self-reference. Since if these were successful the problems associated with the contemporary predicament and the Great Project could be solved forthwith, it is important to briefly examine them before proceeding further. The argument I shall put forward aims to demonstrate that these theories offer no satisfactory solution to the problem, and that they only appear to do so by obscuring the fact that they have defined their terms in such a way that the paradox is not so much avoided as outlawed.

The problems of self-reference that we have identified are analogous to the ancient liar paradox. The ancient liar paradox stated that ‘All Cretans are liars’ but was itself uttered by a Cretan thus making its meaning undecidable. A modern equivalent of this ancient paradox would be ‘This sentence is not true’, and the more general claim that we have already encountered: ‘there is no truth’. In each case the application of the claim to itself results in paradox.

The supposed solutions, Lawson says, are like the one suggested above: “Just do not use that word.” Thus he remarks on Tarski’s proposal:

Adopting Tarski’s hierarchy of languages one can formulate sentences that have the appearance of being self-referential. For example, a Tarskian version of ‘This sentence is not true’ would be:

(I) The sentence (I) is not true-in-L.

So Tarski’s argument runs, this sentence is both a true sentence of the language meta-L, and false in the language L, because it refers to itself and is therefore, according to the rules of Tarski’s logic and the hierarchy of languages, not properly formed. The hierarchy of languages apparently therefore enables self-referential sentences but avoids paradox.

More careful inspection however shows the manoeuvre to be engaged in a sleight of hand for the sentence as constructed only appears to be self-referential. It is a true sentence of the meta-language that makes an assertion of a sentence in L, but these are two different sentences – although they have superficially the same form. What makes them different is that the meaning of the predicate ‘is not true’ is different in each case. In the meta-language it applies the meta-language predicate ‘true’ to the object language, while in the object language it is not a predicate at all. As a consequence the sentence is not self-referential. Another way of expressing this point would be to consider the sentence in the meta-language. The sentence purports to be a true sentence in the meta-language, and applies the predicate ‘is not true’ to a sentence in L, not to a sentence in meta-L. Yet what is this sentence in L? It cannot be the same sentence for this is expressed in meta-L. The evasion becomes more apparent if we revise the example so that the sentence is more explicitly self-referential:

(I) The sentence (I) is not true-in-this-language.

Tarski’s proposal that no language is allowed to contain its own truth-predicate is precisely designed to make this example impossible. The hierarchy of languages succeeds therefore only by providing an account of truth which makes genuine self-reference impossible. It can hardly be regarded therefore as a solution to the paradox of self-reference, since if all that was required to solve the paradox was to ban it, this could have been done at the outset.

Someone might be tempted to conclude that we should say that reality is inconsistent after all. Since any consistent account of reality is incomplete, it must be that the complete account of reality is inconsistent: and so someone who understood reality completely, would do so by means of an inconsistent theory. And just as we said that reality is consistent, in a secondary sense, insofar as it is understood by consistent things, so in that situation, one would say that reality is inconsistent, in a secondary sense, because it is understood by inconsistent things.

The problem with this is that it falsely assumes that a complete and intelligible account of reality is possible. This is not possible largely for the same reasons that there cannot be a list of all true statements. And although we might understand things through an account which is in fact inconsistent, the inconsistency itself contributes nothing to our understanding, because the inconsistency is in itself unintelligible, just as we said about the statement that the sky is both blue and not blue in the same way.

We might ask whether we can at least give a consistent account superior to an account which includes the inconsistencies resulting from the use of “truth.” This might very well be possible, but it appears to me that no one has actually done so. This is actually one of Lawson’s intentions with his book, but I would assert that his project fails overall, despite potentially making some real contributions. The reader is nonetheless welcome to investigate for themselves.

Being and Unity II

Content warning: very obscure.

This post follows up on an earlier post on this topic, as well on what was recently said about real distinction. In the latter post, we applied the distinction between the way a thing is and the way it is known in order to better understand distinction itself. We can obtain a better understanding of unity in a similar way.

As was said in the earlier post on unity, to say that something is “one” does not add anything real to the being of the thing, but it adds the denial of the division between distinct things. The single apple is not “an apple and an orange,” which are divided insofar as they are distinct from one another.

But being distinct from divided things is itself a certain way of being distinct, and consequently all that was said about distinction in general will apply to this way of being distinct as well. In particular, since being distinct means not being something, which is a way that things are understood rather than a way that they are (considered precisely as a way of being), the same thing applies to unity. To say that something is one does not add something to the way that it is, but it adds something to the way that it is understood. This way of being understood is founded, we argued, on existing relationships.

We should avoid two errors here, both of which would be expressions of the Kantian error:

First, the argument here does not mean that a thing is not truly one thing, just as the earlier discussion does not imply that it is false that a chair is not a desk. On the contrary, a chair is in fact not a desk, and a chair is in fact one chair. But when we say or think, “a chair is not a desk,” or “a chair is one chair,” we are saying these things in some way of saying, and thinking them in some way of thinking, and these ways of saying and thinking are not ways of being as such. This in no way implies that the statements themselves are false, just as “the apple seems to be red,” does not imply that the apple is not red. Arguing that the fact of a specific way of understanding implies that the thing is falsely understood would be the position described by Ayn Rand as asserting, “man is blind, because he has eyes—deaf, because he has ears—deluded, because he has a mind—and the things he perceives do not exist, because he perceives them.”

Second, the argument does not imply that the way things really are is unknown and inaccessible to us. One might suppose that this follows, since distinction cannot exist apart from someone’s way of understanding, and at the same time no one can understand without making distinctions. Consequently, someone might argue, there must be some “way things really are in themselves,” which does not include distinction or unity, but which cannot be understood. But this is just a different way of falling into the first error above. There is indeed a way things are, and it is generally not inaccessible to us. In fact, as I pointed out earlier, it would be a contradiction to assert the existence of anything entirely unknowable to us.

Our discussion, being in human language and human thought, naturally uses the proper modes of language and thought. And just as in Mary’s room, where her former knowledge of color is a way of knowing and not a way of sensing, so our discussion advances by ways of discussion, not by ways of being as such. This does not prevent the way things are from being an object of discussion, just as color can be an object of knowledge.

Having avoided these errors, someone might say that nothing of consequence follows from this account. But this would be a mistake. It follows from the present account that when we ask questions like, “How many things are here?”, we are not asking a question purely about how things are, but to some extent about how we should understand them. And even when there is a single way that things are, there is usually not only one way to understand them correctly, but many ways.

Consider some particular question of this kind: “How many things are in this room?” People might answer this question in various ways. John Nerst, in a previous discussion on this blog, seemed to suggest that the answer should be found by counting fundamental particles. Alexander Pruss would give a more complicated answer, since he suggests that large objects like humans and animals should be counted as wholes (while also wishing to deny the existence of parts, which would actually eliminate the notion of a whole), while in other cases he might agree to counting particles. Thus a human being and an armchair might be counted, more or less, as 1 + 10^28 things, namely counting the human being as one thing and the chair as a number of particles.

But if we understand that the question is not, and cannot be, purely about how things are, but is also a question about how things should be understood, then both of the above responses seem unreasonable: they are both relatively bad ways of understanding the things in the room, even if they both have some truth as well. And on the other hand, it is easy to see that “it depends on how you count,” is part of the answer. There is not one true answer to the question, but many true answers that touch on different aspects of the reality in the room.

From the discussion with John Nerst, consider this comment:

My central contention is that the rules that define the universe runs by themselves, and must therefore be self-contained, i.e not need any interpretation or operationalization from outside the system. As I think I said in one of the parts of “Erisology of Self and Will” that the universe must be an automaton, or controlled by an automaton, etc. Formal rules at the bottom.

This is isn’t convincing to you I guess but I suppose I rule out fundamental vagueness because vagueness implies complexity and fundamental complexity is a contradiction in terms. If you keep zooming in on a fuzzy picture you must, at some point, come down to sharply delineated pixels.

Among other things, the argument of the present post shows why this cannot be right. “Sharply delineated pixels” includes the distinction of one pixel from another, and therefore includes something which is a way of understanding as such, not a way of being as such. In other words, while intending to find what is really there, apart from any interpretation, Nerst is directly including a human interpretation in his account. And in fact it is perfectly obvious that anything else is impossible, since any account of reality given by us will be a human account and will thus include a human way of understanding. Things are a certain way: but that way cannot be said or thought except by using ways of speaking or thinking.

Truth in Ordinary Language

After the incident with the tall man, I make plans to meet my companion the following day. “Let us meet at sunrise tomorrow,” I say. They ask in response, “How will I know when the sun has risen?”

When it is true to say that the sun will rise, or that the sun has risen? And what it would take for such statements to be false?

Virtually no one finds themselves uncomfortable with this language despite the fact that the sun has no physical motion called “rising,” but rather the earth is rotating, giving the appearance of movement to the sun. I will ignore issues of relativity, precisely because they are evidently irrelevant. It is not just that the sun is not moving, but that we know that the physical motion of the sun one way or another is irrelevant. The rising of the sun has nothing to do with a deep physical or metaphysical account of the sun as such. Instead, it is about that thing that happens every morning. What would it take for it to be false that the sun will rise tomorrow? Well, if the earth is destroyed today, then presumably the sun will not rise tomorrow. Or if tomorrow it is dark at noon and everyone on Twitter is on an uproar about the fact that the sun is visible at the height of the sky at midnight in their part of the world, then it will have been false that the sun was going to rise in the morning. In other words, the only possible thing that could falsify the claim about the sun would be a falsification of our expectations about our experience of the sun.

As in the last post, however, this does not mean that the statement about the sun is about our expectations. It is about the sun. But the only thing it says about the sun is something like, “The sun will be and do whatever it needs to, including in relative terms, in order for our ordinary experience of a sunrise to be as it usually is.” I said something similar here about the truth of attributions of sensible qualities, such as when we say that “the banana is yellow.”

All of this will apply in general to all of our ordinary language about ourselves, our lives, and the world.

Idealized Idealization

On another occasion, I discussed the Aristotelian idea that the act of the mind does not use an organ. In an essay entitled Immaterial Aspects of Thought, James Ross claims that he can establish the truth of this position definitively. He summarizes the argument:

Some thinking (judgment) is determinate in a way no physical process can be. Consequently, such thinking cannot be (wholly) a physical process. If all thinking, all judgment, is determinate in that way, no physical process can be (the whole of) any judgment at all. Furthermore, “functions” among physical states cannot be determinate enough to be such judgments, either. Hence some judgments can be neither wholly physical processes nor wholly functions among physical processes.

Certain thinking, in a single case, is of a definite abstract form (e.g. N x N = N²), and not indeterminate among incompossible forms (see I below). No physical process can be that definite in its form in a single case. Adding cases even to infinity, unless they are all the possible cases, will not exclude incompossible forms. But supplying all possible cases of any pure function is impossible. So, no physical process can exclude incompossible functions from being equally well (or badly) satisfied (see II below). Thus, no physical process can be a case of such thinking. The same holds for functions among physical states (see IV below).

In essence, the argument is that squaring a number and similar things are infinitely precise processes, and no physical process is infinitely precise. Therefore squaring a number and similar things are not physical processes.

The problem is unfortunately with the major premise here. Squaring a number, and similar things, in the way that we in fact do them, are not infinitely precise processes.

Ross argues that they must be:

Can judgments really be of such definite “pure” forms? They have to be; otherwise, they will fail to have the features we attribute to them and upon which the truth of certain judgments about validity, inconsistency, and truth depend; for instance, they have to exclude incompossible forms or they would lack the very features we take to be definitive of their sorts: e.g., conjunction, disjunction, syllogistic, modus ponens, etc. The single case of thinking has to be of an abstract “form” (a “pure” function) that is not indeterminate among incompossible ones. For instance, if I square a number–not just happen in the course of adding to write down a sum that is a square, but if I actually square the number–I think in the form “N x N = N².”

The same point again. I can reason in the form, modus ponens (“If p then q“; “p“; “therefore, q”). Reasoning by modus ponens requires that no incompossible forms also be “realized” (in the same sense) by what I have done. Reasoning in that form is thinking in a way that is truth-preserving for all cases that realize the form. What is done cannot, therefore, be indeterminate among structures, some of which are not truth preserving. That is why valid reasoning cannot be only an approximation of the form, but must be of the form. Otherwise, it will as much fail to be truth-preserving for all relevant cases as it succeeds; and thus the whole point of validity will be lost. Thus, we already know that the evasion, “We do not really conjoin, add, or do modus ponens but only simulate them,” cannot be correct. Still, I shall consider it fully below.

“It will as much fail to be truth-preserving for all relevant cases as it succeeds” is an exaggeration here. If you perform an operation which approximates modus ponens, then that operation will be approximately truth preserving. It will not be equally truth preserving and not truth preserving.

I have noted many times in the past, as for example here, here, here, and especially here, that following the rules of syllogism does not in practice infallibly guarantee that your conclusions are true, even if your premises are in some way true, because of the vagueness of human thought and language. In essence, Ross is making a contrary argument: we know, he is claiming, that our arguments infallibly succeed; therefore our thoughts cannot be vague. But it is empirically false that our arguments infallibly succeed, so the argument is mistaken right from its starting point.

There is also a strawmanning of the opposing position here insofar as Ross describes those who disagree with him as saying that “we do not really conjoin, add, or do modus ponens but only simulate them.” This assumes that unless you are doing these things perfectly, rather than approximating them, then you are not doing them at all. But this does not follow. Consider a triangle drawn on a blackboard. Consider which of the following statements is true:

  1. There is a triangle drawn on the blackboard.
  2. There is no triangle drawn on the blackboard.

Obviously, the first statement is true, and the second false. But in Ross’s way of thinking, we would have to say, “What is on the blackboard is only approximately triangular, not exactly triangular. Therefore there is no triangle on the blackboard.” This of course is wrong, and his description of the opposing position is wrong in the same way.

Naturally, if we take “triangle” as shorthand for “exact rather than approximate triangle” then (2) will be true. And in a similar way, if take “really conjoin” and so on as shorthand for “really conjoin exactly and not approximately,” then those who disagree will indeed say that we do not do those things. But this is not a problem unless you are assuming from the beginning that our thoughts are infinitely precise, and Ross is attempting to establish that this must be the case, rather than claiming to take it as given. (That is, the summary takes it as given, but Ross attempts throughout the article to establish it.)

One could attempt to defend Ross’s position as follows: we must have infinitely precise thoughts, because we can understand the words “infinitely precise thoughts.” Or in the case of modus ponens, we must have an infinitely precise understanding of it, because we can distinguish between “modus ponens, precisely,” and “approximations of modus ponens“. But the error here is similar to the error of saying that one must have infinite certainty about some things, because otherwise one will not have infinite certainty about the fact that one does not have infinite certainty, as though this were a contradiction. It is no contradiction for all of your thoughts to be fallible, including this one, and it is no contradiction for all of your thoughts to be vague, including your thoughts about precision and approximation.

The title of this post in fact refers to this error, which is probably the fundamental problem in Ross’s argument. Triangles in the real world are not perfectly triangular, but we have an idealized concept of a triangle. In precisely the same way, the process of idealization in the real world is not an infinitely precise process, but we have an idealized concept of idealization. Concluding that our acts of idealization must actually be ideal in themselves, simply because we have an idealized concept of idealization, would be a case of confusing the way of knowing with the way of being. It is a particularly confusing case simply because the way of knowing in this case is also materially the being which is known. But this material identity does not make the mode of knowing into the mode of being.

We should consider also Ross’s minor premise, that a physical process cannot be determinate in the way required:

Whatever the discriminable features of a physical process may be, there will always be a pair of incompatible predicates, each as empirically adequate as the other, to name a function the exhibited data or process “satisfies.” That condition holds for any finite actual “outputs,” no matter how many. That is a feature of physical process itself, of change. There is nothing about a physical process, or any repetitions of it, to block it from being a case of incompossible forms (“functions”), if it could be a case of any pure form at all. That is because the differentiating point, the point where the behavioral outputs diverge to manifest different functions, can lie beyond the actual, even if the actual should be infinite; e.g., it could lie in what the thing would have done, had things been otherwise in certain ways. For instance, if the function is x(*)y = (x + y, if y < 10^40 years, = x + y +1, otherwise), the differentiating output would lie beyond the conjectured life of the universe.

Just as rectangular doors can approximate Euclidean rectangularity, so physical change can simulate pure functions but cannot realize them. For instance, there are no physical features by which an adding machine, whether it is an old mechanical “gear” machine or a hand calculator or a full computer, can exclude its satisfying a function incompatible with addition, say quaddition (cf. Kripke’s definition of the function to show the indeterminacy of the single case: quus, symbolized by the plus sign in a circle, “is defined by: x quus y = x + y, if x, y < 57, =5 otherwise”) modified so that the differentiating outputs (not what constitutes the difference, but what manifests it) lie beyond the lifetime of the machine. The consequence is that a physical process is really indeterminate among incompatible abstract functions.

Extending the list of outputs will not select among incompatible functions whose differentiating “point” lies beyond the lifetime (or performance time) of the machine. That, of course, is not the basis for the indeterminacy; it is just a grue-like illustration. Adding is not a sequence of outputs; it is summing; whereas if the process were quadding, all its outputs would be quadditions, whether or not they differed in quantity from additions (before a differentiating point shows up to make the outputs diverge from sums).

For any outputs to be sums, the machine has to add. But the indeterminacy among incompossible functions is to be found in each single case, and therefore in every case. Thus, the machine never adds.

There is some truth here, and some error here. If we think about a physical process in the particular way that Ross is considering it, it will be true that it will always be able to be interpreted in more than one way. This is why, for example, in my recent discussion with John Nerst, John needed to say that the fundamental cause of things had to be “rules” rather than e.g. fundamental particles. The movement of particles, in itself, could be interpreted in various ways. “Rules,” on the other hand, are presumed to be something which already has a particular interpretation, e.g. adding as opposed to quadding.

On the other hand, there is also an error here. The prima facie sign of this error is the statement that an adding machine “never adds.” Just as according to common sense we can draw triangles on blackboards, so according to common sense the calculator on my desk can certainly add. This is connected with the problem with the entire argument. Since “the calculator can add” is true in some way, there is no particular reason that “we can add” cannot be true in precisely the same way. Ross wishes to argue that we can add in a way that the calculator cannot because, in essence, we do it infallibly; but this is flatly false. We do not do it infallibly.

Considered metaphysically, the problem here is ignorance of the formal cause. If physical processes were entirely formless, they indeed would have no interpretation, just as a formless human (were that possible) would be a philosophical zombie. But in reality there are forms in both cases. In this sense, Ross’s argument comes close to saying “human thought is a form or formed, but physical processes are formless.” Since in fact neither is formless, there is no reason (at least established by this argument) why thought could not be the form of a physical process.

 

The Self and Disembodied Predictive Processing

While I criticized his claim overall, there is some truth in Scott Alexander’s remark that “the predictive processing model isn’t really a natural match for embodiment theory.” The theory of “embodiment” refers to the idea that a thing’s matter contributes in particular ways to its functioning; it cannot be explained by its form alone. As I said in the previous post, the human mind is certainly embodied in this sense. Nonetheless, the idea of predictive processing can suggest something somewhat disembodied. We can imagine the following picture of Andy Clark’s view:

Imagine the human mind as a person in an underground bunker. There is a bank of labelled computer screens on one wall, which portray incoming sensations. On another computer, the person analyzes the incoming data and records his predictions for what is to come, along with the equations or other things which represent his best guesses about the rules guiding incoming sensations.

As time goes on, his predictions are sometimes correct and sometimes incorrect, and so he refines his equations and his predictions to make them more accurate.

As in the previous post, we have here a “barren landscape.” The person in the bunker originally isn’t trying to control anything or to reach any particular outcome; he is just guessing what is going to appear on the screens. This idea also appears somewhat “disembodied”: what the mind is doing down in its bunker does not seem to have much to do with the body and the processes by which it is obtaining sensations.

At some point, however, the mind notices a particular difference between some of the incoming streams of sensation and the rest. The typical screen works like the one labelled “vision.” And there is a problem here. While the mind is pretty good at predicting what comes next there, things frequently come up which it did not predict. No matter how much it improves its rules and equations, it simply cannot entirely overcome this problem. The stream is just too unpredictable for that.

On the other hand, one stream labelled “proprioception” seems to work a bit differently. At any rate, extreme unpredicted events turn out to be much rarer. Additionally, the mind notices something particularly interesting: small differences to prediction do not seem to make much difference to accuracy. Or in other words, if it takes its best guess, then arbitrarily modifies it, as long as this is by a small amount, it will be just as accurate as its original guess would have been.

And thus if it modifies it repeatedly in this way, it can get any outcome it “wants.” Or in other words, the mind has learned that it is in control of one of the incoming streams, and not merely observing it.

This seems to suggest something particular. We do not have any innate knowledge that we are things in the world and that we can affect the world; this is something learned. In this sense, the idea of the self is one that we learn from experience, like the ideas of other things. I pointed out elsewhere that Descartes is mistaken to think the knowledge of thinking is primary. In a similar way, knowledge of self is not primary, but reflective.

Hellen Keller writes in The World I Live In (XI):

Before my teacher came to me, I did not know that I am. I lived in a world that was a no-world. I cannot hope to describe adequately that unconscious, yet conscious time of nothingness. I did not know that I knew aught, or that I lived or acted or desired. I had neither will nor intellect. I was carried along to objects and acts by a certain blind natural impetus. I had a mind which caused me to feel anger, satisfaction, desire. These two facts led those about me to suppose that I willed and thought. I can remember all this, not because I knew that it was so, but because I have tactual memory.

When I wanted anything I liked, ice cream, for instance, of which I was very fond, I had a delicious taste on my tongue (which, by the way, I never have now), and in my hand I felt the turning of the freezer. I made the sign, and my mother knew I wanted ice-cream. I “thought” and desired in my fingers.

Since I had no power of thought, I did not compare one mental state with another. So I was not conscious of any change or process going on in my brain when my teacher began to instruct me. I merely felt keen delight in obtaining more easily what I wanted by means of the finger motions she taught me. I thought only of objects, and only objects I wanted. It was the turning of the freezer on a larger scale. When I learned the meaning of “I” and “me” and found that I was something, I began to think. Then consciousness first existed for me.

Helen Keller’s experience is related to the idea of language as a kind of technology of thought. But the main point is that she is quite literally correct in saying that she did not know that she existed. This does not mean that she had the thought, “I do not exist,” but rather that she had no conscious thought about the self at all. Of course she speaks of feeling desire, but that is precisely as a feeling. Desire for ice cream is what is there (not “what I feel,” but “what is”) before the taste of ice cream arrives (not “before I taste ice cream.”)