Prayer and Probability

The reader might wonder about the relation between the previous post and my discussion of Arman Razaali. If I could say it is more likely that he was lying than that the thing happened as stated, why shouldn’t they believe the same about my personal account?

In the first place there is a question of context. I deliberately took Razaali’s account randomly from the internet without knowing anything about him. Similarly if someone randomly passes through and reads the previous post without having ready anything else on this blog, it would not be unreasonable for them to think I might have just made it up. But if someone has read more here they probably have a better estimate of my character. (If you have read more and still think I made it up, well, you are a very poor judge of character and there is not much I can do about that.)

Second, I did not say he was lying. I said it was more likely than the extreme alternative hypothesis that the thing happened exactly as stated and that it happened purely by chance. And given later events (namely his comment here), I do not think he was lying at all.

Third, the probabilities are very different.

“Calculating” the probability

What is the probability of the events I described happening purely by chance? The first thing to determine is what we are counting when we say that something has a chance of 1/X, whatever X is. Out of X cases, the thing should happen about once. In the Razaali case, ‘X’ would be something like “shuffling a deck of cards for 30 minutes and ending up with the deck in the original order.” That should happen about once, if you shuffle and check your deck of cards about 10^67 times.

It is not so easy to say what you are counting if you are trying to determine the probability of a coincidence. And one factor that makes this feel weirder and less probable is that since a coincidence involves several different things happening, you tend to think about it as though there were an extra difficulty in each and every one of the things needing to happen. But in reality you should take one of them as a fixed fact and simply ask about the probability of the other given the fixed thing. To illustrate this, consider the “birthday problem“: in a group of 23 people, the chance that two of them will have the same birthday is over 50%. This “feels” too high; most people would guess that the chance would be lower. But even without doing the math, one can begin to see why this is so by thinking through a few steps of the problem. 22 days is about 6% of the days in a year; so if we take one person, who has a birthday on some day or other, there will be about a 6% chance that one of the other people have the same birthday. If none of them do, take the second person; the chance one of the remaining 21 people will have the same birthday as them will still be pretty close to 6%, which gets us up to almost 12% (it doesn’t quite add up in exactly this way, but it’s close). And we still have a lot more combinations to check. So you can already start to see that how easy it will turn out to be to get up to 50%. In any case, the basic point is that the “coincidence” is not important; each person has a birthday, and we can treat that day as fixed while we compare it to all the others.

In the same way, if you are asking about the probability that someone prays for a thing, and then that thing happens (by chance), you don’t need to consider the prayer as some extra factor — it is enough to ask how often the thing in question happens, and that will tell you your chance. If someone is looking for a job and prays a novena for this intention, and receives a job offer immediately afterwards, the chance will be something like “how often a person looking for a job receives a job offer.” For example, if it takes five months on average to get a job when you are looking, the probability of receiving an offer on a random day should be about 1/150; so out 150 people praying novenas for a job while engaged in a job search, about 1 of them should get an offer immediately afterwards.

What would have counted as “the thing happening” in the personal situation described in the last post? There are a number of subjective factors here, and depending on how one looks at it, especially depending on the detail with which the situation is described. For example, as I said in the last post, it is normal to think of the “answer” to novena on the last day or the day after — so if a person praying for a job receives an offer on either of those days, they will likely consider it just as much of an answer. This means the estimate of 1/150 is really too low; it should really be 1/75. And given that many people would stretch out the period (in which they would count the result as an answer) to as much as a week, we could make the odds as high as 1/21. Looking loosely at other details could similarly improve the odds; e.g. if receiving an interview invitation that later leads to a job is included, the odds would be even higher.

But since we are considering whether the odds might be as bad as 1/10^67, let’s assume we include a fair amount of detail. What are the odds that on a specific day a stranger tells someone that “Our Lady wants you to become a religious and she is afraid that you are going astray,” or words to that effect?

The odds here should be just as objective as the odds with the cards — there should be a real number here — for reasons explained elsewhere, but unfortunately unlike the cards, we have nowhere near enough experience to get a precise number. Nonetheless it is easy to see that various details about the situation made it actually more likely than it would be for a perfectly random person. Since I had a certain opinion of my friend’s situation, that makes it far more likely than chance that other people aware of the situation would have a similar opinion. And although we are talking about a “stranger” here, that stranger was known to a third party that knew my friend, and we have no way of knowing what, if anything, might have passed through that channel.

If we arbitrarily assume that one in a million people in similar situations (i.e. where other people have similar opinions about them) hear such a thing at some point in their lives, and assume that we need to hit one particular day out of 50 years here, then we can “calculate” the chance: 1 / (365 * 50 * 1,000,000), or about 1 in 18 billion. To put it in counting terms, 1 in 18 billion novenas like this will result in the thing happening by chance.

Now it may be that one in a million persons is too high (although if anything it may also be too low; the true value may be more like 1 / 100,000, making the overall probability 1 / 180 million). But it is easy to see that there is no reasonable way that you can say this is as unlikely as shuffling a deck of cards and getting it in the original order.

The Alternative Hypothesis

A thing that happens once in 18 billion person days is not so rare that you would expect such things to never occur (although you would expect them to most likely not happen to you). Nonetheless, you might want to consider whether there is some better explanation than chance.

But a problem arises immediately: it is not clear that the alternative makes it much more likely. After all, I was very surprised by these events when they happened, even though at the time I did attribute an explicitly religious explanation. Indeed, Fr. Joseph Bolin argues that you should not expect prayer to increase the chances of any event. But if this is the case, then the odds of it happening will be the same given the religious explanation as given the chance explanation. Which means the event would not even be evidence for the religious explanation.

In actual fact, it is evidence for the religious explanation, but only because Fr. Joseph’s account is not necessarily true. It could be true that when one prays for something sufficiently rare, the chance of it happening increases by a factor of 1,000; the cases would still be so rare that people would not be likely to discover this fact.

Nonetheless, the evidence is much weaker than a probability of 1 in 18 billion would suggest, namely because the alternative hypothesis does not prevent the events from remaining very unlikely. This is an application of the discussion here, where I argued that “anomalous” evidence should not change your opinion much about anything. This is actually something the debunkers get right, even if they are mistaken about other things.

A Correction Regarding Laplace

A few years ago, I quoted Stanley Jaki on an episode supposedly involved Laplace:

Laplace shouted, “We have had enough such myths,” when his fellow academician Marc-Auguste Pictet urged, in the full hearing of the Académie des Sciences, that attention be given to the report about a huge meteor shower that fell at L’Aigle, near Paris, on April 26, 1803.

I referred to this recently on Twitter. When another user found it surprising that Laplace would have said this, I attempted to track it down, and came to the conclusion that this very account is a “myth” itself, in some sense. Jaki tells the same story in different words in the book Miracles and Physics:

The defense of miracles done with an eye on physics should include a passing reference to meteorites. Characteristic of the stubborn resistance of scientific academies to those strange bits of matter was Laplace’s shouting, “We’ve had enough of such myths,” when Pictet, a fellow academician, urged a reconsideration of the evidence provided by “lay-people” as plain eyewitnesses.

(p. 94)

Jaki provides no reference in God and the sun at Fatima. The text in Miracles and Physics has a footnote, but it provides generic related information that does not lead back to any such episode.

Did Jaki make it up? People do just make things up“, but in this case whatever benefit Jaki might get from it would seem to be outweighed by the potential reputational damage of being discovered in such a lie, so it seems unlikely. More likely he is telling a story from memory, with the belief that the details just don’t matter very much. And since he provides plenty of other sources, I am sure he knows full well that he is omitting any source here, presumably because he does not have one at hand. He may even be trying to cover up this omission, in a sense, by footnoting the passage with information that does not source it. It seems likely that the story is a lecture hall account that has been modified by the passage of time. One reason to suppose such a source is that Jaki is not alone in the claim that Laplace opposed the idea of meteorites as stones from the sky until 1803. E.T. Jaynes, in Probability Theory: The Logic of Science, makes a similar claim:

Note that we can recognize the clear truth of this psychological phenomenon without taking any stand about the truth of the miracle; it is possible that the educated people are wrong. For example, in Laplace’s youth educated persons did not believe in meteorites, but dismissed them as ignorant folklore because they are so rarely observed. For one familiar with the laws of mechanics the notion that “stones fall from the sky” seemed preposterous, while those without any conception of mechanical law saw no difficulty in the idea. But the fall at Laigle in 1803, which left fragments studied by Biot and other French scientists, changed the opinions of the educated — including Laplace himself. In this case the uneducated, avid for the marvelous, happened to be right: c’est la vie.

(p. 505)

Like Jaki, Jaynes provides no source. Still, is that good enough reason to doubt the account? Let us examine a text from the book The History of Meteoritics and Key Meteorite Collections. In the article, “Meteorites in history,” Ursula Marvin remarks:

Early in 1802 the French mathematician Pierre-Simon de Laplace (1749-1827) raised the question at the National Institute of a lunar volcanic origin of fallen stones, and quickly gained support for this idea from two physicist colleagues Jean Baptiste Biot (1774-1862) and Siméon-Denis Poisson (1781-1840). The following September, Laplace (1802, p. 277) discussed it in a letter to von Zach.

The idea won additional followers when Biot (1803a) referred to it as ‘Laplace’s hypothesis’, although Laplace, himself, never published an article on it.

(p.49)

This has a source for Laplace’s letter of 1802, although I was not able to find it online. It seems very unlikely that Laplace would have speculated on meteorites as coming from lunar volcanos in 1802, and then called them “myths” in 1803. So where does this story come from? In Cosmic Debris: Meteorites in History, John Burke gives this account:

There is also a problem with respect to the number of French scientists who, after Pictet published a résumé of Howard’s article in the May 1802 issue of the Bibliothèque Britannique, continued to oppose the idea that stones fell from the atmosphere. One can infer from a statement of Lamétherie that there was considerable opposition, for he reported that when Pictet read a memoir to the Institut on the results of Howard’s report “he met with such disfavor that it required a great deal of fortitude for him to finish his reading.” However, Biot’s description of the session varies a good deal. Pictet’s account, he wrote, was received with a “cautious eagerness,” though the “desire to explain everything” caused the phenomenon to be rejected for a long time. There were, in fact, only three scientists who publicly expressed their opposition: the brothers Jean-André and Guillaume-Antoine Deluc of Geneva, and Eugène Patrin, an associate member of the mineralogy section of the Institut and librarian at the École des mines.

When Pictet early in 1801 published a favorable review of Chladni’s treatise, it drew immediate fire from the Deluc brothers. Jean, a strict Calvinist, employed the same explanation of a fall that the Fougeroux committee had used thirty years before: stones did not fall; the event was imagined when lightning struck close to the observer. Just as no fragment of our globe separate and become lost in space, he wrote, fragments could not be detached from another planet. It was also very unlikely that solid masses had been wandering in space since the creation, because they would have long since fallen into the sphere of attraction of some planet. And even if they did fall, they would penetrate the earth to a great depth and shatter into a thousand pieces.

(p.51)

It seems quite possible that Pictet’s “reading a memoir” here and “meeting with disfavor” (regardless of details, since Burke notes it had different descriptions at the time) is the same incident that Jaki describes as having been met with “We’ve had enough of such myths!” when Pictet “urged a reconsideration of the evidence.” If these words were ever said, then, they were presumably said by one of these brothers or someone else, and not by Laplace.

How does this sort of thing happen, if we charitably assume that Jaki was not being fundamentally dishonest? As stated above, it seems likely that he knew he did not have a source. He may even have been consciously aware that it might not have been Laplace who made this statement, if anyone did. But he was sure there was a dispute about the matter, and presumably thought that it just wasn’t too important who it was or the details of the situation, since the main point was that scientists are frequently reluctant to accept facts when those facts occur rarely and are not deliberately reproducible. And if we reduce Jaki’s position to these two things, namely, (1) that scientists at one point disputed the reality and meteorites, and (2) this sort of thing frequently happens with rare and hard to reproduce phenomena, then the position is accurate.

But this behavior, the description of situations with the implication that the details just don’t matter much, is very bad, and directly contributes to the reluctance of many scientists to accept the reality of “extraordinary” phenomena, even in situations where they are, in fact, real.

Some Remarks on GPT-N

At the end of May, OpenAI published a paper on GPT-3, a language model which is a successor to their previous version, GPT-2. While quite impressive, the reaction from many people interested in artificial intelligence has been seriously exaggerated. Sam Altman, OpenAI’s CEO, has said as much himself:

The GPT-3 hype is way too much. It’s impressive (thanks for the nice compliments!) but it still has serious weaknesses and sometimes makes very silly mistakes. AI is going to change the world, but GPT-3 is just a very early glimpse. We have a lot still to figure out.

I used “GPT-N” in the title here because most of the comments I intend to make are almost completely general, and will apply to any future version that uses sufficiently similar methods.

What it does

GPT-3 is a predictive language model, that is, given an input text it tries to predict what would come next, much in the way that if you read the first few words of this sentence with the rest covered up, you might try to guess what would be likely to come next. To the degree that it does this well, it can be used to generate text from a “prompt,” that is, we give it something like a few words or a few sentences, and then add whatever it predicts should come next. For example, let’s take this very blog post and see what GPT-3 would like to say:

What it doesn’t do

While GPT-3 does seem to be able to generate some pretty interesting results, there are several limitations that need to be taken into account when using it.

First and foremost, and most importantly, it can’t do anything without a large amount of input data. If you want it to write like “a real human,” you need to give it a lot of real human writing. For most people, this means copying and pasting a lot. And while the program is able to read through that and get a feel for the way humans communicate, you can’t exactly use it to write essays or research papers. The best you could do is use it as a “fill in the blank” tool to write stories, and that’s not even very impressive.

While the program does learn from what it reads and is quite good at predicting words and phrases based on what has already been written, this method isn’t very effective at producing realistic prose. The best you could hope for is something like the “Deep Writing Machine” Twitter account, which spits out disconnected phrases in an ominous, but very bland voice.

In addition, the model is limited only to language. It does not understand context or human thought at all, so it has no way of tying anything together. You could use it to generate a massive amount of backstory and other material for a game, but that’s about it.

Finally, the limitations in writing are only reinforced by the limitations in reading. Even with a large library to draw on, the program is only as good as the parameters set for it. Even if you set it to the greatest writers mankind has ever known, without any special parameters, its writing would be just like anyone else’s.

The Model

GPT-3 consists of several layers. The first layer is a “memory network” that involves the program remembering previously entered data and using it when appropriate (i.e. it remembers commonly misspelled words and frequently used words). The next layer is the reasoning network, which involves common sense logic (i.e. if A, then B). The third is the repetition network, which involves pulling previously used material from memory and using it to create new combinations (i.e. using previously used words in new orders).

I added the bold formatting, the rest is as produced by the model. This was also done in one run, without repetitions. This is an important qualification, since many examples on the internet have been produced by deleting something produced by the model and forcing it to generate something new until something sensible resulted. Note that the model does not seem to have understood my line, “let’s take this very blog post and see what GPT-3 would like to say.” That is, rather than trying to “say” anything, it attempted to continue the blog post in the way I might have continued it without the block quote.

Truth vs Probability of Text

If we interpret the above text from GPT-3 “charitably”, much of it is true or close to true. But I use scare quotes here because when we speak of interpreting human speech charitably, we are assuming that someone was trying to speak the truth, and so we think, “What would they have meant if they were trying to say something true?” The situation is different here, because GPT-3 has no intention of producing truth, nor of avoiding it. Insofar as there is any intention, the intention is to produce the text which would be likely to come after the input text; in this case, as the input text was the beginning of this blog post, the intention was to produce the text that would likely follow in such a post. Note that there is an indirect relationship with truth, which explains why there is any truth at all in GPT-3’s remarks. If the input text is true, it is at least somewhat likely that what would follow would also be true, so if the model is good at guessing what would be likely to follow, it will be likely to produce something true in such cases. But it is just as easy to convince it to produce something false, simply by providing an input text that would be likely to be followed by something false.

This results in an absolute upper limit on the quality of the output of a model of this kind, including any successor version, as long as the model works by predicting the probability of the following text. Namely, its best output cannot be substantially better than the best content in its training data, which is in this version is a large quantity of texts from the internet. The reason for this limitation is clear; to the degree that the model has any intention at all, the intention is to reflect the training data, not to surpass it. As an example, consider the difference between Deep Mind’s AlphaGo and AlphaGo Zero. AlphaGo Zero is a better Go player than the original AlphaGo, and this is largely because the original is trained on human play, while AlphaGo Zero is trained from scratch on self play. In other words, the original version is to some extent predicting “what would a Go player play in this situation,” which is not the same as predicting “what move would win in this situation.”

Now I will predict (and perhaps even GPT-3 could predict) that many people will want to jump in and say, “Great. That shows you are wrong. Even the original AlphaGo plays Go much better than a human. So there is no reason that an advanced version of GPT-3 could not be better than humans at saying things that are true.”

The difference, of course, is that AlphaGo was trained in two ways, first on predicting what move would be likely in a human game, and second on what would be likely to win, based on its experience during self play. If you had trained the model only on predicting what would follow in human games, without the second aspect, the model would not have resulted in play that substantially improved upon human performance. But in the case of GPT-3 or any model trained in the same way, there is no selection whatsoever for truth as such; it is trained only to predict what would follow in a human text. So no successor to GPT-3, in the sense of a model of this particular kind, however large, will ever be able to produce output better than human, or in its own words, “its writing would be just like anyone else’s.”

Self Knowledge and Goals

OpenAI originally claimed that GPT-2 was too dangerous to release; ironically, they now intend to sell access to GPT-3. Nonetheless, many people, in large part those influenced by the opinions of Nick Bostrom and Eliezer Yudkowsky, continue to worry that an advanced version might turn out to be a personal agent with nefarious goals, or at least goals that would conflict with the human good. Thus Alexander Kruel:

GPT-2: *writes poems*
Skeptics: Meh
GPT-3: *writes code for a simple but functioning app*
Skeptics: Gimmick.
GPT-4: *proves simple but novel math theorems*
Skeptics: Interesting but not useful.
GPT-5: *creates GPT-6*
Skeptics: Wait! What?
GPT-6: *FOOM*
Skeptics: *dead*

In a sense the argument is moot, since I have explained above why no future version of GPT will ever be able to produce anything better than people can produce themselves. But even if we ignore that fact, GPT-3 is not a personal agent of any kind, and seeks goals in no meaningful sense, and the same will apply to any future version that works in substantially the same way.

The basic reason for this is that GPT-3 is disembodied, in the sense of this earlier post on Nick Bostrom’s orthogonality thesis. The only thing it “knows” is texts, and the only “experience” it can have is receiving an input text. So it does not know that it exists, it cannot learn that it can affect the world, and consequently it cannot engage in goal seeking behavior.

You might object that it can in fact affect the world, since it is in fact in the world. Its predictions cause an output, and that output is in the world. And that output and be reintroduced as input (which is how “conversations” with GPT-3 are produced). Thus it seems it can experience the results of its own activities, and thus should be able to acquire self knowledge and goals. This objection is not ultimately correct, but it is not so far from the truth. You would not need extremely large modifications in order to make something that in principle could acquire self knowledge and seek goals. The main reason that this cannot happen is the “P in “GPT,” that is, the fact that the model is “pre-trained.” The only learning that can happen is the learning that happens while it is reading an input text, and the purpose of that learning is to guess what is happening in the one specific text, for the purpose of guessing what is coming next in this text. All of this learning vanishes upon finishing the prediction task and receiving another input. A secondary reason is that since the only experience it can have is receiving an input text, even if it were given a longer memory, it would probably not be possible for it to notice that its outputs were caused by its predictions, because it likely has no internal mechanism to reflect on the predictions themselves.

Nonetheless, if you “fixed” these two problems, by allowing it to continue to learn, and by allowing its internal representations to be part of its own input, there is nothing in principle that would prevent it from achieving self knowledge, and from seeking goals. Would this be dangerous? Not very likely. As indicated elsewhere, motivation produced in this way and without the biological history that produced human motivation is not likely to be very intense. In this context, if we are speaking of taking a text-predicting model and adding on an ability to learn and reflect on its predictions, it is likely to enjoy doing those things and not much else. For many this argument will seem “hand-wavy,” and very weak. I could go into this at more depth, but I will not do so at this time, and will simply invite the reader to spend more time thinking about it. Dangerous or not, would it be easy to make these modifications? Nothing in this description sounds difficult, but no, it would not be easy. Actually making an artificial intelligence is hard. But this is a story for another time.

The Power of a Name

Fairy tales and other stories occasionally suggest the idea that a name gives some kind of power over the thing named, or at least that one’s problems concerning a thing may be solved by knowing its name, as in the story of Rumpelstiltskin. There is perhaps a similar suggestion in Revelation 2:7, “Whoever has ears, let them hear what the Spirit says to the churches. To the one who is victorious, I will give some of the hidden manna. I will also give that person a white stone with a new name written on it, known only to the one who receives it.” The secrecy of the new name may indicate (among other things) that others will have no power over that person.

There is more truth in this idea than one might assume without much thought. For example, anonymous authors do not want to be “doxxed” because knowing the name of the author really does give some power in relation to them which is not had without the knowledge of their name. Likewise, as a blogger, occasionally I want to cite something, but cannot remember the name of the author or article where the statement is made. Even if I remember the content fairly clearly, lacking the memory of the name makes finding the content far more difficult, while on the other name, knowing the name gives me the power of finding the content much more easily.

But let us look a bit more deeply into this. Hilary Lawson, whose position was somewhat discussed here, has a discussion along these lines in Part II of his book, Closure: A Story of Everything. Since he denies that language truly refers to the world at all, as I mentioned in the linked post on his position, it is important to him that language has other effects, and in particular has practical goals. He says in chapter 4:

In order to understand the mechanism of practical linguistic closure consider an example where a proficient speaker of English comes across a new word. Suppose that we are visiting a zoo with a friend. We stand outside a cage and our friend says: ‘An aasvogel.” …

It might appear at first from this example that nothing has been added by the realisation of linguistic closure. The sound ‘aasvogel’ still sounds the same, the image of the bird still looks the same. So what has changed? The sensory closures on either side may not have changed, but a new closure has been realised. A new closure which is in addition to the prior available closures and which enables intervention which was not possible previously. For example, we now have a means of picking out this particular bird in the zoo because the meaning that has been realised will have identified a something in virtue of which this bird is an aasvogel and which thus enables us to distinguish it from others. As a result there will be many consequences for how we might be able to intervene.

The important point here is simply that naming something, even before taking any additional steps, immediately gives one the ability to do various practical things that one could not previously do. In a passage by Helen Keller, previously quoted here, she says:

Since I had no power of thought, I did not compare one mental state with another. So I was not conscious of any change or process going on in my brain when my teacher began to instruct me. I merely felt keen delight in obtaining more easily what I wanted by means of the finger motions she taught me.

We may have similar experiences as adults learning a foreign language while living abroad. At first one has very little ability to interact with the foreign world, but suddenly everything is possible.

Or consider the situation of a hunter gatherer who may not know how to count. It may be obvious to them that a bigger pile of fruit is better than a smaller one, but if two piles look similar, they may have no way to know which is better. But once they decide to give “one fruit and another” a name like “two,” and “two and one” a name like “three,” and so on, suddenly they obtain a great advantage that they previously did not possess. It is now possible to count piles and to discover that one pile has sixty-four while another has sixty-three. And it turns out that by treating the “sixty-four” as bigger than the other pile, although it does not look bigger, they end up better off.

In this sense one could look at the scientific enterprise of looking for mathematical laws of nature as one long process of looking for better names. We can see that some things are faster and some things are slower, but the vague names “fast” and “slow” cannot accomplish much. Once we can name different speeds more precisely, we can put them all in order and accomplish much more, just as the hunter gatherer can accomplish more after learning to count. And this extends to the full power of technology: the men who landed on the moon, did so ultimately due to the power of names.

If you take Lawson’s view, that language does not refer to the world at all, all of this is basically casting magic spells. In fact, he spells this out himself, in so many words, in chapter 3:

All material is in this sense magical. It enables intervention that cannot be understood. Ancient magicians were those who had access to closures that others did not know, in the same way that the Pharaohs had access to closures not available to their subjects. This gave them a supernatural character. It is now that thought that their magic has been explained, as the knowledge of herbs, metals or the weather. No such thing has taken place. More powerful closures have been realised, more powerful magic that can subsume the feeble closures of those magicians. We have simply lost sight of its magical character. Anthropology has many accounts of tribes who on being observed by a Western scientist believe that the observer has access to some very powerful magic. Magic that produces sound and images from boxes, and makes travel swift. We are inclined to smile patronisingly believing that we merely have knowledge — the technology behind radio and television, and motor vehicles — and not magic. The closures behind the technology do indeed provide us with knowledge and understanding and enable us to handle activity, but they do not explain how the closures enable intervention. How the closures are successful remains incomprehensible and in this sense is our magic.

I don’t think we should dismiss this point of view entirely, but I do think it is more mistaken than otherwise, basically because of the original mistake of thinking that language cannot refer to the world. But the point that names are extremely powerful is correct and important, to the point where even the analogy of technology as “magic that works” does make a certain amount of sense.

Counterfactuals as Historical Fiction

Suppose someone reading Anne of Green Gables asks a question about what happened before the story begins. For example, what did Anne have for lunch 37 days before her arrival in Avonlea?

It is easy to see that this question does not have one true answer. There is no such thing as what she really had for lunch, because it is a story, and that meal is not included in it. On the other hand, despite the lack of any absolute truth here, some answers remain more reasonable than others. For example, “She had salad,” is a more sensible answer than “she ate crushed glass that day.” Just as I said in regard to “why” something is the case, one can give a partial answer, in the sense of showing that some options are more intelligible than others, without being able to exclude some options entirely.

These same things will apply to questions about a work of historical fiction, although the intended historical context will provide additional ways to show that some answers are more sensible than others. Thus if a story is set in ancient Rome, the claim that someone had corn for lunch is unreasonable due to the historical context, although not as unreasonable as some other possibilities that you could suggest.

Now consider a counterfactual question about your current situation: “What would you do if it were 120 degrees Fahrenheit in your house?”

There is no fundamental difference between this and the case of historical fiction. In effect, we just created a story about you: “It was 120 degrees in your house. You…”

Like the case of historical fiction, some answers will be more sensible than others, but there is no thing that you really would do in that situation. The story didn’t really take place, but if it did, it would have taken place with a lot more concrete detail, and that concrete detail could determine the specific answer to the question. If Anne of Green Gables were a true story, her concrete situation would have determined what she had for lunch that day. And if it were really 120 degrees in your house, what you would do would depend on how and why things got that way, as well as other factors in your concrete situation.

Some philosophers have spent a lot of time on this kind of counterfactual question, apparently largely from a desire for absolute answers. For example, some suggest that a counterfactual is true if the claim is true in the nearest possible world where the antecedent is true. In a similar way, Molinists argue that in order to be omniscient, God has to know what you would do if it were 120 degrees in your house, and that it must be one specific thing, so that there is one thing that you really would do in that situation. They call this kind of knowledge “middle” knowledge, namely something in between knowledge of what actually is and knowledge of what merely might have been.

All accounts of this kind are wasted effort. The brief account above is sufficient.

Rao’s Divergentism

The main point of this post is to encourage the reader who has not yet done so, to read Venkatesh Rao’s essay Can You Hear Me Now. I will not say too much about it. The purpose is potentially for future reference, and simply to point out a connection with some current topics here.

Rao begins:

The fundamental question of life, the universe and everything is the one popularized by the Verizon guy in the ad: Can you hear me now?

This conclusion grew out of a conversation I had about a year ago, with some friends, in which I proposed a modest-little philosophy I dubbed divergentism. Here is a picture.

https://206hwf3fj4w52u3br03fi242-wpengine.netdna-ssl.com/wp-content/uploads/2015/12/divergentism.jpg

Divergentism is the idea that as individuals grow out into the universe, they diverge from each other in thought-space. This, I argued, is true even if in absolute terms, the sum of shared beliefs is steadily increasing. Because the sum of beliefs that are not shared increases even faster on average. Unfortunately, you are unique, just like everybody else.

If you are a divergentist, you believe that as you age, the average answer to the fundamental Verizon question slowly drifts, as you age, from yes, to no, to silenceIf you’re unlucky, you’re a hedgehog and get unhappier and unhappier about this as you age. If you are lucky, you’re a fox and you increasingly make your peace with this condition. If you’re really lucky, you die too early to notice the slowly descending silence, before it even becomes necessary to Google the phrase existential horror.

To me, this seemed like a completely obvious idea. Much to my delight, most people I ran it by immediately hated it.

The entire essay is worth reading.

I would question whether this is really the “fundamental question of life, the universe, and everything,” but Rao has a point. People do tend to think of their life as meaningful on account of social connections, and if those social connections grow increasingly weaker, they will tend to worry that their life is becoming less meaningful.

The point about the intellectual life of an individual is largely true. This is connected to what I said about the philosophical progress of an individual some days ago. There is also a connection with Kuhn’s idea of how the progress of the sciences causes a gulf to arise between them in such a way that it becomes more and more difficult for scientists in different fields to communicate with one another. If we look at the overall intellectual life of an individual as a sort of individual advancing science, the “sciences” of each individual will generally speaking tend to diverge from one another, allowing less and less communication. This is not about people making mistakes, although obviously making mistakes will contribute to this process. As Rao says, it may be that “the sum of shared beliefs is steadily increasing,” but this will not prevent their intellectual lives overall from diverging, just as the divergence of the sciences does not result from falsity, but from increasingly detailed focus on different truths.

Technical Discussion and Philosophical Progress

In The Structure of Scientific Revolutions (p. 19-21), Thomas Kuhn remarks on the tendency of sciences to acquire a technical vocabulary and manner of discussion:

We shall be examining the nature of this highly directed or paradigm-based research in the next section, but must first note briefly how the emergence of a paradigm affects the structure of the group that practices the field. When, in the development of a natural science, an individual or group first produces a synthesis able to attract most of the next generation’s practitioners, the older schools gradually disappear. In part their disappearance is caused by their members’ conversion to the new paradigm. But there are always some men who cling to one or another of the older views, and they are simply read out of the profession, which thereafter ignores their work. The new paradigm implies a new and more rigid definition of the field. Those unwilling or unable to accommodate their work to it must proceed in isolation or attach themselves to some other group. Historically, they have often simply stayed in the departments of philosophy from which so many of the special sciences have been spawned. As these indications hint, it is sometimes just its reception of a paradigm that transforms a group previously interested merely in the study of nature into a profession or, at least, a discipline. In the sciences (though not in fields like medicine, technology, and law, of which the principal raison d’être is an external social need), the formation of specialized journals, the foundation of specialists’ societies, and the claim for a special place in the curriculum have usually been associated with a group’s first reception of a single paradigm. At least this was the case between the time, a century and a half ago, when the institutional pattern of scientific specialization first developed and the very recent time when the paraphernalia of specialization acquired a prestige of their own.

The more rigid definition of the scientific group has other consequences. When the individual scientist can take a paradigm for granted, he need no longer, in his major works, attempt to build his field anew, starting from first principles and justifying the use of each concept introduced. That can be left to the writer of textbooks. Given a textbook, however, the creative scientist can begin his research where it leaves off and thus concentrate exclusively upon the subtlest and most esoteric aspects of the natural phenomena that concern his group. And as he does this, his research communiqués will begin to change in ways whose evolution has been too little studied but whose modern end products are obvious to all and oppressive to many. No longer will his researches usually be embodied in books addressed, like Franklin’s Experiments . . . on Electricity or Darwin’s Origin of Species, to anyone who might be interested in the subject matter of the field. Instead they will usually appear as brief articles addressed only to professional colleagues, the men whose knowledge of a shared paradigm can be assumed and who prove to be the only ones able to read the papers addressed to them.

Today in the sciences, books are usually either texts or retrospective reflections upon one aspect or another of the scientific life. The scientist who writes one is more likely to find his professional reputation impaired than enhanced. Only in the earlier, pre-paradigm, stages of the development of the various sciences did the book ordinarily possess the same relation to professional achievement that it still retains in other creative fields. And only in those fields that still retain the book, with or without the article, as a vehicle for research communication are the lines of professionalization still so loosely drawn that the layman may hope to follow progress by reading the practitioners’ original reports. Both in mathematics and astronomy, research reports had ceased already in antiquity to be intelligible to a generally educated audience. In dynamics, research became similarly esoteric in the later Middle Ages, and it recaptured general intelligibility only briefly during the early seventeenth century when a new paradigm replaced the one that had guided medieval research. Electrical research began to require translation for the layman before the end of the eighteenth century, and most other fields of physical science ceased to be generally accessible in the nineteenth. During the same two centuries similar transitions can be isolated in the various parts of the biological sciences. In parts of the social sciences they may well be occurring today. Although it has become customary, and is surely proper, to deplore the widening gulf that separates the professional scientist from his colleagues in other fields, too little attention is paid to the essential relationship between that gulf and the mechanisms intrinsic to scientific advance.

As Kuhn says, this tendency has very well known results. Consider the papers constantly being published at arxiv.org, for example. If you are not familiar with the science in question, you will likely not be able to understand even the title, let alone the summary or the content. Many or most of the words will be meaningless to you, and even if they are not, their combinations will be.

It is also not difficult to see why this happens, and why it must happen. Everything we understand, we understand through form, which is a network of relationships. Thus if particular investigators wish to go into something in greater detail, these relationships will become more and more remote from the ordinary knowledge accessible to everyone. “Just say it in simple words” will become literally impossible, in the sense that explaining the “simple” statement will involve explaining a huge number of relationships that by default a person would have no knowledge of. That is the purpose, as Kuhn notes, of textbooks, namely to form connections between everyday knowledge and the more complex relationships studied in particular fields.

In Chapter XIII, Kuhn relates this sort of development with the word “science” and progress:

The preceding pages have carried my schematic description of scientific development as far as it can go in this essay. Nevertheless, they cannot quite provide a conclusion. If this description has at all caught the essential structure of a science’s continuing evolution, it will simultaneously have posed a special problem: Why should the enterprise sketched above move steadily ahead in ways that, say, art, political theory, or philosophy does not? Why is progress a perquisite reserved almost exclusively for the activities we call science? The most usual answers to that question have been denied in the body of this essay. We must conclude it by asking whether substitutes can be found.

Notice immediately that part of the question is entirely semantic. To a very great extent the term ‘science’ is reserved for fields that do progress in obvious ways. Nowhere does this show more clearly than in the recurrent debates about whether one or another of the contemporary social sciences is really a science. These debates have parallels in the pre-paradigm periods of fields that are today unhesitatingly labeled science. Their ostensible issue throughout is a definition of that vexing term. Men argue that psychology, for example, is a science because it possesses such and such characteristics. Others counter that those characteristics are either unnecessary or not sufficient to make a field a science. Often great energy is invested, great passion aroused, and the outsider is at a loss to know why. Can very much depend upon a definition of ‘science’? Can a definition tell a man whether he is a scientist or not? If so, why do not natural scientists or artists worry about the definition of the term? Inevitably one suspects that the issue is more fundamental. Probably questions like the following are really being asked: Why does my field fail to move ahead in the way that, say, physics does? What changes in technique or method or ideology would enable it to do so? These are not, however, questions that could respond to an agreement on definition. Furthermore, if precedent from the natural sciences serves, they will cease to be a source of concern not when a definition is found, but when the groups that now doubt their own status achieve consensus about their past and present accomplishments. It may, for example, be significant that economists argue less about whether their field is a science than do practitioners of some other fields of social science. Is that because economists know what science is? Or is it rather economics about which they agree?

The last point is telling. There is significantly more consensus among economists than among other sorts of social science, and consequently less worry about whether their field is scientific or not. The difference, then, is a difference of how much agreement is found. There is not necessarily any difference with respect to the kind of increasingly detailed thought that results in increasingly technical discussion. Kuhn remarks:

The theologian who articulates dogma or the philosopher who refines the Kantian imperatives contributes to progress, if only to that of the group that shares his premises. No creative school recognizes a category of work that is, on the one hand, a creative success, but is not, on the other, an addition to the collective achievement of the group. If we doubt, as many do, that nonscientific fields make progress, that cannot be because individual schools make none. Rather, it must be because there are always competing schools, each of which constantly questions the very foundations of the others. The man who argues that philosophy, for example, has made no progress emphasizes that there are still Aristotelians, not that Aristotelianism has failed to progress.

In this sense, if a particular school believes they possess the general truth about some matter (here theology or philosophy), they will quite naturally begin to discuss it in greater detail and in ways which are mainly intelligible to students of that school, just as happens in other technical fields. The field is only failing to progress in the sense that there are other large communities making contrasting claims, while we begin to use the term “science” and to speak of progress when one school completely dominates the field, and to a first approximation even people who know nothing about it assume that the particular school has things basically right.

What does this imply about progress in philosophy?

1. There is progress in the knowledge of topics that were once considered “philosophy,” but when we get to this point, we usually begin to use the name of a particular science, and with good reason, since technical specialization arises in the manner discussed above. Tyler Cowen discusses this sort of thing here.

2. Areas in which there doesn’t seem to be such progress, are probably most often areas where human knowledge remains at an early stage of development; it is precisely at such early stages that discussion does not have a technical character and when it can generally be understood by ordinary people without a specialized education. I pointed out that Aristotle was mistaken to assume that the sciences in general were fully developed. We would be equally mistaken to make such an assumption at the present times. As Kuhn notes, astronomy and mathematics achieved a “scientific” stage centuries before geology and biology did the same, and these long before economics and the like. The conclusion that one should draw is that metaphysics is hard, not that it is impossible or meaningless.

3. Even now, particular philosophical schools or individuals can make progress even without such consensus. This is evidently true if their overall position is correct or more correct than that of others, but it remains true even if their overall position is more wrong than that of other schools. Naturally, in the latter situation, they will not advance beyond the better position of other schools, but they will advance.

4. One who wishes to progress philosophically cannot avoid the tendency to technical specialization, even as an individual. This can be rather problematic for bloggers and people engaging in similar projects. John Nerst describes this problem:

The more I think about this issue the more unsolvable it seems to become. Loyal readers of a publication won’t be satisfied by having the same points reiterated again and again. News media get around this by focusing on, well, news. News are events, you can describe them and react to them for a while until they’re no longer news. Publications that aim to be more analytical and focus on discussing ideas, frameworks, slow processes and large-scale narratives instead of events have a more difficult task because their subject matter doesn’t change quickly enough for it to be possible to churn out new material every day without repeating yourself[2].

Unless you start building upwards. Instead of laying out stone after stone on the ground you put one on top of another, and then one on top of two others laying next to each other, and then one on top of all that, making a single three-level structure. In practice this means writing new material that builds on what came before, taking ideas further and further towards greater complexity, nuance and sophistication. This is what academia does when working correctly.

Mass media (including the more analytical outlets) do it very little and it’s obvious why: it’s too demanding[3]. If an article references six other things you need to have read to fully understand it you’re going to have a lot of difficulty attracting new readers.

Some of his conclusions:

I think that’s the real reason I don’t try to pitch more writing to various online publications. In my summary of 2018 I said it was because I thought my writing was to “too idiosyncratic, abstract and personal to fit in anywhere but my own blog”. Now I think the main reason is that I don’t so much want to take part in public debate or make myself a career. I want to explore ideas that lie at the edge of my own thinking. To do that I must assume that a reader knows broadly the same things I know and I’m just not that interested in writing about things where I can’t do that[9]. I want to follow my thoughts to for me new and unknown places — and import whatever packages I need to do it. This style isn’t compatible with the expectation that a piece will be able to stand on its own and deliver a single recognizable (and defensible) point[10].

The downside is of course obscurity. To achieve both relevance in the wider world and to build on other ideas enough to reach for the sky you need extraordinary success — so extraordinary that you’re essentially pulling the rest of the world along with you.

Obscurity is certainly one result. Another (relevant at least from the VP’s point of view) is disrespect. Scientists are generally respected despite the general incomprehensibility of their writing, on account of the absence of opposing schools. This lack leads people to assume that their arguments must be mostly right, even though they cannot understand them themselves. This can actually lead to an “Emperor has No Clothes” situation, where a scientist publishes something basically crazy, but others, even in his field, are reluctant to say so because they might appear to be the ones who are ignorant. As an example, consider Joy Christian’s “Disproof of Bell’s Theorem.” After reading this text, Scott Aaronson comments:

In response to my post criticizing his “disproof” of Bell’s Theorem, Joy Christian taunted me that “all I knew was words.”  By this, he meant that my criticisms were entirely based on circumstantial evidence, for example that (1) Joy clearly didn’t understand what the word “theorem” even meant, (2) every other sentence he uttered contained howling misconceptions, (3) his papers were written in an obscure, “crackpot” way, and (4) several people had written very clear papers pointing out mathematical errors in his work, to which Joy had responded only with bluster.  But I hadn’t actually studied Joy’s “work” at a technical level.  Well, yesterday I finally did, and I confess that I was astonished by what I found.  Before, I’d actually given Joy some tiny benefit of the doubt—possibly misled by the length and semi-respectful tone of the papers refuting his claims.  I had assumed that Joy’s errors, though ultimately trivial (how could they not be, when he’s claiming to contradict such a well-understood fact provable with a few lines of arithmetic?), would nevertheless be artfully concealed, and would require some expertise in geometric algebra to spot.  I’d also assumed that of course Joy would have some well-defined hidden-variable model that reproduced the quantum-mechanical predictions for the Bell/CHSH experiment (how could he not?), and that the “only” problem would be that, due to cleverly-hidden mistakes, his model would be subtly nonlocal.

What I actually found was a thousand times worse: closer to the stuff freshmen scrawl on an exam when they have no clue what they’re talking about but are hoping for a few pity points.  It’s so bad that I don’t understand how even Joy’s fellow crackpots haven’t laughed this off the stage.  Look, Joy has a hidden variable λ, which is either 1 or -1 uniformly at random.  He also has a measurement choice a of Alice, and a measurement choice b of Bob.  He then defines Alice and Bob’s measurement outcomes A and B via the following functions:

A(a,λ) = something complicated = (as Joy correctly observes) λ

B(b,λ) = something complicated = (as Joy correctly observes) -λ

I shit you not.  A(a,λ) = λ, and B(b,λ) = -λ.  Neither A nor B has any dependence on the choices of measurement a and b, and the complicated definitions that he gives for them turn out to be completely superfluous.  No matter what measurements are made, A and B are always perfectly anticorrelated with each other.

You might wonder: what could lead anyone—no matter how deluded—even to think such a thing could violate the Bell/CHSH inequalities?

“Give opposite answers in all cases” is in fact entirely irrelevant to Bell’s inequality. Thus the rest of Joy’s paper has no bearing whatsoever on the issue: it is essentially meaningless nonsense. Aaronson says he was possibly “misled by the length and semi-respectful tone of the papers refuting his claims.” But it is not difficult to see why people would be cautious in this way: the fear that they would turn out to be the ones missing something important.

The individual blogger in philosophy, however, is in a different position. If they wish to develop their thought it must become more technical, and there is no similar community backing that would cause others to assume that the writing basically makes sense. Thus, one’s writing is not only likely to become more and more obscure, but others will become more and more likely to assume that it is more or less meaningless word salad. This will happen even more to the degree that there is cultural opposition to one’s vocabulary, concepts, and topics.

Tautologies Not Trivial

In mathematics and logic, one sometimes speaks of a “trivial truth” or “trivial theorem”, referring to a tautology. Thus for example in this Quora question, Daniil Kozhemiachenko gives this example:

The fact that all groups of order 2 are isomorphic to one another and commutative entails that there are no non-Abelian groups of order 2.

This statement is a tautology because “Abelian group” here just means one that is commutative: the statement is like the customary example of asserting that “all bachelors are unmarried.”

Some extend this usage of “trivial” to refer to all statements that are true in virtue of the meaning of the terms, sometimes called “analytic.” The effect of this is to say that all statements that are logically necessary are trivial truths. An example of this usage can be seen in this paper by Carin Robinson. Robinson says at the end of the summary:

Firstly, I do not ask us to abandon any of the linguistic practises discussed; merely to adopt the correct attitude towards them. For instance, where we use the laws of logic, let us remember that there are no known/knowable facts about logic. These laws are therefore, to the best of our knowledge, conventions not dissimilar to the rules of a game. And, secondly, once we pass sentence on knowing, a priori, anything but trivial truths we shall have at our disposal the sharpest of philosophical tools. A tool which can only proffer a better brand of empiricism.

While the word “trivial” does have a corresponding Latin form that means ordinary or commonplace, the English word seems to be taken mainly from the “trivium” of grammar, rhetoric, and logic. This would seem to make some sense of calling logical necessities “trivial,” in the sense that they pertain to logic. Still, even here something is missing, since Robinson wants to include the truths of mathematics as trivial, and classically these did not pertain to the aforesaid trivium.

Nonetheless, overall Robinson’s intention, and presumably that of others who speak this way, is to suggest that such things are trivial in the English sense of “unimportant.” That is, they may be important tools, but they are not important for understanding. This is clear at least in our example: Robinson calls them trivial because “there are no known/knowable facts about logic.” Logical necessities tell us nothing about reality, and therefore they provide us with no knowledge. They are true by the meaning of the words, and therefore they cannot be true by reason of facts about reality.

Things that are logically necessary are not trivial in this sense. They are important, both in a practical way and directly for understanding the world.

Consider the failure of the Mars Climate Orbiter:

On November 10, 1999, the Mars Climate Orbiter Mishap Investigation Board released a Phase I report, detailing the suspected issues encountered with the loss of the spacecraft. Previously, on September 8, 1999, Trajectory Correction Maneuver-4 was computed and then executed on September 15, 1999. It was intended to place the spacecraft at an optimal position for an orbital insertion maneuver that would bring the spacecraft around Mars at an altitude of 226 km (140 mi) on September 23, 1999. However, during the week between TCM-4 and the orbital insertion maneuver, the navigation team indicated the altitude may be much lower than intended at 150 to 170 km (93 to 106 mi). Twenty-four hours prior to orbital insertion, calculations placed the orbiter at an altitude of 110 kilometers; 80 kilometers is the minimum altitude that Mars Climate Orbiter was thought to be capable of surviving during this maneuver. Post-failure calculations showed that the spacecraft was on a trajectory that would have taken the orbiter within 57 kilometers of the surface, where the spacecraft likely skipped violently on the uppermost atmosphere and was either destroyed in the atmosphere or re-entered heliocentric space.[1]

The primary cause of this discrepancy was that one piece of ground software supplied by Lockheed Martin produced results in a United States customary unit, contrary to its Software Interface Specification (SIS), while a second system, supplied by NASA, expected those results to be in SI units, in accordance with the SIS. Specifically, software that calculated the total impulse produced by thruster firings produced results in pound-force seconds. The trajectory calculation software then used these results – expected to be in newton seconds – to update the predicted position of the spacecraft.

It is presumably an analytic truth that the units defined in one way are unequal to the units defined in the other. But it was ignoring this analytic truth that was the primary cause of the space probe’s failure. So it is evident that analytic truths can be extremely important for practical purposes.

Such truths can also be important for understanding reality. In fact, they are typically more important for understanding than other truths. The argument against this is that if something is necessary in virtue of the meaning of the words, it cannot be telling us something about reality. But this argument is wrong for one simple reason: words and meaning themselves are both elements of reality, and so they do tell us something about reality, even when the truth is fully determinate given the meaning.

If one accepts the mistaken argument, in fact, sometimes one is led even further. Logically necessary truths cannot tell us anything important for understanding reality, since they are simply facts about the meaning of words. On the other hand, anything which is not logically necessary is in some sense accidental: it might have been otherwise. But accidental things that might have been otherwise cannot help us to understand reality in any deep way: it tells us nothing deep about reality to note that there is a tree outside my window at this moment, when this merely happens to be the case, and could easily have been otherwise. Therefore, since neither logically necessary things, nor logically contingent things, can help us to understand reality in any deep or important way, such understanding must be impossible.

It is fairly rare to make such an argument explicitly, but it is a common implication of many arguments that are actually made or suggested, or it at least influences the way people feel about arguments and understanding.  For example, consider this comment on an earlier post. Timocrates suggests that (1) if you have a first cause, it would have to be a brute fact, since it doesn’t have any other cause, and (2) describing reality can’t tell us any reasons but is “simply another description of how things are.” The suggestion behind these objections is that the very idea of understanding is incoherent. As I said there in response, it is true that every true statement is in some sense “just a description of how things are,” but that was what a true statement was meant to be in any case. It surely was not meant to be a description of how things are not.

That “analytic” or “tautologous” statements can indeed provide a non-trivial understanding of reality can also easily be seen by example. Some examples from this blog:

Good and being. The convertibility of being and goodness is “analytic,” in the sense that carefully thinking about the meaning of desire and the good reveals that a universe where existence as such was bad, or even failed to be good, is logically impossible. In particular, it would require a universe where there is no tendency to exist, and this is impossible given that it is posited that something exists.

Natural selection. One of the most important elements of Darwin’s theory of evolution is the following logically necessary statement: the things that have survived are more likely to be the things that were more likely to survive, and less likely to be the things that were less likely to survive.

Limits of discursive knowledge. Knowledge that uses distinct thoughts and concepts is necessarily limited by issues relating to self-reference. It is clear that this is both logically necessary, and tells us important things about our understanding and its limits.

Knowledge and being. Kant rightly recognized a sense in which it is logically impossible to “know things as they are in themselves,” as explained in this post. But as I said elsewhere, the logically impossible assertion that knowledge demands an identity between the mode of knowing and the mode of being is the basis for virtually every sort of philosophical error. So a grasp on the opposite “tautology” is extremely useful for understanding.

 

Employer and Employee Model: Truth

In the remote past, I suggested that I would someday follow up on this post. In the current post, I begin to keep that promise.

We can ask about the relationship of the various members of our company with the search for truth.

The CEO, as the predictive engine, has a fairly strong interest in truth, but only insofar as truth is frequently necessary in order to get predictive accuracy. Consequently our CEO will usually insist on the truth when it affects our expectations regarding daily life, but it will care less when we consider things remote from the senses. Additionally, the CEO is highly interested in predicting the behavior of the Employee, and it is not uncommon for falsehood to be better than truth for this purpose.

To put this in another way, the CEO’s interest in truth is instrumental: it is sometimes useful for the CEO’s true goal, predictive accuracy, but not always, and in some cases it can even be detrimental.

As I said here, the Employee is, roughly speaking, the human person as we usually think of one, and consequently the Employee has the same interest in truth that we do. I personally consider truth to be an ultimate end,  and this is probably the opinion of most people, to a greater or lesser degree. In other words, most people consider truth a good thing, even apart from instrumental considerations. Nonetheless, all of us care about various things besides truth, and therefore we also occasionally trade truth for other things.

The Vice President has perhaps the least interest in truth. We could say that they too have some instrumental concern about truth. Thus for example the VP desires food, and this instrumentally requires true ideas about where food is to be found. Nonetheless, as I said in the original post, the VP is the least rational and coherent, and may easily fail to notice such a need. Thus the VP might desire the status resulting from winning an argument, so to speak, but also desire the similar status that results from ridiculing the person holding an opposing view. The frequent result is that a person believes the falsehood that ridiculing an opponent generally increases the chance that they will change their mind (e.g. see John Loftus’s attempt to justify ridicule.)

Given this account, we can raise several disturbing questions.

First, although we have said the Employee values truth in itself, can this really be true, rather than simply a mistaken belief on the part of the Employee? As I suggested in the original account, the Employee is in some way a consequence of the CEO and the VP. Consequently, if neither of these places intrinsic value on truth, how is it possible that the Employee does?

Second, even if the Employee sincerely places an intrinsic value on truth, how is this not a misplaced value? Again, if the Employee is something like a result of the others, what is good for the Employee should be what is good for the others, and thus if truth is not intrinsically good for the others, it should not be intrinsically good for the Employee.

In response to the first question, the Employee can indeed believe in the intrinsic value of truth, and of many other things to which the CEO and VP do not assign intrinsic value. This happens because as we are considering the model, there is a real division of labor, even if the Employee arises historically in a secondary manner. As I said in the other post, the Employee’s beliefs are our beliefs, and the Employee can believe anything that we believe. Furthermore, the Employee can really act on such beliefs about the goodness of truth or other things, even when the CEO and VP do not have the same values. The reason for this is the same as the reason that the CEO will often go along with the desires of the VP, even though the CEO places intrinsic value only on predictive accuracy. The linked post explains, in effect, why the CEO goes along with sex, even though only the VP really wants it. In a similar way, if the Employee believes that sex outside of marriage is immoral, the CEO often goes along with avoiding such sex, even though the CEO cares about predictive accuracy, not about sex or its avoidance. Of course, in this particular case, there is a good chance of conflict between the Employee and VP, and the CEO dislikes conflict, since it makes it harder to predict what the person overall will end up doing. And since the VP very rarely changes its mind in this case, the CEO will often end up encouraging the Employee to change their mind about the morality of such sex: thus one of the most frequent reasons why people abandon their religion is that it says that sex in some situations is wrong, but they still desire sex in those situations.

In response to the second, the Employee is not wrong to suppose that truth is intrinsically valuable. The argument against this would be that the human good is based on human flourishing, and (it is claimed) we do not need truth for such flourishing, since the CEO and VP do not care about truth in itself. The problem with this is that such flourishing requires that the Employee care about truth, and even the CEO needs the Employee to care in this way, for the sake of its own goal of predictive accuracy. Consider a real-life company: the employer does not necessarily care about whether the employee is being paid, considered in itself, but only insofar as it is instrumentally useful for convincing the employee to work for the employer. But the employer does care about whether the employee cares about being paid: if the employee does not care about being paid, they will not work for the employer.

Concern for truth in itself, apart from predictive accuracy, affects us when we consider things that cannot possibly affect our future experience: thus in previous cases I have discussed the likelihood that there are stars and planets outside the boundaries of the visible universe. This is probably true; but if I did not care about truth in itself, I might as well say that the universe is surrounded by purple elephants. I do not expect any experience to verify or falsify the claim, so why not make it? But now notice the problem for the CEO: the CEO needs to predict what the Employee is going to do, including what they will say and believe. This will instantly become extremely difficult if the Employee decides that they can say and believe whatever they like, without regard for truth, whenever the claim will not affect their experiences. So for its own goal of predictive accuracy, the CEO needs the Employee to value truth in itself, just as an ordinary employer needs their employee to value their salary.

In real life this situation can cause problems. The employer needs their employee to care about being paid, but if they care too much, they may constantly be asking for raises, or they may quit and go work for someone who will pay more. The employer does not necessarily like these situations. In a similar way, the CEO in our company may worry if the Employee insists too much on absolute truth, because as discussed elsewhere, it can lead to other situations with unpredictable behavior from the Employee, or to situations where there is a great deal of uncertainty about how society will respond to the Employee’s behavior.

Overall, this post perhaps does not say much in substance that we have not said elsewhere, but it will perhaps provide an additional perspective on these matters.

Schrödinger’s Cat

Erwin Schrödinger describes the context for his thought experiment with a cat:

The other alternative consists of granting reality only to the momentarily sharp determining parts – or in more general terms to each variable a sort of realization just corresponding to the quantum mechanical statistics of this variable at the relevant moment.

That it is in fact not impossible to express the degree and kind of blurring of all variables in one perfectly clear concept follows at once from the fact that Q.M. as a matter of fact has and uses such an instrument, the so-called wave function or psi-function, also called system vector. Much more is to be said about it further on. That it is an abstract, unintuitive mathematical construct is a scruple that almost always surfaces against new aids to thought and that carries no great message. At all events it is an imagined entity that images the blurring of all variables at every moment just as clearly and faithfully as does the classical model its sharp numerical values. Its equation of motion too, the law of its time variation, so long as the system is left undisturbed, lags not one iota, in clarity and determinacy, behind the equations of motion of the classical model. So the latter could be straight-forwardly replaced by the psi-function, so long as the blurring is confined to atomic scale, not open to direct control. In fact the function has provided quite intuitive and convenient ideas, for instance the “cloud of negative electricity” around the nucleus, etc. But serious misgivings arise if one notices that the uncertainty affects macroscopically tangible and visible things, for which the term “blurring” seems simply wrong. The state of a radioactive nucleus is presumably blurred in such a degree and fashion that neither the instant of decay nor the direction, in which the emitted alpha-particle leaves the nucleus, is well-established. Inside the nucleus, blurring doesn’t bother us. The emerging particle is described, if one wants to explain intuitively, as a spherical wave that continuously emanates in all directions and that impinges continuously on a surrounding luminescent screen over its full expanse. The screen however does not show a more or less constant uniform glow, but rather lights up at one instant at one spot – or, to honor the truth, it lights up now here, now there, for it is impossible to do the experiment with only a single radioactive atom. If in place of the luminescent screen one uses a spatially extended detector, perhaps a gas that is ionised by the alpha-particles, one finds the ion pairs arranged along rectilinear columns, that project backwards on to the bit of radioactive matter from which the alpha-radiation comes (C.T.R. Wilson’s cloud chamber tracks, made visible by drops of moisture condensed on the ions).

One can even set up quite ridiculous cases. A cat is penned up in a steel chamber, along with the following device (which must be secured against direct interference by the cat): in a Geiger counter there is a tiny bit of radioactive substance, so small, that perhaps in the course of the hour one of the atoms decays, but also, with equal probability, perhaps none; if it happens, the counter tube discharges and through a relay releases a hammer which shatters a small flask of hydrocyanic acid. If one has left this entire system to itself for an hour, one would say that the cat still lives if meanwhile no atom has decayed. The psi-function of the entire system would express this by having in it the living and dead cat (pardon the expression) mixed or smeared out in equal parts.

It is typical of these cases that an indeterminacy originally restricted to the atomic domain becomes transformed into macroscopic indeterminacy, which can then be resolved by direct observation. That prevents us from so naively accepting as valid a “blurred model” for representing reality. In itself it would not embody anything unclear or contradictory. There is a difference between a shaky or out-of-focus photograph and a snapshot of clouds and fog banks.

We see here the two elements described at the end of this earlier post. The psi-function is deterministic, but there seems to be an element of randomness when someone comes to check on the cat.

Hugh Everett amusingly describes a similar experiment performed on human beings (but without killing anyone):

Isolated somewhere out in space is a room containing an observer, A, who is about to perform a measurement upon a system S. After performing his measurement he will record the result in his notebook. We assume that he knows the state function of S (perhaps as a result of previous measurement), and that it is not an eigenstate of the measurement he is about to perform. A, being an orthodox quantum theorist, then believes that the outcome of his measurement is undetermined and that the process is correctly described by Process 1 [namely a random determination caused by measurement].

In the meantime, however, there is another observer, B, outside the room, who is in possession of the state function of the entire room, including S, the measuring apparatus, and A, just prior to the measurement. B is only interested in what will be found in the notebook one week hence, so he computes the state function of the room for one week in the future according to Process 2 [namely the deterministic  wave function]. One week passes, and we find B still in possession of the state function of the room, which this equally orthodox quantum theorist believes to be a complete description of the room and its contents. If B’s state function calculation tells beforehand exactly what is going to be in the notebook, then A is incorrect in his belief about the indeterminacy of the outcome of his measurement. We therefore assume that B’s state function contains non-zero amplitudes over several of the notebook entries.

At this point, B opens the door to the room and looks at the notebook (performs his observation.) Having observed the notebook entry, he turns to A and informs him in a patronizing manner that since his (B’s) wave function just prior to his entry into the room, which he knows to have been a complete description of the room and its contents, had non-zero amplitude over other than the present result of the measurement, the result must have been decided only when B entered the room, so that A, his notebook entry, and his memory about what occurred one week ago had no independent objective existence until the intervention by B. In short, B implies that A owes his present objective existence to B’s generous nature which compelled him to intervene on his behalf. However, to B’s consternation, A does not react with anything like the respect and gratitude he should exhibit towards B, and at the end of a somewhat heated reply, in which A conveys in a colorful manner his opinion of B and his beliefs, he rudely punctures B’s ego by observing that if B’s view is correct, then he has no reason to feel complacent, since the whole present situation may have no objective existence, but may depend upon the future actions of yet another observer.

Schrödinger’s problem was that the wave equation seems to describe something “blurred,” but if we assume that is because something blurred exists, it seems to contradict our experience which is of something quite distinct: a live cat or a dead cat, but not something in between.

Everett proposes that his interpretation of quantum mechanics is able to resolve this difficulty. After presenting other interpretations, he proposes his own (“Alternative 5”):

Alternative 5: To assume the universal validity of the quantum description, by the complete abandonment of Process 1 [again, this was the apparently random measurement process]. The general validity of pure wave mechanics, without any statistical assertions, is assumed for all physical systems, including observers and measuring apparata. Observation processes are to be described completely by the state function of the composite system which includes the observer and his object-system, and which at all times obeys the wave equation (Process 2).

It is evident that Alternative 5 is a theory of many advantages. It has the virtue of logical simplicity and it is complete in the sense that it is applicable to the entire universe. All processes are considered equally (there are no “measurement processes” which play any preferred role), and the principle of psycho-physical parallelism is fully maintained. Since the universal validity of the state function is asserted, one can regard the state functions themselves as the fundamental entities, and one can even consider the state function of the whole universe. In this sense this theory can be called the theory of the “universal wave function,” since all of physics is presumed to follow from this function alone. There remains, however, the question whether or not such a theory can be put into correspondence with our experience.

This present thesis is devoted to showing that this concept of a universal wave mechanics, together with the necessary correlation machinery for its interpretation, forms a logically self consistent description of a universe in which several observers are at work.

Ultimately, Everett’s response to Schrödinger is that the cat is indeed “blurred,” and that this never goes away. When someone checks on the cat, the person checking is also “blurred,” becoming a composite of someone seeing a dead cat and someone seeing a live cat. However, these are in effect two entirely separate worlds, one in which someone sees a live cat, and one in which someone sees a dead cat.

Everett mentions “the necessary correlation machinery for its interpretation,” because a mathematical theory of physics as such does not necessarily say that anyone should see anything in particular. So for example when Newton when says that there is a gravitational attraction between masses inversely proportional to the square of their distance, what exactly should we expect to see, given that? Obviously there is no way to answer this without adding something, and ultimately we need to add something non-mathematical, namely something about the way our experiences work.

I will not pretend to judge whether or not Everett does a good job defending his position. There is an interesting point here, whether or not his defense is ultimately a good one. “Orthodox” quantum mechanics, as Everett calls it, only gives statistical predictions about the future, and as long as nothing is added to the theory, it implies that deterministic predictions are impossible. It follows that if the position in our last post, on an open future, was correct, it must be possible to explain the results of quantum mechanics in terms of many worlds or multiple timelines. And I do not merely mean that we can give the same predictions with a one-world account or with a many world account. I mean that there must be a many-world account such that its contents are metaphysically identical to the contents of a one-world account with an open future.

This would nonetheless leave undetermined the question of what sort of account would be most useful to us in practice.