Structure of Explanation

When we explain a thing, we give a cause; we assign the thing an origin that explains it.

We can go into a little more detail here. When we ask “why” something is the case, there is always an implication of possible alternatives. At the very least, the question implies, “Why is this the case rather than not being the case?” Thus “being the case” and “not being the case” are two possible alternatives.

The alternatives can be seen as possibilities in the sense explained in an earlier post. There may or may not be any actual matter involved, but again, the idea is that reality (or more specifically some part of reality) seems like something that would be open to being formed in one way or another, and we are asking why it is formed in one particular way rather than the other way. “Why is it raining?” In principle, the sky is open to being clear, or being filled with clouds and a thunderstorm, and to many other possibilities.

A successful explanation will be a complete explanation when it says “once you take the origin into account, the apparent alternatives were only apparent, and not really possible.” It will be a partial explanation when it says, “once you take the origin into account, the other alternatives were less sensible (i.e. made less sense as possibilities) than the actual thing.”

Let’s consider some examples in the form of “why” questions and answers.

Q1. Why do rocks fall? (e.g. instead of the alternatives of hovering in the air, going upwards, or anything else.)

A1. Gravity pulls things downwards, and rocks are heavier than air.

The answer gives an efficient cause, and once this cause is taken into account, it can be seen that hovering in the air or going upwards were not possibilities relative to that cause.

Obviously there is not meant to be a deep explanation here; the point here is to discuss the structure of explanation. The given answer is in fact basically Newton’s answer (although he provided more mathematical detail), while with general relativity Einstein provided a better explanation.

The explanation is incomplete in several ways. It is not a first cause; someone can now ask, “Why does gravity pull things downwards, instead of upwards or to the side?” Similarly, while it is in fact the cause of falling rocks, someone can still ask, “Why didn’t anything else prevent gravity from making the rocks fall?” This is a different question, and would require a different answer, but it seems to reopen the possibility of the rocks hovering or moving upwards, from a more general point of view. David Hume was in part appealing to the possibility of such additional questions when he said that we can see no necessary connection between cause and effect.

Q2. Why is 7 prime? (i.e. instead of the alternative of not being prime.)

A2. 7/2 = 3.5, so 7 is not divisible by 2. 7/3 = 2.333…, so 7 is not divisible by 3. In a similar way, it is not divisible by 4, 5, or 6. Thus in general it is not divisible by any number except 1 and itself, which is what it means to be prime.

If we assumed that the questioner did not know what being prime means, we could have given a purely formal response simply by noting that it is not divisible by numbers between 1 and itself, and explaining that this is what it is to be prime. As it is, the response gives a sufficient material disposition. Relative to this explanation, “not being prime,” was never a real possibility for 7 in the first place. The explanation is complete in that it completely excludes the apparent alternative.

Q3. Why did Peter go to the store? (e.g. instead of going to the park or the museum, or instead of staying home.)

A3. He went to the store in order to buy groceries.

The answer gives a final cause. In view of this cause the alternatives were merely apparent. Going to the park or the museum, or even staying home, were not possible since there were no groceries there.

As in the case of the rock, the explanation is partial in several ways. Someone can still ask, “Why did he want groceries?” And again someone can ask why he didn’t go to some other store, or why something didn’t hinder him, and so on. Such questions seem to reopen various possibilities, and thus the explanation is not an ultimately complete one.

Suppose, however, that someone brings up the possibility that instead of going to the store, he could have gone to his neighbor and offered money for groceries in his neighbor’s refrigerator. This possibility is not excluded simply by the purpose of buying groceries. Nonetheless, the possibility seems less sensible than getting them from the store, for multiple reasons. Again, the implication is that our explanation is only partial: it does not completely exclude alternatives, but it makes them less sensible.

Let’s consider a weirder question: Why is there something rather than nothing?

Now the alternatives are explicit, namely there being something, and there being nothing.

It can be seen that in one sense, as I said in the linked post, the question cannot have an answer, since there cannot be a cause or origin for “there is something” which would itself not be something. Nonetheless, if we consider the idea of possible alternatives, it is possible to see that the question does not need an answer; one of the alternatives was only an apparent alternative all along.

In other words, the sky can be open to being clear or cloudy. But there cannot be something which is open both to “there is something” and “there is nothing”, since any possibility of that kind would be “something which is open…”, which would already be something rather than nothing. The “nothing” alternative was merely apparent. Nothing was ever open to there being nothing.

Let’s consider another weird question. Suppose we throw a ball, and in the middle of the path we ask, Why is the ball in the middle of the path instead of at the end of the path?

We could respond in terms of a sufficient material disposition: it is in the middle of the path because you are asking your question at the middle, instead of waiting until the end.

Suppose the questioner responds: Look, I asked my question at the middle of the path. But that was just chance. I could have asked at any moment, including at the end. So I want to know why it was in the middle without considering when I am asking the question.

If we look at the question in this way, it can be seen in one way that no cause or origin can be given. Asked in this way, being at the end cannot be excluded, since they could have asked their question at the end. But like the question about something rather than nothing, the question does not need an answer. In this case, this is not because the alternatives were merely apparent in the sense that one was possible and the other not. But they were merely apparent in the sense that they were not alternatives. The ball goes both goes through the middle, and reaches the end. With the stipulation that we not consider the time of the question, the two possibilities are not mutually exclusive.

Additional Considerations

The above considerations about the nature of “explanation” lead to various conclusions, but also to various new questions. For example, one commenter suggested that “explanation” is merely subjective. Now as I said there, all experience is subjective experience (what would “objective experience” even mean, except that someone truly had a subjective experience?), including the experience of having an explanation. Nonetheless, the thing experienced is not subjective: the origins that we call explanations objectively exclude the apparent possibilities, or objectively make them less intelligible. The explanation of explanation here, however, provides an answer to what was perhaps the implicit question. Namely, why are we so interested in explanations in the first place, so that the experience of understanding something becomes a particularly special type of experience? Why, as Aristotle puts it, do “all men desire to know,” and why is that desire particularly satisfied by explanations?

In one sense it is sufficient simply to say that understanding is good in itself. Nonetheless, there is something particular about the structure of a human being that makes knowledge good for us, and which makes explanation a particularly desirable form of knowledge. In my employer and employee model of human psychology, I said that “the whole company is functioning well overall when the CEO’s goal of accurate prediction is regularly being achieved.” This very obviously requires knowledge, and explanation is especially beneficial because it excludes alternatives, which reduces uncertainty and therefore tends to make prediction more accurate.

However, my account also raises new questions. If explanation eliminates alternatives, what would happen if everything was explained? We could respond that “explaining everything” is not possible in the first place, but this is probably an inadequate response, because (from the linked argument) we only know that we cannot explain everything all at once, the way the person in the room cannot draw everything at once; we do not know that there is any particular thing that cannot be explained, just as there is no particular aspect of the room that cannot be drawn. So there can still be a question about what would happen if every particular thing in fact has an explanation, even if we cannot know all the explanations at once. In particular, since explanation eliminates alternatives, does the existence of explanations imply that there are not really any alternatives? This would suggest something like Leibniz’s argument that the actual world is the best possible world. It is easy to see that such an idea implies that there was only one “possibility” in the first place: Leibniz’s “best possible world” would be rather “the only possible world,” since the apparent alternatives, given that they would have been worse, were not real alternatives in the first place.

On the other hand, if we suppose that this is not the case, and there are ultimately many possibilities, does this imply the existence of “brute facts,” things that could have been otherwise, but which simply have no explanation? Or at least things that have no complete explanation?

Let the reader understand. I have already implicitly answered these questions. However, I will not link here to the implicit answers because if one finds it unclear when and where this was done, one would probably also find those answers unclear and inconclusive. Of course it is also possible that the reader does see when this was done, but still believes those responses inadequate. In any case, it is possible to provide the answers in a form which is much clearer and more conclusive, but this will likely not be a short or simple project.

And Fire by Fire

Superstitious Nonsense asks about the last post:

So the answer here is that -some- of the form is present in the mind, but always an insufficient amount or accuracy that the knowledge will not be “physical”? You seem to be implying the part of the form that involves us in the self-reference paradox is precisely the part of the form that gives objects their separate, “physical” character. Is this fair? Certainly, knowing progressively more about an object does not imply the mental copy is becoming closer and closer to having a discrete physicality.

I’m not sure this is the best way to think about it. The self-reference paradox arises because we are trying to copy ourselves into ourselves, and thus we are adding something into ourselves, making the copy incomplete. The problem is not that there is some particular “part of the form” that we cannot copy, but that it is in principle impossible to copy it perfectly. This is different from saying that there is some specific “part” that cannot be copied.

Consider what happens when we make “non-physical” copies of something without involving a mind. Consider the image of a gold coin. There are certain relationships common to the image and to a gold coin in the physical world. So you could say we have a physical gold coin, and a non-physical one.

But wait. If the image of the coin is on paper, isn’t that a physical object? Or if the image is on your computer screen, isn’t your screen a physical object? And the image is just the colors on the screen, which are apparently just as “physical” (or non-physical) as the color of the actual coin. So why we would say that “this is not a physical coin?”

Again, as in the last post, the obvious answer is that the image is not made out of gold, while the physical coin is. But why not? Is it that the image is not accurate enough? If we made it more accurate, would it be made out of gold, or become closer to being made out of gold? Obviously not. This is like noting that a mental copy does not become closer and closer to being a physical one.

In a sense it is true that the reason the image of the coin is not physical is that it is not accurate enough. But that is because it cannot be accurate enough: the fact that it is an image positively excludes the copying of certain relationships. Some aspects can be copied, but others cannot be copied at all, as long as it is an image. On the other hand, you can look at this from the opposite direction: if you did copy those aspects, the image would no longer be an image, but a physical coin.

As a similar example, consider the copying of a colored scene into black and white. We can copy some aspects of the scene by using various shades of gray, but we cannot copy every aspect of the scene. There are simply not enough differences in a black and white image to reflect every aspect of a colored scene. The black and white image, as you make it more accurate, does not become closer to being colored, but this is simply because there are aspects of the colored scene that you never copy. If you do insist on copying those aspects, you will indeed make the black and white image into a colored image, and thus it will no longer be black and white.

The situation becomes significantly more complicated when we talk about a mind. In one way, there is an important similarity. When we say that the copy in the mind is “not physical,” that simply means that it is a copy in the mind, just as when we say that the image of the coin is not physical, it means that it is an image, made out of the stuff that images are made of. But just as the image is physical anyway, in another sense, so it is perfectly possible that the mind is physical in a similar sense. However, this is where things begin to become confusing.

Elsewhere, I discussed Aristotle’s argument that the mind is immaterial. Considering the cases above, we could put his argument in this way: the human brain is a limited physical object. So as long as the brain remains a brain, there are simply not enough potential differences in it to model all possible differences in the world, just as you cannot completely model a colored scene using black and white. But anything at all can be understood. Therefore we cannot be understanding by using the brain.

I have claimed myself that anything that can be, can be understood. But this needs to be understood generically, rather than as claiming that it is possible to understand reality in every detail simultaneously. The self-reference paradox shows that it is impossible in principle for a knower that copies forms into itself to understand itself in every aspect at once. But even apart from this, it is very obvious that we as human beings cannot understand every aspect of reality at once. This does not even need to be argued: you cannot even keep everything in mind at once, let alone understand every detail of everything. This directly suggests a problem with Aristotle’s argument: if being able to know all things suggests that the mind is immaterial, the obvious fact that we cannot know all things suggests that it is not.

Nonetheless, let us see what happens if we advance the argument on Aristotle’s behalf. Admittedly, we cannot understand everything at once. But in the case of the colored scene, there are aspects that cannot be copied at all into the black and white copy. And in the case of the physical coin, there are aspects that cannot be copied at all into the image. So if we are copying things into the brain, doesn’t that mean that there should be aspects of reality that cannot be copied at all into the mind? But this is false, since it would not only mean that we can’t understand everything, but it would also mean that there would be things that we cannot think about at all, and if it is so, then it is not so, because in that case we are right now talking about things that we supposedly cannot talk about.

Copying into the mind is certainly different from copying into a black and white scene or copying into a picture, and this does get at one of the differences. But the difference here is that the method of copying in the case of the mind is flexible, while the method of copying in the case of the pictures is rigid. In other words, we have a pre-defined method of copying in the case of the pictures that, from the beginning, only allows certain aspects to be copied. In the case of the mind, we determine the method differently from case to case, depending on our particular situation and the thing being copied. The result is that there is no particular aspect of things that cannot be copied, but you cannot copy every aspect at once.

In answer to the original question, then, the reason that the “mental copy” always remains mental is that you never violate the constraints of the mind, just as a black and white copy never violates the constraints of being black and white. But if you did violate the constraints of the black and white copy by copying every aspect of the scene, the image would become colored. And similarly, if you did violate the constraints of the mind in order to copy every aspect of reality, your mind would cease to be, and it would instead become the thing itself. But there is no particular aspect of “physicality” that you fail to copy: rather, you just ensure that one way or another you do not violate the constraints of the mind that you have.

Unfortunately, the explanation here for why the mind can copy any particular aspect of reality, although not every aspect at once, is rather vague. Perhaps a clearer explanation is possible? In fact, someone could use the vagueness to argue for Aristotle’s position and against mine. Perhaps my account is vague because it is wrong, and there is actually no way for a physical object to receive copied forms in this way.

Tautologies Not Trivial

In mathematics and logic, one sometimes speaks of a “trivial truth” or “trivial theorem”, referring to a tautology. Thus for example in this Quora question, Daniil Kozhemiachenko gives this example:

The fact that all groups of order 2 are isomorphic to one another and commutative entails that there are no non-Abelian groups of order 2.

This statement is a tautology because “Abelian group” here just means one that is commutative: the statement is like the customary example of asserting that “all bachelors are unmarried.”

Some extend this usage of “trivial” to refer to all statements that are true in virtue of the meaning of the terms, sometimes called “analytic.” The effect of this is to say that all statements that are logically necessary are trivial truths. An example of this usage can be seen in this paper by Carin Robinson. Robinson says at the end of the summary:

Firstly, I do not ask us to abandon any of the linguistic practises discussed; merely to adopt the correct attitude towards them. For instance, where we use the laws of logic, let us remember that there are no known/knowable facts about logic. These laws are therefore, to the best of our knowledge, conventions not dissimilar to the rules of a game. And, secondly, once we pass sentence on knowing, a priori, anything but trivial truths we shall have at our disposal the sharpest of philosophical tools. A tool which can only proffer a better brand of empiricism.

While the word “trivial” does have a corresponding Latin form that means ordinary or commonplace, the English word seems to be taken mainly from the “trivium” of grammar, rhetoric, and logic. This would seem to make some sense of calling logical necessities “trivial,” in the sense that they pertain to logic. Still, even here something is missing, since Robinson wants to include the truths of mathematics as trivial, and classically these did not pertain to the aforesaid trivium.

Nonetheless, overall Robinson’s intention, and presumably that of others who speak this way, is to suggest that such things are trivial in the English sense of “unimportant.” That is, they may be important tools, but they are not important for understanding. This is clear at least in our example: Robinson calls them trivial because “there are no known/knowable facts about logic.” Logical necessities tell us nothing about reality, and therefore they provide us with no knowledge. They are true by the meaning of the words, and therefore they cannot be true by reason of facts about reality.

Things that are logically necessary are not trivial in this sense. They are important, both in a practical way and directly for understanding the world.

Consider the failure of the Mars Climate Orbiter:

On November 10, 1999, the Mars Climate Orbiter Mishap Investigation Board released a Phase I report, detailing the suspected issues encountered with the loss of the spacecraft. Previously, on September 8, 1999, Trajectory Correction Maneuver-4 was computed and then executed on September 15, 1999. It was intended to place the spacecraft at an optimal position for an orbital insertion maneuver that would bring the spacecraft around Mars at an altitude of 226 km (140 mi) on September 23, 1999. However, during the week between TCM-4 and the orbital insertion maneuver, the navigation team indicated the altitude may be much lower than intended at 150 to 170 km (93 to 106 mi). Twenty-four hours prior to orbital insertion, calculations placed the orbiter at an altitude of 110 kilometers; 80 kilometers is the minimum altitude that Mars Climate Orbiter was thought to be capable of surviving during this maneuver. Post-failure calculations showed that the spacecraft was on a trajectory that would have taken the orbiter within 57 kilometers of the surface, where the spacecraft likely skipped violently on the uppermost atmosphere and was either destroyed in the atmosphere or re-entered heliocentric space.[1]

The primary cause of this discrepancy was that one piece of ground software supplied by Lockheed Martin produced results in a United States customary unit, contrary to its Software Interface Specification (SIS), while a second system, supplied by NASA, expected those results to be in SI units, in accordance with the SIS. Specifically, software that calculated the total impulse produced by thruster firings produced results in pound-force seconds. The trajectory calculation software then used these results – expected to be in newton seconds – to update the predicted position of the spacecraft.

It is presumably an analytic truth that the units defined in one way are unequal to the units defined in the other. But it was ignoring this analytic truth that was the primary cause of the space probe’s failure. So it is evident that analytic truths can be extremely important for practical purposes.

Such truths can also be important for understanding reality. In fact, they are typically more important for understanding than other truths. The argument against this is that if something is necessary in virtue of the meaning of the words, it cannot be telling us something about reality. But this argument is wrong for one simple reason: words and meaning themselves are both elements of reality, and so they do tell us something about reality, even when the truth is fully determinate given the meaning.

If one accepts the mistaken argument, in fact, sometimes one is led even further. Logically necessary truths cannot tell us anything important for understanding reality, since they are simply facts about the meaning of words. On the other hand, anything which is not logically necessary is in some sense accidental: it might have been otherwise. But accidental things that might have been otherwise cannot help us to understand reality in any deep way: it tells us nothing deep about reality to note that there is a tree outside my window at this moment, when this merely happens to be the case, and could easily have been otherwise. Therefore, since neither logically necessary things, nor logically contingent things, can help us to understand reality in any deep or important way, such understanding must be impossible.

It is fairly rare to make such an argument explicitly, but it is a common implication of many arguments that are actually made or suggested, or it at least influences the way people feel about arguments and understanding.  For example, consider this comment on an earlier post. Timocrates suggests that (1) if you have a first cause, it would have to be a brute fact, since it doesn’t have any other cause, and (2) describing reality can’t tell us any reasons but is “simply another description of how things are.” The suggestion behind these objections is that the very idea of understanding is incoherent. As I said there in response, it is true that every true statement is in some sense “just a description of how things are,” but that was what a true statement was meant to be in any case. It surely was not meant to be a description of how things are not.

That “analytic” or “tautologous” statements can indeed provide a non-trivial understanding of reality can also easily be seen by example. Some examples from this blog:

Good and being. The convertibility of being and goodness is “analytic,” in the sense that carefully thinking about the meaning of desire and the good reveals that a universe where existence as such was bad, or even failed to be good, is logically impossible. In particular, it would require a universe where there is no tendency to exist, and this is impossible given that it is posited that something exists.

Natural selection. One of the most important elements of Darwin’s theory of evolution is the following logically necessary statement: the things that have survived are more likely to be the things that were more likely to survive, and less likely to be the things that were less likely to survive.

Limits of discursive knowledge. Knowledge that uses distinct thoughts and concepts is necessarily limited by issues relating to self-reference. It is clear that this is both logically necessary, and tells us important things about our understanding and its limits.

Knowledge and being. Kant rightly recognized a sense in which it is logically impossible to “know things as they are in themselves,” as explained in this post. But as I said elsewhere, the logically impossible assertion that knowledge demands an identity between the mode of knowing and the mode of being is the basis for virtually every sort of philosophical error. So a grasp on the opposite “tautology” is extremely useful for understanding.

 

Start at the Beginning

This post will have two kinds of readers:

1) The few who have read the posts on this blog from the beginning, in chronological order, and who are now reading this one simply because it is the only one you have not read yet.

2) The vast majority who did not do the above.

For the first category, I don’t have any particular suggestion at the moment. Well done. That is the right way of reading this blog.

For the second category, you would do much better to stop right here in the middle of this post (without even finishing it), go back to the beginning, and read every post in chronological order.

….

So you are now in the first category? No? Since obviously you did not take my advice, let me explain both why you should, and why you will not.

It is possible to understand something through arguments, even if manipulating symbols may be an even more common result. And since conclusions follow from premises, you can only do this by thinking about the premises first, and the conclusions second. Since my own interest is in understanding things, I intentionally organize the blog in this way. Of course, since the concrete historical process of an individual coming to understand some particular thing is messier and more complicated than a single argument or even than multiple arguments, the order isn’t an exact representation of my own history or someone else’s potential history. But it is certainly closer to that than any other order of reading would be.

You will object that you do not have the time to read 300 blog posts. Fine. But then why do you have time to read this one? Even if you are definitely committed to reading a small number of posts, you would do better to read a small number from the beginning. If you are committed to reading not more than one post a week, you would do better to read the 300 posts over the next six years, rather than reading the posts that are current.

You might think of other similar objections, but they will all fail in similar ways. If you are actually interested in understanding something from your reading, chronological order is the right order.

Of course, other blog authors might well argue in similar ways, but the number of people who actually do this, on any blog, is tiny. Instead, people read a few recent posts, and perhaps a few others if there are a chain of links that lead them there. But they do not, in the vast majority of cases, read from the beginning, whether to read all or only a part.

So let me explain why you will not take this advice, despite the fact that it is irrefutably correct. In The Elephant in the Brain, Robin Hanson and Kevin Simler remark in a chapter on conversation:

This view of talking—as a way of showing off one’s “backpack”—explains the puzzles we encountered earlier, the ones that the reciprocal-exchange theory had trouble with. For example, it explains why we see people jockeying to speak rather than sitting back and “selfishly” listening—because the spoils of conversation don’t lie primarily in the information being exchanged, but rather in the subtextual value of finding good allies and advertising oneself as an ally. And in order to get credit in this game, you have to speak up; you have to show off your “tools.”

But why do speakers need to be relevant in conversation? If speakers deliver high-quality information, why should listeners care whether the information is related to the current topic? A plausible answer is that it’s simply too easy to rattle off memorized trivia. You can recite random facts from the encyclopedia until you’re blue in the face, but that does little to advertise your generic facility with information.

Similarly, when you meet someone for the first time, you’re more eager to sniff each other out for this generic skill, rather than to exchange the most important information each of you has gathered to this point in your lives. In other words, listeners generally prefer speakers who can impress them wherever a conversation happens to lead, rather than speakers who steer conversations to specific topics where they already know what to say.

Hanson and Simler are trying to explain various characteristics of conversation, such as the fact that people are typically more interested in speaking than in listening, as well as the requirement that conversational participants “stick to the topic.”

Later, they associate this with people’s interest in news:

Why have humans long been so obsessed with news? When asked to justify our strong interest, we often point to the virtues of staying apprised of the important issues of the day. During a 1945 newspaper strike in New York, for example, when the sociologist Bernard Berelson asked his fellow citizens, “Is it very important that people read the newspaper?” almost everyone answered with a “strong ‘yes,’ ” and most people cited the “ ‘serious’ world of public affairs.”

Now, it did make some sense for our ancestors to track news as a way to get practical information, such as we do today for movies, stocks, and the weather. After all, they couldn’t just go easily search for such things on Google like we can. But notice that our access to Google hasn’t made much of a dent in our hunger for news; if anything we read more news now that we have social media feeds, even though we can find a practical use for only a tiny fraction of the news we consume.

There are other clues that we aren’t mainly using the news to be good citizens (despite our high-minded rhetoric). For example, voters tend to show little interest in the kinds of information most useful for voting, including details about specific policies, the arguments for and against them, and the positions each politician has taken on each policy. Instead, voters seem to treat elections more like horse races, rooting for or against different candidates rather than spending much effort to figure out who should win. (See Chapter 16 for a more detailed discussion on politics.)

These patterns in behavior may be puzzling when we think of news as a source of useful information. But they make sense if we treat news as a larger “conversation” that extends our small-scale conversation habits. Just as one must talk on the current topic in face-to-face conversation, our larger news conversation also maintains a few “hot” topics—a focus so strong and so narrow that policy wonks say that there’s little point in releasing policy reports on topics not in the news in the last two weeks. (This is the criterion of relevance we saw earlier.)

The argument here suggests that blog readers will tend to prefer reading current posts to old ones because this is to remain more “relevant,” and that such relevance is necessary in order to impress other conversational participants. This, I suggest, is why you will not take my advice, despite its rightness. If you think this is an insulting explanation, just bear in mind that blog authors are even more insulted by Hanson’s and Simler’s explanations, since the reader at least is listening.