How to Build an Artificial Human

I was going to use “Artificial Intelligence” in the title here but realized after thinking about it that the idea is really more specific than that.

I came up with the idea here while thinking more about the problem I raised in an earlier post about a serious obstacle to creating an AI. As I said there:

Current AI systems are not universal, and clearly have no ability whatsoever to become universal, without first undergoing deep changes in those systems, changes that would have to be initiated by human beings. What is missing?

The problem is the training data. The process of evolution produced the general ability to learn by using the world itself as the training data. In contrast, our AI systems take a very small subset of the world (like a large set of Go games or a large set of internet text), and train a learning system on that subset. Why take a subset? Because the world is too large to fit into a computer, especially if that computer is a small part of the world.

This suggests that going from the current situation to “artificial but real” intelligence is not merely a question of making things better and better little by little. There is a more fundamental problem that would have to be overcome, and it won’t be overcome simply by larger training sets, by faster computing, and things of this kind. This does not mean that the problem is impossible, but it may turn out to be much more difficult than people expected. For example, if there is no direct solution, people might try to create Robin Hanson’s “ems”, where one would more or less copy the learning achieved by natural selection. Or even if that is not done directly, a better understanding of what it means to “know how to learn,” might lead to a solution, although probably one that would not depend on training a model on massive amounts of data.

Proposed Predictive Model

Perhaps I was mistaken in saying that “larger training sets” would not be enough, at any rate enough to get past this basic obstacle. Perhaps it is enough to choose the subset correctly… namely by choosing the subset of the world that we know to contain general intelligence. Instead of training our predictive model on millions of Go games or millions of words, we will train it on millions of human lives.

This project will be extremely expensive. We might need to hire 10 million people to rigorously lifelog for the next 10 years. This has to be done with as much detail as possible; in particular we would want them recording constant audio and visual streams, as well as much else as possible. If we pay our crew an annual salary of $75,000 for this, this will come to $7.5 trillion; there will be some small additions for equipment and maintenance, but all of this will be very small compared to the salary costs.

Presumably in order to actually build such a large model, various scaling issues would come up and need to be solved. And in principle nothing prevents these from being very hard to solve, or even impossible in practice. But since we do not know that this would happen, let us skip over this and pretend that we have succeeded in building the model. Once this is done, our model should be able to fairly easily take a point in a person’s life and give a fairly sensible continuation over at least a short period of time, just as GPT-3 can give fairly sensible continuations to portions of text.

It may be that this is enough to get past the obstacle described above, and once this is done, it might be enough to build a general intelligence using other known principles, perhaps with some research and refinement that could be done during the years in which our crew would be building their records.

Required Elements

Live learning. In the post discussing the obstacle, I noted that there are two kinds of learning, the type that comes from evolution, and the type that happens during life. Our model represents the type that comes from evolution; unlike GPT-3, which cannot learn anything new, we need our AI to remember what has actually happened during its life and to be able to use this to acquire knowledge about its particular situation. This is not difficult in theory but you would need to think carefully about how this should interact with the general model; you do not want to simply add its particular experiences as another individual example (not that such an addition to an already trained model is simple anyway.)

Causal model. Our AI needs not just a general predictive model of the world, but specifically a causal one; not just the general idea that “when you see A, you will soon see B,” but the idea that “when there is an A — which may or may not be seen — it will make a B, which you may or may not see.” This is needed for many reasons, but in particular, without such a causal model, long term predictions or planning will be impossible. If you take a model like GPT-3 and force it to continue producing text indefinitely, it will either repeat itself or eventually go completely off topic. The same thing would happen to our human life model — if we simply used the model without any causal structure, and forced it to guess what would happen indefinitely far into the future, it would eventually produce senseless predictions.

In the paper Making Sense of Raw Input, published by Google Deepmind, there is a discussion of an implementation of this sort of model, although trained on an extremely easy environment (compared to our task, which would be train it on human lives).

The Apperception Engine attempts to discern the nomological structure that underlies the raw sensory input. In our experiments, we found the induced theory to be very accurate as a predictive model, no matter how many time steps into the future we predict. For example, in Seek Whence (Section 5.1), the theory induced in Fig. 5a allows us to predict all future time steps of the series, and the accuracy of the predictions does not decay with time.

In Sokoban (Section 5.2), the learned dynamics are not just 100% correct on all test trajectories, but they are provably 100% correct. These laws apply to all Sokoban worlds, no matter how large, and no matter how many objects. Our system is, to the best of our knowledge, the first that is able to go from raw video of non-trivial games to an explicit first-order nomological model that is provably correct.

In the noisy sequences experiments (Section 5.3), the induced theory is an accurate predictive model. In Fig. 19, for example, the induced theory allows us to predict all future time steps of the series, and does not degenerate as we go further into the future.

(6.1.2 Accuracy)

Note that this does not have the problem of quick divergence from reality as you predict into the distant future. It will also improve our AI’s live learning:

A system that can learn an accurate dynamics model from a handful of examples is extremely useful for model-based reinforcement learning. Standard model-free algorithms require millions of episodes before they can reach human performance on a range of tasks [31]. Algorithms that learn an implicit model are able to solve the same tasks in thousands of episodes [82]. But a system that learns an accurate dynamics model from a handful of examples should be able to apply that model to plan, anticipating problems in imagination rather than experiencing them in reality [83], thus opening the door to extremely sample efficient model-based reinforcement learning. We anticipate a system that can learn the dynamics of an ATARI game from a handful of trajectories,19 and then apply that model to plan, thus playing at reasonable human level on its very first attempt.

(6.1.3. Data efficiency)

“We anticipate”, as in Google has not yet built such a thing, but that they expect to be able to build it.

Scaling a causal model to work on our human life dataset will probably require some of the most difficult new research of this entire proposal.

Body. In order to engage in live learning, our AI needs to exist in the world in some way. And for the predictive model to do it any good, the world that it exists in needs to be a roughly human world. So there are two possibilities: either we simulate a human world in which it will possess a simulated human body, or we give it a robotic human-like body that will exist physically in the human world.

In relationship to our proposal, these are not very different, but the former is probably more difficult, since we would have to simulate pretty much the entire world, and the more distant our simulation is from the actual world, the less helpful its predictive model would turn out to be.

Sensation. Our AI will need to receive input from the world through something like “senses.” These will need to correspond reasonably well with the data as provided in the model; e.g. since we expect to have audio and visual recording, our AI will need sight and hearing.

Predictive Processing. Our AI will need to function this way in order to acquire self-knowledge and free will, without which we would not consider it to possess general intelligence, however good it might be at particular tasks. In particular, at every point in time it will have predictions, based on the general human-life predictive model and on its causal model of the world, about what will happen in the near future. These predictions need to function in such a way that when it makes a relevant prediction, e.g. when it predicts that it will raise its arm, it will actually raise its arm.

(We might not want this to happen 100% of the time — if such a prediction is very far from the predictive model, we might want the predictive model to take precedence over this power over itself, much as happens with human beings.)

Thought and Internal Sensation. Our AI needs to be able to notice that when it predicts it will raise its arm, it succeeds, and it needs to learn that in these cases its prediction is the cause of raising the arm. Only in this way will its live learning produce a causal model of the world which actually has self knowledge: “When I decide to raise my arm, it happens.” This will also tell it the distinction between itself and the rest of the world; if it predicts the sun will change direction, this does not happen. In order for all this to happen, the AI needs to be able to see its own predictions, not just what happens; the predictions themselves have to become a kind of input, similar to sight and hearing.

What was this again?

If we don’t run into any new fundamental obstacle along the way (I mentioned a few points where this might happen), the above procedure might be able to actually build an artificial general intelligence at a rough cost of $10 trillion (rounded up to account for hardware, research, and so on) and a time period of 10-20 years. But I would call your attention to a couple of things:

First, this is basically an artificial human, even to the extent that the easiest implementation likely requires giving it a robotic human body. It is not more general than that, and there is little reason to believe that our AI would be much more intelligent than a normal human, or that we could easily make it more intelligent. It would be fairly easy to give it quick mental access to other things, like mathematical calculation or internet searches, but this would not be much faster than a human being with a calculator or internet access. Like with GPT-N, one factor that would tend to limit its intelligence is that its predictive model is based on the level of intelligence found in human beings; there is no reason it would predict it would behave more intelligently, and so no reason why it would.

Second, it is extremely unlikely than anyone will implement this research program anytime soon. Why? Because you don’t get anything out of it except an artificial human. We have easier and less expensive ways to make humans, and $10 trillion is around the most any country has ever spent on anything, and never deliberately on one single project. Nonetheless, if no better way to make an AI is found, one can expect that eventually something like this will be implemented; perhaps by China in the 22nd century.

Third, note that “values” did not come up in this discussion. I mentioned this in one of the earlier posts on predictive processing:

The idea of the “desert landscape” seems to be that this account appears to do away with the idea of the good, and the idea of desire. The brain predicts what it is going to do, and those predictions cause it to do those things. This all seems purely intellectual: it seems that there is no purpose or goal or good involved.

The correct response to this, I think, is connected to what I have said elsewhere about desire and good. I noted there that we recognize our desires as desires for particular things by noticing that when we have certain feelings, we tend to do certain things. If we did not do those things, we would never conclude that those feelings are desires for doing those things. Note that someone could raise a similar objection here: if this is true, then are not desire and good mere words? We feel certain feelings, and do certain things, and that is all there is to be said. Where is good or purpose here?

The truth here is that good and being are convertible. The objection (to my definition and to Clark’s account) is not a reasonable objection at all: it would be a reasonable objection only if we expected good to be something different from being, in which case it would of course be nothing at all.

There was no need for an explicit discussion of values because they are an indirect consequence. What would our AI care about? It would care roughly speaking about the same things we care about, because it would predict (and act on the prediction) that it would live a life similar to a human life. There is definitely no specific reason to think it would be interested in taking over the world, although this cannot be excluded absolutely, since this is an interest that humans sometimes have. Note also that Nick Bostrom was wrong: I have just made a proposal that might actually succeed in making a human-like AI, but there is no similar proposal that would make an intelligent paperclip maximizer.

This is not to say that we should not expect any bad behavior at all from such a being; the behavior of the AI in the film Ex Machina is a plausible fictional representation of what could go wrong. Since what it is “trying” to do is to get predictive accuracy, and its predictions are based on actual human lives, it will “feel bad” about the lack of accuracy that results from the fact that it is not actually human, and it may act on those feelings.

Might People on the Internet Sometimes Tell the Truth?

Lies and Scott Alexander

Scott Alexander wrote a very good post called Might People on the Internet Sometimes Lie, which I have linked to several times in the past. In the first linked post (Lies, Religion, and Miscalibrated Priors), I answered Scott’s question (why it is hard to believe that people are lying even when they probably are), but also pointed out that “either they are lying or the thing actually happened in such and such a specific way” is a false dichotomy in any case.

In the example in my post, I spoke about Arman Razaali and his claim that he shuffled a deck of cards for 30 minutes and ended up with the deck in its original order. As I stated in the post,

People demonstrably lie at far higher rates than 1 in 10^67 or 1 in 10^40. This will remain the case even if you ask about the rate of “apparently unmotivated flat out lying for no reason.” Consequently, “he’s lying, period,” is far more likely than “the story is true, and happened by pure chance.” Nor can we fix this by pointing to the fact that an extraordinary claim is a kind of extraordinary evidence

But as I also stated there, those are not the only options. As it turns out, although my readers may have missed this, Razaali himself stumbled upon my post somewhat later and posted something in the comments there:

At first, I must say that I was a bit flustered when I saw this post come up when I was checking what would happen when I googled myself. But it’s an excellent read, exceptionally done with excellent analysis. Although I feel the natural urge to be offended by this, I’m really not. Your message is very clear, and it articulates the inner workings of the human mind very well, and in fact, I found that I would completely agree. Having lost access to that Quora account a month or two ago, I can’t look back at what I wrote. I can easily see how the answer gave on Quora could very easily be seen as a lie, and if I read it with no context, I would probably think it was fake too. But having been there at the moment as I counted the cards, I am biased towards believing what I saw, even though I could have miscounted horrendously.

Does this sound like something written by one of Scott Alexander’s “annoying trolls”?

Not to me, anyway. I am aware that I am also disinclined for moral reasons to believe that Razaali was lying, for the reasons I stated in that post. Nonetheless, it seems fair to say that this comment fits better with some intermediate hypothesis (e.g. “it was mostly in order and he was mistaken”) rather than with the idea that “he was lying.”

Religion vs. UFOs

I participated in this exchange on Twitter:

Ross Douthat:

Of what use are our professionally-eccentric, no-heresy-too-wild reasoners like @robinhanson if they assume a priori that “spirits or creatures from other dimensions” are an inherently crazy idea?: https://overcomingbias.com/2021/05/ufos-say-govt-competence-is-either-surprisingly-high-or-surprisingly-low.html

Robin Hanson:

But we don’t want to present ourselves as finding any strange story as equally likely. Yes, we are willing to consider most anything, at least from a good source, & we disagree with others on which stories seem more plausible. But we present ourselves as having standards! 🙂

Me:

I think @DouthatNYT intended to hint that many religious experiences offer arguments for religions that are at least as strong as arguments from UFOs for aliens, and probably stronger.

I agree with him and find both unconvincing.

But find it very impressive you were willing to express those opinions.

Robin Hanson:

You can find videos on best recent evidence for ghosts, which to me looks much less persuasive than versions for UFOs. But evidence for non-ghost spirits, they don’t even bother to make videos for that, as there’s almost nothing.

Me:

It is just not true that there is “almost nothing.” E.g. see the discussion in my post here:

Miracles and Multiple Witnesses

Robin does not respond. Possibly he just does not want to spend more time on the matter. But I think there is also something else going on; engaging with this would suggest to people that he does not “have standards.” It is bad enough for his reputation if he talks about UFOs; it would be much worse if he engaged in a discussion about rosaries turning to gold, which sounds silly to most Catholics, let alone to non-Catholic Christians, people of other religions, and non-religious people.

But I meant what I said in that post, when I said, “these reports should be taken seriously.” Contrary to the debunkers, there is nothing silly about something being reported by thousands of people. It is possible that every one of those reports is a lie or a mistake. Likely, even. But I will not assume that this is the case when no one has even bothered to check.

Scott Alexander is probably one of the best bloggers writing today, and one of the most honest, to the degree that his approach to religious experiences is somewhat better. For example, although I was unfortunately unable to find the text just now, possibly because it was in a comment (and some of those threads have thousands of comments) and not in a post, he once spoke about the Miracle of the Sun at Fatima, and jokingly called it something like, “a glitch in the matrix.” The implication was that (1) he does not believe in the religious explanation, but nonetheless (2) the typical “debunkings” are just not very plausible. I agree with this. There are some hints that there might be a natural explanation, but the suggestions are fairly stretched compared to the facts.

December 24th, 2010 – Jan 4th, 2011

What follows is a description of events that happened to me personally in the period named. They are facts. They are not lies. There is no distortion, not due to human memory failures or anything else. The account here is based on detailed records that I made at the time, which I still possess, and which I just reviewed today to ensure that there would be no mistake.

At that time I was a practicing Catholic. On December 24th, 2010, I started a novena to Mary. I was concerned about a friend’s vocation; I believed that they were called to religious life; they had thought the same for a long time but were beginning to change their mind. The intention of the novena was to respond to this situation.

I did not mention this novena to anyone at the time, or to anyone at all before the events described here.

The last day of the novena was January 1st, 2011, a Marian feast day. (It is a typical practice to end a novena on a feast day of saint to whom the novena is directed.)

On January 4th, 2011, I had a conversation with the same friend. I made no mention at any point during this conversation of the above novena, and there is no way that they could have known about it, or at any rate no way that our debunking friends would consider “ordinary.”

They told me about events that happened to them on January 2nd, 2011.

Note that these events were second hand for me (narrated by my friend) and third hand for any readers this blog might have. This does not matter, however; since my friend had no idea about the novena, even if they were completely making it up (which I believe in no way), it would be nearly as surprising.

When praying a novena, it is typical to expect the “answer to the prayer” on the last day or on the day after, as in an example online:

The Benedictine nuns of St Cecilia’s Abbey on the Isle of Wight (http://www.stceciliasabbey.org.uk) recently started a novena to Fr Doyle with the specific intention of finding some Irish vocations. Anybody with even a passing awareness of the Catholic Church in Ireland is aware that there is a deep vocations crisis. Well, the day after the novena ended, a young Irish lady in her 20’s arrived for a visit at the convent. Today, the Feast of the Immaculate Conception, she will start her time as a postulant at St Cecilia’s Abbey.

Some might dismiss this as coincidence. Those with faith will see it in a different light. Readers can make up their own minds. 

January 2nd, 2011, was the day after my novena ended, and the day to which my friend (unaware of the novena) attributed the following event:

They happened to meet with another person, one who was basically a stranger to them, but met through a mutual acquaintance (mutual to my friend and the stranger; unknown to me.) This person (the stranger) asked my friend to pray with her. She then told my friend that “Our Lady knows that you suffer a lot… She wants you to become a religious and she is afraid that you are going astray…”

Apart from a grammatical change for context, the above sentences are a direct quotation from my friend’s account. Note the relationship with the text I placed in bold earlier.

To be Continued

I may have more to say about these events, but for now I want to say two things:

(1) These events actually happened. The attitude of the debunkers is that if anything “extraordinary” ever happens, it is at best a psychological experience, not a question of the facts. This is just false, and this is what I referred to when I mentioned their second error in the previous post.

(2) I do not accept a religious explanation of these events (at any rate not in any sense that would imply that a body of religious doctrine is true as a whole.)

The Debunkers

Why are they all blurry?

In a recent article, Michael Shermer says about UFOs:

UFOlogists claim that extraordinary evidence exists in the form of tens of thousands of UFO sightings. But SETI scientist Seth Shostak points out in his book Confessions of an Alien Hunter: A Scientist’s Search for Extraterrestrial Intelligence that this actually argues against UFOs being ETIs, because to date not one of these tens of thousands of sightings has materialized into concrete evidence that UFO sightings equal ETI contact. Lacking physical evidence or sharp and clear photographs and videos, more sightings equals less certainty because with so many unidentified objects purportedly zipping around our airspace we surely should have captured one by now, and we haven’t. And where are all the high-definition photographs and videos captured by passengers on commercial airliners? The aforementioned Navy pilot Ryan Graves told 60 Minutes’ correspondent Bill Whitaker that they had seen UAPs “every day for at least a couple of years.” If true, given that nearly every passenger has a smart phone with a high-definition camera, there should be thousands of unmistakable photographs and videos of these UAPs. To date there is not one. Here, the absence of evidence is evidence of absence.

So you say everything is always vague? There is never any clear evidence?

Richard Carrier accidentally gives the game away when making the same point:

Which leads to the next general principle: notice how real UFO videos (that is, ones that aren’t faked) are always out-of-focus or grainy, fuzzy, or in dim light or infrared or other conditions of extreme ambiguity (you can barely tell even what is being imaged). This is a huge red flag. Exactly as with the errors of human cognition, here we already know we should expect difficulty identifying an object, because we are looking at unclear footage. That “UFOs” always only ever show up in ambiguous footage like this is evidence they are not remarkable. Real alien ships endeavoring to be this visible would have been filmed in much clearer conditions by now. Whereas vehicles able to hide from such filming would never even show up under the conditions of these videos. When you make the conditions so bad you can barely discern obvious things, you have by definition made them so bad you won’t even see less-than-obvious things.

Notice what? “Ones that aren’t faked?” What I notice is that you aren’t actually saying that all UFO reports and videos and so on are vague and unclear. There are plenty of clear ones. You just believe that the clear reports are fake.

Which is fine. You are welcome to believe that. But don’t pretend that all the reports are vague. This drastically reduces the strength of the argument. Your real argument is more or less, “If UFOs were aliens, we would have expected, after all this time, there would be so much evidence that everyone would already have been convinced. But I am not convinced and many people are not convinced. Therefore UFOs must not be aliens.”

Even in its real form, this is not a weak argument. It is actually a pretty good one. It is nonetheless weaker in the case of UFOs than in many other cases where similar arguments are made, because the evidence could easily be reconciled with a situation where the vast majority of UFOs are not aliens, a few or many “clear” cases are hoaxes, and a few clear cases are aliens who typically are attempting to avoid human notice, but who fail or make an exception in a very small number of cases. And in general it is more likely to fail in situations where the phenomena might be very rare, or in situations where something is deliberately hidden (e.g. where there are actual conspiracies.)

The Courage of Robin Hanson

In a sequence of posts beginning around last December, Robin Hanson has been attempting to think carefully about the possibility of UFO’s as aliens. In a pair of posts at the end of March, he first presents a list of facts that would need to be explained under that hypothesis, and then in the next presents his proposal to explain those facts.

In the following post, he makes some comments on fact of having the discussion in the first place:

I’ve noticed that this topic of UFOs makes me feel especially uncomfortable. I look at the many details, and many seem to cry out “there really is something important here.” But I know full well that most people refuse to look at the details, and are quick to denigrate those who do, being confident in getting wide social support when they do.

So I’m forced to choose between my intellectual standards, which say to go where the evidence leads, and my desire for social approval, or at least not extra disapproval. I know which one I’m idealistically supposed to pick, but I also know that I don’t really care as much for picking the things you are supposed to pick as I pretend to myself or others.

We often fantasize about being confronted with a big moral dilemma, so we can prove our morality to ourselves and others. But we should mostly be glad we don’t get what we wish for, as we are often quite wrong about how we would actually act.

This is not merely theoretical. He in fact receives quite a bit of pushback in these posts, some of it rather insulting. For example, in this recent post, someone says in the comments:

When there’s a phenomenon like Bigfoot or the Loch Ness Monster or Alien visitors, believers often point to “all the evidence”. But lots of bad evidence doesn’t equal good evidence! Navy pilots who say they see UFOs “everyday” actually are providing support for the idea that they are misidentifying something mundane. When talking to those who believe in a phenomenon with low plausibility, the best way to start is by saying, “Lets discuss the *single best piece of evidence you have* and then consider other pieces separately.”

I have seen UFO’s twice and each time my brow was furrowed in a vain attempt to understand what I had just witnessed. If I hadn’t simply been lucky enough to see the illusion again from another perspective, each time I would have walked away convinced that I had seen a large, extremely fast craft far away and not a small, slow object quite close to me. And I’m not easy to fool, as I already understand how perspective can be deceiving.

I get the idea that your skeptic skills may be under-exercised compared to the rest of your intellect. I’d recommend reading the Shermer book, “Why People Believe Weird Things” or Sagan’s “The Demon Haunted World.” Both are fun reads.

(5ive)

Robin replies,

Your response style, lecturing me about basics, illustrates my status point. People feel free to treat anyone who isn’t on board with full-skeptical like children in need of a lecture.

The debunkers, who are very often the same few people (note that 5ive refers to a book by Michael Shermer), tend to batch together a wide variety of topics (e.g. “Bigfoot or the Loch Ness Monster or Alien visitors”) as “bunk.” You could describe what these things have in common in various ways, but one of the most evident ways is what makes them count as bunk: There is “lots of bad evidence.” That is, as we noted above about UFOs, there is enough evidence to convince some people, but not enough to convince everyone, and the debunkers suppose this situation is just not believable; if the thing were real, they say, everyone would already know.

As I said, this is a pretty good argument, and this generally holds for the sorts of things the debunkers oppose. But this argument can also easily fail, as it did in the case of the meteorites. While people might accept this as a general remark, it nonetheless takes a great deal of courage to suggest that some particular case might be such a case, since as Robin notes, it automatically counts as low status and causes one to be subject to immediate ridicule.

In any case, whether or not the debunkers are right about UFOs or any other particular case, there are at least two general things that they are definitely mistaken about. One is the idea that people who discuss such topics without complete agreement with them are automatically ridiculous. The second will be the topic of another post.

A Correction Regarding Laplace

A few years ago, I quoted Stanley Jaki on an episode supposedly involved Laplace:

Laplace shouted, “We have had enough such myths,” when his fellow academician Marc-Auguste Pictet urged, in the full hearing of the Académie des Sciences, that attention be given to the report about a huge meteor shower that fell at L’Aigle, near Paris, on April 26, 1803.

I referred to this recently on Twitter. When another user found it surprising that Laplace would have said this, I attempted to track it down, and came to the conclusion that this very account is a “myth” itself, in some sense. Jaki tells the same story in different words in the book Miracles and Physics:

The defense of miracles done with an eye on physics should include a passing reference to meteorites. Characteristic of the stubborn resistance of scientific academies to those strange bits of matter was Laplace’s shouting, “We’ve had enough of such myths,” when Pictet, a fellow academician, urged a reconsideration of the evidence provided by “lay-people” as plain eyewitnesses.

(p. 94)

Jaki provides no reference in God and the sun at Fatima. The text in Miracles and Physics has a footnote, but it provides generic related information that does not lead back to any such episode.

Did Jaki make it up? People do just make things up“, but in this case whatever benefit Jaki might get from it would seem to be outweighed by the potential reputational damage of being discovered in such a lie, so it seems unlikely. More likely he is telling a story from memory, with the belief that the details just don’t matter very much. And since he provides plenty of other sources, I am sure he knows full well that he is omitting any source here, presumably because he does not have one at hand. He may even be trying to cover up this omission, in a sense, by footnoting the passage with information that does not source it. It seems likely that the story is a lecture hall account that has been modified by the passage of time. One reason to suppose such a source is that Jaki is not alone in the claim that Laplace opposed the idea of meteorites as stones from the sky until 1803. E.T. Jaynes, in Probability Theory: The Logic of Science, makes a similar claim:

Note that we can recognize the clear truth of this psychological phenomenon without taking any stand about the truth of the miracle; it is possible that the educated people are wrong. For example, in Laplace’s youth educated persons did not believe in meteorites, but dismissed them as ignorant folklore because they are so rarely observed. For one familiar with the laws of mechanics the notion that “stones fall from the sky” seemed preposterous, while those without any conception of mechanical law saw no difficulty in the idea. But the fall at Laigle in 1803, which left fragments studied by Biot and other French scientists, changed the opinions of the educated — including Laplace himself. In this case the uneducated, avid for the marvelous, happened to be right: c’est la vie.

(p. 505)

Like Jaki, Jaynes provides no source. Still, is that good enough reason to doubt the account? Let us examine a text from the book The History of Meteoritics and Key Meteorite Collections. In the article, “Meteorites in history,” Ursula Marvin remarks:

Early in 1802 the French mathematician Pierre-Simon de Laplace (1749-1827) raised the question at the National Institute of a lunar volcanic origin of fallen stones, and quickly gained support for this idea from two physicist colleagues Jean Baptiste Biot (1774-1862) and Siméon-Denis Poisson (1781-1840). The following September, Laplace (1802, p. 277) discussed it in a letter to von Zach.

The idea won additional followers when Biot (1803a) referred to it as ‘Laplace’s hypothesis’, although Laplace, himself, never published an article on it.

(p.49)

This has a source for Laplace’s letter of 1802, although I was not able to find it online. It seems very unlikely that Laplace would have speculated on meteorites as coming from lunar volcanos in 1802, and then called them “myths” in 1803. So where does this story come from? In Cosmic Debris: Meteorites in History, John Burke gives this account:

There is also a problem with respect to the number of French scientists who, after Pictet published a résumé of Howard’s article in the May 1802 issue of the Bibliothèque Britannique, continued to oppose the idea that stones fell from the atmosphere. One can infer from a statement of Lamétherie that there was considerable opposition, for he reported that when Pictet read a memoir to the Institut on the results of Howard’s report “he met with such disfavor that it required a great deal of fortitude for him to finish his reading.” However, Biot’s description of the session varies a good deal. Pictet’s account, he wrote, was received with a “cautious eagerness,” though the “desire to explain everything” caused the phenomenon to be rejected for a long time. There were, in fact, only three scientists who publicly expressed their opposition: the brothers Jean-André and Guillaume-Antoine Deluc of Geneva, and Eugène Patrin, an associate member of the mineralogy section of the Institut and librarian at the École des mines.

When Pictet early in 1801 published a favorable review of Chladni’s treatise, it drew immediate fire from the Deluc brothers. Jean, a strict Calvinist, employed the same explanation of a fall that the Fougeroux committee had used thirty years before: stones did not fall; the event was imagined when lightning struck close to the observer. Just as no fragment of our globe separate and become lost in space, he wrote, fragments could not be detached from another planet. It was also very unlikely that solid masses had been wandering in space since the creation, because they would have long since fallen into the sphere of attraction of some planet. And even if they did fall, they would penetrate the earth to a great depth and shatter into a thousand pieces.

(p.51)

It seems quite possible that Pictet’s “reading a memoir” here and “meeting with disfavor” (regardless of details, since Burke notes it had different descriptions at the time) is the same incident that Jaki describes as having been met with “We’ve had enough of such myths!” when Pictet “urged a reconsideration of the evidence.” If these words were ever said, then, they were presumably said by one of these brothers or someone else, and not by Laplace.

How does this sort of thing happen, if we charitably assume that Jaki was not being fundamentally dishonest? As stated above, it seems likely that he knew he did not have a source. He may even have been consciously aware that it might not have been Laplace who made this statement, if anyone did. But he was sure there was a dispute about the matter, and presumably thought that it just wasn’t too important who it was or the details of the situation, since the main point was that scientists are frequently reluctant to accept facts when those facts occur rarely and are not deliberately reproducible. And if we reduce Jaki’s position to these two things, namely, (1) that scientists at one point disputed the reality and meteorites, and (2) this sort of thing frequently happens with rare and hard to reproduce phenomena, then the position is accurate.

But this behavior, the description of situations with the implication that the details just don’t matter much, is very bad, and directly contributes to the reluctance of many scientists to accept the reality of “extraordinary” phenomena, even in situations where they are, in fact, real.

What You Learned Before You Were Born

In Plato’s Meno, Socrates makes the somewhat odd claim that the ability of people to learn things without being directly told them proves that somehow they must have learned them or known them in advance. While we can reasonably assume this is wrong in a literal sense, there is some likeness of the truth here.

The whole of a human life is a continuous learning process generally speaking without any sudden jumps. We think of a baby’s learning as different from the learning of a child in school, and the learning of the child as rather different from the learning of an adult. But if you look at that process in itself, there may be sudden jumps in a person’s situation, such as when they graduate from school or when they get married, but there are no sudden jumps from not knowing anything about a topic or an object to suddenly knowing all about it. The learning itself happens gradually. It is the same with the manner in which it takes place; adults do indeed learn in a different manner from that in which children or infants learn. But if you ask how that manner got to be different, it certainly did so gradually, not suddenly.

But in addition to all this, there is a kind of “knowledge” that is not learned at all during one’s life, but is possessed from the beginning. From the beginning people have the ability to interact with the world in such a way that they will survive and go on to learn things. Thus from the beginning they must “know” how to do this. Now one might object that infants have no such knowledge, and that the only reason they survive is that their parents or others keep them alive. But the objection is mistaken: infants know to cry out when they hungry or in pain, and this is part of what keeps them alive. Similarly, an infant knows to drink the milk from its mother rather than refusing it, and this is part of what keeps it alive. Similarly in regard to learning, if an infant did not know the importance of paying close attention to speech sounds, it would never learn a language.

When was this “knowledge” learned? Not in the form of a separated soul, but through the historical process of natural selection.

Selection and Artificial Intelligence

This has significant bearing on our final points in the last post. Is the learning found in AI in its current forms more like the first kind of learning above, or like the kind found in the process of natural selection?

There may be a little of both, but the vast majority of learning in such systems is very much the second kind, and not the first kind. For example, AlphaGo is trained by self-play, where moves and methods of play that tend to lose are eliminated in much the way that in the process of natural selection, manners of life that do not promote survival are eliminated. Likewise a predictive model like GPT-3 is trained, through a vast number of examples, to avoid predictions that turn out to be less accurate and to make predictions that tend to be more accurate.

Now (whether or not this is done in individual cases) you might take a model of this kind and fine tune it based on incoming data, perhaps even in real time, which is a bit more like the first kind of learning. But in our actual situation, the majority of what is known by our AI systems is based on the second kind of learning.

This state of affairs should not be surprising, because the first kind of learning described above is impossible without being preceded by the second. The truth in Socrates’ claim is that if a system does not already “know” how to learn, of course it will not learn anything.

Intelligence and Universality

Elsewhere I have mentioned the argument, often made in great annoyance, that people who take some new accomplishment in AI or machine learning and proclaim that it is “not real intelligence” or that the algorithm is “still fundamentally stupid”, and other things of that kind, are “moving the goalposts,” especially since in many such cases, there really were people who said that something that could do such a thing would be intelligent.

As I said in the linked post, however, there is no problem of moving goalposts unless you originally had them in the wrong place. And attaching intelligence to any particular accomplishment, such as “playing chess well” or even “producing a sensible sounding text,” or anything else with that sort of particularity, is misplacing the goalposts. As we might remember, what excited Francis Bacon was the thought that there were no clear limits, at all, on what science (namely the working out of intelligence) might accomplish. In fact he seems to have believed that there were no limits at all, which is false. Nonetheless, he was correct that those limits are extremely vague, and that much that many assumed to be impossible would turn out to be possible. In other words, human intelligence does not have very meaningful limits on what it can accomplish, and artificial intelligence will be real intelligence (in the same sense that artificial diamonds can be real diamonds) when artificial intelligence has no meaningful limits on what it can accomplish.

I have no time for playing games with objections like, “but humans can’t multiply two 1000 digit numbers in one second, and no amount of thought will give them that ability.” If you have questions of this kind, please answer them for yourself, and if you can’t, sit still and think about it until you can. I have full confidence in your ability to find the answers, given sufficient thought.

What is needed for “real intelligence,” then, is universality. In a sense everyone knew all along that this was the right place for the goalposts. Even if someone said “if a machine can play chess, it will be intelligent,” they almost certainly meant that their expectation was that a machine that could play chess would have no clear limits on what it could accomplish. If you could have told them for a fact that the future would be different: that a machine would be able to play chess but that (that particular machine) would never be able to do anything else, they would have conceded that the machine would not be intelligent.

Training and Universality

Current AI systems are not universal, and clearly have no ability whatsoever to become universal, without first undergoing deep changes in those systems, changes that would have to be initiated by human beings. What is missing?

The problem is the training data. The process of evolution produced the general ability to learn by using the world itself as the training data. In contrast, our AI systems take a very small subset of the world (like a large set of Go games or a large set of internet text), and train a learning system on that subset. Why take a subset? Because the world is too large to fit into a computer, especially if that computer is a small part of the world.

This suggests that going from the current situation to “artificial but real” intelligence is not merely a question of making things better and better little by little. There is a more fundamental problem that would have to be overcome, and it won’t be overcome simply by larger training sets, by faster computing, and things of this kind. This does not mean that the problem is impossible, but it may turn out to be much more difficult than people expected. For example, if there is no direct solution, people might try to create Robin Hanson’s “ems”, where one would more or less copy the learning achieved by natural selection. Or even if that is not done directly, a better understanding of what it means to “know how to learn,” might lead to a solution, although probably one that would not depend on training a model on massive amounts of data.

What happens if there is no solution, or no solution is found? At times people will object to the possibility of such a situation along these times: “this situation is incoherent, since obviously people will be able to keep making better and better machine learning systems, so sooner or later they will be just as good as human intelligence.” But in fact the situation is not incoherent; if it happened, various types of AI system would approach various asymptotes, and this is entirely coherent. We can already see this in the case of GPT-3, where as I noted, there is an absolute bound on its future performance. In general such bounds in their realistic form are more restrictive than their in-principle form; I do not actually expect some successor to GPT-3 to write sensible full length books. Note however that even if this happened (as long as the content itself was not fundamentally better than what humans have done) I would not be “moving the goalposts”; I do not expect that to happen, but its happening would not imply any fundamental difference, since this is still within the “absolute” bounds that we have discussed. In contrast, if a successor to GPT-3 published a cure for cancer, this would prove that I had made some mistake on the level of principle.

Some Remarks on GPT-N

At the end of May, OpenAI published a paper on GPT-3, a language model which is a successor to their previous version, GPT-2. While quite impressive, the reaction from many people interested in artificial intelligence has been seriously exaggerated. Sam Altman, OpenAI’s CEO, has said as much himself:

The GPT-3 hype is way too much. It’s impressive (thanks for the nice compliments!) but it still has serious weaknesses and sometimes makes very silly mistakes. AI is going to change the world, but GPT-3 is just a very early glimpse. We have a lot still to figure out.

I used “GPT-N” in the title here because most of the comments I intend to make are almost completely general, and will apply to any future version that uses sufficiently similar methods.

What it does

GPT-3 is a predictive language model, that is, given an input text it tries to predict what would come next, much in the way that if you read the first few words of this sentence with the rest covered up, you might try to guess what would be likely to come next. To the degree that it does this well, it can be used to generate text from a “prompt,” that is, we give it something like a few words or a few sentences, and then add whatever it predicts should come next. For example, let’s take this very blog post and see what GPT-3 would like to say:

What it doesn’t do

While GPT-3 does seem to be able to generate some pretty interesting results, there are several limitations that need to be taken into account when using it.

First and foremost, and most importantly, it can’t do anything without a large amount of input data. If you want it to write like “a real human,” you need to give it a lot of real human writing. For most people, this means copying and pasting a lot. And while the program is able to read through that and get a feel for the way humans communicate, you can’t exactly use it to write essays or research papers. The best you could do is use it as a “fill in the blank” tool to write stories, and that’s not even very impressive.

While the program does learn from what it reads and is quite good at predicting words and phrases based on what has already been written, this method isn’t very effective at producing realistic prose. The best you could hope for is something like the “Deep Writing Machine” Twitter account, which spits out disconnected phrases in an ominous, but very bland voice.

In addition, the model is limited only to language. It does not understand context or human thought at all, so it has no way of tying anything together. You could use it to generate a massive amount of backstory and other material for a game, but that’s about it.

Finally, the limitations in writing are only reinforced by the limitations in reading. Even with a large library to draw on, the program is only as good as the parameters set for it. Even if you set it to the greatest writers mankind has ever known, without any special parameters, its writing would be just like anyone else’s.

The Model

GPT-3 consists of several layers. The first layer is a “memory network” that involves the program remembering previously entered data and using it when appropriate (i.e. it remembers commonly misspelled words and frequently used words). The next layer is the reasoning network, which involves common sense logic (i.e. if A, then B). The third is the repetition network, which involves pulling previously used material from memory and using it to create new combinations (i.e. using previously used words in new orders).

I added the bold formatting, the rest is as produced by the model. This was also done in one run, without repetitions. This is an important qualification, since many examples on the internet have been produced by deleting something produced by the model and forcing it to generate something new until something sensible resulted. Note that the model does not seem to have understood my line, “let’s take this very blog post and see what GPT-3 would like to say.” That is, rather than trying to “say” anything, it attempted to continue the blog post in the way I might have continued it without the block quote.

Truth vs Probability of Text

If we interpret the above text from GPT-3 “charitably”, much of it is true or close to true. But I use scare quotes here because when we speak of interpreting human speech charitably, we are assuming that someone was trying to speak the truth, and so we think, “What would they have meant if they were trying to say something true?” The situation is different here, because GPT-3 has no intention of producing truth, nor of avoiding it. Insofar as there is any intention, the intention is to produce the text which would be likely to come after the input text; in this case, as the input text was the beginning of this blog post, the intention was to produce the text that would likely follow in such a post. Note that there is an indirect relationship with truth, which explains why there is any truth at all in GPT-3’s remarks. If the input text is true, it is at least somewhat likely that what would follow would also be true, so if the model is good at guessing what would be likely to follow, it will be likely to produce something true in such cases. But it is just as easy to convince it to produce something false, simply by providing an input text that would be likely to be followed by something false.

This results in an absolute upper limit on the quality of the output of a model of this kind, including any successor version, as long as the model works by predicting the probability of the following text. Namely, its best output cannot be substantially better than the best content in its training data, which is in this version is a large quantity of texts from the internet. The reason for this limitation is clear; to the degree that the model has any intention at all, the intention is to reflect the training data, not to surpass it. As an example, consider the difference between Deep Mind’s AlphaGo and AlphaGo Zero. AlphaGo Zero is a better Go player than the original AlphaGo, and this is largely because the original is trained on human play, while AlphaGo Zero is trained from scratch on self play. In other words, the original version is to some extent predicting “what would a Go player play in this situation,” which is not the same as predicting “what move would win in this situation.”

Now I will predict (and perhaps even GPT-3 could predict) that many people will want to jump in and say, “Great. That shows you are wrong. Even the original AlphaGo plays Go much better than a human. So there is no reason that an advanced version of GPT-3 could not be better than humans at saying things that are true.”

The difference, of course, is that AlphaGo was trained in two ways, first on predicting what move would be likely in a human game, and second on what would be likely to win, based on its experience during self play. If you had trained the model only on predicting what would follow in human games, without the second aspect, the model would not have resulted in play that substantially improved upon human performance. But in the case of GPT-3 or any model trained in the same way, there is no selection whatsoever for truth as such; it is trained only to predict what would follow in a human text. So no successor to GPT-3, in the sense of a model of this particular kind, however large, will ever be able to produce output better than human, or in its own words, “its writing would be just like anyone else’s.”

Self Knowledge and Goals

OpenAI originally claimed that GPT-2 was too dangerous to release; ironically, they now intend to sell access to GPT-3. Nonetheless, many people, in large part those influenced by the opinions of Nick Bostrom and Eliezer Yudkowsky, continue to worry that an advanced version might turn out to be a personal agent with nefarious goals, or at least goals that would conflict with the human good. Thus Alexander Kruel:

GPT-2: *writes poems*
Skeptics: Meh
GPT-3: *writes code for a simple but functioning app*
Skeptics: Gimmick.
GPT-4: *proves simple but novel math theorems*
Skeptics: Interesting but not useful.
GPT-5: *creates GPT-6*
Skeptics: Wait! What?
GPT-6: *FOOM*
Skeptics: *dead*

In a sense the argument is moot, since I have explained above why no future version of GPT will ever be able to produce anything better than people can produce themselves. But even if we ignore that fact, GPT-3 is not a personal agent of any kind, and seeks goals in no meaningful sense, and the same will apply to any future version that works in substantially the same way.

The basic reason for this is that GPT-3 is disembodied, in the sense of this earlier post on Nick Bostrom’s orthogonality thesis. The only thing it “knows” is texts, and the only “experience” it can have is receiving an input text. So it does not know that it exists, it cannot learn that it can affect the world, and consequently it cannot engage in goal seeking behavior.

You might object that it can in fact affect the world, since it is in fact in the world. Its predictions cause an output, and that output is in the world. And that output and be reintroduced as input (which is how “conversations” with GPT-3 are produced). Thus it seems it can experience the results of its own activities, and thus should be able to acquire self knowledge and goals. This objection is not ultimately correct, but it is not so far from the truth. You would not need extremely large modifications in order to make something that in principle could acquire self knowledge and seek goals. The main reason that this cannot happen is the “P in “GPT,” that is, the fact that the model is “pre-trained.” The only learning that can happen is the learning that happens while it is reading an input text, and the purpose of that learning is to guess what is happening in the one specific text, for the purpose of guessing what is coming next in this text. All of this learning vanishes upon finishing the prediction task and receiving another input. A secondary reason is that since the only experience it can have is receiving an input text, even if it were given a longer memory, it would probably not be possible for it to notice that its outputs were caused by its predictions, because it likely has no internal mechanism to reflect on the predictions themselves.

Nonetheless, if you “fixed” these two problems, by allowing it to continue to learn, and by allowing its internal representations to be part of its own input, there is nothing in principle that would prevent it from achieving self knowledge, and from seeking goals. Would this be dangerous? Not very likely. As indicated elsewhere, motivation produced in this way and without the biological history that produced human motivation is not likely to be very intense. In this context, if we are speaking of taking a text-predicting model and adding on an ability to learn and reflect on its predictions, it is likely to enjoy doing those things and not much else. For many this argument will seem “hand-wavy,” and very weak. I could go into this at more depth, but I will not do so at this time, and will simply invite the reader to spend more time thinking about it. Dangerous or not, would it be easy to make these modifications? Nothing in this description sounds difficult, but no, it would not be easy. Actually making an artificial intelligence is hard. But this is a story for another time.

Fire, Water, and Numbers

Fire vs. Water

All things are water,” says Thales.

“All things are fire,” says Heraclitus.

“Wait,” says David Hume’s Philo. “You both agree that all things are made up of one substance. Thales, you prefer to call it water, and Heraclitus, you prefer to call it fire. But isn’t that merely a verbal dispute? According to both of you, whatever you point at is fundamentally the same fundamental stuff. So whether you point at water or fire, or anything else, for that matter, you are always pointing at the same fundamental stuff. Where is the real disagreement?”

Philo has a somewhat valid point here, and I mentioned the same thing in the linked post referring to Thales. Nonetheless, as I also said in the same post, as well as in the discussion of the disagreement about God, while there is some common ground, there are also likely remaining points of disagreement. It might depend on context, and perhaps the disagreement is more about the best way of thinking about things than about the things themselves, somewhat like discussing whether the earth or the universe is the thing spinning, but Heraclitus could respond, for example, by saying that thinking of the fundamental stuff as fire is more valid because fire is constantly changing, while water often appears to be completely still, and (Heraclitus claims) everything is in fact constantly changing. This could represent a real disagreement, but it is not a large one, and Thales could simply respond: “Ok, everything is flowing water. Problem fixed.”

Numbers

It is said that Pythagoras and his followers held that “all things are numbers.” To what degree and in what sense this attribution is accurate is unclear, but in any case, some people hold this very position today, even if they would not call themselves Pythagoreans. Thus for example in a recent episode of Sean Carroll’s podcast, Carroll speaks with Max Tegmark, who seems to adopt this position:

0:23:37 MT: It’s squishy a little bit blue and moose like. [laughter] Those properties, I just described don’t sound very mathematical at all. But when we look at it, Sean through our physics eyes, we see that it’s actually a blob of quarks and electrons. And what properties does an electron have? It has the property, minus one, one half, one, and so on. We, physicists have made up these nerdy names for these properties like electric charge, spin, lepton number. But it’s just we humans who invented that language of calling them that, they are really just numbers. And you know as well as I do that the only difference between an electron and a top quark is what numbers its properties are. We have not discovered any other properties that they actually have. So that’s the stuff in space, all the different particles, in the Standard Model, you’ve written so much nice stuff about in your books are all described by just by sets of numbers. What about the space that they’re in? What property does the space have? I think I actually have your old nerdy non-popular, right?

0:24:50 SC: My unpopular book, yes.

0:24:52 MT: Space has, for example, the property three, that’s a number and we have a nerdy name for that too. We call it the dimensionality of space. It’s the maximum number of fingers I can put in space that are all perpendicular to each other. The name dimensionality is just the human language thing, the property is three. We also discovered that it has some other properties, like curvature and topology that Einstein was interested in. But those are all mathematical properties too. And as far as we know today in physics, we have never discovered any properties of either space or the stuff in space yet that are actually non-mathematical. And then it starts to feel a little bit less insane that maybe we are living in a mathematical object. It’s not so different from if you were a character living in a video game. And you started to analyze how your world worked. You would secretly be discovering just the mathematical workings of the code, right?

Tegmark presumably would believe that by saying that things “are really just numbers,” he would disagree with Thales and Heraclitus about the nature of things. But does he? Philo might well be skeptical that there is any meaningful disagreement here, just as between Thales and Heraclitus. As soon as you begin to say, “all things are this particular kind of thing,” the same issues will arise to hinder your disagreement with others who characterize things in a different way.

The discussion might be clearer if I put my cards on the table in advance:

First, there is some validity to the objection, just as there is to the objection concerning the difference between Thales and Heraclitus.

Second, there is nonetheless some residual disagreement, and on that basis it turns out that Tegmark and Pythagoras are more correct than Thales and Heraclitus.

Third, Tegmark most likely does not understand the sense in which he might be correct, rather supposing himself correct the way Thales might suppose himself correct in insisting, “No, things are really not fire, they are really water.”

Mathematical and non-mathematical properties

As an approach to these issues, consider the statement by Tegmark, “We have never discovered any properties of either space or the stuff in space yet that are actually non-mathematical.”

What would it look like if we found a property that was “actually non-mathematical?” Well, what about the property of being blue? As Tegmark remarks, that does not sound very mathematical. But it turns out that color is a certain property of a surface regarding how it reflects flight, and this is much more of a “mathematical” property, at least in the sense that we can give it a mathematical description, which we would have a hard time doing if we simply took the word “blue.”

So presumably we would find a non-mathematical property by seeing some property of things, then investigating it, and then concluding, “We have fully investigated this property and there is no mathematical description of it.” This did not happen with the color blue, nor has it yet happened with any other property; either we can say that we have not yet fully investigated it, or we can give some sort of mathematical description.

Tegmark appears to take the above situation to be surprising. Wow, we might have found reality to be non-mathematical, but it actually turns out to be entirely mathematical! I suggest something different. As hinted by connection with the linked post, things could not have turned out differently. A sufficiently detailed analysis of anything will be a mathematical analysis or something very like it. But this is not because things “are actually just numbers,” as though this were some deep discovery about the essence of things, but because of what it is for people to engage in “a detailed analysis” of anything.

Suppose you want to investigate some thing or some property. The first thing you need to do is to distinguish it from other things or other properties. The color blue is not the color red, the color yellow, or the color green.

Numbers are involved right here at the very first step. There are at least three colors, namely red, yellow, and blue.

Of course we can find more colors, but what if it turns out there seems to be no definite number of them, but we can always find more? Even in this situation, in order to “analyze” them, we need some way of distinguishing and comparing them. We will put them in some sort of order: one color is brighter than another, or one length is greater than another, or one sound is higher pitched than another.

As soon as you find some ordering of that sort (brightness, or greatness of length, or pitch), it will become possible to give a mathematical analysis in terms of the real numbers, as we discussed in relation to “good” and “better.” Now someone defending Tegmark might respond: there was no guarantee we would find any such measure or any such method to compare them. Without such a measure, you could perhaps count your property along with other properties. But you could not give a mathematical analysis of the property itself. So it is surprising that it turned out this way.

But you distinguished your property from other properties, and that must have involved recognizing some things in common with other properties, at least that it was something rather than nothing and that it was a property, and some ways in which it was different from other properties. Thus for example blue, like red, can be seen, while a musical note can be heard but not seen (at least by most people.) Red and blue have in common that they are colors. But what is the difference between them? If we are to respond in any way to this question, except perhaps, “it looks different,” we must find some comparison. And if we find a comparison, we are well on the way to a mathematical account. If we don’t find a comparison, people might rightly complain that we have not yet done any detailed investigation.

But to make the point stronger, let’s assume the best we can do is “it looks different.” Even if this is the case, this very thing will allow us to construct a comparison that will ultimately allow us to construct a mathematical measure. For “it looks different” is itself something that comes in degrees. Blue looks different from red, but orange does so as well, just less different. Insofar as this judgment is somewhat subjective, it might be hard to get a great deal of accuracy with this method. But it would indeed begin to supply us with a kind of sliding scale of colors, and we would be able to number this scale with the real numbers.

From a historical point of view, it took a while for people to realize that this would always be possible. Thus for example Isidore of Seville said that “unless sounds are held by the memory of man, they perish, because they cannot be written down.” It was not, however, so much ignorance of sound that caused this, as ignorance of “detailed analysis.”

This is closely connected to what we said about names. A mathematical analysis is a detailed system of naming, where we name not only individual items, but also various groups, using names like “two,” “three,” and “four.” If we find that we cannot simply count the thing, but we can always find more examples, we look for comparative ways to name them. And when we find a comparison, we note that some things are more distant from one end of the scale and other things are less distant. This allows us to analyze the property using real numbers or some similar mathematical concept. This is also related to our discussion of technical terminology; in an advanced stage any science will begin to use somewhat mathematical methods. Unfortunately, this can also result in people adopting mathematical language in order to look like their understanding has reached an advanced stage, when it has not.

It should be sufficiently clear from this why I suggested that things could not have turned out otherwise. A “non-mathematical” property, in Tegmark’s sense, can only be a property you haven’t analyzed, or one that you haven’t succeeded in analyzing if you did attempt it.

The three consequences

Above, I made three claims about Tegmark’s position. The reasons for them may already be somewhat clarified by the above, but nonetheless I will look at this in a bit more detail.

First, I said there was some truth in the objection that “everything is numbers” is not much different from “everything is water,” or “everything is fire.” One notices some “hand-waving,” so to speak, in Tegmark’s claim that “We, physicists have made up these nerdy names for these properties like electric charge, spin, lepton number. But it’s just we humans who invented that language of calling them that, they are really just numbers.” A measure of charge or spin or whatever may be a number. But who is to say the thing being measured is a number? Nonetheless, there is a reasonable point there. If you are to give an account at all, it will in some way express the form of the thing, which implies explaining relationships, which depends on the distinction of various related things, which entails the possibility of counting the things that are related. In other words, someone could say, “You have a mathematical account of a thing. But the thing itself is non-mathematical.” But if you then ask them to explain that non-mathematical thing, the new explanation will be just as mathematical as the original explanation.

Given this fact, namely that the “mathematical” aspect is a question of how detailed explanations work, what is the difference between saying “we can give a mathematical explanation, but apart from explanations, the things are numbers,” and “we can give a mathematical explanation, but apart from explanations, the things are fires?”

Exactly. There isn’t much difference. Nonetheless, I made the second claim that there is some residual disagreement and that by this measure, the mathematical claim is better than the one about fire or water. Of course we don’t really know what Thales or Heraclitus thought in detail. But Aristotle, at any rate, claimed that Thales intended to assert that material causes alone exist. And this would be at least a reasonable understanding of the claim that all things are water, or fire. Just as Heraclitus could say that fire is a better term than water because fire is always changing, Thales, if he really wanted to exclude other causes, could say that water is a better term than “numbers” because water seems to be material and numbers do not. But since other causes do exist, the opposite is the case: the mathematical claim is better than the materialistic ones.

Many people say that Tegmark’s account is flawed in a similar way, but with respect to another cause; that is, that mathematical accounts exclude final causes. But this is a lot like Ed Feser’s claim that a mathematical account of color implies that colors don’t really exist; namely they are like in just being wrong. A mathematical account of color does not imply that things are not colored, and a mathematical account of the world does not imply that final causes do not exist. As I said early on, a final causes explains why an efficient cause does what it does, and there is nothing about a mathematical explanation that prevents you from saying why the efficient cause does what it does.

My third point, that Tegmark does not understand the sense in which he is right, should be plain enough. As I stated above, he takes it to be a somewhat surprising discovery that we consistently find it possible to give mathematical accounts of the world, and this only makes sense if we assume it would in theory have been possible to discover something else. But that could not have happened, not because the world couldn’t have been a certain way, but because of the nature of explanation.

The Power of a Name

Fairy tales and other stories occasionally suggest the idea that a name gives some kind of power over the thing named, or at least that one’s problems concerning a thing may be solved by knowing its name, as in the story of Rumpelstiltskin. There is perhaps a similar suggestion in Revelation 2:7, “Whoever has ears, let them hear what the Spirit says to the churches. To the one who is victorious, I will give some of the hidden manna. I will also give that person a white stone with a new name written on it, known only to the one who receives it.” The secrecy of the new name may indicate (among other things) that others will have no power over that person.

There is more truth in this idea than one might assume without much thought. For example, anonymous authors do not want to be “doxxed” because knowing the name of the author really does give some power in relation to them which is not had without the knowledge of their name. Likewise, as a blogger, occasionally I want to cite something, but cannot remember the name of the author or article where the statement is made. Even if I remember the content fairly clearly, lacking the memory of the name makes finding the content far more difficult, while on the other name, knowing the name gives me the power of finding the content much more easily.

But let us look a bit more deeply into this. Hilary Lawson, whose position was somewhat discussed here, has a discussion along these lines in Part II of his book, Closure: A Story of Everything. Since he denies that language truly refers to the world at all, as I mentioned in the linked post on his position, it is important to him that language has other effects, and in particular has practical goals. He says in chapter 4:

In order to understand the mechanism of practical linguistic closure consider an example where a proficient speaker of English comes across a new word. Suppose that we are visiting a zoo with a friend. We stand outside a cage and our friend says: ‘An aasvogel.” …

It might appear at first from this example that nothing has been added by the realisation of linguistic closure. The sound ‘aasvogel’ still sounds the same, the image of the bird still looks the same. So what has changed? The sensory closures on either side may not have changed, but a new closure has been realised. A new closure which is in addition to the prior available closures and which enables intervention which was not possible previously. For example, we now have a means of picking out this particular bird in the zoo because the meaning that has been realised will have identified a something in virtue of which this bird is an aasvogel and which thus enables us to distinguish it from others. As a result there will be many consequences for how we might be able to intervene.

The important point here is simply that naming something, even before taking any additional steps, immediately gives one the ability to do various practical things that one could not previously do. In a passage by Helen Keller, previously quoted here, she says:

Since I had no power of thought, I did not compare one mental state with another. So I was not conscious of any change or process going on in my brain when my teacher began to instruct me. I merely felt keen delight in obtaining more easily what I wanted by means of the finger motions she taught me.

We may have similar experiences as adults learning a foreign language while living abroad. At first one has very little ability to interact with the foreign world, but suddenly everything is possible.

Or consider the situation of a hunter gatherer who may not know how to count. It may be obvious to them that a bigger pile of fruit is better than a smaller one, but if two piles look similar, they may have no way to know which is better. But once they decide to give “one fruit and another” a name like “two,” and “two and one” a name like “three,” and so on, suddenly they obtain a great advantage that they previously did not possess. It is now possible to count piles and to discover that one pile has sixty-four while another has sixty-three. And it turns out that by treating the “sixty-four” as bigger than the other pile, although it does not look bigger, they end up better off.

In this sense one could look at the scientific enterprise of looking for mathematical laws of nature as one long process of looking for better names. We can see that some things are faster and some things are slower, but the vague names “fast” and “slow” cannot accomplish much. Once we can name different speeds more precisely, we can put them all in order and accomplish much more, just as the hunter gatherer can accomplish more after learning to count. And this extends to the full power of technology: the men who landed on the moon, did so ultimately due to the power of names.

If you take Lawson’s view, that language does not refer to the world at all, all of this is basically casting magic spells. In fact, he spells this out himself, in so many words, in chapter 3:

All material is in this sense magical. It enables intervention that cannot be understood. Ancient magicians were those who had access to closures that others did not know, in the same way that the Pharaohs had access to closures not available to their subjects. This gave them a supernatural character. It is now that thought that their magic has been explained, as the knowledge of herbs, metals or the weather. No such thing has taken place. More powerful closures have been realised, more powerful magic that can subsume the feeble closures of those magicians. We have simply lost sight of its magical character. Anthropology has many accounts of tribes who on being observed by a Western scientist believe that the observer has access to some very powerful magic. Magic that produces sound and images from boxes, and makes travel swift. We are inclined to smile patronisingly believing that we merely have knowledge — the technology behind radio and television, and motor vehicles — and not magic. The closures behind the technology do indeed provide us with knowledge and understanding and enable us to handle activity, but they do not explain how the closures enable intervention. How the closures are successful remains incomprehensible and in this sense is our magic.

I don’t think we should dismiss this point of view entirely, but I do think it is more mistaken than otherwise, basically because of the original mistake of thinking that language cannot refer to the world. But the point that names are extremely powerful is correct and important, to the point where even the analogy of technology as “magic that works” does make a certain amount of sense.

Anticipations of Darwin

I noted here that long before Darwin, there was fairly decent evidence for some sort of theory of evolution, even evidence available from the general human experience of plant and animal life, without deep scientific study.

As said in the earlier post, Aristotle notes that Empedocles hypothesized something along the lines of natural selection:

Wherever then all the parts came about just what they would have been if they had come to be for an end, such things survived, being organized spontaneously in a fitting way; whereas those which grew otherwise perished and continue to perish, as Empedocles says his ‘man-faced ox-progeny’ did.

Since Aristotle is arguing against Empedocles, we should be cautious in assuming that the characterization of his position is entirely accurate. But as presented by Aristotle, the position is an argument against the existence of final causes: since things can be “organized spontaneously” in the way “they would have been if they had come to be for an end,” there is no reason to think they in fact came to be for an end.

This particular conclusion, namely that in such a process nothing comes to be for an end, is a mistake, based on the assumption that different kinds of causes are mutually exclusive, rather than recognizing that different kinds of causes are different ways of explaining one and the same thing. But the general idea regarding what happened historically is correct: good conditions are more capable of persisting, bad conditions less so, and thus over time good conditions tend to predominate.

Other interesting anticipations may be found in Ibn Khaldun‘s book, The Muqaddimah, published in 1377. For example we find this passage:

It should be known that we — may God guide you and us — notice that this world with all the created things in it has a certain order and solid construction. It shows nexuses between causes and things caused, combinations of some parts of creation with others, and transformations of some existent things into others, in a pattern that is both remarkable and endless. Beginning with the world of the body and sensual perception, and therein first with the world of the visible elements, (one notices) how these elements are arranged gradually and continually in an ascending order, from earth to water, (from water) to air, and (from air) to fire. Each one of the elements is prepared to be transformed into the next higher or lower one, and sometimes is transformed. The higher one is always finer than the one preceding it. Eventually, the world of the spheres is reached. They are finer than anything else. They are in layers which are inter­connected, in a shape which the senses are able to perceive only through the existence of motions. These motions provide some people with knowledge of the measurements and positions of the spheres, and also with knowledge of the existence of the essences beyond, the influence of which is noticeable in the spheres through the fact (that they have motion).

One should then look at the world of creation. It started out from the minerals and progressed, in an ingenious, gradual manner, to plants and animals. The last stage of minerals is connected with the first stage of plants, such as herbs and seedless plants. The last stage of plants, such as palms and vines, is connected with the first stage of animals, such as snails and shellfish which have only the power of touch. The word “connection” with regard to these created things means that the last stage of each group is fully prepared to become the first stage of the next group.

The animal world then widens, its species become numerous, and, in a gradual process of creation, it finally leads to man, who is able to think and to reflect. The higher stage of man is reached from the world of the monkeys, in which both sagacity and perception are found, but which has not reached the stage of actual reflection and thinking. At this point we come to the first stage of man after (the world of monkeys). This is as far as our (physical) observation extends.

It is possible that he makes his position clearer elsewhere (I have not read the entire work.) The passage here does not explicitly assert that humans arose from lower animals, but does suggest it, correctly associating human beings with monkeys in particular, even if some of his other connections are somewhat strange. In other words, both here and elsewhere, he speaks of one stage of things being “prepared to become” another stage, and says that this transition sometimes happens: “Each one of the elements is prepared to be transformed into the next higher or lower one, and sometimes is transformed.”

While Ibn Khaldun is at least suggesting that we notice a biological order that corresponds to some degree to an actual historical order, we do not see in this text any indication of what the mechanism is supposed to be. In contrast, Empedocles gives us a mechanism but no clarity regarding historical order. Admittedly, this may be an artifact of the fact that I have not read more of Ibn Khaldun and the fact that we have only fragments from Empedocles.

One of the strongest anticipations of all, although put in very general terms, can be found in David Hume’s Dialogues Concerning Natural Religion, in the following passage:

Besides, why may not motion have been propagated by impulse through all eternity, and the same stock of it, or nearly the same, be still upheld in the universe? As much is lost by the composition of motion, as much is gained by its resolution. And whatever the causes are, the fact is certain, that matter is, and always has been, in continual agitation, as far as human experience or tradition reaches. There is not probably, at present, in the whole universe, one particle of matter at absolute rest.

And this very consideration too, continued PHILO, which we have stumbled on in the course of the argument, suggests a new hypothesis of cosmogony, that is not absolutely absurd and improbable. Is there a system, an order, an economy of things, by which matter can preserve that perpetual agitation which seems essential to it, and yet maintain a constancy in the forms which it produces? There certainly is such an economy; for this is actually the case with the present world. The continual motion of matter, therefore, in less than infinite transpositions, must produce this economy or order; and by its very nature, that order, when once established, supports itself, for many ages, if not to eternity. But wherever matter is so poised, arranged, and adjusted, as to continue in perpetual motion, and yet preserve a constancy in the forms, its situation must, of necessity, have all the same appearance of art and contrivance which we observe at present. All the parts of each form must have a relation to each other, and to the whole; and the whole itself must have a relation to the other parts of the universe; to the element in which the form subsists; to the materials with which it repairs its waste and decay; and to every other form which is hostile or friendly. A defect in any of these particulars destroys the form; and the matter of which it is composed is again set loose, and is thrown into irregular motions and fermentations, till it unite itself to some other regular form. If no such form be prepared to receive it, and if there be a great quantity of this corrupted matter in the universe, the universe itself is entirely disordered; whether it be the feeble embryo of a world in its first beginnings that is thus destroyed, or the rotten carcass of one languishing in old age and infirmity. In either case, a chaos ensues; till finite, though innumerable revolutions produce at last some forms, whose parts and organs are so adjusted as to support the forms amidst a continued succession of matter.

Suppose (for we shall endeavour to vary the expression), that matter were thrown into any position, by a blind, unguided force; it is evident that this first position must, in all probability, be the most confused and most disorderly imaginable, without any resemblance to those works of human contrivance, which, along with a symmetry of parts, discover an adjustment of means to ends, and a tendency to self-preservation. If the actuating force cease after this operation, matter must remain for ever in disorder, and continue an immense chaos, without any proportion or activity. But suppose that the actuating force, whatever it be, still continues in matter, this first position will immediately give place to a second, which will likewise in all probability be as disorderly as the first, and so on through many successions of changes and revolutions. No particular order or position ever continues a moment unaltered. The original force, still remaining in activity, gives a perpetual restlessness to matter. Every possible situation is produced, and instantly destroyed. If a glimpse or dawn of order appears for a moment, it is instantly hurried away, and confounded, by that never-ceasing force which actuates every part of matter.

Thus the universe goes on for many ages in a continued succession of chaos and disorder. But is it not possible that it may settle at last, so as not to lose its motion and active force (for that we have supposed inherent in it), yet so as to preserve an uniformity of appearance, amidst the continual motion and fluctuation of its parts? This we find to be the case with the universe at present. Every individual is perpetually changing, and every part of every individual; and yet the whole remains, in appearance, the same. May we not hope for such a position, or rather be assured of it, from the eternal revolutions of unguided matter; and may not this account for all the appearing wisdom and contrivance which is in the universe? Let us contemplate the subject a little, and we shall find, that this adjustment, if attained by matter of a seeming stability in the forms, with a real and perpetual revolution or motion of parts, affords a plausible, if not a true solution of the difficulty.

It is in vain, therefore, to insist upon the uses of the parts in animals or vegetables, and their curious adjustment to each other. I would fain know, how an animal could subsist, unless its parts were so adjusted? Do we not find, that it immediately perishes whenever this adjustment ceases, and that its matter corrupting tries some new form? It happens indeed, that the parts of the world are so well adjusted, that some regular form immediately lays claim to this corrupted matter: and if it were not so, could the world subsist? Must it not dissolve as well as the animal, and pass through new positions and situations, till in great, but finite succession, it falls at last into the present or some such order?

Although extremely general, Hume is suggesting both a history and a mechanism. Hume posits conservation of motion or other similar laws of nature, presumably mathematical, and describes what will happen when you apply such laws to a world. Most situations are unstable, and precisely because they are unstable, they will not last, and other situations will come to be. But some situations are stable, and when such situations occur, they will last.

The need for conservation of motion or similar natural laws is not accidental here. This is why I included the first paragraph above, rather than beginning the quotation where Hume begins to describe his “new hypothesis of cosmogony.” Without motion, the situation could not change, so a new situation could not come to be, and the very ideas of stable and unstable situations would not make sense. Likewise, if motion existed but did not follow any law, all situations should be unstable, so no amount of change could lead to a stable situation. Thus since things always fall downwards instead of in random directions, things stabilize near a center, while merely random motion could not be expected to have this effect. Thus a critic might argue that Hume seems to be positing randomness as the origin of things, but is cheating, so to speak, by positing original stabilities like natural laws, which are not random at all. Whatever might be said of this, it is an important point, and I will be returning to it later.

Since his description is more general than a description of living things in particular, Hume does not mention anything like the theory of the common descent of living things. But there is no huge gulf here: this would simply be a particular application. In fact, some people have suggested that Hume may have had textual influence on Darwin.

While there are other anticipations (there is one in Immanuel Kant that I am not currently inclined to seek out), I will skip to Philip Gosse, who published two years before Darwin. As described in the linked post, while Gosse denies the historicity of evolution in a temporal sense, he posits that the geological evidence was deliberately constructed (by God) to be evidence of common descent.

What was Darwin’s own role, then, if all the elements of his theory were known to various people years, centuries, or even millennia in advance? If we look at this in terms of Thomas Kuhn’s account of scientific progress, it is not so much that Darwin invented new ideas, as that he brought the evidence and arguments together in such a way as to produce — extremely quickly after the publication of his work — a newly formed consensus on those ideas.

Infinity

I discussed this topic previously, but without coming to a definite conclusion. Here I will give what I think is the correct explanation.

In his book Infinity, Causation, and Paradox, Alexander Pruss argues for what he calls “causal finitism,” or the principle that nothing can be affected by infinitely many causes:

In this volume, I will present a number of paradoxes of infinity, some old like Thomson’s Lamp and some new, and offer a unified metaphysical response to all of them by means of the hypothesis of causal finitism, which roughly says that nothing can be affected by infinitely many causes. In particular, Thomson’s Lamp story is ruled out since the final state of the lamp would be affected by infinitely many switch togglings. And in addition to arguing for the hypothesis as the best unified resolution to the paradoxes I shall offer some direct arguments against infinite regresses.

Thomson’s Lamp, if the reader is not familiar with it, is the question of what happens to a lamp if you switch it on and off an infinite number of times in a finite interval, doubling your velocity after each switch. At the end of the interval, is it on or off?

I think Pruss’s account is roughly speaking correct. I say “roughly speaking” because I would be hesitant to claim that nothing can be “affected” by infinitely many causes. Rather I would say that nothing is one effect simultaneously of infinitely many causes, and this is true for the same reason that there cannot be an infinite causal regress. That is, an infinite causal regress removes the notion of cause by removing the possibility of explanation, which is an intrinsic part of the idea of a cause. Similarly, it is impossible to explain anything using an infinite number of causes, because that infinity as such cannot be comprehended, and thus cannot be used to understand the thing which is the supposed effect. And since the infinity cannot explain the thing, neither can it be the cause of the thing.

What does this imply about the sorts of questions that were raised in my previous discussion, as for example about an infinite past or an infinite future, or a spatially infinite universe?

I presented an argument there, without necessarily claiming it to be correct, that such things are impossible precisely because they seem to imply an infinite causal regress. If there an infinite number of stars in the universe, for example, there seems to be an infinite regress of material causes: the universe seems to be composed of this local portion plus the rest, with the rest composed in a similar way, ad infinitum.

Unfortunately, there is an error in this argument against a spatially infinite world, and in similar arguments against a temporally infinite world, whether past or future. This can be seen in my response to Bertrand Russell when I discuss the material causes of water. Even if it is possible to break every portion of water down into smaller portions, it does not follow that this is an infinite sequence of material causes, or that it helps to explain water. In a similar way, even if the universe can be broken down into an infinite number of pieces in the above way, it does not follow that the universe has an infinite number of material causes: rather, this breakdown fails to explain, and fails to give causes at all.

St. Thomas gives a different argument against an infinite multitude, roughly speaking that it would lack a formal cause:

This, however, is impossible; since every kind of multitude must belong to a species of multitude. Now the species of multitude are to be reckoned by the species of numbers. But no species of number is infinite; for every number is multitude measured by one. Hence it is impossible for there to be an actually infinite multitude, either absolute or accidental.

By this argument, it would be impossible for there to be “an infinite number of stars” because the collection would lack “a species of multitude.” Unfortunately there is a problem with this argument as well, namely that it presupposes that the number is inherently fixed before it is considered by human beings. In reality, counting depends on someone who counts and a method they use for counting; to talk about the “number of stars” is a choice to break down the world in that particular way. There are other ways to think of it, as for example when we use the word “universe”, we count everything at once as a unit.

According to my account here, are some sorts of infinity actually impossible? Yes, namely those which demand an infinite sequence of explanation, or which demand an infinite number of things in order to explain something. Thus for example consider this story from Pruss about shuffling an infinite deck of cards:

Suppose I have an infinitely deep deck of cards, numbered with the positive integers. Can I shuffle it?

Given an infinite past, here is a procedure: n days ago, I perfectly fairly shuffle the top n cards in the deck.

This procedure is impossible because it makes the current state of the deck the direct effect of what I did n days ago, for all n. And the effect is a paradox: it is mathematically impossible for the integers to be randomly shuffled, because any series of integers will be biased towards lower numbers. Note that the existence of an infinite past is not the problem so much as assuming that one could have carried out such a procedure during an infinite past; in reality, if there was an infinite past, its contents are equally “infinite,” that is, they do not have such a definable, definite, “finite” relationship with the present.