How to Build an Artificial Human

I was going to use “Artificial Intelligence” in the title here but realized after thinking about it that the idea is really more specific than that.

I came up with the idea here while thinking more about the problem I raised in an earlier post about a serious obstacle to creating an AI. As I said there:

Current AI systems are not universal, and clearly have no ability whatsoever to become universal, without first undergoing deep changes in those systems, changes that would have to be initiated by human beings. What is missing?

The problem is the training data. The process of evolution produced the general ability to learn by using the world itself as the training data. In contrast, our AI systems take a very small subset of the world (like a large set of Go games or a large set of internet text), and train a learning system on that subset. Why take a subset? Because the world is too large to fit into a computer, especially if that computer is a small part of the world.

This suggests that going from the current situation to “artificial but real” intelligence is not merely a question of making things better and better little by little. There is a more fundamental problem that would have to be overcome, and it won’t be overcome simply by larger training sets, by faster computing, and things of this kind. This does not mean that the problem is impossible, but it may turn out to be much more difficult than people expected. For example, if there is no direct solution, people might try to create Robin Hanson’s “ems”, where one would more or less copy the learning achieved by natural selection. Or even if that is not done directly, a better understanding of what it means to “know how to learn,” might lead to a solution, although probably one that would not depend on training a model on massive amounts of data.

Proposed Predictive Model

Perhaps I was mistaken in saying that “larger training sets” would not be enough, at any rate enough to get past this basic obstacle. Perhaps it is enough to choose the subset correctly… namely by choosing the subset of the world that we know to contain general intelligence. Instead of training our predictive model on millions of Go games or millions of words, we will train it on millions of human lives.

This project will be extremely expensive. We might need to hire 10 million people to rigorously lifelog for the next 10 years. This has to be done with as much detail as possible; in particular we would want them recording constant audio and visual streams, as well as much else as possible. If we pay our crew an annual salary of $75,000 for this, this will come to $7.5 trillion; there will be some small additions for equipment and maintenance, but all of this will be very small compared to the salary costs.

Presumably in order to actually build such a large model, various scaling issues would come up and need to be solved. And in principle nothing prevents these from being very hard to solve, or even impossible in practice. But since we do not know that this would happen, let us skip over this and pretend that we have succeeded in building the model. Once this is done, our model should be able to fairly easily take a point in a person’s life and give a fairly sensible continuation over at least a short period of time, just as GPT-3 can give fairly sensible continuations to portions of text.

It may be that this is enough to get past the obstacle described above, and once this is done, it might be enough to build a general intelligence using other known principles, perhaps with some research and refinement that could be done during the years in which our crew would be building their records.

Required Elements

Live learning. In the post discussing the obstacle, I noted that there are two kinds of learning, the type that comes from evolution, and the type that happens during life. Our model represents the type that comes from evolution; unlike GPT-3, which cannot learn anything new, we need our AI to remember what has actually happened during its life and to be able to use this to acquire knowledge about its particular situation. This is not difficult in theory but you would need to think carefully about how this should interact with the general model; you do not want to simply add its particular experiences as another individual example (not that such an addition to an already trained model is simple anyway.)

Causal model. Our AI needs not just a general predictive model of the world, but specifically a causal one; not just the general idea that “when you see A, you will soon see B,” but the idea that “when there is an A — which may or may not be seen — it will make a B, which you may or may not see.” This is needed for many reasons, but in particular, without such a causal model, long term predictions or planning will be impossible. If you take a model like GPT-3 and force it to continue producing text indefinitely, it will either repeat itself or eventually go completely off topic. The same thing would happen to our human life model — if we simply used the model without any causal structure, and forced it to guess what would happen indefinitely far into the future, it would eventually produce senseless predictions.

In the paper Making Sense of Raw Input, published by Google Deepmind, there is a discussion of an implementation of this sort of model, although trained on an extremely easy environment (compared to our task, which would be train it on human lives).

The Apperception Engine attempts to discern the nomological structure that underlies the raw sensory input. In our experiments, we found the induced theory to be very accurate as a predictive model, no matter how many time steps into the future we predict. For example, in Seek Whence (Section 5.1), the theory induced in Fig. 5a allows us to predict all future time steps of the series, and the accuracy of the predictions does not decay with time.

In Sokoban (Section 5.2), the learned dynamics are not just 100% correct on all test trajectories, but they are provably 100% correct. These laws apply to all Sokoban worlds, no matter how large, and no matter how many objects. Our system is, to the best of our knowledge, the first that is able to go from raw video of non-trivial games to an explicit first-order nomological model that is provably correct.

In the noisy sequences experiments (Section 5.3), the induced theory is an accurate predictive model. In Fig. 19, for example, the induced theory allows us to predict all future time steps of the series, and does not degenerate as we go further into the future.

(6.1.2 Accuracy)

Note that this does not have the problem of quick divergence from reality as you predict into the distant future. It will also improve our AI’s live learning:

A system that can learn an accurate dynamics model from a handful of examples is extremely useful for model-based reinforcement learning. Standard model-free algorithms require millions of episodes before they can reach human performance on a range of tasks [31]. Algorithms that learn an implicit model are able to solve the same tasks in thousands of episodes [82]. But a system that learns an accurate dynamics model from a handful of examples should be able to apply that model to plan, anticipating problems in imagination rather than experiencing them in reality [83], thus opening the door to extremely sample efficient model-based reinforcement learning. We anticipate a system that can learn the dynamics of an ATARI game from a handful of trajectories,19 and then apply that model to plan, thus playing at reasonable human level on its very first attempt.

(6.1.3. Data efficiency)

“We anticipate”, as in Google has not yet built such a thing, but that they expect to be able to build it.

Scaling a causal model to work on our human life dataset will probably require some of the most difficult new research of this entire proposal.

Body. In order to engage in live learning, our AI needs to exist in the world in some way. And for the predictive model to do it any good, the world that it exists in needs to be a roughly human world. So there are two possibilities: either we simulate a human world in which it will possess a simulated human body, or we give it a robotic human-like body that will exist physically in the human world.

In relationship to our proposal, these are not very different, but the former is probably more difficult, since we would have to simulate pretty much the entire world, and the more distant our simulation is from the actual world, the less helpful its predictive model would turn out to be.

Sensation. Our AI will need to receive input from the world through something like “senses.” These will need to correspond reasonably well with the data as provided in the model; e.g. since we expect to have audio and visual recording, our AI will need sight and hearing.

Predictive Processing. Our AI will need to function this way in order to acquire self-knowledge and free will, without which we would not consider it to possess general intelligence, however good it might be at particular tasks. In particular, at every point in time it will have predictions, based on the general human-life predictive model and on its causal model of the world, about what will happen in the near future. These predictions need to function in such a way that when it makes a relevant prediction, e.g. when it predicts that it will raise its arm, it will actually raise its arm.

(We might not want this to happen 100% of the time — if such a prediction is very far from the predictive model, we might want the predictive model to take precedence over this power over itself, much as happens with human beings.)

Thought and Internal Sensation. Our AI needs to be able to notice that when it predicts it will raise its arm, it succeeds, and it needs to learn that in these cases its prediction is the cause of raising the arm. Only in this way will its live learning produce a causal model of the world which actually has self knowledge: “When I decide to raise my arm, it happens.” This will also tell it the distinction between itself and the rest of the world; if it predicts the sun will change direction, this does not happen. In order for all this to happen, the AI needs to be able to see its own predictions, not just what happens; the predictions themselves have to become a kind of input, similar to sight and hearing.

What was this again?

If we don’t run into any new fundamental obstacle along the way (I mentioned a few points where this might happen), the above procedure might be able to actually build an artificial general intelligence at a rough cost of $10 trillion (rounded up to account for hardware, research, and so on) and a time period of 10-20 years. But I would call your attention to a couple of things:

First, this is basically an artificial human, even to the extent that the easiest implementation likely requires giving it a robotic human body. It is not more general than that, and there is little reason to believe that our AI would be much more intelligent than a normal human, or that we could easily make it more intelligent. It would be fairly easy to give it quick mental access to other things, like mathematical calculation or internet searches, but this would not be much faster than a human being with a calculator or internet access. Like with GPT-N, one factor that would tend to limit its intelligence is that its predictive model is based on the level of intelligence found in human beings; there is no reason it would predict it would behave more intelligently, and so no reason why it would.

Second, it is extremely unlikely than anyone will implement this research program anytime soon. Why? Because you don’t get anything out of it except an artificial human. We have easier and less expensive ways to make humans, and $10 trillion is around the most any country has ever spent on anything, and never deliberately on one single project. Nonetheless, if no better way to make an AI is found, one can expect that eventually something like this will be implemented; perhaps by China in the 22nd century.

Third, note that “values” did not come up in this discussion. I mentioned this in one of the earlier posts on predictive processing:

The idea of the “desert landscape” seems to be that this account appears to do away with the idea of the good, and the idea of desire. The brain predicts what it is going to do, and those predictions cause it to do those things. This all seems purely intellectual: it seems that there is no purpose or goal or good involved.

The correct response to this, I think, is connected to what I have said elsewhere about desire and good. I noted there that we recognize our desires as desires for particular things by noticing that when we have certain feelings, we tend to do certain things. If we did not do those things, we would never conclude that those feelings are desires for doing those things. Note that someone could raise a similar objection here: if this is true, then are not desire and good mere words? We feel certain feelings, and do certain things, and that is all there is to be said. Where is good or purpose here?

The truth here is that good and being are convertible. The objection (to my definition and to Clark’s account) is not a reasonable objection at all: it would be a reasonable objection only if we expected good to be something different from being, in which case it would of course be nothing at all.

There was no need for an explicit discussion of values because they are an indirect consequence. What would our AI care about? It would care roughly speaking about the same things we care about, because it would predict (and act on the prediction) that it would live a life similar to a human life. There is definitely no specific reason to think it would be interested in taking over the world, although this cannot be excluded absolutely, since this is an interest that humans sometimes have. Note also that Nick Bostrom was wrong: I have just made a proposal that might actually succeed in making a human-like AI, but there is no similar proposal that would make an intelligent paperclip maximizer.

This is not to say that we should not expect any bad behavior at all from such a being; the behavior of the AI in the film Ex Machina is a plausible fictional representation of what could go wrong. Since what it is “trying” to do is to get predictive accuracy, and its predictions are based on actual human lives, it will “feel bad” about the lack of accuracy that results from the fact that it is not actually human, and it may act on those feelings.

Age of Em

This is Robin Hanson’s first book. Hanson gradually introduces his topic:

You, dear reader, are special. Most humans were born before 1700. And of those born after, you are probably richer and better educated than most. Thus you and most everyone you know are special, elite members of the industrial era.

Like most of your kind, you probably feel superior to your ancestors. Oh, you don’t blame them for learning what they were taught. But you’d shudder to hear of many of your distant farmer ancestors’ habits and attitudes on sanitation, sex, marriage, gender, religion, slavery, war, bosses, inequality, nature, conformity, and family obligations. And you’d also shudder to hear of many habits and attitudes of your even more ancient forager ancestors. Yes, you admit that lacking your wealth your ancestors couldn’t copy some of your habits. Even so, you tend to think that humanity has learned that your ways are better. That is, you believe in social and moral progress.

The problem is, the future will probably hold new kinds of people. Your descendants’ habits and attitudes are likely to differ from yours by as much as yours differ from your ancestors. If you understood just how different your ancestors were, you’d realize that you should expect your descendants to seem quite strange. Historical fiction misleads you, showing your ancestors as more modern than they were. Science fiction similarly misleads you about your descendants.

As an example of the kind of past difference that Robin is discussing, even in the fairly recent past, consider this account by William Ewald of a trial from the sixteenth century:

In 1522 some rats were placed on trial before the ecclesiastical court in Autun. They were charged with a felony: specifically, the crime of having eaten and wantonly destroyed some barley crops in the jurisdiction. A formal complaint against “some rats of the diocese” was presented to the bishop’s vicar, who thereupon cited the culprits to appear on a day certain, and who appointed a local jurist, Barthelemy Chassenée (whose name is sometimes spelled Chassanée, or Chasseneux, or Chasseneuz), to defend them. Chassenée, then forty-two, was known for his learning, but not yet famous; the trial of the rats of Autun was to establish his reputation, and launch a distinguished career in the law.

When his clients failed to appear in court, Chassenée resorted to procedural arguments. His first tactic was to invoke the notion of fair process, and specifically to challenge the original writ for having failed to give the rats due notice. The defendants, he pointed out, were dispersed over a large tract of countryside, and lived in many villages; a single summons was inadequate to notify them all. Moreover, the summons was addressed only to some of the rats of the diocese; but technically it should have been addressed to them all.

Chassenée was successful in his argument, and the court ordered a second summons to be read from the pulpit of every local parish church; this second summons now correctly addressed all the local rats, without exception.

But on the appointed day the rats again failed to appear. Chassenée now made a second argument. His clients, he reminded the court, were widely dispersed; they needed to make preparations for a great migration, and those preparations would take time. The court once again conceded the reasonableness of the argument, and granted a further delay in the proceedings. When the rats a third time failed to appear, Chassenée was ready with a third argument. The first two arguments had relied on the idea of procedural fairness; the third treated the rats as a class of persons who were entitled to equal treatment under the law. He addressed the court at length, and successfully demonstrated that, if a person is cited to appear at a place to which he cannot come in safety, he may lawfully refuse to obey the writ. And a journey to court would entail serious perils for his clients. They were notoriously unpopular in the region; and furthermore they were rightly afraid of their natural enemies, the cats. Moreover (he pointed out to the court) the cats could hardly be regarded as neutral in this dispute; for they belonged to the plaintiffs. He accordingly demanded that the plaintiffs be enjoined by the court, under the threat of severe penalties, to restrain their cats, and prevent them from frightening his clients. The court again found this argument compelling; but now the plaintiffs seem to have come to the end of their patience. They demurred to the motion; the court, unable to settle on the correct period within which the rats must appear, adjourned on the question sine die, and judgment for the rats was granted by default.

Most of us would assume at once that this is all nothing but an elaborate joke; but Ewald strongly argues that it was all quite serious. This would actually be worthy of its own post, but I will leave it aside for now. In any case it illustrates the existence of extremely different attitudes even a few centuries ago.

In any event, Robin continues:

New habits and attitudes result less than you think from moral progress, and more from people adapting to new situations. So many of your descendants’ strange habits and attitudes are likely to violate your concepts of moral progress; what they do may often seem wrong. Also, you likely won’t be able to easily categorize many future ways as either good or evil; they will instead just seem weird. After all, your world hardly fits the morality tales your distant ancestors told; to them you’d just seem weird. Complex realities frustrate simple summaries, and don’t fit simple morality tales.

Many people of a more conservative temperament, such as myself, might wish to swap out “moral progress” here with “moral regress,” but the point stands in any case. This is related to our discussions of the effects of technology and truth on culture, and of the idea of irreversible changes.

Robin finally gets to the point of his book:

This book presents a concrete and plausible yet troubling view of a future full of strange behaviors and attitudes. You may have seen concrete troubling future scenarios before in science fiction. But few of those scenarios are in fact plausible; their details usually make little sense to those with expert understanding. They were designed for entertainment, not realism.

Perhaps you were told that fictional scenarios are the best we can do. If so, I aim to show that you were told wrong. My method is simple. I will start with a particular very disruptive technology often foreseen in futurism and science fiction: brain emulations, in which brains are recorded, copied, and used to make artificial “robot” minds. I will then use standard theories from many physical, human, and social sciences to describe in detail what a world with that future technology would look like.

I may be wrong about some consequences of brain emulations, and I may misapply some science. Even so, the view I offer will still show just how troublingly strange the future can be.

I greatly enjoyed Robin’s book, but unfortunately I have to admit that relatively few people will in general. It is easy enough to see the reason for this from Robin’s introduction. Who would expect to be interested? Possibly those who enjoy the “futurism and science fiction” concerning brain emulations; but if Robin does what he set out to do, those persons will find themselves strangely uninterested. As he says, science fiction is “designed for entertainment, not realism,” while he is attempting to answer the question, “What would this actually be like?” This intention is very remote from the intention of the science fiction, and consequently it will likely appeal to different people.

Whether or not Robin gets the answer to this question right, he definitely succeeds in making his approach and appeal differ from those of science fiction.

One might illustrate this with almost any random passage from the book. Here are portions of his discussion of the climate of em cities:

As we will discuss in Chapter 18, Cities section, em cities are likely to be big, dense, highly cost-effective concentrations of computer and communication hardware. How might such cities interact with their surroundings?

Today, computer and communication hardware is known for being especially temperamental about its environment. Rooms and buildings designed to house such hardware tend to be climate-controlled to ensure stable and low values of temperature, humidity, vibration, dust, and electromagnetic field intensity. Such equipment housing protects it especially well from fire, flood, and security breaches.

The simple assumption is that, compared with our cities today, em cities will also be more climate-controlled to ensure stable and low values of temperature, humidity, vibrations, dust, and electromagnetic signals. These controls may in fact become city level utilities. Large sections of cities, and perhaps entire cities, may be covered, perhaps even domed, to control humidity, dust, and vibration, with city utilities working to absorb remaining pollutants. Emissions within cities may also be strictly controlled.

However, an em city may contain temperatures, pressures, vibrations, and chemical concentrations that are toxic to ordinary humans. If so, ordinary humans are excluded from most places in em cities for safety reasons. In addition, we will see in Chapter 18, Transport section, that many em city transport facilities are unlikely to be well matched to the needs of ordinary humans.

Cities today are the roughest known kind of terrain, in the sense that cities slow down the wind the most compared with other terrain types. Cities also tend to be hotter than neighboring areas. For example, Las Vegas is 7 ° Fahrenheit hotter in the summer than are surrounding areas. This hotter city effect makes ozone pollution worse and this effect is stronger for bigger cities, in the summer, at night, with fewer clouds, and with slower wind (Arnfield 2003).

This is a mild reason to expect em cities to be hotter than other areas, especially at night and in the summer. However, as em cities are packed full of computing hardware, we shall now see that em cities will  actually be much hotter.

While the book considers a wide variety of topics, e.g. the social relationships among ems, which look quite different from the above passage, the general mode of treatment is the same. As Robin put it, he uses “standard theories” to describe the em world, much as he employs standard theories about cities, about temperature and climate, and about computing hardware in the above passage.

One might object that basically Robin is positing a particular technological change (brain emulations), but then assuming that everything else is the same, and working from there. And there is some validity to this objection. But in the end there is actually no better way to try to predict the future; despite David Hume’s opinion, generally the best way to estimate the future is to say, “Things will be pretty much the same.”

At the end of the book, Robin describes various criticisms. First are those who simply said they weren’t interested: “If we include those who declined to read my draft, the most common complaint is probably ‘who cares?'” And indeed, that is what I would expect, since as Robin remarked himself, people are interested in an entertaining account of the future, not an attempt at a detailed description of what is likely.

Others, he says, “doubt that one can ever estimate the social consequences of technologies decades in advance.” This is basically the objection I mentioned above.

He lists one objection that I am partly in agreement with:

Many doubt that brain emulations will be our next huge technology change, and aren’t interested in analyses of the consequences of any big change except the one they personally consider most likely or interesting. Many of these people expect traditional artificial intelligence, that is, hand-coded software, to achieve broad human level abilities before brain emulations appear. I think that past rates of progress in coding smart software suggest that at previous rates it will take two to four centuries to achieve broad human level abilities via this route. These critics often point to exciting recent developments, such as advances in “deep learning,” that they think make prior trends irrelevant.

I don’t think Robin is necessarily mistaken in regard to his expectations about “traditional artificial intelligence,” although he may be, and I don’t find myself uninterested by default in things that I don’t think the most likely. But I do think that traditional artificial intelligence is more likely than his scenario of brain emulations; more on this below.

There are two other likely objections that Robin does not include in this list, although he does touch on them elsewhere. First, people are likely to say that the creation of ems would be immoral, even if it is possible, and similarly that the kinds of habits and lives that he describes would themselves be immoral. On the one hand, this should not be a criticism at all, since Robin can respond that he is simply describing what he thinks is likely, not saying whether it should happen or not; on the other hand, it is in fact obvious that Robin does not have much disapproval, if any, of his scenario. The book ends in fact by calling attention to this objection:

The analysis in this book suggests that lives in the next great era may be as different from our lives as our lives are from farmers’ lives, or farmers’ lives are from foragers’ lives. Many readers of this book, living industrial era lives and sharing industrial era values, may be disturbed to see a forecast of em era descendants with choices and life styles that appear to reject many of the values that they hold dear. Such readers may be tempted to fight to prevent the em future, perhaps preferring a continuation of the industrial era. Such readers may be correct that rejecting the em future holds them true to their core values.

But I advise such readers to first try hard to see this new era in some detail from the point of view of its typical residents. See what they enjoy and what fills them with pride, and listen to their criticisms of your era and values. This book has been designed in part to assist you in such a soul-searching examination. If after reading this book, you still feel compelled to disown your em descendants, I cannot say you are wrong. My job, first and foremost, has been to help you see your descendants clearly, warts and all.

Our own discussions of the flexibility of human morality are relevant. The creatures Robin is describing are in many ways quite different from humans, and it is in fact very appropriate for their morality to differ from human morality.

A second likely objection is that Robin’s ems are simply impossible, on account of the nature of the human mind. I think that this objection is mistaken, but I will leave the details of this explanation for another time. Robin appears to agree with Sean Carroll about the nature of the mind, as can be seen for example in this post. Robin is mistaken about this, for the reasons suggested in my discussion of Carroll’s position. Part of the problem is that Robin does not seem to understand the alternative. Here is a passage from the linked post on Overcoming Bias:

Now what I’ve said so far is usually accepted as uncontroversial, at least when applied to the usual parts of our world, such as rivers, cars, mountains laptops, or ants. But as soon as one claims that all this applies to human minds, suddenly it gets more controversial. People often state things like this:

“I am sure that I’m not just a collection of physical parts interacting, because I’m aware that I feel. I know that physical parts interacting just aren’t the kinds of things that can feel by themselves. So even though I have a physical body made of parts, and there are close correlations between my feelings and the states of my body parts, there must be something more than that to me (and others like me). So there’s a deep mystery: what is this extra stuff, where does it arise, how does it change, and so on. We humans care mainly about feelings, not physical parts interacting; we want to know what out there feels so we can know what to care about.”

But consider a key question: Does this other feeling stuff interact with the familiar parts of our world strongly and reliably enough to usually be the actual cause of humans making statements of feeling like this?

If yes, this is a remarkably strong interaction, making it quite surprising that physicists have missed it so far. So surprising in fact as to be frankly unbelievable. If this type of interaction were remotely as simple as all the interactions we know, then it should be quite measurable with existing equipment. Any interaction not so measurable would have be vastly more complex and context dependent than any we’ve ever seen or considered. Thus I’d bet heavily and confidently that no one will measure such an interaction.

But if no, if this interaction isn’t strong enough to explain human claims of feeling, then we have a remarkable coincidence to explain. Somehow this extra feeling stuff exists, and humans also have a tendency to say that it exists, but these happen for entirely independent reasons. The fact that feeling stuff exists isn’t causing people to claim it exists, nor vice versa. Instead humans have some sort of weird psychological quirk that causes them to make such statements, and they would make such claims even if feeling stuff didn’t exist. But if we have a good alternate explanation for why people tend to make such statements, what need do we have of the hypothesis that feeling stuff actually exists? Such a coincidence seems too remarkable to be believed.

There is a false dichotomy here, and it is the same one that C.S. Lewis falls into when he says, “Either we can know nothing or thought has reasons only, and no causes.” And in general it is like the error of the pre-Socratics, that if a thing has some principles which seem sufficient, it can have no other principles, failing to see that there are several kinds of cause, and each can be complete in its own way. And perhaps I am getting ahead of myself here, since I said this discussion would be for later, but the objection that Robin’s scenario is impossible is mistaken in exactly the same way, and for the same reason: people believe that if a “materialistic” explanation could be given of human behavior in the way that Robin describes, then people do not truly reason, make choices, and so on. But this is simply to adopt the other side of the false dichotomy, much like C.S. Lewis rejects the possibility of causes for our beliefs.

One final point. I mentioned above that I see Robin’s scenario as less plausible than traditional artificial intelligence. I agree with Tyler Cowen in this post. This present post is already long enough, so again I will leave a detailed explanation for another time, but I will remark that Robin and I have a bet on the question.