How to Build an Artificial Human

I was going to use “Artificial Intelligence” in the title here but realized after thinking about it that the idea is really more specific than that.

I came up with the idea here while thinking more about the problem I raised in an earlier post about a serious obstacle to creating an AI. As I said there:

Current AI systems are not universal, and clearly have no ability whatsoever to become universal, without first undergoing deep changes in those systems, changes that would have to be initiated by human beings. What is missing?

The problem is the training data. The process of evolution produced the general ability to learn by using the world itself as the training data. In contrast, our AI systems take a very small subset of the world (like a large set of Go games or a large set of internet text), and train a learning system on that subset. Why take a subset? Because the world is too large to fit into a computer, especially if that computer is a small part of the world.

This suggests that going from the current situation to “artificial but real” intelligence is not merely a question of making things better and better little by little. There is a more fundamental problem that would have to be overcome, and it won’t be overcome simply by larger training sets, by faster computing, and things of this kind. This does not mean that the problem is impossible, but it may turn out to be much more difficult than people expected. For example, if there is no direct solution, people might try to create Robin Hanson’s “ems”, where one would more or less copy the learning achieved by natural selection. Or even if that is not done directly, a better understanding of what it means to “know how to learn,” might lead to a solution, although probably one that would not depend on training a model on massive amounts of data.

Proposed Predictive Model

Perhaps I was mistaken in saying that “larger training sets” would not be enough, at any rate enough to get past this basic obstacle. Perhaps it is enough to choose the subset correctly… namely by choosing the subset of the world that we know to contain general intelligence. Instead of training our predictive model on millions of Go games or millions of words, we will train it on millions of human lives.

This project will be extremely expensive. We might need to hire 10 million people to rigorously lifelog for the next 10 years. This has to be done with as much detail as possible; in particular we would want them recording constant audio and visual streams, as well as much else as possible. If we pay our crew an annual salary of $75,000 for this, this will come to $7.5 trillion; there will be some small additions for equipment and maintenance, but all of this will be very small compared to the salary costs.

Presumably in order to actually build such a large model, various scaling issues would come up and need to be solved. And in principle nothing prevents these from being very hard to solve, or even impossible in practice. But since we do not know that this would happen, let us skip over this and pretend that we have succeeded in building the model. Once this is done, our model should be able to fairly easily take a point in a person’s life and give a fairly sensible continuation over at least a short period of time, just as GPT-3 can give fairly sensible continuations to portions of text.

It may be that this is enough to get past the obstacle described above, and once this is done, it might be enough to build a general intelligence using other known principles, perhaps with some research and refinement that could be done during the years in which our crew would be building their records.

Required Elements

Live learning. In the post discussing the obstacle, I noted that there are two kinds of learning, the type that comes from evolution, and the type that happens during life. Our model represents the type that comes from evolution; unlike GPT-3, which cannot learn anything new, we need our AI to remember what has actually happened during its life and to be able to use this to acquire knowledge about its particular situation. This is not difficult in theory but you would need to think carefully about how this should interact with the general model; you do not want to simply add its particular experiences as another individual example (not that such an addition to an already trained model is simple anyway.)

Causal model. Our AI needs not just a general predictive model of the world, but specifically a causal one; not just the general idea that “when you see A, you will soon see B,” but the idea that “when there is an A — which may or may not be seen — it will make a B, which you may or may not see.” This is needed for many reasons, but in particular, without such a causal model, long term predictions or planning will be impossible. If you take a model like GPT-3 and force it to continue producing text indefinitely, it will either repeat itself or eventually go completely off topic. The same thing would happen to our human life model — if we simply used the model without any causal structure, and forced it to guess what would happen indefinitely far into the future, it would eventually produce senseless predictions.

In the paper Making Sense of Raw Input, published by Google Deepmind, there is a discussion of an implementation of this sort of model, although trained on an extremely easy environment (compared to our task, which would be train it on human lives).

The Apperception Engine attempts to discern the nomological structure that underlies the raw sensory input. In our experiments, we found the induced theory to be very accurate as a predictive model, no matter how many time steps into the future we predict. For example, in Seek Whence (Section 5.1), the theory induced in Fig. 5a allows us to predict all future time steps of the series, and the accuracy of the predictions does not decay with time.

In Sokoban (Section 5.2), the learned dynamics are not just 100% correct on all test trajectories, but they are provably 100% correct. These laws apply to all Sokoban worlds, no matter how large, and no matter how many objects. Our system is, to the best of our knowledge, the first that is able to go from raw video of non-trivial games to an explicit first-order nomological model that is provably correct.

In the noisy sequences experiments (Section 5.3), the induced theory is an accurate predictive model. In Fig. 19, for example, the induced theory allows us to predict all future time steps of the series, and does not degenerate as we go further into the future.

(6.1.2 Accuracy)

Note that this does not have the problem of quick divergence from reality as you predict into the distant future. It will also improve our AI’s live learning:

A system that can learn an accurate dynamics model from a handful of examples is extremely useful for model-based reinforcement learning. Standard model-free algorithms require millions of episodes before they can reach human performance on a range of tasks [31]. Algorithms that learn an implicit model are able to solve the same tasks in thousands of episodes [82]. But a system that learns an accurate dynamics model from a handful of examples should be able to apply that model to plan, anticipating problems in imagination rather than experiencing them in reality [83], thus opening the door to extremely sample efficient model-based reinforcement learning. We anticipate a system that can learn the dynamics of an ATARI game from a handful of trajectories,19 and then apply that model to plan, thus playing at reasonable human level on its very first attempt.

(6.1.3. Data efficiency)

“We anticipate”, as in Google has not yet built such a thing, but that they expect to be able to build it.

Scaling a causal model to work on our human life dataset will probably require some of the most difficult new research of this entire proposal.

Body. In order to engage in live learning, our AI needs to exist in the world in some way. And for the predictive model to do it any good, the world that it exists in needs to be a roughly human world. So there are two possibilities: either we simulate a human world in which it will possess a simulated human body, or we give it a robotic human-like body that will exist physically in the human world.

In relationship to our proposal, these are not very different, but the former is probably more difficult, since we would have to simulate pretty much the entire world, and the more distant our simulation is from the actual world, the less helpful its predictive model would turn out to be.

Sensation. Our AI will need to receive input from the world through something like “senses.” These will need to correspond reasonably well with the data as provided in the model; e.g. since we expect to have audio and visual recording, our AI will need sight and hearing.

Predictive Processing. Our AI will need to function this way in order to acquire self-knowledge and free will, without which we would not consider it to possess general intelligence, however good it might be at particular tasks. In particular, at every point in time it will have predictions, based on the general human-life predictive model and on its causal model of the world, about what will happen in the near future. These predictions need to function in such a way that when it makes a relevant prediction, e.g. when it predicts that it will raise its arm, it will actually raise its arm.

(We might not want this to happen 100% of the time — if such a prediction is very far from the predictive model, we might want the predictive model to take precedence over this power over itself, much as happens with human beings.)

Thought and Internal Sensation. Our AI needs to be able to notice that when it predicts it will raise its arm, it succeeds, and it needs to learn that in these cases its prediction is the cause of raising the arm. Only in this way will its live learning produce a causal model of the world which actually has self knowledge: “When I decide to raise my arm, it happens.” This will also tell it the distinction between itself and the rest of the world; if it predicts the sun will change direction, this does not happen. In order for all this to happen, the AI needs to be able to see its own predictions, not just what happens; the predictions themselves have to become a kind of input, similar to sight and hearing.

What was this again?

If we don’t run into any new fundamental obstacle along the way (I mentioned a few points where this might happen), the above procedure might be able to actually build an artificial general intelligence at a rough cost of $10 trillion (rounded up to account for hardware, research, and so on) and a time period of 10-20 years. But I would call your attention to a couple of things:

First, this is basically an artificial human, even to the extent that the easiest implementation likely requires giving it a robotic human body. It is not more general than that, and there is little reason to believe that our AI would be much more intelligent than a normal human, or that we could easily make it more intelligent. It would be fairly easy to give it quick mental access to other things, like mathematical calculation or internet searches, but this would not be much faster than a human being with a calculator or internet access. Like with GPT-N, one factor that would tend to limit its intelligence is that its predictive model is based on the level of intelligence found in human beings; there is no reason it would predict it would behave more intelligently, and so no reason why it would.

Second, it is extremely unlikely than anyone will implement this research program anytime soon. Why? Because you don’t get anything out of it except an artificial human. We have easier and less expensive ways to make humans, and $10 trillion is around the most any country has ever spent on anything, and never deliberately on one single project. Nonetheless, if no better way to make an AI is found, one can expect that eventually something like this will be implemented; perhaps by China in the 22nd century.

Third, note that “values” did not come up in this discussion. I mentioned this in one of the earlier posts on predictive processing:

The idea of the “desert landscape” seems to be that this account appears to do away with the idea of the good, and the idea of desire. The brain predicts what it is going to do, and those predictions cause it to do those things. This all seems purely intellectual: it seems that there is no purpose or goal or good involved.

The correct response to this, I think, is connected to what I have said elsewhere about desire and good. I noted there that we recognize our desires as desires for particular things by noticing that when we have certain feelings, we tend to do certain things. If we did not do those things, we would never conclude that those feelings are desires for doing those things. Note that someone could raise a similar objection here: if this is true, then are not desire and good mere words? We feel certain feelings, and do certain things, and that is all there is to be said. Where is good or purpose here?

The truth here is that good and being are convertible. The objection (to my definition and to Clark’s account) is not a reasonable objection at all: it would be a reasonable objection only if we expected good to be something different from being, in which case it would of course be nothing at all.

There was no need for an explicit discussion of values because they are an indirect consequence. What would our AI care about? It would care roughly speaking about the same things we care about, because it would predict (and act on the prediction) that it would live a life similar to a human life. There is definitely no specific reason to think it would be interested in taking over the world, although this cannot be excluded absolutely, since this is an interest that humans sometimes have. Note also that Nick Bostrom was wrong: I have just made a proposal that might actually succeed in making a human-like AI, but there is no similar proposal that would make an intelligent paperclip maximizer.

This is not to say that we should not expect any bad behavior at all from such a being; the behavior of the AI in the film Ex Machina is a plausible fictional representation of what could go wrong. Since what it is “trying” to do is to get predictive accuracy, and its predictions are based on actual human lives, it will “feel bad” about the lack of accuracy that results from the fact that it is not actually human, and it may act on those feelings.

Patience

St. Thomas describes the virtue of patience:

I answer that, As stated above (II-II:123:1), the moral virtues are directed to the good, inasmuch as they safeguard the good of reason against the impulse of the passions. Now among the passions sorrow is strong to hinder the good of reason, according to 2 Corinthians 7:10, “The sorrow of the world worketh death,” and Sirach 30:25, “Sadness hath killed many, and there is no profit in it.” Hence the necessity for a virtue to safeguard the good of reason against sorrow, lest reason give way to sorrow: and this patience does. Wherefore Augustine says (De Patientia ii): “A man’s patience it is whereby he bears evil with an equal mind,” i.e. without being disturbed by sorrow, “lest he abandon with an unequal mind the goods whereby he may advance to better things.” It is therefore evident that patience is a virtue.

This brings to mind things like a martyr being afflicted by others for the truth that he holds and enduring this steadfastly, but in fact it applies well even to the ordinary idea of patience, according to which, for example, we might say that Ray Kurzweil’s impatience for technological progress leads him to false opinions about current historical trends.

We can illustrate this with a little story. Peter, impatient to get home from work, exceeds the speed limit and weaves in and out of traffic. Minutes before getting home, he hits a slippery patch on the road. His car goes off the road, ramming a tree and killing him.

Despite being nothing but a story, it is one that has without a doubt been played out in real life with minor or major variations again and again. We can apply the saying of St. Augustine quoted by St. Thomas. Peter’s patience would consist in “bearing evil with an equal mind,” that is, enduring the fact that he is not home yet without disturbance, “lest he abandon with an unequal mind the goods whereby he may advance to better things,” that is, since his disturbed and unequal mind leads him to abandon the goods, that is, the ordered manner of driving, whereby he may advance to better things, that is, actually to get home.

Patience is rightly thought to be related to the virtue of humility. One who judges rightly about his place in the order of things will understand that it is natural in this order that what is best tends to come last. The good wine is served last. Thus such a person should endure without disturbance the lack that comes earlier, in order not to abandon the good by which he might achieve the good that comes later.

The Good I Do Not Want

St. Paul says in the letter to the Romans, “I can will what is right, but I cannot do it. For I do not do the good I want, but the evil I do not want is what I do.”

This happens because the person is divided. Simply speaking I may believe that the thing that I want to do is right; but in another way, I perceive or suppose that “the evil I do not want” is good.

This sort of division can happen in the opposite way as well, so that a person wills the evil that he takes to be good, but cannot do it, because another part of him perceives that it is evil and to be avoided.

Procrastination can work as an example of both cases. Without a doubt procrastinating is often failing to do the good that one wills; but it is also often refusing to do something that would be mostly pointless, and in this sense, it is refusing to do something bad, and thus one could say that “I do not the evil I want, but the good I do not want is what I do.”

 

Desire and The Good

A confusing thing about the meanings of one and many  is that the meaning of each seems to depend on the other. The reality behind this is that there is a back and forth process in which each is used to understand the other better. First we understand being, which is something one, although without the specific idea of unity. Then we understand distinction, which implies several things, again without the specific idea of “many.” Then we understand the one by contrast with things that are distinct. Finally we understand the many as a whole composed of ones as parts.

A similar thing happens with the meanings of “desire” and “good”. Thus St. Thomas defines the good in reference to desire:

I answer that, Goodness and being are really the same, and differ only in idea; which is clear from the following argument. The essence of goodness consists in this, that it is in some way desirable. Hence the Philosopher says (Ethic. i): “Goodness is what all desire.” Now it is clear that a thing is desirable only in so far as it is perfect; for all desire their own perfection. But everything is perfect so far as it is actual. Therefore it is clear that a thing is perfect so far as it exists; for it is existence that makes all things actual, as is clear from the foregoing (3, 4; 4, 1). Hence it is clear that goodness and being are the same really. But goodness presents the aspect of desirableness, which being does not present.

But he also seems to define desire in relation to the good:

I answer that, We must needs assert that in God there is love: because love is the first movement of the will and of every appetitive faculty. For since the acts of the will and of every appetitive faculty tend towards good and evil, as to their proper objects: and since good is essentially and especially the object of the will and the appetite, whereas evil is only the object secondarily and indirectly, as opposed to good; it follows that the acts of the will and appetite that regard good must naturally be prior to those that regard evil; thus, for instance, joy is prior to sorrow, love to hate: because what exists of itself is always prior to that which exists through another. Again, the more universal is naturally prior to what is less so. Hence the intellect is first directed to universal truth; and in the second place to particular and special truths. Now there are certain acts of the will and appetite that regard good under some special condition, as joy and delight regard good present and possessed; whereas desire and hope regard good not as yet possessed. Love, however, regards good universally, whether possessed or not. Hence love is naturally the first act of the will and appetite; for which reason all the other appetite movements presuppose love, as their root and origin. For nobody desires anything nor rejoices in anything, except as a good that is loved: nor is anything an object of hate except as opposed to the object of love. Similarly, it is clear that sorrow, and other things like to it, must be referred to love as to their first principle. Hence, in whomsoever there is will and appetite, there must also be love: since if the first is wanting, all that follows is also wanting. Now it has been shown that will is in God (19, 1), and hence we must attribute love to Him.

This seems circular. Desire is a tendency towards the good, while the good is something that is desirable.

The correct response is that here too we have a back and forth process where each thing makes the other understood better. The first thing in this order is desire, but for the moment without the specific idea of tendency towards the good. Taken in this way, it expresses a way of feeling, a sensible experience. It does not matter here whether we take desire in particular, or its principle, namely love, or its consequence, namely pleasure or joy, or their opposites, such as hate, aversion or sadness. In any case we wish to consider them in a very subjective way, as a way of feeling.

Taken in this way, we can consider them much like a kind of sensation. People sometimes ask how we know that pain is a property of the one who feels pain, rather than of the object that inflicts pain. It seems perfectly possible to say that “this knife is painful” could be just as much an objective fact about the knife, as the fact that the handle of the knife is brown. Of course, no one actually believes this. But the question is why they do not.

It would be easy to suppose that the experiences themselves, namely of seeing the knife and being cut by it, are self explanatory. Of course being cut by a knife is something that happens to me, and of course the color of the knife is a property of the knife.

I agree with the conclusion, naturally, but I do not agree with the reasoning. I do not think that we know this in virtue of the experiences themselves. I think we learn it, very quickly and without a need for conscious attention, from the contexts in which those experiences happen. As I said in the linked post on truth in the senses, sensations are not descriptions of a thing, and they do not make claims. Pain does not assert, “I do not belong to this painful thing”; it does not say anything. Nor does color assert, “I am a property of this body.” It does not say anything. And if we simply consider the sensations as such, we could not give a reason why pain could not be a property of the painful thing, nor why color could not be a property of ourselves rather than the thing. But the contexts in which we have these sensations teach us that color belongs to the colored object, and pain to ourselves, rather than to the painful thing.

Consider the case of sadness. It is easy enough to see that sadness is a property of ourselves, and not of an objectively sad fact. Part of the reason it is easy to see this is that we can be sad, and we can know that we are sad, without noticing any particular reason for being sad. In other words, it is the context of the experience that shows us that it is a property of ourselves.

Something similar is the case with love and desire. Insofar as they are feelings that can be experienced, they can be experienced without noticing any particular object. Katja Grace talks about this situation:

Sometimes I find myself longing for something, with little idea what it is.

This suggests that perceiving desire and perceiving which thing it is that is desired by the desire are separable mental actions.

In this state, I make guesses as to what I want. Am I thirsty? (I consider drinking some water and see if that feels appealing.) Do I want to have sex? (A brief fantasy informs me that sex would be good, but is not what I crave.) Do I want social comfort? (I open Facebook, maybe that has social comfort I could test with…)

If I do infer the desire in this way, I am still not directly reading it from my own mind. I am making educated guesses and testing them using my mind’s behavior.

In this way, it is possible to feel desire as a mere feeling, without defining it in reference to something good. And this kind of feeling is the origin of the idea of “desire,” but it is not yet sufficient.

We learn from experience that when we have desires, we tend to do things. And we notice that not all desires are the same, and that when we have similar desires, we end up doing similar things. And so from this we get the idea of the good as the end and final cause of our actions. We do similar things when we have similar desires, and what those things have in common is that they result in the same ends, even if they use different means. So the end is “why” and explains the choice of means.

In turn, this understanding of the end allows us to understand desire more precisely, now as an inclination towards the good.