Saturday 22 October 2011

The memory bound on AI systems: The move towards self-awareness

Often a lot these days, we hear about the memory explosion in the world of computers. This explosion basically refers to the increase in the amount of memory per unit size.And there indeed has been an unanticipated memory explosion.Recall the days of SDR SDRAMS where computers used to have 64/128/256/512 MBytes  RAM. And compare that situation with the current times. The computers of the present era have the DDR SDRAMs. General RAM sizes have gone up to 8 GB/16GB/64GB and even more than that in specialized architectures.Even in terms of storage, there has been quite an improvement. There has also been an explosion in the context of processing speeds. Compare 686 series with the i-series and you will find a breathtaking improvement.This was in context of the general scenario, but in case of AI systems, we generally use specialized architectures like the ones used in supercomputers. In such systems, primary memories are in terms of TeraBytes and secondary storage is in terms of Petabytes.Even the MIPS rates are much much higher than the general computers.But even after all these improvements, in case of generic Artificially intelligent systems , the processing speed explosion may suffice, but the explosion in memory(both primary memory and secondary) ain't that good.

The reason for this is that AI systems that are capable of learning, may  need tonnes of memory to remain stable and effective. AI systems, unlike conventional computing systems, organize their memory in the form of neural networks mainly(Although there is an entire variety of knowledge representation structures that have been used in AI systems, we will concentrate on neural networks to keep the discussion simple).Whereas conventional computers have tree like directory structures and file systems, the AI systems form an entirely connected network which is much more exhaustive and much more effective in the context of what AI systems have to do.Neural networks, are an imitation of the human brain. A neural network is composed of nodes and connectors just like the neurons and connectors in our brains.Like the impulses that are transferred between the different neurons of our brain, the nodes of a neural network too transfer signals(information) among them.

Now we will try to see how a neural network actually works. Look at this diagram :


This is the diagram of a basic neural network. Every neural network has 3 layers of nodes : input nodes, hidden nodes and output nodes. Input nodes are passive, which means that they do not contain any information and that they do not manipulate the information that comes to them. The input nodes simply pass on the data(data means variables.we will consider these variables to be numbers in this case) that they get to the many connectors that leave them. For example, in the above figure, look at the first input node. It gets a single variable(X11) and passes on the same to the four connectors that connect to 4 hidden nodes.


The hidden nodes(internal nodes in the middle layer) as well as the output nodes are not passive. Every connector that connects to a hidden node multiplies that value that it carries with a weight. Weight here is just a number.For example,if we had a value of 10 coming to a hidden node and the weight on that connector was 0.7, then the weighted value will be 0.7 * 10 = 7.  So what comes to a hidden node is a set of weighted values. The hidden node contains a sigmoid function which simply strives to combine all these weighted values into a single number. This number should lie between 0 and 1. So every hidden node gives an output that lies between 0 and 1.

After that, the output nodes receive values from the hidden nodes.Output nodes have multiple input connectors, but only a single output. So these nodes combine the input values to reduce the number of outputs that the network produces. Hence they too manipulate the information that they get.

There can be multiple layers of input nodes,hidden nodes and output nodes. Input nodes connect to either more input nodes or to hidden nodes. Whereas the hidden nodes either connect to more hidden nodes or to the output nodes.In this was we get a fully interconnected neural network.

So, this was how a neural network keeps information in it. The input nodes accept the raw information and the output nodes present the results of applying the knowledge. The weights here form the most important part, because it is these weights only that determine how well the results will be. So the overall problem of having effective knowledge comes down to fine calibration of the weights.

Now coming back to the original problem. AI systems are of two types. One that have an existing set of neural networks and no new neural networks are added during operation.Expert systems come under this category.Expert Systems are AI systems that are specialized for some particular task. They are just like human experts.In these types of systems, the amount of knowledge which the AI system will use is already known. In these AI systems, the knowledge is structured in the form of neural networks and as the system starts working, it starts using this knowledge to solve problems. Now as the system works, it keeps on improving its knowledge by adjusting the values of the original weights. Like if the system knows that it failed on a few occasions because of some faulty weight, then it can calibrate the value of that faulty weight on the basis of its findings.These systems need a limited amount of primary memory and storage to function.

The other class of systems is different. These AI systems are capable of much more than the previous class. These systems are also capable of re-calibrating the weights of the existing neural networks, but they are also capable of generating new neural networks and expanding the existing ones.For example, lets take a Humanoid Robot. This robot knows only a few things at the beginning. Now as it starts its operation, it is going to learn new things. The amount of knowledge required by a humanoid to function is so large that it is never possible to incorporate all the knowledge at the very beginning. Hence the humanoids start functioning with a minimal amount of knowledge and they are equipped to learn new things on their own. Now suppose that the humanoid comes across an entirely new thing. As it learns how to do it, it builds a new neural network based on the knowledge that it gathers. Hence as it learn new things it keeps generating new neural networks. The humanoid may also extend its neural network when it learns a new class of something that it already knows. Like when it knows how to cook plain rice, but it recently learned how to cook fried rice. So it will add some new nodes to its existing neural network so that it becomes more versatile.

Its for these systems that the memory limits impose a restriction on the functioning. In the beginning, we are unaware of how a humanoid will learn things and at what pace it will learn them.And even if we gave it the highest possible memory chips available, the thing will not suffice. The problem is that humanoids have to mimic the humans and we human beings have literally got an infinite amount of memory. Our brains are so powerful and our memory capacity is so vast that no humanoid can even think of matching it(at least for the next decade or so). Now, although we use only a limited proportion or our brain, but we are sure that we will never run out of memory. But that is not the case with our humanoid. It has to build new neural networks and it has to expand the existing neural networks that it possesses. So as it starts learning, it has to learn a lot of things and it has to retain most of what it learns.The humanoids that were built till date, started to learn at exponential rates until they either had to shut down due to lack of memory or they learnt what they intended to learn.

All research humanoids were started with a minimalist knowledge and as they started interacting with the world, they started learning new things. But the problem is that the algorithms are not that good at telling as to when they have to stop. As a result they learn a lot of things and keep generating more and more networks and keep expanding their existent networks. As a result, though they begin learning at a good rate, but eventually they always fall short of memory. This happens because we human beings know our limits, but the humanoids don't. They fall short of both the primary memory as well as permanent storage. As they expand neural networks, they have to retain the existing networks in their current memory and they also have to keep using the updated networks if they have to continue doing the job that they were learning. Hence the amount of the neural network that can be retained in their memories crosses the threshold.Moreover, humanoids are inherently multitasking and therefore, they have to keep multiple neural networks while solving problems.

There have been a few solutions to the problem of limited primary memory. The modified algorithms can help the humanoids in deciding that what portion of the neural network they have to keep in the current memory. But even in that case, we are eventually going to reach a bound.

The second problem is that of permanent storage. As the humanoids keep learning new things, they have to store the enlarged and new neural networks so that they have this acquired knowledge for the future use. As a result of this, they have to keep storing the knowledge that they acquire with time. Hence with time, the permanent information with every humanoid also increases.Imagine how much of knowledge a humanoid would be having.

Lets try to get a guess at the magnitude of information that we are talking about. If a humanoid has to learn making coffee, it will be having knowledge on how to see things, how to distinguish between coffee and assisting ingredients with the other objects in the world.Then the humanoid would also be having knowledge about how to make coffee. Now the problem of recognizing objects in the world itself is a big one.Learning by examples is a phenomenon that is used in case of objects. So if the humanoid has seen a coffee jar, it will store it in form of the height, weight, and other visual and physical aspects of the jar. But if the jar were a bit different, it will have to add additional information to it and will refine the existing class of coffee jars to add this new information.So with time, the humanoid will refine its classes of objects by adding more information to them and it will also define new procedures for the new things that it will learn. All these things will be stored in different forms. Like the classes will be stored as a set of facts, whereas procedures will be stores as a set of steps.These are stored in different forms which are either entirely new, or are a variation of the neural networks that we discussed.Irrespective of the form in which we store this information, the amount of memory needed is humongous.And mind you, this amount of memory were talking about here, is thousands of Petabytes in case it were to learn and retain most of the things.

So, is it possible to put that much of memory in a Robot that looks like a human.Not in the current times, that's for sure. But a modification of client-server architecture can be used for regularly transferring tonnes of information from the humanoid to some remote storage where this much of memory is available. Of course, given the network bandwidths of the current times, a single transfer would take a considerable amount of time. But we have none other option, as of now,The problem here, would arise when a humanoid has to perform some action and it knows that it has the knowledge for solving that. In that case, if the neural network or portion of neural network needed for solving the problem is within the local storage(storage within the humanoid)then it's okay. But otherwise, it will have to access the remote repository of data where it has stores all the knowledge that it gathers. In the latter case, imagine the time for which it will have to wait before all the needed information becomes available and it can start acting.

So where's the jinx. Well, the conventional methods and modifications of these conventional methods don't seem to be offering any viable solution to this problem. But there does exist a solution to it. This solution is incorporating Self-Awareness in AI systems.

As the term suggests, Self-Awareness means that the humanoid or the AI system will become aware of its own existence and of its own limits. Obviously the system is aware of what memory capabilities and processing capabilities it has, but here the emphasis is on being aware about how much it is capable of learning just like we human beings are. In this case, every humanoid will start learning at an exponential rate. As it will encounter new problems, it will gather more and more knowledge by its interaction with the world.
But as it knows about itself, it will keep deleting obsolete and temporal knowledge with time and it will also learn only a portion of what it would have learnt in the previous case. The learning by example method, would have made it classify coffee jars and that was an effective means of learning, but now that it is self-aware it will also include only a few aspects which it considers to be necessary. It does this by keeping its memory limits in mind.This is just analogous to a student who, while reading a chapter makes a note of more important things and empahisizes more on them. Likewise, the self-aware humanoid will grasp only the important aspects and will store them only. Later on, while attempting to solve the problem, if it fails, then it tries to grasp the other things that it believes that it missed out on, in the very first attempt.

Hence, the system which previously used to gain a lot of knowledge at the first attempt and was sure that it will be able to solve the problem when it encounters it again, tries a little bit on its luck and gains only partial knowledge. Now as this humanoid fails again and again, it keeps on improving its knowledge base. Eventually when the success rate goes above a threshold, it knows that it has gained enough of knowledge and stops adding more to it. This is the essence of self-awareness. It should know when to stop and that's why the threshold values should be chosen very carefully. Hence, the robot begins to learn in a way in which the human beings do.Every human being tries a thing 2-3 times before he/she starts to succeed and that is how the humanoid would be working now.Another aspect is that with time, the humanoid would become aware of what its skills are and it will be able to guarantee some success in those domains. With time, it will keep refining the knowledge base by adding new knowledge and dispersing the unused and obsolete knowledge. In this way, although the effectiveness will be reduced because if there was a problem that it solved way back in history, then it might take a lot of time solving it again because the existing neural net was deleted and the problem will have to solved from the scratch,but the system will need much lesser memory than before.

This concept cannot be used in places where success rate is critical, but it can be used in humanoids that mimic the life of a regular human being who is in training phase.Even after being self-aware, the system will be needing a little more help from technical advancements, because even with these mechanisms, the amount of permanent information needed would be difficult to incorporate in a machine of the size of a human being.

At the end, I have to tell you that this post is by no means exhaustive. Its just a small snippet of a big big research area. A fully exhaustive post would have taken at least 200 pages, and that's why the post is a seriously scaled-down version of the same. I just wanted to share these enticing insights with you and wanted you to share that exhilarating imagination with me. That was the sole purpose behind putting this post..

Thanks for your patience.





No comments:

Post a Comment