Showing posts with label AI. Show all posts
Showing posts with label AI. Show all posts

Tuesday, 12 June 2012

Using AI to Combat Cyber Crime

Artificial Intelligence has, for quite some time now, been used for combating credit card fraud. Data Mining, which is in a way an application of AI, is used to detect credit card frauds by using various mechanisms. In the most general scenario, a pattern of the user's credit card usage is drawn by making use of his credit card transaction records and all the future transactions are inserted in the pattern only after conforming to this pattern itself.Whenever a transaction or a pair of transactions that violates the pattern is noticed, the system prompts the surveillance personnel to check in. Then its upon the discretion of the personnel to see if the transactions are to be investigated or to be entered in the system and inculcated in the ever changing pattern.In a more advanced form,the normal credit card usage pattern of the user is used along with a pattern of the usage seen in the credit cards of the whole group to which the user belongs. This group may be created on the basis of income,credit card category(gold,silver etc.) or even the company to which the user belongs.This scheme is more robust and more resistant against single high value transactions that may appear to drift away from the pattern but are actually genuine.

The above approaches have been quite effective in combating credit cards frauds to some extent, and as a result, agencies all over the world have started looking at AI for combating other forms of electronic/cyber crime.They sought to AI because of the fact that due to the humongous number of transactions, its utterly impossible to employ humans to track movement over the internet. They need a machine to do that and in fact they need a machine that's smart enough to match the wits of a human expert.The intelligence may either be embedded in the individual application servers, just like spam filters used by mail servers or the intelligence may be implemented at the firewalls at the gateways. The advantage with embedding it into the individual servers is that the logic related to the specific application can be embedded. E.g a traffic pattern may be acceptable if destined for mail server but not for some office application server.In fact the best approach is to divide the intelligence amongst the two places. General intelligence is embedded at the firewalls and the application specific intelligence is embedded in the individual servers.

The general model suggests that some traffic analysis technique be used. This technique would differ according to the networks. Traffic could be analyzed at one or all levels. Either only the datagarm traffic could be analyzed or the ip level traffic or both. The traffic is again matched with the general pattern of traffic just like pattern matching in credit card fraud detection. At the firewalls, the overall traffic pattern is analyzed, and at the individual servers, the application level and session level traffic is analyzed. At the application level, once again two patterns could be used - a user pattern and a group pattern. At the firewalls however, a single pattern has to be used.In fact, the system may keep different patterns for different days or different times instead of a single pattern, and these different patterns may then be used accordingly.Like every cognitive learning mechanism, these patterns would also improve with time. The system would match actual pattern with the stored pattern and also keep changing according to the patterns that it analyzes. For example, if the system reported an anomaly and the network admin thinks its normal traffic, the system would inculcate this in the traffic pattern model and would improve itself. Hence, with time, the system will become more and more effective. 

Tuesday, 1 May 2012

Artificially Intelligent Caching

The first two paras are only for those who have very little or no knowledge on caching.

The Cache is considered to be the most expensive and the most important resource that a computer possesses. Cache is the memory which CPU can access in the least amount of time and hence, cache access takes place at a much faster rate when compared to other memory accesses.Whenever the CPU needs some data, it checks the cache first and only if the cache doesn't have the data, it begins searching for data in other memories.The size of cache is limited due to architecture constraints as well as the fact that cache is made up of highly expensive SRAMs. Now the jinx is that an access to cache takes the least amount of time, but the cache can store only some part of information that a computer holds and needs. So, Operating Systems use replacement policies for replacing existing data in cache with other data from higher level memories(RAM and secondary storage). Its this very replacement policy that decides the fate of the OS. If the replacement policy is such designed that in most of the occasions the data the CPU needed was there in the cache(a cache hit), then the overall system would be fast and performance will go up. On the other hand, if the CPU couldn't find the needed data in the cache(a cache miss), then the performance will go down.

Data enters the cache when its accessed for the first time. Suppose CPU needed some data that resides in secondary storage. First the data goes from secondary storage to RAM and then from RAM to cache. The next time CPU needs that data, it will see if the cache still has it. If the cache doesn't have it, a miss occurs and the RAM and secondary storage are searched in that order. Once again, after the data is found, it gets transferred to cache. The replacement policy comes into action after some fixed time period or after the number of misses has crossed some threshold. The replacement policy tries to predict the future and tries to remove the data from cache which is least likely to be accessed in the future, and store new data from higher memories that is most likely to be accessed in the future. Its this very future prediction capability which results in success or failure of the replacement policy. The most popular replacement policy is LRU(Least recently used) where data that was accessed least recently is removed and data that was accessed most recently is retained.

LRU is driven by heuristics(history of data usage where time of access is the primary driver) and it is obviously not perfect. Given the amount of impact the replacement policy can have on the performance, one has to strive to improve the future prediction capabilities of the replacement policy. This is where researchers believe that AI can be put to some good use. AI has been a good predictor in several domains and this domain is very likely to be another one where AI can be successful. For replacing existing data in cache with data that is more likely to be accessed, one would need heuristics(which is already put into good use by LRU) and some other predictor.

All data of a computer system is organized as a file system. Modern file systems use tree architectures to organize files. For adding up to the heuristics, the system needs to explore the most likely patterns in which data can be accessed. Its not just about the times at which data is accessed but also about the pattern in which data is accessed across the file system tree. So our proposed replacement policy would be striving to find the most probable patterns in which data might be accessed in the future. It can predict these patterns by utilizing the metadata information from the file system and by storing data access patterns and constructing new patterns by following some logic.

Basic patterns can be generated using file access times and frequencies(something which most modern file systems store). These patterns can be compared to some real file access patterns that were observed in recent history(these patterns would be stored in some separate dedicated memory).This comparison can be used to eliminate certain patterns. Finally, a probable set of file access patterns would be generated. During the policy's operation, this set of patterns is combined with heuristics(same as the ones used in LRU) and the replacement is done on the basis of the most probable data access pattern that was chosen by the replacement policy. The number of misses and hits while using the set of patterns can be used to make changes to the set of patterns itself as well as switching to the next most probable pattern from the pattern set. This will correspond to a kind of feedback which is the core of learning in AI systems.

Its pretty clear that such a scheme would need some extra memory and logic for computing these patterns and then storing and changing them. But these patterns would take up very small space and the processing can be done during certain free CPU cycles. In an overall context, the approach would be more beneficial as the cache hit ratio is very likely to increase. Such an approach is useful for general computer systems where both files and file data, are accessed in structured patterns. For example, certain applications always read some files before the others. This approach can also be put to very good use in servers(both database and web) as even in these systems, the users are very likely to view some specific data/webpage before others. However, the approach may breakdown in cases where usage is highly unpredictable.The file systems won't be needing any drastic changes but additional logic for pattern prediction and updation would be needed.

As all other AI systems, even this system will become better with operation. Initially, the system would generate file patterns on the basis of metadata that it gets from file system. As the system is put to operation, it would refine these initial file patterns on the basis of patterns that were actually observed. Finally, the patterns would be further refined after these patterns were put into operation and feedback was incorporated into the patterns. One may also think of some other prediction pattern for file usage but the core concept still remains the same- the system has to predict which files or data would be used the most at a point of time. And its pretty obvious that even this approach would be using AI to serve its purpose.That's the power of AI !


Thursday, 10 November 2011

Predicting Earthquakes In Advance

Surprised after reading the title of this post ? More than that, you might be wondering if it is possible. Well, this may turn out to be a reality in the near future.Earthquakes, are perhaps one of the most devastating forces of the nature. Earthquakes, ever since the inception of civilizations, have claimed countless lives and have resulted in heavy damage to property. Whereas, damage to property can be controlled by making earthquake-resistant structures, loss of life can be ensured by both, making stronger structures and finding ways of predicting the earthquakes, well in advance.

Now, the questions is, how ? Earthquakes are of two types - shallow ones and deep ones. Shallow earthquakes are the ones that originate within a depth of around 300 KM beneath the surface of the planet, and Deep earthquakes are the ones that originate at larger depths. The reason behind the shallow earthquakes is very well understood, however, there is no clear-cut explanation for the Deep Earthquakes. Hence the concept that will be used to predict earthquakes is applicable only in the case of shallow earthquakes. Moreover, shallow earthquakes cause more loss and wreaken more havoc than the deep ones.







Now, the principal point is that, the shallow earthquakes have got a definite relation with seismic activity and seismic waves.They are basically the waves which originate because of the movements inside the Earth.The seismograph shows more activity in case of actual earthquake and the Richter Scale measurement of Earthquake magnitudes, is actually the magnitude of largest variation on seismograph recorded during the Earthquake's span.The monitoring centers throughout the globe, keep recording the seismograph of corresponding zones, and this seismograph easily tells when we had an earthquake.The monitoring centers are placed after analysis of tectonics(There are several tectonic plates inside the earth and the shallow earthquakes are related to the collision and other interactions amongst the tectonic plates. This is tectonics).It also gives us information about the various parameters regarding the geographical area to which they pertain. Like, you must have heard about the danger zones, in terms of probability of occurrence of an earthquake. Countries and states are divided into seismic zones.Some zones have a high risk of seeing an earthquake than others and some zones are also likely to see more powerful earthquake than others. This zonal distribution turns out to be very useful during planning . Such zones are made after analyzing the seismic activity over a long period and also after analyzing the tectonics of that place. Like places that are closer to the meeting point of two tectonic plates are at a higher risk .

So, the question is, can't we make better use of the seismographs and use them for better things than just planning zones? Well, seismographs may turn out to be the biggest boon for mankind Seismographs are formulated by measuring the strength of seismic waves and they are analyzed across various parameters.Seismographs are recorded at all times, and most of the places in the world will have a large database of seismographs by now. Now, there are two suggestions for making use of these seismographs in predicting future earthquakes - A statistic-based philosophy and a Data Mining based philosophy. The statistic-based philosophy is a conventional one.The seismographs of all the years till now are analyzed and the values of various parameters are calculated. The values during the earthquakes are given higher weights while compiling a area-based formula that can be used for predicting earthquakes. The current values of the formula's application help us in finding out if we are nearing an earthquake. Now, the disadvantages of this approach are :

1. values in future may be similar to the past values just by chance and hence may turn out to be false predictors, in the end.

2. the approach might predict an upcoming earthquake, but it will not predict it well in advance and the authorities may not get the required time for letting people know of the same.

3. every statistic approach has its own disadvantages.

4. if there were heavy variations in parameter values during earthquakes, then the formula for that area would be very fragile.

5. The process used for computation of the formula is based on knowledge of a subject that is not well understood. So, the approach is not perfect.

The second approach though, is the one that should interest us the most. It is based on Data Mining. Data Mining is basically a phenomenon, in which tonnes and tonnes of existing data is analyzed by a data mining program and the attempt is to find out some hidden and potentially important information. This information may be in terms of hidden relationships between different items or may be anything else that holds a lot of value for the organization to which the data belongs.Data mining can only be done when you have tonnes and tonnes of data to mine. Just to give you an example, consider a Departmental store. The departmental store sees tonnes of visitors everyday and all information regarding all the billing gets stored in their databases. When the database grows large enough, it is combined with the even older databases and all the billing information stored till date is moved to a Data Warehouse(it is just like the Data Archive of a organization). Now, the departmental store wanted to find out any hidden information from this archives(since the archive is humongous, manual mining is not an option).They run a data mining tool on this data and they find out that about 50 percent of the users who brought bread of A brand also brought cheese of B brand. This is a very value information in this context. The store may give an offer, where a combo of A brand bread and B brand cheese is given. Now since, 50 percent of the users were already loving this combination, a great percent of those who haven't tried it yet, will also have an urge to try the new combo. The store can reap in huge profits like this. That is Data Mining for you.

So, in context of Earthquakes, what has this Data Mining got to offer us? Well, it can do wonders. The thing which has long been recognized by seismographic experts, is that seismographs of most areas might show some specific behavior just before the earthquakes.Now, we don't know how long this behavior lasts or what sort of behavior it is. But we do know one thing.We have the seismic activity recorded, both in terms of graphs and also in terms of values. And we can also assume that we have a significantly large seismograph database . Now, by nature, seismographs are going to give you a lot of data. Seismographs are continuously recorded all the time.We have a sufficiently large  number of data mining tools for both, mining graphical data and mining numerical data. Hence,if there is any behavior, Data Mining will find it and tell it. In terms of graphical data mining, the tool may come up with some pattern that was experienced some time before the earthquakes or through some time before the earthquakes, and in terms of numerical mining, the tool may come up with a set of values that was seen some time before the earthquake, or may even come up with some averages.Hence, if we do have some pattern in the seismographs and we have appropriate seismographic data for some area, then an effective data mining process will always come up with this hidden behavior and experts can use this information to formulate models.In fact, Data Mining tools, also provide the lower level details behind its findings and help the experts in making detailed models.A different program may monitor seismic activity against this model and report results to experts at all the times. Now,it does not matter that whether the behavior was transient or it was prolonged.If there was a specific behavior, Data Mining will find it.The strength of Data Mining lies in the Artificial Intelligence that the various tools possess. Data Mining tools use neural networks, genetic algorithms, cluster algorithms and various other approaches to analyze the data across various dimensions and come up with hidden information.But,this approach too has a few drawbacks :

1. The behavior may not be very useful if that was exhibited just a few seconds before the earthquake.

2. The Data Mining tools take a lot of time for mining information, hence using the tool on the go, is not possible. One has to properly plan that when the latest Data Mining session has to be run and after collecting how much new data, should it be run.

3. Data Mining, at times, may come up with a lot of possible alternatives for explaining a particular piece of information. This is not the fault of the tool, this is a because of the nature of the case. In this case, experts will have to use their knowledge to reduce the number of cases to formulate the final model.

So, the best thing that we can do, is to combine the first approach with the second approach and make a combined model that can be used for predicting earthquakes. There's no doubt that a lot of capital and time will be spent, but just imagine the benefit it has for the mankind.Some research has already started in this field. A team from Indian Institute Of Technology(IIT),Hyderabad is working on a project, where several small sensors will be placed in the Himalayan belt and Data Mining will be done to predict earthquakes, a day in advance. The sensors are from Japan, so their teams too are a part of this. Teams from other IITs will also be contributing. The project will get into full flow by 2015.Some more research from other universities throughout the world, is under way. We can just hope that this research comes up with some encouraging results and gives us a model by using which, areas from all over the world, can find out if an Earthquake was approaching, and that too, well in advance. Just imagine the world then. That is what technology can do.








Saturday, 22 October 2011

The memory bound on AI systems: The move towards self-awareness

Often a lot these days, we hear about the memory explosion in the world of computers. This explosion basically refers to the increase in the amount of memory per unit size.And there indeed has been an unanticipated memory explosion.Recall the days of SDR SDRAMS where computers used to have 64/128/256/512 MBytes  RAM. And compare that situation with the current times. The computers of the present era have the DDR SDRAMs. General RAM sizes have gone up to 8 GB/16GB/64GB and even more than that in specialized architectures.Even in terms of storage, there has been quite an improvement. There has also been an explosion in the context of processing speeds. Compare 686 series with the i-series and you will find a breathtaking improvement.This was in context of the general scenario, but in case of AI systems, we generally use specialized architectures like the ones used in supercomputers. In such systems, primary memories are in terms of TeraBytes and secondary storage is in terms of Petabytes.Even the MIPS rates are much much higher than the general computers.But even after all these improvements, in case of generic Artificially intelligent systems , the processing speed explosion may suffice, but the explosion in memory(both primary memory and secondary) ain't that good.

The reason for this is that AI systems that are capable of learning, may  need tonnes of memory to remain stable and effective. AI systems, unlike conventional computing systems, organize their memory in the form of neural networks mainly(Although there is an entire variety of knowledge representation structures that have been used in AI systems, we will concentrate on neural networks to keep the discussion simple).Whereas conventional computers have tree like directory structures and file systems, the AI systems form an entirely connected network which is much more exhaustive and much more effective in the context of what AI systems have to do.Neural networks, are an imitation of the human brain. A neural network is composed of nodes and connectors just like the neurons and connectors in our brains.Like the impulses that are transferred between the different neurons of our brain, the nodes of a neural network too transfer signals(information) among them.

Now we will try to see how a neural network actually works. Look at this diagram :


This is the diagram of a basic neural network. Every neural network has 3 layers of nodes : input nodes, hidden nodes and output nodes. Input nodes are passive, which means that they do not contain any information and that they do not manipulate the information that comes to them. The input nodes simply pass on the data(data means variables.we will consider these variables to be numbers in this case) that they get to the many connectors that leave them. For example, in the above figure, look at the first input node. It gets a single variable(X11) and passes on the same to the four connectors that connect to 4 hidden nodes.


The hidden nodes(internal nodes in the middle layer) as well as the output nodes are not passive. Every connector that connects to a hidden node multiplies that value that it carries with a weight. Weight here is just a number.For example,if we had a value of 10 coming to a hidden node and the weight on that connector was 0.7, then the weighted value will be 0.7 * 10 = 7.  So what comes to a hidden node is a set of weighted values. The hidden node contains a sigmoid function which simply strives to combine all these weighted values into a single number. This number should lie between 0 and 1. So every hidden node gives an output that lies between 0 and 1.

After that, the output nodes receive values from the hidden nodes.Output nodes have multiple input connectors, but only a single output. So these nodes combine the input values to reduce the number of outputs that the network produces. Hence they too manipulate the information that they get.

There can be multiple layers of input nodes,hidden nodes and output nodes. Input nodes connect to either more input nodes or to hidden nodes. Whereas the hidden nodes either connect to more hidden nodes or to the output nodes.In this was we get a fully interconnected neural network.

So, this was how a neural network keeps information in it. The input nodes accept the raw information and the output nodes present the results of applying the knowledge. The weights here form the most important part, because it is these weights only that determine how well the results will be. So the overall problem of having effective knowledge comes down to fine calibration of the weights.

Now coming back to the original problem. AI systems are of two types. One that have an existing set of neural networks and no new neural networks are added during operation.Expert systems come under this category.Expert Systems are AI systems that are specialized for some particular task. They are just like human experts.In these types of systems, the amount of knowledge which the AI system will use is already known. In these AI systems, the knowledge is structured in the form of neural networks and as the system starts working, it starts using this knowledge to solve problems. Now as the system works, it keeps on improving its knowledge by adjusting the values of the original weights. Like if the system knows that it failed on a few occasions because of some faulty weight, then it can calibrate the value of that faulty weight on the basis of its findings.These systems need a limited amount of primary memory and storage to function.

The other class of systems is different. These AI systems are capable of much more than the previous class. These systems are also capable of re-calibrating the weights of the existing neural networks, but they are also capable of generating new neural networks and expanding the existing ones.For example, lets take a Humanoid Robot. This robot knows only a few things at the beginning. Now as it starts its operation, it is going to learn new things. The amount of knowledge required by a humanoid to function is so large that it is never possible to incorporate all the knowledge at the very beginning. Hence the humanoids start functioning with a minimal amount of knowledge and they are equipped to learn new things on their own. Now suppose that the humanoid comes across an entirely new thing. As it learns how to do it, it builds a new neural network based on the knowledge that it gathers. Hence as it learn new things it keeps generating new neural networks. The humanoid may also extend its neural network when it learns a new class of something that it already knows. Like when it knows how to cook plain rice, but it recently learned how to cook fried rice. So it will add some new nodes to its existing neural network so that it becomes more versatile.

Its for these systems that the memory limits impose a restriction on the functioning. In the beginning, we are unaware of how a humanoid will learn things and at what pace it will learn them.And even if we gave it the highest possible memory chips available, the thing will not suffice. The problem is that humanoids have to mimic the humans and we human beings have literally got an infinite amount of memory. Our brains are so powerful and our memory capacity is so vast that no humanoid can even think of matching it(at least for the next decade or so). Now, although we use only a limited proportion or our brain, but we are sure that we will never run out of memory. But that is not the case with our humanoid. It has to build new neural networks and it has to expand the existing neural networks that it possesses. So as it starts learning, it has to learn a lot of things and it has to retain most of what it learns.The humanoids that were built till date, started to learn at exponential rates until they either had to shut down due to lack of memory or they learnt what they intended to learn.

All research humanoids were started with a minimalist knowledge and as they started interacting with the world, they started learning new things. But the problem is that the algorithms are not that good at telling as to when they have to stop. As a result they learn a lot of things and keep generating more and more networks and keep expanding their existent networks. As a result, though they begin learning at a good rate, but eventually they always fall short of memory. This happens because we human beings know our limits, but the humanoids don't. They fall short of both the primary memory as well as permanent storage. As they expand neural networks, they have to retain the existing networks in their current memory and they also have to keep using the updated networks if they have to continue doing the job that they were learning. Hence the amount of the neural network that can be retained in their memories crosses the threshold.Moreover, humanoids are inherently multitasking and therefore, they have to keep multiple neural networks while solving problems.

There have been a few solutions to the problem of limited primary memory. The modified algorithms can help the humanoids in deciding that what portion of the neural network they have to keep in the current memory. But even in that case, we are eventually going to reach a bound.

The second problem is that of permanent storage. As the humanoids keep learning new things, they have to store the enlarged and new neural networks so that they have this acquired knowledge for the future use. As a result of this, they have to keep storing the knowledge that they acquire with time. Hence with time, the permanent information with every humanoid also increases.Imagine how much of knowledge a humanoid would be having.

Lets try to get a guess at the magnitude of information that we are talking about. If a humanoid has to learn making coffee, it will be having knowledge on how to see things, how to distinguish between coffee and assisting ingredients with the other objects in the world.Then the humanoid would also be having knowledge about how to make coffee. Now the problem of recognizing objects in the world itself is a big one.Learning by examples is a phenomenon that is used in case of objects. So if the humanoid has seen a coffee jar, it will store it in form of the height, weight, and other visual and physical aspects of the jar. But if the jar were a bit different, it will have to add additional information to it and will refine the existing class of coffee jars to add this new information.So with time, the humanoid will refine its classes of objects by adding more information to them and it will also define new procedures for the new things that it will learn. All these things will be stored in different forms. Like the classes will be stored as a set of facts, whereas procedures will be stores as a set of steps.These are stored in different forms which are either entirely new, or are a variation of the neural networks that we discussed.Irrespective of the form in which we store this information, the amount of memory needed is humongous.And mind you, this amount of memory were talking about here, is thousands of Petabytes in case it were to learn and retain most of the things.

So, is it possible to put that much of memory in a Robot that looks like a human.Not in the current times, that's for sure. But a modification of client-server architecture can be used for regularly transferring tonnes of information from the humanoid to some remote storage where this much of memory is available. Of course, given the network bandwidths of the current times, a single transfer would take a considerable amount of time. But we have none other option, as of now,The problem here, would arise when a humanoid has to perform some action and it knows that it has the knowledge for solving that. In that case, if the neural network or portion of neural network needed for solving the problem is within the local storage(storage within the humanoid)then it's okay. But otherwise, it will have to access the remote repository of data where it has stores all the knowledge that it gathers. In the latter case, imagine the time for which it will have to wait before all the needed information becomes available and it can start acting.

So where's the jinx. Well, the conventional methods and modifications of these conventional methods don't seem to be offering any viable solution to this problem. But there does exist a solution to it. This solution is incorporating Self-Awareness in AI systems.

As the term suggests, Self-Awareness means that the humanoid or the AI system will become aware of its own existence and of its own limits. Obviously the system is aware of what memory capabilities and processing capabilities it has, but here the emphasis is on being aware about how much it is capable of learning just like we human beings are. In this case, every humanoid will start learning at an exponential rate. As it will encounter new problems, it will gather more and more knowledge by its interaction with the world.
But as it knows about itself, it will keep deleting obsolete and temporal knowledge with time and it will also learn only a portion of what it would have learnt in the previous case. The learning by example method, would have made it classify coffee jars and that was an effective means of learning, but now that it is self-aware it will also include only a few aspects which it considers to be necessary. It does this by keeping its memory limits in mind.This is just analogous to a student who, while reading a chapter makes a note of more important things and empahisizes more on them. Likewise, the self-aware humanoid will grasp only the important aspects and will store them only. Later on, while attempting to solve the problem, if it fails, then it tries to grasp the other things that it believes that it missed out on, in the very first attempt.

Hence, the system which previously used to gain a lot of knowledge at the first attempt and was sure that it will be able to solve the problem when it encounters it again, tries a little bit on its luck and gains only partial knowledge. Now as this humanoid fails again and again, it keeps on improving its knowledge base. Eventually when the success rate goes above a threshold, it knows that it has gained enough of knowledge and stops adding more to it. This is the essence of self-awareness. It should know when to stop and that's why the threshold values should be chosen very carefully. Hence, the robot begins to learn in a way in which the human beings do.Every human being tries a thing 2-3 times before he/she starts to succeed and that is how the humanoid would be working now.Another aspect is that with time, the humanoid would become aware of what its skills are and it will be able to guarantee some success in those domains. With time, it will keep refining the knowledge base by adding new knowledge and dispersing the unused and obsolete knowledge. In this way, although the effectiveness will be reduced because if there was a problem that it solved way back in history, then it might take a lot of time solving it again because the existing neural net was deleted and the problem will have to solved from the scratch,but the system will need much lesser memory than before.

This concept cannot be used in places where success rate is critical, but it can be used in humanoids that mimic the life of a regular human being who is in training phase.Even after being self-aware, the system will be needing a little more help from technical advancements, because even with these mechanisms, the amount of permanent information needed would be difficult to incorporate in a machine of the size of a human being.

At the end, I have to tell you that this post is by no means exhaustive. Its just a small snippet of a big big research area. A fully exhaustive post would have taken at least 200 pages, and that's why the post is a seriously scaled-down version of the same. I just wanted to share these enticing insights with you and wanted you to share that exhilarating imagination with me. That was the sole purpose behind putting this post..

Thanks for your patience.





Thursday, 20 October 2011

Building a smarter planet(The quest for information)

Every one of us is surrounded by a pool of information-emitting entities. Perhaps, this is the first time that you have heard that term, but in reality, it has been there ever since the inception of this planet. Consider our bodies or, for example, the body of any living being. As a whole, we don't seem to be emitting any information apart from the regular biochemical excretions and lingual/non-lingual communication. But, surprisingly, that very body emits a hell lot of information every now and then. And the EEG and ECG are classical examples of how to capture that information and use it for some useful purposes.









So, now that you have a brief idea of what this post is all about, lets get to the main point directly. Information, as we see it, is some useful knowledge, but that there is the flaw in the definition that we follow. What is considered to be useless till date, can become a very useful bit of information in tomorrow. The EEG and ECG, for example, would have appeared to be totally nonsense things to the doctors of the medieval era. Hence, our definition of information almost always prevents us from seeing the real picture and it almost always makes us miss out on the potential, yet untouched aspects.Lets get to some real example now.

Every computer  is composed of components like RAM,MoBo(Mother Board),chipset, Processors,Buses, cooling units,HDDs etc. All of these components add together to form the computer as a whole. Now, there are two types of signals which these devices emit. One are the digital signals which these devices use to communicate with the the other devices via the bus, and other is the electrical supply which is used to run the individual components. Now, a lot of emphasis has been laid on how the buses should be organised and how the overall architecture has to be designed. This all is done to make the digital signals travel faster than before and also ensure that they become more effective. Therefore, all the innovation went into the improvement of the buses and communication interfaces, because it was these very things that shape the speed and response time of a computer. And in fact, there has been a tremendous improvement in these aspects. The device interfaces progressed from ATA/IDE to SATA and the bus specifications improved from SCSI to USB to the upcoming LightPeak. The magnitude has improved tremendously. But, the SMPS, which is the component that supplies electricity to all the devices, hasn't seen a lot of improvement. And as of now, there is a  very little hope that it will.

Why?, one may ask. Well, the SMPS, once it reached a stage where it seemed to be doing what it was intended to do, made the guys think that it does not need any further improvements.The only improvements added later on, were to make it comply to the latest bus and communication interface specifications. But these improvements in voltage and current specifications, don't constitute a breakthrough. But there could indeed have been a breakthrough improvement that we missed out on.

Every time, your computer breaks down, there is either some component or either some particular sub-component(resistance,capacitor etc.) that needs to be replaced.This happens when either an incompatible device is connected, or when a faulty device is connected, or when some jumper setting went wrong,or even when there was an internal upsurge. The reason why these components or sub components blow-up, is that some component got more electricity than it needed. And this extra electricity often flows through the supply wires of the SMPS. Now the SMPS is based on fixed logic, so it simply knows that how much pre-specified voltage or current has to be passed through a certain wire. And the transformers and other cut-out mechanisms used inside the SMPS help it to ensure that whatever be the external voltage, the voltage to be supplied through it would be what was in the specifications.

So, where do they miss the trick.?  Well, if all the voltages and currents are already withing place, then why do the components blow.The SMPS is responsible only for the power that is supplied to the MoBo and Peripherals, but after that, the MoBo distributes the power to the bus and the internal circuitry. Now, the reason why current exceed the limits, at times, is that either non-compliant  components are connected, or that a particular device was faulty/gets faulty and transfers more than what was needed. Now, the SMPS  is unaware of the actually connected device, whereas the MoBo can get a sound knowledge of what the device actually is. Now, if the SMPS as well as the MoBo were configured to transfer a minimal of information among them, the MoBo, could use some low power signal(driven by CMOS power) to find out the internal configuration before the actual boot-up. This low power signal would just be used to ask the individual components for their interface related information. Hence, by the time the system is all ready for a boot-up, the MoBo already has some information and it also has information regarding its own specifications. Now even if a simplistic logic is present within the SMPS, the aforementioned information can be used to find out the amount of voltages and currents that have to be transmitted through every outlet cable of the SMPS.

So, what's the real deal ? Well, if the SMPS has a detailed knowledge of what has to be transferred, it can either change its capacitance or resistance to provide this much of value, or it can simply cut-off just to prevent damage to some component(s).So, compare this with the previous situation. The former SMPS knew just how to to supply some voltage and current across its wires.Whereas,our new SMPS, is aware of the overall computer configuration and it can change its internal configuration so as to ensure that the voltages that it supplies do not blow away any components.Hence the previous static SMPS that had a very limited knowledge now becomes a smart SMPS that knows a lot about the computer system and  can change itself accordingly. Hence, with just a little bit of knowledge about how the system is configured, the SMPS and MoBo will be able to ensure that the computer system never breaks down.This was just the matter of harnessing some information and harnessing it correctly. Although, the computer BIOS always automatically gets updated when the configuration is changed, this update takes place during the boot-up, and hence, if any of the devices is wrongly configured or if any non-complying device is connected, then that will blow away there itself. However, the suggested method will be like a system diagnosis even before it actually starts and hence prevents any faulty configuration from running.Hence, a static computer system will become a dynamic computer system that could adjust itself according to the different h/w connected to it.

Now, there is no doubt that the costs will go up by addition of this extra logic, but won't a user be willing to spend some extra cost for getting a computer system that is as infallible as it can get. Although, even this computer system can fail when there is a problem with the initial logic or the SMPS, but the individual components and more importantly, the data, will stay safe.In fact, a few IBM laptops even have a BIOS presetting feature that solely runs on CMOS battery power. But the idea suggested here, is much more effective.

So we we had an example on how information, which was always there, but was always unattended, can be used to make an "invincible" computer system.In fact, this was just one of the several ideas. Some universities and R&D departments of organizations like IBM have already come up with a whole list of such things that they are working on.Some of these things are :

1. Tracking every piece of medicine as it goes from manufacturing units to inventories to supply chains and finally to the stores.In this way, information about the medicine's lifetime can be used to counter adulteration, and repackaging of old medicines.


2. Collecting information(EEG,ECG patterns, Breathing rate,temperature variations, movements, growth and some miniature signals emitted by the body)for a newborn baby and combine it with information collected from his/her DNA to find out the potential of any future diseases or any abnormalities.

3. Making the Electricity supply of Metropolitan more smart by making every grid and every transformer keeping a local computer informed about its current state. In this case, if any grid or transformer crosses its limits or senses that it is about to cross its limits, it can either shut down to prevent total breakdown or it can ask  the computer to update the configuration by balancing the load. All such local computers will connect to a central power distribution network that may be regulated by humans or by some other powerful computer itself. In this way, all the systems will remain up for most of the time and potential breakdowns can be prevented. In fact, these computers don't need to be complete computers. They will be a minimized and specialized version of a full-fledged computer.


This is just one list, but in reality, we can take information out of everything that we come across. Of course, the implication of the use to which that information will be put, is very important, but if we start looking the world from an entirely different perspective, then ,most of our problems can get solved.Its just a matter of "Thinking Differently".


Monday, 17 October 2011

Artificial Intelligence: The Unforeseen Consequences

 The simplest definition of artificial intelligence or AI is that it is a science that tends to make computers behave and act in the way human beings do, and it has been this very definition that has attracted various scientists and engineers from around the world to work on this domain. AI, ever since its inception in the 50’s, when the first thoughts of developing such systems were conceived, has been a very fascinating domain of study – one that is considered to be very different from the others because of the very approach that is followed to model AI systems. The AI systems are different from the normal ones because of the fact that these systems approach towards a solution in the way in which we as human beings do, whereas the conventional computing systems approach towards a solution in a rather rigid and procedural way. Whereas the conventional systems can solve only those problems that they were coded to solve, the recently developed AI systems can generate theorems and can prove the same. It is this very aspect of AI systems that had got them a separate place in the world of computer science.

To most of the readers, AI systems primarily comprise of robots as this has been something that has been always highlighted. But, there is a lot more to AI than just robots. The umbrella of AI contains Expert Systems, Theorem Provers, Artificially intelligent Assembly lines, Knowledge Base Systems and a lot more. Although all these systems have got varying architectures and very different characteristics, but there is one thing that ties all of them together – their ability to learn from their mistakes. AI systems have been programmed to find out if their attempt on doing something resulted in a success or a failure and they have been further designed to learn from their failures and use this knowledge in their future attempts to solve the same problem. The real life example of this was when the IBM computer Deep Blue which was programmed to play chess beat the then international chess champion Gary Kasparov in 1997. Deep Blue actually lost in its previous matches that were played with people who knew the moves of Kasparov, but it slowly and slowly got to know as to which moves are favorable and which are not and it used this knowledge to beat Kasparov in the actual match up. It has been this very trait that has made designing AI systems both difficult and at the same time challenging.






Computer Scientists may argue with the next point that I am going to put, but it is something which has always concerned some ethical thinkers and some other people from the science background. Although AI promises to do a whole lot of good to the human race, but at the same time it brings a risk with its massive scale implementation. AI systems on one hand can help our race by managing knowledge for us, exploring new scientific concepts, assisting us in our day to day jobs and a whole host of other reasons. But on the other hand they pose a threat to our own existence. As pointed by the articles of Hubert Dreyfus and John Sutton of the University Of California, Berkley, the rate at which capabilities of the AI systems is increasing can be dreadful. According to them we are not very far away from the day when AI systems will become better than human beings in performing almost any task. We already have AI systems that can perform not only more efficiently but even more effectively in various fields, than human beings. Such fields are currently limited to analytical reasoning, Concept exploration, Logical Inference, Optimization and Concept Proving. Although at this point of time, this list may seem a bit restricted and may not bother a lot of people, but the next generation of AI systems that will be designed for particular domains, will expand this list in a very big way. In the nearby future we are going to see systems that will be capable of programming a system on the basis of Pure Structured Logic, systems that will be able to replace doctors in a few critical surgeries where doctors haven’t been very successful and systems that will be able to do space exploration on their own. In fact such systems have already been implemented but they were assisted by human beings at some point of time. Now one might ask as to why such systems were not developed in the past when they were thought to be developable. The answer to this is that there are certain hardware characteristics of such systems that proved to be the bottleneck. The above mentioned systems need to have very high processing power to support run time reasoning, decision making and logic designing and they also need to have a very large memory to support the massive amounts of information that such systems will have to process. Such systems also need to have a large storage so that they can store whatever they have learnt. Till a few years back the amount of processing and the amount of memory support was no way near what is actually required to make such systems. But now with the inception of multiple core processors and with the recent breakthroughs in memory technology, the amount of processing and the amount of memory available per unit chip space both have gone up. And as a result we are finally able to see such systems coming into action.



Now, with such systems coming into action we can expect such systems being actually used in the field in 3-4 years from now and going by the past experiences of similar trials and advents of similar systems , such systems will indeed outperform human beings in the fields in which they will replace them. And if this turns out to be the case, we are going to face the biggest problem that we have faced till date – massive scale unemployment. Managers, who are always hungry to get more efficiency and more effectiveness without much of a demand, are going to be the first ones, who will prefer such systems over human beings. They will get what they always wanted to get and they will stay happy till the day when they themselves will be replaced by such systems on the orders of the still higher level managers. The whole hierarchy of work flow will then comprise of AI systems. This may seem to be a distant reality but going by the predictions this may actually happen. The sales of the organizations will indeed go up. Companies will be getting profits as high as they had never expected to get but on the other hand the governments will be struggling to cope with the all time high unemployment figures. The nations that will be able to cope up with this surge by passing the appropriate regulations will be the ones that will eventually sustain and the ones who will fail to do so will be drowned into a state where the economy will be on its peak but the society will be on its all time low. The whole balance of such nations will be disrupted and the overall administration will become a total chaos. Planners will be clueless as they will be encountered with something which they had never ever faced before and the Leaders will be clueless as they would have no one to assist them in decision making. In short, the whole world may lead towards an irrecoverable disaster. As of know when we haven’t seen such systems yet, this all may seem to be a bit of a framing but then ask your Grandma how she felt when she saw the television for the first time.