Tuesday 12 June 2012

Using AI to Combat Cyber Crime

Artificial Intelligence has, for quite some time now, been used for combating credit card fraud. Data Mining, which is in a way an application of AI, is used to detect credit card frauds by using various mechanisms. In the most general scenario, a pattern of the user's credit card usage is drawn by making use of his credit card transaction records and all the future transactions are inserted in the pattern only after conforming to this pattern itself.Whenever a transaction or a pair of transactions that violates the pattern is noticed, the system prompts the surveillance personnel to check in. Then its upon the discretion of the personnel to see if the transactions are to be investigated or to be entered in the system and inculcated in the ever changing pattern.In a more advanced form,the normal credit card usage pattern of the user is used along with a pattern of the usage seen in the credit cards of the whole group to which the user belongs. This group may be created on the basis of income,credit card category(gold,silver etc.) or even the company to which the user belongs.This scheme is more robust and more resistant against single high value transactions that may appear to drift away from the pattern but are actually genuine.

The above approaches have been quite effective in combating credit cards frauds to some extent, and as a result, agencies all over the world have started looking at AI for combating other forms of electronic/cyber crime.They sought to AI because of the fact that due to the humongous number of transactions, its utterly impossible to employ humans to track movement over the internet. They need a machine to do that and in fact they need a machine that's smart enough to match the wits of a human expert.The intelligence may either be embedded in the individual application servers, just like spam filters used by mail servers or the intelligence may be implemented at the firewalls at the gateways. The advantage with embedding it into the individual servers is that the logic related to the specific application can be embedded. E.g a traffic pattern may be acceptable if destined for mail server but not for some office application server.In fact the best approach is to divide the intelligence amongst the two places. General intelligence is embedded at the firewalls and the application specific intelligence is embedded in the individual servers.

The general model suggests that some traffic analysis technique be used. This technique would differ according to the networks. Traffic could be analyzed at one or all levels. Either only the datagarm traffic could be analyzed or the ip level traffic or both. The traffic is again matched with the general pattern of traffic just like pattern matching in credit card fraud detection. At the firewalls, the overall traffic pattern is analyzed, and at the individual servers, the application level and session level traffic is analyzed. At the application level, once again two patterns could be used - a user pattern and a group pattern. At the firewalls however, a single pattern has to be used.In fact, the system may keep different patterns for different days or different times instead of a single pattern, and these different patterns may then be used accordingly.Like every cognitive learning mechanism, these patterns would also improve with time. The system would match actual pattern with the stored pattern and also keep changing according to the patterns that it analyzes. For example, if the system reported an anomaly and the network admin thinks its normal traffic, the system would inculcate this in the traffic pattern model and would improve itself. Hence, with time, the system will become more and more effective.