Advertisment

MLAs for Intrusion Detection

author-image
PCQ Bureau
New Update

Intrusion detection is a major nuance in information security, as it aims to

detect exploitation of system vulnerabilities by known legitimate users or

unknown illegitimate users of society. If the system, that is being exploited,

forms part of financial sectors or any mission critical organization, then there

could be catastrophic consequences for the organization. A very popular instance

of masquerader activity happened way back in Nov 2002, when 30,000 credit

histories were stolen by a helpdesk employee and sold to hackers, who logged in

as legitimate users and downloaded all credit reports. The information in those

reports was used to withdraw money from bank accounts and also to carry out

illegal transactions using those cards. This is considered to be one of the most

alarming cases in US history that caused major losses to a large number of

users. So, whenever an illegitimate user logs into the system, he has the

potential to inflict tremendous economic loss to the organization.

Advertisment

Direct Hit!

Applies To: IT managers




USP:
Develop intrusion detection solutions using Open Source

software Primary



 Link: None



Google Keywords:
Intrusion detection

Previous research on intrusion detection techniques applied statistical

distance based methods such as Euclidean Distance, Mahalanobis Distance,

Manhattan Distance, Canberra Metric and Czekanowski Coefficient for detection.

Then probabilistic techniques were applied for detecting intrusions using data,

which is collected offline. Even Finite Automata models were built for

detection. However, these methods suffer from detection accuracy and the

percentage of detection's also very less. Lately, Neural Networks, Fuzzy Logic,

Genetic Algorithms, and their combinations are being applied to data that is

collected online and offline. As a result, detection accuracy has improved as

compared to previous techniques.

Intrusions can be divided into eight basic categories: Eves Dropping and

Packet Sniffing, Snooping and Downloading, Tampering or Data Diddling, Spoofing,

Jamming, Flooding, Masquerading, Exploiting Vulnerabilities, Password Cracking

and Keys. Many techniques have been developed to test for these intrusions. One

was developed by Matthias Schonlau, who collected Unix command data from 50

users and this is used as a benchmark in evaluating IDS using command sequences.

The results of six methods that were followed from this benchmark, viz

Uniqueness, Bayes One-step Markov, Hybrid Multi-step Markov, Compression,

Sequence-Match, Incremental Probabilistic Action Modeling showed very low false

negatives. The missing alarms fell in the range of 30-60%. Roy A Maxicon

extended Schonlau's work by testing the hypothesis of enriched command lines and

achieved a detection level of 82%.

Advertisment

Snort is an Open Source network intrusion prevention and detection system

utilizing a rule-driven language, which combines the benefits of signature,

protocol and anomaly based inspection methods. It is an advanced IDS using

Apache, MySQL, PHP, and ACID and can work in three different modes: Sniffer,

Packet Logger and Intrusion Detection. Using Snort, data can be collected from

various parts of networks and intruders can be detected.

But when the developed detection system is examined in real-life, it doesn't

produce appreciable results. The reason being that these algorithms can only be

applied when we have perfect training and testing instances, which are less

prone to noise. So, the information security world needs to invent novel

algorithms for intrusion detection to achieve excellent detection accuracy.

Advertisment

Masquerader detection



The aim of this research is to apply machine learning algorithms that are

most appropriate for the classification of proper and improper usage of the

resources, thereby detecting improper users, called masqueraders. Masqueraders

will be detected using two methods: the first one using the variations in the

probability distributions of system usage; and the second one using a machine

learning algorithm.

The audit source for our experiment is the enriched and truncated command

sequences that are processed for the detection of masquerader.

purge rm -i -i; clear ; /bin/ls —al |



more - Enriched command


purge - Truncated command


Naïve Bayes Classifier





Naïve Bayes Classifier is an excellent supervised learning algorithm which

has a very high success rate in text classification and information retrieval.

It is based on the Bayes Rule. The Posterior probability p (u/cs) of user u

given a command sequence cs is given as follows:

Advertisment

p (u) p (cs /u)

p (u/cs) = ...Eq (1)

Advertisment

p (cs)



where p (u) is the prior probability for user u, p (cs / u) is the

probability that the command sequence was generated by user u, and p (cs) is the

probability of occurrence of command sequence cs.

The approach uses four phases for detection and comparison. The first phase

is data preprocessing. Here, the command sequences are taken and the enriched

command sequences are filtered for their arguments and the truncated version of

the commands are processed. In the second phase, learning is done using Naïve

Bayes Classifiers, which use multi-variate Bernoulli method for the modeling of

command sequences. In the third phase, probabilistic approach is applied using

Euclidean distance measure for modeling sequences. The fourth phase compares the

results.

The data set is organized into a block of 100 commands. The learning task

here is to model a binary classifier.

Advertisment

The output binary classification could be stated as Classification = True, if

not Masquerader Classification = False, if Masquerader

The masquerader data were taken as positive examples and the legitimate

user's data were taken as negative examples. Here the true positive refers to

the masquerade block of 100 commands. False positive refers to the legitimate

user's command but misclassified as masquerader.

Probability of masquerader detection



In our experiment, only positive examples are used for training. We compute

only p (ci | u) for user's self profile. For non-self we assume that each

command has equal random probability 1/m. For a given test d, p (d | self) and p

(d | non self) can be compared. If the ratio of p (d | self) to p (d | non self

) is high then it is more likely that this command blocks d from user u.

Advertisment

Comparison results



The performance is measured with false positives and false negatives. False

positives refer to incorrect number of misclassifications. The false positive

rate refers to incorrect classification/sum of correct positive and correct

negative while the false negative refers to missed alarms for the actual

masquerader block. False negative rate refers to incorrect negative/no. of

correct positives. The table shows that Naïve Bayes classifiers perform well

over the usual probability distribution model. The detection rate is higher

whereas the false alarm rate is lower.Individual users are identified by the

Naïve Bayes Classifiers.

Conclusion and future work



In our experiment we tested the masquerade detection problem with two

methods Probability Distribution using Euclidean Distance Measure and Naïve

Bayes Classifiers. We observed that Naïve Bayes Classifiers perform well in the

detection of masqueraders. This can be extended to apply hybrid machine learning

algorithms for detecting the class of intruders using the data collected from

free/Open Source software parallel environment. We have also analyzed the

sequence of system calls executed by privileged programs to detect intrusions in

the system based on the normal usage styles of the system. The enriched and

truncated command sequences are collected from the network and these sequences

are used to build a normal behavior of the user using Naïve Bayes Classifiers

and the detection rate is in sync with usual probability techniques, where it is

very high.

Dr S Mercy Shalinie and Ms T Subbulakshmi, Deptt of Comp Sc and Engg,

Thiagarajar College of Engg, Madurai 

Advertisment