EPSRC Reference: |
EP/K033344/1 |
Title: |
Mining the Network Behaviour of Bots |
Principal Investigator: |
Cavallaro, Professor L |
Other Investigators: |
|
Researcher Co-Investigators: |
|
Project Partners: |
|
Department: |
Information Security |
Organisation: |
Royal Holloway, Univ of London |
Scheme: |
Standard Research |
Starts: |
16 June 2013 |
Ends: |
17 June 2017 |
Value (£): |
680,623
|
EPSRC Research Topic Classifications: |
Artificial Intelligence |
Networks & Distributed Systems |
Statistics & Appl. Probability |
|
|
EPSRC Industrial Sector Classifications: |
|
Related Grants: |
|
Panel History: |
Panel Date | Panel Name | Outcome |
20 Feb 2013
|
EPSRC CEReS Feb 2013
|
Announced
|
|
Summary on Grant Application Form |
The botnets phenomenon has quickly become a major security concern for all the
Internet users. In fact, not only has it rapidly gained popularity among the
mass media, but it has also received the attention of the research community
interested in understanding, analyzing, and detecting bot-infected machines.
Once infected with a bot, the victim host joins a botnet, a network of
compromised machines that are under the control of a malicious entity. Botnets
are the primary means for cyber-criminals to carry out criminal tasks,
such as sending spam mail, launching denial-of-service attacks, or stealing
personal data such as mail accounts or bank credentials.
Clustering and correlating network events represent the state-of-the-art when
it comes to detecting and understanding the botnets phenomenon from a network
perspective. While effective, such approaches rest on weak foundations being
vulnerable to easy-to-perform (time and network) obfuscation attacks.
The goal of this project is to build on the promising results of our previous
work to explore novel machine-learning techniques to make the state-of-the-art
more accurate and robust against evasions and advanced malware. Exploring the
possibilities of advanced malware (and thus bots) to enable the development of
novel mathematical techniques to address such threats is not a mere academic
exercise. On the contrary, it is of paramount importance to build robust and
hard-to-elude mitigation approaches; something we currently lack, as
acknowledged by the research community at large.
On the cyber security side, we will develop techniques to analyze the network
traffic generated by a bot sample. Our analysis will focus on inferring the
interesting part of a bot's network behaviour to automatically generate models
that faithfully describe it. Our analysis aims at being independent from the
underlying botnet infrastructure, payload-agnostic, and able to pinpoint
legitimate-resembling malicious activities. The network flows of a monitored
bot will be initially filtered to remove well-defined attack patterns. The
remaining flows will be clustered using a number of network features and
suitable similarity functions. Clusters whose size exceeds a given threshold
will then be analyzed for periodicity: bots tend to engage in similar network
activities that have interflow intervals that either are sampled independently
from a potentially unknown probability distribution, or belong to a small
number of well-defined clusters. Once clusters exhibiting interesting
periodicity patterns are identified, they can be used, along with their network
features, for detecting (or understanding the behaviour of) bots in a mixed
population containing both compromised and clean hosts.
On the machine learning side, we propose to explore the use of conformal
prediction developed by our team to make such cluster-based analysis more
accurate and robust against arbitrary obfuscation-based evasion attacks. A
powerful clustering method is based on nonparametric probability density
estimation. A recent work proposes a computationally efficient method of
nonparametric density estimation based on conformal prediction and inherits its
properties of validity. We plan to explore the use of this method for the
purpose of robust clustering. A theoretical challenge is to spell out and study
the properties of robustness for this clustering method that are inherited from
the validity of the study mentioned above. In addition, the property of
validity of conformal predictors is usually established under the randomness
assumption; we will explore how this assumption can be relaxed. In addition,
the property of validity can be used to control the number of "alarms"
(predicting that a host is compromised) raised by a bot detection algorithm.
This is valuable in situations where alarms have to be investigated by human
experts but the available manpower is limited.
|
Key Findings |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Potential use in non-academic contexts |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Impacts |
Description |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk |
Summary |
|
Date Materialised |
|
|
Sectors submitted by the Researcher |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Project URL: |
|
Further Information: |
|
Organisation Website: |
|