EPSRC Reference: |
GR/T04298/01 |
Title: |
Multi-Objective Meta-Heuristic Algorithms for Finding Interesting Rules in Large Complex Databases |
Principal Investigator: |
de la Iglesia, Dr B |
Other Investigators: |
|
Researcher Co-Investigators: |
|
Project Partners: |
|
Department: |
Computing Sciences |
Organisation: |
University of East Anglia |
Scheme: |
First Grant Scheme Pre-FEC |
Starts: |
01 September 2004 |
Ends: |
31 August 2007 |
Value (£): |
126,751
|
EPSRC Research Topic Classifications: |
Information & Knowledge Mgmt |
|
|
EPSRC Industrial Sector Classifications: |
|
Related Grants: |
|
Panel History: |
|
Summary on Grant Application Form |
We propose a novel algorithm for the extraction of partial classification rules in large complex databases. The novelty in this proposal comes from the merging of a number of developments in data mining and optimisation to produce an efficient algorithm that can find a set of interesting rules in a large database in the presence of high levels of uncertainty, and utilising the existence of concept relationships. This research will be particularly applicable to medical databases, where existing algorithms do not often perform adequately due to the complexity and uncertainty of the data.The new algorithm will be based on the use of Multi-objective optimisation techniques. Multi-objective heuristic algorithms have become efficient problem solvers. They can be applied to the problem of rule induction by considering measures of interest of the rules as the measures to be optimised. We can perform a search for interesting partial classification rules containing numerical or nominal attributes. Most existing algorithms do not address the extraction of rules containing numerical attributes adequately due to the size of the search space. Research into measures of interest also has to be considered and advanced to deliver a set of rules that are interesting in themselves but also in the context of other rules in the set. A strong emphasis on scalability will mean incorporation of Feature Subset Selection (FSS) and sampling mechanisms. Adaptation to cope with missing or uncertain data and with complex data described by an ontology will provide further functionality. The incorporation of all those concepts into an efficient and effective algorithm presents many research challenges, including the interactions between the different algorithm components. The final product should allow databases containing many features (including some extracted from text data, e.g. medical reports) with high levels of uncertainty to be analysed efficiently to discover sets of individually interesting nuggets.
|
Key Findings |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Potential use in non-academic contexts |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Impacts |
Description |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk |
Summary |
|
Date Materialised |
|
|
Sectors submitted by the Researcher |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Project URL: |
|
Further Information: |
|
Organisation Website: |
http://www.uea.ac.uk |