Details of Grant

EPSRC Reference:

GR/T04298/01

Title:

Multi-Objective Meta-Heuristic Algorithms for Finding Interesting Rules in Large Complex Databases

Principal Investigator:

de la Iglesia, Dr B

Other Investigators:

Researcher Co-Investigators:

Project Partners:

Department:

Computing Sciences

Organisation:

University of East Anglia

Scheme:

First Grant Scheme Pre-FEC

Starts:

01 September 2004

Ends:

31 August 2007

Value (£):

126,751

EPSRC Research Topic Classifications:

Information & Knowledge Mgmt

EPSRC Industrial Sector Classifications:

Information Technologies

Related Grants:

Panel History:

Summary on Grant Application Form

We propose a novel algorithm for the extraction of partial classification rules in large complex databases. The novelty in this proposal comes from the merging of a number of developments in data mining and optimisation to produce an efficient algorithm that can find a set of interesting rules in a large database in the presence of high levels of uncertainty, and utilising the existence of concept relationships. This research will be particularly applicable to medical databases, where existing algorithms do not often perform adequately due to the complexity and uncertainty of the data.The new algorithm will be based on the use of Multi-objective optimisation techniques. Multi-objective heuristic algorithms have become efficient problem solvers. They can be applied to the problem of rule induction by considering measures of interest of the rules as the measures to be optimised. We can perform a search for interesting partial classification rules containing numerical or nominal attributes. Most existing algorithms do not address the extraction of rules containing numerical attributes adequately due to the size of the search space. Research into measures of interest also has to be considered and advanced to deliver a set of rules that are interesting in themselves but also in the context of other rules in the set. A strong emphasis on scalability will mean incorporation of Feature Subset Selection (FSS) and sampling mechanisms. Adaptation to cope with missing or uncertain data and with complex data described by an ontology will provide further functionality. The incorporation of all those concepts into an efficient and effective algorithm presents many research challenges, including the interactions between the different algorithm components. The final product should allow databases containing many features (including some extracted from text data, e.g. medical reports) with high levels of uncertainty to be analysed efficiently to discover sets of individually interesting nuggets.

Key Findings

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Potential use in non-academic contexts

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Impacts

Description	This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised

Sectors submitted by the Researcher

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Project URL:

Further Information:

Organisation Website:

http://www.uea.ac.uk