EPSRC Reference: |
EP/J020230/1 |
Title: |
Imperfect data: accuracy, impacts and extraction of meaningful information |
Principal Investigator: |
Foody, Professor G |
Other Investigators: |
|
Researcher Co-Investigators: |
|
Project Partners: |
|
Department: |
Sch of Geography |
Organisation: |
University of Nottingham |
Scheme: |
Standard Research |
Starts: |
04 May 2012 |
Ends: |
31 July 2013 |
Value (£): |
69,249
|
EPSRC Research Topic Classifications: |
Information & Knowledge Mgmt |
|
|
EPSRC Industrial Sector Classifications: |
Aerospace, Defence and Marine |
Information Technologies |
|
Related Grants: |
|
Panel History: |
Panel Date | Panel Name | Outcome |
09 Feb 2012
|
Data Intensive Systems (DaISy)
|
Announced
|
|
Summary on Grant Application Form |
Meaningful information is a fundamental requirement for informed, logical and reasoned activity. Extracting meaningful information from data can, however, be a challenge, especially given problems that data may, amongst other things, be inaccurate, incomplete, and possibly contradictory as arise from a variety of sources of variable quality and trust level.
Data imperfections are a generic problem in information extraction and decision making and so the work is relevant in many disciplines. Imperfect data are, for example, evident in medical diagnosis (e.g. a patient's test results are typically only an imperfect indicator of a condition), in defining nature reserves for species conservation (e.g. the species distribution maps and models are often highly sensitive to 'absence' data - was the species actually present but not observed?) and in security and defence applications (e.g. sub-pixel target detection algorithms applied to surveillance imagery vary in performance and utility between environments). Some problems with imperfect data were recently highly apparent in relation to the response to the Haiti earthquake of 2010, especially in relation to damage mapping to inform relief activities. Vast amounts of well-intentioned assistance was provided by numerous professional and amateur bodies with unprecedented data rates but the volumes of data and the problems with them were a concerns. Key problems were that maps were inaccurate, inconsistent and sometimes contradictory. As such a major mapping challenges arises in how to work with such data. One key issue is the need for information on the accuracy of data sources and methods to help use imperfect data. This project seeks to contribute to this task. It aims to illustrate the impacts of using imperfect data, explore methods to characterise the quality of the data and methods to combine data sources to yield an enhanced product of known accuracy.
A range of methods will be used but the core focus is on the use of latent class modelling. This type of analysis is based on multiple observations or data from a variety of sources. The relationships between the observers/data sources are used to attempt to explain their quality and suggest how the data could be interpreted to yield information. The approach is a form of statistical modelling and is highly attractive for the specific research proposal because if a model can be formed that fits the observed data, then model's parameters define the accuracy of the data sources and its outputs can be used to form new products of known accuracy. As such the modelling analysis may add value to data by indicating its quality and combining it usefully for extraction of information.
As the problems of imperfect data are generic the proposal has broad potential impacts. For the specific DaISy call there are clear impacts in relation to security and defence. For example methods that enable rapid and qualified information to be derived from sources of variable accuracy, completeness and trust level will increase effectiveness and the quality of decision making. Additionally as a model based approach it removes/reduces the need for reference data to be acquired for validation which could otherwise require deployment of personnel to dangerous locations and so of considerable benefit to health and well-being.
|
Key Findings |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Potential use in non-academic contexts |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Impacts |
Description |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk |
Summary |
|
Date Materialised |
|
|
Sectors submitted by the Researcher |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Project URL: |
|
Further Information: |
|
Organisation Website: |
http://www.nottingham.ac.uk |