EPSRC logo

Details of Grant 

EPSRC Reference: EP/F031092/1
Title: On-Demand Data Integration: Dataspaces by Refinement
Principal Investigator: Paton, Professor NW
Other Investigators:
Fernandes, Dr AAA Embury, Dr SM
Researcher Co-Investigators:
Dr K Belhajjame
Project Partners:
Department: Computer Science
Organisation: University of Manchester, The
Scheme: Standard Research
Starts: 01 July 2008 Ends: 30 June 2011 Value (£): 572,897
EPSRC Research Topic Classifications:
Information & Knowledge Mgmt
EPSRC Industrial Sector Classifications:
No relevance to Underpinning Sectors
Related Grants:
Panel History:
Panel DatePanel NameOutcome
18 Oct 2007 ICT Prioritisation Panel (Technology) Announced
Summary on Grant Application Form
Web search engines, such as Google or Yahoo, provide access to large numbers of distributed resources. However, the questions such search engines can support are limited, and do not exploit structure within the accessed resources. For example, it is not possible to ask the question what is the phone number of the department where Suzanne Embury works , even though this information can be obtained by navigating from the result of a search for Suzanne Embury . However, one feature of search engines that has made them successful is that they need minimal configuration; for example, no manual annotation of pages is required before they can be searched. As a result, search engines can be seen as providing low-cost low-quality access to distributed data resources.Data integration infrastructures from the database community, by contrast, provide relatively high-cost, high-quality solutions. Where there are multiple data resources, distributed query processing systems provide the illusion that there is only one data resource, and allow complex questions to be answered that refer to data from multiple resources. For example, they could support the question about phone numbers above, even when the information about who Suzanne works for is stored in a different database from the phone number of her department. However, this precision in question answering is only able to be supported where the relationships between data sources have been manually identified, and inconsistencies resolved as part of a time consuming and largely manual data integration process. This proposal seeks to explore the space between search engines and distributed data management systems by providing various of the benefits of the latter with much reduced configuration costs. The term dataspace has been coined to refer to infrastructures that support precise question answering over resources that have been integrated at minimal cost. At present, dataspaces are more a vision than a reality; many design decisions need to be made that explore cost/quality trade-offs, and new techniques will be required for inter-relating data resources, ranking query answers, and for interacting with users about the likely quality of answers obtained. The proposed research hypothesizes that there is no single best position in the cost/quality tradeoff that exists between fully automated and manually constructed data integration. As a result, we propose to develop a flexible software architecture in which it is possible to experiment with different components for constructing mappings between resources, annotating the mappings with measures of their quality, and ranking results according to user-specified criteria. This architecture, in turn, enables exploration of alternative approaches to the design of the components, in particular with a view to allowing incremental refinement of an initial integration that was constructed automatically.
Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Impacts
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: http://www.man.ac.uk