Details of Grant

EPSRC Reference:

EP/V050869/1

Title:

ConCur: Knowledge Base Construction and Curation

Principal Investigator:

Horrocks, Professor I

Other Investigators:

Motik, Professor B

Cuenca Grau, Professor B

Researcher Co-Investigators:

Dr J Chen

Project Partners:

Samsung

Tencent

Department:

Computer Science

Organisation:

University of Oxford

Scheme:

Standard Research

Starts:

01 June 2021

Ends:

31 May 2024

Value (£):

1,131,073

EPSRC Research Topic Classifications:

Artificial Intelligence

Information & Knowledge Mgmt

EPSRC Industrial Sector Classifications:

Information Technologies

Related Grants:

Panel History:

Panel Date	Panel Name	Outcome
23 Mar 2021	EPSRC ICT Prioritisation Panel March 2021	Announced

Summary on Grant Application Form

Knowledge graphs are graph-structured knowledge resources which are often expressed as triples such as ("UK", "hasCapital", "London") and ("London", "instanceOf", "City"). As well as such basic "facts", knowledge graphs often include structural knowledge about the domain, typically based on a hierarchy of entity types (AKA classes or concepts); e.g., ("City", "subClassOf", "HumanSettlement"). A knowledge graph that consist largely or wholly of structural knowledge is often called an ontology.

Some knowledge graphs are general purpose, such as Wikidata and the Google knowledge graph, while others are developed for specific domains such as medicine. They are rapidly gaining in importance and are playing a key role in many applications. For example, Google uses its knowledge graph for search, question answering and Google Assistant, while Amazon and Apple also use knowledge graphs to power their personal assistants Alexa and Siri, respectively. Knowledge graphs are widely used in the domain of health and wellbeing, e.g., for organising and exchanging information and to power clinical artificial intelligence (AI). One example is FoodOn, an ontology representing food knowledge such as fine-grained food product categorization, nutrition and allergens, as well as related activities such as agriculture.

Knowledge graph construction and maintenance is, however, very challenging, and may require a considerable amount of human effort. Notwithstanding the high cost of knowledge creation, knowledge graphs are often still biased, incomplete or too coarse-grained. Take HeLis, an ontology for health and lifestyle, as an example. Its food knowledge is quite simple and often represents many different variants with a single entity (e.g., "Banana" for all kinds and derivatives of bananas), and its knowledge of health is highly incomplete when compared with dedicated biomedical ontologies. In addition, it is hard to avoid errors such as incorrect facts and categorisations in knowledge graphs; e.g., FoodOn categorises soy milk as a kind of milk, but not as a kind of soy product. Such errors may be inherited from the information source or be caused by the construction procedure. These issues significantly impact the usefulness of knowledge graphs and the reliability of the systems that use them; e.g., the categorisation of soy milk could be dangerous if the knowledge graph were used in a food allergen alert system.

Therefore, effective knowledge graph construction and curation is urgently required and will play a critical role in exploiting the full value of knowledge graphs. As there are now many available knowledge resources, one possible approach is to use multiple sources to address both coverage and quality issues, e.g., via integration and cross-checking. For example, integrating HeLis with FoodOn would combine fine-grained categorization of food products (including bananas) with lifestyle knowledge. Moreover, cross-checking FoodOn with HeLis will reveal the problem with soy milk, which is correctly categorized as a soy product in HeLis. Automating the integration of knowledge resources is challenging, but combining semantic and learning-based techniques seems to be a very promising approach, and we have already obtained some encouraging preliminary results in this direction.

The proposed research will therefore study a range of semantic and machine learning techniques, and how to combine them to support knowledge graph construction and curation. As well as its application to knowledge graph construction and curation, this research will also contribute to the development of new neural-symbolic theories, paradigms and methods, such as deep semantic embedding for learning representations for expressive knowledge, and knowledge-guided learning for addressing sample shortage problems. These techniques promise to revolutionize many AI and big data technologies.

Key Findings

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Potential use in non-academic contexts

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Impacts

Description	This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised

Sectors submitted by the Researcher

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Project URL:

Further Information:

Organisation Website:

http://www.ox.ac.uk