EPSRC Reference: |
GR/M96889/01 |
Title: |
WIDE COVERAGE PARSING AND GRAMMER INDUCTION USING CCG |
Principal Investigator: |
Steedman, Professor M |
Other Investigators: |
|
Researcher Co-Investigators: |
|
Project Partners: |
|
Department: |
Sch of Informatics |
Organisation: |
University of Edinburgh |
Scheme: |
Standard Research (Pre-FEC) |
Starts: |
30 September 2000 |
Ends: |
29 September 2003 |
Value (£): |
159,266
|
EPSRC Research Topic Classifications: |
Human Communication in ICT |
|
|
EPSRC Industrial Sector Classifications: |
Financial Services |
Creative Industries |
Information Technologies |
No relevance to Underpinning Sectors |
|
Related Grants: |
|
Panel History: |
|
Summary on Grant Application Form |
Parsers which integrate corpus-based head-dependency likelihood estimates with linguistically well- motivated grammars currently perform better than the alternatives, and are attractive in practical NLP applications because of their capability for building interpretable structure. Such parsers have so far been confined both in terms of collection of dependency data and in terms of the grammar for the parser itself to quite weak context-free systems, and have been confined to local dependencies, lacking a linguistically proper treatment of unbounded dependencies and constructions involving movement and/or deletion. Such systems both ignore information that is available in the training data, leading to distortions in frequency counts, and make incomplete use of the probabilities during parsing. The present proposal seeks to combine the expressive power of Combinatory Categorial Grammar (CCG) with a generalisation of the head-driven dependency-based probabilistic parsing technique of Collins 1998, to make more complete use of the information concerning unbounded dependencies that is already available in the Penn Treebank training corpus to guide a parser. CCG has been developed as a theory of unbounded dependency, particularly as implicated in relativization, coordination, and parentheticalization, all of which are very frequent in training corpora like the WSJ/Penn Treebank.
|
Key Findings |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Potential use in non-academic contexts |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Impacts |
Description |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk |
Summary |
|
Date Materialised |
|
|
Sectors submitted by the Researcher |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Project URL: |
|
Further Information: |
|
Organisation Website: |
http://www.ed.ac.uk |