EPSRC logo

Details of Grant 

EPSRC Reference: EP/D074959/1
Title: Discriminative Phrase-Based Statistical Machine Translation
Principal Investigator: Osborne, Dr M
Other Investigators:
Researcher Co-Investigators:
Project Partners:
Department: Sch of Informatics
Organisation: University of Edinburgh
Scheme: Standard Research
Starts: 01 February 2007 Ends: 31 January 2010 Value (£): 269,262
EPSRC Research Topic Classifications:
Human Communication in ICT
EPSRC Industrial Sector Classifications:
No relevance to Underpinning Sectors
Related Grants:
Panel History:  
Summary on Grant Application Form
Statistical Machine Translation (SMT) has made great improvements over the last decade. A striking property of these systems is that they make minimal usage of linguistic knowledge about translation. All knowledge about how to translate sentences is gathered in a data-driven manner from parallel corpora (sentences paired with their translation).In tandem with this observation, projecting ahead, we can see that the volumes of parallel corpora available for traning will not increase at a substantial rate. This suggests that further progress in SMT will come from better modelling of the existing data we have: this means bringing linguistics to the translation problem.For some languages, linguistic constraints are easily obtained. For other languages, this information is less widely present. We intend seeing whether an improvement in translation can be obtained even when using impoverished knowledge sources.To successfully carry out this integration, we need a flexible framework. We shall extend an existing approach (which yields state-of-the-art results) using techniques from discriminitive machine learning techinques ( maximim entropy ). These approaches will not only allow us to easily integrate linguistics into the translation process, but should also allow us to improve upon the state-of-the-art simply from better modelling. Associated with better modelling are serious scaling problems, for which we have experience at tackling.The language pairs we shall investigate will include German-English, Arabic-English and Chinese-English.Finally, we shall compete in international Machine Translation evaluation exercises. This will involve automatic and manual evaluation of our translation quality, and will allow comparison of our approaches with that of other groups.
Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Impacts
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: http://www.ed.ac.uk