EPSRC Reference: |
EP/I031022/1 |
Title: |
Natural Speech Technology |
Principal Investigator: |
Renals, Professor S |
Other Investigators: |
|
Researcher Co-Investigators: |
|
Project Partners: |
|
Department: |
Sch of Informatics |
Organisation: |
University of Edinburgh |
Scheme: |
Programme Grants |
Starts: |
01 May 2011 |
Ends: |
31 July 2016 |
Value (£): |
6,236,104
|
EPSRC Research Topic Classifications: |
Digital Signal Processing |
Human Communication in ICT |
|
EPSRC Industrial Sector Classifications: |
Communications |
Healthcare |
Information Technologies |
|
|
Related Grants: |
|
Panel History: |
Panel Date | Panel Name | Outcome |
02 Feb 2011
|
ICT Programme Grants Interviews
|
Announced
|
|
Summary on Grant Application Form |
Humans are highly adaptable, and speech is our natural medium for informal communication. When communicating, we continuously adjust to other people, to the situation, and to the environment, using previously acquired knowledge to make this adaptation seem almost instantaneous. Humans generalise, enabling efficient communication in unfamiliar situations and rapid adaptation to new speakers or listeners. Current speech technology works well for certain controlled tasks and domains, but is far from natural, a consequence of its limited ability to acquire knowledge about people or situations, to adapt, and to generalise. This accounts for the uneasy public reaction to speech-driven systems. For example, text-to-speech synthesis can be as intelligible as human speech, but lacks expression and is not perceived as natural. Similarly, the accuracy of speech recognition systems can collapse if the acoustic environment or task domain changes, conditions which a human listener would handle easily. Research approaches to these problems have hitherto been piecemeal and as a result progress has been patchy. In contrast NST will focus on the integrated theoretical development of new joint models for speech recognition and synthesis. These models will allow us to incorporate knowledge about the speakers, the environment, the communication context and awareness of the task, and will learn and adapt from real world data in an online, unsupervised manner. This theoretical unification is already underway within the NST labs and, combined with our record of turning theory into practical state-of-the-art applications, will enable us to bring a naturalness to speech technology that is not currently attainable.The NST programme will yield technology which (1) approaches human adaptability to new communication situations, (2) is capable of personalised communication, and (3) takes account of speaker intention and expressiveness in speech recognition and synthesis. This is an ambitious vision. Its success will be measured in terms of how the theoretical development reshapes the field over the next decade, the takeup of the software systems that we shall develop, and through the impact of our exemplar interactive applications.We shall establish a strong User Group to maximise the impact of the project, with a members concerned with clinical applications, as well as more general speech technology. Members of the User Group include Toshiba, EADS Innovation Works, Cisco, Barnsley Hospital NHS Foundation Trust, and the Euan MacDonald Centre for MND Research. An important interaction with the User Group will be validating our systems on their data and tasks, discussed at an annual user workshop.
|
Key Findings |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Potential use in non-academic contexts |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Impacts |
Description |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk |
Summary |
|
Date Materialised |
|
|
Sectors submitted by the Researcher |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Project URL: |
|
Further Information: |
|
Organisation Website: |
http://www.ed.ac.uk |