EPSRC logo

Details of Grant 

EPSRC Reference: GR/J51757/01
Title: FEATURE-SPACE TRANSFORMATIONS FOR SPEAKER-ADAPTIVE SPEECH RECOGNITION
Principal Investigator: Brookes, Mr DM
Other Investigators:
Researcher Co-Investigators:
Project Partners:
Department: Electrical and Electronic Engineering
Organisation: Imperial College London
Scheme: Standard Research (Pre-FEC)
Starts: 21 March 1994 Ends: 20 March 1997 Value (£): 124,406
EPSRC Research Topic Classifications:
Human Communication in ICT
EPSRC Industrial Sector Classifications:
Related Grants:
Panel History:  
Summary on Grant Application Form
(1) To derive feature-space transformations to improve the performance of HMM-based speech recognition systems by adapting them to new speakers without the need for additional training.(2) To derive algorithms for extracting from a speech waveform the speaker-specific parameters that are needed for speaker normalisation.Progress:The first task to be addressed in this project was the investigation of algorithms for estimating vocal-tract length from the speech waveform. A number of published algorithms have been evaluated and the one giving the most consistent results identified. This algorithm has been applied to speech from a number of male and female speakers and has been found to give reasonable results: inter-speaker differences are fairly consistent and the estimated vocal-tract length increases predictably with lip-rounding. No serious attempt has yet been made to validate the results by comparison with physiological measurements.The second section of the project has been concerned with the derivation of speaker-specific transformations. This investigation has so far been restricted to modifications of the input filter bank, defined by a linear transformation of a DFT power spectrum. Using a combination of dynamic programming and gradient descent techniques, it has proved possible to derive speaker-dependent, phoneme-independent transformations that can substantially reduce the differences between the log filterbank outputs from a test speaker and those from a target average speaker. Work is currently in progress to reduce the computational burden of the optimisation procedure and to confirm that the very promising initial results translate into an improvement in phone recognition performance.
Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Impacts
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: http://www.imperial.ac.uk