This project targets a new processing paradigm for the development andoptimization of spoken dialogue systems (SDS) that are context-aware,efficient, and most importantly robust to the uncertainty thatpervades natural language. We will develop tractable and effectivetechniques for the integrated end-to-end treatment of uncertainty incontext-aware SDS, using learning algorithms combinedwith Partially Observable Markov Decision Processes (POMDPs). Thisrequires us to develop effective methods for training and testing suchsystems. We will also determine, through system tests withreal users, whether the end-to-end statistical treatment ofuncertainty improves SDS for users, in comparison to rulebased and standard MDP-based techniques.No current SDS treats dialogue processing as an end-to-endintegrated statistical system, constrained by context, whereuncertainty in one process feeds into other processes, whereuncertainty in one dialogue state feeds into the nextdialogue state, and where this whole system is constrained viacontextual feedback. It is still standard practice to ignore theuncertainty in the output of a lower-level process by passing only asingle best analysis to higher-level processes, with the sideeffect that lower-level processes do not take into account importanthigh-level constraints. For example, contextual features ofdialogues such as user goals or previous speech acts are notsystematically exploited in speech recognition or utteranceinterpretation. This is a serious shortcoming for current SDS, given that uncertainty pervades and proliferates throughevery level of dialogue processing, from speech recognition errorsthrough interpretation ambiguities, to uncertain dialogue states andcompeting strategies. These problems lead to the currentsituation where SDS are not robust or efficient enoughfor any but very simple tasks.We will build and evaluate SDS which usestatistical processing end-to-end, and which use contextrepresentations to constrain the uncertainty inherent in dialogue. Wewill build on exisiting knowledge and techniques developed in the TALKproject, and well as recent corpora (COMMUNICATOR, TALK, AMI). TheSDS development tools, components, and environments usedand developed at Edinburgh's HCRC (e.g. DIPPER, HTK, Festival) alsoprovide a number of exisiting dialogue systems (FLIGHTS, TALK, WITAS), forming a platform to be extended usingthe new methods developed in the project. These systems can then beused for testing, evaluation, and further data collection.The proposal thus aims to improve dialogue system robustness andefficiency, and allow SDS to be developed and optimized usingdata-driven approaches. There is much user frustration with currently deployed SDS, so there is much to be gained from improved robustness andefficiency. Data-driven optimization will also lead to decreaseddeployment and development costs for industry. Thus the beneficiaries ofthis research will potentially be all futureusers of IT (including the illiterate andIT-illiterate, also in the developing world). In the short tomedium term, commercial applications include: interactive SDS, dialogue and meeting summarisation, interactiveentertainment, intelligent tutoring systems, intelligent personalassistants, and dialogue supported question-answering and search.With recent advances in speech recognition, parsing, context-sensitivestatistical dialogue management, the theory of learning with PartiallyObservable states, the availability of new,large, and richly annotated dialogue corpora, we are now in a position to treat dialogueprocessing as an end-to-end context-aware statistical system. Webelieve this model will lead to a breakthough in robust, efficient, and natural human-computer SDS, andhas the potential to radically improve the state-of-the-art indialogue management.
|