The Stanford Natural Language Processing Group

Belief Updating in Spoken Language Interfaces

Dan Bohus
Carnegie Mellon University

Abstract

Over the last decade, advances in natural language processing technologies have paved the way for the emergence of complex spoken language interfaces. A persistent and important problem in the development of these systems is their lack of robustness when confronted with understanding-errors. The problem stems mostly from the unreliability of current speech recognition technology, and is present across all domains and interaction types. My research addresses this problem by: (1) endowing spoken language interfaces with better error awareness, (2) constructing and evaluating a rich repertoire of error recovery strategies, and (3) developing data-driven approaches for making error handling decisions.

In this talk, I focus on the first of these problems: error awareness. Traditionally, spoken dialog systems rely on recognition confidence scores and simple heuristics to guard against potential misunderstandings. While confidence scores can provide an initial reliability assessment, ideally a system should leverage information from subsequent user turns in the conversation and continuously update and improve the accuracy of its beliefs.

I describe a scalable machine learning solution for this belief updating problem. The proposed approach relies on a compressed, abstracted concept-level representation of beliefs and casts the belief updating problem as a multinomial regression task. Experimental results indicate that the constructed belief updating models significantly outperform typical heuristic rules used in current systems. Furthermore, a user study with a mixed-initiative spoken dialog system shows that the proposed approach leads to significant improvements in both the effectiveness and the efficiency of the interaction across a wide range of recognition error rates.