|
Elizabeth Shriberg
SRI & ICSI Berkeley
Adventures in Prosody modeling for Speech Processing
Abstract
This talk describes a "direct modeling" approach for incorporating
prosody (the rhythm and melody of speech) into the automatic
processing of spontaneous speech. In contrast to methods that train
models from hand-labeled prosodic, our approach is fully automatic and
requires no human annotation of prosody. I'll first provide an
overview of methods for feature processing, machine learning
techniques for predicting target classes from prosodic features, and
approaches for combining prosodic models with information from
statistical language modeling. Following the general approach, I'll
discuss a range of interesting speech processing problems to which
this general framework of prosodic modeling has been successfully
applied. These include automatic punctuation detection, disfluency
modeling, dialog act segmentation and classification, emotion
recognition, and speaker recognition. Data come from a range of
corpora of spontaneous speech, including human-computer dialog,
telephone conversations, and multi-party meetings.
|