The information content of speech is embedded in its segmental (phonemes) as well as its suprasegmental, or prosodic, aspects (pitch, loudness and duration). The latter has important functions in discourse including expressing attitudes and assigning prominence to specific words. In this talk I will present results on the relationship between perceptual prominence and correlates in the acoustic and visual channels. Since the meaning of an utterance changes with the relative saliency assigned to the syllables or words of which it consists, it is an important topic of research how cues in the acoustic as well as the visual signals are employed to emphasize or tone down parts of a message. I will discuss this topic based on earlier and recent results of perceptual studies, also with a view to A-V synthesis and cross-language aspects.
Hansjoerg Mixdorff is professor of Digital Audio and Video Processing at the Faculty of Computer Science and Media at BHT Berlin, where he teaches Speech Communication, Speech Signal Processing and Perception. His main research interests are the modeling of prosodic features of speech, especially in cross-language comparison, and Text-to-Speech systems. His research in intonation modelling has been successfully applied to several of the world's languages. Recently he has been active in the area of speech perception, especially in an auditory-visual setting.