Welcome to the Digital Audio Processing Lab

The Digital Audio Processing lab is a signal processing research facility dedicated to speech and audio processing applications. Research activities include audio content analysis and retrieval, speech recognition, speech enhancement and coding. The lab is equipped with computers, recording and listening equipment and real-time DSP development tools.

Research Areas

Audio Processing

Music content analysis and retrieval

The automatic extraction of musically relevant attributes from audio signals is an important component of music data mining systems. Our work is directed towards melody-based retrieval with research being actively pursued on pitch detection in polyphonic music and melodic similarity scoring.

show ref. listRelated publications

Online Demo - Tansen

Speech Processing

Knowledge based speech recognition

In a linguistically motivated approach to speech recognition, acoustic-phonetic representations for phoneme classification are expected to provide greater robustness in the context of speaker, language and environment variability. Acoustic events or landmarks associated with broad phonetic classes are first located in the speech signal. Next appropriate phone class dependent features are extracted from speech samples in the vicinity of the landmark for the recognition of the phone. Our work is presently directed towards the reliable detection of speech landmarks, and the accurate classification of stop consonants.

Show ref. list Related publications

Low bit rate speech coding

Speech coding algorithms are judged by their speech quality and the bit-rate. In the low (below 8 kbps) to very low (below 2 kbps) bit rate region, there is a distinct compromise between speech quality and bit rate. However there are a number of applications where voice compression at very low bit rates is essential. Our own research efforts are directed toward "communication" quality speech coding at bit rates below 2 kbps. We are actively involved in developing a communication-quality speech codec based on the Multiband Excitation (MBE) model to operate at a range of bit rates below 2 kbps with a complexity suitable for a real-time DSP implementation.

show ref listRelated publications

Online Demo - MBE Coder