Consolidated List of Publications

List of various publications by the lab. These include publications in international conferences, journals and national conferences.

2026

Mildred Pereira, Preeti Rao, Hansjörg Mixdorff, "Comparing Acoustic Cues to Phrase Boundary in Hindi and Indian English.", Speech Prosody 2026, Philadelphia, USA

2025

P. Paradkar, N. Nayak, P. Rao and E. Leen, "A Teacher Training Application for Expressive Reading", T4E 2025, Chennai, India. [Poster for Tools and Demo Session]
S. Dhar, S. R. Chetupalli and P. Rao, "Speaker Anonymization for Children’s Oral Reading Assessment", Proc. of EAAI 2026, Singapore.
S. Dhar, M. Gupta and P. Rao, "LAPS-Diff: A Diffusion-Based Framework for Hindi Singing Voice Synthesis With Language Aware Prosody-Style Guided Learning", Proc. of APSIPA ASC 2025, Singapore. [Poster]
R. Rajakrishnan, S. Raman, A. Ranjan, M. Pereira, N. Nayak, and P. Rao, "Psycholinguistic Features Predict Word Duration in Hindi Read Aloud Speech", Proc. of ICASSP 2025, Hyderabad. [Video]
Y. Bhake and P. Rao, "Expressive Timing in Hindustani Vocal Music", Proc. of WIMAGA, ICASSP 2025 Workshop, Hyderabad.
S. Raman and P. Rao, "Oral Reading Errors by Grade 3 Children in Indian Schools: A Hindi-English Perspective", Proc. of Interspeech, 2025, Rotterdam
Y. Bhake, A. Anand and P. Rao, "Melodic And Metrical Elements of Expressiveness in Hindustani Vocal Music", Proc. of ISMIR 2025, Daejeon, Korea
S. Roychowdhury and P. Rao, "Multimodal Raga Classification from Vocal Performances with Disentanglement and Contrastive Loss", Transactions of the International Society for Music Information Retrieval, July, 2025.
M. Dhamne, S. Raman and P. Rao, "Predicting Prosodic Boundaries for Children's Texts", Proc. of ENMLP 2025, Suzhou, China. [Poster]
M. Rilliard et al., "CROSS-CULTURAL DIMENSIONS ORGANIZING PROSODIC ATTITUDES RECEPTION: A META-ANALYSIS OF FREE LABELING STUDIES", Journal of Speech Sciences (JOSS), v. 14, 2025.

2024

S. Nadkarni, P. Rao and M. Clayton, "Identifying Melodic Motifs and Stable Notes from Gestural Information in Indian Vocal Performances", Transactions of the International Society for Music Information Retrieval, December, 2024.
Sujoy Roychowdhury, Preeti Rao, Sharat Chandran "Human Pose Estimatoin for Expressive Movement Descriptors in Vocal Musical Performances", Proceedings of ISMIR 2024, San Francisco
Ankit Anand, Preeti Rao, Vinoo Alluri "Seconds Matter: Exploring the Timeframe for Music Preferences", European Society for the Cognitive Sciences of Music 2024 - 12th Triennial Conference
Pavan Kalyan, Preeti Rao, Preethi Jyothi, Pushpak Bhattacharyya "Emotion Arithmetic: Emotional Speech Synthesis via Weight Space Interpolation", In Proceedings of Interspeech, 2024.
Raj Gothi, Rahul Kumar, Mildred Pereira, Nagesh Nayak and Preeti Rao "A Dataset and Two-pass System for Reading Miscue Detection", In Proceedings of Interspeech, 2024.
Vaidya, M., Kumar Sahoo, B. and Rao,P. "Deep Learning for Assessment of Oral Reading Fluency" https://arxiv.org/abs/2405.19426
Muller, M., Dixon, S., Volk, A., Sturm, B.L.T., Rao, P. and Gotham, M. "Introducing the TISMIR Educational Track", Transactions of the International Society for Music Information Retrieval, May 2024.
Kamini Sabu, Preeti Rao "Predicting Children’s Perceived Reading Proficiency with Prosody Modeling", Computer Speech & Language 84 (2024): 101557.
Pavan Kalyan, P., Jyothi, P., Rao, P., & Bhattacharyya, P. (2024, March) "STORiCo: Storytelling TTS for Hindi with Character Voice Modulation. "In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 2: Short Papers) (pp. 426-431).

2023

Duan, Z., van Kranenburg, P., Nam, J., & Rao, P. (2023)."Editorial for TISMIR Special Collection: Cultural Diversity in MIR Research. ", Transactions of the International Society for Music Information Retrieval, 6(1), 203–205
Raj Gothi, Preeti Rao "Improving Automatic Speech Recognition with Dialect-Specific Language Models", Speech and Computer: 25th International Conference, SPECOM 2023, Dharwad, India, November 29 – December 2, 2023, Proceedings, Part I (pp. 57-67), doi : 10.1007/978-3-031-48309-7_5.
Ananthanarayana, R. M., Bhattacharjee, A., and Rao, P. (2023). Four-way Classification of Tabla Strokes with Transfer Learning using Western Drums, Transactions of the International Society for Music Information Retrieval
Shreyas Nadkarni, Sujoy Roychowdhury, Preeti Rao, Martin Clayton "Exploring the Correspondence of Melodic Contour with Gesture in Raga Alap Singing" , In Proceedings of ISMIR, 2023. Nominated for Best Paper.
T Pavan Kalyan, Preeti Rao, Preethi Jyothi, Pushpak Bhattacharyya "Narrator or Character: Voice Modulation in an Expressive Multi-speaker TTS" , In Proceedings of Interspeech, 2023
Preeti Rao, "Bridging the Gap Between Musicological Knowledge and Performance Practice with Audio MIR" , Book chapter in Computer Assisted Music and Dramatics, Springer Singapore, 2023
Siddhartha CV, Preeti Rao, Rajbabu Velmurugan, "Classroom Activity Detection in Noisy Preschool Environments with Audio Analysis" , In Proceedings of ICSSES 2023
Hansjörg Mixdorff, Navneet Nayan, Albert Rilliard, Preeti Rao, Debashis Ghosh, "Developing a corpus of Audio-Visual attitudinal expressions in Hindi" , to appear in Proc. of the International Congress of Phonetic Sciences, 2023
Rao, P. and Rohit, M.A "Music Structure Analysis of Concert Recordings" , in Indian Art Music: A Computational Perspective, Sriranga Digital Press, 2023, pp. 197-214
Clayton, M., Rao, P. and Rohit, M.A. "Rhythm and Structural Segmentation in Dhrupad Bandish Performance" , in Indian Art Music: A Computational Perspective, Sriranga Digital Press, 2023, pp. 215-238
Rao, P., Murthy H.A. and Prasanna, S.R.M., Editors, Indian Art Music: A Computational Perspective, Sriranga Digital Press, 2023

2022

Vaidya Mithilesh, Kamini Sabu, and Preeti Rao, "Deep Learning for Prominence Detection In Children’s Read Speech" , ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2022. (Poster )
Martin Clayton, Preeti Rao, Nithya Shikharpur, Sujoy Roychowdhury and Jin Li, "Raga Classification From Vocal Performances Using Multimodal Analysis", Proceedings of ISMIR 2022. Best Special Call Paper Award.
Rishabh Dahale, Vaibhav Talwadker, Preeti Rao and Prateek Verma, "Generating Coherent Drum Accompaniment with Fills and Improvisations", Proceedings of ISMIR 2022

2021

K. Sabu, M. Vaidya, and P. Rao, "CNN encoding of acoustic parameters for prominence detection", arXiv preprint arXiv:2104.05488, 2021.
Rohit M. A., Amitrajit Bhattacharjee and Preeti Rao, "Four-way Classification of Tabla Strokes with Models Adapted from Automatic Drum Transcription", Proceedings of ISMIR 2021. (Poster) (Video). Nominated for Best Paper in Special Track for Cultural Diversity.
Nithya Shikarpur, Asawari Keskar and Preeti Rao, "Computational Analysis of Melodic Mode Switching In Raga Performance", Proceedings of ISMIR 2021. (Poster) (Video)
Titas Chakraborty, Vaishali Patil and Preeti Rao, "The Four-way Classiﬁcation of Stops with Voicing and Aspiration for Non-native Speech Evaluation", Proceedings of Interspeech 2021.
Rohit M. A. and Preeti Rao, "Automatic Stroke Classification of Tabla Accompaniment in Hindustani Vocal Concert Audio", Published in The Journal of Acoustical Society of India: Vol. 48, No. 1-2, 2021 (arXiv preprint)
Preeti Rao and Kaustuv K. Ganguli, "How learned schema influence melodic phrase perception", Proceedings of the Future Directions of Music Cognition International Conference, March 2021.
Kamini Sabu and Preeti Rao " Prosodic Event Detection in Children’s Read Speech", Published in Computer Speech and Language, Feb 2021, doi: https://doi.org/10.1016/j.csl.2021.101200.
Kaustuv K. Ganguli and Preeti Rao, "A Study of Variability in Raga Motifs in Performance Contexts", Journal of New Music Research, pp. 1--15, Feb 2021 (link).

2020

Kamini Sabu and Preeti Rao " Automatic prediction of confidence level from children’s oral reading recordings", Proceedings of Interspeech, October 2020, Shanghai, China.
Rohit M. A., Vinutha T. P., and Preeti Rao " Structural Segmentation of Dhrupad Vocal Bandish Audio Based on Tempo, "Proceedings of ISMIR, October 2020, Montreal, Canada. (Poster, Accompanying webpage)
P. Rao, T. P. Vinutha and M. A. Rohit "Structural Segmentation of Alap in Dhrupad Vocal Concerts ", Transactions of International Society for Music Information Retrieval (TISMIR), 3(1), pp. 137--152, Sep 2020.
Rohit M. A. and Preeti Rao " Structure and Automatic Segmentation of Dhrupad Vocal Bandish Audio", Unpublished technical report, arXiv:2008.00756 [eess.AS], Aug 2020.
Amruta Vidwans, Prateek Verma and Preeti Rao " Classifying Cultural Music using Melodic Features ", Proceedings of SPCOM, July 2020, I.I.Sc, Bangalore. (Video Presentation)
Krishna Subramani, Preeti Rao and Alexandre D’Hooge " VAPAR Synth - A Variational Parametric Model for Audio Synthesis ", Proceedings of ICASSP, May 2020, Barcelona, Spain.
Jayneel Parekh, Preeti Rao and Yi-Hsuan Yang " Speech-to-Singing Conversion in an Encoder-Decoder Framework ", Proceedings of ICASSP, May 2020, Barcelona, Spain.
K. Sabu, S. Chaudhuri, P. Rao and M. Patil " An Optimized Signal Processing Pipeline for Syllable Detection and Speech Rate Estimation ", Accepted for NCC, February 2020, Kharagpur, India (https://arxiv.org/abs/2103.04346).

2019

Krishna Subramani, Alexandre D’Hooge and Preeti Rao " Generative Audio Synthesis with a Parameteric Model ", Late-Breaking/Demo, 20th annual conference of the International Society for Music Information Retrieval, Nov 2019, Delft, The Netherlands. (Poster)
K. K. Ganguli and P. Rao "On the perception of raga motifs by trained musicians", The Journal of the Acoustical Society of America, Vol. 145(4), pp. 2418–2434, April 2019; doi: 10.1121/1.5097588
(Copyright (2019) Acoustical Society of America. This article may be downloaded for personal use only. Any other use requires prior permission of the author and the Acoustical Society of America).
Amruta Vidwans, Prateek Verma, and Preeti Rao "Understanding and Classifying Cultural Music Using Melodic Features - Case Of Hindustani, Carnatic And Turkish Music", arXiv:1906.08916 [cs.SD]

2018

P. Rao and S. Joshi "Vowel contrasts in Indian English by native Marathi speakers at different acquisition levelsEnglish in India and Indian Englishes, Satellite Workshop of Interspeech 2018, Hyderabad, India.
K. K. Ganguli and P. Rao "On the distributional representation of ragas: experiments with allied raga-pairs ", Transactions of International Society for Music Information Retrieval (TISMIR), 1(1), pp. 79--95, Dec 2018.
K. Sabu, K. Kumar and P. Rao " Automatic detection of expressiveness in oral reading ", Special session: Show And Tell, Interspeech, Sep 2018, Hyderabad, India.
N. Mohanan, R. Velmurugan and P.Rao " A Non-convolutive NMF Model for Speech Dereverberation ", Proc. of Interspeech, Sep 2018, Hyderabad, India.
Rohit M. A. and P. Rao " Acoustic-Prosodic Features of Tabla Bol Recitation and Correspondence with the Tabla Imitation ", Proc. of Interspeech, Sep 2018, Hyderabad, India. (Poster)
P. Rao, M. Pandya, K. Sabu, K. Kumar and N. Bondale " A Study of Lexical and Prosodic Cues to Segmentation in a Hindi-English Code-switched Discourse ", Proc. of Interspeech, Sep 2018, Hyderabad, India.
K. Sabu and P. Rao " Automatic assessment of children's oral reading using speech recognition and prosody modeling ", CSI Transactions on ICT, S.I. Visvesvaraya, pp 1-5, Jun 2018, Springer.
K. Sabu and P.Rao " Detection of Prominent Words in Oral Reading by Children ", Proc. of Speech Prosody, Jun 2018, Poznan, Poland.
K. Subramani, S. Sridhar, Rohit M. A., and P.Rao " Energy-Weighted Multi-Band Novelty Functions for Onset Detection in Piano Music ", Proc. of NCC, Feb 2018, Hyderabad, India.
K. Sabu, K. Kumar, and P.Rao " Improving the Noise Robustness of Prominence Detection for Children's Oral Reading Assessment ", Proc. of NCC, Feb 2018, Hyderabad, India.

2017

P. Rao and K. K. Ganguli " Linking prototypical, stock knowledge with the creative musicianship displayed in raga performance ", Invited talk at Frontiers of Research on Speech and Music (FRSM), Dec 2017, Rourkela, India.
K. K. Ganguli and P. Rao " Imitate or recall: How do musicians perform raga phrases? ", Proc. of Frontiers of Research on Speech and Music (FRSM), Dec 2017, Rourkela, India.
M. Khan, M.A. Rohit and P. Rao " Perceptual discrimination of tone quality associated with Sitar jawari ", Proc. of Frontiers of Research on Speech and Music (FRSM), Dec 2017, Rourkela, India.
S. Nag, M.A. Rohit and P. Rao " Reliability of acoustic perception of Tabla strokes in determining their quality ", Proc. of Frontiers of Research on Speech and Music (FRSM), Dec 2017, Rourkela, India.
D. Gudi, T.P. Vinutha and P. Rao " Discrimination of Sitar and Tabla strokes in instrumental concerts using spectral features ", Proc. of Frontiers of Research on Speech and Music (FRSM), Dec 2017, Rourkela, India.
A. Srinivasamurthy, A. Holzapfel, K. K. Ganguli and X. Serra " Aspects of Tempo and Rhythmic Elaboration in Hindustani Music: A Corpus Study ", Frontiers in Digital Humanities, 4 (20), 2017.
P. Rao, N. Sanghvi, H. Mixdorff and K. Sabu " Acoustic correlate of focus in Marathi: Production and perception ", Journal of Phonetics 65C (2017) pp. 110-125.
C. Gupta, D. Grunberg, P. Rao and Y. Wang " Towards automatic mispronunciation detection in singing ", Proc. of the 18th International Society for Music Information Retrieval Conference (ISMIR), Oct 2017, Suzhou, China. Poster
K. K. Ganguli and P. Rao " Towards computational modeling of the ungrammatical in a raga performance ", Proc. of the 18th International Society for Music Information Retrieval Conference (ISMIR), Oct 2017, Suzhou, China. Nominated for Best Paper.
J. C. Ross, A. Mishra, K. K. Ganguli, P. Bhattacharyya and P. Rao " Identifying raga similarity through embeddings learned from compositions' notation ", Proc. of the 18th International Society for Music Information Retrieval Conference (ISMIR), Oct 2017, Suzhou, China.
K. Narang and P. Rao " Acoustic features for determining goodness of tabla strokes ", Proc. of the 18th International Society for Music Information Retrieval Conference (ISMIR), Oct 2017, Suzhou, China.
K. Sabu, P. Swarup, H. Tulsiani and P.Rao " Automatic Assessment of Children's L2 Reading for Accuracy and Fluency ", Proc. of SLaTE, Aug 2017, Stockholm, Sweden.
K. K. Ganguli and P.Rao " Validating stock musicological knowledge via audio analyses of contemporary raga performance ", Invited talk at the 20th Quinquennial Congress of the International Musicological Society: Digital Musicology Study Session, Mar 2017, Tokyo, Japan.
P.Rao " North-Indian Sitar and Sarod Concerts: Visualization of the Rhythmic Structure ", Invited talk at Workshop on Cross-disciplinary and Multi-cultural Perspectives on Musical Rhythm III, Mar 2017, NYU Abu Dhabi, UAE.
N. Mohanan, R. Velmurugan and P.Rao " Speech Dereverberation using NMF with Regularized Room Impulse Response ", Proc. of ICASSP, Mar 2017, New Orleans, USA.
K. K. Ganguli, A. Lele, S. Pinjani, P. Rao, A. Srinivasamurthy and S. Gulati " Melodic Shape Stylization for Robust and Efficient Motif Detection in Hindustani Vocal Music ", Proc. of the National Conference on Communications (NCC), Mar 2017, IIT Madras, India.
A. Pasad, K. Sabu and P. Rao " Voice Activity Detection for Children's Read Speech Recognition in Noisy Conditions ", Proc. of the National Conference on Communications (NCC), Mar 2017, IIT Madras, India.
H. Tulsiani, P. Swarup and P. Rao " Acoustic and Language Modeling for Children's Read Speech Assessment ", Proc. of the National Conference on Communications (NCC), Mar 2017, IIT Madras, India.

2016

P. Rao, P. Swarup, A. Pasad, H. Tulsiani and G. G. Das " Automatic Assessment of Reading with Speech Recognition Technology ", Proc. of 24th Int. Conf. on Computers in Education. India: Asia-Pacific Society for Computers in Education, Dec 2016, Mumbai, India. Poster
K. K. Ganguli and P. Rao " Perceptual anchor or attractor: how do musicians perceive raga phrases? ", Proc. of Frontiers of Research on Speech and Music (FRSM), Nov 2016, Baripada, India.
A. Lele, S. Pinjani, K. K. Ganguli and P. Rao " Improved melodic sequence matching for query based searching in Indian classical music ", Proc. of Frontiers of Research on Speech and Music (FRSM), Nov 2016, Baripada, India.
S. Barhate, S. Kshirsagar, N. Sanghvi, K. Sabu, P. Rao and N. Bondale " Prosodic Features of Marathi News Reading Style ", Proc. of IEEE TENCON, Nov 2016, Singapore.
K. K. Ganguli and P. Rao " Exploring melodic similarity in Hindustani classical music through the synthetic manipulation of raga phrases ", Presented in the workshop on Cognitive Music Information Retrieval (CogMIR), Aug 2016, New York, USA.
K. K. Ganguli, S. Gulati, X. Serra and P. Rao " Data-driven exploration of melodic structures in Hindustani music ", Proc. of the 17th International Society for Music Information Retrieval Conference (ISMIR), Aug 2016, New York, USA. Poster
S. Gulati, J. Serra, K. K. Ganguli, S. Senturk and X. Serra " Time-delayed melody surfaces for raga recognition ", Proc. of the 17th International Society for Music Information Retrieval Conference (ISMIR), Aug 2016, New York, USA.
T.P. Vinutha, S. Suryanarayana, K. K. Ganguli and P. Rao " Structural segmentation and visualization of Sitar and Sarod concert audio ", Proc. of the 17th International Society for Music Information Retrieval Conference (ISMIR), Aug 2016, New York, USA. Poster
T.P. Vinutha, S. Suryanarayana and P. Rao " Reliable Tempo Detection for Structural Segmentation in Sarod concerts ", Proc. of the National Conference on Communications (NCC), March 2016, IIT Guwahati, India.
P. Rao, H. Mixdorff, I. Deshpande, N. Sanghvi and S. Kshirsagar " A Quantitative Study of Focus Shift in Marathi ", Proc. of Speech Prosody, May 2016, Boston, U.S.A.
V. Patil and P. Rao " Detection of Phonemic Aspiration for Spoken Hindi Pronunciation Evaluation ", Journal of Phonetics 54 (2016):202-221.

2015

Amruta Vidwans, Nachiket Deo and Preeti Rao "Audio segmentation based on melodic style with hand-crafted features and with convolutional neural networks", arXiv:1807.11138 [cs.SD]
H. Tulsiani and P. Rao " The IIT-B Query-by-Example System for MediaEval 2015 ", Working Notes Proc. of the MediaEval 2015 Workshop, Sept 2015, Wurzen, Germany.
K. K. Ganguli, A. Rastogi, V. Pandit, P. Kantan and P. Rao " Efficient Melodic Query based Audio Search for Hindustani Vocal Compositions ", Proc. of the 16th International Society for Music Information Retrieval Conference (ISMIR), Oct 2015, Malaga, Spain.
S. Joshi, N. Deo and P. Rao " Vowel Mispronunciation Detection using DNN Acoustic Models with Cross Lingual Training ", Proc. of Interspeech, Sep 2015, Dresden, Germany.
P. Verma, T. P. Vinutha, P. Pandit and P. Rao " Structural Segmentation of Hindustani Concert Audio with Posterior Features ", Proc. of 40th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2015, Brisbane, Australia.
K. K. Ganguli and P. Rao "Discrimination of Melodic Patterns in Indian Classical Music", Proc. of the National Conference on Communications (NCC), Feb 2015, IIT Bombay, India.
M. Jagtap and P. Rao "Enhancing Speech Intelligibility Based on Noise Characteristics", Proc. of the National Conference on Communications (NCC), Feb 2015, IIT Bombay, India.

2014

P. Rao, N. Nayak and S. Adavanne "Singing Voice Separation Using Adaptive Window Harmonic Sinusoidal Modeling ", Presented at MIREX at the 15th International Society for Music Information Retrieval Conference (ISMIR), Oct 2014, Taipei, Taiwan. Poster
S. Venkataramani, N. Nayak, P. Rao and R. Velmurugan "Vocal Separation Using Singer-Vowel Priors Obtained From Polyphonic Audio", Proc. of the 15th International Society for Music Information Retrieval Conference (ISMIR), Oct 2014, Taipei, Taiwan.
S. Gulati, J. Serra, K. K. Ganguli and X. Serra "Landmark Detection in Hindustani Music Melodies ", Proc. of ICMC and SMC, Sept 2014, Athens, Greece.
N. Jathar and P. Rao,"Acoustic Characteristics of Critical Message Utterances in Noise Applied to Speech Intelligibility Enhancement", Proc. of Interspeech, Sep 2014, Singapore.
T. P. Vinutha and P. Rao "Audio Segmentation of Hindustani Music Concert Recordings", Proc. of Frontiers of Research on Speech and Music (FRSM), Mar 2014, Mysore, India.
K. K. Ganguli and P. Rao "Tempo Dependence of Melodic Shapes in Hindustani Classical Music", Proc. of Frontiers of Research on Speech and Music (FRSM), Mar 2014, Mysore, India.
P. Rao, J. Ross, K. K. Ganguli, V. Pandit, V. Ishwar, A. Bellur and H. Murthy "Classification of Melodic Motifs in Raga Music with Time-series Matching", Journal of New Music Research, Volume 43, Issue 1, 31 Mar 2014.
S. Rao and P. Rao " An Overview of Hindustani Music in the Context of Computational Musicology", Journal of New Music Research, Vol. 43, Issue 1, 31 Mar 2014.
N. Prajapati, H. Tulsiani, J. Gada and P. Rao,"Better Phone Alignment for Confidence Measures in Voice based Querying", Proc. of the National Conference on Communications (NCC), Feb 2014, IIT Kanpur, India.
S. Venkataramani, R. Velmurugan and P. Rao "Improving Mobile Phone based Query Recognition using a Microphone Array", Proc. of the National Conference on Communications (NCC), Feb 2014, IIT Kanpur, India.

2013

V. Karjigi and P. Rao "Knowledge based Features for Place Classification of Unvoiced Stops ",Journal of Intelligent Systems, 22(3), 215-228, 2013.

P. Rao, J. Ross and K. K. Ganguli "Distinguishing Raga-specific Intonation of Phrases with Audio Analysis ", Journal of ITC SRA, Vol. 26-27, Dec 2013.
S. Joshi and P. Rao," Acoustic Models for Pronunciation Assessment of Vowels of Indian English", Proc. of Oriental COCOSDA Conference, Nov 2013, KIIT Gurgaon, India.
P. Verma and P. Rao " Distinguishing Musical Instrument Playing Styles with Acoustic Signal Analyses", Proc. of Acoustics, Nov 2013, New Delhi, India.
N. Jathar and P. Rao," Prosody modification of Speech and Singing for Tutoring Applications", Proc. of Acoustics, Nov 2013, New Delhi, India.
V. Patil and P. Rao,"Automatic pronunciation feedback for phonemic aspiration", Proc. of SLaTE Workshop, Sep 2013, Grenoble, France.
V. Patil and P. Rao,"Acoustic features for detection of phonemic aspiration in voiced plosives", Proc. of Interspeech, Aug 2013, Lyon, France.
J. Gada, P. Rao and K. Samudravijaya,"Confidence Measures for Detecting Speech Recognition Errors", Proc. of the National Conference on Communications (NCC), Feb 2013, IIT Delhi, India.

2012

V. Patil and P. Rao,"Automatic pronunciation assessment for language learners with acoustic-phonetic features", Proc. of SLP-TED Workshop at COLING, Dec 2012, Mumbai, India.
V. Karjigi and P.Rao, Classification of place of articulation in unvoiced stops with spectro-temporal surface modeling, Speech Communication, Vol. 54 (10), pp. 1104-1120, Dec 2012.
S. Gulati, V. Rao and P. Rao, "Meter detection from audio for Indian music", S. Ystad et al. (Eds.): CMMR/FRSM 2011, Springer LNCS 7172, pp. 34-43, 2012.
C. Gupta and P. Rao, "An objective evaluation tool for ornamentation in singing", S. Ystad et al. (Eds.): CMMR/FRSM 2011, Springer LNCS 7172, pp. 1-25, 2012.
J. C. Ross, T. P. Vinutha, and P. Rao, Detecting Melodic Motifs from Audio for Hindustani Classical Music, Proc. of the 13th International Society for Music Information Retrieval Conference (ISMIR), Oct 2012, Porto, Portugal.
J. C. Ross and P. Rao, Detection of Raga-Characteristic Phrases from Hindustani Classical Music Audio, Proc. of 2nd CompMusic Workshop, Jul 2012, Istanbul, Turkey.
A. Vidwans, K. K. Ganguli and P. Rao Classification of Indian Classical Vocal Styles from Melodic Contours, Proc. of 2nd CompMusic Workshop, Jul 2012, Istanbul, Turkey.
P. Rao Audio Metadata Extraction: The Case For Hindustani Classical Music, Proc. of International Conference On Signal Processing And Communications, Jul 2012, IISC-Bangalore, India.
G. K. Koduri , S. Gulati , P. Rao and X. Serra Raga Recognition based on Pitch Distribution Methods, Journal of New Music Research, 41:4, 337-350,Apr 2012.
M. V. Jha and P. Rao, "Assessing Vowel Quality for Singing Evaluation", Proc. of the National Conference on Communications (NCC), Feb 2012, IIT-Kharagpur, India.
N. H. Potty, V. Rajbabu and P. Rao "Architecture of a Teleconference System based on Minimum Audible Angle", Proc. of the National Conference on Communications (NCC), Feb 2012, IIT-Kharagpur, India.
P. Verma and P. Rao, Real-time Melodic Accompaniment System for Indian Music Using TMS320C6713, Proc. of the International Conference on VLSI Design and Embedded Systems, Jan 2012, Hyderabad, India.
A. Vidwans and P. Rao, "Identifying Indian Classical Music Styles using Melodic Contours", Proc. of Frontiers of Research on Speech and Music (FRSM), Jan 2012, Gurgaon, India.
V. Rao, P. Gaddipati and P. Rao: "Signal-driven window-length adaptation for sinusoid detection in polyphonic music", IEEE Transactions on Audio, Speech, and Language Processing, Vol. 20, no.1, pp. 342-348, Jan 2012.

2011

R. Shah, K. Chandrayan and P. Rao, "Efficient Broadcast Monitoring using Audio Change Detection", Proc. of the Fifth Indian International Conference on Artificial Intelligence, Dec 2011, Tumkur, India
G. Koduri, S. Gulati and P. Rao, "A survey of raaga recognition techniques and improvements to the state-of-the-art", Proc. of Sound and Music Computing, Jul 2011, Padova, Italy.
V. Rao, C. Gupta and P. Rao, "Context-aware features for singing voice detection in polyphonic music", Proc. of Adaptive Multimedia Retrieval, Jul 2011, Barcelona, Spain.
N. Bhave and P. Rao, "Vehicle engine sound analysis applied to traffic congestion Estimation", Proc. of International Symposium on Computer Music Modeling and Retrieval (CMMR) and Frontiers of Research on Speech and Music (FRSM), Mar 2011, Bhubaneswar, India.
V. Patil and P. Rao, Acoustic features for detection of aspirated stops, Proc. of the National Conference on Communications (NCC), Jan 2011, Bangalore, India.
S. Kini, S. Gulati and P. Rao, Automatic genre classification of North Indian devotional music, Proc. of the National Conference on Communications (NCC), Jan 2011, Bangalore, India.
N. H. Potty, D. Sengupta, V. Rajbabu and P. Rao, Azimuth-dependent spatialization for a teleconference audio display, Proc. of the National Conference on Communications (NCC), Jan 2011, Bangalore, India.

2010

S. Gulati and P. Rao, Rhythm pattern representation for tempo detection in music, Proc. of the First International Conference on Intelligent Interactive Technologies and Multimedia, Dec, 2010, Allahabad, India.
C. Bhat, K. L. Srinivas and P. Rao, Pronunciation scoring for language learners using a phone recognition system, Proc. of the First International Conference on Intelligent Interactive Technologies and Multimedia, Dec. 2010, Allahabad, India.
V. Rao and P. Rao, Vocal melody extraction in the presence of pitched accompaniment in polyphonic music, IEEE Transactions on Audio Speech and Language Processing, vol. 18, no. 8, pp. 2145–2154, Nov. 2010.
P. Gaddipati, N. Dave, P. Rao and V. Rajbabu, Improving time-frequency sparsity for audio spatialization by time-adaptive windowing, Proc. of the National Conference on Communications (NCC), 2010, Chennai, India.
A. Patil, C. Gupta and P. Rao, Evaluating vowel pronunciation quality: Formant space matching versus ASR confidence scoring, Proc. of the National Conference on Communications (NCC), 2010, Chennai, India.
S. Pant, V. Rao and P. Rao, A melody detection user interface for polyphonic music, Proc. of the National Conference on Communications (NCC), 2010, Chennai, India.
M. Deepak and P. Rao, Trajectory and surface modeling of LSF for low rate speech coding, Proc. of the National Conference on Communications (NCC), 2010, Chennai, India.

2009

V. Rao, S. Pant, M. Bhaskar and P. Rao, Applications of a semi-automatic melody extraction interface for Indian music, Proc. of Frontiers of Research on Speech and Music (FRSM), December 2009, Gwalior, India
S. Belle, R. K. Joshi and P. Rao, Raga identification by using swara intonation, Journal of ITC Sangeet Research Academy, vol. 23, December, 2009.
P. Rao, Musical information extraction from the singing voice, Invited paper, Proc. of the National Conference on Signal & Image Processing Applications", IET Mumbai and C.O.E.P., September, 2009.
V. Rao and P. Rao, Improving polyphonic melody extraction by dynamic programming based dual F0 tracking, Proc. of the 12th Intl. Conf. on Digital Audio Effects (DAFx-09), Como, Italy, 2009.
V. Patil, S. Joshi and P. Rao, Improving the robustness of phonetic segmentation to accent and style variation with a two-staged approach, Proc. of Interspeech 2009, Brighton, U.K., 2009.
V. Rao, S. Ramakrishnan and P. Rao, Singing voice detection in polyphonic music using predominant pitch, Proc. of Interspeech 2009, Brighton, U.K., 2009.
N. Santosh, S. Ramakrishnan, V. Rao and P. Rao, Improving singing voice detection in presence of pitched accompaniment, Proc. of the National Conference on Communications (NCC), 2009.
N. Tamrakar, M. Deepak and P. Rao, An 800 bps MBE vocoder with low delay, Proc. of the National Conference on Communications (NCC), 2009.

2008

V. Rao and P. Rao, Melody extraction using harmonic matching, Extended Abstract for MIREX 2008 Audio Melody Extraction Evaluation task.
V. Rao and P. Rao, Vocal melody detection in pitched accompaniment using harmonic matching methods, Proc. of the 11th International Conference on Digital Audio Effects (DAFx-08), September 2008, Espoo, Finland.
V. Karjigi and P. Rao, Landmark based recognition of stops: acoustic attributes versus smoothed spectra, Interspeech 2008, September 2008, Brisbane, Australia.
V. Rao and P. Rao, Objective evaluation of a melody extractor for North Indian classical vocal performances, Proc. of Frontier of Research in Speech & Music (FRSM), February 2008, Kolkata, India.
V. Patil and P. Rao, Acoustic cues to manner of articulation of obstruents in Marathi, Proc. of Frontier of Research in Speech & Music (FRSM), February 2008, Kolkata, India.
R. Siva Kumar, Neelesh Tamrakar and P. Rao, Segment based MBE speech coding at 1000 bps, Proc. of NCC 2008, Mumbai, India.
V. Rao, Ramakrishnan and P. Rao, Singing voice detection in North Indian classical music, Proc. of NCC 2008, Mumbai, India.

2007

P. Rao, Audio Signal Processing, Chapter in Speech, Audio, Image and Biomedical Signal Processing using Neural Networks, (Eds.) Bhanu Prasad and S. R. Mahadeva Prasanna, Springer-Verlag, 2007.
V. Karjigi and P. Rao, Four-way classification of place of articulaton of Marathi unvoiced stops from burst spectra, Proc. of WISP December 2007, Guwahati, India.
Pradeep Kumar P., P. Rao and S. Dutta Roy Note onset detection in natural humming, Proc. of ICCIMA 2007, Volume 04, pp.176-180., 2007
V. Karjigi, B. Patel and P. Rao, Identification of stop consonants for acoustic keyword spotting in continuous speech, Proc. of Wireless Personal Multimedia Communications (WPMC), September 2007, Jaipur, India.
A.Bapat, V. Rao and P. Rao, Melodic contour extraction for Indian classical vocal music, Proc. of Music-AI (International Workshop on Artificial Intelligence and Music) in IJCAI, 2007, Hyderabad, India.
P. Kumar, M. Joshi, S. Hariharan, S. Dutta-Roy and P. Rao, Sung note segmentation for a query-by-humming system, Proc. of Music-AI (International Workshop on Artificial Intelligence and Music) in IJCAI, 2007, Hyderabad, India.
P. Patwardhan and P. Rao, Signal adaptive MBE model for low bit rate speech coding, Proc. of National Conference on Communications (NCC), 2007, Kanpur, India.
R. Singh and P. Rao, Spectral subtraction speech enhancement with RASTA filtering, Proc. of National Conference on Communications (NCC), 2007, Kanpur, India.
V. Karjigi, P. Rao and K. Samudravijaya, Investigation of acoustic attributes of Marathi unvoiced stops for classification, Proc. of Frontiers of Research in Speech and Music (FRSM), 2007, Mysore, India.
V. Rao and P. Rao, Vocal trill and glissando thresholds for Indian listeners, Proc. of Frontiers of Research in Speech and Music (FRSM), 2007, Mysore, India.

2006

N. Kamath, V. Rao and P. Rao,Voice quality synthesis with the bandwidth enhanced sinusoidal model,Proc. of IEEE International Conference on Signal and Image Processing (ICSIP), 2006, Hubli, India.
P. Patwardhan and P.Rao, Effect of voice quality on frequency-warped modeling of vowel spectra, Speech Communication, vol. 48(8), pp. 1009-1023, January, 2006.
K. Manohar and P. Rao, Speech enhancement in non-stationary noise environments using noise properties,Speech Communication, vol. 48(1), pp. 96-109, January, 2006.

2005

P.Rao and P.Patwardhan, Frequency-warped modeling of vowel spectra: dependence on vowel quality, Speech Communication, vol. 47, pp. 322-335, Nov, 2005.
A.Bapat and P.Rao, Pitch tracking of voice in tabla background by the two-way mismatch method, Proc. of the 13th Int. Conf. on Advanced Computing and Communications, 2005, Coimbatore, India.
A.Bapat, P.Rao, H.V.Sahasrabuddhe, On errors in pitch tracking with tabla accompaniment, Proc. of IMAE, 2005, Pune.
A.K. Gupta and P.Rao, Designing a channel coding scheme for the low rate MBE vocoder, Proc. of the National Conference on Communications, 2005, I.I.T., Kharagpur
S. Dutta Roy, P. Rao , A.S. Galinde and R. Bhargava, Melodic-contour based QBH systems: analytical modeling and performance evaluation, Proc. of the National Conference on Communications, 2005, I.I.T., Kharagpur.

2004

P. Patwardhan and P.Rao, Adaptive frequency warping for improved spectral modeling, Proc. of the Int. Conf. on Spoken Language Technology (ICSLT-O-COCOSDA), 2004, New Delhi, India.
K. Manohar and P. Rao, Reduction of burst noises in STSA speech enhancement, Proc. of the Int. Conf. on Spoken Language Technology (ICSLT-O-COCOSDA), 2004, New Delhi, India
P. Rao and S.Shandilya, On the detection of melodic pitch in a percussive background, Journal of the Audio Engineering Society (JAES), vol. 50-4, pp. 378-390, April, 2004.
K. Manohar and P.Rao, A comparison study of spectral subtraction speech enhancement methods, Proc. of the National Conference on Communications, 2004, I.I.Sc. Bangalore.
Pradeep Kumar P. and P.Rao, A study of frequency-scale warping for speaker recognition, Proc. of the National Conference on Communications, 2004, I.I.Sc., Bangalore.

2003

P. Rao and P. Patwardhan, On the representation of voice source aperiodicities in the MBE speech coding model, Proc. of ISCA Workshop on Voice Quality: Functions, Analysis and Synthesis, Geneva, August 2003
S. K. Shandilya and P. Rao, Pitch detection of singing voice in fully musical audio, Proc. of 114^th Convention of Audio Engineering Society, Amsterdam, March 2003
S. Shandilya and P. Rao, Retrieving pitch of the singing voice in polyphonic audio, Proc. of the National Conference on Communications, 2003, I.I.T. Madras.
P. Rao, M. Pathak and T. R. Ramamohan, Robustness of the MBE vocoder to acoustic background noise, Proc. of the National Conference on Communications, 2003, I.I.T. Madras.
M. Anand Raju, B. Sundaram and P. Rao, TANSEN: a query-by-humming based music retrieval system, Proc. of the National Conference on Communications, 2003, I.I.T. Madras.
P. Patwardhan and P. Rao, Frequency warped all-pole modeling of vowel spectra: Dependence on voice and vowel quality, Proc. of Workshop on Spoken Language Processing, T.I.F.R., Mumbai, January 2003.

2002

P. Patwardhan and P. Rao, Controlling perceived degradation in spectral envelope modeling via predistortion, Proc. of the Int. Conf. on Spoken Language Processing, September, 2002, Denver, U.S.A.
G. Moharir, P. Patwardhan and P. Rao, Spectral enhancement preprocessing for the sinusoidal coding of noisy speech, Proc. of the Int. Conf. on Spoken Language Processing, September, 2002, Denver, U.S.A.
M. Anand Raju and P. Rao, Building a melody retrieval system, Proc. of the National Conference on Communications, 2002, I.I.T. Bombay.

2001

A. Malot, P. Rao and V. M. Gadre, Spectrum interpolation synthesis for the compression of music signals, Proc. of the COST G-6 Conf. On Digital Audio Effects, Limerick, Ireland, 2001.
P. Rao, R. van Dinther, R. Veldhuis and A.Kohlrausch, A measure for predicting audibility discrimination thresholds for spectral envelope distortions in vowel sounds, Journal of the Acoustical Society of America, vol. 109(5), pp 2085-2097, May, 2001.
P. Rao, Applying perceptual distance to the discrimination of sounds, Proc. of the National Conference on Communications, I.I.T. Kanpur, January 2001.

2000

P. Rao and A. Das Barman, Speech formant frequency estimation: evaluating a nonstationary analysis approach, Signal Processing, Elsevier Science, vol. 80, no.8, pp. 1655-1667, August, 2000.

1999

P. Patwardhan, S. P. Sira and P. Rao, Some insights into psychoacoustic modeling for MPEG Layer 3 audio coding, Proc. of the National Conference on Communications, I. I. T. Kharagpur, 1999.

1996

P. Rao, A robust method for the estimation of formant frequency modulations in speech signals, Proc. of the International Conf. on Acoustics, Speech and Signal Processing, Atlanta, U.S.A., 1996.

1995

Y. Asakawa, P. Rao and H. Sekine, 8 kb/s speech coding with 4 ms frame size, IEICE Trans., Special Issue on Digital Signal Processing, August, 1995.

1994

P. Rao, Y.Asakawa, and H.Sekine, 8 kb/s low-delay speech coding with 4 ms frame size, Proc. of the Int. Conf. on Spoken Language Processing, Yokohama, Japan, 1994.
A. Swami and P. Rao, Cumulant-based CDT detectors for oto-acoustic signals, Proc. SPIE-94, vol. 2296, Advanced Signal Processing: Algorithms, Architectures, and Implementations -V, 1994.

1993

P. Rao, C. Griffin and F. Taylor, Time-delay estimation using the Wigner distribution, in "Coherence and Time-Delay Estimation", G. C. Carter, Ed., IEEE Press, New York, 1993.

1992

P. Rao and R.C. Bilger, Detection of cubic difference tones using the trispectrum, Journal of the Acoustical Society of America, vol.4, p.2409, 1992.
P. Rao, R. C. Bilger, A. Russell, and T. Meyer, Techniques for measuring low-level otoacoustic emissions, Journal of the Acoustical Society of America, vol. 91, no. 4, p. 2409, 1992.

1991

P. Rao and F. Taylor, Detection and localization of narrowband transient signals using the Wigner distribution, Journal of the Acoustical Society of America, vol. 90, no. 3, 1991, pp. 1423-1434.
C. Griffin, P. Rao, and F. J. Taylor, Round-off error analysis of the discrete-Wigner distribution using fixed point arithmetic, IEEE Trans. Signal Processing, vol. 39, no. 9, 1991, signal detector, Proc. of the International Conf. on Acoustics, Speech and Signal Processing, Toronto, Canada, 1991.

1990

P. Rao, G. Harrison and F. Taylor, Real-time monitoring of vibration using the Wigner distribution, Sound and Vibration, vol.24, no.5, pp. 22-25, 1990.
P. Rao and F. Taylor, Estimation of instantaneous frequency using the discrete-Wigner distribution, Electronics Letters, vol.26, no.4, pp. 246-248, 1990.

1989

C. Griffin, P. Rao and F. Taylor, Round-off error analysis of the discrete-Wigner distribution using fixed point arithmetic, Proc. of the International Conf. On Acoustics, Speech and Signal Processing, Glasgow, Scotland, 1989.

Consolidated List of Publications

List of various publications by the lab. These include publications in international conferences, journals and national conferences.

Our Projects