Non-negative matrix factorization (NMF) has been a popular approach to separate a mixture into its constituent sources. The NMF models for a source are learned from clean training sounds and can be used to explain the contribution of the source in any mixture. Recently, neural networks and deep-learning have become popular for source-separation. In this talk, we will present a neural network that can act as an equivalent to NMF and show how it can be used for source-separation. We will also present a neural network that can act as an equivalent to short-time front-end transforms and demonstrate the ability of the network to learn optimal, real-valued basis functions directly from the raw waveform of a signal. Combining these, we can now propose end-to-end models that combine the concepts of NMF and deep-learning to source separation. We will also show how such a model can easily lead to novel extensions that can result in potentially superior separation performance.
Shrikant Venkataramani is a candidate for PhD in the department of Electrical and Computer Engineering at the University of Illinois at Urbana-Champaign. He received his Bachelors in Electronics and Communications Engineering from Mumbai University and his M.Tech degree from IIT Bombay. His research focusses on developing machine learning algorithms for audio processing and source separation, stressing on generative neural network models and non-negative models for audio. He has completed two internships at Adobe Research and has been awarded the "Best student paper award" at MLSP 2017.