Given a set of images that might have a common object, can we detect and extract it ? We answer this by performing image co-segmentation, which is the problem of segmenting similar objects from more than one image. The variations across image sources make it very difficult to extract the common object(s) accurately. It is difficult as one does not know what are the good local features to use and how to match them across images. This further becomes difficult if only a subset of the images contain the common object, where the remaining images act as outliers. In this talk, we present our unsupervised and supervised approaches to solve this problem. One unsupervised approach uses super-pixel segmentation to represent every image as a region adjacency graph (RAG) and does a computationally efficient subgraph matching to obtain the common segment. This is achieved using an intermediate latent class graph and performing a maximally occurring common subgraph matching. The other unsupervised approach considers co-segmentation as a foreground-background classification problem. We solve this through a common foreground labeling problem in a discriminative feature space that we learn in an unsupervised manner using an improved linear discriminant analysis (LDA). Next we present a supervised approach that performs this classification better using a convolutional neural network framework. This network is composed of a metric learning sub-network and decision sub-network leading to a novel conditional siamese encoder-decoder network for estimating a co-segmentation mask. The details and performance of the methods on benchmark datasets will be presented.
Rajbabu Velmurugan is an Associate professor in the Department of Electrical Engineering, Indian Institute of Technology Bombay. He received his Ph.D. in electrical and computer engineering from Georgia Institute of Technology, USA in 2007. He was in L&T, India from 1995 to 1996 and in The MathWorks, USA from 1998 to 2001. He joined IIT Bombay in 2007. His research interests are broadly in signal processing; inverse problems with application in image and audio processing such as blind deconvolution and source separation, low-level image processing and video analysis, speech enhancement using multi-microphone arrays, and developing efficient hardware systems for signal processing applications.