Semi-supervised learning methods constitute a category of machine learning methods which use labelled points together with the similarity graph for classification of data points into predefined classes. For each class a semi-supervised method provides a classification function. The semi-supervised methods assume that the classification function should change smoothly over the similarity graph. This idea is formulated as an optimization problem. Standard Laplacian (or transductive learning) and Normalized Laplacian (or diffusion kernel) are well known semi-supervised learning methods. Kernel choice in semi-supervised learning methods reflects how the underlying similarity graph influences the values of the classification functions. In this talk, we analyze a general family of semi-supervised methods, explain the differences between the methods and provide recommendations for the choice of the kernel parameters and labelled points. In particular, it appears that it is preferabl e to choose a method and a kernel based on the properties of the labelled points. Our general framework gives particularly promising PageRank based method. We illustrate our general theoretical conclusions with a typical benchmark example, clustered preferential attachment model and two applications. Applications such as classification of Wikipedia pages and content in P2P networks will be discussed. (This talk is based on joint work with P. Goncalves, A. Mishenin and M. Sokol)
Dr. Konstantin Avrachenkov is Director of Research at INRIA Sophia Antipolis for MISTRAL / MAESTRO Team. He obtained his Master's in Control Engineering in St. Petersburg State Polytechnic and Ph.D. in mathematics from University of South Australia. His research interests are in matrix perturbation theory, Markov chains and Markov decision processes, game theory (particularly network games), and queuing theory, with application focus on scheduling, optimization and inference on computer and communication networks, internet, world wide web, and complex networks in general.