Many learning problems are formulated as minimization of some loss function on a training set of examples. Distributed gradient methods on a cluster are often used to this purpose. In this talk we discuss how the variability of task execution times at cluster nodes affects the system throughput. In particular, a simple but accurate model allows us to quantify how the time to solve the minimization problem depends on the network of information exchanges among the nodes. Interestingly, we show that, even when communication overhead may be neglected, the clique is not necessarily the most effective topology, as commonly assumed in previous works.
Dr. Giovanni Neglia is with INRIA Sophia Antipolis, France. He obtained his Ph.D. in 2004 from Universita degli Study di Palermo, Italy. His research is in complex networks, particularly communication networks, peer-to-peer networks and smart grids. He is a winner of several best paper awards.