Ensuring fast and seamless service to users is critical for today's cloud services. However, guaranteeing fast response can be challenging due to random service delays that are common in today's data centers. In this talk I explore the use the redundancy to combat such service variability. For example, replicating a computing task at multiple servers and then waiting for the earliest copy saves service time. But the redundant tasks can cost more computing resources and also delay subsequent tasks. I present a queueing-theoretic framework to answer fundamental questions such as: 1) How many replicas to launch? 2) Which queues to join? 3) When to issue and cancel the replicas? This framework reveals surprising regimes where replication reduces both delay as well as resource cost. The task replication idea can also be generalized to analyze latency in content download from erasure coded storage. More broadly, this work lays the theoretical foundation for studying queues with redundancy, uncovering many interesting future directions in cloud infrastructure, crowdsourcing and beyond.
Gauri Joshi is a Research Staff Member at IBM T. J. Watson Research Center. She will be joining the Carnegie Mellon ECE department as assistant professor in Fall 2017. Gauri completed a Ph.D from MIT EECS in June 2016. Before coming to MIT, she completed a B.Tech and M. Tech in Electrical Engineering from the Indian Institute of Technology (IIT) Bombay in 2010. Her awards and honors include the Best Thesis Prize in Computer science at MIT (2012), Institute Gold Medal of IIT Bombay (2010), Claude Shannon Research Assistantship (2015-16), and Schlumberger Faculty for the Future fellowship (2011-2015).