![]() |
COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. | ![]() |
University of Cambridge > Talks.cam > Computer Laboratory Systems Research Group Seminar > The Case for Decentralized Scheduling in Modern Datacenters
The Case for Decentralized Scheduling in Modern DatacentersAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Richard Mortier. Modern data centres serve as a backbone for executing diverse user workloads. The growing demand for their resources has led to high volumes of traffic, requiring clusters to operate at high utilization. In this talk, I shall detail how data centre schedulers, which are responsible for mapping workload tasks to resources, perform under such challenging conditions. I will present how centralized schedulers, while globally informed, do not scale well under high load since they generate a lot of network traffic when continuously transferring updated node data. Conversely, distributed schedulers scale well but lack a precise global view of cluster resources, leading to suboptimal task allocations. Consequently, these existing schedulers impose up to three times longer wait times on tail tasks, leading to large variance in inter-task start times, and hence, longer task and job completion times. I will then describe recent advances in decentralized scheduling, focusing on performance, scalability, and load balancing. I will present our approach of job-aware decentralized scheduling which effectively reduces task wait times even under high cluster load. I will also talk about how distributed optimization algorithms can be implemented within the framework of decentralized scheduling, in order to provide theoretical guarantees for convergence to an optimal schedule. By the end of this talk, I hope to convince you that decentralized schedulers achieve a good balance in both scale and performance, and are indeed the most practical solution for data centres. Bio: Smita Vijayakumar recently completed her PhD in Computer Science from the University of Cambridge, under the supervision of Evangelia Kalyvianaki. As a part of her thesis, she developed a decentralized scheduling framework to reduce tail task latencies in highly utilized datacenters. She has over twelve years of industry experience at companies like Cisco and Juniper, working on cloud computing, networking, and distributed systems. She also has an MS in Computer Science from The Ohio State University, where her work investigated cloud resource allocation to bottleneck stages for processing streaming applications. Her research has been published in top-tier ACM and IEEE conferences. She has also been actively involved in mentoring, teaching, and community leadership, including founding Women Who Go in India. Smita’s expertise spans cloud scheduling, resource management, and scalable distributed systems. This talk is part of the Computer Laboratory Systems Research Group Seminar series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsGypsy Roma Traveller (GRT) History Month Health Economics @ Cambridge Rausing LectureOther talksGalactic Archaeology in the Gaia era: a surprising population of very metal-poor stars in the Milky Way disc Abauntz Cave (Navarre): New Excavations at a Classic Palaeolithic Site TBC CPC - Psychiatry Towards Global Maps of Anthropogenic Threats to Biodiversity and Their Contributions to Species Extinctions Statistics Clinic Lent 2025 IV |