COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > Computer Laboratory Systems Research Group Seminar > Scaling AI Systems with Optical I/O
Scaling AI Systems with Optical I/OAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Srinivasan Keshav. The emergence of optical I/O chiplets enables compute/memory chips to communicate with several Tbps bandwidth. Many technology trends point to the arrival of optical I/O chiplets as a key industry inflection point to realize fully disaggregated systems. In this talk, I will focus on the potential of optical I/O-enabled accelerators for building high bandwidth interconnects tailored for distributed machine learning training. Our goal is to scale the state-of-the-art ML training platforms, such as NVIDIA ’s DGX , from a few tightly connected GPUs in one package to hundreds of GPUs while maintaining Tbps communication bandwidth across the chips. Our design enables accelerating the training time of popular ML models using a device placement algorithm that partitions the training job with data, model, and pipeline parallelism across nodes, while ensuring a sparse and local communication pattern that can be supported efficiently on the interconnect. Bio: Manya Ghobadi is an assistant professor at the EECS department at MIT . Before MIT , she was a researcher at Microsoft Research and a software engineer at Google Platforms. Manya is a computer systems researcher with a networking focus and has worked on a broad set of topics, including data center networking, optical networks, transport protocols, and network measurement. Her work has won the best dataset award and best paper award at the ACM Internet Measurement Conference (IMC) as well as Google research excellent paper award. This talk is part of the Computer Laboratory Systems Research Group Seminar series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsPhysiology, Development and Neuroscience Talks CCLS 2020Other talksExploring the role of the tumour microenvironment: what do the other cells do? Blood in Motion: The Physics of Blood Flow CANCELLED - Further explorations in Peru Lung Cancer: Part 1. Patient pathway and intervention. Part 2. Lung Cancer: Futurescape Success: what lies behind the mask? Zero cases - the lessons from New Zealand |