COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > Computer Laboratory Digital Technology Group (DTG) Meetings > Dynamic Causal Monitoring for Distributed Systems
Dynamic Causal Monitoring for Distributed SystemsAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Andrew Rice. Monitoring and troubleshooting distributed systems is notoriously difficult; potential problems are complex, varied, and unpredictable. The de-facto monitoring and diagnosis tools at our disposal today—logs, counters, and metrics—have two important limitations: what gets recorded is defined a priori, and the information is recorded in a component- or machine-centric way, making it hard to correlate events that cross these boundaries. In this talk I will describe Pivot Tracing, a monitoring framework for distributed systems that addresses both limitations by combining dynamic instrumentation with causal tracing to fundamentally increase the power of both. Through a novel relational operator—the happened-before join, Pivot Tracing gives users, at runtime, the ability to define arbitrary metrics at one point of the system, while being able to select, filter, and group by events meaningful at other parts of the system, even when crossing component or machine boundaries. I will describe our prototype of Pivot Tracing for Java-based systems, and show some examples of our evaluation on a heterogeneous Hadoop cluster comprising HDFS , HBase, MapReduce, and YARN . We found that Pivot Tracing can effectively identify a diverse range of root causes such as software bugs, misconfiguration, and limping hardware. Further, Pivot Tracing is dynamic, extensible, and enables cross-tier analysis between any inter-operating applications, with low execution overhead. Bio: Rodrigo Fonseca is an assistant professor at Brown University’s Computer Science Department. He holds a PhD from UC Berkeley, and prior to Brown was a visiting researcher at Yahoo! Research. He is broadly interested in networking, distributed systems, and operating systems. His research involves seeking better ways to build, operate, and diagnose distributed systems, including large-scale internet systems, cloud computing, and mobile computing. He is currently working on dynamic tracing infrastructures for these systems, on new ways to leverage network programmability, and on better ways to manage energy usage in mobile devices. This talk is part of the Computer Laboratory Digital Technology Group (DTG) Meetings series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsArt Cell Gallery Exhibtions C.U. Geographical Society BlueSci Talks and Workshops Future of Sustainable Development in South Asia Cambridge Food Security Forum Russian SocietyOther talksSimulating Electricity Prices: negative prices and auto-correlation The potential of the non-state sector:what can be learnt from the PEAS example Modelling discontinuities in simulator output using Voronoi tessellations Attentional episodes and cognitive control Ethics for the working mathematician, seminar 8: Standing on the shoulders of giants. |