COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > Computer Laboratory Computer Architecture Group Meeting > Modern Cache Prefetching and Page Size Aware Cache Prefetching
Modern Cache Prefetching and Page Size Aware Cache PrefetchingAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Prof Simon Moore. Note unusual time The increase in working set sizes of contemporary applications outpaces the growth in cache sizes, resulting in frequent main memory accesses that deteriorate system performance due to the disparity between processor and memory speeds. Prefetching data blocks ahead of demand accesses has proven successful at attenuating this bottleneck. However, prefetchers operating in the physical address space leave significant performance on the table by limiting their pattern detection within 4KB physical page boundaries when modern systems use page sizes larger than 4KB to mitigate the address translation overheads. In this talk I first will discuss the design and operation of the signature-path prefetcher (SPP), the baseline prefetcher for this work that is representative of the prefetchers seen in many current mid-to-last level cache designs in industry. Then, I will use this as a basis for discussing the exploitation of the high usage of large pages in modern systems to increase the effectiveness of spatial cache prefetching. I will explore the design of Page Size Aware (PSA) prefetchers which leverage large page sizes through a microarchitectural scheme that propagates the page size information to the lower-level cache prefetchers, thus enabling the prefetcher’s meta data to be managed at the actual size of the page the data resides in. Our scheme enables safe prefetching beyond 4KB physical page boundaries when the accessed blocks reside in large pages. We show that our scheme for page-size awareness is compatible with any cache prefetcher without implying design modifications. Interestingly, we find that in several cases, even when data lies in large pages, its sometimes best to manage the prefetcher state at a 4KB boundary, thus we introduce a set dueling technique to determine the ideal metadata page size for a given workload. Our evaluation shows that our proposals improve single-core geomean performance by an average of 8.1% (up to 90% for some workloads) over the original implementation of the considered prefetchers, across 80 memory-intensive workloads. Bio: Paul V. Gratz is a Professor in the department of Electrical and Computer Engineering at Texas A&M University. His research interests include efficient and reliable design in the context of high performance computer architecture, processor memory systems and on-chip interconnection networks. He received his B.S. and M.S. degrees in Electrical Engineering from The University of Florida in 1994 and 1997 respectively. From 1997 to 2002 he was a design engineer with Intel Corporation. He received his Ph.D. degree in Electrical and Computer Engineering from the University of Texas at Austin in 2008. His paper, “Synchronized Progress in Interconnection Networks (SPIN) : A New Theory for Deadlock Freedom,” was selected as a Top Pick from the architecture conferences in 2018 by IEEE Micro. His papers “Path Confidence based Lookahead Prefetching” and “B-Fetch: Branch Prediction Directed Prefetching for Chip-Multiprocessors” were nominated for best papers at MICRO ‘16 and MICRO ‘14 respectively. At ASPLOS ‘09, Dr. Gratz received a best paper award for “An Evaluation of the TRIPS Computer System.” In 2016 he received the “Distinguished Achievement Award in Teaching – College Level” from the Texas A&M Association of Former Students and in 2017 he received the “Excellence Award in Teaching, 2017” from the Texas A&M College of Engineering. This talk is part of the Computer Laboratory Computer Architecture Group Meeting series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsOne Day Workshop on: "The Greek language in Pontus: Romeyka in contemporary Trebizond" Technology Type the title of a new list hereOther talksFact-checking as a conversation: an AI perspective Kirk Lecture: Data-driven modelling of collective cell motility The Rademacher expansion and the gravitational path integral for N=4 dyons Double Copy in Mini-Twistor Space Sean Jordan on Venus and Exo-Venus Dynamics of Aging Biomolecular Condensates |