![]() |
University of Cambridge > Talks.cam > Technical Talks - Department of Computer Science and Technology > Perplexity AI: Under the Hood of LLM Inference
Perplexity AI: Under the Hood of LLM InferenceAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Ben Karniely. Abstract: Perplexity is a search and answer engine which leverages LLMs to provide high-quality citation-backed answers. The AI Inference team within the company is responsible for serving the models behind the product, ranging from single-GPU embedding models to multi-node sparse Mixture-of-Experts language models. This talk provides more insight into the in-house runtime behind inference at Perplexity, with a particular focus on efficiently serving some of the largest available open-source models. Biography:Nandor Licker is an AI Inference Engineer at Perplexity, focusing on LLM runtime implementation and GPU performance optimization. Register for the talk at the following link: https://luma.com/dx1ggxgk Some catering will be provided after the talk. This talk is part of the Technical Talks - Department of Computer Science and Technology series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsOrganismal Proteostasis: Molecular Strategies for Proteome Protection in Health and Disease Soft Matter other talksOther talksNon-invertible symmetry-enriched topological order in string-nets Shoaib Malik on "Islamic Perspectives on Exotheology" Holographic defect CFTs with Dirichlet end-of-the-world branes Evolution of germ-line restricted and unrestricted genomes in diptera Polar Oceans Seminar Talk - Bethan Wynne-Cattanach TBC |