![]() |
COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. | ![]() |
Surgical data using LLMsAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Pietro Lio. The automatic summarization of surgical videos is crucial for improving procedural documentation, surgical training, and post-operative analysis. This thesis presents a new method at the intersection of artificial intelligence and medicine, seeking to develop innovative machine-learning models with real-world applications in surgery. To this end, we propose a multi-modal approach to generate video summaries by benefiting from the latest improvements in both computer vision and large language models. For instance, the model processes surgical videos in the 3 following key steps. After dividing the video into clips, the focus is on the extraction of visual features, by treating the clips on a frame level with visual transformers. The goal is to detect the tools, organs, tissues and actions performed by the surgeon. These visual features are then translated to frame captions using large language models. Subsequently, on the video level, the emphasis is placed on the temporal features. The latter are obtained with a Vivit-based encoder by taking as input both the clips and the frame captions extracted earlier. In an analogous way to the frame captions, the temporal features are converted into clip captions, which capture the overall context of the clip. The last phase gathers the combination of the clip descriptions into a surgical report with an LLM specifically designed for this task. We train and evaluate our model on the CholecT50 dataset, leveraging instrument and action frame annotations along 50 laparoscopic videos. Experimental results demonstrate that our method produces coherent and contextually meaningful summaries, with a 96% precision for tool detection and 0.74% Bert score for temporal context extraction. This research contributes to the development of AI-assisted tools for surgical reporting and analysis This talk is part of the Foundation AI series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsEarthwatch Lecture Sir David King's Surface Science Seminars Spanish & PortugueseOther talksMitigating the Risks of Metastable Failures in Distributed Systems A Covering Pursuit Game Interaction of Mechanical Ventilation and Natural Convection AI in Healthcare - Opportunities and Challenges Reproducible Brain Charts: Accelerating Studies of Brain Development with Open Science Thermal shift methods and high throughput screening |