![]() |
COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. | ![]() |
University of Cambridge > Talks.cam > Data Science and AI in Medicine > Proteomizer & GhostBuster: AI Tools for Proteomic Inference and Ghost Gene Annotation
Proteomizer & GhostBuster: AI Tools for Proteomic Inference and Ghost Gene AnnotationAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Pietro Lio. Hybrid (in presence and online) Despite the rapid growth of multiomic datasets, our ability to interpret and integrate transcriptomic, proteomic, and regulatory information remains limited by both biological complexity and systemic biases in data and literature. In this talk, I will present two complementary machine learning frameworks—Proteomizer and GhostBuster—that address these challenges from distinct but synergistic angles. Proteomizer is a deep learning platform that predicts proteomic landscapes from transcriptomic and miRNomic profiles, achieving state-of-the-art accuracy (r = 0.68) on over 8,600 matched samples. Beyond prediction, Proteomizer enhances differential expression analysis and enables mechanistic insights through explainable AI, revealing regulatory interactions that underlie transcript-protein discrepancies. GhostBuster, on the other hand, tackles a different but equally critical issue: literature bias in gene annotation. Many human genes remain understudied due to sociological dynamics that skew research focus. GhostBuster is the first ML framework explicitly designed to mitigate this bias, using unbiased datasets (e.g., TCGA , LINCS) to uncover novel gene functions, disease associations, and pathway memberships. It demonstrates that models trained on less-biased data are significantly more effective at identifying emerging biological knowledge, particularly for “ghost genes”. Together, these tools exemplify a new generation of interpretable, bias-aware machine learning approaches that not only improve predictive performance but also expand our capacity to generate biologically meaningful hypotheses—especially for the vast uncharted regions of the human genome. This talk is part of the Data Science and AI in Medicine series. This talk is included in these lists:Note that ex-directory lists are not shown. |
Other listsJewish History Mathematical Modeling Japanese Society in Cambridge ケンブリッジ日本人会Other talksUsing strong contraction to obtain hyperbolicity Coarse separation in hyperbolic groups and RAAG's TBC External Seminar - Magdalena Bezanilla TBC Registration HIher-rank lattices and uniformly convex Banach spaces |