University of Cambridge > Talks.cam > Artificial Intelligence Research Group Talks (Computer Laboratory) > Neural Networks for High-Dimensional Tabular Biomedical Datasets

Neural Networks for High-Dimensional Tabular Biomedical Datasets

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Mateja Jamnik.

Modern machine learning algorithms frequently overfit on small-sample size and high-dimensional tabular datasets, which are common in medicine, bioinformatics and drug discovery. How can we reduce the overfitting on tabular datasets with D>>N?

This talk presents two neural methods for learning from small-sample size and high-dimensional tabular datasets. First, we present WPFS , a parameter-efficient neural architecture that performs global feature selection during training. Second, we present GCondNet, a general approach which combines Graph Neural Networks (GNNs) for incorporating the implicit relationships between samples when training standard neural networks. GCondNet exploits the high-dimensionality of the data by creating many small graphs to capture the structure between samples within a feature. We show that WPFS and GCondNet outperform both standard and more recent methods on real-world biomedical datasets.

You can also join us on Zoom

This talk is part of the Artificial Intelligence Research Group Talks (Computer Laboratory) series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity