BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:TITLE: Evaluating machine learning models for prediction of attent
 ion deficit hyperactivity disorder among autistic individuals using geneti
 c data - Niran Okewole\, Department of Psychiatry
DTSTART:20260216T120000Z
DTEND:20260216T123000Z
UID:TALK241858@talks.cam.ac.uk
CONTACT:Sam Nallaperuma-Herzberg
DESCRIPTION:Autistic individuals with co-occurring attention deficit hyper
 activity disorder (ADHD) experience additional challenges which are amenab
 le to interventions\, especially if identified early. Prediction algorithm
 s often utilise regression methods which have limited decision boundaries.
  This study thus aimed to evaluate the utility of machine learning methods
  in combination with genetic data to predict co-occurring ADHD among autis
 tic individuals. The study was conducted among autistic individuals [n=13\
 ,290] of genetically inferred European ancestry in the Simons Foundation P
 owering Autism Research (SPARK) dataset. ADHD diagnosis was based on infor
 mer-reported clinician diagnosis. Features included age at registration in
  the study\, sex\, presence/absence of cognitive impairment\, three polyge
 nic scores (ADHD\, depression and educational attainment)\, and the first 
 10 genetic principal components. Models tested include logistic regression
 \, elastic net\, random forest\, gradient boosting and a stacking ensemble
  classifier. The data was split into training (80%) and test (20%) sets\, 
 with iterations of 5-fold and 10-fold cross-validation in the training set
 . Model specifications included hyperparameter tuning using a grid search 
 approach. Global model explainability was assessed using Shapley Additive 
 Explanations (SHAP). External validation was conducted in genetically-infe
 rred African ancestry individuals in the SPARK dataset [n=2\,489]. Tree-ba
 sed methods were found to perform better than linear models by ~3 percenta
 ge points in AUC. The best performing model utilising all features was Ran
 dom Forest (training set: AUC = 0.707\, F1 = 0.653\, Balanced Accuracy = 0
 .659\; test set: AUC = 0.703\, F1 = 0.667\, Balanced Accuracy = 0.652). Mo
 del performance was better for males than for females. External validation
  showed reduced performance in African ancestry individuals (AUC = 0.649\,
  F1 = 0.587\, Balanced Accuracy = 0.625). We conclude that tree-based pred
 ictive models incorporating genetic data are promising although not curren
 tly suitable for individual-level prediction. Optimisations will be requir
 ed for females and individuals of African ancestry.   
LOCATION:SS03 Seminar Room\, Willam Gates building (Department of Computer
  Science and Technology)
END:VEVENT
END:VCALENDAR
