Brain-phenotype predictions can survive across diverse real-world data.

Adkinson, Brendan D; Rosenblatt, Matthew; Dadashkarimi, Javid; Tejavibulya, Link; Jiang, Rongtao; Noble, Stephanie; Scheinost, Dustin

Adkinson, Brendan D; Rosenblatt, Matthew; Dadashkarimi, Javid; Tejavibulya, Link; Jiang, Rongtao; Noble, Stephanie; Scheinost, Dustin.

Affiliation

Adkinson BD; Interdepartmental Neuroscience Program, Yale School of Medicine, New Haven, CT, 06510, USA.
Rosenblatt M; Department of Biomedical Engineering, Yale University, New Haven, CT, 06520, USA.
Dadashkarimi J; Department of Radiology, Athinoula. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, MA, 02129, USA.
Tejavibulya L; Department of Radiology, Harvard Medical School, Boston, MA, 02129, USA.
Jiang R; Interdepartmental Neuroscience Program, Yale School of Medicine, New Haven, CT, 06510, USA.
Noble S; Department of Radiology & Biomedical Imaging, Yale School of Medicine, New Haven, CT, 06510, USA.
Scheinost D; Department of Radiology & Biomedical Imaging, Yale School of Medicine, New Haven, CT, 06510, USA.

bioRxiv ; 2024 Jan 24.

Article in En | MEDLINE | ID: mdl-38328100

ABSTRACT

ABSTRACT

Recent work suggests that machine learning models predicting psychiatric treatment outcomes based on clinical data may fail when applied to unharmonized samples. Neuroimaging predictive models offer the opportunity to incorporate neurobiological information, which may be more robust to dataset shifts. Yet, among the minority of neuroimaging studies that undertake any form of external validation, there is a notable lack of attention to generalization across dataset-specific idiosyncrasies. Research settings, by design, remove the between-site variations that real-world and, eventually, clinical applications demand. Here, we rigorously test the ability of a range of predictive models to generalize across three diverse, unharmonized samples the Philadelphia Neurodevelopmental Cohort (n=1291), the Healthy Brain Network (n=1110), and the Human Connectome Project in Development (n=428). These datasets have high inter-dataset heterogeneity, encompassing substantial variations in age distribution, sex, racial and ethnic minority representation, recruitment geography, clinical symptom burdens, fMRI tasks, sequences, and behavioral measures. We demonstrate that reproducible and generalizable brain-behavior associations can be realized across diverse dataset features with sample sizes in the hundreds. Results indicate the potential of functional connectivity-based predictive models to be robust despite substantial inter-dataset variability. Notably, for the HCPD and HBN datasets, the best predictions were not from training and testing in the same dataset (i.e., cross-validation) but across datasets. This result suggests that training on diverse data may improve prediction in specific cases. Overall, this work provides a critical foundation for future work evaluating the generalizability of neuroimaging predictive models in real-world scenarios and clinical settings.

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google

Full text: 1 Collection: 01-internacional Database: MEDLINE Type of study: Prognostic_studies / Risk_factors_studies Language: En Journal: BioRxiv Year: 2024 Document type: Article Affiliation country: Estados Unidos

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google