RESUMO
Functional connectomes reveal biomarkers of individual psychological or clinical traits. However, there is great variability in the analytic pipelines typically used to derive them from rest-fMRI cohorts. Here, we consider a specific type of studies, using predictive models on the edge weights of functional connectomes, for which we highlight the best modeling choices. We systematically study the prediction performances of models in 6 different cohorts and a total of 2000 individuals, encompassing neuro-degenerative (Alzheimer's, Post-traumatic stress disorder), neuro-psychiatric (Schizophrenia, Autism), drug impact (Cannabis use) clinical settings and psychological trait (fluid intelligence). The typical prediction procedure from rest-fMRI consists of three main steps: defining brain regions, representing the interactions, and supervised learning. For each step we benchmark typical choices: 8 different ways of defining regions -either pre-defined or generated from the rest-fMRI data- 3 measures to build functional connectomes from the extracted time-series, and 10 classification models to compare functional interactions across subjects. Our benchmarks summarize more than 240 different pipelines and outline modeling choices that show consistent prediction performances in spite of variations in the populations and sites. We find that regions defined from functional data work best; that it is beneficial to capture between-region interactions with tangent-based parametrization of covariances, a midway between correlations and partial correlation; and that simple linear predictors such as a logistic regression give the best predictions. Our work is a step forward to establishing reproducible imaging-based biomarkers for clinical settings.
Assuntos
Benchmarking/métodos , Encéfalo/diagnóstico por imagem , Conectoma/métodos , Imageamento por Ressonância Magnética/métodos , Modelos Neurológicos , Encéfalo/fisiologia , Conectoma/normas , Humanos , Imageamento por Ressonância Magnética/normas , DescansoRESUMO
Resting-state functional Magnetic Resonance Imaging (R-fMRI) holds the promise to reveal functional biomarkers of neuropsychiatric disorders. However, extracting such biomarkers is challenging for complex multi-faceted neuropathologies, such as autism spectrum disorders. Large multi-site datasets increase sample sizes to compensate for this complexity, at the cost of uncontrolled heterogeneity. This heterogeneity raises new challenges, akin to those face in realistic diagnostic applications. Here, we demonstrate the feasibility of inter-site classification of neuropsychiatric status, with an application to the Autism Brain Imaging Data Exchange (ABIDE) database, a large (N=871) multi-site autism dataset. For this purpose, we investigate pipelines that extract the most predictive biomarkers from the data. These R-fMRI pipelines build participant-specific connectomes from functionally-defined brain areas. Connectomes are then compared across participants to learn patterns of connectivity that differentiate typical controls from individuals with autism. We predict this neuropsychiatric status for participants from the same acquisition sites or different, unseen, ones. Good choices of methods for the various steps of the pipeline lead to 67% prediction accuracy on the full ABIDE data, which is significantly better than previously reported results. We perform extensive validation on multiple subsets of the data defined by different inclusion criteria. These enables detailed analysis of the factors contributing to successful connectome-based prediction. First, prediction accuracy improves as we include more subjects, up to the maximum amount of subjects available. Second, the definition of functional brain areas is of paramount importance for biomarker discovery: brain areas extracted from large R-fMRI datasets outperform reference atlases in the classification tasks.
Assuntos
Transtorno do Espectro Autista/diagnóstico por imagem , Córtex Cerebral/fisiopatologia , Conectoma/métodos , Conjuntos de Dados como Assunto , Processamento de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos , Estudos Multicêntricos como Assunto/métodos , Adolescente , Adulto , Biomarcadores , Córtex Cerebral/diagnóstico por imagem , Criança , Conectoma/normas , Conjuntos de Dados como Assunto/normas , Humanos , Processamento de Imagem Assistida por Computador/normas , Imageamento por Ressonância Magnética/normas , Estudos Multicêntricos como Assunto/normas , Reprodutibilidade dos TestesRESUMO
The disparity between the chronological age of an individual and their brain-age measured based on biological information has the potential to offer clinically relevant biomarkers of neurological syndromes that emerge late in the lifespan. While prior brain-age prediction studies have relied exclusively on either structural or functional brain data, here we investigate how multimodal brain-imaging data improves age prediction. Using cortical anatomy and whole-brain functional connectivity on a large adult lifespan sample (N=2354, age 19-82), we found that multimodal data improves brain-based age prediction, resulting in a mean absolute prediction error of 4.29 years. Furthermore, we found that the discrepancy between predicted age and chronological age captures cognitive impairment. Importantly, the brain-age measure was robust to confounding effects: head motion did not drive brain-based age prediction and our models generalized reasonably to an independent dataset acquired at a different site (N=475). Generalization performance was increased by training models on a larger and more heterogeneous dataset. The robustness of multimodal brain-age prediction to confounds, generalizability across sites, and sensitivity to clinically-relevant impairments, suggests promising future application to the early prediction of neurocognitive disorders.
Assuntos
Encéfalo/diagnóstico por imagem , Encéfalo/crescimento & desenvolvimento , Disfunção Cognitiva/diagnóstico por imagem , Imagem Multimodal/métodos , Adulto , Idoso , Idoso de 80 Anos ou mais , Córtex Cerebral/diagnóstico por imagem , Córtex Cerebral/crescimento & desenvolvimento , Disfunção Cognitiva/psicologia , Feminino , Movimentos da Cabeça , Humanos , Imageamento por Ressonância Magnética , Masculino , Pessoa de Meia-Idade , Modelos Neurológicos , Testes Neuropsicológicos , Valor Preditivo dos Testes , Reprodutibilidade dos Testes , Adulto JovemRESUMO
Statistical machine learning methods are increasingly used for neuroimaging data analysis. Their main virtue is their ability to model high-dimensional datasets, e.g., multivariate analysis of activation images or resting-state time series. Supervised learning is typically used in decoding or encoding settings to relate brain images to behavioral or clinical observations, while unsupervised learning can uncover hidden structures in sets of images (e.g., resting state functional MRI) or find sub-populations in large cohorts. By considering different functional neuroimaging applications, we illustrate how scikit-learn, a Python machine learning library, can be used to perform some key analysis steps. Scikit-learn contains a very large set of statistical learning algorithms, both supervised and unsupervised, and its application to neuroimaging data provides a versatile tool to study the brain.
RESUMO
Spontaneous brain activity reveals mechanisms of brain function and dysfunction. Its population-level statistical analysis based on functional images often relies on the definition of brain regions that must summarize efficiently the covariance structure between the multiple brain networks. In this paper, we extend a network-discovery approach, namely dictionary learning, to readily extract brain regions. To do so, we introduce a new tool drawing from clustering and linear decomposition methods by carefully crafting a penalty. Our approach automatically extracts regions from rest fMRI that better explain the data and are more stable across subjects than reference decomposition or clustering methods.