RESUMO
Many clinical and research studies of the human brain require accurate structural MRI segmentation. While traditional atlas-based methods can be applied to volumes from any acquisition site, recent deep learning algorithms ensure high accuracy only when tested on data from the same sites exploited in training (i.e., internal data). Performance degradation experienced on external data (i.e., unseen volumes from unseen sites) is due to the inter-site variability in intensity distributions, and to unique artefacts caused by different MR scanner models and acquisition parameters. To mitigate this site-dependency, often referred to as the scanner effect, we propose LOD-Brain, a 3D convolutional neural network with progressive levels-of-detail (LOD), able to segment brain data from any site. Coarser network levels are responsible for learning a robust anatomical prior helpful in identifying brain structures and their locations, while finer levels refine the model to handle site-specific intensity distributions and anatomical variations. We ensure robustness across sites by training the model on an unprecedentedly rich dataset aggregating data from open repositories: almost 27,000 T1w volumes from around 160 acquisition sites, at 1.5 - 3T, from a population spanning from 8 to 90 years old. Extensive tests demonstrate that LOD-Brain produces state-of-the-art results, with no significant difference in performance between internal and external sites, and robust to challenging anatomical variations. Its portability paves the way for large-scale applications across different healthcare institutions, patient populations, and imaging technology manufacturers. Code, model, and demo are available on the project website.
Assuntos
Imageamento por Ressonância Magnética , Neuroimagem , Humanos , Criança , Adolescente , Adulto Jovem , Adulto , Pessoa de Meia-Idade , Idoso , Idoso de 80 Anos ou mais , Encéfalo/diagnóstico por imagem , Algoritmos , ArtefatosRESUMO
Ultra-high-field magnetic resonance imaging (MRI) enables sub-millimetre resolution imaging of the human brain, allowing the study of functional circuits of cortical layers at the meso-scale. An essential step in many functional and structural neuroimaging studies is segmentation, the operation of partitioning the MR images in anatomical structures. Despite recent efforts in brain imaging analysis, the literature lacks in accurate and fast methods for segmenting 7-tesla (7T) brain MRI. We here present CEREBRUM-7T, an optimised end-to-end convolutional neural network, which allows fully automatic segmentation of a whole 7T T1w MRI brain volume at once, without partitioning the volume, pre-processing, nor aligning it to an atlas. The trained model is able to produce accurate multi-structure segmentation masks on six different classes plus background in only a few seconds. The experimental part, a combination of objective numerical evaluations and subjective analysis, confirms that the proposed solution outperforms the training labels it was trained on and is suitable for neuroimaging studies, such as layer functional MRI studies. Taking advantage of a fine-tuning operation on a reduced set of volumes, we also show how it is possible to effectively apply CEREBRUM-7T to different sites data. Furthermore, we release the code, 7T data, and other materials, including the training labels and the Turing test.
Assuntos
Encéfalo/anatomia & histologia , Encéfalo/diagnóstico por imagem , Processamento de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos , Redes Neurais de Computação , Neuroimagem/métodos , HumanosRESUMO
The promise of artificial intelligence in understanding biological vision relies on the comparison of computational models with brain data with the goal of capturing functional principles of visual information processing. Convolutional neural networks (CNN) have successfully matched the transformations in hierarchical processing occurring along the brain's feedforward visual pathway, extending into ventral temporal cortex. However, we are still to learn if CNNs can successfully describe feedback processes in early visual cortex. Here, we investigated similarities between human early visual cortex and a CNN with encoder/decoder architecture, trained with self-supervised learning to fill occlusions and reconstruct an unseen image. Using representational similarity analysis (RSA), we compared 3T functional magnetic resonance imaging (fMRI) data from a nonstimulated patch of early visual cortex in human participants viewing partially occluded images, with the different CNN layer activations from the same images. Results show that our self-supervised image-completion network outperforms a classical object-recognition supervised network (VGG16) in terms of similarity to fMRI data. This work provides additional evidence that optimal models of the visual system might come from less feedforward architectures trained with less supervision. We also find that CNN decoder pathway activations are more similar to brain processing compared to encoder activations, suggesting an integration of mid- and low/middle-level features in early visual cortex. Challenging an artificial intelligence model to learn natural image representations via self-supervised learning and comparing them with brain data can help us to constrain our understanding of information processing, such as neuronal predictive coding.
Assuntos
Imageamento por Ressonância Magnética , Córtex Visual , Inteligência Artificial , Humanos , Redes Neurais de Computação , Córtex Visual/diagnóstico por imagem , Percepção VisualRESUMO
Many functional and structural neuroimaging studies call for accurate morphometric segmentation of different brain structures starting from image intensity values of MRI scans. Current automatic (multi-) atlas-based segmentation strategies often lack accuracy on difficult-to-segment brain structures and, since these methods rely on atlas-to-scan alignment, they may take long processing times. Alternatively, recent methods deploying solutions based on Convolutional Neural Networks (CNNs) are enabling the direct analysis of out-of-the-scanner data. However, current CNN-based solutions partition the test volume into 2D or 3D patches, which are processed independently. This process entails a loss of global contextual information, thereby negatively impacting the segmentation accuracy. In this work, we design and test an optimised end-to-end CNN architecture that makes the exploitation of global spatial information computationally tractable, allowing to process a whole MRI volume at once. We adopt a weakly supervised learning strategy by exploiting a large dataset composed of 947 out-of-the-scanner (3 Tesla T1-weighted 1mm isotropic MP-RAGE 3D sequences) MR Images. The resulting model is able to produce accurate multi-structure segmentation results in only a few seconds. Different quantitative measures demonstrate an improved accuracy of our solution when compared to state-of-the-art techniques. Moreover, through a randomised survey involving expert neuroscientists, we show that subjective judgements favour our solution with respect to widely adopted atlas-based software.
Assuntos
Encéfalo , Cérebro , Processamento de Imagem Assistida por Computador , Imageamento por Ressonância Magnética , Encéfalo/diagnóstico por imagem , Humanos , Redes Neurais de ComputaçãoRESUMO
BACKGROUND: Deep neural networks have revolutionised machine learning, with unparalleled performance in object classification. However, in brain imaging (e.g., fMRI), the direct application of Convolutional Neural Networks (CNN) to decoding subject states or perception from imaging data seems impractical given the scarcity of available data. NEW METHOD: In this work we propose a robust method to transfer information from deep learning (DL) features to brain fMRI data with the goal of decoding. By adopting Reduced Rank Regression with Ridge Regularisation we establish a multivariate link between imaging data and the fully connected layer (fc7) of a CNN. We exploit the reconstructed fc7 features by performing an object image classification task on two datasets: one of the largest fMRI databases, taken from different scanners from more than two hundred subjects watching different movie clips, and another with fMRI data taken while watching static images. RESULTS: The fc7 features could be significantly reconstructed from the imaging data, and led to significant decoding performance. COMPARISON WITH EXISTING METHODS: The decoding based on reconstructed fc7 outperformed the decoding based on imaging data alone. CONCLUSION: In this work we show how to improve fMRI-based decoding benefiting from the mapping between functional data and CNN features. The potential advantage of the proposed method is twofold: the extraction of stimuli representations by means of an automatic procedure (unsupervised) and the embedding of high-dimensional neuroimaging data onto a space designed for visual object discrimination, leading to a more manageable space from dimensionality point of view.
Assuntos
Mapeamento Encefálico/métodos , Encéfalo/fisiologia , Aprendizado Profundo , Processamento de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos , Transferência de Experiência , Percepção Visual/fisiologia , Adulto , Encéfalo/diagnóstico por imagem , HumanosRESUMO
Major methodological advancements have been recently made in the field of neural decoding, which is concerned with the reconstruction of mental content from neuroimaging measures. However, in the absence of a large-scale examination of the validity of the decoding models across subjects and content, the extent to which these models can be generalized is not clear. This study addresses the challenge of producing generalizable decoding models, which allow the reconstruction of perceived audiovisual features from human magnetic resonance imaging (fMRI) data without prior training of the algorithm on the decoded content. We applied an adapted version of kernel ridge regression combined with temporal optimization on data acquired during film viewing (234 runs) to generate standardized brain models for sound loudness, speech presence, perceived motion, face-to-frame ratio, lightness, and color brightness. The prediction accuracies were tested on data collected from different subjects watching other movies mainly in another scanner. Substantial and significant (QFDR<0.05) correlations between the reconstructed and the original descriptors were found for the first three features (loudness, speech, and motion) in all of the 9 test movies (R¯=0.62, R¯ = 0.60, R¯ = 0.60, respectively) with high reproducibility of the predictors across subjects. The face ratio model produced significant correlations in 7 out of 8 movies (R¯=0.56). The lightness and brightness models did not show robustness (R¯=0.23, R¯ = 0). Further analysis of additional data (95 runs) indicated that loudness reconstruction veridicality can consistently reveal relevant group differences in musical experience. The findings point to the validity and generalizability of our loudness, speech, motion, and face ratio models for complex cinematic stimuli (as well as for music in the case of loudness). While future research should further validate these models using controlled stimuli and explore the feasibility of extracting more complex models via this method, the reliability of our results indicates the potential usefulness of the approach and the resulting models in basic scientific and diagnostic contexts.