Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
Hum Brain Mapp ; 45(11): e26795, 2024 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-39045881

RESUMO

The architecture of the brain is too complex to be intuitively surveyable without the use of compressed representations that project its variation into a compact, navigable space. The task is especially challenging with high-dimensional data, such as gene expression, where the joint complexity of anatomical and transcriptional patterns demands maximum compression. The established practice is to use standard principal component analysis (PCA), whose computational felicity is offset by limited expressivity, especially at great compression ratios. Employing whole-brain, voxel-wise Allen Brain Atlas transcription data, here we systematically compare compressed representations based on the most widely supported linear and non-linear methods-PCA, kernel PCA, non-negative matrix factorisation (NMF), t-stochastic neighbour embedding (t-SNE), uniform manifold approximation and projection (UMAP), and deep auto-encoding-quantifying reconstruction fidelity, anatomical coherence, and predictive utility across signalling, microstructural, and metabolic targets, drawn from large-scale open-source MRI and PET data. We show that deep auto-encoders yield superior representations across all metrics of performance and target domains, supporting their use as the reference standard for representing transcription patterns in the human brain.


Assuntos
Encéfalo , Imageamento por Ressonância Magnética , Transcrição Gênica , Humanos , Encéfalo/diagnóstico por imagem , Encéfalo/metabolismo , Transcrição Gênica/fisiologia , Tomografia por Emissão de Pósitrons , Processamento de Imagem Assistida por Computador/métodos , Análise de Componente Principal , Compressão de Dados/métodos , Atlas como Assunto
2.
Diabetologia ; 66(3): 495-507, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36538063

RESUMO

AIMS/HYPOTHESIS: Type 2 diabetes is highly polygenic and influenced by multiple biological pathways. Rapid expansion in the number of type 2 diabetes loci can be leveraged to identify such pathways. METHODS: We developed a high-throughput pipeline to enable clustering of type 2 diabetes loci based on variant-trait associations. Our pipeline extracted summary statistics from genome-wide association studies (GWAS) for type 2 diabetes and related traits to generate a matrix of 323 variants × 64 trait associations and applied Bayesian non-negative matrix factorisation (bNMF) to identify genetic components of type 2 diabetes. Epigenomic enrichment analysis was performed in 28 cell types and single pancreatic cells. We generated cluster-specific polygenic scores and performed regression analysis in an independent cohort (N=25,419) to assess for clinical relevance. RESULTS: We identified ten clusters of genetic loci, recapturing the five from our prior analysis as well as novel clusters related to beta cell dysfunction, pronounced insulin secretion, and levels of alkaline phosphatase, lipoprotein A and sex hormone-binding globulin. Four clusters related to mechanisms of insulin deficiency, five to insulin resistance and one had an unclear mechanism. The clusters displayed tissue-specific epigenomic enrichment, notably with the two beta cell clusters differentially enriched in functional and stressed pancreatic beta cell states. Additionally, cluster-specific polygenic scores were differentially associated with patient clinical characteristics and outcomes. The pipeline was applied to coronary artery disease and chronic kidney disease, identifying multiple overlapping clusters with type 2 diabetes. CONCLUSIONS/INTERPRETATION: Our approach stratifies type 2 diabetes loci into physiologically interpretable genetic clusters associated with distinct tissues and clinical outcomes. The pipeline allows for efficient updating as additional GWAS become available and can be readily applied to other conditions, facilitating clinical translation of GWAS findings. Software to perform this clustering pipeline is freely available.


Assuntos
Diabetes Mellitus Tipo 2 , Humanos , Diabetes Mellitus Tipo 2/genética , Estudo de Associação Genômica Ampla , Predisposição Genética para Doença/genética , Teorema de Bayes , Análise por Conglomerados , Polimorfismo de Nucleotídeo Único
3.
J Biomed Inform ; 112: 103606, 2020 12.
Artigo em Inglês | MEDLINE | ID: mdl-33127447

RESUMO

Multimorbidity, or the presence of several medical conditions in the same individual, has been increasing in the population - both in absolute and relative terms. Nevertheless, multimorbidity remains poorly understood, and the evidence from existing research to describe its burden, determinants and consequences has been limited. Previous studies attempting to understand multimorbidity patterns are often cross-sectional and do not explicitly account for multimorbidity patterns' evolution over time; some of them are based on small datasets and/or use arbitrary and narrow age ranges; and those that employed advanced models, usually lack appropriate benchmarking and validations. In this study, we (1) introduce a novel approach for using Non-negative Matrix Factorisation (NMF) for temporal phenotyping (i.e., simultaneously mining disease clusters and their trajectories); (2) provide quantitative metrics for the evaluation of these clusters and trajectories; and (3) demonstrate how the temporal characteristics of the disease clusters that result from our model can help mine multimorbidity networks and generate new hypotheses for the emergence of various multimorbidity patterns over time. We trained and evaluated our models on one of the world's largest electronic health records (EHR) datasets, containing more than 7 million patients, from which over 2 million where relevant to, and hence included in this study.


Assuntos
Registros Eletrônicos de Saúde , Multimorbidade , Algoritmos , Estudos Transversais , Humanos
4.
Int J Mol Sci ; 20(22)2019 Nov 16.
Artigo em Inglês | MEDLINE | ID: mdl-31744086

RESUMO

Using pan-cancer data from The Cancer Genome Atlas (TCGA), we investigated how patterns in copy number alterations in cancer cells vary both by tissue type and as a function of genetic alteration. We find that patterns in both chromosomal ploidy and individual arm copy number are dependent on tumour type. We highlight for example, the significant losses in chromosome arm 3p and the gain of ploidy in 5q in kidney clear cell renal cell carcinoma tissue samples. We find that specific gene mutations are associated with genome-wide copy number changes. Using signatures derived from non-negative factorisation, we also find gene mutations that are associated with particular patterns of ploidy change. Finally, utilising a set of machine learning classifiers, we successfully predicted the presence of mutated genes in a sample using arm-wise copy number patterns as features. This demonstrates that mutations in specific genes are correlated and may lead to specific patterns of ploidy loss and gain across chromosome arms. Using these same classifiers, we highlight which arms are most predictive of commonly mutated genes in kidney renal clear cell carcinoma (KIRC).


Assuntos
Carcinoma de Células Renais/patologia , Variações do Número de Cópias de DNA/genética , Neoplasias Renais/patologia , Área Sob a Curva , Carcinoma de Células Renais/genética , Cromossomos/genética , Humanos , Neoplasias Renais/genética , Aprendizado de Máquina , Mutação , Ploidias , Curva ROC , Proteína Supressora de Tumor p53/genética , Proteína Supressora de Tumor Von Hippel-Lindau/genética
5.
Br J Nutr ; 116(2): 300-15, 2016 07.
Artigo em Inglês | MEDLINE | ID: mdl-27189191

RESUMO

Identification and characterisation of dietary patterns are needed to define public health policies to promote better food behaviours. The aim of this study was to identify the major dietary patterns in the French adult population and to determine their main demographic, socio-economic, nutritional and environmental characteristics. Dietary patterns were defined from food consumption data collected in the second French national cross-sectional dietary survey (2006-2007). Non-negative-matrix factorisation method, followed by a cluster analysis, was implemented to derive the dietary patterns. Logistic regressions were then used to determine their main demographic and socio-economic characteristics. Finally, nutritional profiles and contaminant exposure levels of dietary patterns were compared using ANOVA. Seven dietary patterns, with specific food consumption behaviours, were identified: 'Small eater', 'Health conscious', 'Mediterranean', 'Sweet and processed', 'Traditional', 'Snacker' and 'Basic consumer'. For instance, the Health-conscious pattern was characterised by a high consumption of low-fat and light products. Individuals belonging to this pattern were likely to be older and to have a better nutritional profile than the overall population, but were more exposed to many contaminants. Conversely, individuals of Snacker pattern were likely to be younger, consumed more highly processed foods, had a nutrient-poor profile but were exposed to a limited number of food contaminants. The study identified main dietary patterns in the French adult population with distinct food behaviours and specific demographic, socio-economic, nutritional and environmental features. Paradoxically, for better dietary patterns, potential health risks cannot be ruled out. Therefore, this study demonstrated the need to conduct a risk-benefit analysis to define efficient public health policies regarding diet.


Assuntos
Dieta , Comportamento Alimentar , Adulto , Estudos Transversais , Dieta/classificação , Inquéritos sobre Dietas , Feminino , França , Humanos , Masculino , Fatores Socioeconômicos
6.
Front Neurol ; 14: 1027160, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37064187

RESUMO

Background: There is a growing interest in the topography of brain regions associated with disorders of consciousness. This has caused increased research output, yielding many publications investigating the topic with varying methodologies. The objective of this study was to ascertain the topographical regions of the brain most frequently associated with disorders of consciousness. Methods: We performed a cross-sectional text-mining analysis of disorders of consciousness studies. A text mining algorithm built in the Python programming language searched documents for anatomical brain terminology. We reviewed primary PubMed studies between January 1st 2000 to 8th February 2023 for the search query "Disorders of Consciousness." The frequency of brain regions mentioned in these articles was recorded, ranked, then built into a graphical network. Subgroup analysis was performed by evaluating the impact on our results if analyses were based on abstracts, full-texts, or topic-modeled groups (non-negative matrix factorization was used to create subgroups of each collection based on their key topics). Brain terms were ranked by their frequency and concordance was measured between subgroups. Graphical analysis was performed to explore relationships between the anatomical regions mentioned. The PageRank algorithm (used by Google to list search results in order of relevance) was used to determine global importance of the regions. Results: The PubMed search yielded 24,944 abstracts and 3,780 full texts. The topic-modeled subgroups contained 2015 abstracts and 283 full texts. Text Mining across all document groups concordantly ranked the thalamus the highest (Savage score = 11.716), followed by the precuneus (Savage score = 4.983), hippocampus (Savage score = 4.483). Graphical analysis had 5 clusters with the thalamus once again having the highest PageRank score (PageRank = 0.0344). Conclusion: The thalamus, precuneus and cingulate cortex are strongly associated with disorders of consciousness, likely due to the roles they play in maintaining awareness and involvement in the default mode network, respectively. The findings also suggest that other areas of the brain like the cerebellum, cuneus, amygdala and hippocampus also share connections to consciousness should be further investigated.

7.
J Raman Spectrosc ; 54(3): 258-268, 2023 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38505661

RESUMO

Raman spectroscopy shows promise as a biomarker for complex nerve and muscle (neuromuscular) diseases. To maximise its potential, several challenges remain. These include the sensitivity to different instrument configurations, translation across preclinical/human tissues and the development of multivariate analytics that can derive interpretable spectral outputs for disease identification. Nonnegative matrix factorisation (NMF) can extract features from high-dimensional data sets and the nonnegative constraint results in physically realistic outputs. In this study, we have undertaken NMF on Raman spectra of muscle obtained from different clinical and preclinical settings. First, we obtained and combined Raman spectra from human patients with mitochondrial disease and healthy volunteers, using both a commercial microscope and in-house fibre optic probe. NMF was applied across all data, and spectral patterns common to both equipment configurations were identified. Linear discriminant models utilising these patterns were able to accurately classify disease states (accuracy 70.2-84.5%). Next, we applied NMF to spectra obtained from the mdx mouse model of a Duchenne muscular dystrophy and patients with dystrophic muscle conditions. Spectral fingerprints common to mouse/human were obtained and able to accurately identify disease (accuracy 79.5-98.8%). We conclude that NMF can be used to analyse Raman data across different equipment configurations and the preclinical/clinical divide. Thus, the application of NMF decomposition methods could enhance the potential of Raman spectroscopy for the study of fatal neuromuscular diseases.

8.
Med Eng Phys ; 57: 51-60, 2018 07.
Artigo em Inglês | MEDLINE | ID: mdl-29703696

RESUMO

The muscle synergy concept provides a widely-accepted paradigm to break down the complexity of motor control. In order to identify the synergies, different matrix factorisation techniques have been used in a repertoire of fields such as prosthesis control and biomechanical and clinical studies. However, the relevance of these matrix factorisation techniques is still open for discussion since there is no ground truth for the underlying synergies. Here, we evaluate factorisation techniques and investigate the factors that affect the quality of estimated synergies. We compared commonly used matrix factorisation methods: Principal component analysis (PCA), Independent component analysis (ICA), Non-negative matrix factorization (NMF) and second-order blind identification (SOBI). Publicly available real data were used to assess the synergies extracted by each factorisation method in the classification of wrist movements. Synthetic datasets were utilised to explore the effect of muscle synergy sparsity, level of noise and number of channels on the extracted synergies. Results suggest that the sparse synergy model and a higher number of channels would result in better estimated synergies. Without dimensionality reduction, SOBI showed better results than other factorisation methods. This suggests that SOBI would be an alternative when a limited number of electrodes is available but its performance was still poor in that case. Otherwise, NMF had the best performance when the number of channels was higher than the number of synergies. Therefore, NMF would be the best method for muscle synergy extraction.


Assuntos
Algoritmos , Eletromiografia , Movimento , Músculos/fisiologia , Processamento de Sinais Assistido por Computador , Humanos , Modelos Biológicos , Análise de Componente Principal
9.
Food Chem Toxicol ; 98(Pt B): 179-188, 2016 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-27984160

RESUMO

Through their diet, humans are exposed to a wide range of substances with possible adverse effects. Total diet studies (TDS) assess exposure and risk for many single substances or mixtures from the same chemical family. This research aims to identify from 440 substances in the second French TDS, the major mixtures to which the French population is exposed and their associated diet. Firstly, substances with a contamination value over the detection limit were selected. Secondly, consumption systems comprising major consumed foods were identified using non-negative matrix factorisation and combined with concentration levels to form the main mixture. Thirdly, individuals were clustered to identify "diet clusters" with similar consumption patterns and co-exposure profiles. Six main consumption systems and their associated mixtures were identified. For example, a mixture of ten pesticides, six trace elements and bisphenol A was identified. Exposure to this mixture is related to fruit and vegetables consumed by a diet cluster comprising 62% of women with a mean age of 51 years. Six other clusters are described with their associated diets and mixtures. Cluster co-exposures were compared to the whole population. This work helps prioritise mixtures for which it is crucial to investigate possible toxicological effects.


Assuntos
Exposição Ambiental/análise , Adulto , Inquéritos sobre Dietas , Dieta Mediterrânea , Comportamento Alimentar , Feminino , Contaminação de Alimentos/análise , França , Frutas/química , Humanos , Masculino , Metais/análise , Pessoa de Meia-Idade , Resíduos de Praguicidas/análise , Medição de Risco , Verduras
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa