Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
Entropy (Basel) ; 23(1)2021 Jan 07.
Artigo em Inglês | MEDLINE | ID: mdl-33430463

RESUMO

If regularity in data takes the form of higher-order functions among groups of variables, models which are biased towards lower-order functions may easily mistake the data for noise. To distinguish whether this is the case, one must be able to quantify the contribution of different orders of dependence to the total information. Recent work in information theory attempts to do this through measures of multivariate mutual information (MMI) and information decomposition (ID). Despite substantial theoretical progress, practical issues related to tractability and learnability of higher-order functions are still largely unaddressed. In this work, we introduce a new approach to information decomposition-termed Neural Information Decomposition (NID)-which is both theoretically grounded, and can be efficiently estimated in practice using neural networks. We show on synthetic data that NID can learn to distinguish higher-order functions from noise, while many unsupervised probability models cannot. Additionally, we demonstrate the usefulness of this framework as a tool for exploring biological and artificial neural networks.

2.
Magn Reson Med ; 84(4): 2174-2189, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-32250475

RESUMO

PURPOSE: In the present work, we describe the correction of diffusion-weighted MRI for site and scanner biases using a novel method based on invariant representation. THEORY AND METHODS: Pooled imaging data from multiple sources are subject to variation between the sources. Correcting for these biases has become very important as imaging studies increase in size and multi-site cases become more common. We propose learning an intermediate representation invariant to site/protocol variables, a technique adapted from information theory-based algorithmic fairness; by leveraging the data processing inequality, such a representation can then be used to create an image reconstruction that is uninformative of its original source, yet still faithful to underlying structures. To implement this, we use a deep learning method based on variational auto-encoders (VAE) to construct scanner invariant encodings of the imaging data. RESULTS: To evaluate our method, we use training data from the 2018 MICCAI Computational Diffusion MRI (CDMRI) Challenge Harmonization dataset. Our proposed method shows improvements on independent test data relative to a recently published baseline method on each subtask, mapping data from three different scanning contexts to and from one separate target scanning context. CONCLUSIONS: As imaging studies continue to grow, the use of pooled multi-site imaging will similarly increase. Invariant representation presents a strong candidate for the harmonization of these data.


Assuntos
Encéfalo , Imagem de Difusão por Ressonância Magnética , Encéfalo/diagnóstico por imagem , Processamento de Imagem Assistida por Computador
3.
Biomolecules ; 14(3)2024 Mar 12.
Artigo em Inglês | MEDLINE | ID: mdl-38540759

RESUMO

Recent advancements in AI-driven technologies, particularly in protein structure prediction, are significantly reshaping the landscape of drug discovery and development. This review focuses on the question of how these technological breakthroughs, exemplified by AlphaFold2, are revolutionizing our understanding of protein structure and function changes underlying cancer and improve our approaches to counter them. By enhancing the precision and speed at which drug targets are identified and drug candidates can be designed and optimized, these technologies are streamlining the entire drug development process. We explore the use of AlphaFold2 in cancer drug development, scrutinizing its efficacy, limitations, and potential challenges. We also compare AlphaFold2 with other algorithms like ESMFold, explaining the diverse methodologies employed in this field and the practical effects of these differences for the application of specific algorithms. Additionally, we discuss the broader applications of these technologies, including the prediction of protein complex structures and the generative AI-driven design of novel proteins.


Assuntos
Antineoplásicos , Neoplasias , Humanos , Antineoplásicos/farmacologia , Antineoplásicos/uso terapêutico , Neoplasias/tratamento farmacológico , Descoberta de Drogas , Desenvolvimento de Medicamentos , Inteligência Artificial
4.
Commun Biol ; 7(1): 400, 2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38565955

RESUMO

Unlocking the full dimensionality of single-cell RNA sequencing data (scRNAseq) is the next frontier to a richer, fuller understanding of cell biology. We introduce q-diffusion, a framework for capturing the coexpression structure of an entire library of genes, improving on state-of-the-art analysis tools. The method is demonstrated via three case studies. In the first, q-diffusion helps gain statistical significance for differential effects on patient outcomes when analyzing the CALGB/SWOG 80405 randomized phase III clinical trial, suggesting precision guidance for the treatment of metastatic colorectal cancer. Secondly, q-diffusion is benchmarked against existing scRNAseq classification methods using an in vitro PBMC dataset, in which the proposed method discriminates IFN-γ stimulation more accurately. The same case study demonstrates improvements in unsupervised cell clustering with the recent Tabula Sapiens human atlas. Finally, a local distributional segmentation approach for spatial scRNAseq, driven by q-diffusion, yields interpretable structures of human cortical tissue.


Assuntos
Leucócitos Mononucleares , Análise de Célula Única , Humanos , Análise de Célula Única/métodos , Perfilação da Expressão Gênica/métodos , Análise por Conglomerados
5.
Front Neurosci ; 18: 1387196, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39015378

RESUMO

Abnormal ß-amyloid (Aß) accumulation in the brain is an early indicator of Alzheimer's disease (AD) and is typically assessed through invasive procedures such as PET (positron emission tomography) or CSF (cerebrospinal fluid) assays. As new anti-Alzheimer's treatments can now successfully target amyloid pathology, there is a growing interest in predicting Aß positivity (Aß+) from less invasive, more widely available types of brain scans, such as T1-weighted (T1w) MRI. Here we compare multiple approaches to infer Aß + from standard anatomical MRI: (1) classical machine learning algorithms, including logistic regression, XGBoost, and shallow artificial neural networks, (2) deep learning models based on 2D and 3D convolutional neural networks (CNNs), (3) a hybrid ANN-CNN, combining the strengths of shallow and deep neural networks, (4) transfer learning models based on CNNs, and (5) 3D Vision Transformers. All models were trained on paired MRI/PET data from 1,847 elderly participants (mean age: 75.1 yrs. ± 7.6SD; 863 females/984 males; 661 healthy controls, 889 with mild cognitive impairment (MCI), and 297 with Dementia), scanned as part of the Alzheimer's Disease Neuroimaging Initiative. We evaluated each model's balanced accuracy and F1 scores. While further tests on more diverse data are warranted, deep learning models trained on standard MRI showed promise for estimating Aß + status, at least in people with MCI. This may offer a potential screening option before resorting to more invasive procedures.

6.
J Vis Exp ; (152)2019 10 11.
Artigo em Inglês | MEDLINE | ID: mdl-31657800

RESUMO

Differential gene expression analysis is an important technique for understanding disease states. The machine learning algorithm CorEx has shown utility in analyzing differential expression of groups of genes in tumor RNA-seq in a way that may be helpful for advancing precision oncology. However, CorEx produces many factors that can be challenging to analyze and connect to existing understanding. To facilitate such connections, we have built a website, CorExplorer, that allows users to interactively explore the data and answer common questions related to its analysis. We trained CorEx on RNA-seq gene expression data for four tumor types: ovarian, lung, melanoma, and colorectal. We then incorporated corresponding survival, protein-protein interactions, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichments, and heatmaps into the website for association with the factor graph visualization. Here we employ example protocols to illustrate use of the database for comprehending the significance of the learned tumor factors in the context of this external data.


Assuntos
Algoritmos , Perfilação da Expressão Gênica/métodos , Regulação Neoplásica da Expressão Gênica/genética , Aprendizado de Máquina , Neoplasias/genética , Portais do Paciente , Perfilação da Expressão Gênica/instrumentação , Genoma , Humanos , Neoplasias/metabolismo , Medicina de Precisão/instrumentação , Medicina de Precisão/métodos , RNA/biossíntese , RNA/genética
7.
Sci Data ; 6(1): 96, 2019 06 17.
Artigo em Inglês | MEDLINE | ID: mdl-31209213

RESUMO

Health care is one of the most exciting frontiers in data mining and machine learning. Successful adoption of electronic health records (EHRs) created an explosion in digital clinical data available for analysis, but progress in machine learning for healthcare research has been difficult to measure because of the absence of publicly available benchmark data sets. To address this problem, we propose four clinical prediction benchmarks using data derived from the publicly available Medical Information Mart for Intensive Care (MIMIC-III) database. These tasks cover a range of clinical problems including modeling risk of mortality, forecasting length of stay, detecting physiologic decline, and phenotype classification. We propose strong linear and neural baselines for all four tasks and evaluate the effect of deep supervision, multitask training and data-specific architectural modifications on the performance of neural models.


Assuntos
Benchmarking , Registros Eletrônicos de Saúde , Aprendizado de Máquina , Mineração de Dados , Bases de Dados Factuais , Humanos
8.
Front Aging Neurosci ; 10: 390, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30555318

RESUMO

Brain aging is a multifaceted process that remains poorly understood. Despite significant advances in technology, progress toward identifying reliable risk factors for suboptimal brain health requires realistically complex analytic methods to explain relationships between genetics, biology, and environment. Here we show the utility of a novel unsupervised machine learning technique - Correlation Explanation (CorEx) - to discover how individual measures from structural brain imaging, genetics, plasma, and CSF markers can jointly provide information on risk for Alzheimer's disease (AD). We examined 829 participants (M age: 75.3 ± 6.9 years; 350 women and 479 men) from the Alzheimer's Disease Neuroimaging Initiative database to identify multivariate predictors of cognitive decline and brain atrophy over a 1-year period. Our sample included 231 cognitively normal individuals, 397 with mild cognitive impairment (MCI), and 201 with AD as their baseline diagnosis. Analyses revealed latent factors based on data-driven combinations of plasma markers and brain metrics, that were aligned with established biological pathways in AD. These factors were able to improve disease prediction along the trajectory from normal cognition and MCI to AD, with an area under the receiver operating curve of up to 99%, and prediction accuracy of up to 89.9% on independent "held out" testing data. Further, the most important latent factors that predicted AD consisted of a novel set of variables that are essential for cardiovascular, immune, and bioenergetic functions. Collectively, these results demonstrate the strength of unsupervised network measures in the detection and prediction of AD.

9.
BMC Med Genomics ; 10(1): 12, 2017 03 15.
Artigo em Inglês | MEDLINE | ID: mdl-28292312

RESUMO

BACKGROUND: De novo inference of clinically relevant gene function relationships from tumor RNA-seq remains a challenging task. Current methods typically either partition patient samples into a few subtypes or rely upon analysis of pairwise gene correlations that will miss some groups in noisy data. Leveraging higher dimensional information can be expected to increase the power to discern targetable pathways, but this is commonly thought to be an intractable computational problem. METHODS: In this work we adapt a recently developed machine learning algorithm for sensitive detection of complex gene relationships. The algorithm, CorEx, efficiently optimizes over multivariate mutual information and can be iteratively applied to generate a hierarchy of relatively independent latent factors. The learned latent factors are used to stratify patients for survival analysis with respect to both single factors and combinations. These analyses are performed and interpreted in the context of biological function annotations and protein network interactions that might be utilized to match patients to multiple therapies. RESULTS: Analysis of ovarian tumor RNA-seq samples demonstrates the algorithm's power to infer well over one hundred biologically interpretable gene cohorts, several times more than standard methods such as hierarchical clustering and k-means. The CorEx factor hierarchy is also informative, with related but distinct gene clusters grouped by upper nodes. Some latent factors correlate with patient survival, including one for a pathway connected with the epithelial-mesenchymal transition in breast cancer that is regulated by a microRNA that modulates epigenetics. Further, combinations of factors lead to a synergistic survival advantage in some cases. CONCLUSIONS: In contrast to studies that attempt to partition patients into a small number of subtypes (typically 4 or fewer) for treatment purposes, our approach utilizes subgroup information for combinatoric transcriptional phenotyping. Considering only the 66 gene expression groups that are found to both have significant Gene Ontology enrichment and are small enough to indicate specific drug targets implies a computational phenotype for ovarian cancer that allows for 366 possible patient profiles, enabling truly personalized treatment. The findings here demonstrate a new technique that sheds light on the complexity of gene expression dependencies in tumors and could eventually enable the use of patient RNA-seq profiles for selection of personalized and effective cancer treatments.


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica , Neoplasias Ovarianas/genética , Neoplasias Ovarianas/terapia , Análise por Conglomerados , Transição Epitelial-Mesenquimal/genética , Feminino , Humanos , Anotação de Sequência Molecular , Metástase Neoplásica , Células-Tronco Neoplásicas/patologia , Neoplasias Ovarianas/metabolismo , Neoplasias Ovarianas/patologia , Mapeamento de Interação de Proteínas , Análise de Sequência de RNA
10.
PLoS One ; 10(6): e0130167, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26115446

RESUMO

We suggest an information-theoretic approach for measuring stylistic coordination in dialogues. The proposed measure has a simple predictive interpretation and can account for various confounding factors through proper conditioning. We revisit some of the previous studies that reported strong signatures of stylistic accommodation, and find that a significant part of the observed coordination can be attributed to a simple confounding effect--length coordination. Specifically, longer utterances tend to be followed by longer responses, which gives rise to spurious correlations in the other stylistic features. We propose a test to distinguish correlations in length due to contextual factors (topic of conversation, user verbosity, etc.) and turn-by-turn coordination. We also suggest a test to identify whether stylistic coordination persists even after accounting for length coordination and contextual factors.


Assuntos
Comunicação , Linguística , Modelos Teóricos , Fatores de Confusão Epidemiológicos , Humanos
11.
Proc IEEE Int Symp Biomed Imaging ; 2015: 980-984, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-26413208

RESUMO

Cognitive decline in old age is tightly linked with brain atrophy, causing significant burden. It is critical to identify which biomarkers are most predictive of cognitive decline and brain atrophy in the elderly. In 566 older adults from the Alzheimer's Disease Neuroimaging Initiative (ADNI), we used a novel unsupervised machine learning approach to evaluate an extensive list of more than 200 potential brain, blood and cerebrospinal fluid (CSF)-based predictors of cognitive decline. The method, called CorEx, discovers groups of variables with high multivariate mutual information and then constructs latent factors that explain these correlations. The approach produces a hierarchical structure and the predictive power of biological variables and latent factors are compared with regression. We found that a group of variables containing the well-known AD risk gene APOE and CSF tau and amyloid levels were highly correlated. This latent factor was the most predictive of cognitive decline and brain atrophy.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA