Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros











Base de dados
Intervalo de ano de publicação
1.
Bioinformatics ; 40(5)2024 May 02.
Artigo em Inglês | MEDLINE | ID: mdl-38603606

RESUMO

MOTIVATION: Predictive biological signatures provide utility as biomarkers for disease diagnosis and prognosis, as well as prediction of responses to vaccination or therapy. These signatures are identified from high-throughput profiling assays through a combination of dimensionality reduction and machine learning techniques. The genes, proteins, metabolites, and other biological analytes that compose signatures also generate hypotheses on the underlying mechanisms driving biological responses, thus improving biological understanding. Dimensionality reduction is a critical step in signature discovery to address the large number of analytes in omics datasets, especially for multi-omics profiling studies with tens of thousands of measurements. Latent factor models, which can account for the structural heterogeneity across diverse assays, effectively integrate multi-omics data and reduce dimensionality to a small number of factors that capture correlations and associations among measurements. These factors provide biologically interpretable features for predictive modeling. However, multi-omics integration and predictive modeling are generally performed independently in sequential steps, leading to suboptimal factor construction. Combining these steps can yield better multi-omics signatures that are more predictive while still being biologically meaningful. RESULTS: We developed a supervised variational Bayesian factor model that extracts multi-omics signatures from high-throughput profiling datasets that can span multiple data types. Signature-based multiPle-omics intEgration via lAtent factoRs (SPEAR) adaptively determines factor rank, emphasis on factor structure, data relevance and feature sparsity. The method improves the reconstruction of underlying factors in synthetic examples and prediction accuracy of coronavirus disease 2019 severity and breast cancer tumor subtypes. AVAILABILITY AND IMPLEMENTATION: SPEAR is a publicly available R-package hosted at https://bitbucket.org/kleinstein/SPEAR.


Assuntos
Teorema de Bayes , Biologia Computacional , Multiômica , Humanos , Biologia Computacional/métodos , Genômica/métodos , Aprendizado de Máquina Supervisionado
2.
Hum Vaccin Immunother ; 19(2): 2251830, 2023 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-37697867

RESUMO

Overfitting describes the phenomenon where a highly predictive model on the training data generalizes poorly to future observations. It is a common concern when applying machine learning techniques to contemporary medical applications, such as predicting vaccination response and disease status in infectious disease or cancer studies. This review examines the causes of overfitting and offers strategies to counteract it, focusing on model complexity reduction, reliable model evaluation, and harnessing data diversity. Through discussion of the underlying mathematical models and illustrative examples using both synthetic data and published real datasets, our objective is to equip analysts and bioinformaticians with the knowledge and tools necessary to detect and mitigate overfitting in their research.


Assuntos
Aprendizado de Máquina , Vacinação
3.
Nature ; 623(7985): 139-148, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37748514

RESUMO

Post-acute infection syndromes may develop after acute viral disease1. Infection with SARS-CoV-2 can result in the development of a post-acute infection syndrome known as long COVID. Individuals with long COVID frequently report unremitting fatigue, post-exertional malaise, and a variety of cognitive and autonomic dysfunctions2-4. However, the biological processes that are associated with the development and persistence of these symptoms are unclear. Here 275 individuals with or without long COVID were enrolled in a cross-sectional study that included multidimensional immune phenotyping and unbiased machine learning methods to identify biological features associated with long COVID. Marked differences were noted in circulating myeloid and lymphocyte populations relative to the matched controls, as well as evidence of exaggerated humoral responses directed against SARS-CoV-2 among participants with long COVID. Furthermore, higher antibody responses directed against non-SARS-CoV-2 viral pathogens were observed among individuals with long COVID, particularly Epstein-Barr virus. Levels of soluble immune mediators and hormones varied among groups, with cortisol levels being lower among participants with long COVID. Integration of immune phenotyping data into unbiased machine learning models identified the key features that are most strongly associated with long COVID status. Collectively, these findings may help to guide future studies into the pathobiology of long COVID and help with developing relevant biomarkers.


Assuntos
Anticorpos Antivirais , Herpesvirus Humano 4 , Hidrocortisona , Linfócitos , Células Mieloides , Síndrome de COVID-19 Pós-Aguda , SARS-CoV-2 , Humanos , Anticorpos Antivirais/sangue , Anticorpos Antivirais/imunologia , Biomarcadores/sangue , Estudos Transversais , Herpesvirus Humano 4/imunologia , Hidrocortisona/sangue , Imunofenotipagem , Linfócitos/imunologia , Aprendizado de Máquina , Células Mieloides/imunologia , Síndrome de COVID-19 Pós-Aguda/diagnóstico , Síndrome de COVID-19 Pós-Aguda/imunologia , Síndrome de COVID-19 Pós-Aguda/fisiopatologia , Síndrome de COVID-19 Pós-Aguda/virologia , SARS-CoV-2/imunologia
4.
bioRxiv ; 2023 Sep 27.
Artigo em Inglês | MEDLINE | ID: mdl-36747790

RESUMO

MOTIVATION: Predictive biological signatures provide utility as biomarkers for disease diagnosis and prognosis, as well as prediction of responses to vaccination or therapy. These signatures are iden-tified from high-throughput profiling assays through a combination of dimensionality reduction and machine learning techniques. The genes, proteins, metabolites, and other biological analytes that compose signatures also generate hypotheses on the underlying mechanisms driving biological responses, thus improving biological understanding. Dimensionality reduction is a critical step in signature discovery to address the large number of analytes in omics datasets, especially for multi-omics profiling studies with tens of thousands of measurements. Latent factor models, which can account for the structural heterogeneity across diverse assays, effectively integrate multi-omics data and reduce dimensionality to a small number of factors that capture correlations and associations among measurements. These factors provide biologically interpretable features for predictive model-ing. However, multi-omics integration and predictive modeling are generally performed independent-ly in sequential steps, leading to suboptimal factor construction. Combining these steps can yield better multi-omics signatures that are more predictive while still being biologically meaningful. RESULTS: We developed a supervised variational Bayesian factor model that extracts multi-omics signatures from high-throughput profiling datasets that can span multiple data types. Signature-based multiPle-omics intEgration via lAtent factoRs (SPEAR) adaptively determines factor rank, emphasis on factor structure, data relevance and feature sparsity. The method improves the recon-struction of underlying factors in synthetic examples and prediction accuracy of COVID-19 severity and breast cancer tumor subtypes. AVAILABILITY: SPEAR is a publicly available R-package hosted at https://bitbucket.org/kleinstein/SPEAR.

5.
PLoS Comput Biol ; 13(12): e1005875, 2017 12.
Artigo em Inglês | MEDLINE | ID: mdl-29281633

RESUMO

Mass cytometry (CyTOF) has greatly expanded the capability of cytometry. It is now easy to generate multiple CyTOF samples in a single study, with each sample containing single-cell measurement on 50 markers for more than hundreds of thousands of cells. Current methods do not adequately address the issues concerning combining multiple samples for subpopulation discovery, and these issues can be quickly and dramatically amplified with increasing number of samples. To overcome this limitation, we developed Partition-Assisted Clustering and Multiple Alignments of Networks (PAC-MAN) for the fast automatic identification of cell populations in CyTOF data closely matching that of expert manual-discovery, and for alignments between subpopulations across samples to define dataset-level cellular states. PAC-MAN is computationally efficient, allowing the management of very large CyTOF datasets, which are increasingly common in clinical studies and cancer studies that monitor various tissue samples for each subject.


Assuntos
Análise de Célula Única/estatística & dados numéricos , Animais , Biomarcadores/análise , Análise por Conglomerados , Biologia Computacional , Simulação por Computador , Interpretação Estatística de Dados , Bases de Dados Factuais , Citometria de Fluxo/estatística & dados numéricos , Expressão Gênica , Humanos , Camundongos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA