Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 68
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
BMC Bioinformatics ; 25(1): 195, 2024 May 17.
Artículo en Inglés | MEDLINE | ID: mdl-38760692

RESUMEN

BACKGROUND: Pathogenic infections pose a significant threat to global health, affecting millions of people every year and presenting substantial challenges to healthcare systems worldwide. Efficient and timely testing plays a critical role in disease control and transmission prevention. Group testing is a well-established method for reducing the number of tests needed to screen large populations when the disease prevalence is low. However, it does not fully utilize the quantitative information provided by qPCR methods, nor is it able to accommodate a wide range of pathogen loads. RESULTS: To address these issues, we introduce a novel adaptive semi-quantitative group testing (SQGT) scheme to efficiently screen populations via two-stage qPCR testing. The SQGT method quantizes cycle threshold (Ct) values into multiple bins, leveraging the information from the first stage of screening to improve the detection sensitivity. Dynamic Ct threshold adjustments mitigate dilution effects and enhance test accuracy. Comparisons with traditional binary outcome GT methods show that SQGT reduces the number of tests by 24% on the only complete real-world qPCR group testing dataset from Israel, while maintaining a negligible false negative rate. CONCLUSION: In conclusion, our adaptive SQGT approach, utilizing qPCR data and dynamic threshold adjustments, offers a promising solution for efficient population screening. With a reduction in the number of tests and minimal false negatives, SQGT holds potential to enhance disease control and testing strategies on a global scale.


Asunto(s)
Reacción en Cadena en Tiempo Real de la Polimerasa , Reacción en Cadena en Tiempo Real de la Polimerasa/métodos , Humanos
2.
Brief Bioinform ; 23(1)2022 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-34524425

RESUMEN

To enable personalized cancer treatment, machine learning models have been developed to predict drug response as a function of tumor and drug features. However, most algorithm development efforts have relied on cross-validation within a single study to assess model accuracy. While an essential first step, cross-validation within a biological data set typically provides an overly optimistic estimate of the prediction performance on independent test sets. To provide a more rigorous assessment of model generalizability between different studies, we use machine learning to analyze five publicly available cell line-based data sets: National Cancer Institute 60, ancer Therapeutics Response Portal (CTRP), Genomics of Drug Sensitivity in Cancer, Cancer Cell Line Encyclopedia and Genentech Cell Line Screening Initiative (gCSI). Based on observed experimental variability across studies, we explore estimates of prediction upper bounds. We report performance results of a variety of machine learning models, with a multitasking deep neural network achieving the best cross-study generalizability. By multiple measures, models trained on CTRP yield the most accurate predictions on the remaining testing data, and gCSI is the most predictable among the cell line data sets included in this study. With these experiments and further simulations on partial data, two lessons emerge: (1) differences in viability assays can limit model generalizability across studies and (2) drug diversity, more than tumor diversity, is crucial for raising model generalizability in preclinical screening.


Asunto(s)
Neoplasias , Algoritmos , Línea Celular , Humanos , Aprendizaje Automático , Neoplasias/tratamiento farmacológico , Neoplasias/genética , Redes Neurales de la Computación
3.
PLoS Comput Biol ; 19(11): e1011563, 2023 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-37971967

RESUMEN

mRNA levels of all genes in a genome is a critical piece of information defining the overall state of the cell in a given environmental condition. Being able to reconstruct such condition-specific expression in fungal genomes is particularly important to metabolically engineer these organisms to produce desired chemicals in industrially scalable conditions. Most previous deep learning approaches focused on predicting the average expression levels of a gene based on its promoter sequence, ignoring its variation across different conditions. Here we present FUN-PROSE-a deep learning model trained to predict differential expression of individual genes across various conditions using their promoter sequences and expression levels of all transcription factors. We train and test our model on three fungal species and get the correlation between predicted and observed condition-specific gene expression as high as 0.85. We then interpret our model to extract promoter sequence motifs responsible for variable expression of individual genes. We also carried out input feature importance analysis to connect individual transcription factors to their gene targets. A sizeable fraction of both sequence motifs and TF-gene interactions learned by our model agree with previously known biological information, while the rest corresponds to either novel biological facts or indirect correlations.


Asunto(s)
Aprendizaje Profundo , Saccharomyces cerevisiae , Saccharomyces cerevisiae/genética , Biología Computacional , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Expresión Génica
4.
Proc Natl Acad Sci U S A ; 118(8)2021 02 23.
Artículo en Inglés | MEDLINE | ID: mdl-33593911

RESUMEN

The central question in the origin of life is to understand how structure can emerge from randomness. The Eigen theory of replication states, for sequences that are copied one base at a time, that the replication fidelity has to surpass an error threshold to avoid that replicated specific sequences become random because of the incorporated replication errors [M. Eigen, Naturwissenschaften 58 (10), 465-523 (1971)]. Here, we showed that linking short oligomers from a random sequence pool in a templated ligation reaction reduced the sequence space of product strands. We started from 12-mer oligonucleotides with two bases in all possible combinations and triggered enzymatic ligation under temperature cycles. Surprisingly, we found the robust creation of long, highly structured sequences with low entropy. At the ligation site, complementary and alternating sequence patterns developed. However, between the ligation sites, we found either an A-rich or a T-rich sequence within a single oligonucleotide. Our modeling suggests that avoidance of hairpins was the likely cause for these two complementary sequence pools. What emerged was a network of complementary sequences that acted both as templates and substrates of the reaction. This self-selecting ligation reaction could be restarted by only a few majority sequences. The findings showed that replication by random templated ligation from a random sequence input will lead to a highly structured, long, and nonrandom sequence pool. This is a favorable starting point for a subsequent Darwinian evolution searching for higher catalytic functions in an RNA world scenario.


Asunto(s)
Evolución Molecular , Conformación de Ácido Nucleico , Oligonucleótidos/química , Origen de la Vida , Moldes Genéticos , ADN Polimerasa Dirigida por ADN/metabolismo
5.
Proc Natl Acad Sci U S A ; 118(17)2021 04 27.
Artículo en Inglés | MEDLINE | ID: mdl-33833080

RESUMEN

Epidemics generally spread through a succession of waves that reflect factors on multiple timescales. On short timescales, superspreading events lead to burstiness and overdispersion, whereas long-term persistent heterogeneity in susceptibility is expected to lead to a reduction in both the infection peak and the herd immunity threshold (HIT). Here, we develop a general approach to encompass both timescales, including time variations in individual social activity, and demonstrate how to incorporate them phenomenologically into a wide class of epidemiological models through reparameterization. We derive a nonlinear dependence of the effective reproduction number [Formula: see text] on the susceptible population fraction S. We show that a state of transient collective immunity (TCI) emerges well below the HIT during early, high-paced stages of the epidemic. However, this is a fragile state that wanes over time due to changing levels of social activity, and so the infection peak is not an indication of long-lasting herd immunity: Subsequent waves may emerge due to behavioral changes in the population, driven by, for example, seasonal factors. Transient and long-term levels of heterogeneity are estimated using empirical data from the COVID-19 epidemic and from real-life face-to-face contact networks. These results suggest that the hardest hit areas, such as New York City, have achieved TCI following the first wave of the epidemic, but likely remain below the long-term HIT. Thus, in contrast to some previous claims, these regions can still experience subsequent waves.


Asunto(s)
COVID-19 , Epidemias , Inmunidad Colectiva , Modelos Inmunológicos , SARS-CoV-2/inmunología , COVID-19/epidemiología , COVID-19/inmunología , COVID-19/transmisión , Humanos , Estados Unidos/epidemiología
6.
PLoS Comput Biol ; 18(12): e1010244, 2022 12.
Artículo en Inglés | MEDLINE | ID: mdl-36574450

RESUMEN

Recent observations have revealed that closely related strains of the same microbial species can stably coexist in natural and laboratory settings subject to boom and bust dynamics and serial dilutions, respectively. However, the possible mechanisms enabling the coexistence of only a handful of strains, but not more, have thus far remained unknown. Here, using a consumer-resource model of microbial ecosystems, we propose that by differentiating along Monod parameters characterizing microbial growth rates in high and low nutrient conditions, strains can coexist in patterns similar to those observed. In our model, boom and bust environments create satellite niches due to resource concentrations varying in time. These satellite niches can be occupied by closely related strains, thereby enabling their coexistence. We demonstrate that this result is valid even in complex environments consisting of multiple resources and species. In these complex communities, each species partitions resources differently and creates separate sets of satellite niches for their own strains. While there is no theoretical limit to the number of coexisting strains, in our simulations, we always find between 1 and 3 strains coexisting, consistent with known experiments and observations.


Asunto(s)
Ecosistema , Microbiota
7.
PLoS Comput Biol ; 16(8): e1008135, 2020 08.
Artículo en Inglés | MEDLINE | ID: mdl-32810127

RESUMEN

Social interaction between microbes can be described at many levels of details: from the biochemistry of cell-cell interactions to the ecological dynamics of populations. Choosing an appropriate level to model microbial communities without losing generality remains a challenge. Here we show that modeling cross-feeding interactions at an intermediate level between genome-scale metabolic models of individual species and consumer-resource models of ecosystems is suitable to experimental data. We applied our modeling framework to three published examples of multi-strain Escherichia coli communities with increasing complexity: uni-, bi-, and multi-directional cross-feeding of either substitutable metabolic byproducts or essential nutrients. The intermediate-scale model accurately fit empirical data and quantified metabolic exchange rates that are hard to measure experimentally, even for a complex community of 14 amino acid auxotrophies. By studying the conditions of species coexistence, the ecological outcomes of cross-feeding interactions, and each community's robustness to perturbations, we extracted new quantitative insights from these three published experimental datasets. Our analysis provides a foundation to quantify cross-feeding interactions from experimental data, and highlights the importance of metabolic exchanges in the dynamics and stability of microbial communities.


Asunto(s)
Microbiota , Bacterias/clasificación , Bacterias/metabolismo , Modelos Biológicos
8.
PLoS Comput Biol ; 15(12): e1007524, 2019 12.
Artículo en Inglés | MEDLINE | ID: mdl-31856158

RESUMEN

The human gut microbiome is a complex ecosystem, in which hundreds of microbial species and metabolites coexist, in part due to an extensive network of cross-feeding interactions. However, both the large-scale trophic organization of this ecosystem, and its effects on the underlying metabolic flow, remain unexplored. Here, using a simplified model, we provide quantitative support for a multi-level trophic organization of the human gut microbiome, where microbes consume and secrete metabolites in multiple iterative steps. Using a manually-curated set of metabolic interactions between microbes, our model suggests about four trophic levels, each characterized by a high level-to-level metabolic transfer of byproducts. It also quantitatively predicts the typical metabolic environment of the gut (fecal metabolome) in approximate agreement with the real data. To understand the consequences of this trophic organization, we quantify the metabolic flow and biomass distribution, and explore patterns of microbial and metabolic diversity in different levels. The hierarchical trophic organization suggested by our model can help mechanistically establish causal links between the abundances of microbes and metabolites in the human gut.


Asunto(s)
Microbioma Gastrointestinal/fisiología , Modelos Biológicos , Biomasa , Biología Computacional , Simulación por Computador , Ecosistema , Humanos , Metaboloma , Interacciones Microbianas , Biología de Sistemas
9.
Nucleic Acids Res ; 45(13): 7615-7622, 2017 Jul 27.
Artículo en Inglés | MEDLINE | ID: mdl-28605556

RESUMEN

Among several quantitative invariants found in evolutionary genomics, one of the most striking is the scaling of the overall abundance of proteins, or protein domains, sharing a specific functional annotation across genomes of given size. The size of these functional categories change, on average, as power-laws in the total number of protein-coding genes. Here, we show that such regularities are not restricted to the overall behavior of high-level functional categories, but also exist systematically at the level of single evolutionary families of protein domains. Specifically, the number of proteins within each family follows family-specific scaling laws with genome size. Functionally similar sets of families tend to follow similar scaling laws, but this is not always the case. To understand this systematically, we provide a comprehensive classification of families based on their scaling properties. Additionally, we develop a quantitative score for the heterogeneity of the scaling of families belonging to a given category or predefined group. Under the common reasonable assumption that selection is driven solely or mainly by biological function, these findings point to fine-tuned and interdependent functional roles of specific protein domains, beyond our current functional annotations. This analysis provides a deeper view on the links between evolutionary expansion of protein families and the functional constraints shaping the gene repertoire of bacterial genomes.


Asunto(s)
Evolución Molecular , Genoma Bacteriano , Proteínas Bacterianas/química , Proteínas Bacterianas/clasificación , Proteínas Bacterianas/genética , Tamaño del Genoma , Dominios Proteicos/genética , Proteoma/química , Proteoma/clasificación , Proteoma/genética , Factores de Transcripción/química , Factores de Transcripción/clasificación , Factores de Transcripción/genética
10.
BMC Bioinformatics ; 19(Suppl 18): 486, 2018 Dec 21.
Artículo en Inglés | MEDLINE | ID: mdl-30577754

RESUMEN

BACKGROUND: The National Cancer Institute drug pair screening effort against 60 well-characterized human tumor cell lines (NCI-60) presents an unprecedented resource for modeling combinational drug activity. RESULTS: We present a computational model for predicting cell line response to a subset of drug pairs in the NCI-ALMANAC database. Based on residual neural networks for encoding features as well as predicting tumor growth, our model explains 94% of the response variance. While our best result is achieved with a combination of molecular feature types (gene expression, microRNA and proteome), we show that most of the predictive power comes from drug descriptors. To further demonstrate value in detecting anticancer therapy, we rank the drug pairs for each cell line based on model predicted combination effect and recover 80% of the top pairs with enhanced activity. CONCLUSIONS: We present promising results in applying deep learning to predicting combinational drug response. Our feature analysis indicates screening data involving more cell lines are needed for the models to make better use of molecular features.


Asunto(s)
Aprendizaje Profundo/tendencias , Evaluación Preclínica de Medicamentos/métodos , Línea Celular Tumoral , Humanos , National Cancer Institute (U.S.) , Redes Neurales de la Computación , Estados Unidos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA