Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
Sci Rep ; 14(1): 8165, 2024 04 08.
Artículo en Inglés | MEDLINE | ID: mdl-38589653

RESUMEN

Accurately calling indels with next-generation sequencing (NGS) data is critical for clinical application. The precisionFDA team collaborated with the U.S. Food and Drug Administration's (FDA's) National Center for Toxicological Research (NCTR) and successfully completed the NCTR Indel Calling from Oncopanel Sequencing Data Challenge, to evaluate the performance of indel calling pipelines. Top performers were selected based on precision, recall, and F1-score. The performance of many other pipelines was close to the top performers, which produced a top cluster of performers. The performance was significantly higher in high confidence regions and coding regions, and significantly lower in low complexity regions. Oncopanel capture and other issues may have occurred that affected the recall rate. Indels with higher variant allele frequency (VAF) may generally be called with higher confidence. Many of the indel calling pipelines had good performance. Some of them performed generally well across all three oncopanels, while others were better for a specific oncopanel. The performance of indel calling can further be improved by restricting the calls within high confidence intervals (HCIs) and coding regions, and by excluding low complexity regions (LCR) regions. Certain VAF cut-offs could be applied according to the applications.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Mutación INDEL , Polimorfismo de Nucleótido Simple
2.
medRxiv ; 2023 Dec 13.
Artículo en Inglés | MEDLINE | ID: mdl-38168217

RESUMEN

The COVID-19 pandemic had disproportionate effects on the Veteran population due to the increased prevalence of medical and environmental risk factors. Synthetic electronic health record (EHR) data can help meet the acute need for Veteran population-specific predictive modeling efforts by avoiding the strict barriers to access, currently present within Veteran Health Administration (VHA) datasets. The U.S. Food and Drug Administration (FDA) and the VHA launched the precisionFDA COVID-19 Risk Factor Modeling Challenge to develop COVID-19 diagnostic and prognostic models; identify Veteran population-specific risk factors; and test the usefulness of synthetic data as a substitute for real data. The use of synthetic data boosted challenge participation by providing a dataset that was accessible to all competitors. Models trained on synthetic data showed similar but systematically inflated model performance metrics to those trained on real data. The important risk factors identified in the synthetic data largely overlapped with those identified from the real data, and both sets of risk factors were validated in the literature. Tradeoffs exist between synthetic data generation approaches based on whether a real EHR dataset is required as input. Synthetic data generated directly from real EHR input will more closely align with the characteristics of the relevant cohort. This work shows that synthetic EHR data will have practical value to the Veterans' health research community for the foreseeable future.

3.
Glob Health Epidemiol Genom ; 2022: 6499217, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35707747

RESUMEN

The 2019 coronavirus disease (COVID-19) pandemic has demonstrated the importance of predicting, identifying, and tracking mutations throughout a pandemic event. As the COVID-19 global pandemic surpassed one year, several variants had emerged resulting in increased severity and transmissibility. Here, we used PCR as a surrogate for viral load and consequent severity to evaluate the real-world capabilities of a genome-based clinical severity predictive algorithm. Using a previously published algorithm, we compared the viral genome-based severity predictions to clinically derived PCR-based viral load of 716 viral genomes. For those samples predicted to be "severe" (probability of severe illness >0.5), we observed an average cycle threshold (Ct) of 18.3, whereas those in in the "mild" category (severity probability <0.5) had an average Ct of 20.4 (P=0.0017). We also found a nontrivial correlation between predicted severity probability and cycle threshold (r = -0.199). Finally, when divided into severity probability quartiles, the group most likely to experience severe illness (≥75% probability) had a Ct of 16.6 (n = 10), whereas the group least likely to experience severe illness (<25% probability) had a Ct of 21.4 (n = 350) (P=0.0045). Taken together, our results suggest that the severity predicted by a genome-based algorithm can be related to clinical diagnostic tests and that relative severity may be inferred from diagnostic values.


Asunto(s)
COVID-19 , SARS-CoV-2 , COVID-19/diagnóstico , COVID-19/genética , Humanos , Reacción en Cadena en Tiempo Real de la Polimerasa , SARS-CoV-2/genética , Índice de Severidad de la Enfermedad , Carga Viral/genética
4.
Cell Genom ; 2(5)2022 May 11.
Artículo en Inglés | MEDLINE | ID: mdl-35720974

RESUMEN

The precisionFDA Truth Challenge V2 aimed to assess the state of the art of variant calling in challenging genomic regions. Starting with FASTQs, 20 challenge participants applied their variant-calling pipelines and submitted 64 variant call sets for one or more sequencing technologies (Illumina, PacBio HiFi, and Oxford Nanopore Technologies). Submissions were evaluated following best practices for benchmarking small variants with updated Genome in a Bottle benchmark sets and genome stratifications. Challenge submissions included numerous innovative methods, with graph-based and machine learning methods scoring best for short-read and long-read datasets, respectively. With machine learning approaches, combining multiple sequencing technologies performed particularly well. Recent developments in sequencing and variant calling have enabled benchmarking variants in challenging genomic regions, paving the way for the identification of previously unknown clinically relevant variants.

5.
Evol Med Public Health ; 9(1): 267-275, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34447577

RESUMEN

INTRODUCTION: The coronavirus disease 2019 (COVID-19) pandemic is a global public health emergency causing a disparate burden of death and disability around the world. The viral genetic variants associated with outcome severity are still being discovered. METHODS: We downloaded 155 958 severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genomes from GISAID. Of these genomes, 3637 samples included useable metadata on patient outcomes. Using this subset, we evaluated whether SARS-CoV-2 viral genomic variants improved prediction of reported severity beyond age and region. First, we established whether including genomic variants as model features meaningfully increased the predictive power of our model. Next, we evaluated specific variants in order to determine the magnitude of association with severity and the frequency of these variants among SARS-CoV-2 genomes. RESULTS: Logistic regression models that included viral genomic variants outperformed other models (area under the curve = 0.91 as compared with 0.68 for age and gender alone; P < 0.001). We found 84 variants with odds ratios greater than 2 for outcome severity (17 and 67 for higher and lower severity, respectively). The median frequency of associated variants was 0.15% (interquartile range 0.09-0.45%). Altogether 85% of genomes had at least one variant associated with patient outcome. CONCLUSION: Numerous SARS-CoV-2 variants have 2-fold or greater association with odds of mild or severe outcome and collectively, these variants are common. In addition to comprehensive mitigation efforts, public health measures should be prioritized to control the more severe manifestations of COVID-19 and the transmission chains linked to these severe cases.Lay summary: This study explores which, if any, SARS-CoV-2 viral genomic variants are associated with mild or severe COVID-19 patient outcomes. Our results suggest that there are common genomic variants in SARS-CoV-2 that are more often associated with negative patient outcomes, which may impact downstream public health measures.

6.
Patterns (N Y) ; 2(5): 100245, 2021 May 14.
Artículo en Inglés | MEDLINE | ID: mdl-34036290

RESUMEN

Sample mislabeling or misannotation has been a long-standing problem in scientific research, particularly prevalent in large-scale, multi-omic studies due to the complexity of multi-omic workflows. There exists an urgent need for implementing quality controls to automatically screen for and correct sample mislabels or misannotations in multi-omic studies. Here, we describe a crowdsourced precisionFDA NCI-CPTAC Multi-omics Enabled Sample Mislabeling Correction Challenge, which provides a framework for systematic benchmarking and evaluation of mislabel identification and correction methods for integrative proteogenomic studies. The challenge received a large number of submissions from domestic and international data scientists, with highly variable performance observed across the submitted methods. Post-challenge collaboration between the top-performing teams and the challenge organizers has created an open-source software, COSMO, with demonstrated high accuracy and robustness in mislabeling identification and correction in simulated and real multi-omic datasets.

7.
J Am Med Inform Assoc ; 28(7): 1582-1590, 2021 07 14.
Artículo en Inglés | MEDLINE | ID: mdl-33895824

RESUMEN

Artificial intelligence (AI) is critical to harnessing value from exponentially growing health and healthcare data. Expectations are high for AI solutions to effectively address current health challenges. However, there have been prior periods of enthusiasm for AI followed by periods of disillusionment, reduced investments, and progress, known as "AI Winters." We are now at risk of another AI Winter in health/healthcare due to increasing publicity of AI solutions that are not representing touted breakthroughs, and thereby decreasing trust of users in AI. In this article, we first highlight recently published literature on AI risks and mitigation strategies that would be relevant for groups considering designing, implementing, and promoting self-governance. We then describe a process for how a diverse group of stakeholders could develop and define standards for promoting trust, as well as AI risk-mitigating practices through greater industry self-governance. We also describe how adherence to such standards could be verified, specifically through certification/accreditation. Self-governance could be encouraged by governments to complement existing regulatory schema or legislative efforts to mitigate AI risks. Greater adoption of industry self-governance could fill a critical gap to construct a more comprehensive approach to the governance of AI solutions than US legislation/regulations currently encompass. In this more comprehensive approach, AI developers, AI users, and government/legislators all have critical roles to play to advance practices that maintain trust in AI and prevent another AI Winter.


Asunto(s)
Inteligencia Artificial , Confianza , Acreditación , Atención a la Salud , Instituciones de Salud
8.
Genetics ; 208(4): 1643-1656, 2018 04.
Artículo en Inglés | MEDLINE | ID: mdl-29487137

RESUMEN

Insulin resistance is associated with obesity, cardiovascular disease, non-alcoholic fatty liver disease, and type 2 diabetes. These complications are exacerbated by a high-calorie diet, which we used to model type 2 diabetes in Drosophila melanogaster Our studies focused on the fat body, an adipose- and liver-like tissue that stores fat and maintains circulating glucose. A gene regulatory network was constructed to predict potential regulators of insulin signaling in this tissue. Genomic characterization of fat bodies suggested a central role for the transcription factor Seven-up (Svp). Here, we describe a new role for Svp as a positive regulator of insulin signaling. Tissue-specific loss-of-function showed that Svp is required in the fat body to promote glucose clearance, lipid turnover, and insulin signaling. Svp appears to promote insulin signaling, at least in part, by inhibiting ecdysone signaling. Svp also impairs the immune response possibly via inhibition of antimicrobial peptide expression in the fat body. Taken together, these studies show that gene regulatory networks can help identify positive regulators of insulin signaling and metabolic homeostasis using the Drosophila fat body.


Asunto(s)
Proteínas de Unión al ADN/metabolismo , Insulina/metabolismo , Receptores de Esteroides/metabolismo , Transducción de Señal , Tejido Adiposo , Alimentación Animal , Animales , Proteínas de Unión al ADN/genética , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , Dislipidemias/etiología , Dislipidemias/metabolismo , Metabolismo Energético , Femenino , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Técnicas de Silenciamiento del Gen , Redes Reguladoras de Genes , Glucosa/metabolismo , Homeostasis , Masculino , Metaboloma , Metabolómica/métodos , Unión Proteica , Receptores de Esteroides/genética , Transcriptoma
9.
Bioinformatics ; 34(2): 249-257, 2018 Jan 15.
Artículo en Inglés | MEDLINE | ID: mdl-28968736

RESUMEN

MOTIVATION: Cells process information, in part, through transcription factor (TF) networks, which control the rates at which individual genes produce their products. A TF network map is a graph that indicates which TFs bind and directly regulate each gene. Previous work has described network mapping algorithms that rely exclusively on gene expression data and 'integrative' algorithms that exploit a wide range of data sources including chromatin immunoprecipitation sequencing (ChIP-seq) of many TFs, genome-wide chromatin marks, and binding specificities for many TFs determined in vitro. However, such resources are available only for a few major model systems and cannot be easily replicated for new organisms or cell types. RESULTS: We present NetProphet 2.0, a 'data light' algorithm for TF network mapping, and show that it is more accurate at identifying direct targets of TFs than other, similarly data light algorithms. In particular, it improves on the accuracy of NetProphet 1.0, which used only gene expression data, by exploiting three principles. First, combining multiple approaches to network mapping from expression data can improve accuracy relative to the constituent approaches. Second, TFs with similar DNA binding domains bind similar sets of target genes. Third, even a noisy, preliminary network map can be used to infer DNA binding specificities from promoter sequences and these inferred specificities can be used to further improve the accuracy of the network map. AVAILABILITY AND IMPLEMENTATION: Source code and comprehensive documentation are freely available at https://github.com/yiming-kang/NetProphet_2.0. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

10.
Proc Natl Acad Sci U S A ; 113(47): E7428-E7437, 2016 11 22.
Artículo en Inglés | MEDLINE | ID: mdl-27810962

RESUMEN

The ability to rationally manipulate the transcriptional states of cells would be of great use in medicine and bioengineering. We have developed an algorithm, NetSurgeon, which uses genome-wide gene-regulatory networks to identify interventions that force a cell toward a desired expression state. We first validated NetSurgeon extensively on existing datasets. Next, we used NetSurgeon to select transcription factor deletions aimed at improving ethanol production in Saccharomyces cerevisiae cultures that are catabolizing xylose. We reasoned that interventions that move the transcriptional state of cells using xylose toward that of cells producing large amounts of ethanol from glucose might improve xylose fermentation. Some of the interventions selected by NetSurgeon successfully promoted a fermentative transcriptional state in the absence of glucose, resulting in strains with a 2.7-fold increase in xylose import rates, a 4-fold improvement in xylose integration into central carbon metabolism, or a 1.3-fold increase in ethanol production rate. We conclude by presenting an integrated model of transcriptional regulation and metabolic flux that will enable future efforts aimed at improving xylose fermentation to prioritize functional regulators of central carbon metabolism.


Asunto(s)
Eliminación de Gen , Proteínas de Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/crecimiento & desarrollo , Factores de Transcripción/genética , Algoritmos , Etanol/metabolismo , Fermentación , Redes Reguladoras de Genes , Glucosa/metabolismo , Ingeniería Metabólica , Modelos Genéticos , Saccharomyces cerevisiae/genética , Transcriptoma , Xilosa/metabolismo
11.
mBio ; 7(2): e00313-16, 2016 Apr 19.
Artículo en Inglés | MEDLINE | ID: mdl-27094327

RESUMEN

UNLABELLED: Cryptococcus neoformans is a ubiquitous, opportunistic fungal pathogen that kills over 600,000 people annually. Here, we report integrated computational and experimental investigations of the role and mechanisms of transcriptional regulation in cryptococcal infection. Major cryptococcal virulence traits include melanin production and the development of a large polysaccharide capsule upon host entry; shed capsule polysaccharides also impair host defenses. We found that both transcription and translation are required for capsule growth and that Usv101 is a master regulator of pathogenesis, regulating melanin production, capsule growth, and capsule shedding. It does this by directly regulating genes encoding glycoactive enzymes and genes encoding three other transcription factors that are essential for capsule growth: GAT201, RIM101, and SP1. Murine infection with cryptococci lacking Usv101 significantly alters the kinetics and pathogenesis of disease, with extended survival and, unexpectedly, death by pneumonia rather than meningitis. Our approaches and findings will inform studies of other pathogenic microbes. IMPORTANCE: Cryptococcus neoformans causes fatal meningitis in immunocompromised individuals, mainly HIV positive, killing over 600,000 each year. A unique feature of this yeast, which makes it particularly virulent, is its polysaccharide capsule; this structure impedes host efforts to combat infection. Capsule size and structure respond to environmental conditions, such as those encountered in an infected host. We have combined computational and experimental tools to elucidate capsule regulation, which we show primarily occurs at the transcriptional level. We also demonstrate that loss of a novel transcription factor alters virulence factor expression and host cell interactions, changing the lethal condition from meningitis to pneumonia with an exacerbated host response. We further demonstrate the relevant targets of regulation and kinetically map key regulatory and host interactions. Our work elucidates mechanisms of capsule regulation, provides methods and resources to the research community, and demonstrates an altered pathogenic outcome that resembles some human conditions.


Asunto(s)
Criptococosis/microbiología , Cryptococcus neoformans/patogenicidad , Proteínas Fúngicas/metabolismo , Regulación Fúngica de la Expresión Génica , Factores de Transcripción/metabolismo , Animales , Biología Computacional , Cryptococcus neoformans/genética , Cryptococcus neoformans/metabolismo , Femenino , Proteínas Fúngicas/genética , Redes Reguladoras de Genes , Humanos , Melaninas/metabolismo , Ratones , Factores de Transcripción/genética , Virulencia
12.
Genome Res ; 25(5): 690-700, 2015 May.
Artículo en Inglés | MEDLINE | ID: mdl-25644834

RESUMEN

Key steps in understanding a biological process include identifying genes that are involved and determining how they are regulated. We developed a novel method for identifying transcription factors (TFs) involved in a specific process and used it to map regulation of the key virulence factor of a deadly fungus-its capsule. The map, built from expression profiles of 41 TF mutants, includes 20 TFs not previously known to regulate virulence attributes. It also reveals a hierarchy comprising executive, midlevel, and "foreman" TFs. When grouped by temporal expression pattern, these TFs explain much of the transcriptional dynamics of capsule induction. Phenotypic analysis of TF deletion mutants revealed complex relationships among virulence factors and virulence in mice. These resources and analyses provide the first integrated, systems-level view of capsule regulation and biosynthesis. Our methods dramatically improve the efficiency with which transcriptional networks can be analyzed, making genomic approaches accessible to laboratories focused on specific physiological processes.


Asunto(s)
Mapeo Cromosómico/métodos , Redes Reguladoras de Genes , Factores de Virulencia/genética , Animales , Cryptococcus neoformans/genética , Cryptococcus neoformans/patogenicidad , Femenino , Proteínas Fúngicas/genética , Ratones , Ratones Endogámicos C57BL , Modelos Genéticos , Factores de Transcripción/genética
13.
Eukaryot Cell ; 13(6): 832-42, 2014 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-24747214

RESUMEN

Cryptococcus neoformans is an opportunistic yeast responsible for lethal meningoencephalitis in humans. This pathogen elaborates a polysaccharide capsule, which is its major virulence factor. Mannose constitutes over one-half of the capsule mass and is also extensively utilized in cell wall synthesis and in glycosylation of proteins and lipids. The activated mannose donor for most biosynthetic reactions, GDP-mannose, is made in the cytosol, although it is primarily consumed in secretory organelles. This compartmentalization necessitates specific transmembrane transporters to make the donor available for glycan synthesis. We previously identified two cryptococcal GDP-mannose transporters, Gmt1 and Gmt2. Biochemical studies of each protein expressed in Saccharomyces cerevisiae showed that both are functional, with similar kinetics and substrate specificities in vitro. We have now examined these proteins in vivo and demonstrate that cells lacking Gmt1 show significant phenotypic differences from those lacking Gmt2 in terms of growth, colony morphology, protein glycosylation, and capsule phenotypes. Some of these observations may be explained by differential expression of the two genes, but others suggest that the two proteins play overlapping but nonidentical roles in cryptococcal biology. Furthermore, gmt1 gmt2 double mutant cells, which are unexpectedly viable, exhibit severe defects in capsule synthesis and protein glycosylation and are avirulent in mouse models of cryptococcosis.


Asunto(s)
Proteínas Portadoras/metabolismo , Cryptococcus neoformans/metabolismo , Proteínas Fúngicas/metabolismo , Animales , Proteínas Portadoras/genética , Cryptococcus neoformans/genética , Cryptococcus neoformans/crecimiento & desarrollo , Cryptococcus neoformans/patogenicidad , Proteínas Fúngicas/genética , Ratones , Virulencia/genética
14.
Genome Res ; 23(8): 1319-28, 2013 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-23636944

RESUMEN

A critical step in understanding how a genome functions is determining which transcription factors (TFs) regulate each gene. Accordingly, extensive effort has been devoted to mapping TF networks. In Saccharomyces cerevisiae, protein-DNA interactions have been identified for most TFs by ChIP-chip, and expression profiling has been done on strains deleted for most TFs. These studies revealed that there is little overlap between the genes whose promoters are bound by a TF and those whose expression changes when the TF is deleted, leaving us without a definitive TF network for any eukaryote and without an efficient method for mapping functional TF networks. This paper describes NetProphet, a novel algorithm that improves the efficiency of network mapping from gene expression data. NetProphet exploits a fundamental observation about the nature of TF networks: The response to disrupting or overexpressing a TF is strongest on its direct targets and dissipates rapidly as it propagates through the network. Using S. cerevisiae data, we show that NetProphet can predict thousands of direct, functional regulatory interactions, using only gene expression data. The targets that NetProphet predicts for a TF are at least as likely to have sites matching the TF's binding specificity as the targets implicated by ChIP. Unlike most ChIP targets, the NetProphet targets also show evidence of functional regulation. This suggests a surprising conclusion: The best way to begin mapping direct, functional TF-promoter interactions may not be by measuring binding. We also show that NetProphet yields new insights into the functions of several yeast TFs, including a well-studied TF, Cbf1, and a completely unstudied TF, Eds1.


Asunto(s)
Redes Reguladoras de Genes , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/genética , Programas Informáticos , Factores de Transcripción/metabolismo , Algoritmos , Sitios de Unión , Inmunoprecipitación de Cromatina , Perfilación de la Expresión Génica , Regulación Fúngica de la Expresión Génica , Genoma Fúngico , Modelos Genéticos , Regiones Promotoras Genéticas , Unión Proteica , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...