Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 69
Filtrar
Mais filtros

Bases de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Mol Cell ; 78(5): 960-974.e11, 2020 06 04.
Artigo em Inglês | MEDLINE | ID: mdl-32330456

RESUMO

Dynamic cellular processes such as differentiation are driven by changes in the abundances of transcription factors (TFs). However, despite years of studies, our knowledge about the protein copy number of TFs in the nucleus is limited. Here, by determining the absolute abundances of 103 TFs and co-factors during the course of human erythropoiesis, we provide a dynamic and quantitative scale for TFs in the nucleus. Furthermore, we establish the first gene regulatory network of cell fate commitment that integrates temporal protein stoichiometry data with mRNA measurements. The model revealed quantitative imbalances in TFs' cross-antagonistic relationships that underlie lineage determination. Finally, we made the surprising discovery that, in the nucleus, co-repressors are dramatically more abundant than co-activators at the protein level, but not at the RNA level, with profound implications for understanding transcriptional regulation. These analyses provide a unique quantitative framework to understand transcriptional regulation of cell differentiation in a dynamic context.


Assuntos
Eritropoese/genética , Redes Reguladoras de Genes/genética , Fatores de Transcrição/genética , Bases de Dados Factuais , Regulação da Expressão Gênica/genética , Hematopoese/genética , Humanos , Proteômica/métodos , Fatores de Transcrição/análise , Fatores de Transcrição/metabolismo
2.
Genes Dev ; 30(5): 508-21, 2016 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-26944678

RESUMO

T-cell acute lymphoblastic leukemia (T-ALL) is a heterogeneous group of hematological tumors composed of distinct subtypes that vary in their genetic abnormalities, gene expression signatures, and prognoses. However, it remains unclear whether T-ALL subtypes differ at the functional level, and, as such, T-ALL treatments are uniformly applied across subtypes, leading to variable responses between patients. Here we reveal the existence of a subtype-specific epigenetic vulnerability in T-ALL by which a particular subgroup of T-ALL characterized by expression of the oncogenic transcription factor TAL1 is uniquely sensitive to variations in the dosage and activity of the histone 3 Lys27 (H3K27) demethylase UTX/KDM6A. Specifically, we identify UTX as a coactivator of TAL1 and show that it acts as a major regulator of the TAL1 leukemic gene expression program. Furthermore, we demonstrate that UTX, previously described as a tumor suppressor in T-ALL, is in fact a pro-oncogenic cofactor essential for leukemia maintenance in TAL1-positive (but not TAL1-negative) T-ALL. Exploiting this subtype-specific epigenetic vulnerability, we propose a novel therapeutic approach based on UTX inhibition through in vivo administration of an H3K27 demethylase inhibitor that efficiently kills TAL1-positive primary human leukemia. These findings provide the first opportunity to develop personalized epigenetic therapy for T-ALL patients.


Assuntos
Fatores de Transcrição Hélice-Alça-Hélice Básicos/metabolismo , Epigênese Genética , Regulação Neoplásica da Expressão Gênica/genética , Terapia Genética , Histona Desmetilases/genética , Proteínas Nucleares/genética , Leucemia-Linfoma Linfoblástico de Células T Precursoras/terapia , Proteínas Proto-Oncogênicas/metabolismo , Linhagem Celular Tumoral , Técnicas de Silenciamento de Genes , Histona Desmetilases/metabolismo , Humanos , Proteínas Nucleares/metabolismo , Leucemia-Linfoma Linfoblástico de Células T Precursoras/genética , Leucemia-Linfoma Linfoblástico de Células T Precursoras/fisiopatologia , Proteínas Proto-Oncogênicas/genética , Proteína 1 de Leucemia Linfocítica Aguda de Células T
3.
Bioinformatics ; 38(6): 1593-1599, 2022 03 04.
Artigo em Inglês | MEDLINE | ID: mdl-34951624

RESUMO

MOTIVATION: Bioinformatic tools capable of annotating, rapidly and reproducibly, large, targeted lipidomic datasets are limited. Specifically, few programs enable high-throughput peak assessment of liquid chromatography-electrospray ionization tandem mass spectrometry data acquired in either selected or multiple reaction monitoring modes. RESULTS: We present here Bayesian Annotations for Targeted Lipidomics, a Gaussian naïve Bayes classifier for targeted lipidomics that annotates peak identities according to eight features related to retention time, intensity, and peak shape. Lipid identification is achieved by modeling distributions of these eight input features across biological conditions and maximizing the joint posterior probabilities of all peak identities at a given transition. When applied to sphingolipid and glycerophosphocholine selected reaction monitoring datasets, we demonstrate over 95% of all peaks are rapidly and correctly identified. AVAILABILITY AND IMPLEMENTATION: BATL software is freely accessible online at https://complimet.ca/batl/ and is compatible with Safari, Firefox, Chrome and Edge. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Lipidômica , Software , Teorema de Bayes , Espectrometria de Massas , Cromatografia Líquida/métodos
4.
J Theor Biol ; 575: 111632, 2023 11 07.
Artigo em Inglês | MEDLINE | ID: mdl-37804942

RESUMO

Elementary flux modes (EFMs) are minimal, steady state pathways characterizing a flux network. Fundamentally, all steady state fluxes in a network are decomposable into a linear combination of EFMs. While there is typically no unique set of EFM weights that reconstructs these fluxes, several optimization-based methods have been proposed to constrain the solution space by enforcing some notion of parsimony. However, it has long been recognized that optimization-based approaches may fail to uniquely identify EFM weights and return different feasible solutions across objective functions and solvers. Here we show that, for flux networks only involving single molecule transformations, these problems can be avoided by imposing a Markovian constraint on EFM weights. Our Markovian constraint guarantees a unique solution to the flux decomposition problem, and that solution is arguably more biophysically plausible than other solutions. We describe an algorithm for computing Markovian EFM weights via steady state analysis of a certain discrete-time Markov chain, based on the flux network, which we call the cycle-history Markov chain. We demonstrate our method with a differential analysis of EFM activity in a lipid metabolic network comparing healthy and Alzheimer's disease patients. Our method is the first to uniquely decompose steady state fluxes into EFM weights for any unimolecular metabolic network.


Assuntos
Escherichia coli , Modelos Biológicos , Humanos , Escherichia coli/metabolismo , Redes e Vias Metabólicas , Algoritmos , Análise do Fluxo Metabólico/métodos
5.
EMBO Rep ; 21(12): e49499, 2020 12 03.
Artigo em Inglês | MEDLINE | ID: mdl-33047485

RESUMO

The function and maintenance of muscle stem cells (MuSCs) are tightly regulated by signals originating from their niche environment. Skeletal myofibers are a principle component of the MuSC niche and are in direct contact with the muscle stem cells. Here, we show that Myf6 establishes a ligand/receptor interaction between muscle stem cells and their associated muscle fibers. Our data show that Myf6 transcriptionally regulates a broad spectrum of myokines and muscle-secreted proteins in skeletal myofibers, including EGF. EGFR signaling blocks p38 MAP kinase-induced differentiation of muscle stem cells. Homozygous deletion of Myf6 causes a significant reduction in the ability of muscle to produce EGF, leading to a deregulation in EGFR signaling. Consequently, although Myf6-knockout mice are born with a normal muscle stem cell compartment, they undergo a progressive reduction in their stem cell pool during postnatal life due to spontaneous exit from quiescence. Taken together, our data uncover a novel role for Myf6 in promoting the expression of key myokines, such as EGF, in the muscle fiber which prevents muscle stem cell exhaustion by blocking their premature differentiation.


Assuntos
Fatores de Regulação Miogênica , Células-Tronco , Animais , Diferenciação Celular/genética , Homozigoto , Camundongos , Músculo Esquelético , Fatores de Regulação Miogênica/genética , Deleção de Sequência
6.
BMC Bioinformatics ; 22(1): 69, 2021 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-33588754

RESUMO

BACKGROUND: Chromatin immunoprecipitation followed by high throughput sequencing (ChIP-seq), initially introduced more than a decade ago, is widely used by the scientific community to detect protein/DNA binding and histone modifications across the genome. Every experiment is prone to noise and bias, and ChIP-seq experiments are no exception. To alleviate bias, the incorporation of control datasets in ChIP-seq analysis is an essential step. The controls are used to account for the background signal, while the remainder of the ChIP-seq signal captures true binding or histone modification. However, a recurrent issue is different types of bias in different ChIP-seq experiments. Depending on which controls are used, different aspects of ChIP-seq bias are better or worse accounted for, and peak calling can produce different results for the same ChIP-seq experiment. Consequently, generating "smart" controls, which model the non-signal effect for a specific ChIP-seq experiment, could enhance contrast and increase the reliability and reproducibility of the results. RESULT: We propose a peak calling algorithm, Weighted Analysis of ChIP-seq (WACS), which is an extension of the well-known peak caller MACS2. There are two main steps in WACS: First, weights are estimated for each control using non-negative least squares regression. The goal is to customize controls to model the noise distribution for each ChIP-seq experiment. This is then followed by peak calling. We demonstrate that WACS significantly outperforms MACS2 and AIControl, another recent algorithm for generating smart controls, in the detection of enriched regions along the genome, in terms of motif enrichment and reproducibility analyses. CONCLUSIONS: This ultimately improves our understanding of ChIP-seq controls and their biases, and shows that WACS results in a better approximation of the noise distribution in controls.


Assuntos
Sequenciamento de Cromatina por Imunoprecipitação , Sequenciamento de Nucleotídeos em Larga Escala , Algoritmos , Imunoprecipitação da Cromatina , Reprodutibilidade dos Testes , Análise de Sequência de DNA
7.
J Biol Chem ; 294(52): 20097-20108, 2019 12 27.
Artigo em Inglês | MEDLINE | ID: mdl-31753917

RESUMO

Skeletal muscle is a heterogeneous tissue. Individual myofibers that make up muscle tissue exhibit variation in their metabolic and contractile properties. Although biochemical and histological assays are available to study myofiber heterogeneity, efficient methods to analyze the whole transcriptome of individual myofibers are lacking. Here, we report on a single-myofiber RNA-sequencing (smfRNA-Seq) approach to analyze the whole transcriptome of individual myofibers by combining single-fiber isolation with Switching Mechanism at 5' end of RNA Template (SMART) technology. Using smfRNA-Seq, we first determined the genes that are expressed in the whole muscle, including in nonmyogenic cells. We also analyzed the differences in the transcriptome of myofibers from young and old mice to validate the effectiveness of this new method. Our results suggest that aging leads to significant changes in the expression of metabolic genes, such as Nos1, and structural genes, such as Myl1, in myofibers. We conclude that smfRNA-Seq is a powerful tool to study developmental, disease-related, and age-related changes in the gene expression profile of skeletal muscle.


Assuntos
Perfilação da Expressão Gênica/métodos , RNA Mensageiro/metabolismo , Envelhecimento , Animais , Separação Celular/métodos , Biblioteca Gênica , Genoma , Camundongos , Fibras Musculares Esqueléticas/citologia , Fibras Musculares Esqueléticas/metabolismo , Músculo Esquelético/metabolismo , RNA Mensageiro/química , Análise de Sequência de RNA/métodos , Análise de Célula Única , Transcriptoma
8.
Bioinformatics ; 35(19): 3592-3598, 2019 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-30824903

RESUMO

MOTIVATION: Chromatin Immunopreciptation (ChIP)-seq is used extensively to identify sites of transcription factor binding or regions of epigenetic modifications to the genome. A key step in ChIP-seq analysis is peak calling, where genomic regions enriched for ChIP versus control reads are identified. Many programs have been designed to solve this task, but nearly all fall into the statistical trap of using the data twice-once to determine candidate enriched regions, and again to assess enrichment by classical statistical hypothesis testing. This double use of the data invalidates the statistical significance assigned to enriched regions, thus the true significance or reliability of peak calls remains unknown. RESULTS: Using simulated and real ChIP-seq data, we show that three well-known peak callers, MACS, SICER and diffReps, output biased P-values and false discovery rate estimates that can be many orders of magnitude too optimistic. We propose a wrapper algorithm, RECAP, that uses resampling of ChIP-seq and control data to estimate a monotone transform correcting for biases built into peak calling algorithms. When applied to null hypothesis data, where there is no enrichment between ChIP-seq and control, P-values recalibrated by RECAP are approximately uniformly distributed. On data where there is genuine enrichment, RECAP P-values give a better estimate of the true statistical significance of candidate peaks and better false discovery rate estimates, which correlate better with empirical reproducibility. RECAP is a powerful new tool for assessing the true statistical significance of ChIP-seq peak calls. AVAILABILITY AND IMPLEMENTATION: The RECAP software is available through www.perkinslab.ca or on github at https://github.com/theodorejperkins/RECAP. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Sequenciamento de Cromatina por Imunoprecipitação , Cromatina , Algoritmos , Sítios de Ligação , Sequenciamento de Nucleotídeos em Larga Escala , Reprodutibilidade dos Testes , Análise de Sequência de DNA
9.
Nucleic Acids Res ; 46(14): 7221-7235, 2018 08 21.
Artigo em Inglês | MEDLINE | ID: mdl-30016497

RESUMO

Muscle-specific transcription factor MyoD orchestrates the myogenic gene expression program by binding to short DNA motifs called E-boxes within myogenic cis-regulatory elements (CREs). Genome-wide analyses of MyoD cistrome by chromatin immnunoprecipitation sequencing shows that MyoD-bound CREs contain multiple E-boxes of various sequences. However, how E-box numbers, sequences and their spatial arrangement within CREs collectively regulate the binding affinity and transcriptional activity of MyoD remain largely unknown. Here, by an integrative analysis of MyoD cistrome combined with genome-wide analysis of key regulatory histones and gene expression data we show that the affinity landscape of MyoD is driven by multiple E-boxes, and that the overall binding affinity-and associated nucleosome positioning and epigenetic features of the CREs-crucially depend on the variant sequences and positioning of the E-boxes within the CREs. By comparative genomic analysis of single nucleotide polymorphism (SNPs) across publicly available data from 17 strains of laboratory mice, we show that variant sequences within the MyoD-bound motifs, but not their genome-wide counterparts, are under selection. At last, we show that the quantitative regulatory effect of MyoD binding on the nearby genes can, in part, be predicted by the motif composition of the CREs to which it binds. Taken together, our data suggest that motif numbers, sequences and their spatial arrangement within the myogenic CREs are important determinants of the cis-regulatory code of myogenic CREs.


Assuntos
Elementos E-Box/genética , Desenvolvimento Muscular/genética , Proteína MyoD/genética , Proteína MyoD/metabolismo , Transcrição Gênica/genética , Ativação Transcricional/genética , Animais , Sequência de Bases/genética , Imunoprecipitação da Cromatina , Proteínas de Ligação a DNA/genética , Expressão Gênica/genética , Regulação da Expressão Gênica , Estudo de Associação Genômica Ampla , Camundongos , Desenvolvimento Muscular/fisiologia , Motivos de Nucleotídeos/genética , Polimorfismo de Nucleotídeo Único/genética , Regiões Promotoras Genéticas/genética
10.
BMC Genomics ; 20(1): 941, 2019 Dec 07.
Artigo em Inglês | MEDLINE | ID: mdl-31810449

RESUMO

BACKGROUND: Phenotypic variability of human populations is partly the result of gene polymorphism and differential gene expression. As such, understanding the molecular basis for diversity requires identifying genes with both high and low population expression variance and identifying the mechanisms underlying their expression control. Key issues remain unanswered with respect to expression variability in human populations. The role of gene methylation as well as the contribution that age, sex and tissue-specific factors have on expression variability are not well understood. RESULTS: Here we used a novel method that accounts for sampling error to classify human genes based on their expression variability in normal human breast and brain tissues. We find that high expression variability is almost exclusively unimodal, indicating that variance is not the result of segregation into distinct expression states. Genes with high expression variability differ markedly between tissues and we find that genes with high population expression variability are likely to have age-, but not sex-dependent expression. Lastly, we find that methylation likely has a key role in controlling expression variability insofar as genes with low expression variability are likely to be non-methylated. CONCLUSIONS: We conclude that gene expression variability in the human population is likely to be important in tissue development and identity, methylation, and in natural biological aging. The expression variability of a gene is an important functional characteristic of the gene itself and the classification of a gene as one with Hyper-Variability or Hypo-Variability in a human population or in a specific tissue should be useful in the identification of important genes that functionally regulate development or disease.


Assuntos
Envelhecimento/genética , Mama/química , Metilação de DNA , Perfilação da Expressão Gênica/métodos , Redes Reguladoras de Genes , Fatores Etários , Química Encefálica , Cadáver , Ilhas de CpG , Epigênese Genética , Feminino , Regulação da Expressão Gênica , Humanos , Masculino , Especificidade de Órgãos , Fenótipo
11.
Bioinformatics ; 32(17): i790-i797, 2016 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-27587702

RESUMO

MOTIVATION: In competitive endogenous RNA (ceRNA) networks, different mRNAs targeted by the same miRNA can 'cross-talk' by absorbing miRNAs and relieving repression on the other mRNAs. This creates correlations in mRNA expression even without direct interaction. Most previous theoretical study of cross-talk has focused on correlations in stochastic fluctuations of mRNAs around their steady state values. However, the experimentally known examples of cross-talk do not involve single-cell fluctuations, but rather bulk tissue-level changes between conditions, such as due to differentiation or disease. In our study, we quantify for the first time both fluctuational and cross-conditional cross-talk in chemical kinetic models of miRNA-mRNA interaction networks. We study the parameter regions under which these different types of cross-talk arise, and how they are affected by network structure. RESULTS: We find that while a network may support both fluctuational and cross-conditional cross-talk, the parameter regimes under which each type of cross-talk tends to emerge are rather different. Consistent with previous studies, fluctuational cross-talk occurs when miRNA and mRNA expression levels are 'balanced', whereas cross-conditional cross-talk tends to emerge when average miRNA levels are high and average mRNA levels are low. Conversely, cross-conditional miRNA cross-talk-a little-discussed phenomenon-is greatest when miRNA levels are low and mRNA levels are high. We show that the parameter ranges where cross-talk is maximized can, to some degree, be predicted based on network structure. Indeed, we find that the dominant effect of network structure on correlations happens through the effect of network structure on the overall balance between miRNA and mRNA expression. However, it is not the only effect, as we find that the density of connections between miRNAs and mRNAs in larger networks increases cross-talk without altering the expression balance. CONCLUSION: Our results deepen the theoretical understanding of cross-talk in ceRNA networks, and have implications for the experimental identification of ceRNA cross-talk phenomena. AVAILABILITY AND IMPLEMENTATION: Simulation software available upon request. CONTACT: tperkins@ohri.ca.


Assuntos
Redes Reguladoras de Genes , MicroRNAs , RNA Mensageiro , Receptor Cross-Talk
12.
EMBO Rep ; 16(10): 1334-57, 2015 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-26265008

RESUMO

In embryonic stem cells (ESCs), gene regulatory networks (GRNs) coordinate gene expression to maintain ESC identity; however, the complete repertoire of factors regulating the ESC state is not fully understood. Our previous temporal microarray analysis of ESC commitment identified the E3 ubiquitin ligase protein Makorin-1 (MKRN1) as a potential novel component of the ESC GRN. Here, using multilayered systems-level analyses, we compiled a MKRN1-centered interactome in undifferentiated ESCs at the proteomic and ribonomic level. Proteomic analyses in undifferentiated ESCs revealed that MKRN1 associates with RNA-binding proteins, and ensuing RIP-chip analysis determined that MKRN1 associates with mRNAs encoding functionally related proteins including proteins that function during cellular stress. Subsequent biological validation identified MKRN1 as a novel stress granule-resident protein, although MKRN1 is not required for stress granule formation, or survival of unstressed ESCs. Thus, our unbiased systems-level analyses support a role for the E3 ligase MKRN1 as a ribonucleoprotein within the ESC GRN.


Assuntos
Células-Tronco Embrionárias/fisiologia , Redes Reguladoras de Genes/genética , Proteínas do Tecido Nervoso/genética , Ribonucleoproteínas/genética , Animais , Citoplasma/metabolismo , Genômica , Camundongos , Proteínas do Tecido Nervoso/química , Proteômica , RNA/metabolismo , Proteínas de Ligação a RNA/metabolismo , Ribonucleoproteínas/química , Ubiquitina-Proteína Ligases/metabolismo
13.
Bioinformatics ; 31(16): 2676-82, 2015 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-25847008

RESUMO

MOTIVATION: Stem cell differentiation is largely guided by master transcriptional regulators, but it also depends on the expression of other types of genes, such as cell cycle genes, signaling genes, metabolic genes, trafficking genes, etc. Traditional approaches to understanding gene expression patterns across multiple conditions, such as principal components analysis or K-means clustering, can group cell types based on gene expression, but they do so without knowledge of the differentiation hierarchy. Hierarchical clustering can organize cell types into a tree, but in general this tree is different from the differentiation hierarchy itself. METHODS: Given the differentiation hierarchy and gene expression data at each node, we construct a weighted Euclidean distance metric such that the minimum spanning tree with respect to that metric is precisely the given differentiation hierarchy. We provide a set of linear constraints that are provably sufficient for the desired construction and a linear programming approach to identify sparse sets of weights, effectively identifying genes that are most relevant for discriminating different parts of the tree. RESULTS: We apply our method to microarray gene expression data describing 38 cell types in the hematopoiesis hierarchy, constructing a weighted Euclidean metric that uses just 175 genes. However, we find that there are many alternative sets of weights that satisfy the linear constraints. Thus, in the style of random-forest training, we also construct metrics based on random subsets of the genes and compare them to the metric of 175 genes. We then report on the selected genes and their biological functions. Our approach offers a new way to identify genes that may have important roles in stem cell differentiation. CONTACT: tperkins@ohri.ca SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Diferenciação Celular/genética , Genes , Programação Linear , Células-Tronco/citologia , Células Sanguíneas/citologia , Análise por Conglomerados , Regulação da Expressão Gênica , Humanos
14.
Bioinformatics ; 29(4): 444-50, 2013 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-23300135

RESUMO

MOTIVATION: Reliable estimation of the mean fragment length for next-generation short-read sequencing data is an important step in next-generation sequencing analysis pipelines, most notably because of its impact on the accuracy of the enriched regions identified by peak-calling algorithms. Although many peak-calling algorithms include a fragment-length estimation subroutine, the problem has not been adequately solved, as demonstrated by the variability of the estimates returned by different algorithms. RESULTS: In this article, we investigate the use of strand cross-correlation to estimate mean fragment length of single-end data and show that traditional estimation approaches have mixed reliability. We observe that the mappability of different parts of the genome can introduce an artificial bias into cross-correlation computations, resulting in incorrect fragment-length estimates. We propose a new approach, called mappability-sensitive cross-correlation (MaSC), which removes this bias and allows for accurate and reliable fragment-length estimation. We analyze the computational complexity of this approach, and evaluate its performance on a test suite of NGS datasets, demonstrating its superiority to traditional cross-correlation analysis. AVAILABILITY: An open-source Perl implementation of our approach is available at http://www.perkinslab.ca/Software.html.


Assuntos
Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Mapeamento Cromossômico , Interpretação Estatística de Dados , Genômica , Humanos , Reprodutibilidade dos Testes
15.
Sci Rep ; 14(1): 1550, 2024 01 18.
Artigo em Inglês | MEDLINE | ID: mdl-38233494

RESUMO

One of the fundamental computational problems in cancer genomics is the identification of single nucleotide variants (SNVs) from DNA sequencing data. Many statistical models and software implementations for SNV calling have been developed in the literature, yet, they still disagree widely on real datasets. Based on an empirical Bayesian approach, we introduce a local false discovery rate (LFDR) estimator for germline SNV calling. Our approach learns model parameters without prior information, and simultaneously accounts for information across all sites in the genomic regions of interest. We also propose another LFDR-based algorithm that reliably prioritizes a given list of mutations called by any other variant-calling algorithm. We use a suite of gold-standard cell line data to compare our LFDR approach against a collection of widely used, state of the art programs. We find that our LFDR approach approximately matches or exceeds the performance of all of these programs, despite some very large differences among them. Furthermore, when prioritizing other algorithms' calls by our LFDR score, we find that by manipulating the type I-type II tradeoff we can select subsets of variant calls with minimal loss of sensitivity but dramatic increases in precision.


Assuntos
Nucleotídeos , Polimorfismo de Nucleotídeo Único , Teorema de Bayes , Nucleotídeos/genética , Software , Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala
16.
Res Pract Thromb Haemost ; 8(3): 102403, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38706783

RESUMO

Background: Anticoagulation therapy is the mainstay of therapy for patients with venous thromboembolism (VTE). However, continuing or stopping anticoagulants after the first 3 to 6 months is a difficult decision that requires ascertainment of the risk of bleeding and recurrent VTE. Despite the development of several statistical models to predict bleeding, the benefit of machine learning (ML) models has not been investigated in depth. Objectives: To assess the benefits of ML algorithms in bleeding risk evaluation in VTE patients and gain insight into their baseline information. Methods: The baseline clinical, demographic, and genotype information was collected for 2542 patients with VTE who were on extended anticoagulation therapy. Six unsupervised dimensionality reduction and clustering ML algorithms were used to visualize and cluster the data for patients with major bleeding (118 patients) and nonbleeders. Eight supervised ML algorithms were trained and compared with the previously derived clinical models using a 5-fold nested cross-validation scheme. Results: The baseline dataset for bleeders and nonbleeders showed a high degree of similarity. Two novel clusters were discovered within the dataset for bleeders based on the presence of isolated pulmonary embolism or isolated deep vein thrombosis, though the difference in bleeding risks was not statistically significant (P = .32). The supervised analysis showed that the ML and clinical models have similar discrimination (c-statistics, ∼62%) and calibration performance (Brier score, ∼0.045). Conclusion: The clinical variables recorded at baseline are not distinctive enough to improve bleeding prediction beyond the performance of the existing models, and other strategies or data modalities should be considered.

17.
J Thromb Haemost ; 22(7): 1997-2008, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38642704

RESUMO

BACKGROUND: Thus far, all the clinical models developed to predict major bleeding in patients on extended anticoagulation therapy use the baseline predictors to stratify patients into different risk groups. Therefore, these models do not account for the clinical changes and events that occur after the baseline visit, which can modify risk of bleeding. However, it is difficult to develop predictive models from the routine follow-up clinical interviews, which are irregular sequences of multivariate time series data. OBJECTIVES: To demonstrate that deep learning can incorporate patient time series follow-up data to improve prediction of major bleeding. METHODS: We used the baseline and follow-up data that were collected over 8 years in a longitudinal cohort study of 2542 patients, of whom 118 had major bleeding. Four supervised neural network-based machine-learning models were trained on the baseline, follow-up, or both datasets using 70% of the data. The performance of these models was evaluated, along with modified versions of 6 previously developed clinical models, on the remaining 30% of the data. RESULTS: An ensemble of feedforward and recurrent neural networks that used the baseline and follow-up data was the best-performing model, achieving a sensitivity and a specificity of 61% and 82%, respectively, in identifying major bleeding, and it outperformed the previously developed clinical models in terms of area under the receiver operating characteristic curve (82%) and area under the precision-recall curve (14%). CONCLUSION: Time series follow-up data can improve major bleeding prediction in patients on extended anticoagulation therapy.


Assuntos
Anticoagulantes , Aprendizado Profundo , Hemorragia , Humanos , Anticoagulantes/efeitos adversos , Anticoagulantes/administração & dosagem , Hemorragia/induzido quimicamente , Masculino , Feminino , Idoso , Medição de Risco , Fatores de Tempo , Fatores de Risco , Pessoa de Meia-Idade , Estudos Longitudinais , Valor Preditivo dos Testes , Esquema de Medicação , Resultado do Tratamento , Redes Neurais de Computação , Idoso de 80 Anos ou mais
18.
Methods Mol Biol ; 2587: 537-553, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36401049

RESUMO

High-content screening is commonly performed on 2D cultured cells, which is high throughput but has low biological relevance. In contrast, single myofiber culture assay preserves the satellite cell niche between the basal lamina and sarcolemma and consequently has high biological relevance but is low throughput. We describe here a high-content screening method that utilizes single myofiber culture that addresses the caveats of both techniques. Our method utilizes the transgenic reporter allele Myf5-Cre:R26R-eYFP to differentiate stem and committed cells within a dividing couplet that can be quantified by high-content throughput immunodetection and bioinformatic analysis.


Assuntos
Células Satélites de Músculo Esquelético , Músculos , Células Cultivadas , Divisão Celular
19.
Nat Commun ; 14(1): 535, 2023 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-36726011

RESUMO

Adult stem cells are indispensable for tissue regeneration, but their function declines with age. The niche environment in which the stem cells reside plays a critical role in their function. However, quantification of the niche effect on stem cell function is lacking. Using muscle stem cells (MuSC) as a model, we show that aging leads to a significant transcriptomic shift in their subpopulations accompanied by locus-specific gain and loss of chromatin accessibility and DNA methylation. By combining in vivo MuSC transplantation and computational methods, we show that the expression of approximately half of all age-altered genes in MuSCs from aged male mice can be restored by exposure to a young niche environment. While there is a correlation between gene reversibility and epigenetic alterations, restoration of gene expression occurs primarily at the level of transcription. The stem cell niche environment therefore represents an important therapeutic target to enhance tissue regeneration in aging.


Assuntos
Células-Tronco Adultas , Músculo Esquelético , Masculino , Camundongos , Animais , Músculo Esquelético/metabolismo , Fibras Musculares Esqueléticas , Células-Tronco/metabolismo , Envelhecimento/fisiologia
20.
PLoS Comput Biol ; 7(5): e1002048, 2011 May.
Artigo em Inglês | MEDLINE | ID: mdl-21589890

RESUMO

Inferring regulatory and metabolic network models from quantitative genetic interaction data remains a major challenge in systems biology. Here, we present a novel quantitative model for interpreting epistasis within pathways responding to an external signal. The model provides the basis of an experimental method to determine the architecture of such pathways, and establishes a new set of rules to infer the order of genes within them. The method also allows the extraction of quantitative parameters enabling a new level of information to be added to genetic network models. It is applicable to any system where the impact of combinatorial loss-of-function mutations can be quantified with sufficient accuracy. We test the method by conducting a systematic analysis of a thoroughly characterized eukaryotic gene network, the galactose utilization pathway in Saccharomyces cerevisiae. For this purpose, we quantify the effects of single and double gene deletions on two phenotypic traits, fitness and reporter gene expression. We show that applying our method to fitness traits reveals the order of metabolic enzymes and the effects of accumulating metabolic intermediates. Conversely, the analysis of expression traits reveals the order of transcriptional regulatory genes, secondary regulatory signals and their relative strength. Strikingly, when the analyses of the two traits are combined, the method correctly infers ~80% of the known relationships without any false positives.


Assuntos
Biologia Computacional/métodos , Epistasia Genética , Redes Reguladoras de Genes , Modelos Genéticos , Galactose/genética , Galactose/metabolismo , Deleção de Genes , Regulação Fúngica da Expressão Gênica , Genes Fúngicos , Redes e Vias Metabólicas , Fenótipo , Saccharomyces cerevisiae/genética , Transdução de Sinais
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA