Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 71
Filtrar
1.
PLoS Genet ; 20(6): e1011310, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38857303

RESUMO

Growth deficiency is a characteristic feature of both Kabuki syndrome 1 (KS1) and Kabuki syndrome 2 (KS2), Mendelian disorders of the epigenetic machinery with similar phenotypes but distinct genetic etiologies. We previously described skeletal growth deficiency in a mouse model of KS1 and further established that a Kmt2d-/- chondrocyte model of KS1 exhibits precocious differentiation. Here we characterized growth deficiency in a mouse model of KS2, Kdm6atm1d/+. We show that Kdm6atm1d/+ mice have decreased femur and tibia length compared to controls and exhibit abnormalities in cortical and trabecular bone structure. Kdm6atm1d/+ growth plates are also shorter, due to decreases in hypertrophic chondrocyte size and hypertrophic zone height. Given these disturbances in the growth plate, we generated Kdm6a-/- chondrogenic cell lines. Similar to our prior in vitro model of KS1, we found that Kdm6a-/- cells undergo premature, enhanced differentiation towards chondrocytes compared to Kdm6a+/+ controls. RNA-seq showed that Kdm6a-/- cells have a distinct transcriptomic profile that indicates dysregulation of cartilage development. Finally, we performed RNA-seq simultaneously on Kmt2d-/-, Kdm6a-/-, and control lines at Days 7 and 14 of differentiation. This revealed surprising resemblance in gene expression between Kmt2d-/- and Kdm6a-/- at both time points and indicates that the similarity in phenotype between KS1 and KS2 also exists at the transcriptional level.


Assuntos
Anormalidades Múltiplas , Condrócitos , Modelos Animais de Doenças , Face , Doenças Hematológicas , Histona Desmetilases , Doenças Vestibulares , Animais , Doenças Vestibulares/genética , Doenças Vestibulares/patologia , Camundongos , Face/anormalidades , Histona Desmetilases/genética , Histona Desmetilases/metabolismo , Doenças Hematológicas/genética , Doenças Hematológicas/patologia , Condrócitos/metabolismo , Anormalidades Múltiplas/genética , Anormalidades Múltiplas/patologia , Diferenciação Celular/genética , Condrogênese/genética , Proteínas de Ligação a DNA/genética , Proteínas de Ligação a DNA/deficiência , Humanos , Camundongos Knockout , Fenótipo , Histona-Lisina N-Metiltransferase , Proteína de Leucina Linfoide-Mieloide
2.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38711367

RESUMO

Hi-C data are commonly normalized using single sample processing methods, with focus on comparisons between regions within a given contact map. Here, we aim to compare contact maps across different samples. We demonstrate that unwanted variation, of likely technical origin, is present in Hi-C data with replicates from different individuals, and that properties of this unwanted variation change across the contact map. We present band-wise normalization and batch correction, a method for normalization and batch correction of Hi-C data and show that it substantially improves comparisons across samples, including in a quantitative trait loci analysis as well as differential enrichment across cell types.


Assuntos
Locos de Características Quantitativas , Humanos , Biologia Computacional
3.
Genome Res ; 34(5): 696-710, 2024 Jun 25.
Artigo em Inglês | MEDLINE | ID: mdl-38702196

RESUMO

Many Mendelian developmental disorders caused by coding variants in epigenetic regulators have now been discovered. Epigenetic regulators are broadly expressed, and each of these disorders typically shows phenotypic manifestations from many different organ systems. An open question is whether the chromatin disruption-the root of the pathogenesis-is similar in the different disease-relevant cell types. This is possible in principle, because all these cell types are subject to effects from the same causative gene, which has the same kind of function (e.g., methylates histones) and is disrupted by the same germline variant. We focus on mouse models for Kabuki syndrome types 1 and 2 and find that the chromatin accessibility changes in neurons are mostly distinct from changes in B or T cells. This is not because the neuronal accessibility changes occur at regulatory elements that are only active in neurons. Neurons, but not B or T cells, show preferential chromatin disruption at CpG islands and at regulatory elements linked to aging. A sensitive analysis reveals that regulatory elements disrupted in B/T cells do show chromatin accessibility changes in neurons, but these are very subtle and of uncertain functional significance. Finally, we are able to identify a small set of regulatory elements disrupted in all three cell types. Our findings reveal the cellular-context-specific effect of variants in epigenetic regulators and suggest that blood-derived episignatures, although useful diagnostically, may not be well suited for understanding the mechanistic basis of neurodevelopment in Mendelian disorders of the epigenetic machinery.


Assuntos
Anormalidades Múltiplas , Envelhecimento , Cromatina , Ilhas de CpG , Face , Doenças Hematológicas , Neurônios , Doenças Vestibulares , Animais , Doenças Hematológicas/genética , Doenças Hematológicas/metabolismo , Camundongos , Face/anormalidades , Cromatina/metabolismo , Cromatina/genética , Doenças Vestibulares/genética , Neurônios/metabolismo , Envelhecimento/genética , Anormalidades Múltiplas/genética , Modelos Animais de Doenças , Epigênese Genética , Linfócitos T/metabolismo , Linfócitos B/metabolismo
4.
bioRxiv ; 2024 Mar 13.
Artigo em Inglês | MEDLINE | ID: mdl-38559266

RESUMO

Tens of thousands of RNA-sequencing experiments comprising hundreds of thousands of individual samples have now been performed. These data represent a broad range of experimental conditions, sequencing technologies, and hypotheses under study. The Recount project has aggregated and uniformly processed hundreds of thousands of publicly available RNA-seq samples. Most of these samples only include RNA expression measurements; genotype data for these same samples would enable a wide range of analyses including variant prioritization, eQTL analysis, and studies of allele specific expression. Here, we developed a statistical model based on the existing reference and alternative read counts from the RNA-seq experiments available through Recount3 to predict genotypes at autosomal biallelic loci in coding regions. We demonstrate the accuracy of our model using large-scale studies that measured both gene expression and genotype genome-wide. We show that our predictive model is highly accurate with 99.5% overall accuracy, 99.6% major allele accuracy, and 90.4% minor allele accuracy. Our model is robust to tissue and study effects, provided the coverage is high enough. We applied this model to genotype all the samples in Recount 3 and provide the largest ready-to-use expression repository containing genotype information. We illustrate that the predicted genotype from RNA-seq data is sufficient to unravel the underlying population structure of samples in Recount3 using Principal Component Analysis.

5.
JCI Insight ; 9(1)2024 Jan 09.
Artigo em Inglês | MEDLINE | ID: mdl-38015625

RESUMO

Weaver syndrome is a Mendelian disorder of the epigenetic machinery (MDEM) caused by germline pathogenic variants in EZH2, which encodes the predominant H3K27 methyltransferase and key enzymatic component of Polycomb repressive complex 2 (PRC2). Weaver syndrome is characterized by striking overgrowth and advanced bone age, intellectual disability, and distinctive facies. We generated a mouse model for the most common Weaver syndrome missense variant, EZH2 p.R684C. Ezh2R684C/R684C mouse embryonic fibroblasts (MEFs) showed global depletion of H3K27me3. Ezh2R684C/+ mice had abnormal bone parameters, indicative of skeletal overgrowth, and Ezh2R684C/+ osteoblasts showed increased osteogenic activity. RNA-Seq comparing osteoblasts differentiated from Ezh2R684C/+, and Ezh2+/+ BM-mesenchymal stem cells (BM-MSCs) indicated collective dysregulation of the BMP pathway and osteoblast differentiation. Inhibition of the opposing H3K27 demethylases KDM6A and KDM6B substantially reversed the excessive osteogenesis in Ezh2R684C/+ cells both at the transcriptional and phenotypic levels. This supports both the ideas that writers and erasers of histone marks exist in a fine balance to maintain epigenome state and that epigenetic modulating agents have therapeutic potential for the treatment of MDEMs.


Assuntos
Fibroblastos , Osteogênese , Animais , Camundongos , Osteogênese/fisiologia , Fibroblastos/metabolismo , Complexo Repressor Polycomb 2 , Modelos Animais de Doenças , Histona Desmetilases
6.
Genome Biol ; 24(1): 246, 2023 10 26.
Artigo em Inglês | MEDLINE | ID: mdl-37885016

RESUMO

BACKGROUND: RNA velocity analysis of single cells offers the potential to predict temporal dynamics from gene expression. In many systems, RNA velocity has been observed to produce a vector field that qualitatively reflects known features of the system. However, the limitations of RNA velocity estimates are still not well understood. RESULTS: We analyze the impact of different steps in the RNA velocity workflow on direction and speed. We consider both high-dimensional velocity estimates and low-dimensional velocity vector fields mapped onto an embedding. We conclude the transition probability method for mapping velocity estimates onto an embedding is effectively interpolating in the embedding space. Our findings reveal a significant dependence of the RNA velocity workflow on smoothing via the k-nearest-neighbors (k-NN) graph of the observed data. This reliance results in considerable estimation errors for both direction and speed in both high- and low-dimensional settings when the k-NN graph fails to accurately represent the true data structure; this is an unknown feature of real data. RNA velocity performs poorly at estimating speed in both low- and high-dimensional spaces, except in very low noise settings. We introduce a novel quality measure that can identify when RNA velocity should not be used. CONCLUSIONS: Our findings emphasize the importance of choices in the RNA velocity workflow and highlight critical limitations of data analysis. We advise against over-interpreting expression dynamics using RNA velocity, particularly in terms of speed. Finally, we emphasize that the use of RNA velocity in assessing the correctness of a low-dimensional embedding is circular.


Assuntos
Probabilidade , Análise por Conglomerados
7.
PLoS Genet ; 19(10): e1010997, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37871105

RESUMO

Diet-related metabolic syndrome is the largest contributor to adverse health in the United States. However, the study of gene-environment interactions and their epigenomic and transcriptomic integration is complicated by the lack of environmental and genetic control in humans that is possible in mouse models. Here we exposed three mouse strains, C57BL/6J (BL6), A/J, and NOD/ShiLtJ (NOD), to a high-fat, high-carbohydrate diet, leading to varying degrees of metabolic syndrome. We then performed transcriptomic and genome-wide DNA methylation analyses for each strain and found overlapping but also highly divergent changes in gene expression and methylation upstream of the discordant metabolic phenotypes. Strain-specific pathway analysis of dietary effects revealed a dysregulation of cholesterol biosynthesis common to all three strains but distinct regulatory networks driving this dysregulation. This suggests a strategy for strain-specific targeted pharmacologic intervention of these upstream regulators informed by epigenetic and transcriptional regulation. As a pilot study, we administered the drug GW4064 to target one of these genotype-dependent networks, the farnesoid X receptor pathway, and found that GW4064 exerts strain-specific protection against dietary effects in BL6, as predicted by our transcriptomic analysis. Furthermore, GW4064 treatment induced inflammatory-related gene expression changes in NOD, indicating a strain-specific effect in its associated toxicities as well as its therapeutic efficacy. This pilot study demonstrates the potential efficacy of precision therapeutics for genotype-informed dietary metabolic intervention and a mouse platform for guiding this approach.


Assuntos
Síndrome Metabólica , Humanos , Camundongos , Animais , Síndrome Metabólica/tratamento farmacológico , Síndrome Metabólica/genética , Síndrome Metabólica/metabolismo , Epigenômica , Projetos Piloto , Fígado/metabolismo , Camundongos Endogâmicos C57BL , Camundongos Endogâmicos NOD , Dieta Hiperlipídica/efeitos adversos , Epigênese Genética
8.
bioRxiv ; 2023 Aug 03.
Artigo em Inglês | MEDLINE | ID: mdl-37577516

RESUMO

Many Mendelian developmental disorders caused by coding variants in epigenetic regulators have now been discovered. Epigenetic regulators are broadly expressed, and each of these disorders typically exhibits phenotypic manifestations from many different organ systems. An open question is whether the chromatin disruption - the root of the pathogenesis - is similar in the different disease-relevant cell types. This is possible in principle, since all these cell-types are subject to effects from the same causative gene, that has the same kind of function (e.g. methylates histones) and is disrupted by the same germline variant. We focus on mouse models for Kabuki syndrome types 1 and 2, and find that the chromatin accessibility abnormalities in neurons are mostly distinct from those in B or T cells. This is not because the neuronal abnormalities occur at regulatory elements that are only active in neurons. Neurons, but not B or T cells, show preferential chromatin disruption at CpG islands and at regulatory elements linked to aging. A sensitive analysis reveals that the regions disrupted in B/T cells do exhibit chromatin accessibility changes in neurons, but these are very subtle and of uncertain functional significance. Finally, we are able to identify a small set of regulatory elements disrupted in all three cell types. Our findings reveal the cellular-context-specific effect of variants in epigenetic regulators, and suggest that blood-derived "episignatures" may not be well-suited for understanding the mechanistic basis of neurodevelopment in Mendelian disorders of the epigenetic machinery.

9.
bioRxiv ; 2023 Jun 30.
Artigo em Inglês | MEDLINE | ID: mdl-37425751

RESUMO

Weaver syndrome is a Mendelian disorder of the epigenetic machinery (MDEM) caused by germline pathogenic variants in EZH2, which encodes the predominant H3K27 methyltransferase and key enzymatic component of Polycomb repressive complex 2 (PRC2). Weaver syndrome is characterized by striking overgrowth and advanced bone age, intellectual disability, and distinctive facies. We generated a mouse model for the most common Weaver syndrome missense variant, EZH2 p.R684C. Ezh2R684C/R684C mouse embryonic fibroblasts (MEFs) showed global depletion of H3K27me3. Ezh2R684C/+ mice had abnormal bone parameters indicative of skeletal overgrowth, and Ezh2R684C/+ osteoblasts showed increased osteogenic activity. RNA-seq comparing osteoblasts differentiated from Ezh2R684C/+ and Ezh2+/+ bone marrow mesenchymal stem cells (BM-MSCs) indicated collective dysregulation of the BMP pathway and osteoblast differentiation. Inhibition of the opposing H3K27 demethylases Kdm6a/6b substantially reversed the excessive osteogenesis in Ezh2R684C/+ cells both at the transcriptional and phenotypic levels. This supports both the ideas that writers and erasers of histone marks exist in a fine balance to maintain epigenome state, and that epigenetic modulating agents have therapeutic potential for the treatment of MDEMs.

10.
Nat Commun ; 14(1): 4059, 2023 07 10.
Artigo em Inglês | MEDLINE | ID: mdl-37429865

RESUMO

Feature selection to identify spatially variable genes or other biologically informative genes is a key step during analyses of spatially-resolved transcriptomics data. Here, we propose nnSVG, a scalable approach to identify spatially variable genes based on nearest-neighbor Gaussian processes. Our method (i) identifies genes that vary in expression continuously across the entire tissue or within a priori defined spatial domains, (ii) uses gene-specific estimates of length scale parameters within the Gaussian process models, and (iii) scales linearly with the number of spatial locations. We demonstrate the performance of our method using experimental data from several technological platforms and simulations. A software implementation is available at https://bioconductor.org/packages/nnSVG .


Assuntos
Perfilação da Expressão Gênica , Software , Análise por Conglomerados , Distribuição Normal
11.
bioRxiv ; 2023 Apr 28.
Artigo em Inglês | MEDLINE | ID: mdl-37163127

RESUMO

Diet-related metabolic syndrome is the largest contributor to adverse health in the United States. However, the study of gene-environment interactions and their epigenomic and transcriptomic integration is complicated by the lack of environmental and genetic control in humans that is possible in mouse models. Here we exposed three mouse strains, C57BL/6J (BL6), A/J, and NOD/ShiLtJ (NOD), to a high-fat high-carbohydrate diet, leading to varying degrees of metabolic syndrome. We then performed transcriptomic and genomic DNA methylation analyses and found overlapping but also highly divergent changes in gene expression and methylation upstream of the discordant metabolic phenotypes. Strain-specific pathway analysis of dietary effects reveals a dysregulation of cholesterol biosynthesis common to all three strains but distinct regulatory networks driving this dysregulation. This suggests a strategy for strain-specific targeted pharmacologic intervention of these upstream regulators informed by transcriptional regulation. As a pilot study, we administered the drug GW4064 to target one of these genotype-dependent networks, the Farnesoid X receptor pathway, and found that GW4064 exerts genotype-specific protection against dietary effects in BL6, as predicted by our transcriptomic analysis, as well as increased inflammatory-related gene expression changes in NOD. This pilot study demonstrates the potential efficacy of precision therapeutics for genotype-informed dietary metabolic intervention, and a mouse platform for guiding this approach.

12.
Biostatistics ; 2023 May 31.
Artigo em Inglês | MEDLINE | ID: mdl-37257175

RESUMO

In complex tissues containing cells that are difficult to dissociate, single-nucleus RNA-sequencing (snRNA-seq) has become the preferred experimental technology over single-cell RNA-sequencing (scRNA-seq) to measure gene expression. To accurately model these data in downstream analyses, previous work has shown that droplet-based scRNA-seq data are not zero-inflated, but whether droplet-based snRNA-seq data follow the same probability distributions has not been systematically evaluated. Using pseudonegative control data from nuclei in mouse cortex sequenced with the 10x Genomics Chromium system and mouse kidney sequenced with the DropSeq system, we found that droplet-based snRNA-seq data follow a negative binomial distribution, suggesting that parametric statistical models applied to scRNA-seq are transferable to snRNA-seq. Furthermore, we found that the quantification choices in adapting quantification mapping strategies from scRNA-seq to snRNA-seq can play a significant role in downstream analyses and biological interpretation. In particular, reference transcriptomes that do not include intronic regions result in significantly smaller library sizes and incongruous cell type classifications. We also confirmed the presence of a gene length bias in snRNA-seq data, which we show is present in both exonic and intronic reads, and investigate potential causes for the bias.

13.
Cancer Res ; 83(11): 1905-1916, 2023 06 02.
Artigo em Inglês | MEDLINE | ID: mdl-36989344

RESUMO

Pancreatic ductal adenocarcinoma (PDAC) is believed to arise from the accumulation of a series of somatic mutations and is also frequently associated with pancreatic intraepithelial neoplasia (PanIN) lesions. However, there is still debate as to whether the cell type-of-origin of PanINs and PDACs in humans is acinar or ductal. As cell type identity is maintained epigenetically, DNA methylation changes during pancreatic neoplasia can provide a compelling perspective to examine this question. Here, we performed laser-capture microdissection on surgically resected specimens from 18 patients to isolate, with high purity, DNA for whole-genome bisulfite sequencing from four relevant cell types: acini, nonneoplastic ducts, PanIN lesions, and PDAC lesions. Differentially methylated regions (DMR) were identified using two complementary analytical approaches: bsseq, which identifies any DMRs but is particularly useful for large block-like DMRs, and informME, which profiles the potential energy landscape across the genome and is particularly useful for identifying differential methylation entropy. Both global methylation profiles and block DMRs clearly implicated an acinar origin for PanINs. At the gene level, PanIN lesions exhibited an intermediate acinar-ductal phenotype resembling acinar-to-ductal metaplasia. In 97.6% of PanIN-specific DMRs, PanIN lesions had an intermediate methylation level between normal and PDAC, which suggests from an information theory perspective that PanIN lesions are epigenetically primed to progress to PDAC. Thus, epigenomic analysis complements histopathology to define molecular progression toward PDAC. The shared epigenetic lineage between PanIN and PDAC lesions could provide an opportunity for prevention by targeting aberrantly methylated progression-related genes. SIGNIFICANCE: Analysis of DNA methylation landscapes provides insights into the cell-of-origin of PanIN lesions, clarifies the role of PanIN lesions as metaplastic precursors to human PDAC, and suggests potential targets for chemoprevention.


Assuntos
Carcinoma in Situ , Carcinoma Ductal Pancreático , Neoplasias Pancreáticas , Humanos , Metilação de DNA , Neoplasias Pancreáticas/patologia , Carcinogênese/genética , Carcinogênese/patologia , Carcinoma Ductal Pancreático/patologia , Carcinoma in Situ/genética , Carcinoma in Situ/patologia , Neoplasias Pancreáticas
14.
Bioinform Adv ; 3(1): vbad020, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36874953

RESUMO

Summary: Thousands of DNA methylation (DNAm) array samples from human blood are publicly available on the Gene Expression Omnibus (GEO), but they remain underutilized for experiment planning, replication and cross-study and cross-platform analyses. To facilitate these tasks, we augmented our recountmethylation R/Bioconductor package with 12 537 uniformly processed EPIC and HM450K blood samples on GEO as well as several new features. We subsequently used our updated package in several illustrative analyses, finding (i) study ID bias adjustment increased variation explained by biological and demographic variables, (ii) most variation in autosomal DNAm was explained by genetic ancestry and CD4+ T-cell fractions and (iii) the dependence of power to detect differential methylation on sample size was similar for each of peripheral blood mononuclear cells (PBMC), whole blood and umbilical cord blood. Finally, we used PBMC and whole blood to perform independent validations, and we recovered 38-46% of differentially methylated probes between sexes from two previously published epigenome-wide association studies. Availability and implementation: Source code to reproduce the main results are available on GitHub (repo: recountmethylation_flexible-blood-analysis_manuscript; url: https://github.com/metamaden/recountmethylation_flexible-blood-analysis_manuscript). All data was publicly available and downloaded from the Gene Expression Omnibus (https://www.ncbi.nlm.nih.gov/geo/). Compilations of the analyzed public data can be accessed from the website recount.bio/data (preprocessed HM450K array data: https://recount.bio/data/remethdb_h5se-gm_epic_0-0-2_1589820348/; preprocessed EPIC array data: https://recount.bio/data/remethdb_h5se-gm_epic_0-0-2_1589820348/). Supplementary information: Supplementary data are available at Bioinformatics Advances online.

15.
PLoS Comput Biol ; 18(3): e1009954, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-35353807

RESUMO

Estimates of correlation between pairs of genes in co-expression analysis are commonly used to construct networks among genes using gene expression data. As previously noted, the distribution of such correlations depends on the observed expression level of the involved genes, which we refer to this as a mean-correlation relationship in RNA-seq data, both bulk and single-cell. This dependence introduces an unwanted technical bias in co-expression analysis whereby highly expressed genes are more likely to be highly correlated. Such a relationship is not observed in protein-protein interaction data, suggesting that it is not reflecting biology. Ignoring this bias can lead to missing potentially biologically relevant pairs of genes that are lowly expressed, such as transcription factors. To address this problem, we introduce spatial quantile normalization (SpQN), a method for normalizing local distributions in a correlation matrix. We show that spatial quantile normalization removes the mean-correlation relationship and corrects the expression bias in network reconstruction.


Assuntos
Perfilação da Expressão Gênica , Fatores de Transcrição , Análise de Sequência de RNA/métodos , Fatores de Transcrição/genética , Sequenciamento do Exoma
16.
Cell Genom ; 2(1)2022 Jan 12.
Artigo em Inglês | MEDLINE | ID: mdl-35199087

RESUMO

The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL; https://anvilproject.org) was developed to address a widespread community need for a unified computing environment for genomics data storage, management, and analysis. In this perspective, we present AnVIL, describe its ecosystem and interoperability with other platforms, and highlight how this platform and associated initiatives contribute to improved genomic data sharing efforts. The AnVIL is a federated cloud platform designed to manage and store genomics and related data, enable population-scale analysis, and facilitate collaboration through the sharing of data, code, and analysis results. By inverting the traditional model of data sharing, the AnVIL eliminates the need for data movement while also adding security measures for active threat detection and monitoring and provides scalable, shared computing resources for any researcher. We describe the core data management and analysis components of the AnVIL, which currently consists of Terra, Gen3, Galaxy, RStudio/Bioconductor, Dockstore, and Jupyter, and describe several flagship genomics datasets available within the AnVIL. We continue to extend and innovate the AnVIL ecosystem by implementing new capabilities, including mechanisms for interoperability and responsible data sharing, while streamlining access management. The AnVIL opens many new opportunities for analysis, collaboration, and data sharing that are needed to drive research and to make discoveries through the joint analysis of hundreds of thousands to millions of genomes along with associated clinical and molecular data types.

17.
Nat Commun ; 13(1): 783, 2022 02 10.
Artigo em Inglês | MEDLINE | ID: mdl-35145108

RESUMO

Infinium methylation arrays are not available for the vast majority of non-human mammals. Moreover, even if species-specific arrays were available, probe differences between them would confound cross-species comparisons. To address these challenges, we developed the mammalian methylation array, a single custom array that measures up to 36k CpGs per species that are well conserved across many mammalian species. We designed a set of probes that can tolerate specific cross-species mutations. We annotate the array in over 200 species and report CpG island status and chromatin states in select species. Calibration experiments demonstrate the high fidelity in humans, rats, and mice. The mammalian methylation array has several strengths: it applies to all mammalian species even those that have not yet been sequenced, it provides deep coverage of conserved cytosines facilitating the development of epigenetic biomarkers, and it increases the probability that biological insights gained in one species will translate to others.


Assuntos
Sequência Conservada , Metilação de DNA , Mamíferos/genética , Mamíferos/metabolismo , Processamento de Proteína Pós-Traducional/genética , Processamento de Proteína Pós-Traducional/fisiologia , Animais , Biomarcadores , Ilhas de CpG , Epigênese Genética , Humanos , Camundongos , Mutação , Ratos , Transcriptoma
18.
Genome Biol ; 23(1): 41, 2022 01 31.
Artigo em Inglês | MEDLINE | ID: mdl-35101061

RESUMO

BACKGROUND: The cell cycle is a highly conserved, continuous process which controls faithful replication and division of cells. Single-cell technologies have enabled increasingly precise measurements of the cell cycle both as a biological process of interest and as a possible confounding factor. Despite its importance and conservation, there is no universally applicable approach to infer position in the cell cycle with high-resolution from single-cell RNA-seq data. RESULTS: Here, we present tricycle, an R/Bioconductor package, to address this challenge by leveraging key features of the biology of the cell cycle, the mathematical properties of principal component analysis of periodic functions, and the use of transfer learning. We estimate a cell-cycle embedding using a fixed reference dataset and project new data into this reference embedding, an approach that overcomes key limitations of learning a dataset-dependent embedding. Tricycle then predicts a cell-specific position in the cell cycle based on the data projection. The accuracy of tricycle compares favorably to gold-standard experimental assays, which generally require specialized measurements in specifically constructed in vitro systems. Using internal controls which are available for any dataset, we show that tricycle predictions generalize to datasets with multiple cell types, across tissues, species, and even sequencing assays. CONCLUSIONS: Tricycle generalizes across datasets and is highly scalable and applicable to atlas-level single-cell RNA-seq data.


Assuntos
Aprendizado de Máquina , Análise de Célula Única , Ciclo Celular/genética , Análise de Componente Principal , Análise de Sequência de RNA , Sequenciamento do Exoma
19.
Kidney Int ; 101(2): 369-378, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-34843755

RESUMO

Uremic symptoms are common in patients with advanced chronic kidney disease, but the toxins that cause these symptoms are unknown. To evaluate this, we performed a cross-sectional study of the 12 month post-randomization follow-up visit of Modification of Diet in Renal Disease (MDRD) participants reporting uremic symptoms who also had available stored serum. We quantified 1,163 metabolites by liquid chromatography-tandem mass spectrometry. For each uremic symptom, we calculated a score as the severity multiplied by the number of days the symptom was experienced. We analyzed the associations of the individual symptom scores with metabolites using linear models with empirical Bayesian inference, adjusted for multiple comparisons. Among 695 participants, the mean measured glomerular filtration rate (mGFR) was 28 mL/min/1.73 m2. Uremic symptoms were more common in the subgroup of 214 patients with an mGFR under 20 mL/min/1.73 m2 (mGFR under 20 subgroup) than in the full group. For all metabolites with significant associations, the direction of the association was concordant in the full group and the subgroup. For gastrointestinal symptoms (bad taste, loss of appetite, nausea, and vomiting), eleven metabolites were associated with symptoms. For neurologic symptoms (decreased alertness, falling asleep during the day, forgetfulness, lack of pep and energy, and tiring easily/weakness), seven metabolites were associated with symptoms. Associations were consistent across sensitivity analyses. Thus, our proof-of-principle study demonstrates the potential for metabolomics to understand metabolic pathways associated with uremic symptoms. Larger, prospective studies with external validation are needed.


Assuntos
Insuficiência Renal Crônica , Teorema de Bayes , Estudos Transversais , Taxa de Filtração Glomerular , Humanos , Metabolômica , Estudos Prospectivos , Insuficiência Renal Crônica/complicações , Insuficiência Renal Crônica/diagnóstico
20.
Genome Biol ; 22(1): 323, 2021 11 29.
Artigo em Inglês | MEDLINE | ID: mdl-34844637

RESUMO

We present recount3, a resource consisting of over 750,000 publicly available human and mouse RNA sequencing (RNA-seq) samples uniformly processed by our new Monorail analysis pipeline. To facilitate access to the data, we provide the recount3 and snapcount R/Bioconductor packages as well as complementary web resources. Using these tools, data can be downloaded as study-level summaries or queried for specific exon-exon junctions, genes, samples, or other features. Monorail can be used to process local and/or private data, allowing results to be directly compared to any study in recount3. Taken together, our tools help biologists maximize the utility of publicly available RNA-seq data, especially to improve their understanding of newly collected data. recount3 is available from http://rna.recount.bio .


Assuntos
Splicing de RNA , RNA-Seq/métodos , RNA/genética , Animais , Sequência de Bases , Biologia Computacional/métodos , Éxons , Regulação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Camundongos , Análise de Sequência de RNA/métodos , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...