Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 80
Filter
1.
Sci Adv ; 9(29): eadf4163, 2023 07 21.
Article in English | MEDLINE | ID: mdl-37467337

ABSTRACT

Aging is a leading risk factor for cancer. While it is proposed that age-related accumulation of somatic mutations drives this relationship, it is likely not the full story. We show that aging and cancer share a common epigenetic replication signature, which we modeled using DNA methylation from extensively passaged immortalized human cells in vitro and tested on clinical tissues. This signature, termed CellDRIFT, increased with age across multiple tissues, distinguished tumor from normal tissue, was escalated in normal breast tissue from cancer patients, and was transiently reset upon reprogramming. In addition, within-person tissue differences were correlated with predicted lifetime tissue-specific stem cell divisions and tissue-specific cancer risk. Our findings suggest that age-related replication may drive epigenetic changes in cells and could push them toward a more tumorigenic state.


Subject(s)
Epigenome , Neoplasms , Humans , Neoplasms/genetics , Neoplasms/pathology , Epigenesis, Genetic , Aging/genetics , Risk Factors
2.
Cell Genom ; 3(5): 100303, 2023 May 10.
Article in English | MEDLINE | ID: mdl-37228754

ABSTRACT

Although the role of RNA binding proteins (RBPs) in extracellular RNA (exRNA) biology is well established, their exRNA cargo and distribution across biofluids are largely unknown. To address this gap, we extend the exRNA Atlas resource by mapping exRNAs carried by extracellular RBPs (exRBPs). This map was developed through an integrative analysis of ENCODE enhanced crosslinking and immunoprecipitation (eCLIP) data (150 RBPs) and human exRNA profiles (6,930 samples). Computational analysis and experimental validation identified exRBPs in plasma, serum, saliva, urine, cerebrospinal fluid, and cell-culture-conditioned medium. exRBPs carry exRNA transcripts from small non-coding RNA biotypes, including microRNA (miRNA), piRNA, tRNA, small nuclear RNA (snRNA), small nucleolar RNA (snoRNA), Y RNA, and lncRNA, as well as protein-coding mRNA fragments. Computational deconvolution of exRBP RNA cargo reveals associations of exRBPs with extracellular vesicles, lipoproteins, and ribonucleoproteins across human biofluids. Overall, we mapped the distribution of exRBPs across human biofluids, presenting a resource for the community.

3.
Cell ; 186(7): 1493-1511.e40, 2023 03 30.
Article in English | MEDLINE | ID: mdl-37001506

ABSTRACT

Understanding how genetic variants impact molecular phenotypes is a key goal of functional genomics, currently hindered by reliance on a single haploid reference genome. Here, we present the EN-TEx resource of 1,635 open-access datasets from four donors (∼30 tissues × âˆ¼15 assays). The datasets are mapped to matched, diploid genomes with long-read phasing and structural variants, instantiating a catalog of >1 million allele-specific loci. These loci exhibit coordinated activity along haplotypes and are less conserved than corresponding, non-allele-specific ones. Surprisingly, a deep-learning transformer model can predict the allele-specific activity based only on local nucleotide-sequence context, highlighting the importance of transcription-factor-binding motifs particularly sensitive to variants. Furthermore, combining EN-TEx with existing genome annotations reveals strong associations between allele-specific and GWAS loci. It also enables models for transferring known eQTLs to difficult-to-profile tissues (e.g., from skin to heart). Overall, EN-TEx provides rich data and generalizable models for more accurate personal functional genomics.


Subject(s)
Epigenome , Quantitative Trait Loci , Genome-Wide Association Study , Genomics , Phenotype , Polymorphism, Single Nucleotide
5.
Front Cell Dev Biol ; 10: 804164, 2022.
Article in English | MEDLINE | ID: mdl-35317387

ABSTRACT

One promising goal for utilizing the molecular information circulating in biofluids is the discovery of clinically useful biomarkers. Extracellular RNAs (exRNAs) are one of the most diverse classes of molecular cargo, easily assayed by sequencing and with expressions that rapidly change in response to subject status. Despite diverse exRNA cargo, most evaluations from biofluids have focused on small RNA sequencing and analysis, specifically on microRNAs (miRNAs). Another goal of characterizing circulating molecular information, is to correlate expression to injuries associated with specific tissues of origin. Biomarker candidates are often described as being specific, enriched in a particular tissue or associated with a disease process. Likewise, miRNA data is often reported to be specific, enriched for a tissue, without rigorous testing to support the claim. Here we provide a tissue atlas of small RNAs from 30 different tissues and three different blood cell types. We analyzed the tissues for enrichment of small RNA sequences and assessed their expression in biofluids: plasma, cerebrospinal fluid, urine, and saliva. We employed published data sets representing physiological (resting vs. acute exercise) and pathologic states (early- vs. late-stage liver fibrosis, and differential subtypes of stroke) to determine differential tissue-enriched small RNAs. We also developed an online tool that provides information about exRNA sequences found in different biofluids and tissues. The data can be used to better understand the various types of small RNA sequences in different tissues as well as their potential release into biofluids, which should help in the validation or design of biomarker studies.

6.
Genome Med ; 13(1): 70, 2021 04 26.
Article in English | MEDLINE | ID: mdl-33902690

ABSTRACT

BACKGROUND: Inflammatory breast cancer (IBC) has a highly invasive and metastatic phenotype. However, little is known about its genetic drivers. To address this, we report the largest cohort of whole-genome sequencing (WGS) of IBC cases. METHODS: We performed WGS of 20 IBC samples and paired normal blood DNA to identify genomic alterations. For comparison, we used 23 matched non-IBC samples from the Cancer Genome Atlas Program (TCGA). We also validated our findings using WGS data from the International Cancer Genome Consortium (ICGC) and the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium. We examined a wide selection of genomic features to search for differences between IBC and conventional breast cancer. These include (i) somatic and germline single-nucleotide variants (SNVs), in both coding and non-coding regions; (ii) the mutational signature and the clonal architecture derived from these SNVs; (iii) copy number and structural variants (CNVs and SVs); and (iv) non-human sequence in the tumors (i.e., exogenous sequences of bacterial origin). RESULTS: Overall, IBC has similar genomic characteristics to non-IBC, including specific alterations, overall mutational load and signature, and tumor heterogeneity. In particular, we observed similar mutation frequencies between IBC and non-IBC, for each gene and most cancer-related pathways. Moreover, we found no exogenous sequences of infectious agents specific to IBC samples. Even though we could not find any strongly statistically distinguishing genomic features between the two groups, we did find some suggestive differences in IBC: (i) The MAST2 gene was more frequently mutated (20% IBC vs. 0% non-IBC). (ii) The TGF ß pathway was more frequently disrupted by germline SNVs (50% vs. 13%). (iii) Different copy number profiles were observed in several genomic regions harboring cancer genes. (iv) Complex SVs were more frequent. (v) The clonal architecture was simpler, suggesting more homogenous tumor-evolutionary lineages. CONCLUSIONS: Whole-genome sequencing of IBC manifests a similar genomic architecture to non-IBC. We found no unique genomic alterations shared in just IBCs; however, subtle genomic differences were observed including germline alterations in TGFß pathway genes and somatic mutations in the MAST2 kinase that could represent potential therapeutic targets.


Subject(s)
Genome, Human , Inflammatory Breast Neoplasms/genetics , Mutation/genetics , Whole Genome Sequencing , Clone Cells , DNA Copy Number Variations/genetics , Evolution, Molecular , Humans , Inflammatory Breast Neoplasms/microbiology , Inflammatory Breast Neoplasms/pathology , Molecular Sequence Annotation , Phenotype , Signal Transduction/genetics
7.
Front Genet ; 12: 778416, 2021.
Article in English | MEDLINE | ID: mdl-35047007

ABSTRACT

We now know RNA can survive the harsh environment of biofluids when encapsulated in vesicles or by associating with lipoproteins or RNA binding proteins. These extracellular RNA (exRNA) play a role in intercellular signaling, serve as biomarkers of disease, and form the basis of new strategies for disease treatment. The Extracellular RNA Communication Consortium (ERCC) hosted a two-day online workshop (April 19-20, 2021) on the unique challenges of exRNA data analysis. The goal was to foster an open dialog about best practices and discuss open problems in the field, focusing initially on small exRNA sequencing data. Video recordings of workshop presentations and discussions are available (https://exRNA.org/exRNAdata2021-videos/). There were three target audiences: experimentalists who generate exRNA sequencing data, computational and data scientists who work with those groups to analyze their data, and experimental and data scientists new to the field. Here we summarize issues explored during the workshop, including progress on an effort to develop an exRNA data analysis challenge to engage the community in solving some of these open problems.

8.
Nat Methods ; 17(8): 807-814, 2020 08.
Article in English | MEDLINE | ID: mdl-32737473

ABSTRACT

Enhancers are important non-coding elements, but they have traditionally been hard to characterize experimentally. The development of massively parallel assays allows the characterization of large numbers of enhancers for the first time. Here, we developed a framework using Drosophila STARR-seq to create shape-matching filters based on meta-profiles of epigenetic features. We integrated these features with supervised machine-learning algorithms to predict enhancers. We further demonstrated that our model could be transferred to predict enhancers in mammals. We comprehensively validated the predictions using a combination of in vivo and in vitro approaches, involving transgenic assays in mice and transduction-based reporter assays in human cell lines (153 enhancers in total). The results confirmed that our model can accurately predict enhancers in different species without re-parameterization. Finally, we examined the transcription factor binding patterns at predicted enhancers versus promoters. We demonstrated that these patterns enable the construction of a secondary model that effectively distinguishes enhancers and promoters.


Subject(s)
Epigenesis, Genetic/physiology , Pattern Recognition, Automated/methods , Animals , Cell Line , Drosophila , Histones/genetics , Histones/metabolism , Humans , Mice , Mice, Transgenic , Reproducibility of Results
9.
Nat Commun ; 11(1): 3696, 2020 07 29.
Article in English | MEDLINE | ID: mdl-32728046

ABSTRACT

ENCODE comprises thousands of functional genomics datasets, and the encyclopedia covers hundreds of cell types, providing a universal annotation for genome interpretation. However, for particular applications, it may be advantageous to use a customized annotation. Here, we develop such a custom annotation by leveraging advanced assays, such as eCLIP, Hi-C, and whole-genome STARR-seq on a number of data-rich ENCODE cell types. A key aspect of this annotation is comprehensive and experimentally derived networks of both transcription factors and RNA-binding proteins (TFs and RBPs). Cancer, a disease of system-wide dysregulation, is an ideal application for such a network-based annotation. Specifically, for cancer-associated cell types, we put regulators into hierarchies and measure their network change (rewiring) during oncogenesis. We also extensively survey TF-RBP crosstalk, highlighting how SUB1, a previously uncharacterized RBP, drives aberrant tumor expression and amplifies the effect of MYC, a well-known oncogenic TF. Furthermore, we show how our annotation allows us to place oncogenic transformations in the context of a broad cell space; here, many normal-to-tumor transitions move towards a stem-like state, while oncogene knockdowns show an opposing trend. Finally, we organize the resource into a coherent workflow to prioritize key elements and variants, in addition to regulators. We showcase the application of this prioritization to somatic burdening, cancer differential expression and GWAS. Targeted validations of the prioritized regulators, elements and variants using siRNA knockdowns, CRISPR-based editing, and luciferase assays demonstrate the value of the ENCODE resource.


Subject(s)
Databases, Genetic , Genomics , Neoplasms/genetics , Cell Line, Tumor , Cell Transformation, Neoplastic/genetics , Gene Regulatory Networks , Humans , Mutation/genetics , Reproducibility of Results , Transcription Factors/metabolism
10.
Nature ; 583(7818): 699-710, 2020 07.
Article in English | MEDLINE | ID: mdl-32728249

ABSTRACT

The human and mouse genomes contain instructions that specify RNAs and proteins and govern the timing, magnitude, and cellular context of their production. To better delineate these elements, phase III of the Encyclopedia of DNA Elements (ENCODE) Project has expanded analysis of the cell and tissue repertoires of RNA transcription, chromatin structure and modification, DNA methylation, chromatin looping, and occupancy by transcription factors and RNA-binding proteins. Here we summarize these efforts, which have produced 5,992 new experimental datasets, including systematic determinations across mouse fetal development. All data are available through the ENCODE data portal (https://www.encodeproject.org), including phase II ENCODE1 and Roadmap Epigenomics2 data. We have developed a registry of 926,535 human and 339,815 mouse candidate cis-regulatory elements, covering 7.9 and 3.4% of their respective genomes, by integrating selected datatypes associated with gene regulation, and constructed a web-based server (SCREEN; http://screen.encodeproject.org) to provide flexible, user-defined access to this resource. Collectively, the ENCODE data and registry provide an expansive resource for the scientific community to build a better understanding of the organization and function of the human and mouse genomes.


Subject(s)
DNA/genetics , Databases, Genetic , Genome/genetics , Genomics , Molecular Sequence Annotation , Registries , Regulatory Sequences, Nucleic Acid/genetics , Animals , Chromatin/genetics , Chromatin/metabolism , DNA/chemistry , DNA Footprinting , DNA Methylation/genetics , DNA Replication Timing , Deoxyribonuclease I/metabolism , Genome, Human , Histones/metabolism , Humans , Mice , Mice, Transgenic , RNA-Binding Proteins/genetics , Transcription, Genetic/genetics , Transposases/metabolism
SELECTION OF CITATIONS
SEARCH DETAIL