Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
Add more filters










Database
Language
Publication year range
1.
Res Sq ; 2023 Jul 19.
Article in English | MEDLINE | ID: mdl-37503119

ABSTRACT

The Encyclopedia of DNA elements (ENCODE) project is a collaborative effort to create a comprehensive catalog of functional elements in the human genome. The current database comprises more than 19000 functional genomics experiments across more than 1000 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the Homo sapiens and Mus musculus genomes. All experimental data, metadata, and associated computational analyses created by the ENCODE consortium are submitted to the Data Coordination Center (DCC) for validation, tracking, storage, and distribution to community resources and the scientific community. The ENCODE project has engineered and distributed uniform processing pipelines in order to promote data provenance and reproducibility as well as allow interoperability between genomic resources and other consortia. All data files, reference genome versions, software versions, and parameters used by the pipelines are captured and available via the ENCODE Portal. The pipeline code, developed using Docker and Workflow Description Language (WDL; https://openwdl.org/) is publicly available in GitHub, with images available on Dockerhub (https://hub.docker.com), enabling access to a diverse range of biomedical researchers. ENCODE pipelines maintained and used by the DCC can be installed to run on personal computers, local HPC clusters, or in cloud computing environments via Cromwell. Access to the pipelines and data via the cloud allows small labs the ability to use the data or software without access to institutional compute clusters. Standardization of the computational methodologies for analysis and quality control leads to comparable results from different ENCODE collections - a prerequisite for successful integrative analyses.

2.
bioRxiv ; 2023 May 16.
Article in English | MEDLINE | ID: mdl-37292896

ABSTRACT

The majority of mammalian genes encode multiple transcript isoforms that result from differential promoter use, changes in exonic splicing, and alternative 3' end choice. Detecting and quantifying transcript isoforms across tissues, cell types, and species has been extremely challenging because transcripts are much longer than the short reads normally used for RNA-seq. By contrast, long-read RNA-seq (LR-RNA-seq) gives the complete structure of most transcripts. We sequenced 264 LR-RNA-seq PacBio libraries totaling over 1 billion circular consensus reads (CCS) for 81 unique human and mouse samples. We detect at least one full-length transcript from 87.7% of annotated human protein coding genes and a total of 200,000 full-length transcripts, 40% of which have novel exon junction chains. To capture and compute on the three sources of transcript structure diversity, we introduce a gene and transcript annotation framework that uses triplets representing the transcript start site, exon junction chain, and transcript end site of each transcript. Using triplets in a simplex representation demonstrates how promoter selection, splice pattern, and 3' processing are deployed across human tissues, with nearly half of multi-transcript protein coding genes showing a clear bias toward one of the three diversity mechanisms. Evaluated across samples, the predominantly expressed transcript changes for 74% of protein coding genes. In evolution, the human and mouse transcriptomes are globally similar in types of transcript structure diversity, yet among individual orthologous gene pairs, more than half (57.8%) show substantial differences in mechanism of diversification in matching tissues. This initial large-scale survey of human and mouse long-read transcriptomes provides a foundation for further analyses of alternative transcript usage, and is complemented by short-read and microRNA data on the same samples and by epigenome data elsewhere in the ENCODE4 collection.

3.
bioRxiv ; 2023 Apr 06.
Article in English | MEDLINE | ID: mdl-37066421

ABSTRACT

The Encyclopedia of DNA elements (ENCODE) project is a collaborative effort to create a comprehensive catalog of functional elements in the human genome. The current database comprises more than 19000 functional genomics experiments across more than 1000 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the Homo sapiens and Mus musculus genomes. All experimental data, metadata, and associated computational analyses created by the ENCODE consortium are submitted to the Data Coordination Center (DCC) for validation, tracking, storage, and distribution to community resources and the scientific community. The ENCODE project has engineered and distributed uniform processing pipelines in order to promote data provenance and reproducibility as well as allow interoperability between genomic resources and other consortia. All data files, reference genome versions, software versions, and parameters used by the pipelines are captured and available via the ENCODE Portal. The pipeline code, developed using Docker and Workflow Description Language (WDL; https://openwdl.org/) is publicly available in GitHub, with images available on Dockerhub (https://hub.docker.com), enabling access to a diverse range of biomedical researchers. ENCODE pipelines maintained and used by the DCC can be installed to run on personal computers, local HPC clusters, or in cloud computing environments via Cromwell. Access to the pipelines and data via the cloud allows small labs the ability to use the data or software without access to institutional compute clusters. Standardization of the computational methodologies for analysis and quality control leads to comparable results from different ENCODE collections - a prerequisite for successful integrative analyses.

4.
Genome Res ; 29(11): 1900-1909, 2019 11.
Article in English | MEDLINE | ID: mdl-31645363

ABSTRACT

MicroRNAs (miRNAs) play a critical role as posttranscriptional regulators of gene expression. The ENCODE Project profiled the expression of miRNAs in an extensive set of organs during a time-course of mouse embryonic development and captured the expression dynamics of 785 miRNAs. We found distinct organ-specific and developmental stage-specific miRNA expression clusters, with an overall pattern of increasing organ-specific expression as embryonic development proceeds. Comparative analysis of conserved miRNAs in mouse and human revealed stronger clustering of expression patterns by organ type rather than by species. An analysis of messenger RNA expression clusters compared with miRNA expression clusters identifies the potential role of specific miRNA expression clusters in suppressing the expression of mRNAs specific to other developmental programs in the organ in which these miRNAs are expressed during embryonic development. Our results provide the most comprehensive time-course of miRNA expression as part of an integrated ENCODE reference data set for mouse embryonic development.


Subject(s)
Embryonic Development/genetics , MicroRNAs/genetics , Animals , Female , Gene Expression Regulation, Developmental , Mice , Pregnancy , RNA, Messenger/genetics
5.
MAbs ; 6(5): 1274-82, 2014.
Article in English | MEDLINE | ID: mdl-25517312

ABSTRACT

Antibody engineering to enhance thermostability may enable further application and ease of use of antibodies across a number of different areas. A modified human IgG framework has been developed through a combination of engineering approaches, which can be used to stabilize antibodies of diverse specificity. This is achieved through a combination of complementarity-determining region (CDR)-grafting onto the stable framework, mammalian cell display and in vitro somatic hypermutation (SHM). This approach allows both stabilization and maturation to affinities beyond those of the original antibody, as shown by the stabilization of an anti-HA33 antibody by approximately 10°C and affinity maturation of approximately 300-fold over the original antibody. Specificities of 10 antibodies of diverse origin were successfully transferred to the stable framework through CDR-grafting, with 8 of these successfully stabilized, including the therapeutic antibodies adalimumab, stabilized by 9.9°C, denosumab, stabilized by 7°C, cetuximab stabilized by 6.9°C and to a lesser extent trastuzumab stabilized by 0.8°C. This data suggests that this approach may be broadly useful for improving the biophysical characteristics of antibodies across a number of applications.


Subject(s)
Antibodies/immunology , Complementarity Determining Regions/immunology , Immunoglobulin G/immunology , Protein Engineering/methods , Adalimumab , Animals , Antibodies/chemistry , Antibodies/genetics , Antibodies, Monoclonal, Humanized/genetics , Antibodies, Monoclonal, Humanized/immunology , Antibody Affinity/immunology , Calorimetry, Differential Scanning , Cell Surface Display Techniques , Cetuximab , Complementarity Determining Regions/genetics , Denosumab , HEK293 Cells , Humans , Immunoglobulin G/genetics , Models, Molecular , Protein Conformation , Protein Stability , Somatic Hypermutation, Immunoglobulin , Temperature , Trastuzumab
6.
Sci Rep ; 4: 3925, 2014 Jan 29.
Article in English | MEDLINE | ID: mdl-24473230

ABSTRACT

Recently there has not been a systematic, objective assessment of the metabolic capabilities of the human platelet. A manually curated, functionally tested, and validated biochemical reaction network of platelet metabolism, iAT-PLT-636, was reconstructed using 33 proteomic datasets and 354 literature references. The network contains enzymes mapping to 403 diseases and 231 FDA approved drugs, alluding to an expansive scope of biochemical transformations that may affect or be affected by disease processes in multiple organ systems. The effect of aspirin (ASA) resistance on platelet metabolism was evaluated using constraint-based modeling, which revealed a redirection of glycolytic, fatty acid, and nucleotide metabolism reaction fluxes in order to accommodate eicosanoid synthesis and reactive oxygen species stress. These results were confirmed with independent proteomic data. The construction and availability of iAT-PLT-636 should stimulate further data-driven, systems analysis of platelet metabolism towards the understanding of pathophysiological conditions including, but not strictly limited to, coagulopathies.


Subject(s)
Aspirin/pharmacology , Blood Platelets/drug effects , Blood Platelets/metabolism , Biochemical Phenomena/drug effects , Biochemical Phenomena/physiology , Eicosanoids/metabolism , Fatty Acids/metabolism , Glycolysis/drug effects , Glycolysis/physiology , Humans , Nucleotides/metabolism , Proteome/metabolism , Proteomics/methods , Reactive Oxygen Species/metabolism
7.
Nat Protoc ; 6(9): 1290-307, 2011 Aug 04.
Article in English | MEDLINE | ID: mdl-21886097

ABSTRACT

Over the past decade, a growing community of researchers has emerged around the use of constraint-based reconstruction and analysis (COBRA) methods to simulate, analyze and predict a variety of metabolic phenotypes using genome-scale models. The COBRA Toolbox, a MATLAB package for implementing COBRA methods, was presented earlier. Here we present a substantial update of this in silico toolbox. Version 2.0 of the COBRA Toolbox expands the scope of computations by including in silico analysis methods developed since its original release. New functions include (i) network gap filling, (ii) (13)C analysis, (iii) metabolic engineering, (iv) omics-guided analysis and (v) visualization. As with the first version, the COBRA Toolbox reads and writes systems biology markup language-formatted models. In version 2.0, we improved performance, usability and the level of documentation. A suite of test scripts can now be used to learn the core functionality of the toolbox and validate results. This toolbox lowers the barrier of entry to use powerful COBRA methods.


Subject(s)
Computational Biology/methods , Metabolic Networks and Pathways , Software , Algorithms
SELECTION OF CITATIONS
SEARCH DETAIL
...