Pesquisa | BVS Doenças Infecciosas e Parasitárias

ENCODE data at the ENCODE portal.

Sloan, Cricket A; Chan, Esther T; Davidson, Jean M; Malladi, Venkat S; Strattan, J Seth; Hitz, Benjamin C; Gabdank, Idan; Narayanan, Aditi K; Ho, Marcus; Lee, Brian T; Rowe, Laurence D; Dreszer, Timothy R; Roe, Greg; Podduturi, Nikhil R; Tanaka, Forrest; Hong, Eurie L; Cherry, J Michael.

Nucleic Acids Res ; 44(D1): D726-32, 2016 Jan 04.

Artigo em Inglês | MEDLINE | ID: mdl-26527727

RESUMO

The Encyclopedia of DNA Elements (ENCODE) Project is in its third phase of creating a comprehensive catalog of functional elements in the human genome. This phase of the project includes an expansion of assays that measure diverse RNA populations, identify proteins that interact with RNA and DNA, probe regions of DNA hypersensitivity, and measure levels of DNA methylation in a wide range of cell and tissue types to identify putative regulatory elements. To date, results for almost 5000 experiments have been released for use by the scientific community. These data are available for searching, visualization and download at the new ENCODE Portal (www.encodeproject.org). The revamped ENCODE Portal provides new ways to browse and search the ENCODE data based on the metadata that describe the assays as well as summaries of the assays that focus on data provenance. In addition, it is a flexible platform that allows integration of genomic data from multiple projects. The portal experience was designed to improve access to ENCODE data by relying on metadata that allow reusability and reproducibility of the experiments.

Assuntos

Bases de Dados Genéticas , Genoma Humano , Genômica , Animais , DNA/metabolismo , Genes , Humanos , Camundongos , Proteínas/metabolismo , RNA/metabolismo

The ENCODE Uniform Analysis Pipelines.

Hitz, Benjamin C; Lee, Jin-Wook; Jolanki, Otto; Kagda, Meenakshi S; Graham, Keenan; Sud, Paul; Gabdank, Idan; Strattan, J Seth; Sloan, Cricket A; Dreszer, Timothy; Rowe, Laurence D; Podduturi, Nikhil R; Malladi, Venkat S; Chan, Esther T; Davidson, Jean M; Ho, Marcus; Miyasato, Stuart; Simison, Matt; Tanaka, Forrest; Luo, Yunhai; Whaling, Ian; Hong, Eurie L; Lee, Brian T; Sandstrom, Richard; Rynes, Eric; Nelson, Jemma; Nishida, Andrew; Ingersoll, Alyssa; Buckley, Michael; Frerker, Mark; Kim, Daniel S; Boley, Nathan; Trout, Diane; Dobin, Alex; Rahmanian, Sorena; Wyman, Dana; Balderrama-Gutierrez, Gabriela; Reese, Fairlie; Durand, Neva C; Dudchenko, Olga; Weisz, David; Rao, Suhas S P; Blackburn, Alyssa; Gkountaroulis, Dimos; Sadr, Mahdi; Olshansky, Moshe; Eliaz, Yossi; Nguyen, Dat; Bochkov, Ivan; Shamim, Muhammad Saad.

Res Sq ; 2023 Jul 19.

Artigo em Inglês | MEDLINE | ID: mdl-37503119

RESUMO

The Encyclopedia of DNA elements (ENCODE) project is a collaborative effort to create a comprehensive catalog of functional elements in the human genome. The current database comprises more than 19000 functional genomics experiments across more than 1000 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the Homo sapiens and Mus musculus genomes. All experimental data, metadata, and associated computational analyses created by the ENCODE consortium are submitted to the Data Coordination Center (DCC) for validation, tracking, storage, and distribution to community resources and the scientific community. The ENCODE project has engineered and distributed uniform processing pipelines in order to promote data provenance and reproducibility as well as allow interoperability between genomic resources and other consortia. All data files, reference genome versions, software versions, and parameters used by the pipelines are captured and available via the ENCODE Portal. The pipeline code, developed using Docker and Workflow Description Language (WDL; https://openwdl.org/) is publicly available in GitHub, with images available on Dockerhub (https://hub.docker.com), enabling access to a diverse range of biomedical researchers. ENCODE pipelines maintained and used by the DCC can be installed to run on personal computers, local HPC clusters, or in cloud computing environments via Cromwell. Access to the pipelines and data via the cloud allows small labs the ability to use the data or software without access to institutional compute clusters. Standardization of the computational methodologies for analysis and quality control leads to comparable results from different ENCODE collections - a prerequisite for successful integrative analyses.

The ENCODE Uniform Analysis Pipelines.

Hitz, Benjamin C; Jin-Wook, Lee; Jolanki, Otto; Kagda, Meenakshi S; Graham, Keenan; Sud, Paul; Gabdank, Idan; Strattan, J Seth; Sloan, Cricket A; Dreszer, Timothy; Rowe, Laurence D; Podduturi, Nikhil R; Malladi, Venkat S; Chan, Esther T; Davidson, Jean M; Ho, Marcus; Miyasato, Stuart; Simison, Matt; Tanaka, Forrest; Luo, Yunhai; Whaling, Ian; Hong, Eurie L; Lee, Brian T; Sandstrom, Richard; Rynes, Eric; Nelson, Jemma; Nishida, Andrew; Ingersoll, Alyssa; Buckley, Michael; Frerker, Mark; Kim, Daniel S; Boley, Nathan; Trout, Diane; Dobin, Alex; Rahmanian, Sorena; Wyman, Dana; Balderrama-Gutierrez, Gabriela; Reese, Fairlie; Durand, Neva C; Dudchenko, Olga; Weisz, David; Rao, Suhas S P; Blackburn, Alyssa; Gkountaroulis, Dimos; Sadr, Mahdi; Olshansky, Moshe; Eliaz, Yossi; Nguyen, Dat; Bochkov, Ivan; Shamim, Muhammad Saad.

bioRxiv ; 2023 Apr 06.

Artigo em Inglês | MEDLINE | ID: mdl-37066421

RESUMO

SnoVault and encodeD: A novel object-based storage system and applications to ENCODE metadata.

Hitz, Benjamin C; Rowe, Laurence D; Podduturi, Nikhil R; Glick, David I; Baymuradov, Ulugbek K; Malladi, Venkat S; Chan, Esther T; Davidson, Jean M; Gabdank, Idan; Narayana, Aditi K; Onate, Kathrina C; Hilton, Jason; Ho, Marcus C; Lee, Brian T; Miyasato, Stuart R; Dreszer, Timothy R; Sloan, Cricket A; Strattan, J Seth; Tanaka, Forrest Y; Hong, Eurie L; Cherry, J Michael.

PLoS One ; 12(4): e0175310, 2017.

Artigo em Inglês | MEDLINE | ID: mdl-28403240

RESUMO

The Encyclopedia of DNA elements (ENCODE) project is an ongoing collaborative effort to create a comprehensive catalog of functional elements initiated shortly after the completion of the Human Genome Project. The current database exceeds 6500 experiments across more than 450 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the H. sapiens and M. musculus genomes. All ENCODE experimental data, metadata, and associated computational analyses are submitted to the ENCODE Data Coordination Center (DCC) for validation, tracking, storage, unified processing, and distribution to community resources and the scientific community. As the volume of data increases, the identification and organization of experimental details becomes increasingly intricate and demands careful curation. The ENCODE DCC has created a general purpose software system, known as SnoVault, that supports metadata and file submission, a database used for metadata storage, web pages for displaying the metadata and a robust API for querying the metadata. The software is fully open-source, code and installation instructions can be found at: http://github.com/ENCODE-DCC/snovault/ (for the generic database) and http://github.com/ENCODE-DCC/encoded/ to store genomic data in the manner of ENCODE. The core database engine, SnoVault (which is completely independent of ENCODE, genomic data, or bioinformatic data) has been released as a separate Python package.

Assuntos

Bases de Dados Genéticas , Genômica/métodos , Metadados , Software , Animais , DNA/genética , Genoma , Humanos , Camundongos

Principles of metadata organization at the ENCODE data coordination center.

Hong, Eurie L; Sloan, Cricket A; Chan, Esther T; Davidson, Jean M; Malladi, Venkat S; Strattan, J Seth; Hitz, Benjamin C; Gabdank, Idan; Narayanan, Aditi K; Ho, Marcus; Lee, Brian T; Rowe, Laurence D; Dreszer, Timothy R; Roe, Greg R; Podduturi, Nikhil R; Tanaka, Forrest; Hilton, Jason A; Cherry, J Michael.

Database (Oxford) ; 20162016.

Artigo em Inglês | MEDLINE | ID: mdl-26980513

RESUMO

The Encyclopedia of DNA Elements (ENCODE) Data Coordinating Center (DCC) is responsible for organizing, describing and providing access to the diverse data generated by the ENCODE project. The description of these data, known as metadata, includes the biological sample used as input, the protocols and assays performed on these samples, the data files generated from the results and the computational methods used to analyze the data. Here, we outline the principles and philosophy used to define the ENCODE metadata in order to create a metadata standard that can be applied to diverse assays and multiple genomic projects. In addition, we present how the data are validated and used by the ENCODE DCC in creating the ENCODE Portal (https://www.encodeproject.org/). Database URL: www.encodeproject.org.

Assuntos

Biologia Computacional/métodos , DNA/genética , Bases de Dados Genéticas , Algoritmos , Animais , Caenorhabditis elegans , Biologia Computacional/normas , Coleta de Dados , Drosophila melanogaster , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Camundongos , Ácidos Nucleicos/genética , Controle de Qualidade , Reprodutibilidade dos Testes , Alinhamento de Sequência

Ontology application and use at the ENCODE DCC.

Malladi, Venkat S; Erickson, Drew T; Podduturi, Nikhil R; Rowe, Laurence D; Chan, Esther T; Davidson, Jean M; Hitz, Benjamin C; Ho, Marcus; Lee, Brian T; Miyasato, Stuart; Roe, Gregory R; Simison, Matt; Sloan, Cricket A; Strattan, J Seth; Tanaka, Forrest; Kent, W James; Cherry, J Michael; Hong, Eurie L.

Database (Oxford) ; 20152015.

Artigo em Inglês | MEDLINE | ID: mdl-25776021

RESUMO

The Encyclopedia of DNA elements (ENCODE) project is an ongoing collaborative effort to create a catalog of genomic annotations. To date, the project has generated over 4000 experiments across more than 350 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory network and transcriptional landscape of the Homo sapiens and Mus musculus genomes. All ENCODE experimental data, metadata and associated computational analyses are submitted to the ENCODE Data Coordination Center (DCC) for validation, tracking, storage and distribution to community resources and the scientific community. As the volume of data increases, the organization of experimental details becomes increasingly complicated and demands careful curation to identify related experiments. Here, we describe the ENCODE DCC's use of ontologies to standardize experimental metadata. We discuss how ontologies, when used to annotate metadata, provide improved searching capabilities and facilitate the ability to find connections within a set of experiments. Additionally, we provide examples of how ontologies are used to annotate ENCODE metadata and how the annotations can be identified via ontology-driven searches at the ENCODE portal. As genomic datasets grow larger and more interconnected, standardization of metadata becomes increasingly vital to allow for exploration and comparison of data between different scientific projects.

Assuntos

Curadoria de Dados/métodos , Bases de Dados Genéticas , Ontologia Genética , Redes Reguladoras de Genes/fisiologia , Anotação de Sequência Molecular/métodos , Transcrição Gênica/fisiologia , Animais , Humanos , Camundongos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA