Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 86
Filtrar
Más filtros

Bases de datos
Tipo del documento
Intervalo de año de publicación
1.
Cell ; 183(4): 905-917.e16, 2020 11 12.
Artículo en Inglés | MEDLINE | ID: mdl-33186529

RESUMEN

The generation of functional genomics datasets is surging, because they provide insight into gene regulation and organismal phenotypes (e.g., genes upregulated in cancer). The intent behind functional genomics experiments is not necessarily to study genetic variants, yet they pose privacy concerns due to their use of next-generation sequencing. Moreover, there is a great incentive to broadly share raw reads for better statistical power and general research reproducibility. Thus, we need new modes of sharing beyond traditional controlled-access models. Here, we develop a data-sanitization procedure allowing raw functional genomics reads to be shared while minimizing privacy leakage, enabling principled privacy-utility trade-offs. Our protocol works with traditional Illumina-based assays and newer technologies such as 10x single-cell RNA sequencing. It involves quantifying the privacy leakage in reads by statistically linking study participants to known individuals. We carried out these linkages using data from highly accurate reference genomes and more realistic environmental samples.


Asunto(s)
Seguridad Computacional , Genómica , Privacidad , Genoma Humano , Genotipo , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Fenotipo , Filogenia , Reproducibilidad de los Resultados , Análisis de Secuencia de ARN , Análisis de la Célula Individual
2.
Cell ; 158(3): 673-88, 2014 Jul 31.
Artículo en Inglés | MEDLINE | ID: mdl-25083876

RESUMEN

Trimethylation of histone H3 at lysine 4 (H3K4me3) is a chromatin modification known to mark the transcription start sites of active genes. Here, we show that H3K4me3 domains that spread more broadly over genes in a given cell type preferentially mark genes that are essential for the identity and function of that cell type. Using the broadest H3K4me3 domains as a discovery tool in neural progenitor cells, we identify novel regulators of these cells. Machine learning models reveal that the broadest H3K4me3 domains represent a distinct entity, characterized by increased marks of elongation. The broadest H3K4me3 domains also have more paused polymerase at their promoters, suggesting a unique transcriptional output. Indeed, genes marked by the broadest H3K4me3 domains exhibit enhanced transcriptional consistency and [corrected] increased transcriptional levels, and perturbation of H3K4me3 breadth leads to changes in transcriptional consistency. Thus, H3K4me3 breadth contains information that could ensure transcriptional precision at key cell identity/function genes.


Asunto(s)
Células/metabolismo , Código de Histonas , Histonas/metabolismo , Transcripción Genética , Animales , Inteligencia Artificial , Genómica , Humanos , Lisina/metabolismo , Metilación , Ratones Endogámicos C57BL , Células-Madre Neurales/metabolismo , ARN Polimerasa II/metabolismo
3.
Nature ; 583(7818): 693-698, 2020 07.
Artículo en Inglés | MEDLINE | ID: mdl-32728248

RESUMEN

The Encylopedia of DNA Elements (ENCODE) Project launched in 2003 with the long-term goal of developing a comprehensive map of functional elements in the human genome. These included genes, biochemical regions associated with gene regulation (for example, transcription factor binding sites, open chromatin, and histone marks) and transcript isoforms. The marks serve as sites for candidate cis-regulatory elements (cCREs) that may serve functional roles in regulating gene expression1. The project has been extended to model organisms, particularly the mouse. In the third phase of ENCODE, nearly a million and more than 300,000 cCRE annotations have been generated for human and mouse, respectively, and these have provided a valuable resource for the scientific community.


Asunto(s)
Bases de Datos Genéticas , Genoma/genética , Genómica , Anotación de Secuencia Molecular , Animales , Sitios de Unión , Cromatina/genética , Cromatina/metabolismo , Metilación de ADN , Bases de Datos Genéticas/normas , Bases de Datos Genéticas/tendencias , Regulación de la Expresión Génica/genética , Genoma Humano/genética , Genómica/normas , Genómica/tendencias , Histonas/metabolismo , Humanos , Ratones , Anotación de Secuencia Molecular/normas , Control de Calidad , Secuencias Reguladoras de Ácidos Nucleicos/genética , Factores de Transcripción/metabolismo
4.
Nature ; 583(7818): 744-751, 2020 07.
Artículo en Inglés | MEDLINE | ID: mdl-32728240

RESUMEN

The Encyclopedia of DNA Elements (ENCODE) project has established a genomic resource for mammalian development, profiling a diverse panel of mouse tissues at 8 developmental stages from 10.5 days after conception until birth, including transcriptomes, methylomes and chromatin states. Here we systematically examined the state and accessibility of chromatin in the developing mouse fetus. In total we performed 1,128 chromatin immunoprecipitation with sequencing (ChIP-seq) assays for histone modifications and 132 assay for transposase-accessible chromatin using sequencing (ATAC-seq) assays for chromatin accessibility across 72 distinct tissue-stages. We used integrative analysis to develop a unified set of chromatin state annotations, infer the identities of dynamic enhancers and key transcriptional regulators, and characterize the relationship between chromatin state and accessibility during developmental gene regulation. We also leveraged these data to link enhancers to putative target genes and demonstrate tissue-specific enrichments of sequence variants associated with disease in humans. The mouse ENCODE data sets provide a compendium of resources for biomedical researchers and achieve, to our knowledge, the most comprehensive view of chromatin dynamics during mammalian fetal development to date.


Asunto(s)
Cromatina/genética , Cromatina/metabolismo , Conjuntos de Datos como Asunto , Desarrollo Fetal/genética , Histonas/metabolismo , Anotación de Secuencia Molecular , Secuencias Reguladoras de Ácidos Nucleicos/genética , Animales , Cromatina/química , Secuenciación de Inmunoprecipitación de Cromatina , Enfermedad/genética , Elementos de Facilitación Genéticos/genética , Femenino , Regulación del Desarrollo de la Expresión Génica/genética , Variación Genética , Histonas/química , Humanos , Masculino , Ratones , Ratones Endogámicos C57BL , Especificidad de Órganos/genética , Reproducibilidad de los Resultados , Transposasas/metabolismo
7.
Mol Cell ; 65(4): 761-774.e5, 2017 Feb 16.
Artículo en Inglés | MEDLINE | ID: mdl-28132844

RESUMEN

We have developed a general progressive procedure, Active Interaction Mapping, to guide assembly of the hierarchy of functions encoding any biological system. Using this process, we assemble an ontology of functions comprising autophagy, a central recycling process implicated in numerous diseases. A first-generation model, built from existing gene networks in Saccharomyces, captures most known autophagy components in broad relation to vesicle transport, cell cycle, and stress response. Systematic analysis identifies synthetic-lethal interactions as most informative for further experiments; consequently, we saturate the model with 156,364 such measurements across autophagy-activating conditions. These targeted interactions provide more information about autophagy than all previous datasets, producing a second-generation ontology of 220 functions. Approximately half are previously unknown; we confirm roles for Gyp1 at the phagophore-assembly site, Atg24 in cargo engulfment, Atg26 in cytoplasm-to-vacuole targeting, and Ssd1, Did4, and others in selective and non-selective autophagy. The procedure and autophagy hierarchy are at http://atgo.ucsd.edu/.


Asunto(s)
Autofagia/genética , Redes Reguladoras de Genes , Genómica/métodos , Proteínas de Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/genética , Biología de Sistemas/métodos , Proteínas Relacionadas con la Autofagia/genética , Proteínas Relacionadas con la Autofagia/metabolismo , Bases de Datos Genéticas , Complejos de Clasificación Endosomal Requeridos para el Transporte/genética , Complejos de Clasificación Endosomal Requeridos para el Transporte/metabolismo , Proteínas Activadoras de GTPasa/genética , Proteínas Activadoras de GTPasa/metabolismo , Regulación Fúngica de la Expresión Génica , Glucosiltransferasas/genética , Glucosiltransferasas/metabolismo , Humanos , Modelos Genéticos , Pichia/genética , Pichia/metabolismo , Mapas de Interacción de Proteínas , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Integración de Sistemas
10.
Nucleic Acids Res ; 48(D1): D743-D748, 2020 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-31612944

RESUMEN

The Saccharomyces Genome Database (SGD; www.yeastgenome.org) maintains the official annotation of all genes in the Saccharomyces cerevisiae reference genome and aims to elucidate the function of these genes and their products by integrating manually curated experimental data. Technological advances have allowed researchers to profile RNA expression and identify transcripts at high resolution. These data can be configured in web-based genome browser applications for display to the general public. Accordingly, SGD has incorporated published transcript isoform data in our instance of JBrowse, a genome visualization platform. This resource will help clarify S. cerevisiae biological processes by furthering studies of transcriptional regulation, untranslated regions, genome engineering, and expression quantification in S. cerevisiae.


Asunto(s)
Genoma Fúngico , Proteínas de Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/genética , Transcriptoma , Biología Computacional/métodos , Bases de Datos Genéticas , Genómica , Anotación de Secuencia Molecular , Sistemas de Lectura Abierta , Isoformas de Proteínas , RNA-Seq , Valores de Referencia , Interfaz Usuario-Computador , Navegador Web
11.
Nucleic Acids Res ; 48(D1): D882-D889, 2020 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-31713622

RESUMEN

The Encyclopedia of DNA Elements (ENCODE) is an ongoing collaborative research project aimed at identifying all the functional elements in the human and mouse genomes. Data generated by the ENCODE consortium are freely accessible at the ENCODE portal (https://www.encodeproject.org/), which is developed and maintained by the ENCODE Data Coordinating Center (DCC). Since the initial portal release in 2013, the ENCODE DCC has updated the portal to make ENCODE data more findable, accessible, interoperable and reusable. Here, we report on recent updates, including new ENCODE data and assays, ENCODE uniform data processing pipelines, new visualization tools, a dataset cart feature, unrestricted public access to ENCODE data on the cloud (Amazon Web Services open data registry, https://registry.opendata.aws/encode-project/) and more comprehensive tutorials and documentation.


Asunto(s)
ADN/genética , Bases de Datos Genéticas , Genoma Humano , Programas Informáticos , Animales , Genómica , Humanos , Ratones
12.
Am J Hum Genet ; 100(6): 895-906, 2017 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-28552198

RESUMEN

With advances in genomic sequencing technology, the number of reported gene-disease relationships has rapidly expanded. However, the evidence supporting these claims varies widely, confounding accurate evaluation of genomic variation in a clinical setting. Despite the critical need to differentiate clinically valid relationships from less well-substantiated relationships, standard guidelines for such evaluation do not currently exist. The NIH-funded Clinical Genome Resource (ClinGen) has developed a framework to define and evaluate the clinical validity of gene-disease pairs across a variety of Mendelian disorders. In this manuscript we describe a proposed framework to evaluate relevant genetic and experimental evidence supporting or contradicting a gene-disease relationship and the subsequent validation of this framework using a set of representative gene-disease pairs. The framework provides a semiquantitative measurement for the strength of evidence of a gene-disease relationship that correlates to a qualitative classification: "Definitive," "Strong," "Moderate," "Limited," "No Reported Evidence," or "Conflicting Evidence." Within the ClinGen structure, classifications derived with this framework are reviewed and confirmed or adjusted based on clinical expertise of appropriate disease experts. Detailed guidance for utilizing this framework and access to the curation interface is available on our website. This evidence-based, systematic method to assess the strength of gene-disease relationships will facilitate more knowledgeable utilization of genomic variants in clinical and research settings.


Asunto(s)
Estudios de Asociación Genética , Predisposición Genética a la Enfermedad , Genómica , Humanos , Reproducibilidad de los Resultados
13.
Nucleic Acids Res ; 46(D1): D736-D742, 2018 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-29140510

RESUMEN

The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org) is an expertly curated database of literature-derived functional information for the model organism budding yeast, Saccharomyces cerevisiae. SGD constantly strives to synergize new types of experimental data and bioinformatics predictions with existing data, and to organize them into a comprehensive and up-to-date information resource. The primary mission of SGD is to facilitate research into the biology of yeast and to provide this wealth of information to advance, in many ways, research on other organisms, even those as evolutionarily distant as humans. To build such a bridge between biological kingdoms, SGD is curating data regarding yeast-human complementation, in which a human gene can successfully replace the function of a yeast gene, and/or vice versa. These data are manually curated from published literature, made available for download, and incorporated into a variety of analysis tools provided by SGD.


Asunto(s)
Bases de Datos Genéticas , Genoma Fúngico , Saccharomyces cerevisiae/genética , Predicción , Ontología de Genes , Genes Fúngicos , Genoma Humano , Humanos , Mutación , Especificidad de la Especie
14.
Nucleic Acids Res ; 46(D1): D794-D801, 2018 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-29126249

RESUMEN

The Encyclopedia of DNA Elements (ENCODE) Data Coordinating Center has developed the ENCODE Portal database and website as the source for the data and metadata generated by the ENCODE Consortium. Two principles have motivated the design. First, experimental protocols, analytical procedures and the data themselves should be made publicly accessible through a coherent, web-based search and download interface. Second, the same interface should serve carefully curated metadata that record the provenance of the data and justify its interpretation in biological terms. Since its initial release in 2013 and in response to recommendations from consortium members and the wider community of scientists who use the Portal to access ENCODE data, the Portal has been regularly updated to better reflect these design principles. Here we report on these updates, including results from new experiments, uniformly-processed data from other projects, new visualization tools and more comprehensive metadata to describe experiments and analyses. Additionally, the Portal is now home to meta(data) from related projects including Genomics of Gene Regulation, Roadmap Epigenome Project, Model organism ENCODE (modENCODE) and modERN. The Portal now makes available over 13000 datasets and their accompanying metadata and can be accessed at: https://www.encodeproject.org/.


Asunto(s)
ADN/genética , Bases de Datos Genéticas , Componentes del Gen , Genómica , Secuenciación de Nucleótidos de Alto Rendimiento , Metadatos , Animales , Caenorhabditis elegans/genética , Presentación de Datos , Conjuntos de Datos como Asunto , Drosophila melanogaster/genética , Predicción , Genoma Humano , Humanos , Ratones/genética , Interfaz Usuario-Computador
15.
Nucleic Acids Res ; 45(D1): D128-D134, 2017 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-27794554

RESUMEN

RNAcentral is a database of non-coding RNA (ncRNA) sequences that aggregates data from specialised ncRNA resources and provides a single entry point for accessing ncRNA sequences of all ncRNA types from all organisms. Since its launch in 2014, RNAcentral has integrated twelve new resources, taking the total number of collaborating database to 22, and began importing new types of data, such as modified nucleotides from MODOMICS and PDB. We created new species-specific identifiers that refer to unique RNA sequences within a context of single species. The website has been subject to continuous improvements focusing on text and sequence similarity searches as well as genome browsing functionality. All RNAcentral data is provided for free and is available for browsing, bulk downloads, and programmatic access at http://rnacentral.org/.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , ARN no Traducido/química , Animales , Genómica , Humanos , Nucleótidos/química , Análisis de Secuencia de ARN , Especificidad de la Especie
16.
Dev Biol ; 426(2): 155-164, 2017 06 15.
Artículo en Inglés | MEDLINE | ID: mdl-27157655

RESUMEN

The Xenopus community has embraced recent advances in sequencing technology, resulting in the accumulation of numerous RNA-Seq and ChIP-Seq datasets. However, easily accessing and comparing datasets generated by multiple laboratories is challenging. Thus, we have created a central space to view, search and analyze data, providing essential information on gene expression changes and regulatory elements present in the genome. XenMine (www.xenmine.org) is a user-friendly website containing published genomic datasets from both Xenopus tropicalis and Xenopus laevis. We have established an analysis pipeline where all published datasets are uniformly processed with the latest genome releases. Information from these datasets can be extracted and compared using an array of pre-built or custom templates. With these search tools, users can easily extract sequences for all putative regulatory domains surrounding a gene of interest, identify the expression values of a gene of interest over developmental time, and analyze lists of genes for gene ontology terms and publications. Additionally, XenMine hosts an in-house genome browser that allows users to visualize all available ChIP-Seq data, extract specifically marked sequences, and aid in identifying important regulatory elements within the genome. Altogether, XenMine is an excellent tool for visualizing, accessing and querying analyzed datasets rapidly and efficiently.


Asunto(s)
Minería de Datos , Bases de Datos Genéticas , Genoma , Genómica/métodos , Xenopus/genética , Animales , Secuencia de Bases , Conjuntos de Datos como Asunto , Expresión Génica , Ontología de Genes , Internet , ARN/biosíntesis , ARN/genética , Secuencias Reguladoras de Ácidos Nucleicos , Programas Informáticos
17.
Nucleic Acids Res ; 44(D1): D698-702, 2016 Jan 04.
Artículo en Inglés | MEDLINE | ID: mdl-26578556

RESUMEN

The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org) is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. In recent years, we have moved toward increased representation of sequence variation and allelic differences within S. cerevisiae. The publication of numerous additional genomes has motivated the creation of new tools for their annotation and analysis. Here we present the Variant Viewer: a dynamic open-source web application for the visualization of genomic and proteomic differences. Multiple sequence alignments have been constructed across high quality genome sequences from 11 different S. cerevisiae strains and stored in the SGD. The alignments and summaries are encoded in JSON and used to create a two-tiered dynamic view of the budding yeast pan-genome, available at http://www.yeastgenome.org/variant-viewer.


Asunto(s)
Bases de Datos Genéticas , Variación Genética , Genoma Fúngico , Saccharomyces cerevisiae/genética , Anotación de Secuencia Molecular , Alineación de Secuencia , Análisis de Secuencia de ADN , Análisis de Secuencia de Proteína , Interfaz Usuario-Computador
18.
Nucleic Acids Res ; 44(D1): D726-32, 2016 Jan 04.
Artículo en Inglés | MEDLINE | ID: mdl-26527727

RESUMEN

The Encyclopedia of DNA Elements (ENCODE) Project is in its third phase of creating a comprehensive catalog of functional elements in the human genome. This phase of the project includes an expansion of assays that measure diverse RNA populations, identify proteins that interact with RNA and DNA, probe regions of DNA hypersensitivity, and measure levels of DNA methylation in a wide range of cell and tissue types to identify putative regulatory elements. To date, results for almost 5000 experiments have been released for use by the scientific community. These data are available for searching, visualization and download at the new ENCODE Portal (www.encodeproject.org). The revamped ENCODE Portal provides new ways to browse and search the ENCODE data based on the metadata that describe the assays as well as summaries of the assays that focus on data provenance. In addition, it is a flexible platform that allows integration of genomic data from multiple projects. The portal experience was designed to improve access to ENCODE data by relying on metadata that allow reusability and reproducibility of the experiments.


Asunto(s)
Bases de Datos Genéticas , Genoma Humano , Genómica , Animales , ADN/metabolismo , Genes , Humanos , Ratones , Proteínas/metabolismo , ARN/metabolismo
19.
Nucleic Acids Res ; 43(Database issue): D123-9, 2015 01.
Artículo en Inglés | MEDLINE | ID: mdl-25352543

RESUMEN

The field of non-coding RNA biology has been hampered by the lack of availability of a comprehensive, up-to-date collection of accessioned RNA sequences. Here we present the first release of RNAcentral, a database that collates and integrates information from an international consortium of established RNA sequence databases. The initial release contains over 8.1 million sequences, including representatives of all major functional classes. A web portal (http://rnacentral.org) provides free access to data, search functionality, cross-references, source code and an integrated genome browser for selected species.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , ARN no Traducido/química , Mapeo Cromosómico , Humanos , Internet , ARN no Traducido/genética , Análisis de Secuencia de ARN
20.
Nucleic Acids Res ; 42(Database issue): D717-25, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24265222

RESUMEN

The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org) is the community resource for genomic, gene and protein information about the budding yeast Saccharomyces cerevisiae, containing a variety of functional information about each yeast gene and gene product. We have recently added regulatory information to SGD and present it on a new tabbed section of the Locus Summary entitled 'Regulation'. We are compiling transcriptional regulator-target gene relationships, which are curated from the literature at SGD or imported, with permission, from the YEASTRACT database. For nearly every S. cerevisiae gene, the Regulation page displays a table of annotations showing the regulators of that gene, and a graphical visualization of its regulatory network. For genes whose products act as transcription factors, the Regulation page also shows a table of their target genes, accompanied by a Gene Ontology enrichment analysis of the biological processes in which those genes participate. We additionally synthesize information from the literature for each transcription factor in a free-text Regulation Summary, and provide other information relevant to its regulatory function, such as DNA binding site motifs and protein domains. All of the regulation data are available for querying, analysis and download via YeastMine, the InterMine-based data warehouse system in use at SGD.


Asunto(s)
Bases de Datos Genéticas , Regulación Fúngica de la Expresión Génica , Genoma Fúngico , Saccharomyces cerevisiae/genética , Sitios de Unión , Redes Reguladoras de Genes , Internet , Estructura Terciaria de Proteína , Proteínas de Saccharomyces cerevisiae/química , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , Factores de Transcripción/química , Factores de Transcripción/metabolismo , Transcripción Genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA