Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Nucleic Acids Res ; 51(D1): D942-D949, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36420896

RESUMEN

GENCODE produces high quality gene and transcript annotation for the human and mouse genomes. All GENCODE annotation is supported by experimental data and serves as a reference for genome biology and clinical genomics. The GENCODE consortium generates targeted experimental data, develops bioinformatic tools and carries out analyses that, along with externally produced data and methods, support the identification and annotation of transcript structures and the determination of their function. Here, we present an update on the annotation of human and mouse genes, including developments in the tools, data, analyses and major collaborations which underpin this progress. For example, we report the creation of a set of non-canonical ORFs identified in GENCODE transcripts, the LRGASP collaboration to assess the use of long transcriptomic data to build transcript models, the progress in collaborations with RefSeq and UniProt to increase convergence in the annotation of human and mouse protein-coding genes, the propagation of GENCODE across the human pan-genome and the development of new tools to support annotation of regulatory features by GENCODE. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org.


Asunto(s)
Biología Computacional , Genoma Humano , Humanos , Animales , Ratones , Anotación de Secuencia Molecular , Biología Computacional/métodos , Genoma Humano/genética , Transcriptoma/genética , Perfilación de la Expresión Génica , Bases de Datos Genéticas
2.
Nucleic Acids Res ; 49(D1): D916-D923, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33270111

RESUMEN

The GENCODE project annotates human and mouse genes and transcripts supported by experimental data with high accuracy, providing a foundational resource that supports genome biology and clinical genomics. GENCODE annotation processes make use of primary data and bioinformatic tools and analysis generated both within the consortium and externally to support the creation of transcript structures and the determination of their function. Here, we present improvements to our annotation infrastructure, bioinformatics tools, and analysis, and the advances they support in the annotation of the human and mouse genomes including: the completion of first pass manual annotation for the mouse reference genome; targeted improvements to the annotation of genes associated with SARS-CoV-2 infection; collaborative projects to achieve convergence across reference annotation databases for the annotation of human and mouse protein-coding genes; and the first GENCODE manually supervised automated annotation of lncRNAs. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org.


Asunto(s)
COVID-19/prevención & control , Biología Computacional/métodos , Bases de Datos Genéticas , Genómica/métodos , Anotación de Secuencia Molecular/métodos , SARS-CoV-2/genética , Animales , COVID-19/epidemiología , COVID-19/virología , Epidemias , Humanos , Internet , Ratones , Seudogenes/genética , ARN Largo no Codificante/genética , SARS-CoV-2/metabolismo , SARS-CoV-2/fisiología , Transcripción Genética/genética
3.
Nucleic Acids Res ; 49(D1): D884-D891, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33137190

RESUMEN

The Ensembl project (https://www.ensembl.org) annotates genomes and disseminates genomic data for vertebrate species. We create detailed and comprehensive annotation of gene structures, regulatory elements and variants, and enable comparative genomics by inferring the evolutionary history of genes and genomes. Our integrated genomic data are made available in a variety of ways, including genome browsers, search interfaces, specialist tools such as the Ensembl Variant Effect Predictor, download files and programmatic interfaces. Here, we present recent Ensembl developments including two new website portals. Ensembl Rapid Release (http://rapid.ensembl.org) is designed to provide core tools and services for genomes as soon as possible and has been deployed to support large biodiversity sequencing projects. Our SARS-CoV-2 genome browser (https://covid-19.ensembl.org) integrates our own annotation with publicly available genomic data from numerous sources to facilitate the use of genomics in the international scientific response to the COVID-19 pandemic. We also report on other updates to our annotation resources, tools and services. All Ensembl data and software are freely available without restriction.


Asunto(s)
Biología Computacional/métodos , Bases de Datos de Ácidos Nucleicos , Genómica/métodos , SARS-CoV-2/genética , Vertebrados/genética , Animales , COVID-19/epidemiología , COVID-19/virología , Humanos , Internet , Anotación de Secuencia Molecular/métodos , Pandemias , Vertebrados/clasificación
4.
Nucleic Acids Res ; 48(D1): D682-D688, 2020 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-31691826

RESUMEN

The Ensembl (https://www.ensembl.org) is a system for generating and distributing genome annotation such as genes, variation, regulation and comparative genomics across the vertebrate subphylum and key model organisms. The Ensembl annotation pipeline is capable of integrating experimental and reference data from multiple providers into a single integrated resource. Here, we present 94 newly annotated and re-annotated genomes, bringing the total number of genomes offered by Ensembl to 227. This represents the single largest expansion of the resource since its inception. We also detail our continued efforts to improve human annotation, developments in our epigenome analysis and display, a new tool for imputing causal genes from genome-wide association studies and visualisation of variation within a 3D protein model. Finally, we present information on our new website. Both software and data are made available without restriction via our website, online tools platform and programmatic interfaces (available under an Apache 2.0 license) and data updates made available four times a year.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Epigenoma , Anotación de Secuencia Molecular , Algoritmos , Animales , Gráficos por Computador , Bases de Datos de Proteínas , Variación Genética , Estudio de Asociación del Genoma Completo , Genómica , Histonas/metabolismo , Humanos , Imagenología Tridimensional , Internet , Ligandos , Motor de Búsqueda , Programas Informáticos , Especificidad de la Especie , Transcriptoma , Interfaz Usuario-Computador , Navegador Web
5.
Nucleic Acids Res ; 47(D1): D745-D751, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30407521

RESUMEN

The Ensembl project (https://www.ensembl.org) makes key genomic data sets available to the entire scientific community without restrictions. Ensembl seeks to be a fundamental resource driving scientific progress by creating, maintaining and updating reference genome annotation and comparative genomics resources. This year we describe our new and expanded gene, variant and comparative annotation capabilities, which led to a 50% increase in the number of vertebrate genomes we support. We have also doubled the number of available human variants and added regulatory regions for many mouse cell types and developmental stages. Our data sets and tools are available via the Ensembl website as well as a through a RESTful webservice, Perl application programming interface and as data files for download.


Asunto(s)
Bases de Datos Genéticas , Genoma/genética , Genómica , Vertebrados/genética , Animales , Biología Computacional/tendencias , Humanos , Ratones , Anotación de Secuencia Molecular , Programas Informáticos
6.
Nucleic Acids Res ; 47(D1): D766-D773, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30357393

RESUMEN

The accurate identification and description of the genes in the human and mouse genomes is a fundamental requirement for high quality analysis of data informing both genome biology and clinical genomics. Over the last 15 years, the GENCODE consortium has been producing reference quality gene annotations to provide this foundational resource. The GENCODE consortium includes both experimental and computational biology groups who work together to improve and extend the GENCODE gene annotation. Specifically, we generate primary data, create bioinformatics tools and provide analysis to support the work of expert manual gene annotators and automated gene annotation pipelines. In addition, manual and computational annotation workflows use any and all publicly available data and analysis, along with the research literature to identify and characterise gene loci to the highest standard. GENCODE gene annotations are accessible via the Ensembl and UCSC Genome Browsers, the Ensembl FTP site, Ensembl Biomart, Ensembl Perl and REST APIs as well as https://www.gencodegenes.org.


Asunto(s)
Bases de Datos Genéticas , Genoma Humano/genética , Genómica , Seudogenes/genética , Animales , Biología Computacional , Humanos , Internet , Ratones , Anotación de Secuencia Molecular , Programas Informáticos
7.
Nat Commun ; 8(1): 1092, 2017 10 23.
Artículo en Inglés | MEDLINE | ID: mdl-29061983

RESUMEN

Noncoding regulatory variants play a central role in the genetics of human diseases and in evolution. Here we measure allele-specific transcription factor binding occupancy of three liver-specific transcription factors between crosses of two inbred mouse strains to elucidate the regulatory mechanisms underlying transcription factor binding variations in mammals. Our results highlight the pre-eminence of cis-acting variants on transcription factor occupancy divergence. Transcription factor binding differences linked to cis-acting variants generally exhibit additive inheritance, while those linked to trans-acting variants are most often dominantly inherited. Cis-acting variants lead to local coordination of transcription factor occupancies that decay with distance; distal coordination is also observed and may be modulated by long-range chromatin contacts. Our results reveal the regulatory mechanisms that interplay to drive transcription factor occupancy, chromatin state, and gene expression in complex mammalian cell states.


Asunto(s)
Cromatina/metabolismo , Factores de Transcripción/metabolismo , Alelos , Animales , Cromatina/genética , Evolución Molecular , Regulación Fúngica de la Expresión Génica/genética , Regulación Fúngica de la Expresión Génica/fisiología , Humanos , Ratones , Unión Proteica/genética , Unión Proteica/fisiología , Factores de Transcripción/genética
8.
PLoS Genet ; 12(5): e1006024, 2016 05.
Artículo en Inglés | MEDLINE | ID: mdl-27166679

RESUMEN

Whether codon usage fine-tunes mRNA translation in mammals remains controversial, with recent papers suggesting that production of proteins in specific Gene Ontological (GO) pathways can be regulated by actively modifying the codon and anticodon pools in different cellular conditions. In this work, we compared the sequence content of genes in specific GO categories with the exonic genome background. Although a substantial fraction of variability in codon usage could be explained by random sampling, almost half of GO sets showed more variability in codon usage than expected by chance. Nevertheless, by quantifying translational efficiency in healthy and cancerous tissues in human and mouse, we demonstrated that a given tRNA pool can equally well translate many different sets of mRNAs, irrespective of their cell-type specificity. This disconnect between variations in codon usage and the stability of translational efficiency is best explained by differences in GC content between gene sets. GC variation across the mammalian genome is most likely a result of the interplay between genome repair and gene duplication mechanisms, rather than selective pressures caused by codon-driven translational rates. Consequently, codon usage differences in mammalian transcriptomes are most easily explained by well-understood mutational biases acting on the underlying genome.


Asunto(s)
Codón/genética , Biosíntesis de Proteínas/genética , Selección Genética , Transcriptoma/genética , Animales , Anticodón/genética , Composición de Base/genética , Expresión Génica , Ontología de Genes , Genómica , Humanos , Mamíferos , Ratones , Modelos Genéticos , ARN Mensajero/genética , ARN de Transferencia/genética
9.
Genome Res ; 25(2): 167-78, 2015 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-25394363

RESUMEN

To understand the evolutionary dynamics between transcription factor (TF) binding and gene expression in mammals, we compared transcriptional output and the binding intensities for three tissue-specific TFs in livers from four closely related mouse species. For each transcription factor, TF-dependent genes and the TF binding sites most likely to influence mRNA expression were identified by comparing mRNA expression levels between wild-type and TF knockout mice. Independent evolution was observed genome-wide between the rate of change in TF binding and the rate of change in mRNA expression across taxa, with the exception of a small number of TF-dependent genes. We also found that binding intensities are preferentially conserved near genes whose expression is dependent on the TF, and the conservation is shared among binding peaks in close proximity to each other near the TSS. Expression of TF-dependent genes typically showed an increased sensitivity to changes in binding levels as measured by mRNA abundance. Taken together, these results highlight a significant tolerance to evolutionary changes in TF binding intensity in mammalian transcriptional networks and suggest that some TF-dependent genes may be largely regulated by a single TF across evolution.


Asunto(s)
Evolución Biológica , Regulación de la Expresión Génica , Mamíferos/genética , Mamíferos/metabolismo , Factores de Transcripción/metabolismo , Animales , Sitios de Unión , Proteínas Potenciadoras de Unión a CCAAT/metabolismo , Inmunoprecipitación de Cromatina , Evolución Molecular , Variación Genética , Factor Nuclear 4 del Hepatocito/metabolismo , Secuenciación de Nucleótidos de Alto Rendimiento , Ratones , Ratones Noqueados , Modelos Estadísticos , Unión Proteica , Especificidad de la Especie , Sitio de Iniciación de la Transcripción , Transcripción Genética
10.
Genome Res ; 24(11): 1797-807, 2014 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-25122613

RESUMEN

The genetic code is an abstraction of how mRNA codons and tRNA anticodons molecularly interact during protein synthesis; the stability and regulation of this interaction remains largely unexplored. Here, we characterized the expression of mRNA and tRNA genes quantitatively at multiple time points in two developing mouse tissues. We discovered that mRNA codon pools are highly stable over development and simply reflect the genomic background; in contrast, precise regulation of tRNA gene families is required to create the corresponding tRNA transcriptomes. The dynamic regulation of tRNA genes during development is controlled in order to generate an anticodon pool that closely corresponds to messenger RNAs. Thus, across development, the pools of mRNA codons and tRNA anticodons are invariant and highly correlated, revealing a stable molecular interaction interlocking transcription and translation.


Asunto(s)
Encéfalo/metabolismo , Regulación del Desarrollo de la Expresión Génica , Hígado/metabolismo , ARN Mensajero/genética , ARN de Transferencia/genética , Transcriptoma , Animales , Anticodón/genética , Secuencia de Bases , Encéfalo/embriología , Inmunoprecipitación de Cromatina/métodos , Codón/genética , Simulación por Computador , Embrión de Mamíferos/embriología , Embrión de Mamíferos/metabolismo , Femenino , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Hígado/embriología , Masculino , Ratones Endogámicos C57BL , Modelos Genéticos , Sistemas de Lectura Abierta/genética , Análisis de Componente Principal , ARN Mensajero/metabolismo , ARN de Transferencia/metabolismo , Factores de Tiempo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...