Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 28
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Nature ; 604(7905): 310-315, 2022 04.
Artículo en Inglés | MEDLINE | ID: mdl-35388217

RESUMEN

Comprehensive genome annotation is essential to understand the impact of clinically relevant variants. However, the absence of a standard for clinical reporting and browser display complicates the process of consistent interpretation and reporting. To address these challenges, Ensembl/GENCODE1 and RefSeq2 launched a joint initiative, the Matched Annotation from NCBI and EMBL-EBI (MANE) collaboration, to converge on human gene and transcript annotation and to jointly define a high-value set of transcripts and corresponding proteins. Here, we describe the MANE transcript sets for use as universal standards for variant reporting and browser display. The MANE Select set identifies a representative transcript for each human protein-coding gene, whereas the MANE Plus Clinical set provides additional transcripts at loci where the Select transcripts alone are not sufficient to report all currently known clinical variants. Each MANE transcript represents an exact match between the exonic sequences of an Ensembl/GENCODE transcript and its counterpart in RefSeq such that the identifiers can be used synonymously. We have now released MANE Select transcripts for 97% of human protein-coding genes, including all American College of Medical Genetics and Genomics Secondary Findings list v3.0 (ref. 3) genes. MANE transcripts are accessible from major genome browsers and key resources. Widespread adoption of these transcript sets will increase the consistency of reporting, facilitate the exchange of data regardless of the annotation source and help to streamline clinical interpretation.


Asunto(s)
Biología Computacional , Bases de Datos Genéticas , Genómica , Genoma , Humanos , Difusión de la Información , Anotación de Secuencia Molecular , National Library of Medicine (U.S.) , Estados Unidos
2.
Nat Methods ; 21(7): 1349-1363, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38849569

RESUMEN

The Long-read RNA-Seq Genome Annotation Assessment Project Consortium was formed to evaluate the effectiveness of long-read approaches for transcriptome analysis. Using different protocols and sequencing platforms, the consortium generated over 427 million long-read sequences from complementary DNA and direct RNA datasets, encompassing human, mouse and manatee species. Developers utilized these data to address challenges in transcript isoform detection, quantification and de novo transcript detection. The study revealed that libraries with longer, more accurate sequences produce more accurate transcripts than those with increased read depth, whereas greater read depth improved quantification accuracy. In well-annotated genomes, tools based on reference sequences demonstrated the best performance. Incorporating additional orthogonal data and replicate samples is advised when aiming to detect rare and novel transcripts or using reference-free approaches. This collaborative study offers a benchmark for current practices and provides direction for future method development in transcriptome analysis.


Asunto(s)
Perfilación de la Expresión Génica , RNA-Seq , Humanos , Animales , Ratones , RNA-Seq/métodos , Perfilación de la Expresión Génica/métodos , Transcriptoma , Análisis de Secuencia de ARN/métodos , Anotación de Secuencia Molecular/métodos
3.
Nucleic Acids Res ; 51(D1): D942-D949, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36420896

RESUMEN

GENCODE produces high quality gene and transcript annotation for the human and mouse genomes. All GENCODE annotation is supported by experimental data and serves as a reference for genome biology and clinical genomics. The GENCODE consortium generates targeted experimental data, develops bioinformatic tools and carries out analyses that, along with externally produced data and methods, support the identification and annotation of transcript structures and the determination of their function. Here, we present an update on the annotation of human and mouse genes, including developments in the tools, data, analyses and major collaborations which underpin this progress. For example, we report the creation of a set of non-canonical ORFs identified in GENCODE transcripts, the LRGASP collaboration to assess the use of long transcriptomic data to build transcript models, the progress in collaborations with RefSeq and UniProt to increase convergence in the annotation of human and mouse protein-coding genes, the propagation of GENCODE across the human pan-genome and the development of new tools to support annotation of regulatory features by GENCODE. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org.


Asunto(s)
Biología Computacional , Genoma Humano , Humanos , Animales , Ratones , Anotación de Secuencia Molecular , Biología Computacional/métodos , Genoma Humano/genética , Transcriptoma/genética , Perfilación de la Expresión Génica , Bases de Datos Genéticas
4.
Nucleic Acids Res ; 49(D1): D916-D923, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33270111

RESUMEN

The GENCODE project annotates human and mouse genes and transcripts supported by experimental data with high accuracy, providing a foundational resource that supports genome biology and clinical genomics. GENCODE annotation processes make use of primary data and bioinformatic tools and analysis generated both within the consortium and externally to support the creation of transcript structures and the determination of their function. Here, we present improvements to our annotation infrastructure, bioinformatics tools, and analysis, and the advances they support in the annotation of the human and mouse genomes including: the completion of first pass manual annotation for the mouse reference genome; targeted improvements to the annotation of genes associated with SARS-CoV-2 infection; collaborative projects to achieve convergence across reference annotation databases for the annotation of human and mouse protein-coding genes; and the first GENCODE manually supervised automated annotation of lncRNAs. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org.


Asunto(s)
COVID-19/prevención & control , Biología Computacional/métodos , Bases de Datos Genéticas , Genómica/métodos , Anotación de Secuencia Molecular/métodos , SARS-CoV-2/genética , Animales , COVID-19/epidemiología , COVID-19/virología , Epidemias , Humanos , Internet , Ratones , Seudogenes/genética , ARN Largo no Codificante/genética , SARS-CoV-2/metabolismo , SARS-CoV-2/fisiología , Transcripción Genética/genética
5.
Nucleic Acids Res ; 47(D1): D745-D751, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30407521

RESUMEN

The Ensembl project (https://www.ensembl.org) makes key genomic data sets available to the entire scientific community without restrictions. Ensembl seeks to be a fundamental resource driving scientific progress by creating, maintaining and updating reference genome annotation and comparative genomics resources. This year we describe our new and expanded gene, variant and comparative annotation capabilities, which led to a 50% increase in the number of vertebrate genomes we support. We have also doubled the number of available human variants and added regulatory regions for many mouse cell types and developmental stages. Our data sets and tools are available via the Ensembl website as well as a through a RESTful webservice, Perl application programming interface and as data files for download.


Asunto(s)
Bases de Datos Genéticas , Genoma/genética , Genómica , Vertebrados/genética , Animales , Biología Computacional/tendencias , Humanos , Ratones , Anotación de Secuencia Molecular , Programas Informáticos
6.
Nucleic Acids Res ; 47(D1): D766-D773, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30357393

RESUMEN

The accurate identification and description of the genes in the human and mouse genomes is a fundamental requirement for high quality analysis of data informing both genome biology and clinical genomics. Over the last 15 years, the GENCODE consortium has been producing reference quality gene annotations to provide this foundational resource. The GENCODE consortium includes both experimental and computational biology groups who work together to improve and extend the GENCODE gene annotation. Specifically, we generate primary data, create bioinformatics tools and provide analysis to support the work of expert manual gene annotators and automated gene annotation pipelines. In addition, manual and computational annotation workflows use any and all publicly available data and analysis, along with the research literature to identify and characterise gene loci to the highest standard. GENCODE gene annotations are accessible via the Ensembl and UCSC Genome Browsers, the Ensembl FTP site, Ensembl Biomart, Ensembl Perl and REST APIs as well as https://www.gencodegenes.org.


Asunto(s)
Bases de Datos Genéticas , Genoma Humano/genética , Genómica , Seudogenes/genética , Animales , Biología Computacional , Humanos , Internet , Ratones , Anotación de Secuencia Molecular , Programas Informáticos
7.
Nucleic Acids Res ; 46(D1): D754-D761, 2018 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-29155950

RESUMEN

The Ensembl project has been aggregating, processing, integrating and redistributing genomic datasets since the initial releases of the draft human genome, with the aim of accelerating genomics research through rapid open distribution of public data. Large amounts of raw data are thus transformed into knowledge, which is made available via a multitude of channels, in particular our browser (http://www.ensembl.org). Over time, we have expanded in multiple directions. First, our resources describe multiple fields of genomics, in particular gene annotation, comparative genomics, genetics and epigenomics. Second, we cover a growing number of genome assemblies; Ensembl Release 90 contains exactly 100. Third, our databases feed simultaneously into an array of services designed around different use cases, ranging from quick browsing to genome-wide bioinformatic analysis. We present here the latest developments of the Ensembl project, with a focus on managing an increasing number of assemblies, supporting efforts in genome interpretation and improving our browser.


Asunto(s)
Bases de Datos Genéticas , Conjuntos de Datos como Asunto , Genoma , Difusión de la Información , Animales , Epigenómica , Genoma Humano , Estudio de Asociación del Genoma Completo , Genómica , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Anotación de Secuencia Molecular , Vertebrados/genética , Navegador Web
8.
Nucleic Acids Res ; 46(D1): D221-D228, 2018 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-29126148

RESUMEN

The Consensus Coding Sequence (CCDS) project provides a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assembly in genome annotations produced independently by NCBI and the Ensembl group at EMBL-EBI. This dataset is the product of an international collaboration that includes NCBI, Ensembl, HUGO Gene Nomenclature Committee, Mouse Genome Informatics and University of California, Santa Cruz. Identically annotated coding regions, which are generated using an automated pipeline and pass multiple quality assurance checks, are assigned a stable and tracked identifier (CCDS ID). Additionally, coordinated manual review by expert curators from the CCDS collaboration helps in maintaining the integrity and high quality of the dataset. The CCDS data are available through an interactive web page (https://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi) and an FTP site (ftp://ftp.ncbi.nlm.nih.gov/pub/CCDS/). In this paper, we outline the ongoing work, growth and stability of the CCDS dataset and provide updates on new collaboration members and new features added to the CCDS user interface. We also present expert curation scenarios, with specific examples highlighting the importance of an accurate reference genome assembly and the crucial role played by input from the research community.


Asunto(s)
Secuencia de Consenso , Bases de Datos Genéticas , Sistemas de Lectura Abierta , Animales , Curaduría de Datos/métodos , Curaduría de Datos/normas , Bases de Datos Genéticas/normas , Guías como Asunto , Humanos , Ratones , Anotación de Secuencia Molecular , National Library of Medicine (U.S.) , Estados Unidos , Interfaz Usuario-Computador
9.
Genome Res ; 26(1): 130-9, 2016 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-26560630

RESUMEN

We have generated an improved assembly and gene annotation of the pig X Chromosome, and a first draft assembly of the pig Y Chromosome, by sequencing BAC and fosmid clones from Duroc animals and incorporating information from optical mapping and fiber-FISH. The X Chromosome carries 1033 annotated genes, 690 of which are protein coding. Gene order closely matches that found in primates (including humans) and carnivores (including cats and dogs), which is inferred to be ancestral. Nevertheless, several protein-coding genes present on the human X Chromosome were absent from the pig, and 38 pig-specific X-chromosomal genes were annotated, 22 of which were olfactory receptors. The pig Y-specific Chromosome sequence generated here comprises 30 megabases (Mb). A 15-Mb subset of this sequence was assembled, revealing two clusters of male-specific low copy number genes, separated by an ampliconic region including the HSFY gene family, which together make up most of the short arm. Both clusters contain palindromes with high sequence identity, presumably maintained by gene conversion. Many of the ancestral X-related genes previously reported in at least one mammalian Y Chromosome are represented either as active genes or partial sequences. This sequencing project has allowed us to identify genes--both single copy and amplified--on the pig Y Chromosome, to compare the pig X and Y Chromosomes for homologous sequences, and thereby to reveal mechanisms underlying pig X and Y Chromosome evolution.


Asunto(s)
Cromosomas de los Mamíferos/genética , Evolución Molecular , Porcinos/genética , Cromosoma X/genética , Cromosoma Y/genética , Animales , Secuencia de Bases , Gatos/genética , Perros/genética , Femenino , Conversión Génica , Expresión Génica , Biblioteca de Genes , Orden Génico , Humanos , Masculino , Datos de Secuencia Molecular , Alineación de Secuencia , Análisis de Secuencia de ADN
10.
Nature ; 496(7446): 498-503, 2013 Apr 25.
Artículo en Inglés | MEDLINE | ID: mdl-23594743

RESUMEN

Zebrafish have become a popular organism for the study of vertebrate gene function. The virtually transparent embryos of this species, and the ability to accelerate genetic studies by gene knockdown or overexpression, have led to the widespread use of zebrafish in the detailed investigation of vertebrate gene function and increasingly, the study of human genetic disease. However, for effective modelling of human genetic disease it is important to understand the extent to which zebrafish genes and gene structures are related to orthologous human genes. To examine this, we generated a high-quality sequence assembly of the zebrafish genome, made up of an overlapping set of completely sequenced large-insert clones that were ordered and oriented using a high-resolution high-density meiotic map. Detailed automatic and manual annotation provides evidence of more than 26,000 protein-coding genes, the largest gene set of any vertebrate so far sequenced. Comparison to the human reference genome shows that approximately 70% of human genes have at least one obvious zebrafish orthologue. In addition, the high quality of this genome assembly provides a clearer understanding of key genomic features such as a unique repeat content, a scarcity of pseudogenes, an enrichment of zebrafish-specific genes on chromosome 4 and chromosomal regions that influence sex determination.


Asunto(s)
Secuencia Conservada/genética , Genoma/genética , Pez Cebra/genética , Animales , Cromosomas/genética , Evolución Molecular , Femenino , Genes/genética , Genoma Humano/genética , Genómica , Humanos , Masculino , Meiosis/genética , Anotación de Secuencia Molecular , Seudogenes/genética , Estándares de Referencia , Procesos de Determinación del Sexo/genética , Proteínas de Pez Cebra/genética
11.
J Virol ; 89(1): 428-42, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25320324

RESUMEN

UNLABELLED: The alphaherpesvirus pseudorabies virus (PrV) establishes latency primarily in neurons of trigeminal ganglia when only the transcription of the latency-associated transcript (LAT) locus is detected. Eleven microRNAs (miRNAs) cluster within the LAT, suggesting a role in establishment and/or maintenance of latency. We generated a mutant (M) PrV deleted of nine miRNA genes which displayed properties that were almost identical to those of the parental PrV wild type (WT) during propagation in vitro. Fifteen pigs were experimentally infected with either WT or M virus or were mock infected. Similar levels of virus excretion and host antibody response were observed in all infected animals. At 62 days postinfection, trigeminal ganglia were excised and profiled by deep sequencing and quantitative RT-PCR. Latency was established in all infected animals without evidence of viral reactivation, demonstrating that miRNAs are not essential for this process. Lower levels of the large latency transcript (LLT) were found in ganglia infected by M PrV than in those infected by WT PrV. All PrV miRNAs were expressed, with highest expression observed for prv-miR-LLT1, prv-miR-LLT2 (in WT ganglia), and prv-miR-LLT10 (in both WT and M ganglia). No evidence of differentially expressed porcine miRNAs was found. Fifty-four porcine genes were differentially expressed between WT, M, and control ganglia. Both viruses triggered a strong host immune response, but in M ganglia gene upregulation was prevalent. Pathway analyses indicated that several biofunctions, including those related to cell-mediated immune response and the migration of dendritic cells, were impaired in M ganglia. These findings are consistent with a function of the LAT locus in the modulation of host response for maintaining a latent state. IMPORTANCE: This study provides a thorough reference on the establishment of latency by PrV in its natural host, the pig. Our results corroborate the evidence obtained from the study of several LAT mutants of other alphaherpesviruses encoding miRNAs from their LAT regions. Neither PrV miRNA expression nor high LLT expression levels are essential to achieve latency in trigeminal ganglia. Once latency is established by PrV, the only remarkable differences are found in the pattern of host response. This indicates that, as in herpes simplex virus, LAT functions as an immune evasion locus.


Asunto(s)
Herpesvirus Suido 1/fisiología , Interacciones Huésped-Patógeno , Seudorrabia/inmunología , Seudorrabia/virología , Ganglio del Trigémino/inmunología , Ganglio del Trigémino/virología , Latencia del Virus , Animales , Perfilación de la Expresión Génica , Herpesvirus Suido 1/inmunología , Inmunidad Celular , MicroARNs , Eliminación de Secuencia , Porcinos , Replicación Viral
12.
Nucleic Acids Res ; 42(Database issue): D771-9, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24316575

RESUMEN

The Vertebrate Genome Annotation (VEGA) database (http://vega.sanger.ac.uk), initially designed as a community resource for browsing manual annotation of the human genome project, now contains five reference genomes (human, mouse, zebrafish, pig and rat). Its introduction pages have been redesigned to enable the user to easily navigate between whole genomes and smaller multi-species haplotypic regions of interest such as the major histocompatibility complex. The VEGA browser is unique in that annotation is updated via the Human And Vertebrate Analysis aNd Annotation (HAVANA) update track every 2 weeks, allowing single gene updates to be made publicly available to the research community quickly. The user can now access different haplotypic subregions more easily, such as those from the non-obese diabetic mouse, and display them in a more intuitive way using the comparative tools. We also highlight how the user can browse manually annotated updated patches from the Genome Reference Consortium (GRC).


Asunto(s)
Bases de Datos Genéticas , Genoma , Anotación de Secuencia Molecular , Animales , Genoma Humano , Genómica , Humanos , Internet , Ratones , Ratones Endogámicos NOD , Ratones Noqueados , Ratas , Porcinos/genética , Pez Cebra/genética
13.
Nucleic Acids Res ; 42(Database issue): D865-72, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24217909

RESUMEN

The Consensus Coding Sequence (CCDS) project (http://www.ncbi.nlm.nih.gov/CCDS/) is a collaborative effort to maintain a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assemblies by the National Center for Biotechnology Information (NCBI) and Ensembl genome annotation pipelines. Identical annotations that pass quality assurance tests are tracked with a stable identifier (CCDS ID). Members of the collaboration, who are from NCBI, the Wellcome Trust Sanger Institute and the University of California Santa Cruz, provide coordinated and continuous review of the dataset to ensure high-quality CCDS representations. We describe here the current status and recent growth in the CCDS dataset, as well as recent changes to the CCDS web and FTP sites. These changes include more explicit reporting about the NCBI and Ensembl annotation releases being compared, new search and display options, the addition of biologically descriptive information and our approach to representing genes for which support evidence is incomplete. We also present a summary of recent and future curation targets.


Asunto(s)
Bases de Datos Genéticas , Proteínas/genética , Animales , Exones , Genómica , Humanos , Internet , Ratones , Anotación de Secuencia Molecular , Análisis de Secuencia
14.
Brief Bioinform ; 14(5): 528-37, 2013 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-23803301

RESUMEN

The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists.


Asunto(s)
Disciplinas de las Ciencias Biológicas/educación , Biología Computacional/educación , Curriculum , Minería de Datos , Sistemas de Administración de Bases de Datos , Lenguajes de Programación , Diseño de Software , Enseñanza
15.
Brief Bioinform ; 13(3): 383-9, 2012 May.
Artículo en Inglés | MEDLINE | ID: mdl-22110242

RESUMEN

Funding bodies are increasingly recognizing the need to provide graduates and researchers with access to short intensive courses in a variety of disciplines, in order both to improve the general skills base and to provide solid foundations on which researchers may build their careers. In response to the development of 'high-throughput biology', the need for training in the field of bioinformatics, in particular, is seeing a resurgence: it has been defined as a key priority by many Institutions and research programmes and is now an important component of many grant proposals. Nevertheless, when it comes to planning and preparing to meet such training needs, tension arises between the reward structures that predominate in the scientific community which compel individuals to publish or perish, and the time that must be devoted to the design, delivery and maintenance of high-quality training materials. Conversely, there is much relevant teaching material and training expertise available worldwide that, were it properly organized, could be exploited by anyone who needs to provide training or needs to set up a new course. To do this, however, the materials would have to be centralized in a database and clearly tagged in relation to target audiences, learning objectives, etc. Ideally, they would also be peer reviewed, and easily and efficiently accessible for downloading. Here, we present the Bioinformatics Training Network (BTN), a new enterprise that has been initiated to address these needs and review it, respectively, to similar initiatives and collections.


Asunto(s)
Biología Computacional/educación , Redes Comunitarias , Humanos , Investigadores/educación
16.
Bioinformatics ; 29(15): 1919-21, 2013 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-23742982

RESUMEN

SUMMARY: We present iAnn, an open source community-driven platform for dissemination of life science events, such as courses, conferences and workshops. iAnn allows automatic visualisation and integration of customised event reports. A central repository lies at the core of the platform: curators add submitted events, and these are subsequently accessed via web services. Thus, once an iAnn widget is incorporated into a website, it permanently shows timely relevant information as if it were native to the remote site. At the same time, announcements submitted to the repository are automatically disseminated to all portals that query the system. To facilitate the visualization of announcements, iAnn provides powerful filtering options and views, integrated in Google Maps and Google Calendar. All iAnn widgets are freely available. AVAILABILITY: http://iann.pro/iannviewer CONTACT: manuel.corpas@tgac.ac.uk.


Asunto(s)
Disciplinas de las Ciencias Biológicas , Programas Informáticos , Aniversarios y Eventos Especiales , Congresos como Asunto , Internet
17.
BMC Genomics ; 14: 332, 2013 May 15.
Artículo en Inglés | MEDLINE | ID: mdl-23676093

RESUMEN

BACKGROUND: The domestic pig is known as an excellent model for human immunology and the two species share many pathogens. Susceptibility to infectious disease is one of the major constraints on swine performance, yet the structure and function of genes comprising the pig immunome are not well-characterized. The completion of the pig genome provides the opportunity to annotate the pig immunome, and compare and contrast pig and human immune systems. RESULTS: The Immune Response Annotation Group (IRAG) used computational curation and manual annotation of the swine genome assembly 10.2 (Sscrofa10.2) to refine the currently available automated annotation of 1,369 immunity-related genes through sequence-based comparison to genes in other species. Within these genes, we annotated 3,472 transcripts. Annotation provided evidence for gene expansions in several immune response families, and identified artiodactyl-specific expansions in the cathelicidin and type 1 Interferon families. We found gene duplications for 18 genes, including 13 immune response genes and five non-immune response genes discovered in the annotation process. Manual annotation provided evidence for many new alternative splice variants and 8 gene duplications. Over 1,100 transcripts without porcine sequence evidence were detected using cross-species annotation. We used a functional approach to discover and accurately annotate porcine immune response genes. A co-expression clustering analysis of transcriptomic data from selected experimental infections or immune stimulations of blood, macrophages or lymph nodes identified a large cluster of genes that exhibited a correlated positive response upon infection across multiple pathogens or immune stimuli. Interestingly, this gene cluster (cluster 4) is enriched for known general human immune response genes, yet contains many un-annotated porcine genes. A phylogenetic analysis of the encoded proteins of cluster 4 genes showed that 15% exhibited an accelerated evolution as compared to 4.1% across the entire genome. CONCLUSIONS: This extensive annotation dramatically extends the genome-based knowledge of the molecular genetics and structure of a major portion of the porcine immunome. Our complementary functional approach using co-expression during immune response has provided new putative immune response annotation for over 500 porcine genes. Our phylogenetic analysis of this core immunome cluster confirms rapid evolutionary change in this set of genes, and that, as in other species, such genes are important components of the pig's adaptation to pathogen challenge over evolutionary time. These comprehensive and integrated analyses increase the value of the porcine genome sequence and provide important tools for global analyses and data-mining of the porcine immune response.


Asunto(s)
Genómica , Inmunidad/genética , Anotación de Secuencia Molecular , Porcinos/genética , Porcinos/inmunología , Animales , Bovinos , Evolución Molecular , Duplicación de Gen , Humanos , Inmunoglobulinas/genética , Ratones , Modelos Moleculares , Conformación Proteica , Receptores de Antígenos de Linfocitos T/genética , Receptores KIR/genética , Selección Genética , Especificidad de la Especie
18.
Biochem J ; 442(3): 733-42, 2012 Mar 15.
Artículo en Inglés | MEDLINE | ID: mdl-22132794

RESUMEN

The genes for CA1Pase (2-carboxy-D-arabinitol-1-bisphosphate phosphatase) from French bean, wheat, Arabidopsis and tobacco were identified and cloned. The deduced protein sequence included an N-terminal motif identical with the PGM (phosphoglycerate mutase) active site sequence [LIVM]-x-R-H-G-[EQ]-x-x-[WN]. The corresponding gene from wheat coded for an enzyme with the properties published for CA1Pase. The expressed protein lacked PGM activity but rapidly dephosphorylated 2,3-DPG (2,3-diphosphoglycerate) to 2-phosphoglycerate. DTT (dithiothreitol) activation and GSSG inactivation of this enzyme was pH-sensitive, the greatest difference being apparent at pH 8. The presence of the expressed protein during in vitro measurement of Rubisco (ribulose-1,5-bisphosphate carboxylase/oxygenase) activity prevented a progressive decline in Rubisco turnover. This was due to the removal of an inhibitory bisphosphate that was present in the RuBP (ribulose-1,5-bisphosphate) preparation, and was found to be PDBP (D-glycero-2,3-pentodiulose-1,5-bisphosphate). The substrate specificity of the expressed protein indicates a role for CA1Pase in the removal of 'misfire' products of Rubisco.


Asunto(s)
Monoéster Fosfórico Hidrolasas/metabolismo , Proteínas de Plantas/metabolismo , Ribulosa-Bifosfato Carboxilasa/metabolismo , Secuencia de Aminoácidos , Arabidopsis/enzimología , Cinética , Datos de Secuencia Molecular , Pentosafosfatos/metabolismo , Phaseolus/enzimología , Especificidad por Sustrato , Nicotiana/enzimología , Triticum/enzimología
19.
bioRxiv ; 2023 Jul 27.
Artículo en Inglés | MEDLINE | ID: mdl-37546854

RESUMEN

The Long-read RNA-Seq Genome Annotation Assessment Project (LRGASP) Consortium was formed to evaluate the effectiveness of long-read approaches for transcriptome analysis. The consortium generated over 427 million long-read sequences from cDNA and direct RNA datasets, encompassing human, mouse, and manatee species, using different protocols and sequencing platforms. These data were utilized by developers to address challenges in transcript isoform detection and quantification, as well as de novo transcript isoform identification. The study revealed that libraries with longer, more accurate sequences produce more accurate transcripts than those with increased read depth, whereas greater read depth improved quantification accuracy. In well-annotated genomes, tools based on reference sequences demonstrated the best performance. When aiming to detect rare and novel transcripts or when using reference-free approaches, incorporating additional orthogonal data and replicate samples are advised. This collaborative study offers a benchmark for current practices and provides direction for future method development in transcriptome analysis.

20.
Nature ; 440(7087): 1045-9, 2006 Apr 20.
Artículo en Inglés | MEDLINE | ID: mdl-16625196

RESUMEN

Chromosome 17 is unusual among the human chromosomes in many respects. It is the largest human autosome with orthology to only a single mouse chromosome, mapping entirely to the distal half of mouse chromosome 11. Chromosome 17 is rich in protein-coding genes, having the second highest gene density in the genome. It is also enriched in segmental duplications, ranking third in density among the autosomes. Here we report a finished sequence for human chromosome 17, as well as a structural comparison with the finished sequence for mouse chromosome 11, the first finished mouse chromosome. Comparison of the orthologous regions reveals striking differences. In contrast to the typical pattern seen in mammalian evolution, the human sequence has undergone extensive intrachromosomal rearrangement, whereas the mouse sequence has been remarkably stable. Moreover, although the human sequence has a high density of segmental duplication, the mouse sequence has a very low density. Notably, these segmental duplications correspond closely to the sites of structural rearrangement, demonstrating a link between duplication and rearrangement. Examination of the main classes of duplicated segments provides insight into the dynamics underlying expansion of chromosome-specific, low-copy repeats in the human genome.


Asunto(s)
Cromosomas Humanos Par 17/genética , Evolución Molecular , Animales , Composición de Base , Duplicación de Gen , Humanos , Elementos de Nucleótido Esparcido Largo/genética , Ratones , Análisis de Secuencia de ADN , Elementos de Nucleótido Esparcido Corto/genética , Sintenía/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA