Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 89
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Nature ; 604(7905): 310-315, 2022 04.
Artículo en Inglés | MEDLINE | ID: mdl-35388217

RESUMEN

Comprehensive genome annotation is essential to understand the impact of clinically relevant variants. However, the absence of a standard for clinical reporting and browser display complicates the process of consistent interpretation and reporting. To address these challenges, Ensembl/GENCODE1 and RefSeq2 launched a joint initiative, the Matched Annotation from NCBI and EMBL-EBI (MANE) collaboration, to converge on human gene and transcript annotation and to jointly define a high-value set of transcripts and corresponding proteins. Here, we describe the MANE transcript sets for use as universal standards for variant reporting and browser display. The MANE Select set identifies a representative transcript for each human protein-coding gene, whereas the MANE Plus Clinical set provides additional transcripts at loci where the Select transcripts alone are not sufficient to report all currently known clinical variants. Each MANE transcript represents an exact match between the exonic sequences of an Ensembl/GENCODE transcript and its counterpart in RefSeq such that the identifiers can be used synonymously. We have now released MANE Select transcripts for 97% of human protein-coding genes, including all American College of Medical Genetics and Genomics Secondary Findings list v3.0 (ref. 3) genes. MANE transcripts are accessible from major genome browsers and key resources. Widespread adoption of these transcript sets will increase the consistency of reporting, facilitate the exchange of data regardless of the annotation source and help to streamline clinical interpretation.


Asunto(s)
Biología Computacional , Bases de Datos Genéticas , Genómica , Genoma , Humanos , Difusión de la Información , Anotación de Secuencia Molecular , National Library of Medicine (U.S.) , Estados Unidos
2.
Nucleic Acids Res ; 51(D1): D977-D985, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36350656

RESUMEN

The NHGRI-EBI GWAS Catalog (www.ebi.ac.uk/gwas) is a FAIR knowledgebase providing detailed, structured, standardised and interoperable genome-wide association study (GWAS) data to >200 000 users per year from academic research, healthcare and industry. The Catalog contains variant-trait associations and supporting metadata for >45 000 published GWAS across >5000 human traits, and >40 000 full P-value summary statistics datasets. Content is curated from publications or acquired via author submission of prepublication summary statistics through a new submission portal and validation tool. GWAS data volume has vastly increased in recent years. We have updated our software to meet this scaling challenge and to enable rapid release of submitted summary statistics. The scope of the repository has expanded to include additional data types of high interest to the community, including sequencing-based GWAS, gene-based analyses and copy number variation analyses. Community outreach has increased the number of shared datasets from under-represented traits, e.g. cancer, and we continue to contribute to awareness of the lack of population diversity in GWAS. Interoperability of the Catalog has been enhanced through links to other resources including the Polygenic Score Catalog and the International Mouse Phenotyping Consortium, refinements to GWAS trait annotation, and the development of a standard format for GWAS data.


Asunto(s)
Estudio de Asociación del Genoma Completo , Bases del Conocimiento , Animales , Humanos , Ratones , Variaciones en el Número de Copia de ADN , National Human Genome Research Institute (U.S.) , Fenotipo , Polimorfismo de Nucleótido Simple , Programas Informáticos , Estados Unidos
3.
Nucleic Acids Res ; 51(D1): D942-D949, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36420896

RESUMEN

GENCODE produces high quality gene and transcript annotation for the human and mouse genomes. All GENCODE annotation is supported by experimental data and serves as a reference for genome biology and clinical genomics. The GENCODE consortium generates targeted experimental data, develops bioinformatic tools and carries out analyses that, along with externally produced data and methods, support the identification and annotation of transcript structures and the determination of their function. Here, we present an update on the annotation of human and mouse genes, including developments in the tools, data, analyses and major collaborations which underpin this progress. For example, we report the creation of a set of non-canonical ORFs identified in GENCODE transcripts, the LRGASP collaboration to assess the use of long transcriptomic data to build transcript models, the progress in collaborations with RefSeq and UniProt to increase convergence in the annotation of human and mouse protein-coding genes, the propagation of GENCODE across the human pan-genome and the development of new tools to support annotation of regulatory features by GENCODE. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org.


Asunto(s)
Biología Computacional , Genoma Humano , Humanos , Animales , Ratones , Anotación de Secuencia Molecular , Biología Computacional/métodos , Genoma Humano/genética , Transcriptoma/genética , Perfilación de la Expresión Génica , Bases de Datos Genéticas
4.
Genet Med ; 26(2): 101029, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-37982373

RESUMEN

PURPOSE: The terminology used for gene-disease curation and variant annotation to describe inheritance, allelic requirement, and both sequence and functional consequences of a variant is currently not standardized. There is considerable discrepancy in the literature and across clinical variant reporting in the derivation and application of terms. Here, we standardize the terminology for the characterization of disease-gene relationships to facilitate harmonized global curation and to support variant classification within the ACMG/AMP framework. METHODS: Terminology for inheritance, allelic requirement, and both structural and functional consequences of a variant used by Gene Curation Coalition members and partner organizations was collated and reviewed. Harmonized terminology with definitions and use examples was created, reviewed, and validated. RESULTS: We present a standardized terminology to describe gene-disease relationships, and to support variant annotation. We demonstrate application of the terminology for classification of variation in the ACMG SF 2.0 genes recommended for reporting of secondary findings. Consensus terms were agreed and formalized in both Sequence Ontology (SO) and Human Phenotype Ontology (HPO) ontologies. Gene Curation Coalition member groups intend to use or map to these terms in their respective resources. CONCLUSION: The terminology standardization presented here will improve harmonization, facilitate the pooling of curation datasets across international curation efforts and, in turn, improve consistency in variant classification and genetic test interpretation.


Asunto(s)
Pruebas Genéticas , Variación Genética , Humanos , Alelos , Bases de Datos Genéticas
5.
J Med Genet ; 60(8): 810-818, 2023 08.
Artículo en Inglés | MEDLINE | ID: mdl-36669873

RESUMEN

BACKGROUND: Genomic variant prioritisation is one of the most significant bottlenecks to mainstream genomic testing in healthcare. Tools to improve precision while ensuring high recall are critical to successful mainstream clinical genomic testing, in particular for whole genome sequencing where millions of variants must be considered for each patient. METHODS: We developed EyeG2P, a publicly available database and web application using the Ensembl Variant Effect Predictor. EyeG2P is tailored for efficient variant prioritisation for individuals with inherited ophthalmic conditions. We assessed the sensitivity of EyeG2P in 1234 individuals with a broad range of eye conditions who had previously received a confirmed molecular diagnosis through routine genomic diagnostic approaches. For a prospective cohort of 83 individuals, we assessed the precision of EyeG2P in comparison with routine diagnostic approaches. For 10 additional individuals, we assessed the utility of EyeG2P for whole genome analysis. RESULTS: EyeG2P had 99.5% sensitivity for genomic variants previously identified as clinically relevant through routine diagnostic analysis (n=1234 individuals). Prospectively, EyeG2P enabled a significant increase in precision (35% on average) in comparison with routine testing strategies (p<0.001). We demonstrate that incorporation of EyeG2P into whole genome sequencing analysis strategies can reduce the number of variants for analysis to six variants, on average, while maintaining high diagnostic yield. CONCLUSION: Automated filtering of genomic variants through EyeG2P can increase the efficiency of diagnostic testing for individuals with a broad range of inherited ophthalmic disorders.


Asunto(s)
Bases de Datos Genéticas , Oftalmopatías , Pruebas Genéticas , Genoma Humano , Genómica , Oftalmopatías/genética , Humanos , Variación Genética
6.
Nucleic Acids Res ; 50(D1): D1216-D1220, 2022 01 07.
Artículo en Inglés | MEDLINE | ID: mdl-34718739

RESUMEN

The European Variation Archive (EVA; https://www.ebi.ac.uk/eva/) is a resource for sharing all types of genetic variation data (SNPs, indels, and structural variants) for all species. The EVA was created in 2014 to provide FAIR access to genetic variation data and has since grown to be a primary resource for genomic variants hosting >3 billion records. The EVA and dbSNP have established a compatible global system to assign unique identifiers to all submitted genetic variants. The EVA is active within the Global Alliance of Genomics and Health (GA4GH), maintaining, contributing and implementing standards such as VCF, Refget and Variant Representation Specification (VRS). In this article, we describe the submission and permanent accessioning services along with the different ways the data can be retrieved by the scientific community.


Asunto(s)
Biología Computacional , Bases de Datos Genéticas , Variación Genética/genética , Programas Informáticos , Animales , Variación Estructural del Genoma/genética , Genómica , Humanos , Mutación INDEL/genética , Anotación de Secuencia Molecular , Polimorfismo de Nucleótido Simple/genética
7.
Nucleic Acids Res ; 50(D1): D988-D995, 2022 01 07.
Artículo en Inglés | MEDLINE | ID: mdl-34791404

RESUMEN

Ensembl (https://www.ensembl.org) is unique in its flexible infrastructure for access to genomic data and annotation. It has been designed to efficiently deliver annotation at scale for all eukaryotic life, and it also provides deep comprehensive annotation for key species. Genomes representing a greater diversity of species are increasingly being sequenced. In response, we have focussed our recent efforts on expediting the annotation of new assemblies. Here, we report the release of the greatest annual number of newly annotated genomes in the history of Ensembl via our dedicated Ensembl Rapid Release platform (http://rapid.ensembl.org). We have also developed a new method to generate comparative analyses at scale for these assemblies and, for the first time, we have annotated non-vertebrate eukaryotes. Meanwhile, we continually improve, extend and update the annotation for our high-value reference vertebrate genomes and report the details here. We have a range of specific software tools for specific tasks, such as the Ensembl Variant Effect Predictor (VEP) and the newly developed interface for the Variant Recoder. All Ensembl data, software and tools are freely available for download and are accessible programmatically.


Asunto(s)
Bases de Datos Genéticas , Genoma/genética , Anotación de Secuencia Molecular , Programas Informáticos , Animales , Biología Computacional/clasificación , Humanos
8.
Nucleic Acids Res ; 49(D1): D916-D923, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33270111

RESUMEN

The GENCODE project annotates human and mouse genes and transcripts supported by experimental data with high accuracy, providing a foundational resource that supports genome biology and clinical genomics. GENCODE annotation processes make use of primary data and bioinformatic tools and analysis generated both within the consortium and externally to support the creation of transcript structures and the determination of their function. Here, we present improvements to our annotation infrastructure, bioinformatics tools, and analysis, and the advances they support in the annotation of the human and mouse genomes including: the completion of first pass manual annotation for the mouse reference genome; targeted improvements to the annotation of genes associated with SARS-CoV-2 infection; collaborative projects to achieve convergence across reference annotation databases for the annotation of human and mouse protein-coding genes; and the first GENCODE manually supervised automated annotation of lncRNAs. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org.


Asunto(s)
COVID-19/prevención & control , Biología Computacional/métodos , Bases de Datos Genéticas , Genómica/métodos , Anotación de Secuencia Molecular/métodos , SARS-CoV-2/genética , Animales , COVID-19/epidemiología , COVID-19/virología , Epidemias , Humanos , Internet , Ratones , Seudogenes/genética , ARN Largo no Codificante/genética , SARS-CoV-2/metabolismo , SARS-CoV-2/fisiología , Transcripción Genética/genética
9.
Hum Mutat ; 43(6): 682-697, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-35143074

RESUMEN

DECIPHER (https://www.deciphergenomics.org) is a free web platform for sharing anonymized phenotype-linked variant data from rare disease patients. Its dynamic interpretation interfaces contextualize genomic and phenotypic data to enable more informed variant interpretation, incorporating international standards for variant classification. DECIPHER supports almost all types of germline and mosaic variation in the nuclear and mitochondrial genome: sequence variants, short tandem repeats, copy-number variants, and large structural variants. Patient phenotypes are deposited using Human Phenotype Ontology (HPO) terms, supplemented by quantitative data, which is aggregated to derive gene-specific phenotypic summaries. It hosts data from >250 projects from ~40 countries, openly sharing >40,000 patient records containing >51,000 variants and >172,000 phenotype terms. The rich phenotype-linked variant data in DECIPHER drives rare disease research and diagnosis by enabling patient matching within DECIPHER and with other resources, and has been cited in >2,600 publications. In this study, we describe the types of data deposited to DECIPHER, the variant interpretation tools, and patient matching interfaces which make DECIPHER an invaluable rare disease resource.


Asunto(s)
Bases de Datos Genéticas , Enfermedades Raras , Genómica , Humanos , Fenotipo , Enfermedades Raras/diagnóstico , Enfermedades Raras/genética , Programas Informáticos
10.
Hum Mutat ; 43(8): 986-997, 2022 08.
Artículo en Inglés | MEDLINE | ID: mdl-34816521

RESUMEN

The Ensembl Variant Effect Predictor (VEP) is a freely available, open-source tool for the annotation and filtering of genomic variants. It predicts variant molecular consequences using the Ensembl/GENCODE or RefSeq gene sets. It also reports phenotype associations from databases such as ClinVar, allele frequencies from studies including gnomAD, and predictions of deleteriousness from tools such as Sorting Intolerant From Tolerant and Combined Annotation Dependent Depletion. Ensembl VEP includes filtering options to customize variant prioritization. It is well supported and updated roughly quarterly to incorporate the latest gene, variant, and phenotype association information. Ensembl VEP analysis can be performed using a highly configurable, extensible command-line tool, a Representational State Transfer application programming interface, and a user-friendly web interface. These access methods are designed to suit different levels of bioinformatics experience and meet different needs in terms of data size, visualization, and flexibility. In this tutorial, we will describe performing variant annotation using the Ensembl VEP web tool, which enables sophisticated analysis through a simple interface.


Asunto(s)
Genómica , Programas Informáticos , Biología Computacional , Bases de Datos Genéticas , Frecuencia de los Genes , Humanos , Anotación de Secuencia Molecular , Fenotipo
11.
Genet Med ; 24(8): 1732-1742, 2022 08.
Artículo en Inglés | MEDLINE | ID: mdl-35507016

RESUMEN

PURPOSE: Several groups and resources provide information that pertains to the validity of gene-disease relationships used in genomic medicine and research; however, universal standards and terminologies to define the evidence base for the role of a gene in disease and a single harmonized resource were lacking. To tackle this issue, the Gene Curation Coalition (GenCC) was formed. METHODS: The GenCC drafted harmonized definitions for differing levels of gene-disease validity on the basis of existing resources, and performed a modified Delphi survey with 3 rounds to narrow the list of terms. The GenCC also developed a unified database to display curated gene-disease validity assertions from its members. RESULTS: On the basis of 241 survey responses from the genetics community, a consensus term set was chosen for grading gene-disease validity and database submissions. As of December 2021, the database contained 15,241 gene-disease assertions on 4569 unique genes from 12 submitters. When comparing submissions to the database from distinct sources, conflicts in assertions of gene-disease validity ranged from 5.3% to 13.4%. CONCLUSION: Terminology standardization, sharing of gene-disease validity classifications, and resolution of curation conflicts will facilitate collaborations across international curation efforts and in turn, improve consistency in genetic testing and variant interpretation.


Asunto(s)
Bases de Datos Genéticas , Genómica , Pruebas Genéticas , Variación Genética , Humanos
12.
Nucleic Acids Res ; 47(D1): D1005-D1012, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30445434

RESUMEN

The GWAS Catalog delivers a high-quality curated collection of all published genome-wide association studies enabling investigations to identify causal variants, understand disease mechanisms, and establish targets for novel therapies. The scope of the Catalog has also expanded to targeted and exome arrays with 1000 new associations added for these technologies. As of September 2018, the Catalog contains 5687 GWAS comprising 71673 variant-trait associations from 3567 publications. New content includes 284 full P-value summary statistics datasets for genome-wide and new targeted array studies, representing 6 × 109 individual variant-trait statistics. In the last 12 months, the Catalog's user interface was accessed by ∼90000 unique users who viewed >1 million pages. We have improved data access with the release of a new RESTful API to support high-throughput programmatic access, an improved web interface and a new summary statistics database. Summary statistics provision is supported by a new format proposed as a community standard for summary statistics data representation. This format was derived from our experience in standardizing heterogeneous submissions, mapping formats and in harmonizing content. Availability: https://www.ebi.ac.uk/gwas/.


Asunto(s)
Bases de Datos Genéticas , Estudio de Asociación del Genoma Completo , Enfermedad/genética , Variación Genética , Humanos , Análisis por Micromatrices , Publicaciones , Programas Informáticos , Interfaz Usuario-Computador
13.
Nucleic Acids Res ; 47(D1): D766-D773, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30357393

RESUMEN

The accurate identification and description of the genes in the human and mouse genomes is a fundamental requirement for high quality analysis of data informing both genome biology and clinical genomics. Over the last 15 years, the GENCODE consortium has been producing reference quality gene annotations to provide this foundational resource. The GENCODE consortium includes both experimental and computational biology groups who work together to improve and extend the GENCODE gene annotation. Specifically, we generate primary data, create bioinformatics tools and provide analysis to support the work of expert manual gene annotators and automated gene annotation pipelines. In addition, manual and computational annotation workflows use any and all publicly available data and analysis, along with the research literature to identify and characterise gene loci to the highest standard. GENCODE gene annotations are accessible via the Ensembl and UCSC Genome Browsers, the Ensembl FTP site, Ensembl Biomart, Ensembl Perl and REST APIs as well as https://www.gencodegenes.org.


Asunto(s)
Bases de Datos Genéticas , Genoma Humano/genética , Genómica , Seudogenes/genética , Animales , Biología Computacional , Humanos , Internet , Ratones , Anotación de Secuencia Molecular , Programas Informáticos
14.
Nucleic Acids Res ; 47(D1): D745-D751, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30407521

RESUMEN

The Ensembl project (https://www.ensembl.org) makes key genomic data sets available to the entire scientific community without restrictions. Ensembl seeks to be a fundamental resource driving scientific progress by creating, maintaining and updating reference genome annotation and comparative genomics resources. This year we describe our new and expanded gene, variant and comparative annotation capabilities, which led to a 50% increase in the number of vertebrate genomes we support. We have also doubled the number of available human variants and added regulatory regions for many mouse cell types and developmental stages. Our data sets and tools are available via the Ensembl website as well as a through a RESTful webservice, Perl application programming interface and as data files for download.


Asunto(s)
Bases de Datos Genéticas , Genoma/genética , Genómica , Vertebrados/genética , Animales , Biología Computacional/tendencias , Humanos , Ratones , Anotación de Secuencia Molecular , Programas Informáticos
15.
Int J Mol Sci ; 22(21)2021 Nov 05.
Artículo en Inglés | MEDLINE | ID: mdl-34769404

RESUMEN

Age-related macular degeneration (AMD) is a common blinding disease in the western world that is linked to the loss of fenestration in the choriocapillaris that sustains the retinal pigment epithelium and photoreceptors in the back of the eye. Changes in ocular and systemic zinc concentrations have been associated with AMD; therefore, we hypothesized that these changes might be directly involved in fenestrae formation. To test this hypothesis, an endothelial cell (bEND.5) model for fenestrae formation was treated with different concentrations of zinc sulfate (ZnSO4) solution for up to 20 h. Fenestrae were visualized by staining for Plasmalemmal Vesicle Associated Protein-1 (PV-1), the protein that forms the diaphragms of the fenestrated endothelium. Size and distribution were monitored by transmission electron microscopy (TEM). We found that zinc induced the redistribution of PV-1 into areas called sieve plates containing ~70-nm uniform size and typical morphology fenestrae. As AMD is associated with reduced zinc concentrations in the serum and in ocular tissues, and dietary zinc supplementation is recommended to slow disease progression, we propose here that the elevation of zinc concentration may restore choriocapillaris fenestration resulting in improved nutrient flow and clearance of waste material in the retina.


Asunto(s)
Coroides/patología , Células Endoteliales/patología , Degeneración Macular/patología , Proteínas de la Membrana/metabolismo , Células Fotorreceptoras/patología , Epitelio Pigmentado de la Retina/patología , Zinc/metabolismo , Animales , Células Cultivadas , Coroides/metabolismo , Células Endoteliales/metabolismo , Degeneración Macular/metabolismo , Ratones , Microscopía Electrónica de Transmisión/métodos , Células Fotorreceptoras/metabolismo , Epitelio Pigmentado de la Retina/metabolismo
16.
Am J Physiol Cell Physiol ; 317(6): C1093-C1106, 2019 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-31461344

RESUMEN

This study explored the mechanism by which Ca2+-activated Cl- channels (CaCCs) encoded by the Tmem16a gene are regulated by calmodulin-dependent protein kinase II (CaMKII) and protein phosphatases 1 (PP1) and 2A (PP2A). Ca2+-activated Cl- currents (IClCa) were recorded from HEK-293 cells expressing mouse TMEM16A. IClCa were evoked using a pipette solution in which free Ca2+ concentration was clamped to 500 nM, in the presence (5 mM) or absence of ATP. With 5 mM ATP, IClCa decayed to <50% of the initial current magnitude within 10 min after seal rupture. IClCa rundown seen with ATP-containing pipette solution was greatly diminished by omitting ATP. IClCa recorded after 20 min of cell dialysis with 0 ATP were more than twofold larger than those recorded with 5 mM ATP. Intracellular application of autocamtide-2-related inhibitory peptide (5 µM) or KN-93 (10 µM), two specific CaMKII inhibitors, produced a similar attenuation of TMEM16A rundown. In contrast, internal application of okadaic acid (30 nM) or cantharidin (100 nM), two nonselective PP1 and PP2A blockers, promoted the rundown of TMEM16A in cells dialyzed with 0 ATP. Mutating serine 528 of TMEM16A to an alanine led to a similar inhibition of TMEM16A rundown to that exerted by either one of the two CaMKII inhibitors tested, which was not observed for three putative CaMKII consensus sites for phosphorylation (T273, T622, and S730). Our results suggest that TMEM16A-mediated CaCCs are regulated by CaMKII and PP1/PP2A. Our data also suggest that serine 528 of TMEM16A is an important contributor to the regulation of IClCa by CaMKII.


Asunto(s)
Anoctamina-1/genética , Proteína Quinasa Tipo 2 Dependiente de Calcio Calmodulina/genética , Regulación de la Expresión Génica , Proteínas de Neoplasias/genética , Proteína Fosfatasa 1/genética , Proteína Fosfatasa 2/genética , Adenosina Trifosfato/metabolismo , Adenosina Trifosfato/farmacología , Secuencia de Aminoácidos , Animales , Anoctamina-1/metabolismo , Bencilaminas/farmacología , Calcio/metabolismo , Proteína Quinasa Tipo 2 Dependiente de Calcio Calmodulina/antagonistas & inhibidores , Proteína Quinasa Tipo 2 Dependiente de Calcio Calmodulina/metabolismo , Cantaridina/farmacología , Cloruros/metabolismo , Potenciales Evocados/efectos de los fármacos , Potenciales Evocados/fisiología , Células HEK293 , Humanos , Transporte Iónico/efectos de los fármacos , Ratones , Proteínas de Neoplasias/metabolismo , Ácido Ocadaico/farmacología , Técnicas de Placa-Clamp , Péptidos/farmacología , Fosforilación/efectos de los fármacos , Proteína Fosfatasa 1/antagonistas & inhibidores , Proteína Fosfatasa 1/metabolismo , Proteína Fosfatasa 2/antagonistas & inhibidores , Proteína Fosfatasa 2/metabolismo , Alineación de Secuencia , Homología de Secuencia de Aminoácido , Transducción de Señal , Sulfonamidas/farmacología
18.
Genet Med ; 21(4): 837-849, 2019 04.
Artículo en Inglés | MEDLINE | ID: mdl-30206421

RESUMEN

PURPOSE: Variants in IQSEC2, escaping X inactivation, cause X-linked intellectual disability with frequent epilepsy in males and females. We aimed to investigate sex-specific differences. METHODS: We collected the data of 37 unpublished patients (18 males and 19 females) with IQSEC2 pathogenic variants and 5 individuals with variants of unknown significance and reviewed published variants. We compared variant types and phenotypes in males and females and performed an analysis of IQSEC2 isoforms. RESULTS: IQSEC2 pathogenic variants mainly led to premature truncation and were scattered throughout the longest brain-specific isoform, encoding the synaptic IQSEC2/BRAG1 protein. Variants occurred de novo in females but were either de novo (2/3) or inherited (1/3) in males, with missense variants being predominantly inherited. Developmental delay and intellectual disability were overall more severe in males than in females. Likewise, seizures were more frequently observed and intractable, and started earlier in males than in females. No correlation was observed between the age at seizure onset and severity of intellectual disability or resistance to antiepileptic treatments. CONCLUSION: This study provides a comprehensive overview of IQSEC2-related encephalopathy in males and females, and suggests that an accurate dosage of IQSEC2 at the synapse is crucial during normal brain development.


Asunto(s)
Encefalopatías/genética , Factores de Intercambio de Guanina Nucleótido/genética , Discapacidad Intelectual/genética , Convulsiones/genética , Encéfalo/crecimiento & desarrollo , Encéfalo/metabolismo , Encefalopatías/epidemiología , Encefalopatías/fisiopatología , Femenino , Humanos , Lactante , Recién Nacido , Discapacidad Intelectual/epidemiología , Discapacidad Intelectual/fisiopatología , Masculino , Mutación , Linaje , Fenotipo , Isoformas de Proteínas/genética , Convulsiones/epidemiología , Convulsiones/fisiopatología , Caracteres Sexuales
19.
PLoS Comput Biol ; 14(8): e1006390, 2018 08.
Artículo en Inglés | MEDLINE | ID: mdl-30102703

RESUMEN

Manually curating biomedical knowledge from publications is necessary to build a knowledge based service that provides highly precise and organized information to users. The process of retrieving relevant publications for curation, which is also known as document triage, is usually carried out by querying and reading articles in PubMed. However, this query-based method often obtains unsatisfactory precision and recall on the retrieved results, and it is difficult to manually generate optimal queries. To address this, we propose a machine-learning assisted triage method. We collect previously curated publications from two databases UniProtKB/Swiss-Prot and the NHGRI-EBI GWAS Catalog, and used them as a gold-standard dataset for training deep learning models based on convolutional neural networks. We then use the trained models to classify and rank new publications for curation. For evaluation, we apply our method to the real-world manual curation process of UniProtKB/Swiss-Prot and the GWAS Catalog. We demonstrate that our machine-assisted triage method outperforms the current query-based triage methods, improves efficiency, and enriches curated content. Our method achieves a precision 1.81 and 2.99 times higher than that obtained by the current query-based triage methods of UniProtKB/Swiss-Prot and the GWAS Catalog, respectively, without compromising recall. In fact, our method retrieves many additional relevant publications that the query-based method of UniProtKB/Swiss-Prot could not find. As these results show, our machine learning-based method can make the triage process more efficient and is being implemented in production so that human curators can focus on more challenging tasks to improve the quality of knowledge bases.


Asunto(s)
Curaduría de Datos/métodos , Almacenamiento y Recuperación de la Información/métodos , Curaduría de Datos/estadística & datos numéricos , Bases de Datos Genéticas , Bases de Datos de Proteínas , Aprendizaje Profundo , Genómica , Bases del Conocimiento , Aprendizaje Automático , Publicaciones
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA