Búsqueda | Portal de Búsqueda de la BVS

1.

Database resources of the National Center for Biotechnology Information.

Sayers, Eric W; Beck, Jeff; Bolton, Evan E; Brister, J Rodney; Chan, Jessica; Comeau, Donald C; Connor, Ryan; DiCuccio, Michael; Farrell, Catherine M; Feldgarden, Michael; Fine, Anna M; Funk, Kathryn; Hatcher, Eneida; Hoeppner, Marilu; Kane, Megan; Kannan, Sivakumar; Katz, Kenneth S; Kelly, Christopher; Klimke, William; Kim, Sunghwan; Kimchi, Avi; Landrum, Melissa; Lathrop, Stacy; Lu, Zhiyong; Malheiro, Adriana; Marchler-Bauer, Aron; Murphy, Terence D; Phan, Lon; Prasad, Arjun B; Pujar, Shashikant; Sawyer, Amanda; Schmieder, Erin; Schneider, Valerie A; Schoch, Conrad L; Sharma, Shobha; Thibaud-Nissen, Françoise; Trawick, Barton W; Venkatapathi, Thilakam; Wang, Jiyao; Pruitt, Kim D; Sherry, Stephen T.

Nucleic Acids Res ; 52(D1): D33-D43, 2024 Jan 05.

Artículo en Inglés | MEDLINE | ID: mdl-37994677

RESUMEN

The National Center for Biotechnology Information (NCBI) provides online information resources for biology, including the GenBank® nucleic acid sequence database and the PubMed® database of citations and abstracts published in life science journals. NCBI provides search and retrieval operations for most of these data from 35 distinct databases. The E-utilities serve as the programming interface for most of these databases. Resources receiving significant updates in the past year include PubMed, PMC, Bookshelf, SciENcv, the NIH Comparative Genomics Resource (CGR), NCBI Virus, SRA, RefSeq, foreign contamination screening tools, Taxonomy, iCn3D, ClinVar, GTR, MedGen, dbSNP, ALFA, ClinicalTrials.gov, Pathogen Detection, antimicrobial resistance resources, and PubChem. These resources can be accessed through the NCBI home page at https://www.ncbi.nlm.nih.gov.

Asunto(s)

Bases de Datos Genéticas , National Library of Medicine (U.S.) , Biotecnología/instrumentación , Bases de Datos de Ácidos Nucleicos , Internet , Estados Unidos

2.

Database resources of the National Center for Biotechnology Information in 2023.

Sayers, Eric W; Bolton, Evan E; Brister, J Rodney; Canese, Kathi; Chan, Jessica; Comeau, Donald C; Farrell, Catherine M; Feldgarden, Michael; Fine, Anna M; Funk, Kathryn; Hatcher, Eneida; Kannan, Sivakumar; Kelly, Christopher; Kim, Sunghwan; Klimke, William; Landrum, Melissa J; Lathrop, Stacy; Lu, Zhiyong; Madden, Thomas L; Malheiro, Adriana; Marchler-Bauer, Aron; Murphy, Terence D; Phan, Lon; Pujar, Shashikant; Rangwala, Sanjida H; Schneider, Valerie A; Tse, Tony; Wang, Jiyao; Ye, Jian; Trawick, Barton W; Pruitt, Kim D; Sherry, Stephen T.

Nucleic Acids Res ; 51(D1): D29-D38, 2023 01 06.

Artículo en Inglés | MEDLINE | ID: mdl-36370100

RESUMEN

The National Center for Biotechnology Information (NCBI) provides online information resources for biology, including the GenBank® nucleic acid sequence database and the PubMed® database of citations and abstracts published in life science journals. NCBI provides search and retrieval operations for most of these data from 35 distinct databases. The E-utilities serve as the programming interface for most of these databases. New resources include the Comparative Genome Resource (CGR) and the BLAST ClusteredNR database. Resources receiving significant updates in the past year include PubMed, PMC, Bookshelf, IgBLAST, GDV, RefSeq, NCBI Virus, GenBank type assemblies, iCn3D, ClinVar, GTR, dbGaP, ALFA, ClinicalTrials.gov, Pathogen Detection, antimicrobial resistance resources, and PubChem. These resources can be accessed through the NCBI home page at https://www.ncbi.nlm.nih.gov.

Asunto(s)

Bases de Datos Genéticas , Bases de Datos de Ácidos Nucleicos , Estados Unidos , National Library of Medicine (U.S.) , Alineación de Secuencia , Biotecnología , Internet

3.

Database resources of the national center for biotechnology information.

Sayers, Eric W; Bolton, Evan E; Brister, J Rodney; Canese, Kathi; Chan, Jessica; Comeau, Donald C; Connor, Ryan; Funk, Kathryn; Kelly, Chris; Kim, Sunghwan; Madej, Tom; Marchler-Bauer, Aron; Lanczycki, Christopher; Lathrop, Stacy; Lu, Zhiyong; Thibaud-Nissen, Francoise; Murphy, Terence; Phan, Lon; Skripchenko, Yuri; Tse, Tony; Wang, Jiyao; Williams, Rebecca; Trawick, Barton W; Pruitt, Kim D; Sherry, Stephen T.

Nucleic Acids Res ; 50(D1): D20-D26, 2022 01 07.

Artículo en Inglés | MEDLINE | ID: mdl-34850941

RESUMEN

The National Center for Biotechnology Information (NCBI) produces a variety of online information resources for biology, including the GenBank® nucleic acid sequence database and the PubMed® database of citations and abstracts published in life science journals. NCBI provides search and retrieval operations for most of these data from 35 distinct databases. The E-utilities serve as the programming interface for the most of these databases. Resources receiving significant updates in the past year include PubMed, PMC, Bookshelf, RefSeq, SRA, Virus, dbSNP, dbVar, ClinicalTrials.gov, MMDB, iCn3D and PubChem. These resources can be accessed through the NCBI home page at https://www.ncbi.nlm.nih.gov.

Asunto(s)

Biotecnología/tendencias , Bases de Datos Genéticas/tendencias , Bases de Datos de Compuestos Químicos , Bases de Datos de Ácidos Nucleicos , Bases de Datos de Proteínas , Humanos , Internet , National Library of Medicine (U.S.) , PubMed , Estados Unidos

4.

Database resources of the National Center for Biotechnology Information.

Sayers, Eric W; Beck, Jeffrey; Bolton, Evan E; Bourexis, Devon; Brister, James R; Canese, Kathi; Comeau, Donald C; Funk, Kathryn; Kim, Sunghwan; Klimke, William; Marchler-Bauer, Aron; Landrum, Melissa; Lathrop, Stacy; Lu, Zhiyong; Madden, Thomas L; O'Leary, Nuala; Phan, Lon; Rangwala, Sanjida H; Schneider, Valerie A; Skripchenko, Yuri; Wang, Jiyao; Ye, Jian; Trawick, Barton W; Pruitt, Kim D; Sherry, Stephen T.

Nucleic Acids Res ; 49(D1): D10-D17, 2021 01 08.

Artículo en Inglés | MEDLINE | ID: mdl-33095870

RESUMEN

The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed® database of citations and abstracts published in life science journals. The Entrez system provides search and retrieval operations for most of these data from 34 distinct databases. The E-utilities serve as the programming interface for the Entrez system. Custom implementations of the BLAST program provide sequence-based searching of many specialized datasets. New resources released in the past year include a new PubMed interface and NCBI datasets. Additional resources that were updated in the past year include PMC, Bookshelf, Genome Data Viewer, SRA, ClinVar, dbSNP, dbVar, Pathogen Detection, BLAST, Primer-BLAST, IgBLAST, iCn3D and PubChem. All of these resources can be accessed through the NCBI home page at https://www.ncbi.nlm.nih.gov.

Asunto(s)

Bases de Datos Genéticas , National Library of Medicine (U.S.) , Biología Computacional/métodos , Bases de Datos de Compuestos Químicos , Bases de Datos de Ácidos Nucleicos , Bases de Datos de Proteínas , Genómica/métodos , Humanos , PubMed , Estados Unidos

5.

Database resources of the National Center for Biotechnology Information.

Sayers, Eric W; Beck, Jeff; Brister, J Rodney; Bolton, Evan E; Canese, Kathi; Comeau, Donald C; Funk, Kathryn; Ketter, Anne; Kim, Sunghwan; Kimchi, Avi; Kitts, Paul A; Kuznetsov, Anatoliy; Lathrop, Stacy; Lu, Zhiyong; McGarvey, Kelly; Madden, Thomas L; Murphy, Terence D; O'Leary, Nuala; Phan, Lon; Schneider, Valerie A; Thibaud-Nissen, Françoise; Trawick, Bart W; Pruitt, Kim D; Ostell, James.

Nucleic Acids Res ; 48(D1): D9-D16, 2020 01 08.

Artículo en Inglés | MEDLINE | ID: mdl-31602479

RESUMEN

The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts published in life science journals. The Entrez system provides search and retrieval operations for most of these data from 35 distinct databases. The E-utilities serve as the programming interface for the Entrez system. Custom implementations of the BLAST program provide sequence-based searching of many specialized datasets. New resources released in the past year include a new PubMed interface, a sequence database search and a gene orthologs page. Additional resources that were updated in the past year include PMC, Bookshelf, My Bibliography, Assembly, RefSeq, viral genomes, the prokaryotic genome annotation pipeline, Genome Workbench, dbSNP, BLAST, Primer-BLAST, IgBLAST and PubChem. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.

Asunto(s)

Biología Computacional/métodos , Biología Computacional/organización & administración , Bases de Datos Genéticas , National Library of Medicine (U.S.) , Bases de Datos de Ácidos Nucleicos , Genómica/métodos , Humanos , PubMed , Estados Unidos , Navegador Web

6.

SPDI: data model for variants and applications at NCBI.

Holmes, J Bradley; Moyer, Eric; Phan, Lon; Maglott, Donna; Kattman, Brandi.

Bioinformatics ; 36(6): 1902-1907, 2020 03 01.

Artículo en Inglés | MEDLINE | ID: mdl-31738401

RESUMEN

MOTIVATION: Normalizing sequence variants on a reference, projecting them across congruent sequences and aggregating their diverse representations are critical to the elucidation of the genetic basis of disease and biological function. Inconsistent representation of variants among variant callers, local databases and tools result in discrepancies that complicate analysis. NCBI's genetic variation resources, dbSNP and ClinVar, require a robust, scalable set of principles to manage asserted sequence variants. RESULTS: The SPDI data model defines variants as a sequence of four attributes: sequence, position, deletion and insertion, and can be applied to nucleotide and protein variants. NCBI web services convert representations among HGVS, VCF and SPDI and provide two functions to aggregate variants. One, based on the NCBI Variant Overprecision Correction Algorithm, returns a unique, normalized representation termed the 'Contextual Allele'. The SPDI data model, with its four operations, defines exactly the reference subsequence affected by the variant, even in repeat regions, such as homopolymer and other sequence repeats. The second function projects variants across congruent sequences and depends on an alignment dataset of non-assembly NCBI RefSeq sequences (prefixed NM, NR and NG), as well as inter- and intra-assembly-associated genomic sequences (NCs, NTs and NWs), supporting robust projection of variants across congruent sequences and assembly versions. The variant is projected to all congruent Contextual Alleles. One of these Contextual Alleles, typically the allele based on the latest assembly version, represents the entire set, is designated the unique 'Canonical Allele' and is used directly to aggregate variants across congruent sequences. AVAILABILITY AND IMPLEMENTATION: The SPDI services are available for open access at: https://api.ncbi.nlm.nih.gov/variation/v0. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Bases de Datos Genéticas , Genómica , Algoritmos , Genoma , Vocabulario Controlado

7.

iCn3D, a web-based 3D viewer for sharing 1D/2D/3D representations of biomolecular structures.

Wang, Jiyao; Youkharibache, Philippe; Zhang, Dachuan; Lanczycki, Christopher J; Geer, Renata C; Madej, Thomas; Phan, Lon; Ward, Minghong; Lu, Shennan; Marchler, Gabriele H; Wang, Yanli; Bryant, Stephen H; Geer, Lewis Y; Marchler-Bauer, Aron.

Bioinformatics ; 36(1): 131-135, 2020 01 01.

Artículo en Inglés | MEDLINE | ID: mdl-31218344

RESUMEN

MOTIVATION: Build a web-based 3D molecular structure viewer focusing on interactive structural analysis. RESULTS: iCn3D (I-see-in-3D) can simultaneously show 3D structure, 2D molecular contacts and 1D protein and nucleotide sequences through an integrated sequence/annotation browser. Pre-defined and arbitrary molecular features can be selected in any of the 1D/2D/3D windows as sets of residues and these selections are synchronized dynamically in all displays. Biological annotations such as protein domains, single nucleotide variations, etc. can be shown as tracks in the 1D sequence/annotation browser. These customized displays can be shared with colleagues or publishers via a simple URL. iCn3D can display structure-structure alignments obtained from NCBI's VAST+ service. It can also display the alignment of a sequence with a structure as identified by BLAST, and thus relate 3D structure to a large fraction of all known proteins. iCn3D can also display electron density maps or electron microscopy (EM) density maps, and export files for 3D printing. The following example URL exemplifies some of the 1D/2D/3D representations: https://www.ncbi.nlm.nih.gov/Structure/icn3d/full.html?mmdbid=1TUP&showanno=1&show2d=1&showsets=1. AVAILABILITY AND IMPLEMENTATION: iCn3D is freely available to the public. Its source code is available at https://github.com/ncbi/icn3d. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Secuencia de Bases , Biología Computacional , Internet , Modelos Moleculares , Proteínas , Programas Informáticos , Biología Computacional/métodos , Bases de Datos Genéticas , Conformación Molecular , Proteínas/química

8.

Database resources of the National Center for Biotechnology Information.

Sayers, Eric W; Agarwala, Richa; Bolton, Evan E; Brister, J Rodney; Canese, Kathi; Clark, Karen; Connor, Ryan; Fiorini, Nicolas; Funk, Kathryn; Hefferon, Timothy; Holmes, J Bradley; Kim, Sunghwan; Kimchi, Avi; Kitts, Paul A; Lathrop, Stacy; Lu, Zhiyong; Madden, Thomas L; Marchler-Bauer, Aron; Phan, Lon; Schneider, Valerie A; Schoch, Conrad L; Pruitt, Kim D; Ostell, James.

Nucleic Acids Res ; 47(D1): D23-D28, 2019 01 08.

Artículo en Inglés | MEDLINE | ID: mdl-30395293

RESUMEN

The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts published in life science journals. The Entrez system provides search and retrieval operations for most of these data from 38 distinct databases. The E-utilities serve as the programming interface for the Entrez system. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. New resources released in the past year include PubMed Labs and a new sequence database search. Resources that were updated in the past year include PubMed, PMC, Bookshelf, genome data viewer, Assembly, prokaryotic genomes, Genome, BioProject, dbSNP, dbVar, BLAST databases, igBLAST, iCn3D and PubChem. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.

Asunto(s)

Biotecnología/organización & administración , Bases de Datos Genéticas , Animales , Biotecnología/métodos , Bases de Datos de Compuestos Químicos , Humanos , Programas Informáticos , Estados Unidos/epidemiología , Navegador Web

9.

LitVar: a semantic search engine for linking genomic variant data in PubMed and PMC.

Allot, Alexis; Peng, Yifan; Wei, Chih-Hsuan; Lee, Kyubum; Phan, Lon; Lu, Zhiyong.

Nucleic Acids Res ; 46(W1): W530-W536, 2018 07 02.

Artículo en Inglés | MEDLINE | ID: mdl-29762787

RESUMEN

The identification and interpretation of genomic variants play a key role in the diagnosis of genetic diseases and related research. These tasks increasingly rely on accessing relevant manually curated information from domain databases (e.g. SwissProt or ClinVar). However, due to the sheer volume of medical literature and high cost of expert curation, curated variant information in existing databases are often incomplete and out-of-date. In addition, the same genetic variant can be mentioned in publications with various names (e.g. 'A146T' versus 'c.436G>A' versus 'rs121913527'). A search in PubMed using only one name usually cannot retrieve all relevant articles for the variant of interest. Hence, to help scientists, healthcare professionals, and database curators find the most up-to-date published variant research, we have developed LitVar for the search and retrieval of standardized variant information. In addition, LitVar uses advanced text mining techniques to compute and extract relationships between variants and other associated entities such as diseases and chemicals/drugs. LitVar is publicly available at https://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/LitVar.

Asunto(s)

Curaduría de Datos/métodos , Minería de Datos/métodos , Polimorfismo de Nucleótido Simple , Motor de Búsqueda , Interfaz Usuario-Computador , Genética Médica , Genoma Humano , Genómica/métodos , Humanos , Internet , PubMed , Semántica

10.

tmVar 2.0: integrating genomic variant information from literature with dbSNP and ClinVar for precision medicine.

Wei, Chih-Hsuan; Phan, Lon; Feltz, Juliana; Maiti, Rama; Hefferon, Tim; Lu, Zhiyong.

Bioinformatics ; 34(1): 80-87, 2018 01 01.

Artículo en Inglés | MEDLINE | ID: mdl-28968638

RESUMEN

Motivation: Despite significant efforts in expert curation, clinical relevance about most of the 154 million dbSNP reference variants (RS) remains unknown. However, a wealth of knowledge about the variant biological function/disease impact is buried in unstructured literature data. Previous studies have attempted to harvest and unlock such information with text-mining techniques but are of limited use because their mutation extraction results are not standardized or integrated with curated data. Results: We propose an automatic method to extract and normalize variant mentions to unique identifiers (dbSNP RSIDs). Our method, in benchmarking results, demonstrates a high F-measure of â¼90% and compared favorably to the state of the art. Next, we applied our approach to the entire PubMed and validated the results by verifying that each extracted variant-gene pair matched the dbSNP annotation based on mapped genomic position, and by analyzing variants curated in ClinVar. We then determined which text-mined variants and genes constituted novel discoveries. Our analysis reveals 41 889 RS numbers (associated with 9151 genes) not found in ClinVar. Moreover, we obtained a rich set worth further review: 12 462 rare variants (MAF ≤ 0.01) in 3849 genes which are presumed to be deleterious and not frequently found in the general population. To our knowledge, this is the first large-scale study to analyze and integrate text-mined variant data with curated knowledge in existing databases. Our results suggest that databases can be significantly enriched by text mining and that the combined information can greatly assist human efforts in evaluating/prioritizing variants in genomic research. Availability and implementation: The tmVar 2.0 source code and corpus are freely available at https://www.ncbi.nlm.nih.gov/research/bionlp/Tools/tmvar/. Contact: zhiyong.lu@nih.gov.

Asunto(s)

Minería de Datos/métodos , Mutación , Polimorfismo Genético , Medicina de Precisión/métodos , Programas Informáticos , Curaduría de Datos , Bases de Datos Factuales , Predisposición Genética a la Enfermedad , Genómica/métodos , Humanos , Fenotipo , PubMed , Publicaciones

11.

Correspondence on "Comparison of literature mining tools for variant classification: Through the lens of 50 RYR1 variants" by Wermers et al.

Wei, Chih-Hsuan; Phan, Lon; Hefferon, Timothy; Landrum, Melissa; Rehm, Heidi L; Lu, Zhiyong.

Genet Med ; : 101208, 2024 Jul 04.

Artículo en Inglés | MEDLINE | ID: mdl-38973600

12.

The VAAST Variant Prioritizer (VVP): ultrafast, easy to use whole genome variant prioritization tool.

Flygare, Steven; Hernandez, Edgar Javier; Phan, Lon; Moore, Barry; Li, Man; Fejes, Anthony; Hu, Hao; Eilbeck, Karen; Huff, Chad; Jorde, Lynn; G Reese, Martin; Yandell, Mark.

BMC Bioinformatics ; 19(1): 57, 2018 02 20.

Artículo en Inglés | MEDLINE | ID: mdl-29463208

RESUMEN

BACKGROUND: Prioritization of sequence variants for diagnosis and discovery of Mendelian diseases is challenging, especially in large collections of whole genome sequences (WGS). Fast, scalable solutions are needed for discovery research, for clinical applications, and for curation of massive public variant repositories such as dbSNP and gnomAD. In response, we have developed VVP, the VAAST Variant Prioritizer. VVP is ultrafast, scales to even the largest variant repositories and genome collections, and its outputs are designed to simplify clinical interpretation of variants of uncertain significance. RESULTS: We show that scoring the entire contents of dbSNP (> 155 million variants) requires only 95 min using a machine with 4 cpus and 16 GB of RAM, and that a 60X WGS can be processed in less than 5 min. We also demonstrate that VVP can score variants anywhere in the genome, regardless of type, effect, or location. It does so by integrating sequence conservation, the type of sequence change, allele frequencies, variant burden, and zygosity. Finally, we also show that VVP scores are consistently accurate, and easily interpreted, traits not shared by many commonly used tools such as SIFT and CADD. CONCLUSIONS: VVP provides rapid and scalable means to prioritize any sequence variant, anywhere in the genome, and its scores are designed to facilitate variant interpretation using ACMG and NHS guidelines. These traits make it well suited for operation on very large collections of WGS sequences.

Asunto(s)

Biología Computacional/métodos , Variación Genética , Genoma Humano , Programas Informáticos , Bases de Datos Genéticas , Humanos , Polimorfismo de Nucleótido Simple/genética , Curva ROC , Factores de Tiempo , Secuenciación Completa del Genoma , Cigoto/metabolismo

13.

Supporting precision medicine by data mining across multi-disciplines: an integrative approach for generating comprehensive linkages between single nucleotide variants (SNVs) and drug-binding sites.

Roy Choudhury, Amrita; Cheng, Tiejun; Phan, Lon; Bryant, Stephen H; Wang, Yanli.

Bioinformatics ; 33(11): 1621-1629, 2017 Jun 01.

Artículo en Inglés | MEDLINE | ID: mdl-28158543

RESUMEN

MOTIVATION: Genetic variants in drug targets and metabolizing enzymes often have important functional implications, including altering the efficacy and toxicity of drugs. Identifying single nucleotide variants (SNVs) that contribute to differences in drug response and understanding their underlying mechanisms are fundamental to successful implementation of the precision medicine model. This work reports an effort to collect, classify and analyze SNVs that may affect the optimal response to currently approved drugs. RESULTS: An integrated approach was taken involving data mining across multiple information resources including databases containing drugs, drug targets, chemical structures, protein-ligand structure complexes, genetic and clinical variations as well as protein sequence alignment tools. We obtained 2640 SNVs of interest, most of which occur rarely in populations (minor allele frequency < 0.01). Clinical significance of only 9.56% of the SNVs is known in ClinVar, although 79.02% are predicted as deleterious. The examples here demonstrate that even if the mapped SNVs predicted as deleterious may not result in significant structural modifications, they can plausibly modify the protein-drug interactions, affecting selectivity and drug-binding affinity. Our analysis identifies potentially deleterious SNVs present on drug-binding residues that are relevant for further studies in the context of precision medicine. AVAILABILITY AND IMPLEMENTATION: Data are available from Supplementary information file. CONTACT: yanli.wang@nih.gov. SUPPLEMENTARY INFORMATION: Supplementary Tables S1-S5 are available at Bioinformatics online.

Asunto(s)

Minería de Datos/métodos , Polimorfismo de Nucleótido Simple , Unión Proteica/genética , Análisis de Secuencia de Proteína/métodos , Sitios de Unión , Frecuencia de los Genes , Humanos , Medicina de Precisión/métodos , Análisis de Secuencia de ADN/métodos

14.

Database resources of the National Center for Biotechnology Information.

Sayers, Eric W; Barrett, Tanya; Benson, Dennis A; Bolton, Evan; Bryant, Stephen H; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M; Dicuccio, Michael; Federhen, Scott; Feolo, Michael; Fingerman, Ian M; Geer, Lewis Y; Helmberg, Wolfgang; Kapustin, Yuri; Krasnov, Sergey; Landsman, David; Lipman, David J; Lu, Zhiyong; Madden, Thomas L; Madej, Tom; Maglott, Donna R; Marchler-Bauer, Aron; Miller, Vadim; Karsch-Mizrachi, Ilene; Ostell, James; Panchenko, Anna; Phan, Lon; Pruitt, Kim D; Schuler, Gregory D; Sequeira, Edwin; Sherry, Stephen T; Shumway, Martin; Sirotkin, Karl; Slotta, Douglas; Souvorov, Alexandre; Starchenko, Grigory; Tatusova, Tatiana A; Wagner, Lukas; Wang, Yanli; Wilbur, W John; Yaschenko, Eugene; Ye, Jian.

Nucleic Acids Res ; 40(Database issue): D13-25, 2012 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-22140104

RESUMEN

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Website. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Genome and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Probe, Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.

Asunto(s)

Bases de Datos como Asunto , Bases de Datos Genéticas , Bases de Datos de Proteínas , Expresión Génica , Genómica , Internet , Modelos Moleculares , National Library of Medicine (U.S.) , Publicaciones Periódicas como Asunto , PubMed , Alineación de Secuencia , Análisis de Secuencia de ADN , Análisis de Secuencia de Proteína , Análisis de Secuencia de ARN , Bibliotecas de Moléculas Pequeñas , Estados Unidos

15.

Database resources of the National Center for Biotechnology Information.

Sayers, Eric W; Barrett, Tanya; Benson, Dennis A; Bolton, Evan; Bryant, Stephen H; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M; DiCuccio, Michael; Federhen, Scott; Feolo, Michael; Fingerman, Ian M; Geer, Lewis Y; Helmberg, Wolfgang; Kapustin, Yuri; Landsman, David; Lipman, David J; Lu, Zhiyong; Madden, Thomas L; Madej, Tom; Maglott, Donna R; Marchler-Bauer, Aron; Miller, Vadim; Mizrachi, Ilene; Ostell, James; Panchenko, Anna; Phan, Lon; Pruitt, Kim D; Schuler, Gregory D; Sequeira, Edwin; Sherry, Stephen T; Shumway, Martin; Sirotkin, Karl; Slotta, Douglas; Souvorov, Alexandre; Starchenko, Grigory; Tatusova, Tatiana A; Wagner, Lukas; Wang, Yanli; Wilbur, W John; Yaschenko, Eugene; Ye, Jian.

Nucleic Acids Res ; 39(Database issue): D38-51, 2011 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-21097890

RESUMEN

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Electronic PCR, OrfFinder, Splign, ProSplign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), IBIS, Biosystems, Peptidome, OMSSA, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.

Asunto(s)

Bases de Datos Genéticas , Bases de Datos de Proteínas , Expresión Génica , Genómica , National Library of Medicine (U.S.) , Estructura Terciaria de Proteína , PubMed , Alineación de Secuencia , Análisis de Secuencia de ADN , Análisis de Secuencia de ARN , Programas Informáticos , Integración de Sistemas , Estados Unidos

16.

The completion of the Mammalian Gene Collection (MGC).

Temple, Gary; Gerhard, Daniela S; Rasooly, Rebekah; Feingold, Elise A; Good, Peter J; Robinson, Cristen; Mandich, Allison; Derge, Jeffrey G; Lewis, Jeanne; Shoaf, Debonny; Collins, Francis S; Jang, Wonhee; Wagner, Lukas; Shenmen, Carolyn M; Misquitta, Leonie; Schaefer, Carl F; Buetow, Kenneth H; Bonner, Tom I; Yankie, Linda; Ward, Ming; Phan, Lon; Astashyn, Alex; Brown, Garth; Farrell, Catherine; Hart, Jennifer; Landrum, Melissa; Maidak, Bonnie L; Murphy, Michael; Murphy, Terence; Rajput, Bhanu; Riddick, Lillian; Webb, David; Weber, Janet; Wu, Wendy; Pruitt, Kim D; Maglott, Donna; Siepel, Adam; Brejova, Brona; Diekhans, Mark; Harte, Rachel; Baertsch, Robert; Kent, Jim; Haussler, David; Brent, Michael; Langton, Laura; Comstock, Charles L G; Stevens, Michael; Wei, Chaochun; van Baren, Marijke J; Salehi-Ashtiani, Kourosh.

Genome Res ; 19(12): 2324-33, 2009 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-19767417

RESUMEN

Since its start, the Mammalian Gene Collection (MGC) has sought to provide at least one full-protein-coding sequence cDNA clone for every human and mouse gene with a RefSeq transcript, and at least 6200 rat genes. The MGC cloning effort initially relied on random expressed sequence tag screening of cDNA libraries. Here, we summarize our recent progress using directed RT-PCR cloning and DNA synthesis. The MGC now contains clones with the entire protein-coding sequence for 92% of human and 89% of mouse genes with curated RefSeq (NM-accession) transcripts, and for 97% of human and 96% of mouse genes with curated RefSeq transcripts that have one or more PubMed publications, in addition to clones for more than 6300 rat genes. These high-quality MGC clones and their sequences are accessible without restriction to researchers worldwide.

Asunto(s)

Clonación Molecular/métodos , Biología Computacional/métodos , ADN Complementario/genética , Biblioteca de Genes , Genes/genética , Mamíferos/genética , Animales , ADN/biosíntesis , Humanos , Ratones , National Institutes of Health (U.S.) , Ratas , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa , Estados Unidos

17.

Tracking genetic variants in the biomedical literature using LitVar 2.0.

Allot, Alexis; Wei, Chih-Hsuan; Phan, Lon; Hefferon, Timothy; Landrum, Melissa; Rehm, Heidi L; Lu, Zhiyong.

Nat Genet ; 55(6): 901-903, 2023 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-37268776

18.

dbVar structural variant cluster set for data analysis and variant comparison.

Phan, Lon; Hsu, Jeffrey; Tri, Le Quang Minh; Willi, Michaela; Mansour, Tamer; Kai, Yan; Garner, John; Lopez, John; Busby, Ben.

F1000Res ; 5: 673, 2016.

Artículo en Inglés | MEDLINE | ID: mdl-28357035

RESUMEN

dbVar houses over 3 million submitted structural variants (SSV) from 120 human studies including copy number variations (CNV), insertions, deletions, inversions, translocations, and complex chromosomal rearrangements. Users can submit multiple SSVs to dbVAR that are presumably identical, but were ascertained by different platforms and samples, to calculate whether the variant is rare or common in the population and allow for cross validation. However, because SSV genomic location reporting can vary - including fuzzy locations where the start and/or end points are not precisely known - analysis, comparison, annotation, and reporting of SSVs across studies can be difficult. This project was initiated by the Structural Variant Comparison Group for the purpose of generating a non-redundant set of genomic regions defined by counts of concordance for all human SSVs placed on RefSeq assembly GRCh38 (RefSeq accession GCF_000001405.26). We intend that the availability of these regions, called structural variant clusters (SVCs), will facilitate the analysis, annotation, and exchange of SV data and allow for simplified display in genomic sequence viewers for improved variant interpretation. Sets of SVCs were generated by variant type for each of the 120 studies as well as for a combined set across all studies. Starting from 3.64 million SSVs, 2.5 million and 3.4 million non-redundant SVCs with count >=1 were generated by variant type for each study and across all studies, respectively. In addition, we have developed utilities for annotating, searching, and filtering SVC data in GVF format for computing summary statistics, exporting data for genomic viewers, and annotating the SVC using external data sources.

19.

Study of translational control of eukaryotic gene expression using yeast.

Hinnebusch, Alan G; Asano, Katsura; Olsen, Deanne S; Phan, Lon; Nielsen, Klaus H; Valásek, Leos.

Ann N Y Acad Sci ; 1038: 60-74, 2004 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-15838098

RESUMEN

Eukaryotic cells respond to starvation by decreasing the rate of general protein synthesis while inducing translation of specific mRNAs encoding transcription factors GCN4 (yeast) or ATF4 (humans). Both responses are elicited by phosphorylation of translation initiation factor 2 (eIF2) and the attendant inhibition of its nucleotide exchange factor eIF2B-decreasing the binding to 40S ribosomes of methionyl initiator tRNA in the ternary complex (TC) with eIF2 and GTP. The reduction in TC levels enables scanning ribosomes to bypass the start codons of upstream open reading frames in the GCN4 mRNA leader and initiate translation at the authentic GCN4 start codon. We exploited the fact that GCN4 translation is a sensitive reporter of defects in TC recruitment to identify the catalytic and regulatory subunits of eIF2B. More recently, we implicated the C-terminal domain of eIF1A in 40S-binding of TC in vivo. Interestingly, we found that TC resides in a multifactor complex (MFC) with eIF3, eIF1, and the GTPase-activating protein for eIF2, known as eIF5. Our biochemical and genetic analyses indicate that physical interactions between MFC components enhance TC binding to 40S subunits and are required for wild-type translational control of GCN4. MFC integrity and eIF3 function also contribute to post-assembly steps in the initiation pathway that impact GCN4 expression. Thus, apart from its critical role in the starvation response, GCN4 regulation is a valuable tool for dissecting the contributions of multiple translation factors in the eukaryotic initiation pathway.

Asunto(s)

Regulación Fúngica de la Expresión Génica , Biosíntesis de Proteínas , Saccharomyces cerevisiae , Proteínas de Unión al ADN/genética , Proteínas de Unión al ADN/metabolismo , Factor 1 Eucariótico de Iniciación/genética , Factor 1 Eucariótico de Iniciación/metabolismo , Factor 2 Eucariótico de Iniciación/metabolismo , Factor 2B Eucariótico de Iniciación/genética , Factor 2B Eucariótico de Iniciación/metabolismo , Humanos , Sustancias Macromoleculares , Modelos Moleculares , Unión Proteica , Proteínas Quinasas/genética , Proteínas Quinasas/metabolismo , Estructura Terciaria de Proteína , Ribosomas/metabolismo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo

20.

Phenotype-Genotype Integrator (PheGenI): synthesizing genome-wide association study (GWAS) data with existing genomic resources.

Ramos, Erin M; Hoffman, Douglas; Junkins, Heather A; Maglott, Donna; Phan, Lon; Sherry, Stephen T; Feolo, Mike; Hindorff, Lucia A.

Eur J Hum Genet ; 22(1): 144-7, 2014 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-23695286

RESUMEN

Rapidly accumulating data from genome-wide association studies (GWASs) and other large-scale studies are most useful when synthesized with existing databases. To address this opportunity, we developed the Phenotype-Genotype Integrator (PheGenI), a user-friendly web interface that integrates various National Center for Biotechnology Information (NCBI) genomic databases with association data from the National Human Genome Research Institute GWAS Catalog and supports downloads of search results. Here, we describe the rationale for and development of this resource. Integrating over 66,000 association records with extensive single nucleotide polymorphism (SNP), gene, and expression quantitative trait loci data already available from the NCBI, PheGenI enables deeper investigation and interrogation of SNPs associated with a wide range of traits, facilitating the examination of the relationships between genetic variation and human diseases.

Asunto(s)

Estudio de Asociación del Genoma Completo , Genotipo , Fenotipo , Programas Informáticos , Biología Computacional , Bases de Datos Genéticas , Genoma Humano , Genómica , Humanos , Internet , Polimorfismo de Nucleótido Simple

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA