Búsqueda | Portal de Búsqueda de la BVS Ecuador

The Human Pangenome Project: a global resource to map genomic diversity.

Wang, Ting; Antonacci-Fulton, Lucinda; Howe, Kerstin; Lawson, Heather A; Lucas, Julian K; Phillippy, Adam M; Popejoy, Alice B; Asri, Mobin; Carson, Caryn; Chaisson, Mark J P; Chang, Xian; Cook-Deegan, Robert; Felsenfeld, Adam L; Fulton, Robert S; Garrison, Erik P; Garrison, Nanibaa' A; Graves-Lindsay, Tina A; Ji, Hanlee; Kenny, Eimear E; Koenig, Barbara A; Li, Daofeng; Marschall, Tobias; McMichael, Joshua F; Novak, Adam M; Purushotham, Deepak; Schneider, Valerie A; Schultz, Baergen I; Smith, Michael W; Sofia, Heidi J; Weissman, Tsachy; Flicek, Paul; Li, Heng; Miga, Karen H; Paten, Benedict; Jarvis, Erich D; Hall, Ira M; Eichler, Evan E; Haussler, David.

Nature ; 604(7906): 437-446, 2022 04.

Artículo en Inglés | MEDLINE | ID: mdl-35444317

RESUMEN

The human reference genome is the most widely used resource in human genetics and is due for a major update. Its current structure is a linear composite of merged haplotypes from more than 20 people, with a single individual comprising most of the sequence. It contains biases and errors within a framework that does not represent global human genomic variation. A high-quality reference with global representation of common variants, including single-nucleotide variants, structural variants and functional elements, is needed. The Human Pangenome Reference Consortium aims to create a more sophisticated and complete human reference genome with a graph-based, telomere-to-telomere representation of global genomic diversity. Here we leverage innovations in technology, study design and global partnerships with the goal of constructing the highest-possible quality human pangenome reference. Our goal is to improve data representation and streamline analyses to enable routine assembly of complete diploid genomes. With attention to ethical frameworks, the human pangenome reference will contain a more accurate and diverse representation of global genomic variation, improve gene-disease association studies across populations, expand the scope of genomics research to the most repetitive and polymorphic regions of the genome, and serve as the ultimate genetic resource for future biomedical research and precision medicine.

Asunto(s)

Genoma Humano , Genómica , Genoma Humano/genética , Haplotipos/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Análisis de Secuencia de ADN

A global reference for human genetic variation.

Auton, Adam; Brooks, Lisa D; Durbin, Richard M; Garrison, Erik P; Kang, Hyun Min; Korbel, Jan O; Marchini, Jonathan L; McCarthy, Shane; McVean, Gil A; Abecasis, Gonçalo R.

Nature ; 526(7571): 68-74, 2015 Oct 01.

Artículo en Inglés | MEDLINE | ID: mdl-26432245

RESUMEN

The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.

Asunto(s)

Variación Genética/genética , Genética de Población/normas , Genoma Humano/genética , Genómica/normas , Internacionalidad , Conjuntos de Datos como Asunto , Demografía , Susceptibilidad a Enfermedades , Exoma/genética , Genética Médica , Estudio de Asociación del Genoma Completo , Genotipo , Haplotipos/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Mutación INDEL/genética , Mapeo Físico de Cromosoma , Polimorfismo de Nucleótido Simple/genética , Sitios de Carácter Cuantitativo/genética , Enfermedades Raras/genética , Estándares de Referencia , Análisis de Secuencia de ADN

SpeedSeq: ultra-fast personal genome analysis and interpretation.

Chiang, Colby; Layer, Ryan M; Faust, Gregory G; Lindberg, Michael R; Rose, David B; Garrison, Erik P; Marth, Gabor T; Quinlan, Aaron R; Hall, Ira M.

Nat Methods ; 12(10): 966-8, 2015 Oct.

Artículo en Inglés | MEDLINE | ID: mdl-26258291

RESUMEN

SpeedSeq is an open-source genome analysis platform that accomplishes alignment, variant detection and functional annotation of a 50× human genome in 13 h on a low-cost server and alleviates a bioinformatics bottleneck that typically demands weeks of computation with extensive hands-on expert involvement. SpeedSeq offers performance competitive with or superior to current methods for detecting germline and somatic single-nucleotide variants, structural variants, insertions and deletions, and it includes novel functionality for streamlined interpretation.

Asunto(s)

Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Anotación de Secuencia Molecular/métodos , Programas Informáticos , Variación Genética , Humanos , Neoplasias/genética , Polimorfismo de Nucleótido Simple , Medicina de Precisión/métodos , Flujo de Trabajo

MOSAIK: a hash-based algorithm for accurate next-generation sequencing short-read mapping.

Lee, Wan-Ping; Stromberg, Michael P; Ward, Alistair; Stewart, Chip; Garrison, Erik P; Marth, Gabor T.

PLoS One ; 9(3): e90581, 2014.

Artículo en Inglés | MEDLINE | ID: mdl-24599324

RESUMEN

MOSAIK is a stable, sensitive and open-source program for mapping second and third-generation sequencing reads to a reference genome. Uniquely among current mapping tools, MOSAIK can align reads generated by all the major sequencing technologies, including Illumina, Applied Biosystems SOLiD, Roche 454, Ion Torrent and Pacific BioSciences SMRT. Indeed, MOSAIK was the only aligner to provide consistent mappings for all the generated data (sequencing technologies, low-coverage and exome) in the 1000 Genomes Project. To provide highly accurate alignments, MOSAIK employs a hash clustering strategy coupled with the Smith-Waterman algorithm. This method is well-suited to capture mismatches as well as short insertions and deletions. To support the growing interest in larger structural variant (SV) discovery, MOSAIK provides explicit support for handling known-sequence SVs, e.g. mobile element insertions (MEIs) as well as generating outputs tailored to aid in SV discovery. All variant discovery benefits from an accurate description of the read placement confidence. To this end, MOSAIK uses a neural-network based training scheme to provide well-calibrated mapping quality scores, demonstrated by a correlation coefficient between MOSAIK assigned and actual mapping qualities greater than 0.98. In order to ensure that studies of any genome are supported, a training pipeline is provided to ensure optimal mapping quality scores for the genome under investigation. MOSAIK is multi-threaded, open source, and incorporated into our command and pipeline launcher system GKNO (http://gkno.me).

Asunto(s)

Análisis de Secuencia de ADN , Programas Informáticos , Algoritmos , Mapeo Cromosómico/métodos , Escherichia coli/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Mutación INDEL , Secuencias Repetitivas Esparcidas , Redes Neurales de la Computación , Polimorfismo de Nucleótido Simple , Curva ROC

SSW library: an SIMD Smith-Waterman C/C++ library for use in genomic applications.

Zhao, Mengyao; Lee, Wan-Ping; Garrison, Erik P; Marth, Gabor T.

PLoS One ; 8(12): e82138, 2013.

Artículo en Inglés | MEDLINE | ID: mdl-24324759

RESUMEN

BACKGROUND: The Smith-Waterman algorithm, which produces the optimal pairwise alignment between two sequences, is frequently used as a key component of fast heuristic read mapping and variation detection tools for next-generation sequencing data. Though various fast Smith-Waterman implementations are developed, they are either designed as monolithic protein database searching tools, which do not return detailed alignment, or are embedded into other tools. These issues make reusing these efficient Smith-Waterman implementations impractical. RESULTS: To facilitate easy integration of the fast Single-Instruction-Multiple-Data Smith-Waterman algorithm into third-party software, we wrote a C/C++ library, which extends Farrar's Striped Smith-Waterman (SSW) to return alignment information in addition to the optimal Smith-Waterman score. In this library we developed a new method to generate the full optimal alignment results and a suboptimal score in linear space at little cost of efficiency. This improvement makes the fast Single-Instruction-Multiple-Data Smith-Waterman become really useful in genomic applications. SSW is available both as a C/C++ software library, as well as a stand-alone alignment tool at: https://github.com/mengyao/Complete-Striped-Smith-Waterman-Library. CONCLUSIONS: The SSW library has been used in the primary read mapping tool MOSAIK, the split-read mapping program SCISSORS, the MEI detector TANGRAM, and the read-overlap graph generation program RZMBLR. The speeds of the mentioned software are improved significantly by replacing their ordinary Smith-Waterman or banded Smith-Waterman module with the SSW Library.

Asunto(s)

Algoritmos , Genómica/métodos , Lenguajes de Programación , Simulación por Computador , Bases de Datos de Proteínas , Alineación de Secuencia , Factores de Tiempo

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA