Búsqueda | Portal Regional de la BVS

A draft human pangenome reference.

Liao, Wen-Wei; Asri, Mobin; Ebler, Jana; Doerr, Daniel; Haukness, Marina; Hickey, Glenn; Lu, Shuangjia; Lucas, Julian K; Monlong, Jean; Abel, Haley J; Buonaiuto, Silvia; Chang, Xian H; Cheng, Haoyu; Chu, Justin; Colonna, Vincenza; Eizenga, Jordan M; Feng, Xiaowen; Fischer, Christian; Fulton, Robert S; Garg, Shilpa; Groza, Cristian; Guarracino, Andrea; Harvey, William T; Heumos, Simon; Howe, Kerstin; Jain, Miten; Lu, Tsung-Yu; Markello, Charles; Martin, Fergal J; Mitchell, Matthew W; Munson, Katherine M; Mwaniki, Moses Njagi; Novak, Adam M; Olsen, Hugh E; Pesout, Trevor; Porubsky, David; Prins, Pjotr; Sibbesen, Jonas A; Sirén, Jouni; Tomlinson, Chad; Villani, Flavia; Vollger, Mitchell R; Antonacci-Fulton, Lucinda L; Baid, Gunjan; Baker, Carl A; Belyaeva, Anastasiya; Billis, Konstantinos; Carroll, Andrew; Chang, Pi-Chuan; Cody, Sarah.

Nature ; 617(7960): 312-324, 2023 05.

Artículo en Inglés | MEDLINE | ID: mdl-37165242

RESUMEN

Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals1. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels. Based on alignments of the assemblies, we generate a draft pangenome that captures known variants and haplotypes and reveals new alleles at structurally complex loci. We also add 119 million base pairs of euchromatic polymorphic sequences and 1,115 gene duplications relative to the existing reference GRCh38. Roughly 90 million of the additional base pairs are derived from structural variation. Using our draft pangenome to analyse short-read data reduced small variant discovery errors by 34% and increased the number of structural variants detected per haplotype by 104% compared with GRCh38-based workflows, which enabled the typing of the vast majority of structural variant alleles per sample.

Asunto(s)

Genoma Humano , Genómica , Humanos , Diploidia , Genoma Humano/genética , Haplotipos/genética , Análisis de Secuencia de ADN , Genómica/normas , Estándares de Referencia , Estudios de Cohortes , Alelos , Variación Genética

DeepConsensus improves the accuracy of sequences with a gap-aware sequence transformer.

Baid, Gunjan; Cook, Daniel E; Shafin, Kishwar; Yun, Taedong; Llinares-López, Felipe; Berthet, Quentin; Belyaeva, Anastasiya; Töpfer, Armin; Wenger, Aaron M; Rowell, William J; Yang, Howard; Kolesnikov, Alexey; Ammar, Waleed; Vert, Jean-Philippe; Vaswani, Ashish; McLean, Cory Y; Nattestad, Maria; Chang, Pi-Chuan; Carroll, Andrew.

Nat Biotechnol ; 41(2): 232-238, 2023 02.

Artículo en Inglés | MEDLINE | ID: mdl-36050551

RESUMEN

Circular consensus sequencing with Pacific Biosciences (PacBio) technology generates long (10-25 kilobases), accurate 'HiFi' reads by combining serial observations of a DNA molecule into a consensus sequence. The standard approach to consensus generation, pbccs, uses a hidden Markov model. We introduce DeepConsensus, which uses an alignment-based loss to train a gap-aware transformer-encoder for sequence correction. Compared to pbccs, DeepConsensus reduces read errors by 42%. This increases the yield of PacBio HiFi reads at Q20 by 9%, at Q30 by 27% and at Q40 by 90%. With two SMRT Cells of HG003, reads from DeepConsensus improve hifiasm assembly contiguity (ï»¿NG50 4.9 megabases (Mb) to 17.2 Mb), increase gene completeness (94% to 97%), reduce the false gene duplication rate (1.1% to 0.5%), improve assembly base accuracy (Q43 to Q45) and reduce variant-calling errors by 24%. DeepConsensus models could be trained to the general problem of analyzing the alignment of other types of sequences, such as unique molecular identifiers or genome assemblies.

Asunto(s)

Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia de ADN

PrecisionFDA Truth Challenge V2: Calling variants from short and long reads in difficult-to-map regions.

Olson, Nathan D; Wagner, Justin; McDaniel, Jennifer; Stephens, Sarah H; Westreich, Samuel T; Prasanna, Anish G; Johanson, Elaine; Boja, Emily; Maier, Ezekiel J; Serang, Omar; Jáspez, David; Lorenzo-Salazar, José M; Muñoz-Barrera, Adrián; Rubio-Rodríguez, Luis A; Flores, Carlos; Kyriakidis, Konstantinos; Malousi, Andigoni; Shafin, Kishwar; Pesout, Trevor; Jain, Miten; Paten, Benedict; Chang, Pi-Chuan; Kolesnikov, Alexey; Nattestad, Maria; Baid, Gunjan; Goel, Sidharth; Yang, Howard; Carroll, Andrew; Eveleigh, Robert; Bourgey, Mathieu; Bourque, Guillaume; Li, Gen; Ma, ChouXian; Tang, LinQi; Du, YuanPing; Zhang, ShaoWei; Morata, Jordi; Tonda, Raúl; Parra, Genís; Trotta, Jean-Rémi; Brueffer, Christian; Demirkaya-Budak, Sinem; Kabakci-Zorlu, Duygu; Turgut, Deniz; Kalay, Özem; Budak, Gungor; Narci, Kübra; Arslan, Elif; Brown, Richard; Johnson, Ivan J.

Cell Genom ; 2(5)2022 May 11.

Artículo en Inglés | MEDLINE | ID: mdl-35720974

RESUMEN

The precisionFDA Truth Challenge V2 aimed to assess the state of the art of variant calling in challenging genomic regions. Starting with FASTQs, 20 challenge participants applied their variant-calling pipelines and submitted 64 variant call sets for one or more sequencing technologies (Illumina, PacBio HiFi, and Oxford Nanopore Technologies). Submissions were evaluated following best practices for benchmarking small variants with updated Genome in a Bottle benchmark sets and genome stratifications. Challenge submissions included numerous innovative methods, with graph-based and machine learning methods scoring best for short-read and long-read datasets, respectively. With machine learning approaches, combining multiple sequencing technologies performed particularly well. Recent developments in sequencing and variant calling have enabled benchmarking variants in challenging genomic regions, paving the way for the identification of previously unknown clinically relevant variants.

Accelerated identification of disease-causing variants with ultra-rapid nanopore genome sequencing.

Goenka, Sneha D; Gorzynski, John E; Shafin, Kishwar; Fisk, Dianna G; Pesout, Trevor; Jensen, Tanner D; Monlong, Jean; Chang, Pi-Chuan; Baid, Gunjan; Bernstein, Jonathan A; Christle, Jeffrey W; Dalton, Karen P; Garalde, Daniel R; Grove, Megan E; Guillory, Joseph; Kolesnikov, Alexey; Nattestad, Maria; Ruzhnikov, Maura R Z; Samadi, Mehrzad; Sethia, Ankit; Spiteri, Elizabeth; Wright, Christopher J; Xiong, Katherine; Zhu, Tong; Jain, Miten; Sedlazeck, Fritz J; Carroll, Andrew; Paten, Benedict; Ashley, Euan A.

Nat Biotechnol ; 40(7): 1035-1041, 2022 07.

Artículo en Inglés | MEDLINE | ID: mdl-35347328

RESUMEN

Whole-genome sequencing (WGS) can identify variants that cause genetic disease, but the time required for sequencing and analysis has been a barrier to its use in acutely ill patients. In the present study, we develop an approach for ultra-rapid nanopore WGS that combines an optimized sample preparation protocol, distributing sequencing over 48 flow cells, near real-time base calling and alignment, accelerated variant calling and fast variant filtration for efficient manual review. Application to two example clinical cases identified a candidate variant in <8 h from sample preparation to variant identification. We show that this framework provides accurate variant calls and efficient prioritization, and accelerates diagnostic clinical genome sequencing twofold compared with previous approaches.

Asunto(s)

Secuenciación de Nanoporos , Nanoporos , Mapeo Cromosómico , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Secuenciación Completa del Genoma/métodos

Ultrarapid Nanopore Genome Sequencing in a Critical Care Setting.

Gorzynski, John E; Goenka, Sneha D; Shafin, Kishwar; Jensen, Tanner D; Fisk, Dianna G; Grove, Megan E; Spiteri, Elizabeth; Pesout, Trevor; Monlong, Jean; Baid, Gunjan; Bernstein, Jonathan A; Ceresnak, Scott; Chang, Pi-Chuan; Christle, Jeffrey W; Chubb, Henry; Dalton, Karen P; Dunn, Kyla; Garalde, Daniel R; Guillory, Joseph; Knowles, Joshua W; Kolesnikov, Alexey; Ma, Michael; Moscarello, Tia; Nattestad, Maria; Perez, Marco; Ruzhnikov, Maura R Z; Samadi, Mehrzad; Setia, Ankit; Wright, Chris; Wusthoff, Courtney J; Xiong, Katherine; Zhu, Tong; Jain, Miten; Sedlazeck, Fritz J; Carroll, Andrew; Paten, Benedict; Ashley, Euan A.

N Engl J Med ; 386(7): 700-702, 2022 02 17.

Artículo en Inglés | MEDLINE | ID: mdl-35020984

Asunto(s)

Cuidados Críticos , Secuenciación de Nanoporos/métodos , Trastornos del Neurodesarrollo/diagnóstico , Adolescente , Preescolar , Femenino , Humanos , Lactante , Recién Nacido , Masculino , Persona de Mediana Edad , Mutación , Secuenciación de Nanoporos/economía , Trastornos del Neurodesarrollo/genética , Análisis de Secuencia de ADN/métodos , Estado Epiléptico/genética

Haplotype-aware variant calling with PEPPER-Margin-DeepVariant enables high accuracy in nanopore long-reads.

Shafin, Kishwar; Pesout, Trevor; Chang, Pi-Chuan; Nattestad, Maria; Kolesnikov, Alexey; Goel, Sidharth; Baid, Gunjan; Kolmogorov, Mikhail; Eizenga, Jordan M; Miga, Karen H; Carnevali, Paolo; Jain, Miten; Carroll, Andrew; Paten, Benedict.

Nat Methods ; 18(11): 1322-1332, 2021 11.

Artículo en Inglés | MEDLINE | ID: mdl-34725481

RESUMEN

Long-read sequencing has the potential to transform variant detection by reaching currently difficult-to-map regions and routinely linking together adjacent variations to enable read-based phasing. Third-generation nanopore sequence data have demonstrated a long read length, but current interpretation methods for their novel pore-based signal have unique error profiles, making accurate analysis challenging. Here, we introduce a haplotype-aware variant calling pipeline, PEPPER-Margin-DeepVariant, that produces state-of-the-art variant calling results with nanopore data. We show that our nanopore-based method outperforms the short-read-based single-nucleotide-variant identification method at the whole-genome scale and produces high-quality single-nucleotide variants in segmental duplications and low-mappability regions where short-read-based genotyping fails. We show that our pipeline can provide highly contiguous phase blocks across the genome with nanopore reads, contiguously spanning between 85% and 92% of annotated genes across six samples. We also extend PEPPER-Margin-DeepVariant to PacBio HiFi data, providing an efficient solution with superior performance over the current WhatsHap-DeepVariant standard. Finally, we demonstrate de novo assembly polishing methods that use nanopore and PacBio HiFi reads to produce diploid assemblies with high accuracy (Q35+ nanopore-polished and Q40+ PacBio HiFi-polished).

Asunto(s)

Genes , Haplotipos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Nanoporos , Polimorfismo de Nucleótido Simple , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Genoma Humano , Humanos , Anotación de Secuencia Molecular

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA