Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Genome Res ; 34(3): 454-468, 2024 Apr 25.
Artigo em Inglês | MEDLINE | ID: mdl-38627094

RESUMO

Reference-free genome phasing is vital for understanding allele inheritance and the impact of single-molecule DNA variation on phenotypes. To achieve thorough phasing across homozygous or repetitive regions of the genome, long-read sequencing technologies are often used to perform phased de novo assembly. As a step toward reducing the cost and complexity of this type of analysis, we describe new methods for accurately phasing Oxford Nanopore Technologies (ONT) sequence data with the Shasta genome assembler and a modular tool for extending phasing to the chromosome scale called GFAse. We test using new variants of ONT PromethION sequencing, including those using proximity ligation, and show that newer, higher accuracy ONT reads substantially improve assembly quality.


Assuntos
Nanoporos , Humanos , Análise de Sequência de DNA/métodos , Sequenciamento por Nanoporos/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Software , Genômica/métodos
2.
Nature ; 617(7960): 312-324, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-37165242

RESUMO

Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals1. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels. Based on alignments of the assemblies, we generate a draft pangenome that captures known variants and haplotypes and reveals new alleles at structurally complex loci. We also add 119 million base pairs of euchromatic polymorphic sequences and 1,115 gene duplications relative to the existing reference GRCh38. Roughly 90 million of the additional base pairs are derived from structural variation. Using our draft pangenome to analyse short-read data reduced small variant discovery errors by 34% and increased the number of structural variants detected per haplotype by 104% compared with GRCh38-based workflows, which enabled the typing of the vast majority of structural variant alleles per sample.


Assuntos
Genoma Humano , Genômica , Humanos , Diploide , Genoma Humano/genética , Haplótipos/genética , Análise de Sequência de DNA , Genômica/normas , Padrões de Referência , Estudos de Coortes , Alelos , Variação Genética
3.
Nature ; 611(7936): 519-531, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-36261518

RESUMO

The current human reference genome, GRCh38, represents over 20 years of effort to generate a high-quality assembly, which has benefitted society1,2. However, it still has many gaps and errors, and does not represent a biological genome as it is a blend of multiple individuals3,4. Recently, a high-quality telomere-to-telomere reference, CHM13, was generated with the latest long-read technologies, but it was derived from a hydatidiform mole cell line with a nearly homozygous genome5. To address these limitations, the Human Pangenome Reference Consortium formed with the goal of creating high-quality, cost-effective, diploid genome assemblies for a pangenome reference that represents human genetic diversity6. Here, in our first scientific report, we determined which combination of current genome sequencing and assembly approaches yield the most complete and accurate diploid genome assembly with minimal manual curation. Approaches that used highly accurate long reads and parent-child data with graph-based haplotype phasing during assembly outperformed those that did not. Developing a combination of the top-performing methods, we generated our first high-quality diploid reference assembly, containing only approximately four gaps per chromosome on average, with most chromosomes within ±1% of the length of CHM13. Nearly 48% of protein-coding genes have non-synonymous amino acid changes between haplotypes, and centromeric regions showed the highest diversity. Our findings serve as a foundation for assembling near-complete diploid human genomes at scale for a pangenome reference to capture global genetic variation from single nucleotides to structural rearrangements.


Assuntos
Mapeamento Cromossômico , Diploide , Genoma Humano , Genômica , Humanos , Mapeamento Cromossômico/normas , Genoma Humano/genética , Haplótipos/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento de Nucleotídeos em Larga Escala/normas , Análise de Sequência de DNA/métodos , Análise de Sequência de DNA/normas , Padrões de Referência , Genômica/métodos , Genômica/normas , Cromossomos Humanos/genética , Variação Genética/genética
5.
ACS Nano ; 15(10): 16642-16653, 2021 10 26.
Artigo em Inglês | MEDLINE | ID: mdl-34618430

RESUMO

We describe a method for direct tRNA sequencing using the Oxford Nanopore MinION. The principal technical advance is custom adapters that facilitate end-to-end sequencing of individual transfer RNA (tRNA) molecules at subnanometer precision. A second advance is a nanopore sequencing pipeline optimized for tRNA. We tested this method using purified E. coli tRNAfMet, tRNALys, and tRNAPhe samples. 76-92% of individual aligned tRNA sequence reads were full length. As a proof of concept, we showed that nanopore sequencing detected all 43 expected isoacceptors in total E. coli MRE600 tRNA as well as isodecoders that further define that tRNA population. Alignment-based comparisons between the three purified tRNAs and their synthetic controls revealed systematic nucleotide miscalls that were diagnostic of known modifications. Systematic miscalls were also observed proximal to known modifications in total E. coli tRNA alignments, including a highly conserved pseudouridine in the T loop. This work highlights the potential of nanopore direct tRNA sequencing as well as improvements needed to implement tRNA sequencing for human healthcare applications.


Assuntos
Sequenciamento por Nanoporos , Nanoporos , Escherichia coli/genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Nucleotídeos
6.
Methods Mol Biol ; 2298: 53-74, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34085238

RESUMO

Historically, RNA has been sequenced as cDNA copies derived from reverse transcription of cellular RNA followed by PCR amplification. Recently, RNA sequencing using nanopores has emerged as an alternative. Using this technology, individual cellular RNA strands are read directly as they are driven through nanoscale pores by an applied voltage. The speed of translocation is regulated by a helicase that is loaded onto each RNA strand by an adapter that also facilitates capture by the nanopore electric field. Here we describe a technique for adapting human ribosomal RNA subunits for nanopore sequencing. Using this strategy, a single Oxford Nanopore MinION run delivered 470,907 sequence reads of which 396,048 aligned to ribosomal RNA, with 28S, 18S, 5.8S, and 5S coverage of 6053, 369,472, 16,058, and 4465 reads, respectively. Example alignments that reveal putative nucleotide modifications are provided.


Assuntos
Sequenciamento por Nanoporos/métodos , Nucleotídeos/genética , RNA Ribossômico/genética , Análise de Sequência de RNA/métodos , Humanos , Nanoporos , Análise de Sequência de DNA/métodos
7.
bioRxiv ; 2021 Apr 06.
Artigo em Inglês | MEDLINE | ID: mdl-33851162

RESUMO

We report a SARS-CoV-2 lineage that shares N501Y, P681H, and other mutations with known variants of concern, such as B.1.1.7. This lineage, which we refer to as B.1.x (COG-UK sometimes references similar samples as B.1.324.1), is present in at least 20 states across the USA and in at least six countries. However, a large deletion causes the sequence to be automatically rejected from repositories, suggesting that the frequency of this new lineage is underestimated using public data. Recent dynamics based on 339 samples obtained in Santa Cruz County, CA, USA suggest that B.1.x may be increasing in frequency at a rate similar to that of B.1.1.7 in Southern California. At present the functional differences between this variant B.1.x and other circulating SARS-CoV-2 variants are unknown, and further studies on secondary attack rates, viral loads, immune evasion and/or disease severity are needed to determine if it poses a public health concern. Nonetheless, given what is known from well-studied circulating variants of concern, it seems unlikely that the lineage could pose larger concerns for human health than many already globally distributed lineages. Our work highlights a need for rapid turnaround time from sequence generation to submission and improved sequence quality control that removes submission bias. We identify promising paths toward this goal.

8.
Nat Biotechnol ; 38(9): 1044-1053, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32686750

RESUMO

De novo assembly of a human genome using nanopore long-read sequences has been reported, but it used more than 150,000 CPU hours and weeks of wall-clock time. To enable rapid human genome assembly, we present Shasta, a de novo long-read assembler, and polishing algorithms named MarginPolish and HELEN. Using a single PromethION nanopore sequencer and our toolkit, we assembled 11 highly contiguous human genomes de novo in 9 d. We achieved roughly 63× coverage, 42-kb read N50 values and 6.5× coverage in reads >100 kb using three flow cells per sample. Shasta produced a complete haploid human genome assembly in under 6 h on a single commercial compute node. MarginPolish and HELEN polished haploid assemblies to more than 99.9% identity (Phred quality score QV = 30) with nanopore reads alone. Addition of proximity-ligation sequencing enabled near chromosome-level scaffolds for all 11 genomes. We compare our assembly performance to existing methods for diploid, haploid and trio-binned human samples and report superior accuracy and speed.


Assuntos
Genoma Humano/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento por Nanoporos , Análise de Sequência de DNA/métodos , Algoritmos , Benchmarking , Cromossomos Humanos/genética , Aprendizado Profundo , Genômica , Antígenos HLA/genética , Haploidia , Sequenciamento de Nucleotídeos em Larga Escala/normas , Humanos , Análise de Sequência de DNA/normas
9.
Genome Biol ; 21(1): 83, 2020 03 31.
Artigo em Inglês | MEDLINE | ID: mdl-32234056

RESUMO

BACKGROUND: Long non-coding RNAs (lncRNAs) exhibit highly cell type-specific expression and function, making this class of transcript attractive for targeted cancer therapy. However, the vast majority of lncRNAs have not been tested as potential therapeutic targets, particularly in the context of currently used cancer treatments. Malignant glioma is rapidly fatal, and ionizing radiation is part of the current standard-of-care used to slow tumor growth in both adult and pediatric patients. RESULTS: We use CRISPR interference (CRISPRi) to screen 5689 lncRNA loci in human glioblastoma (GBM) cells, identifying 467 hits that modify cell growth in the presence of clinically relevant doses of fractionated radiation. Thirty-three of these lncRNA hits sensitize cells to radiation, and based on their expression in adult and pediatric gliomas, nine of these hits are prioritized as lncRNA Glioma Radiation Sensitizers (lncGRS). Knockdown of lncGRS-1, a primate-conserved, nuclear-enriched lncRNA, inhibits the growth and proliferation of primary adult and pediatric glioma cells, but not the viability of normal brain cells. Using human brain organoids comprised of mature neural cell types as a three-dimensional tissue substrate to model the invasive growth of glioma, we find that antisense oligonucleotides targeting lncGRS-1 selectively decrease tumor growth and sensitize glioma cells to radiation therapy. CONCLUSIONS: These studies identify lncGRS-1 as a glioma-specific therapeutic target and establish a generalizable approach to rapidly identify novel therapeutic targets in the vast non-coding genome to enhance radiation therapy.


Assuntos
Neoplasias Encefálicas/terapia , Sistemas CRISPR-Cas , Glioblastoma/terapia , RNA Longo não Codificante/antagonistas & inibidores , Adulto , Astrócitos , Encéfalo , Neoplasias Encefálicas/genética , Neoplasias Encefálicas/patologia , Neoplasias Encefálicas/radioterapia , Linhagem Celular Tumoral , Terapia Combinada , Glioblastoma/genética , Glioblastoma/patologia , Glioblastoma/radioterapia , Humanos , Oligonucleotídeos Antissenso , Organoides , Tolerância a Radiação
11.
Nat Methods ; 16(12): 1297-1305, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31740818

RESUMO

High-throughput complementary DNA sequencing technologies have advanced our understanding of transcriptome complexity and regulation. However, these methods lose information contained in biological RNA because the copied reads are often short and modifications are not retained. We address these limitations using a native poly(A) RNA sequencing strategy developed by Oxford Nanopore Technologies. Our study generated 9.9 million aligned sequence reads for the human cell line GM12878, using thirty MinION flow cells at six institutions. These native RNA reads had a median length of 771 bases, and a maximum aligned length of over 21,000 bases. Mitochondrial poly(A) reads provided an internal measure of read-length quality. We combined these long nanopore reads with higher accuracy short-reads and annotated GM12878 promoter regions to identify 33,984 plausible RNA isoforms. We describe strategies for assessing 3' poly(A) tail length, base modifications and transcript haplotypes.


Assuntos
Sequenciamento por Nanoporos/métodos , Poli A/genética , Análise de Sequência de RNA/métodos , Transcriptoma , Células Cultivadas , Humanos
12.
Nat Biotechnol ; 36(4): 321-323, 2018 04.
Artigo em Inglês | MEDLINE | ID: mdl-29553574

RESUMO

The human genome reference sequence remains incomplete owing to the challenge of assembling long tracts of near-identical tandem repeats in centromeres. We implemented a nanopore sequencing strategy to generate high-quality reads that span hundreds of kilobases of highly repetitive DNA in a human Y chromosome centromere. Combining these data with short-read variant validation, we assembled and characterized the centromeric region of a human Y chromosome.


Assuntos
Centrômero/genética , Cromossomos Humanos Y/genética , Sequenciamento de Nucleotídeos em Larga Escala , Sequências de Repetição em Tandem/genética , Genoma Humano/genética , Humanos , Nanoporos , Sequências Repetitivas de Ácido Nucleico/genética
13.
Nat Biotechnol ; 36(4): 338-345, 2018 04.
Artigo em Inglês | MEDLINE | ID: mdl-29431738

RESUMO

We report the sequencing and assembly of a reference genome for the human GM12878 Utah/Ceph cell line using the MinION (Oxford Nanopore Technologies) nanopore sequencer. 91.2 Gb of sequence data, representing ∼30× theoretical coverage, were produced. Reference-based alignment enabled detection of large structural variants and epigenetic modifications. De novo assembly of nanopore reads alone yielded a contiguous assembly (NG50 ∼3 Mb). We developed a protocol to generate ultra-long reads (N50 > 100 kb, read lengths up to 882 kb). Incorporating an additional 5× coverage of these ultra-long reads more than doubled the assembly contiguity (NG50 ∼6.4 Mb). The final assembled genome was 2,867 million bases in size, covering 85.8% of the reference. Assembly accuracy, after incorporating complementary short-read sequencing data, exceeded 99.8%. Ultra-long reads enabled assembly and phasing of the 4-Mb major histocompatibility complex (MHC) locus in its entirety, measurement of telomere repeat length, and closure of gaps in the reference human genome assembly GRCh38.


Assuntos
Genoma Humano/genética , Genômica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Humanos , Nanoporos
14.
Genome Res ; 28(2): 266-274, 2018 02.
Artigo em Inglês | MEDLINE | ID: mdl-29273626

RESUMO

Advances in long-read single molecule sequencing have opened new possibilities for 'benchtop' whole-genome sequencing. The Oxford Nanopore Technologies MinION is a portable device that uses nanopore technology that can directly sequence DNA molecules. MinION single molecule long sequence reads are well suited for de novo assembly of complex genomes as they facilitate the construction of highly contiguous physical genome maps obviating the need for labor-intensive physical genome mapping. Long sequence reads can also be used to delineate complex chromosomal rearrangements, such as those that occur in tumor cells, that can confound analysis using short reads. Here, we assessed MinION long-read-derived sequences for feasibility concerning: (1) the de novo assembly of a large complex genome, and (2) the elucidation of complex rearrangements. The genomes of two Caenorhabditis elegans strains, a wild-type strain and a strain containing two complex rearrangements, were sequenced with MinION. Up to 42-fold coverage was obtained from a single flow cell, and the best pooled data assembly produced a highly contiguous wild-type C. elegans genome containing 48 contigs (N50 contig length = 3.99 Mb) covering >99% of the 100,286,401-base reference genome. Further, the MinION-derived genome assembly expanded the C. elegans reference genome by >2 Mb due to a more accurate determination of repetitive sequence elements and assembled the complete genomes of two co-extracted bacteria. MinION long-read sequence data also facilitated the elucidation of complex rearrangements in a mutagenized strain. The sequence accuracy of the MinION long-read contigs (∼98%) was improved using Illumina-derived sequence data to polish the final genome assembly to 99.8% nucleotide accuracy when compared to the reference assembly.


Assuntos
Caenorhabditis elegans/genética , Genoma/genética , Anotação de Sequência Molecular , Animais , Mapeamento Cromossômico , Rearranjo Gênico/genética , Sequenciamento de Nucleotídeos em Larga Escala , Sequências Repetitivas de Ácido Nucleico/genética
15.
F1000Res ; 6: 760, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28794860

RESUMO

BACKGROUND: Long-read sequencing is rapidly evolving and reshaping the suite of opportunities for genomic analysis. For the MinION in particular, as both the platform and chemistry develop, the user community requires reference data to set performance expectations and maximally exploit third-generation sequencing. We performed an analysis of MinION data derived from whole genome sequencing of Escherichiacoli K-12 using the R9.0 chemistry, comparing the results with the older R7.3 chemistry. METHODS: We computed the error-rate estimates for insertions, deletions, and mismatches in MinION reads. RESULTS: Run-time characteristics of the flow cell and run scripts for R9.0 were similar to those observed for R7.3 chemistry, but with an 8-fold increase in bases per second (from 30 bps in R7.3 and SQK-MAP005 library preparation, to 250 bps in R9.0) processed by individual nanopores, and less drop-off in yield over time. The 2-dimensional ("2D") N50 read length was unchanged from the prior chemistry. Using the proportion of alignable reads as a measure of base-call accuracy, 99.9% of "pass" template reads from 1-dimensional ("1D")  experiments were mappable and ~97% from 2D experiments. The median identity of reads was ~89% for 1D and ~94% for 2D experiments. The total error rate (miscall + insertion + deletion ) decreased for 2D "pass" reads from 9.1% in R7.3 to 7.5% in R9.0 and for template "pass" reads from 26.7% in R7.3 to 14.5% in R9.0. CONCLUSIONS: These Phase 2 MinION experiments serve as a baseline by providing estimates for read quality, throughput, and mappability. The datasets further enable the development of bioinformatic tools tailored to the new R9.0 chemistry and the design of novel biological applications for this technology. ABBREVIATIONS: K: thousand, Kb: kilobase (one thousand base pairs), M: million, Mb: megabase (one million base pairs), Gb: gigabase (one billion base pairs).

16.
Nat Commun ; 8: 16027, 2017 07 19.
Artigo em Inglês | MEDLINE | ID: mdl-28722025

RESUMO

Understanding gene regulation and function requires a genome-wide method capable of capturing both gene expression levels and isoform diversity at the single-cell level. Short-read RNAseq is limited in its ability to resolve complex isoforms because it fails to sequence full-length cDNA copies of RNA molecules. Here, we investigate whether RNAseq using the long-read single-molecule Oxford Nanopore MinION sequencer is able to identify and quantify complex isoforms without sacrificing accurate gene expression quantification. After benchmarking our approach, we analyse individual murine B1a cells using a custom multiplexing strategy. We identify thousands of unannotated transcription start and end sites, as well as hundreds of alternative splicing events in these B1a cells. We also identify hundreds of genes expressed across B1a cells that display multiple complex isoforms, including several B cell-specific surface receptors. Our results show that we can identify and quantify complex isoforms at the single cell level.


Assuntos
Linfócitos B/metabolismo , Perfilação da Expressão Gênica/métodos , Receptores de Superfície Celular/metabolismo , Análise de Célula Única/métodos , Animais , Benchmarking , Camundongos Endogâmicos C57BL , Isoformas de Proteínas/metabolismo , Análise de Sequência de DNA , Análise de Sequência de RNA , Transcriptoma
17.
Nat Methods ; 14(4): 411-413, 2017 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-28218897

RESUMO

DNA chemical modifications regulate genomic function. We present a framework for mapping cytosine and adenosine methylation with the Oxford Nanopore Technologies MinION using this nanopore sequencer's ionic current signal. We map three cytosine variants and two adenine variants. The results show that our model is sensitive enough to detect changes in genomic DNA methylation levels as a function of growth phase in Escherichia coli.


Assuntos
5-Metilcitosina/metabolismo , Metilação de DNA , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Nanoporos , 5-Metilcitosina/análise , Escherichia coli/genética , Genoma Bacteriano , Sequenciamento de Nucleotídeos em Larga Escala/instrumentação , Cadeias de Markov , Modelos Genéticos
19.
Genome Biol ; 17(1): 239, 2016 11 25.
Artigo em Inglês | MEDLINE | ID: mdl-27887629

RESUMO

Nanopore DNA strand sequencing has emerged as a competitive, portable technology. Reads exceeding 150 kilobases have been achieved, as have in-field detection and analysis of clinical pathogens. We summarize key technical features of the Oxford Nanopore MinION, the dominant platform currently available. We then discuss pioneering applications executed by the genomics community.


Assuntos
Algoritmos , Biologia Computacional/métodos , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Nanoporos , Aneuploidia , Biologia Computacional/instrumentação , DNA/análise , DNA/genética , Genômica/instrumentação , Sequenciamento de Nucleotídeos em Larga Escala/instrumentação , Humanos , Reprodutibilidade dos Testes
20.
Nat Methods ; 12(4): 351-6, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25686389

RESUMO

Speed, single-base sensitivity and long read lengths make nanopores a promising technology for high-throughput sequencing. We evaluated and optimized the performance of the MinION nanopore sequencer using M13 genomic DNA and used expectation maximization to obtain robust maximum-likelihood estimates for insertion, deletion and substitution error rates (4.9%, 7.8% and 5.1%, respectively). Over 99% of high-quality 2D MinION reads mapped to the reference at a mean identity of 85%. We present a single-nucleotide-variant detection tool that uses maximum-likelihood parameter estimates and marginalization over many possible read alignments to achieve precision and recall of up to 99%. By pairing our high-confidence alignment strategy with long MinION reads, we resolved the copy number for a cancer-testis gene family (CT47) within an unresolved region of human chromosome Xq24.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Nanoporos , Algoritmos , Dosagem de Genes , Humanos , Neoplasias/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...