Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 28
Filtrar
1.
Genome Res ; 34(3): 454-468, 2024 Apr 25.
Artigo em Inglês | MEDLINE | ID: mdl-38627094

RESUMO

Reference-free genome phasing is vital for understanding allele inheritance and the impact of single-molecule DNA variation on phenotypes. To achieve thorough phasing across homozygous or repetitive regions of the genome, long-read sequencing technologies are often used to perform phased de novo assembly. As a step toward reducing the cost and complexity of this type of analysis, we describe new methods for accurately phasing Oxford Nanopore Technologies (ONT) sequence data with the Shasta genome assembler and a modular tool for extending phasing to the chromosome scale called GFAse. We test using new variants of ONT PromethION sequencing, including those using proximity ligation, and show that newer, higher accuracy ONT reads substantially improve assembly quality.


Assuntos
Nanoporos , Humanos , Análise de Sequência de DNA/métodos , Sequenciamento por Nanoporos/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Software , Genômica/métodos
2.
Nature ; 617(7960): 312-324, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-37165242

RESUMO

Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals1. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels. Based on alignments of the assemblies, we generate a draft pangenome that captures known variants and haplotypes and reveals new alleles at structurally complex loci. We also add 119 million base pairs of euchromatic polymorphic sequences and 1,115 gene duplications relative to the existing reference GRCh38. Roughly 90 million of the additional base pairs are derived from structural variation. Using our draft pangenome to analyse short-read data reduced small variant discovery errors by 34% and increased the number of structural variants detected per haplotype by 104% compared with GRCh38-based workflows, which enabled the typing of the vast majority of structural variant alleles per sample.


Assuntos
Genoma Humano , Genômica , Humanos , Diploide , Genoma Humano/genética , Haplótipos/genética , Análise de Sequência de DNA , Genômica/normas , Padrões de Referência , Estudos de Coortes , Alelos , Variação Genética
3.
bioRxiv ; 2023 Feb 22.
Artigo em Inglês | MEDLINE | ID: mdl-36865218

RESUMO

As a step towards simplifying and reducing the cost of haplotype resolved de novo assembly, we describe new methods for accurately phasing nanopore data with the Shasta genome assembler and a modular tool for extending phasing to the chromosome scale called GFAse. We test using new variants of Oxford Nanopore Technologies' (ONT) PromethION sequencing, including those using proximity ligation and show that newer, higher accuracy ONT reads substantially improve assembly quality.

5.
Nature ; 611(7936): 519-531, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-36261518

RESUMO

The current human reference genome, GRCh38, represents over 20 years of effort to generate a high-quality assembly, which has benefitted society1,2. However, it still has many gaps and errors, and does not represent a biological genome as it is a blend of multiple individuals3,4. Recently, a high-quality telomere-to-telomere reference, CHM13, was generated with the latest long-read technologies, but it was derived from a hydatidiform mole cell line with a nearly homozygous genome5. To address these limitations, the Human Pangenome Reference Consortium formed with the goal of creating high-quality, cost-effective, diploid genome assemblies for a pangenome reference that represents human genetic diversity6. Here, in our first scientific report, we determined which combination of current genome sequencing and assembly approaches yield the most complete and accurate diploid genome assembly with minimal manual curation. Approaches that used highly accurate long reads and parent-child data with graph-based haplotype phasing during assembly outperformed those that did not. Developing a combination of the top-performing methods, we generated our first high-quality diploid reference assembly, containing only approximately four gaps per chromosome on average, with most chromosomes within ±1% of the length of CHM13. Nearly 48% of protein-coding genes have non-synonymous amino acid changes between haplotypes, and centromeric regions showed the highest diversity. Our findings serve as a foundation for assembling near-complete diploid human genomes at scale for a pangenome reference to capture global genetic variation from single nucleotides to structural rearrangements.


Assuntos
Mapeamento Cromossômico , Diploide , Genoma Humano , Genômica , Humanos , Mapeamento Cromossômico/normas , Genoma Humano/genética , Haplótipos/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento de Nucleotídeos em Larga Escala/normas , Análise de Sequência de DNA/métodos , Análise de Sequência de DNA/normas , Padrões de Referência , Genômica/métodos , Genômica/normas , Cromossomos Humanos/genética , Variação Genética/genética
6.
Nucleic Acids Res ; 50(6): 3475-3489, 2022 04 08.
Artigo em Inglês | MEDLINE | ID: mdl-35244721

RESUMO

The SARS-CoV-2 virus has a complex transcriptome characterised by multiple, nested subgenomic RNAsused to express structural and accessory proteins. Long-read sequencing technologies such as nanopore direct RNA sequencing can recover full-length transcripts, greatly simplifying the assembly of structurally complex RNAs. However, these techniques do not detect the 5' cap, thus preventing reliable identification and quantification of full-length, coding transcript models. Here we used Nanopore ReCappable Sequencing (NRCeq), a new technique that can identify capped full-length RNAs, to assemble a complete annotation of SARS-CoV-2 sgRNAs and annotate the location of capping sites across the viral genome. We obtained robust estimates of sgRNA expression across cell lines and viral isolates and identified novel canonical and non-canonical sgRNAs, including one that uses a previously un-annotated leader-to-body junction site. The data generated in this work constitute a useful resource for the scientific community and provide important insights into the mechanisms that regulate the transcription of SARS-CoV-2 sgRNAs.


Assuntos
COVID-19 , Nanoporos , RNA Guia de Cinetoplastídeos/química , COVID-19/genética , Genoma Viral/genética , Humanos , Capuzes de RNA , RNA Viral/genética , RNA Viral/metabolismo , SARS-CoV-2/genética
7.
RNA ; 28(2): 162-176, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-34728536

RESUMO

Nanopore sequencing devices read individual RNA strands directly. This facilitates identification of exon linkages and nucleotide modifications; however, using conventional direct RNA nanopore sequencing, the 5' and 3' ends of poly(A) RNA cannot be identified unambiguously. This is due in part to RNA degradation in vivo and in vitro that can obscure transcription start and end sites. In this study, we aimed to identify individual full-length human RNA isoforms among ∼4 million nanopore poly(A)-selected RNA reads. First, to identify RNA strands bearing 5' m7G caps, we exchanged the biological cap for a modified cap attached to a 45-nt oligomer. This oligomer adaptation method improved 5' end sequencing and ensured correct identification of the 5' m7G capped ends. Second, among these 5'-capped nanopore reads, we screened for features consistent with a 3' polyadenylation site. Combining these two steps, we identified 294,107 individual high-confidence full-length RNA scaffolds from human GM12878 cells, most of which (257,721) aligned to protein-coding genes. Of these, 4876 scaffolds indicated unannotated isoforms that were often internal to longer, previously identified RNA isoforms. Orthogonal data for m7G caps and open chromatin, such as CAGE and DNase-HS seq, confirmed the validity of these high-confidence RNA scaffolds.


Assuntos
Isoformas de RNA/química , RNA Mensageiro/química , Linhagem Celular Tumoral , Humanos , Sequenciamento por Nanoporos/métodos , Sinais de Poliadenilação na Ponta 3' do RNA , Isoformas de RNA/genética , RNA Mensageiro/genética , Transcriptoma
8.
ACS Nano ; 15(10): 16642-16653, 2021 10 26.
Artigo em Inglês | MEDLINE | ID: mdl-34618430

RESUMO

We describe a method for direct tRNA sequencing using the Oxford Nanopore MinION. The principal technical advance is custom adapters that facilitate end-to-end sequencing of individual transfer RNA (tRNA) molecules at subnanometer precision. A second advance is a nanopore sequencing pipeline optimized for tRNA. We tested this method using purified E. coli tRNAfMet, tRNALys, and tRNAPhe samples. 76-92% of individual aligned tRNA sequence reads were full length. As a proof of concept, we showed that nanopore sequencing detected all 43 expected isoacceptors in total E. coli MRE600 tRNA as well as isodecoders that further define that tRNA population. Alignment-based comparisons between the three purified tRNAs and their synthetic controls revealed systematic nucleotide miscalls that were diagnostic of known modifications. Systematic miscalls were also observed proximal to known modifications in total E. coli tRNA alignments, including a highly conserved pseudouridine in the T loop. This work highlights the potential of nanopore direct tRNA sequencing as well as improvements needed to implement tRNA sequencing for human healthcare applications.


Assuntos
Sequenciamento por Nanoporos , Nanoporos , Escherichia coli/genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Nucleotídeos
9.
Methods Mol Biol ; 2298: 53-74, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34085238

RESUMO

Historically, RNA has been sequenced as cDNA copies derived from reverse transcription of cellular RNA followed by PCR amplification. Recently, RNA sequencing using nanopores has emerged as an alternative. Using this technology, individual cellular RNA strands are read directly as they are driven through nanoscale pores by an applied voltage. The speed of translocation is regulated by a helicase that is loaded onto each RNA strand by an adapter that also facilitates capture by the nanopore electric field. Here we describe a technique for adapting human ribosomal RNA subunits for nanopore sequencing. Using this strategy, a single Oxford Nanopore MinION run delivered 470,907 sequence reads of which 396,048 aligned to ribosomal RNA, with 28S, 18S, 5.8S, and 5S coverage of 6053, 369,472, 16,058, and 4465 reads, respectively. Example alignments that reveal putative nucleotide modifications are provided.


Assuntos
Sequenciamento por Nanoporos/métodos , Nucleotídeos/genética , RNA Ribossômico/genética , Análise de Sequência de RNA/métodos , Humanos , Nanoporos , Análise de Sequência de DNA/métodos
10.
bioRxiv ; 2021 Apr 06.
Artigo em Inglês | MEDLINE | ID: mdl-33851162

RESUMO

We report a SARS-CoV-2 lineage that shares N501Y, P681H, and other mutations with known variants of concern, such as B.1.1.7. This lineage, which we refer to as B.1.x (COG-UK sometimes references similar samples as B.1.324.1), is present in at least 20 states across the USA and in at least six countries. However, a large deletion causes the sequence to be automatically rejected from repositories, suggesting that the frequency of this new lineage is underestimated using public data. Recent dynamics based on 339 samples obtained in Santa Cruz County, CA, USA suggest that B.1.x may be increasing in frequency at a rate similar to that of B.1.1.7 in Southern California. At present the functional differences between this variant B.1.x and other circulating SARS-CoV-2 variants are unknown, and further studies on secondary attack rates, viral loads, immune evasion and/or disease severity are needed to determine if it poses a public health concern. Nonetheless, given what is known from well-studied circulating variants of concern, it seems unlikely that the lineage could pose larger concerns for human health than many already globally distributed lineages. Our work highlights a need for rapid turnaround time from sequence generation to submission and improved sequence quality control that removes submission bias. We identify promising paths toward this goal.

11.
Am J Hum Genet ; 107(4): 654-669, 2020 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-32937144

RESUMO

There is growing recognition that epivariations, most often recognized as promoter hypermethylation events that lead to gene silencing, are associated with a number of human diseases. However, little information exists on the prevalence and distribution of rare epigenetic variation in the human population. In order to address this, we performed a survey of methylation profiles from 23,116 individuals using the Illumina 450k array. Using a robust outlier approach, we identified 4,452 unique autosomal epivariations, including potentially inactivating promoter methylation events at 384 genes linked to human disease. For example, we observed promoter hypermethylation of BRCA1 and LDLR at population frequencies of ∼1 in 3,000 and ∼1 in 6,000, respectively, suggesting that epivariations may underlie a fraction of human disease which would be missed by purely sequence-based approaches. Using expression data, we confirmed that many epivariations are associated with outlier gene expression. Analysis of variation data and monozygous twin pairs suggests that approximately two-thirds of epivariations segregate in the population secondary to underlying sequence mutations, while one-third are likely sporadic events that occur post-zygotically. We identified 25 loci where rare hypermethylation coincided with the presence of an unstable CGG tandem repeat, validated the presence of CGG expansions at several loci, and identified the putative molecular defect underlying most of the known folate-sensitive fragile sites in the genome. Our study provides a catalog of rare epigenetic changes in the human genome, gives insight into the underlying origins and consequences of epivariations, and identifies many hypermethylated CGG repeat expansions.


Assuntos
Proteína BRCA1/genética , Epigênese Genética , Doenças Genéticas Inatas/genética , Genoma Humano , Receptores de LDL/genética , Expansão das Repetições de Trinucleotídeos , Proteína BRCA1/metabolismo , Metilação de DNA , Feminino , Ácido Fólico/metabolismo , Inativação Gênica , Doenças Genéticas Inatas/diagnóstico , Doenças Genéticas Inatas/patologia , Loci Gênicos , Variação Genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Masculino , Regiões Promotoras Genéticas , Receptores de LDL/metabolismo , Gêmeos Monozigóticos
12.
Nat Biotechnol ; 38(9): 1044-1053, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32686750

RESUMO

De novo assembly of a human genome using nanopore long-read sequences has been reported, but it used more than 150,000 CPU hours and weeks of wall-clock time. To enable rapid human genome assembly, we present Shasta, a de novo long-read assembler, and polishing algorithms named MarginPolish and HELEN. Using a single PromethION nanopore sequencer and our toolkit, we assembled 11 highly contiguous human genomes de novo in 9 d. We achieved roughly 63× coverage, 42-kb read N50 values and 6.5× coverage in reads >100 kb using three flow cells per sample. Shasta produced a complete haploid human genome assembly in under 6 h on a single commercial compute node. MarginPolish and HELEN polished haploid assemblies to more than 99.9% identity (Phred quality score QV = 30) with nanopore reads alone. Addition of proximity-ligation sequencing enabled near chromosome-level scaffolds for all 11 genomes. We compare our assembly performance to existing methods for diploid, haploid and trio-binned human samples and report superior accuracy and speed.


Assuntos
Genoma Humano/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento por Nanoporos , Análise de Sequência de DNA/métodos , Algoritmos , Benchmarking , Cromossomos Humanos/genética , Aprendizado Profundo , Genômica , Antígenos HLA/genética , Haploidia , Sequenciamento de Nucleotídeos em Larga Escala/normas , Humanos , Análise de Sequência de DNA/normas
13.
Bioinformatics ; 36(19): 4928-4934, 2020 12 08.
Artigo em Inglês | MEDLINE | ID: mdl-32597959

RESUMO

MOTIVATION: Nucleotide modification status can be decoded from the Oxford Nanopore Technologies nanopore-sequencing ionic current signals. Although various algorithms have been developed for nanopore-sequencing-based modification analysis, more detailed characterizations, such as modification numbers, corresponding signal levels and proportions are still lacking. RESULTS: We present a framework for the unsupervised determination of the number of nucleotide modifications from nanopore-sequencing readouts. We demonstrate the approach can effectively recapitulate the number of modifications, the corresponding ionic current signal levels, as well as mixing proportions under both DNA and RNA contexts. We further show, by integrating information from multiple detected modification regions, that the modification status of DNA and RNA molecules can be inferred. This method forms a key step of de novo characterization of nucleotide modifications, shedding light on the interpretation of various biological questions. AVAILABILITY AND IMPLEMENTATION: Modified nanopolish: https://github.com/adbailey4/nanopolish/tree/cigar_output. All other codes used to reproduce the results: https://github.com/hd2326/ModificationNumber. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Nanoporos , Sequenciamento de Nucleotídeos em Larga Escala , Nucleotídeos/genética , Análise de Sequência de DNA , Software
14.
Genome Biol ; 21(1): 83, 2020 03 31.
Artigo em Inglês | MEDLINE | ID: mdl-32234056

RESUMO

BACKGROUND: Long non-coding RNAs (lncRNAs) exhibit highly cell type-specific expression and function, making this class of transcript attractive for targeted cancer therapy. However, the vast majority of lncRNAs have not been tested as potential therapeutic targets, particularly in the context of currently used cancer treatments. Malignant glioma is rapidly fatal, and ionizing radiation is part of the current standard-of-care used to slow tumor growth in both adult and pediatric patients. RESULTS: We use CRISPR interference (CRISPRi) to screen 5689 lncRNA loci in human glioblastoma (GBM) cells, identifying 467 hits that modify cell growth in the presence of clinically relevant doses of fractionated radiation. Thirty-three of these lncRNA hits sensitize cells to radiation, and based on their expression in adult and pediatric gliomas, nine of these hits are prioritized as lncRNA Glioma Radiation Sensitizers (lncGRS). Knockdown of lncGRS-1, a primate-conserved, nuclear-enriched lncRNA, inhibits the growth and proliferation of primary adult and pediatric glioma cells, but not the viability of normal brain cells. Using human brain organoids comprised of mature neural cell types as a three-dimensional tissue substrate to model the invasive growth of glioma, we find that antisense oligonucleotides targeting lncGRS-1 selectively decrease tumor growth and sensitize glioma cells to radiation therapy. CONCLUSIONS: These studies identify lncGRS-1 as a glioma-specific therapeutic target and establish a generalizable approach to rapidly identify novel therapeutic targets in the vast non-coding genome to enhance radiation therapy.


Assuntos
Neoplasias Encefálicas/terapia , Sistemas CRISPR-Cas , Glioblastoma/terapia , RNA Longo não Codificante/antagonistas & inibidores , Adulto , Astrócitos , Encéfalo , Neoplasias Encefálicas/genética , Neoplasias Encefálicas/patologia , Neoplasias Encefálicas/radioterapia , Linhagem Celular Tumoral , Terapia Combinada , Glioblastoma/genética , Glioblastoma/patologia , Glioblastoma/radioterapia , Humanos , Oligonucleotídeos Antissenso , Organoides , Tolerância a Radiação
16.
Nat Methods ; 16(12): 1297-1305, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31740818

RESUMO

High-throughput complementary DNA sequencing technologies have advanced our understanding of transcriptome complexity and regulation. However, these methods lose information contained in biological RNA because the copied reads are often short and modifications are not retained. We address these limitations using a native poly(A) RNA sequencing strategy developed by Oxford Nanopore Technologies. Our study generated 9.9 million aligned sequence reads for the human cell line GM12878, using thirty MinION flow cells at six institutions. These native RNA reads had a median length of 771 bases, and a maximum aligned length of over 21,000 bases. Mitochondrial poly(A) reads provided an internal measure of read-length quality. We combined these long nanopore reads with higher accuracy short-reads and annotated GM12878 promoter regions to identify 33,984 plausible RNA isoforms. We describe strategies for assessing 3' poly(A) tail length, base modifications and transcript haplotypes.


Assuntos
Sequenciamento por Nanoporos/métodos , Poli A/genética , Análise de Sequência de RNA/métodos , Transcriptoma , Células Cultivadas , Humanos
17.
Microb Genom ; 4(11)2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30461375

RESUMO

The genome of Bordetella pertussis is complex, with high G+C content and many repeats, each longer than 1000 bp. Long-read sequencing offers the opportunity to produce single-contig B. pertussis assemblies using sequencing reads which are longer than the repetitive sections, with the potential to reveal genomic features which were previously unobservable in multi-contig assemblies produced by short-read sequencing alone. We used an R9.4 MinION flow cell and barcoding to sequence five B. pertussis strains in a single sequencing run. We then trialled combinations of the many nanopore user community-built long-read analysis tools to establish the current optimal assembly pipeline for B. pertussis genome sequences. This pipeline produced closed genome sequences for four strains, allowing visualization of inter-strain genomic rearrangement. Read mapping to the Tohama I reference genome suggests that the remaining strain contains an ultra-long duplicated region (almost 200 kbp), which was not resolved by our pipeline; further investigation also revealed that a second strain that was seemingly resolved by our pipeline may contain an even longer duplication, albeit in a small subset of cells. We have therefore demonstrated the ability to resolve the structure of several B. pertussis strains per single barcoded nanopore flow cell, but the genomes with highest complexity (e.g. very large duplicated regions) remain only partially resolved using the standard library preparation and will require an alternative library preparation method. For full strain characterization, we recommend hybrid assembly of long and short reads together; for comparison of genome arrangement, assembly using long reads alone is sufficient.


Assuntos
Bordetella pertussis/genética , Genoma Bacteriano , Análise de Sequência de DNA/métodos , Anotação de Sequência Molecular , Nanoporos
18.
Nat Biotechnol ; 36(4): 321-323, 2018 04.
Artigo em Inglês | MEDLINE | ID: mdl-29553574

RESUMO

The human genome reference sequence remains incomplete owing to the challenge of assembling long tracts of near-identical tandem repeats in centromeres. We implemented a nanopore sequencing strategy to generate high-quality reads that span hundreds of kilobases of highly repetitive DNA in a human Y chromosome centromere. Combining these data with short-read variant validation, we assembled and characterized the centromeric region of a human Y chromosome.


Assuntos
Centrômero/genética , Cromossomos Humanos Y/genética , Sequenciamento de Nucleotídeos em Larga Escala , Sequências de Repetição em Tandem/genética , Genoma Humano/genética , Humanos , Nanoporos , Sequências Repetitivas de Ácido Nucleico/genética
19.
Nat Biotechnol ; 36(4): 338-345, 2018 04.
Artigo em Inglês | MEDLINE | ID: mdl-29431738

RESUMO

We report the sequencing and assembly of a reference genome for the human GM12878 Utah/Ceph cell line using the MinION (Oxford Nanopore Technologies) nanopore sequencer. 91.2 Gb of sequence data, representing ∼30× theoretical coverage, were produced. Reference-based alignment enabled detection of large structural variants and epigenetic modifications. De novo assembly of nanopore reads alone yielded a contiguous assembly (NG50 ∼3 Mb). We developed a protocol to generate ultra-long reads (N50 > 100 kb, read lengths up to 882 kb). Incorporating an additional 5× coverage of these ultra-long reads more than doubled the assembly contiguity (NG50 ∼6.4 Mb). The final assembled genome was 2,867 million bases in size, covering 85.8% of the reference. Assembly accuracy, after incorporating complementary short-read sequencing data, exceeded 99.8%. Ultra-long reads enabled assembly and phasing of the 4-Mb major histocompatibility complex (MHC) locus in its entirety, measurement of telomere repeat length, and closure of gaps in the reference human genome assembly GRCh38.


Assuntos
Genoma Humano/genética , Genômica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Humanos , Nanoporos
20.
Genome Res ; 28(2): 266-274, 2018 02.
Artigo em Inglês | MEDLINE | ID: mdl-29273626

RESUMO

Advances in long-read single molecule sequencing have opened new possibilities for 'benchtop' whole-genome sequencing. The Oxford Nanopore Technologies MinION is a portable device that uses nanopore technology that can directly sequence DNA molecules. MinION single molecule long sequence reads are well suited for de novo assembly of complex genomes as they facilitate the construction of highly contiguous physical genome maps obviating the need for labor-intensive physical genome mapping. Long sequence reads can also be used to delineate complex chromosomal rearrangements, such as those that occur in tumor cells, that can confound analysis using short reads. Here, we assessed MinION long-read-derived sequences for feasibility concerning: (1) the de novo assembly of a large complex genome, and (2) the elucidation of complex rearrangements. The genomes of two Caenorhabditis elegans strains, a wild-type strain and a strain containing two complex rearrangements, were sequenced with MinION. Up to 42-fold coverage was obtained from a single flow cell, and the best pooled data assembly produced a highly contiguous wild-type C. elegans genome containing 48 contigs (N50 contig length = 3.99 Mb) covering >99% of the 100,286,401-base reference genome. Further, the MinION-derived genome assembly expanded the C. elegans reference genome by >2 Mb due to a more accurate determination of repetitive sequence elements and assembled the complete genomes of two co-extracted bacteria. MinION long-read sequence data also facilitated the elucidation of complex rearrangements in a mutagenized strain. The sequence accuracy of the MinION long-read contigs (∼98%) was improved using Illumina-derived sequence data to polish the final genome assembly to 99.8% nucleotide accuracy when compared to the reference assembly.


Assuntos
Caenorhabditis elegans/genética , Genoma/genética , Anotação de Sequência Molecular , Animais , Mapeamento Cromossômico , Rearranjo Gênico/genética , Sequenciamento de Nucleotídeos em Larga Escala , Sequências Repetitivas de Ácido Nucleico/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA