Pesquisa | Portal de Pesquisa da BVS Enfermagem

Efficient and unique cobarcoding of second-generation sequencing reads from long DNA molecules enabling cost-effective and accurate sequencing, haplotyping, and de novo assembly.

Wang, Ou; Chin, Robert; Cheng, Xiaofang; Wu, Michelle Ka Yan; Mao, Qing; Tang, Jingbo; Sun, Yuhui; Anderson, Ellis; Lam, Han K; Chen, Dan; Zhou, Yujun; Wang, Linying; Fan, Fei; Zou, Yan; Xie, Yinlong; Zhang, Rebecca Yu; Drmanac, Snezana; Nguyen, Darlene; Xu, Chongjun; Villarosa, Christian; Gablenz, Scott; Barua, Nina; Nguyen, Staci; Tian, Wenlan; Liu, Jia Sophie; Wang, Jingwan; Liu, Xiao; Qi, Xiaojuan; Chen, Ao; Wang, He; Dong, Yuliang; Zhang, Wenwei; Alexeev, Andrei; Yang, Huanming; Wang, Jian; Kristiansen, Karsten; Xu, Xun; Drmanac, Radoje; Peters, Brock A.

Genome Res ; 29(5): 798-808, 2019 05.

Artigo em Inglês | MEDLINE | ID: mdl-30940689

RESUMO

Here, we describe single-tube long fragment read (stLFR), a technology that enables sequencing of data from long DNA molecules using economical second-generation sequencing technology. It is based on adding the same barcode sequence to subfragments of the original long DNA molecule (DNA cobarcoding). To achieve this efficiently, stLFR uses the surface of microbeads to create millions of miniaturized barcoding reactions in a single tube. Using a combinatorial process, up to 3.6 billion unique barcode sequences were generated on beads, enabling practically nonredundant cobarcoding with 50 million barcodes per sample. Using stLFR, we demonstrate efficient unique cobarcoding of more than 8 million 20- to 300-kb genomic DNA fragments. Analysis of the human genome NA12878 with stLFR demonstrated high-quality variant calling and phase block lengths up to N50 34 Mb. We also demonstrate detection of complex structural variants and complete diploid de novo assembly of NA12878. These analyses were all performed using single stLFR libraries, and their construction did not significantly add to the time or cost of whole-genome sequencing (WGS) library preparation. stLFR represents an easily automatable solution that enables high-quality sequencing, phasing, SV detection, scaffolding, cost-effective diploid de novo genome assembly, and other long DNA sequencing applications.

Assuntos

Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento Completo do Genoma/métodos , Análise Custo-Benefício , Diploide , Biblioteca Gênica , Genoma Humano , Genômica , Haplótipos/genética , Sequenciamento de Nucleotídeos em Larga Escala/economia , Humanos , Sequenciamento Completo do Genoma/economia

Assembly and analysis of the genome of Notholithocarpus densiflorus.

Cai, Ying; Anderson, Ellis; Xue, Wen; Wong, Sylvia; Cui, Luman; Cheng, Xiaofang; Wang, Ou; Mao, Qing; Liu, Sophie Jia; Davis, John T; Magalang, Paulo R; Schmidt, Douglas; Kasuga, Takao; Garbelotto, Matteo; Drmanac, Radoje; Kua, Chai-Shian; Cannon, Charles; Maloof, Julin N; Peters, Brock A.

G3 (Bethesda) ; 14(5)2024 05 07.

Artigo em Inglês | MEDLINE | ID: mdl-38427916

RESUMO

Tanoak (Notholithocarpus densiflorus) is an evergreen tree in the Fagaceae family found in California and southern Oregon. Historically, tanoak acorns were an important food source for Native American tribes, and the bark was used extensively in the leather tanning process. Long considered a disjunct relictual element of the Asian stone oaks (Lithocarpus spp.), phylogenetic analysis has determined that the tanoak is an example of convergent evolution. Tanoaks are deeply divergent from oaks (Quercus) of the Pacific Northwest and comprise a new genus with a single species. These trees are highly susceptible to "sudden oak death" (SOD), a plant pathogen (Phytophthora ramorum) that has caused widespread deaths of tanoaks. In this study, we set out to assemble the genome and perform comparative studies among a number of individuals that demonstrated varying levels of susceptibility to SOD. First, we sequenced and de novo assembled a draft reference genome of N. densiflorus using cobarcoded library processing methods and an MGI DNBSEQ-G400 sequencer. To increase the contiguity of the final assembly, we also sequenced Oxford Nanopore long reads to 30× coverage. To our knowledge, the draft genome reported here is one of the more contiguous and complete genomes of a tree species published to date, with a contig N50 of â¼1.2âMb, a scaffold N50 of â¼2.1âMb, and a complete gene score of 95.5% through BUSCO analysis. In addition, we sequenced 11 genetically distinct individuals and mapped these onto the draft reference genome, enabling the discovery of almost 25 million single nucleotide polymorphisms and â¼4.4 million small insertions and deletions. Finally, using cobarcoded data, we were able to generate a complete haplotype coverage of all 11 genomes.

Assuntos

Fagaceae , Genoma de Planta , Fagaceae/genética , Filogenia , Anotação de Sequência Molecular , Genômica/métodos , Polimorfismo de Nucleotídeo Único

Comparison of long-read methods for sequencing and assembly of a plant genome.

Murigneux, Valentine; Rai, Subash Kumar; Furtado, Agnelo; Bruxner, Timothy J C; Tian, Wei; Harliwong, Ivon; Wei, Hanmin; Yang, Bicheng; Ye, Qianyu; Anderson, Ellis; Mao, Qing; Drmanac, Radoje; Wang, Ou; Peters, Brock A; Xu, Mengyang; Wu, Pei; Topp, Bruce; Coin, Lachlan J M; Henry, Robert J.

Gigascience ; 9(12)2020 12 21.

Artigo em Inglês | MEDLINE | ID: mdl-33347571

RESUMO

BACKGROUND: Sequencing technologies have advanced to the point where it is possible to generate high-accuracy, haplotype-resolved, chromosome-scale assemblies. Several long-read sequencing technologies are available, and a growing number of algorithms have been developed to assemble the reads generated by those technologies. When starting a new genome project, it is therefore challenging to select the most cost-effective sequencing technology, as well as the most appropriate software for assembly and polishing. It is thus important to benchmark different approaches applied to the same sample. RESULTS: Here, we report a comparison of 3 long-read sequencing technologies applied to the de novo assembly of a plant genome, Macadamia jansenii. We have generated sequencing data using Pacific Biosciences (Sequel I), Oxford Nanopore Technologies (PromethION), and BGI (single-tube Long Fragment Read) technologies for the same sample. Several assemblers were benchmarked in the assembly of Pacific Biosciences and Nanopore reads. Results obtained from combining long-read technologies or short-read and long-read technologies are also presented. The assemblies were compared for contiguity, base accuracy, and completeness, as well as sequencing costs and DNA material requirements. CONCLUSIONS: The 3 long-read technologies produced highly contiguous and complete genome assemblies of M. jansenii. At the time of sequencing, the cost associated with each method was significantly different, but continuous improvements in technologies have resulted in greater accuracy, increased throughput, and reduced costs. We propose updating this comparison regularly with reports on significant iterations of the sequencing technologies.

Assuntos

Genoma Bacteriano , Sequenciamento de Nucleotídeos em Larga Escala , Genoma de Planta , Análise de Sequência de DNA , Software

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA