Your browser doesn't support javascript.
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 17.133
Filtrar
1.
Lancet ; 394(10197): 533-540, 2019 Aug 10.
Artigo em Inglês | MEDLINE | ID: mdl-31395441

RESUMO

One of the primary goals of genomic medicine is to improve diagnosis through identification of genomic conditions, which could improve clinical management, prevent complications, and promote health. We explore how genomic medicine is being used to obtain molecular diagnoses for patients with previously undiagnosed diseases in prenatal, paediatric, and adult clinical settings. We focus on the role of clinical genomic sequencing (exome and genome) in aiding patients with conditions that are undiagnosed even after extensive clinical evaluation and testing. In particular, we explore the impact of combining genomic and phenotypic data and integrating multiple data types to improve diagnoses for patients with undiagnosed diseases, and we discuss how these genomic sequencing diagnoses could change clinical management.


Assuntos
Doenças Raras/diagnóstico , Análise de Sequência de DNA/métodos , Adulto , Criança , Diagnóstico Precoce , Genômica , Humanos , Fenótipo , Diagnóstico Pré-Natal/métodos , Doenças Raras/genética , Sequenciamento Completo do Exoma , Sequenciamento Completo do Genoma
2.
Medicine (Baltimore) ; 98(35): e16626, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-31464899

RESUMO

Gastric cancer (GC) is one of the common malignant tumors in China, with a high morbidity and mortality. With the development and application of high-throughput sequencing technologies and metagenomics, a great quantity of studies have shown that gastrointestinal microbiota is closely related to digestive system diseases. Although some studies have reported the effect of long-term follow-up after subtotal gastrectomy on intestinal flora changes in patients with GC. However, the features of gut microbiota and their shifts in patients with GC in perioperative period remain unclear.This study was designed to characterize fecal microbiota shifts of the patients with GC before and after the radical distal gastrectomy (RDG) during their hospital staying periods. Furthermore, fecal microbiota was also compared between the GC patients and healthy individuals.Patients who were diagnosed with advanced gastric adenocarcinoma at distal stomach were enrolled in the study. The bacterial burden within fecal samples was determined using quantitative polymerase chain reaction. To analyze the diversity and composition of gut microbiota from fecal DNA of 20 GC patients and 22 healthy controls, amplicons of the 16S rRNA gene from all subjects were pyrosequenced. To study gut microbiota shifts, the fecal microbiota from 6 GC patients before and after RDG was detected and subsequently analyzed. Short-chain fatty acids were also detected by chromatography spectrometer in these 6 GC patients.RDG had a moderate effect on bacterial richness and evenness, but had pronounced effects on the composition of postoperative gut microbiota compared with preoperative group. The relative abundances of genera Akkermansia, Esherichia/Shigella, Lactobacillus, and Dialister were significant changed in perioperative period. Remarkably, higher abundances of Escherichia/Shigella, Veillonella, and Clostridium XVIII and lower abundances of Bacteroides were observed in gut microbiota of overall GC patients compared to healthy controls.This study is the first study to characterize the altered gut microbiota within fecal samples from GC patients during perioperative period, and provide a new insights on such microbial perturbations as a potential effector of perioperative period phenotype. Further research must validate these discoveries and may evaluate targeted microbiota shifts to improve outcomes in GC patients.


Assuntos
Bactérias/classificação , RNA Ribossômico 16S/genética , Análise de Sequência de DNA/métodos , Neoplasias Gástricas/cirurgia , Adulto , Bactérias/genética , Bactérias/isolamento & purificação , China , DNA Bacteriano/genética , DNA Ribossômico/genética , Feminino , Gastrectomia , Microbioma Gastrointestinal , Humanos , Masculino , Pessoa de Meia-Idade , Período Perioperatório , Filogenia , Neoplasias Gástricas/microbiologia
3.
Gene ; 713: 143971, 2019 Sep 10.
Artigo em Inglês | MEDLINE | ID: mdl-31299361

RESUMO

An in silico genome analysis of the probiotic Bacillus strain FTC01 was performed. The draft genome comprises 3.9 Mb, with a G + C content of 46.6% and a total of 3941 coding sequences. The species of strain FTC01 was defined as B. velezensis during GenBank genome annotation, following the current nomenclature. Eight gene clusters involved in the synthesis of non-ribosomal lipopeptides, polyketides and bacilysin were found, as well as part of the gene cluster involved in the synthesis of cyclic lipopeptide locillomycin. The production of lipopeptides surfactin and iturin by strain FTC01 was confirmed. In addition, a gene encoding a peptidylprolyl isomerase, involved in bacterial adhesion to the host tissue, beyond twelve genes responsible for acid tolerance and several hydrolase genes were found. These characteristics may help in host colonization and maintenance and may account for the probiotic properties observed for strain FTC01.


Assuntos
Bacillus/genética , Bacillus/metabolismo , Proteínas de Bactérias/genética , Genoma Bacteriano , Metaboloma , Probióticos/metabolismo , Bacillus/crescimento & desenvolvimento , DNA Bacteriano/análise , Filogenia , Análise de Sequência de DNA/métodos
4.
Genome Biol ; 20(1): 134, 2019 07 08.
Artigo em Inglês | MEDLINE | ID: mdl-31287019

RESUMO

We present SMURF-seq, a protocol to efficiently sequence short DNA molecules on a long-read sequencer by randomly ligating them to form long molecules. Applying SMURF-seq using the Oxford Nanopore MinION yields up to 30 fragments per read, providing an average of 6.2 and up to 7.5 million mappable fragments per run, increasing information throughput for read-counting applications. We apply SMURF-seq on the MinION to generate copy number profiles. A comparison with profiles from Illumina sequencing reveals that SMURF-seq attains similar accuracy. More broadly, SMURF-seq expands the utility of long-read sequencers for read-counting applications.


Assuntos
Variações do Número de Cópias de DNA , Análise de Sequência de DNA/métodos , Linhagem Celular Tumoral , Feminino , Humanos
5.
Biochim Biophys Acta Rev Cancer ; 1872(1): 122-137, 2019 08.
Artigo em Inglês | MEDLINE | ID: mdl-31265877

RESUMO

The rapid evolution of next-generation sequencing (NGS)-based tumor genomic profile detection and the emergence of molecularly targeted therapies have enabled precision oncology. In NGS-based analysis, various types of databases have been developed to perform different functions. However, many problems still exist when using these public databases. Therefore, it is important to better understand the characteristics and limitations of each database and have them complement each other to provide useful clinical evidence for NGS testing. In this review, we elaborate on the important role of databases and their concrete applications in NGS-based somatic mutation detection. We introduce the typically used databases for sequence alignment, variant filtration, and variant interpretation, and compare the differences between the databases with similar functions. Subsequently, we determine the limitations of each database and provide the corresponding solutions. Furthermore, we present an overview diagram to clearly illustrate the database used in the entire NGS-based somatic mutation detection pipeline.


Assuntos
Análise Mutacional de DNA , Bases de Dados Genéticas , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Neoplasias/genética , Genoma Humano/genética , Humanos , Mutação , Medicina de Precisão , Análise de Sequência de DNA/métodos
7.
BMC Bioinformatics ; 20(Suppl 8): 283, 2019 Jun 10.
Artigo em Inglês | MEDLINE | ID: mdl-31182012

RESUMO

BACKGROUND: Numerous essential algorithms and methods, including entropy-based quantitative methods, have been developed to analyze complex DNA sequences since the last decade. Exons and introns are the most notable components of DNA and their identification and prediction are always the focus of state-of-the-art research. RESULTS: In this study, we designed an integrated entropy-based analysis approach, which involves modified topological entropy calculation, genomic signal processing (GSP) method and singular value decomposition (SVD), to investigate exons and introns in DNA sequences. We optimized and implemented the topological entropy and the generalized topological entropy to calculate the complexity of DNA sequences, highlighting the characteristics of repetition sequences. By comparing digitalizing entropy values of exons and introns, we observed that they are significantly different. After we converted DNA data to numerical topological entropy value, we applied SVD method to effectively investigate exon and intron regions on a single gene sequence. Additionally, several genes across five species are used for exon predictions. CONCLUSIONS: Our approach not only helps to explore the complexity of DNA sequence and its functional elements, but also provides an entropy-based GSP method to analyze exon and intron regions. Our work is feasible across different species and extendable to analyze other components in both coding and noncoding region of DNA sequences.


Assuntos
Entropia , Éxons/genética , Íntrons/genética , Algoritmos , Sequência de Bases , Cromossomos Humanos/genética , DNA/genética , Genoma Humano , Humanos , Regiões Promotoras Genéticas/genética , Curva ROC , Análise de Sequência de DNA/métodos , Processamento de Sinais Assistido por Computador
8.
BMC Bioinformatics ; 20(Suppl 11): 276, 2019 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-31167633

RESUMO

BACKGROUND: A crucial task in metagenomic analysis is to annotate the function and taxonomy of the sequencing reads generated from a microbiome sample. In general, the reads can either be assembled into contigs and searched against reference databases, or individually searched without assembly. The first approach may suffer from fragmentary and incomplete assembly, while the second is hampered by the reduced functional signal contained in the short reads. To tackle these issues, we have previously developed GRASP (Guided Reference-based Assembly of Short Peptides), which accepts a reference protein sequence as input and aims to assemble its homologs from a database containing fragmentary protein sequences. In addition to a gene-centric assembly tool, GRASP also serves as a homolog search tool when using the assembled protein sequences as templates to recruit reads. GRASP has significantly improved recall rate (60-80% vs. 30-40%) compared to other homolog search tools such as BLAST. However, GRASP is both time- and space-consuming. Subsequently, we developed GRASPx, which is 30X faster than GRASP. Here, we present a completely redesigned algorithm, GRASP2, for this computational problem. RESULTS: GRASP2 utilizes Burrows-Wheeler Transformation (BWT) and FM-index to perform assembly graph generation, and reduces the search space by employing a fast ungapped alignment strategy as a filter. GRASP2 also explicitly generates candidate paths prior to alignment, which effectively uncouples the iterative access of the assembly graph and alignment matrix. This strategy makes the execution of the program more efficient under current computer architecture, and contributes to GRASP2's speedup. GRASP2 is 8-fold faster than GRASPx (and 250-fold faster than GRASP) and uses 8-fold less memory while maintaining the original high recall rate of GRASP. GRASP2 reaches ~ 80% recall rate compared to that of ~ 40% generated by BLAST, both at a high precision level (> 95%). With such a high performance, GRASP2 is only ~3X slower than BLASTP. CONCLUSION: GRASP2 is a high-performance gene-centric and homolog search tool with significant speedup compared to its predecessors, which makes GRASP2 a useful tool for metagenomics data analysis, GRASP2 is implemented in C++ and is freely available from http://www.sourceforge.net/projects/grasp2 .


Assuntos
Genes , Metagenômica/métodos , Análise de Sequência de DNA/métodos , Homologia de Sequência do Ácido Nucleico , Software , Algoritmos , Organismos Aquáticos/genética , Microbiota/genética , Curva ROC , Fatores de Tempo
9.
Gene ; 711: 143942, 2019 Aug 30.
Artigo em Inglês | MEDLINE | ID: mdl-31238090

RESUMO

In the work, metagenomic sequencing was conducted to investigate the microbial gene catalogue in two samples of phosphinothricin (PPT)-utilized soils from South China. The gene sets contained an overwhelming majority of prevalent microbial genes, and were largely shared between these two samples. Several genus with high abundance were shared, such as norank_d__Bacteria, Nitrososphaera, Candidatus_Nitrosotalea, Candidatus_Nitrosocosmicus, and Rhodanobacter. Bacitracin resistance genes (61.4%) were the most dominant antibiotic resistance genes in two samples, followed by multidrug resistance efflux pump (12.5%). A lot of common virulence factors with high abundance were found in two samples, such as Alginate, Capsule I, ClpC, FbpABC, and HitABC, many of which were used for the iron uptake system. Total 57 putative PPT acetyltransferase were annotated, and two of them were found to be novel putative acetyltransferases for acetylation and detoxification of PPT. In conclusion, the work revealed microbial gene catalogue of PPT-utilized soils and found two novel putative PPT acetyltransferases using metagenomics. The work facilitates the understanding of impact of PPT on complex microbial community structure and physiology resides in PPT-utilized soils. Moreover, two annotated PPT acetyltransferases show important potential for the development of transgenic herbicide-resistant crops.


Assuntos
Aminobutiratos/metabolismo , Bactérias/classificação , Proteínas de Bactérias/genética , Metagenômica/métodos , Bactérias/genética , Bactérias/metabolismo , Resistência Microbiana a Medicamentos , Filogenia , Análise de Sequência de DNA/métodos , Microbiologia do Solo
10.
Genome Biol ; 20(1): 129, 2019 06 24.
Artigo em Inglês | MEDLINE | ID: mdl-31234903

RESUMO

BACKGROUND: Basecalling, the computational process of translating raw electrical signal to nucleotide sequence, is of critical importance to the sequencing platforms produced by Oxford Nanopore Technologies (ONT). Here, we examine the performance of different basecalling tools, looking at accuracy at the level of bases within individual reads and at majority-rule consensus basecalls in an assembly. We also investigate some additional aspects of basecalling: training using a taxon-specific dataset, using a larger neural network model and improving consensus basecalls in an assembly by additional signal-level analysis with Nanopolish. RESULTS: Training basecallers on taxon-specific data results in a significant boost in consensus accuracy, mostly due to the reduction of errors in methylation motifs. A larger neural network is able to improve both read and consensus accuracy, but at a cost to speed. Improving consensus sequences ('polishing') with Nanopolish somewhat negates the accuracy differences in basecallers, but pre-polish accuracy does have an effect on post-polish accuracy. CONCLUSIONS: Basecalling accuracy has seen significant improvements over the last 2 years. The current version of ONT's Guppy basecaller performs well overall, with good accuracy and fast performance. If higher accuracy is required, users should consider producing a custom model using a larger neural network and/or training data from the same species.


Assuntos
Redes Neurais (Computação) , Análise de Sequência de DNA/métodos , Software , Klebsiella pneumoniae , Nanoporos
11.
Arch Virol ; 164(8): 2119-2129, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-31147766

RESUMO

Rabies is a fatal disease caused by infection with rabies virus (RABV), and human rabies is still a critical public-health concern in China. Although there have been some phylogenetic studies about RABV transmission patterns, with the accumulation of more rabies sequences in recent years, there is an urgent need to update and clarify the spatial and temporal patterns of RABV circulating in China on a national scale. In this study, we collected all available RABV nucleoprotein gene sequences from China and its neighboring countries and performed comparative analysis. We identified six significant subclades of RABV circulating in China and found that each of them has a specific geographical distribution, reflecting possible physical barriers to gene flow. The phylogeographic analysis revealed minimal viral movement among different geographical locations. An analysis using Bayesian coalescent methods indicated that the current RABV strains in China may come from a common ancestor about 400 years ago, and currently, China is amid the second event of increasing RABV population since the 1950s, but the population has decreased gradually. We did not detect any evidence of recombination in the sequence dataset, nor did we find any evidence for positive selection during the expansion of RABV. Overall, geographic location and neutral genetic drift may be the main factors in shaping the phylogeography of RABV transmission in China.


Assuntos
Vírus da Raiva/genética , Raiva/transmissão , Animais , Teorema de Bayes , China , Evolução Molecular , Humanos , Epidemiologia Molecular/métodos , Nucleoproteínas/genética , Filogenia , Filogeografia/métodos , RNA Viral/genética , Raiva/virologia , Análise de Sequência de DNA/métodos
12.
Arch Virol ; 164(8): 2159-2164, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-31152250

RESUMO

Canine enteric coronaviruses (CCoVs) are important enteric pathogens of dogs. CCoVs with different variations are typically pantropic and pathogenic in dogs. In this study, we isolated a CCoV, designated HLJ-073, from a dead 6-week-old male Pekingese with gross lesions and diarrhea. Interestingly, sequence analysis suggested that HLJ-073 contained a 350-nt deletion in ORF3abc compared with reference CCoV isolates, resulting in the loss of portions of ORF3a and ORF3c and the complete loss of ORF3b. Phylogenetic analysis based on the S gene showed that HLJ-073 was more closely related to members of the FCoV II cluster than to members of the CCoV I or CCoV II cluster. Furthermore, recombination analysis suggested that HLJ-073 originated from the recombination of FCoV 79-1683 and CCoV A76, which were both isolated in the United States. Cell tropism experiments suggested that HLJ-073 could effectively replicate in canine macrophages/monocytes and human THP-1 cells. This is the first report of the isolation of strain HLJ-073 in China, and this virus has biological characteristics that are different from those of other reported CCoVs.


Assuntos
Coronavirus Canino/genética , Deleção de Sequência/genética , Animais , Células Cultivadas , China , Infecções por Coronavirus/virologia , Diarreia/virologia , Doenças do Cão/virologia , Cães , Humanos , Masculino , Filogenia , Análise de Sequência de DNA/métodos , Glicoproteína da Espícula de Coronavírus/genética , Células THP-1
13.
Arch Virol ; 164(8): 2209-2213, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-31161389

RESUMO

The complete genome of a double-stranded RNA (dsRNA) mycovirus, Phoma matteuccicola partitivirus 1 (PmPV1) was sequenced. It consists of two dsRNA segments, 1664 bp (dsRNA-1) and 1383 bp (dsRNA-2) in length, each containing a single open reading frame (ORF) potentially encoding a 46.78-kDa protein and a 40.92-kDa protein, respectively. dsRNA-1 encodes a putative polypeptide with a conserved RNA-dependent RNA polymerase (RdRp) domain that shows sequence similarity to the corresponding proteins of partitiviruses. The protein encoded by dsRNA-2 has no significant similarity to the typical coat proteins (CPs) of partitiviruses, but structure analysis nevertheless suggested that it might function as a coat protein. Purified viral particles of PmPV1 were isometric and approximately 29 nm in diameter. Phylogenetic analysis showed that PmPV1 is closely related to members of the genus Gammapartitivirus within the family Partitiviridae but forms a separate branch with Colletotrichum acutatum RNA virus 1 and Ustilaginoidea virens partitivirus 2. This is the first report of the full-length nucleotide sequence of a novel virus of the genus Gammapartitivirus infecting P. matteuccicola strain LG915, the causal agent of leaf blight of Curcuma wenyujin.


Assuntos
Ascomicetos/virologia , Micovírus/genética , Genoma Viral/genética , Sequência de Aminoácidos , Sequência de Bases , Proteínas do Capsídeo/genética , Curcuma/virologia , Genômica/métodos , Fases de Leitura Aberta/genética , Filogenia , Doenças das Plantas/virologia , RNA Replicase/genética , Vírus de RNA/genética , RNA de Cadeia Dupla/genética , RNA Viral/genética , Análise de Sequência de DNA/métodos
14.
Nat Chem ; 11(7): 629-637, 2019 07.
Artigo em Inglês | MEDLINE | ID: mdl-31209299

RESUMO

In DNA, the loss of a nucleobase by hydrolysis generates an abasic site. Formed as a result of DNA damage, as well as a key intermediate during the base excision repair pathway, abasic sites are frequent DNA lesions that can lead to mutations and strand breaks. Here we present snAP-seq, a chemical approach that selectively exploits the reactive aldehyde moiety at abasic sites to reveal their location within DNA at single-nucleotide resolution. Importantly, the approach resolves abasic sites from other aldehyde functionalities known to exist in genomic DNA. snAP-seq was validated on synthetic DNA and then applied to two separate genomes. We studied the distribution of thymine modifications in the Leishmania major genome by enzymatically converting these modifications into abasic sites followed by abasic site mapping. We also applied snAP-seq directly to HeLa DNA to provide a map of endogenous abasic sites in the human genome.


Assuntos
DNA/genética , Genoma/genética , Análise de Sequência de DNA/métodos , Aldeídos/química , Sequência de Bases , DNA/química , Dano ao DNA/genética , DNA Liase (Sítios Apurínicos ou Apirimidínicos)/genética , Técnicas de Silenciamento de Genes , Células HeLa , Humanos , Leishmania major/genética , Sondas Moleculares/síntese química , Sondas Moleculares/química , Timina/química , Uracila-DNA Glicosidase/química
16.
BMC Bioinformatics ; 20(1): 298, 2019 Jun 03.
Artigo em Inglês | MEDLINE | ID: mdl-31159722

RESUMO

BACKGROUND: Several standalone error correction tools have been proposed to correct sequencing errors in Illumina data in order to facilitate de novo genome assembly. However, in a recent survey, we showed that state-of-the-art assemblers often did not benefit from this pre-correction step. We found that many error correction tools introduce new errors in reads that overlap highly repetitive DNA regions such as low-complexity patterns or short homopolymers, ultimately leading to a more fragmented assembly. RESULTS: We propose BrownieCorrector, an error correction tool for Illumina sequencing data that focuses on the correction of only those reads that overlap short DNA patterns that are highly repetitive in the genome. BrownieCorrector extracts all reads that contain such a pattern and clusters them into different groups using a community detection algorithm that takes into account both the sequence similarity between overlapping reads and their respective paired-end reads. Each cluster holds reads that originate from the same genomic region and hence each cluster can be corrected individually, thus providing a consistent correction for all reads within that cluster. CONCLUSIONS: BrownieCorrector is benchmarked using six real Illumina datasets for different eukaryotic genomes. The prior use of BrownieCorrector improves assembly results over the use of uncorrected reads in all cases. In comparison with other error correction tools, BrownieCorrector leads to the best assembly results in most cases even though less than 2% of the reads within a dataset are corrected. Additionally, we investigate the impact of error correction on hybrid assembly where the corrected Illumina reads are supplemented with PacBio data. Our results confirm that BrownieCorrector improves the quality of hybrid genome assembly as well. BrownieCorrector is written in standard C++11 and released under GPL license. BrownieCorrector relies on multithreading to take advantage of multi-core/multi-CPU systems. The source code is available at https://github.com/biointec/browniecorrector .


Assuntos
Algoritmos , DNA/genética , Genoma , Sequências Repetitivas de Ácido Nucleico/genética , Análise de Sequência de DNA/métodos , Animais , Bases de Dados de Ácidos Nucleicos , Humanos , Alinhamento de Sequência , Fatores de Tempo
17.
Semin Ophthalmol ; 34(4): 223-231, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31170015

RESUMO

Purpose: To review the value of next-generation sequencing (NGS) in identifying the pathogens which cause ocular infections, thereby facilitating prompt initiation of treatment with an optimal anti-microbial regimen. Both contemporary and futuristic approaches to identifying pathogens in ocular infections are covered in this brief overview. Methods: Review of the peer reviewed literature on conventional and advanced methods as applied to the diagnosis of infectious diseases of the eye. Conclusion: NGS is a novel technology for identifying the pathogens responsible for ocular infections with the potential to improve the accuracy and speed of diagnosis and hastening the selection of the best therapy.


Assuntos
Infecções Oculares/diagnóstico , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Análise de Sequência de RNA/métodos , DNA Bacteriano/genética , DNA Fúngico/genética , DNA Ribossômico/genética , Humanos , Reação em Cadeia da Polimerase
18.
BMC Bioinformatics ; 20(1): 232, 2019 May 09.
Artigo em Inglês | MEDLINE | ID: mdl-31072311

RESUMO

BACKGROUND: Draft quality genomes for a multitude of organisms have become common due to the advancement of genome assemblers using long-read technologies with high error rates. Although current assemblies are substantially more contiguous than assemblies based on short reads, complete chromosomal assemblies are still challenging. Interspersed repeat families with multiple copy versions dominate the contig and scaffold ends of current long-read assemblies for complex genomes. These repeat families generally remain unresolved, as existing algorithmic solutions either do not scale to large copy numbers or can not handle the current high read error rates. RESULTS: We propose novel repeat resolution methods for large interspersed repeat families and assess their accuracy on simulated data sets with various distinct repeat structures and on drosophila melanogaster transposons. Additionally, we compare our methods to an existing long read repeat resolution tool and show the improved accuracy of our method. CONCLUSIONS: Our results demonstrate the applicability of our methods for the improvement of the contiguity of genome assemblies.


Assuntos
Bases de Dados Genéticas/normas , Genoma/genética , Análise de Sequência de DNA/métodos , Algoritmos , Humanos
19.
BMC Bioinformatics ; 20(1): 234, 2019 May 09.
Artigo em Inglês | MEDLINE | ID: mdl-31072312

RESUMO

BACKGROUND: The Oxford Nanopore Technologies (ONT) MinION portable sequencer makes it possible to use cutting-edge genomic technologies in the field and the academic classroom. RESULTS: We present NanoDJ, a Jupyter notebook integration of tools for simplified manipulation and assembly of DNA sequences produced by ONT devices. It integrates basecalling, read trimming and quality control, simulation and plotting routines with a variety of widely used aligners and assemblers, including procedures for hybrid assembly. CONCLUSIONS: With the use of Jupyter-facilitated access to self-explanatory contents of applications and the interactive visualization of results, as well as by its distribution into a Docker software container, NanoDJ is aimed to simplify and make more reproducible ONT DNA sequence analysis. The NanoDJ package code, documentation and installation instructions are freely available at https://github.com/genomicsITER/NanoDJ .


Assuntos
Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Nanoporos , Análise de Sequência de DNA/métodos
20.
BMC Bioinformatics ; 20(1): 236, 2019 May 10.
Artigo em Inglês | MEDLINE | ID: mdl-31077131

RESUMO

BACKGROUND: With the widespread use of multiple amplicon-sequencing (MAS) in genetic variation detection, an efficient tool is required to remove primer sequences from short reads to ensure the reliability of downstream analysis. Although some tools are currently available, their efficiency and accuracy require improvement in trimming large scale of primers in high throughput target genome sequencing. This issue is becoming more urgent considering the potential clinical implementation of MAS for processing patient samples. We here developed pTrimmer that could handle thousands of primers simultaneously with greatly improved accuracy and performance. RESULT: pTrimmer combines the two algorithms of k-mers and Needleman-Wunsch algorithm, which ensures its accuracy even with the presence of sequencing errors. pTrimmer has an improvement of 28.59% sensitivity and 11.87% accuracy compared to the similar tools. The simulation showed pTrimmer has an ultra-high sensitivity rate of 99.96% and accuracy of 97.38% compared to cutPrimers (70.85% sensitivity rate and 58.73% accuracy). And the performance of pTrimmer is notably higher. It is about 370 times faster than cutPrimers and even 17,000 times faster than cutadapt per threads. Trimming 2158 pairs of primers from 11 million reads (Illumina PE 150 bp) takes only 37 s and no more than 100 MB of memory consumption. CONCLUSIONS: pTrimmer is designed to trim primer sequence from multiplex amplicon sequencing and target sequencing. It is highly sensitive and specific compared to other three similar tools, which could help users to get more reliable mutational information for downstream analysis.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Algoritmos , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA