Pesquisa | BVS Integralidade em Saúde

1.

Alignment of Next-Generation Sequencing Reads.

Reinert, Knut; Langmead, Ben; Weese, David; Evers, Dirk J.

Annu Rev Genomics Hum Genet ; 16: 133-51, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-25939052

RESUMO

High-throughput DNA sequencing has considerably changed the possibilities for conducting biomedical research by measuring billions of short DNA or RNA fragments. A central computational problem, and for many applications a first step, consists of determining where the fragments came from in the original genome. In this article, we review the main techniques for generating the fragments, the main applications, and the main algorithmic ideas for computing a solution to the read alignment problem. In addition, we describe pitfalls and difficulties connected to determining the correct positions of reads.

Assuntos

Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Genoma , Poliploidia , Sequências Repetitivas de Ácido Nucleico , Software

2.

Phylogenomics of Lophotrochozoa with Consideration of Systematic Error.

Kocot, Kevin M; Struck, Torsten H; Merkel, Julia; Waits, Damien S; Todt, Christiane; Brannock, Pamela M; Weese, David A; Cannon, Johanna T; Moroz, Leonid L; Lieb, Bernhard; Halanych, Kenneth M.

Syst Biol ; 66(2): 256-282, 2017 Mar 01.

Artigo em Inglês | MEDLINE | ID: mdl-27664188

RESUMO

Phylogenomic studies have improved understanding of deep metazoan phylogeny and show promise for resolving incongruences among analyses based on limited numbers of loci. One region of the animal tree that has been especially difficult to resolve, even with phylogenomic approaches, is relationships within Lophotrochozoa (the animal clade that includes molluscs, annelids, and flatworms among others). Lack of resolution in phylogenomic analyses could be due to insufficient phylogenetic signal, limitations in taxon and/or gene sampling, or systematic error. Here, we investigated why lophotrochozoan phylogeny has been such a difficult question to answer by identifying and reducing sources of systematic error. We supplemented existing data with 32 new transcriptomes spanning the diversity of Lophotrochozoa and constructed a new set of Lophotrochozoa-specific core orthologs. Of these, 638 orthologous groups (OGs) passed strict screening for paralogy using a tree-based approach. In order to reduce possible sources of systematic error, we calculated branch-length heterogeneity, evolutionary rate, percent missing data, compositional bias, and saturation for each OG and analyzed increasingly stricter subsets of only the most stringent (best) OGs for these five variables. Principal component analysis of the values for each factor examined for each OG revealed that compositional heterogeneity and average patristic distance contributed most to the variance observed along the first principal component while branch-length heterogeneity and, to a lesser extent, saturation contributed most to the variance observed along the second. Missing data did not strongly contribute to either. Additional sensitivity analyses examined effects of removing taxa with heterogeneous branch lengths, large amounts of missing data, and compositional heterogeneity. Although our analyses do not unambiguously resolve lophotrochozoan phylogeny, we advance the field by reducing the list of viable hypotheses. Moreover, our systematic approach for dissection of phylogenomic data can be applied to explore sources of incongruence and poor support in any phylogenomic data set. [Annelida; Brachiopoda; Bryozoa; Entoprocta; Mollusca; Nemertea; Phoronida; Platyzoa; Polyzoa; Spiralia; Trochozoa.].

Assuntos

Briozoários/classificação , Briozoários/genética , Classificação/métodos , Genoma/genética , Filogenia , Animais

3.

Reconstruction of cyclooxygenase evolution in animals suggests variable, lineage-specific duplications, and homologs with low sequence identity.

Havird, Justin C; Kocot, Kevin M; Brannock, Pamela M; Cannon, Johanna T; Waits, Damien S; Weese, David A; Santos, Scott R; Halanych, Kenneth M.

J Mol Evol ; 80(3-4): 193-208, 2015 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-25758350

RESUMO

Cyclooxygenase (COX) enzymatically converts arachidonic acid into prostaglandin G/H in animals and has importance during pregnancy, digestion, and other physiological functions in mammals. COX genes have mainly been described from vertebrates, where gene duplications are common, but few studies have examined COX in invertebrates. Given the increasing ease in generating genomic data, as well as recent, although incomplete descriptions of potential COX sequences in Mollusca, Crustacea, and Insecta, assessing COX evolution across Metazoa is now possible. Here, we recover 40 putative COX orthologs by searching publicly available genomic resources as well as ~250 novel invertebrate transcriptomic datasets. Results suggest the common ancestor of Cnidaria and Bilateria possessed a COX homolog similar to those of vertebrates, although such homologs were not found in poriferan and ctenophore genomes. COX was found in most crustaceans and the majority of molluscs examined, but only specific taxa/lineages within Cnidaria and Annelida. For example, all octocorallians appear to have COX, while no COX homologs were found in hexacorallian datasets. Most species examined had a single homolog, although species-specific COX duplications were found in members of Annelida, Mollusca, and Cnidaria. Additionally, COX genes were not found in Hemichordata, Echinodermata, or Platyhelminthes, and the few previously described COX genes in Insecta lacked appreciable sequence homology (although structural analyses suggest these may still be functional COX enzymes). This analysis provides a benchmark for identifying COX homologs in future genomic and transcriptomic datasets, and identifies lineages for future studies of COX.

Assuntos

Evolução Molecular , Duplicação Gênica , Prostaglandina-Endoperóxido Sintases/genética , Animais , Cordados/genética , Crustáceos/genética , Bases de Dados Genéticas , Equinodermos/genética , Insetos/genética , Dados de Sequência Molecular , Moluscos/genética , Filogenia , Prostaglandina-Endoperóxido Sintases/metabolismo , Alinhamento de Sequência

4.

Journaled string tree-a scalable data structure for analyzing thousands of similar genomes on your laptop.

Rahn, René; Weese, David; Reinert, Knut.

Bioinformatics ; 30(24): 3499-505, 2014 Dec 15.

Artigo em Inglês | MEDLINE | ID: mdl-25028723

RESUMO

MOTIVATION: Next-generation sequencing (NGS) has revolutionized biomedical research in the past decade and led to a continuous stream of developments in bioinformatics, addressing the need for fast and space-efficient solutions for analyzing NGS data. Often researchers need to analyze a set of genomic sequences that stem from closely related species or are indeed individuals of the same species. Hence, the analyzed sequences are similar. For analyses where local changes in the examined sequence induce only local changes in the results, it is obviously desirable to examine identical or similar regions not repeatedly. RESULTS: In this work, we provide a datatype that exploits data parallelism inherent in a set of similar sequences by analyzing shared regions only once. In real-world experiments, we show that algorithms that otherwise would scan each reference sequentially can be speeded up by a factor of 115.

Assuntos

Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Algoritmos , Computadores

5.

Fiona: a parallel and automatic strategy for read error correction.

Schulz, Marcel H; Weese, David; Holtgrewe, Manuel; Dimitrova, Viktoria; Niu, Sijia; Reinert, Knut; Richard, Hugues.

Bioinformatics ; 30(17): i356-63, 2014 Sep 01.

Artigo em Inglês | MEDLINE | ID: mdl-25161220

RESUMO

MOTIVATION: Automatic error correction of high-throughput sequencing data can have a dramatic impact on the amount of usable base pairs and their quality. It has been shown that the performance of tasks such as de novo genome assembly and SNP calling can be dramatically improved after read error correction. While a large number of methods specialized for correcting substitution errors as found in Illumina data exist, few methods for the correction of indel errors, common to technologies like 454 or Ion Torrent, have been proposed. RESULTS: We present Fiona, a new stand-alone read error-correction method. Fiona provides a new statistical approach for sequencing error detection and optimal error correction and estimates its parameters automatically. Fiona is able to correct substitution, insertion and deletion errors and can be applied to any sequencing technology. It uses an efficient implementation of the partial suffix array to detect read overlaps with different seed lengths in parallel. We tested Fiona on several real datasets from a variety of organisms with different read lengths and compared its performance with state-of-the-art methods. Fiona shows a constantly higher correction accuracy over a broad range of datasets from 454 and Ion Torrent sequencers, without compromise in speed. CONCLUSION: Fiona is an accurate parameter-free read error-correction method that can be run on inexpensive hardware and can make use of multicore parallelization whenever available. Fiona was implemented using the SeqAn library for sequence analysis and is publicly available for download at http://www.seqan.de/projects/fiona. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Mutação INDEL

6.

Fast and accurate read mapping with approximate seeds and multiple backtracking.

Siragusa, Enrico; Weese, David; Reinert, Knut.

Nucleic Acids Res ; 41(7): e78, 2013 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-23358824

RESUMO

We present Masai, a read mapper representing the state-of-the-art in terms of speed and accuracy. Our tool is an order of magnitude faster than RazerS 3 and mrFAST, 2-4 times faster and more accurate than Bowtie 2 and BWA. The novelties of our read mapper are filtration with approximate seeds and a method for multiple backtracking. Approximate seeds, compared with exact seeds, increase filtration specificity while preserving sensitivity. Multiple backtracking amortizes the cost of searching a large set of seeds by taking advantage of the repetitiveness of next-generation sequencing data. Combined together, these two methods significantly speed up approximate search on genomic data sets. Masai is implemented in C++ using the SeqAn library. The source code is distributed under the BSD license and binaries for Linux, Mac OS X and Windows can be freely downloaded from http://www.seqan.de/projects/masai.

Assuntos

Mapeamento Cromossômico/métodos , Software , Algoritmos , Animais , Caenorhabditis elegans/genética , Drosophila melanogaster/genética , Escherichia coli/genética , Variação Genética , Genômica/métodos , Humanos

7.

RazerS 3: faster, fully sensitive read mapping.

Weese, David; Holtgrewe, Manuel; Reinert, Knut.

Bioinformatics ; 28(20): 2592-9, 2012 Oct 15.

Artigo em Inglês | MEDLINE | ID: mdl-22923295

RESUMO

MOTIVATION: During the past years, next-generation sequencing has become a key technology for many applications in the biomedical sciences. Throughput continues to increase and new protocols provide longer reads than currently available. In almost all applications, read mapping is a first step. Hence, it is crucial to have algorithms and implementations that perform fast, with high sensitivity, and are able to deal with long reads and a large absolute number of insertions and deletions. RESULTS: RazerS is a read mapping program with adjustable sensitivity based on counting q-grams. In this work, we propose the successor RazerS 3, which now supports shared-memory parallelism, an additional seed-based filter with adjustable sensitivity, a much faster, banded version of the Myers' bit-vector algorithm for verification, memory-saving measures and support for the SAM output format. This leads to a much improved performance for mapping reads, in particular, long reads with many errors. We extensively compare RazerS 3 with other popular read mappers and show that its results are often superior to them in terms of sensitivity while exhibiting practical and often competitive run times. In addition, RazerS 3 works without a pre-computed index. AVAILABILITY AND IMPLEMENTATION: Source code and binaries are freely available for download at http://www.seqan.de/projects/razers. RazerS 3 is implemented in C++ and OpenMP under a GPL license using the SeqAn library and supports Linux, Mac OS X and Windows.

Assuntos

Algoritmos , Mapeamento Cromossômico/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos

8.

Detecting genomic indel variants with exact breakpoints in single- and paired-end sequencing data using SplazerS.

Emde, Anne-Katrin; Schulz, Marcel H; Weese, David; Sun, Ruping; Vingron, Martin; Kalscheuer, Vera M; Haas, Stefan A; Reinert, Knut.

Bioinformatics ; 28(5): 619-27, 2012 Mar 01.

Artigo em Inglês | MEDLINE | ID: mdl-22238266

RESUMO

MOTIVATION: The reliable detection of genomic variation in resequencing data is still a major challenge, especially for variants larger than a few base pairs. Sequencing reads crossing boundaries of structural variation carry the potential for their identification, but are difficult to map. RESULTS: Here we present a method for 'split' read mapping, where prefix and suffix match of a read may be interrupted by a longer gap in the read-to-reference alignment. We use this method to accurately detect medium-sized insertions and long deletions with precise breakpoints in genomic resequencing data. Compared with alternative split mapping methods, SplazerS significantly improves sensitivity for detecting large indel events, especially in variant-rich regions. Our method is robust in the presence of sequencing errors as well as alignment errors due to genomic mutations/divergence, and can be used on reads of variable lengths. Our analysis shows that SplazerS is a versatile tool applicable to unanchored or single-end as well as anchored paired-end reads. In addition, application of SplazerS to targeted resequencing data led to the interesting discovery of a complete, possibly functional gene retrocopy variant. AVAILABILITY: SplazerS is available from http://www.seqan.de/projects/ splazers. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Genômica/métodos , Mutação INDEL , Análise de Sequência de DNA , Algoritmos , Humanos

9.

RazerS--fast read mapping with sensitivity control.

Weese, David; Emde, Anne-Katrin; Rausch, Tobias; Döring, Andreas; Reinert, Knut.

Genome Res ; 19(9): 1646-54, 2009 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-19592482

RESUMO

Second-generation sequencing technologies deliver DNA sequence data at unprecedented high throughput. Common to most biological applications is a mapping of the reads to an almost identical or highly similar reference genome. Due to the large amounts of data, efficient algorithms and implementations are crucial for this task. We present an efficient read mapping tool called RazerS. It allows the user to align sequencing reads of arbitrary length using either the Hamming distance or the edit distance. Our tool can work either lossless or with a user-defined loss rate at higher speeds. Given the loss rate, we present an approach that guarantees not to lose more reads than specified. This enables the user to adapt to the problem at hand and provides a seamless tradeoff between sensitivity and running time.

Assuntos

Mapeamento Cromossômico/métodos , Análise de Sequência de DNA/métodos , Software , Algoritmos , Animais , Drosophila melanogaster/genética , Genoma Humano/genética , Genoma de Inseto/genética , Humanos , Sensibilidade e Especificidade , Alinhamento de Sequência , Fatores de Tempo , Interface Usuário-Computador

10.

STELLAR: fast and exact local alignments.

Kehr, Birte; Weese, David; Reinert, Knut.

BMC Bioinformatics ; 12 Suppl 9: S15, 2011 Oct 05.

Artigo em Inglês | MEDLINE | ID: mdl-22151882

RESUMO

BACKGROUND: Large-scale comparison of genomic sequences requires reliable tools for the search of local alignments. Practical local aligners are in general fast, but heuristic, and hence sometimes miss significant matches. RESULTS: We present here the local pairwise aligner STELLAR that has full sensitivity for Îµ-alignments, i.e. guarantees to report all local alignments of a given minimal length and maximal error rate. The aligner is composed of two steps, filtering and verification. We apply the SWIFT algorithm for lossless filtering, and have developed a new verification strategy that we prove to be exact. Our results on simulated and real genomic data confirm and quantify the conjecture that heuristic tools like BLAST or BLAT miss a large percentage of significant local alignments. CONCLUSIONS: STELLAR is very practical and fast on very long sequences which makes it a suitable new tool for finding local alignments between genomic sequences under the edit distance model. Binaries are freely available for Linux, Windows, and Mac OS X at http://www.seqan.de/projects/stellar. The source code is freely distributed with the SeqAn C++ library version 1.3 and later at http://www.seqan.de.

Assuntos

Genômica/métodos , Alinhamento de Sequência/métodos , Software , Algoritmos , Animais , Drosophila/genética

11.

A novel and well-defined benchmarking method for second generation read mapping.

Holtgrewe, Manuel; Emde, Anne-Katrin; Weese, David; Reinert, Knut.

BMC Bioinformatics ; 12: 210, 2011 May 26.

Artigo em Inglês | MEDLINE | ID: mdl-21615913

RESUMO

BACKGROUND: Second generation sequencing technologies yield DNA sequence data at ultra high-throughput. Common to most biological applications is a mapping of the reads to an almost identical or highly similar reference genome. The assessment of the quality of read mapping results is not straightforward and has not been formalized so far. Hence, it has not been easy to compare different read mapping approaches in a unified way and to determine which program is the best for what task. RESULTS: We present a new benchmark method, called Rabema (Read Alignment BEnchMArk), for read mappers. It consists of a strict definition of the read mapping problem and of tools to evaluate the result of arbitrary read mappers supporting the SAM output format. CONCLUSIONS: We show the usefulness of the benchmark program by performing a comparison of popular read mappers. The tools supporting the benchmark are licensed under the GPL and available from http://www.seqan.de/projects/rabema.html.

Assuntos

Análise de Sequência de DNA/métodos , Análise de Sequência de DNA/normas , Algoritmos , Animais , Sequenciamento de Nucleotídeos em Larga Escala , Humanos

12.

MicroRazerS: rapid alignment of small RNA reads.

Emde, Anne-Katrin; Grunert, Marcel; Weese, David; Reinert, Knut; Sperling, Silke R.

Bioinformatics ; 26(1): 123-4, 2010 Jan 01.

Artigo em Inglês | MEDLINE | ID: mdl-19880369

RESUMO

MOTIVATION: Deep sequencing has become the method of choice for determining the small RNA content of a cell. Mapping the sequenced reads onto their reference genome serves as the basis for all further analyses, namely for identification and quantification. A method frequently used is Mega BLAST followed by several filtering steps, even though it is slow and inefficient for this task. Also, none of the currently available short read aligners has established itself for the particular task of small RNA mapping. RESULTS: We present MicroRazerS, a tool optimized for mapping small RNAs onto a reference genome. It is an order of magnitude faster than Mega BLAST and comparable in speed with other short read mapping tools. In addition, it is more sensitive and easy to handle and adjust. AVAILABILITY: MicroRazerS is part of the SeqAn C++ library and can be downloaded from http://www.seqan.de/projects/MicroRazerS.html.

Assuntos

Algoritmos , MicroRNAs/genética , Alinhamento de Sequência/métodos , Análise de Sequência de RNA/métodos , Software , Sequência de Bases , Dados de Sequência Molecular

13.

A consistency-based consensus algorithm for de novo and reference-guided sequence assembly of short reads.

Rausch, Tobias; Koren, Sergey; Denisov, Gennady; Weese, David; Emde, Anne-Katrin; Döring, Andreas; Reinert, Knut.

Bioinformatics ; 25(9): 1118-24, 2009 May 01.

Artigo em Inglês | MEDLINE | ID: mdl-19269990

RESUMO

MOTIVATION: Novel high-throughput sequencing technologies pose new algorithmic challenges in handling massive amounts of short-read, high-coverage data. A robust and versatile consensus tool is of particular interest for such data since a sound multi-read alignment is a prerequisite for variation analyses, accurate genome assemblies and insert sequencing. RESULTS: A multi-read alignment algorithm for de novo or reference-guided genome assembly is presented. The program identifies segments shared by multiple reads and then aligns these segments using a consistency-enhanced alignment graph. On real de novo sequencing data obtained from the newly established NCBI Short Read Archive, the program performs similarly in quality to other comparable programs. On more challenging simulated datasets for insert sequencing and variation analyses, our program outperforms the other tools. AVAILABILITY: The consensus program can be downloaded from http://www.seqan.de/projects/consensus.html. It can be used stand-alone or in conjunction with the Celera Assembler. Both application scenarios as well as the usage of the tool are described in the documentation.

Assuntos

Algoritmos , Alinhamento de Sequência/métodos , Sequência de Bases , Biologia Computacional/métodos , Internet , Dados de Sequência Molecular , Análise de Sequência de DNA/métodos

14.

Phenotypic Comparability from Genotypic Variability among Physically Structured Microbial Consortia.

Hoffman, Stephanie K; Seitz, Kiley W; Havird, Justin C; Weese, David A; Santos, Scott R.

Integr Comp Biol ; 60(2): 288-303, 2020 08 01.

Artigo em Inglês | MEDLINE | ID: mdl-32353148

RESUMO

Microbiomes represent the collective bacteria, archaea, protist, fungi, and virus communities living in or on individual organisms that are typically multicellular eukaryotes. Such consortia have become recognized as having significant impacts on the development, health, and disease status of their hosts. Since understanding the mechanistic connections between an individual's genetic makeup and their complete set of traits (i.e., genome to phenome) requires consideration at different levels of biological organization, this should include interactions with, and the organization of, microbial consortia. To understand microbial consortia organization, we elucidated the genetic constituents among phenotypically similar (and hypothesized functionally-analogous) layers (i.e., top orange, second orange, pink, and green layers) in the unique laminated orange cyanobacterial-bacterial crusts endemic to Hawaii's anchialine ecosystem. High-throughput amplicon sequencing of ribosomal RNA hypervariable regions (i.e., Bacteria-specific V6 and Eukarya-biased V9) revealed microbial richness increasing by crust layer depth, with samples of a given layer more similar to different layers from the same geographic site than to their phenotypically-analogous layer from different sites. Furthermore, samples from sites on the same island were more similar to each other, regardless of which layer they originated from, than to analogous layers from another island. However, cyanobacterial and algal taxa were abundant in all surface and bottom layers, with anaerobic and chemoautotrophic taxa concentrated in the middle two layers, suggesting crust oxygenation from both above and below. Thus, the arrangement of oxygenated vs. anoxygenated niches in these orange crusts is functionally distinct relative to other laminated cyanobacterial-bacterial communities examined to date, with convergent evolution due to similar environmental conditions a likely driver for these phenotypically comparable but genetically distinct microbial consortia.

Assuntos

Bactérias/genética , Genótipo , Consórcios Microbianos/genética , Fenótipo , Cianobactérias/genética , Havaí

15.

Segment-based multiple sequence alignment.

Rausch, Tobias; Emde, Anne-Katrin; Weese, David; Döring, Andreas; Notredame, Cedric; Reinert, Knut.

Bioinformatics ; 24(16): i187-92, 2008 Aug 15.

Artigo em Inglês | MEDLINE | ID: mdl-18689823

RESUMO

MOTIVATION: Many multiple sequence alignment tools have been developed in the past, progressing either in speed or alignment accuracy. Given the importance and wide-spread use of alignment tools, progress in both categories is a contribution to the community and has driven research in the field so far. RESULTS: We introduce a graph-based extension to the consistency-based, progressive alignment strategy. We apply the consistency notion to segments instead of single characters. The main problem we solve in this context is to define segments of the sequences in such a way that a graph-based alignment is possible. We implemented the algorithm using the SeqAn library and report results on amino acid and DNA sequences. The benefit of our approach is threefold: (1) sequences with conserved blocks can be rapidly aligned, (2) the implementation is conceptually easy, generic and fast and (3) the consistency idea can be extended to align multiple genomic sequences. AVAILABILITY: The segment-based multiple sequence alignment tool can be downloaded from http://www.seqan.de/projects/msa.html. A novel version of T-Coffee interfaced with the tool is available from http://www.tcoffee.org. The usage of the tool is described in both documentations.

Assuntos

Algoritmos , Alinhamento de Sequência/métodos , Análise de Sequência/métodos , Software

16.

Effects of Predator-Prey Interactions on Predator Traits: Differentiation of Diets and Venoms of a Marine Snail.

Weese, David A; Duda, Thomas F.

Toxins (Basel) ; 11(5)2019 05 25.

Artigo em Inglês | MEDLINE | ID: mdl-31130611

RESUMO

Species interactions are fundamental ecological forces that can have significant impacts on the evolutionary trajectories of species. Nonetheless, the contribution of predator-prey interactions to genetic and phenotypic divergence remains largely unknown. Predatory marine snails of the family Conidae exhibit specializations for different prey items and intraspecific variation in prey utilization patterns at geographic scales. Because cone snails utilize venom to capture prey and venom peptides are direct gene products, it is feasible to examine the evolution of genes associated with changes in resource utilization. Here, we compared feeding ecologies and venom duct transcriptomes of individuals from three populations of Conus miliaris, a species that exhibits geographic variation in prey utilization and dietary breadth, in order to determine the extent to which dietary differences are correlated with differences in venom composition, and if expanded niche breadth is associated with increased variation in venom composition. While populations showed little to no overlap in resource utilization, taxonomic richness of prey was greatest at Easter Island. Changes in dietary breadth were associated with differences in expression patterns and increased genetic differentiation of toxin-related genes. The Easter Island population also exhibited greater diversity of toxin-related transcripts, but did not show increased variance in expression of these transcripts. These results imply that differences in dietary breadth contribute more to the structural and regulatory differentiation of venoms than differences in diet.

Assuntos

Conotoxinas/genética , Caramujo Conus/fisiologia , Samoa Americana , Animais , Caramujo Conus/genética , Dieta , Comportamento Alimentar , Guam , Polimorfismo de Nucleotídeo Único , Polinésia , Comportamento Predatório , Transcriptoma

17.

SeqAn an efficient, generic C++ library for sequence analysis.

Döring, Andreas; Weese, David; Rausch, Tobias; Reinert, Knut.

BMC Bioinformatics ; 9: 11, 2008 Jan 09.

Artigo em Inglês | MEDLINE | ID: mdl-18184432

RESUMO

BACKGROUND: The use of novel algorithmic techniques is pivotal to many important problems in life science. For example the sequencing of the human genome 1 would not have been possible without advanced assembly algorithms. However, owing to the high speed of technological progress and the urgent need for bioinformatics tools, there is a widening gap between state-of-the-art algorithmic techniques and the actual algorithmic components of tools that are in widespread use. RESULTS: To remedy this trend we propose the use of SeqAn, a library of efficient data types and algorithms for sequence analysis in computational biology. SeqAn comprises implementations of existing, practical state-of-the-art algorithmic components to provide a sound basis for algorithm testing and development. In this paper we describe the design and content of SeqAn and demonstrate its use by giving two examples. In the first example we show an application of SeqAn as an experimental platform by comparing different exact string matching algorithms. The second example is a simple version of the well-known MUMmer tool rewritten in SeqAn. Results indicate that our implementation is very efficient and versatile to use. CONCLUSION: We anticipate that SeqAn greatly simplifies the rapid development of new bioinformatics tools by providing a collection of readily usable, well-designed algorithmic components which are fundamental for the field of sequence analysis. This leverages not only the implementation of new algorithms, but also enables a sound analysis and comparison of existing algorithms.

Assuntos

Algoritmos , Bases de Dados Genéticas , Linguagens de Programação , Alinhamento de Sequência/métodos , Análise de Sequência/métodos , Software , Interface Usuário-Computador , Sistemas de Gerenciamento de Base de Dados

18.

The SeqAn C++ template library for efficient sequence analysis: A resource for programmers.

Reinert, Knut; Dadi, Temesgen Hailemariam; Ehrhardt, Marcel; Hauswedell, Hannes; Mehringer, Svenja; Rahn, René; Kim, Jongkyu; Pockrandt, Christopher; Winkler, Jörg; Siragusa, Enrico; Urgese, Gianvito; Weese, David.

J Biotechnol ; 261: 157-168, 2017 Nov 10.

Artigo em Inglês | MEDLINE | ID: mdl-28888961

RESUMO

BACKGROUND: The use of novel algorithmic techniques is pivotal to many important problems in life science. For example the sequencing of the human genome (Venter et al., 2001) would not have been possible without advanced assembly algorithms and the development of practical BWT based read mappers have been instrumental for NGS analysis. However, owing to the high speed of technological progress and the urgent need for bioinformatics tools, there was a widening gap between state-of-the-art algorithmic techniques and the actual algorithmic components of tools that are in widespread use. We previously addressed this by introducing the SeqAn library of efficient data types and algorithms in 2008 (Döring et al., 2008). RESULTS: The SeqAn library has matured considerably since its first publication 9 years ago. In this article we review its status as an established resource for programmers in the field of sequence analysis and its contributions to many analysis tools. CONCLUSIONS: We anticipate that SeqAn will continue to be a valuable resource, especially since it started to actively support various hardware acceleration techniques in a systematic manner.

Assuntos

Bases de Dados Genéticas , Genômica/métodos , Análise de Sequência de DNA/métodos , Software , Algoritmos , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Alinhamento de Sequência

19.

Phylogenetic utility, and variability in structure and content, of complete mitochondrial genomes among genetic lineages of the Hawaiian anchialine shrimp Halocaridina rubra Holthuis 1963 (Atyidae:Decapoda).

Justice, Joshua L; Weese, David A; Santos, Scott Ross.

Mitochondrial DNA A DNA Mapp Seq Anal ; 27(4): 2710-8, 2016 07.

Artigo em Inglês | MEDLINE | ID: mdl-26061341

RESUMO

The Atyidae are caridean shrimp possessing hair-like setae on their claws and are important contributors to ecological services in tropical and temperate fresh and brackish water ecosystems. Complete mitochondrial genomes have only been reported from five of the 449 species in the family, thus limiting understanding of mitochondrial genome evolution and the phylogenetic utility of complete mitochondrial sequences in the Atyidae. Here, comparative analyses of complete mitochondrial genomes from eight genetic lineages of Halocaridina rubra, an atyid endemic to the anchialine ecosystem of the Hawaiian Archipelago, are presented. Although gene number, order, and orientation were syntenic among genomes, three regions were identified and further quantified where conservation was substantially lower: (1) high length and sequence variability in the tRNA-Lys and tRNA-Asp intergenic region; (2) a 317-bp insertion between the NAD6 and CytB genes confined to a single lineage and representing a partial duplication of CytB; and (3) the putative control region. Phylogenetic analyses utilizing complete mitochondrial sequences provided new insights into relationships among the H. rubra genetic lineages, with the topology of one clade correlating to the geologic sequence of the islands. However, deeper nodes in the phylogeny lacked bootstrap support. Overall, our results from H. rubra suggest intra-specific mitochondrial genomic diversity could be underestimated across the Metazoa since the vast majority of complete genomes are from just a single individual of a species.

Assuntos

Decápodes/genética , Genoma Mitocondrial/genética , Animais , Decápodes/classificação , Ecossistema , Evolução Molecular , Havaí , Filogenia , RNA de Transferência/genética , Sequências de Repetição em Tandem/genética

20.

CIDANE: comprehensive isoform discovery and abundance estimation.

Canzar, Stefan; Andreotti, Sandro; Weese, David; Reinert, Knut; Klau, Gunnar W.

Genome Biol ; 17: 16, 2016 Jan 30.

Artigo em Inglês | MEDLINE | ID: mdl-26831908

RESUMO

We present CIDANE, a novel framework for genome-based transcript reconstruction and quantification from RNA-seq reads. CIDANE assembles transcripts efficiently with significantly higher sensitivity and precision than existing tools. Its algorithmic core not only reconstructs transcripts ab initio, but also allows the use of the growing annotation of known splice sites, transcription start and end sites, or full-length transcripts, which are available for most model organisms. CIDANE supports the integrated analysis of RNA-seq and additional gene-boundary data and recovers splice junctions that are invisible to other methods. CIDANE is available at http://ccb.jhu.edu/software/cidane/.

Assuntos

Sequenciamento de Nucleotídeos em Larga Escala/métodos , Isoformas de Proteínas/genética , RNA/genética , Análise de Sequência de RNA/métodos , Algoritmos , Perfilação da Expressão Gênica , Isoformas de Proteínas/isolamento & purificação , Splicing de RNA/genética , Software , Transcriptoma/genética

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa