Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 25
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
Integr Comp Biol ; 60(2): 288-303, 2020 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-32353148

RESUMEN

Microbiomes represent the collective bacteria, archaea, protist, fungi, and virus communities living in or on individual organisms that are typically multicellular eukaryotes. Such consortia have become recognized as having significant impacts on the development, health, and disease status of their hosts. Since understanding the mechanistic connections between an individual's genetic makeup and their complete set of traits (i.e., genome to phenome) requires consideration at different levels of biological organization, this should include interactions with, and the organization of, microbial consortia. To understand microbial consortia organization, we elucidated the genetic constituents among phenotypically similar (and hypothesized functionally-analogous) layers (i.e., top orange, second orange, pink, and green layers) in the unique laminated orange cyanobacterial-bacterial crusts endemic to Hawaii's anchialine ecosystem. High-throughput amplicon sequencing of ribosomal RNA hypervariable regions (i.e., Bacteria-specific V6 and Eukarya-biased V9) revealed microbial richness increasing by crust layer depth, with samples of a given layer more similar to different layers from the same geographic site than to their phenotypically-analogous layer from different sites. Furthermore, samples from sites on the same island were more similar to each other, regardless of which layer they originated from, than to analogous layers from another island. However, cyanobacterial and algal taxa were abundant in all surface and bottom layers, with anaerobic and chemoautotrophic taxa concentrated in the middle two layers, suggesting crust oxygenation from both above and below. Thus, the arrangement of oxygenated vs. anoxygenated niches in these orange crusts is functionally distinct relative to other laminated cyanobacterial-bacterial communities examined to date, with convergent evolution due to similar environmental conditions a likely driver for these phenotypically comparable but genetically distinct microbial consortia.


Asunto(s)
Bacterias/genética , Genotipo , Consorcios Microbianos/genética , Fenotipo , Cianobacterias/genética , Hawaii
2.
Toxins (Basel) ; 11(5)2019 05 25.
Artículo en Inglés | MEDLINE | ID: mdl-31130611

RESUMEN

Species interactions are fundamental ecological forces that can have significant impacts on the evolutionary trajectories of species. Nonetheless, the contribution of predator-prey interactions to genetic and phenotypic divergence remains largely unknown. Predatory marine snails of the family Conidae exhibit specializations for different prey items and intraspecific variation in prey utilization patterns at geographic scales. Because cone snails utilize venom to capture prey and venom peptides are direct gene products, it is feasible to examine the evolution of genes associated with changes in resource utilization. Here, we compared feeding ecologies and venom duct transcriptomes of individuals from three populations of Conus miliaris, a species that exhibits geographic variation in prey utilization and dietary breadth, in order to determine the extent to which dietary differences are correlated with differences in venom composition, and if expanded niche breadth is associated with increased variation in venom composition. While populations showed little to no overlap in resource utilization, taxonomic richness of prey was greatest at Easter Island. Changes in dietary breadth were associated with differences in expression patterns and increased genetic differentiation of toxin-related genes. The Easter Island population also exhibited greater diversity of toxin-related transcripts, but did not show increased variance in expression of these transcripts. These results imply that differences in dietary breadth contribute more to the structural and regulatory differentiation of venoms than differences in diet.


Asunto(s)
Conotoxinas/genética , Caracol Conus/fisiología , Samoa Americana , Animales , Caracol Conus/genética , Dieta , Conducta Alimentaria , Guam , Polimorfismo de Nucleótido Simple , Polinesia , Conducta Predatoria , Transcriptoma
3.
J Biotechnol ; 261: 157-168, 2017 Nov 10.
Artículo en Inglés | MEDLINE | ID: mdl-28888961

RESUMEN

BACKGROUND: The use of novel algorithmic techniques is pivotal to many important problems in life science. For example the sequencing of the human genome (Venter et al., 2001) would not have been possible without advanced assembly algorithms and the development of practical BWT based read mappers have been instrumental for NGS analysis. However, owing to the high speed of technological progress and the urgent need for bioinformatics tools, there was a widening gap between state-of-the-art algorithmic techniques and the actual algorithmic components of tools that are in widespread use. We previously addressed this by introducing the SeqAn library of efficient data types and algorithms in 2008 (Döring et al., 2008). RESULTS: The SeqAn library has matured considerably since its first publication 9 years ago. In this article we review its status as an established resource for programmers in the field of sequence analysis and its contributions to many analysis tools. CONCLUSIONS: We anticipate that SeqAn will continue to be a valuable resource, especially since it started to actively support various hardware acceleration techniques in a systematic manner.


Asunto(s)
Bases de Datos Genéticas , Genómica/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Algoritmos , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Alineación de Secuencia
4.
Syst Biol ; 66(2): 256-282, 2017 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-27664188

RESUMEN

Phylogenomic studies have improved understanding of deep metazoan phylogeny and show promise for resolving incongruences among analyses based on limited numbers of loci. One region of the animal tree that has been especially difficult to resolve, even with phylogenomic approaches, is relationships within Lophotrochozoa (the animal clade that includes molluscs, annelids, and flatworms among others). Lack of resolution in phylogenomic analyses could be due to insufficient phylogenetic signal, limitations in taxon and/or gene sampling, or systematic error. Here, we investigated why lophotrochozoan phylogeny has been such a difficult question to answer by identifying and reducing sources of systematic error. We supplemented existing data with 32 new transcriptomes spanning the diversity of Lophotrochozoa and constructed a new set of Lophotrochozoa-specific core orthologs. Of these, 638 orthologous groups (OGs) passed strict screening for paralogy using a tree-based approach. In order to reduce possible sources of systematic error, we calculated branch-length heterogeneity, evolutionary rate, percent missing data, compositional bias, and saturation for each OG and analyzed increasingly stricter subsets of only the most stringent (best) OGs for these five variables. Principal component analysis of the values for each factor examined for each OG revealed that compositional heterogeneity and average patristic distance contributed most to the variance observed along the first principal component while branch-length heterogeneity and, to a lesser extent, saturation contributed most to the variance observed along the second. Missing data did not strongly contribute to either. Additional sensitivity analyses examined effects of removing taxa with heterogeneous branch lengths, large amounts of missing data, and compositional heterogeneity. Although our analyses do not unambiguously resolve lophotrochozoan phylogeny, we advance the field by reducing the list of viable hypotheses. Moreover, our systematic approach for dissection of phylogenomic data can be applied to explore sources of incongruence and poor support in any phylogenomic data set. [Annelida; Brachiopoda; Bryozoa; Entoprocta; Mollusca; Nemertea; Phoronida; Platyzoa; Polyzoa; Spiralia; Trochozoa.].


Asunto(s)
Briozoos/clasificación , Briozoos/genética , Clasificación/métodos , Genoma/genética , Filogenia , Animales
5.
Genome Biol ; 17: 16, 2016 Jan 30.
Artículo en Inglés | MEDLINE | ID: mdl-26831908

RESUMEN

We present CIDANE, a novel framework for genome-based transcript reconstruction and quantification from RNA-seq reads. CIDANE assembles transcripts efficiently with significantly higher sensitivity and precision than existing tools. Its algorithmic core not only reconstructs transcripts ab initio, but also allows the use of the growing annotation of known splice sites, transcription start and end sites, or full-length transcripts, which are available for most model organisms. CIDANE supports the integrated analysis of RNA-seq and additional gene-boundary data and recovers splice junctions that are invisible to other methods. CIDANE is available at http://ccb.jhu.edu/software/cidane/.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Isoformas de Proteínas/genética , ARN/genética , Análisis de Secuencia de ARN/métodos , Algoritmos , Perfilación de la Expresión Génica , Isoformas de Proteínas/aislamiento & purificación , Empalme del ARN/genética , Programas Informáticos , Transcriptoma/genética
6.
Mitochondrial DNA A DNA Mapp Seq Anal ; 27(4): 2710-8, 2016 07.
Artículo en Inglés | MEDLINE | ID: mdl-26061341

RESUMEN

The Atyidae are caridean shrimp possessing hair-like setae on their claws and are important contributors to ecological services in tropical and temperate fresh and brackish water ecosystems. Complete mitochondrial genomes have only been reported from five of the 449 species in the family, thus limiting understanding of mitochondrial genome evolution and the phylogenetic utility of complete mitochondrial sequences in the Atyidae. Here, comparative analyses of complete mitochondrial genomes from eight genetic lineages of Halocaridina rubra, an atyid endemic to the anchialine ecosystem of the Hawaiian Archipelago, are presented. Although gene number, order, and orientation were syntenic among genomes, three regions were identified and further quantified where conservation was substantially lower: (1) high length and sequence variability in the tRNA-Lys and tRNA-Asp intergenic region; (2) a 317-bp insertion between the NAD6 and CytB genes confined to a single lineage and representing a partial duplication of CytB; and (3) the putative control region. Phylogenetic analyses utilizing complete mitochondrial sequences provided new insights into relationships among the H. rubra genetic lineages, with the topology of one clade correlating to the geologic sequence of the islands. However, deeper nodes in the phylogeny lacked bootstrap support. Overall, our results from H. rubra suggest intra-specific mitochondrial genomic diversity could be underestimated across the Metazoa since the vast majority of complete genomes are from just a single individual of a species.


Asunto(s)
Decápodos/genética , Genoma Mitocondrial/genética , Animales , Decápodos/clasificación , Ecosistema , Evolución Molecular , Hawaii , Filogenia , ARN de Transferencia/genética , Secuencias Repetidas en Tándem/genética
7.
Biol Bull ; 229(2): 134-42, 2015 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-26504154

RESUMEN

Larvae in aquatic habitats often develop in environments different from those they inhabit as adults. Shrimp in the Atyidae exemplify this trend, as larvae of many species require salt or brackish water for development, while adults are freshwater-adapted. An exception within the Atyidae family is the "anchialine clade," which are euryhaline as adults and endemic to habitats with subterranean fresh and marine water influences. Although the Hawaiian anchialine atyid Halocaridina rubra is a strong osmoregulator, its larvae have never been observed in nature. Moreover, larval development in anchialine species is poorly studied. Here, reproductive trends in laboratory colonies over a 5-y period are presented from seven genetic lineages and one mixed population of H. rubra; larval survivorship under varying salinities is also discussed. The presence and number of larvae differed significantly among lineages, with the mixed population being the most prolific. Statistical differences in reproduction attributable to seasonality also were identified. Larval survivorship was lowest (12% settlement rate) at a salinity approaching fresh water and significantly higher in brackish and seawater (88% and 72%, respectively). Correlated with this finding, identifiable gills capable of ion transport did not develop until metamorphosis into juveniles. Thus, early life stages of H. rubra are apparently excluded from surface waters, which are characterized by lower and fluctuating salinities. Instead, these stages are restricted to the subterranean (where there is higher and more stable salinity) portion of Hawaii's anchialine habitats due to their inability to tolerate low salinities. Taken together, these data contribute to the understudied area of larval ecology in the anchialine ecosystem.


Asunto(s)
Decápodos/crecimiento & desarrollo , Decápodos/fisiología , Animales , Decápodos/genética , Ecosistema , Femenino , Agua Dulce , Branquias/crecimiento & desarrollo , Hawaii , Larva/crecimiento & desarrollo , Larva/fisiología , Masculino , Metamorfosis Biológica , Reproducción/fisiología , Salinidad , Estaciones del Año , Agua de Mar
8.
Annu Rev Genomics Hum Genet ; 16: 133-51, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-25939052

RESUMEN

High-throughput DNA sequencing has considerably changed the possibilities for conducting biomedical research by measuring billions of short DNA or RNA fragments. A central computational problem, and for many applications a first step, consists of determining where the fragments came from in the original genome. In this article, we review the main techniques for generating the fragments, the main applications, and the main algorithmic ideas for computing a solution to the read alignment problem. In addition, we describe pitfalls and difficulties connected to determining the correct positions of reads.


Asunto(s)
Algoritmos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos , Genoma , Poliploidía , Secuencias Repetitivas de Ácidos Nucleicos , Programas Informáticos
9.
J Mol Evol ; 80(3-4): 193-208, 2015 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-25758350

RESUMEN

Cyclooxygenase (COX) enzymatically converts arachidonic acid into prostaglandin G/H in animals and has importance during pregnancy, digestion, and other physiological functions in mammals. COX genes have mainly been described from vertebrates, where gene duplications are common, but few studies have examined COX in invertebrates. Given the increasing ease in generating genomic data, as well as recent, although incomplete descriptions of potential COX sequences in Mollusca, Crustacea, and Insecta, assessing COX evolution across Metazoa is now possible. Here, we recover 40 putative COX orthologs by searching publicly available genomic resources as well as ~250 novel invertebrate transcriptomic datasets. Results suggest the common ancestor of Cnidaria and Bilateria possessed a COX homolog similar to those of vertebrates, although such homologs were not found in poriferan and ctenophore genomes. COX was found in most crustaceans and the majority of molluscs examined, but only specific taxa/lineages within Cnidaria and Annelida. For example, all octocorallians appear to have COX, while no COX homologs were found in hexacorallian datasets. Most species examined had a single homolog, although species-specific COX duplications were found in members of Annelida, Mollusca, and Cnidaria. Additionally, COX genes were not found in Hemichordata, Echinodermata, or Platyhelminthes, and the few previously described COX genes in Insecta lacked appreciable sequence homology (although structural analyses suggest these may still be functional COX enzymes). This analysis provides a benchmark for identifying COX homologs in future genomic and transcriptomic datasets, and identifies lineages for future studies of COX.


Asunto(s)
Evolución Molecular , Duplicación de Gen , Prostaglandina-Endoperóxido Sintasas/genética , Animales , Cordados/genética , Crustáceos/genética , Bases de Datos Genéticas , Equinodermos/genética , Insectos/genética , Datos de Secuencia Molecular , Moluscos/genética , Filogenia , Prostaglandina-Endoperóxido Sintasas/metabolismo , Alineación de Secuencia
10.
Mol Ecol Resour ; 15(1): 228-9, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25424247

RESUMEN

This article documents the public availability of (i) transcriptome sequence data, assembly and annotation, and single nucleotide polymorphisms (SNPs) for the cone snail Conus miliaris; (ii) a set of SNP markers for two biotypes from the Culex pipiens mosquito complex; (iii) transcriptome sequence data, assembly and annotation for the mountain fly Drosophila nigrosparsa; (iv) transcriptome sequence data, assembly and annotation and SNPs for the Neotropical toads Rhinella marina and R. schneideri; and (v) partial genomic sequence assembly and annotation for 35 spiny lizard species (Genus Sceloporus).


Asunto(s)
Bufonidae/genética , Caracol Conus/genética , Culex/genética , Drosophila/genética , Lagartos/genética , Polimorfismo de Nucleótido Simple , Transcriptoma , Animales , Bases de Datos de Compuestos Químicos
11.
Curr Biol ; 24(23): 2827-32, 2014 Dec 01.
Artículo en Inglés | MEDLINE | ID: mdl-25454590

RESUMEN

Ambulacraria, comprising Hemichordata and Echinodermata, is closely related to Chordata, making it integral to understanding chordate origins and polarizing chordate molecular and morphological characters. Unfortunately, relationships within Hemichordata and Echinodermata have remained unresolved, compromising our ability to extrapolate findings from the most closely related molecular and developmental models outside of Chordata (e.g., the acorn worms Saccoglossus kowalevskii and Ptychodera flava and the sea urchin Strongylocentrotus purpuratus). To resolve long-standing phylogenetic issues within Ambulacraria, we sequenced transcriptomes for 14 hemichordates as well as 8 echinoderms and complemented these with existing data for a total of 33 ambulacrarian operational taxonomic units (OTUs). Examination of leaf stability values revealed rhabdopleurid pterobranchs and the enteropneust Stereobalanus canadensis were unstable in placement; therefore, analyses were also run without these taxa. Analyses of 185 genes resulted in reciprocal monophyly of Enteropneusta and Pterobranchia, placed the deep-sea family Torquaratoridae within Ptychoderidae, and confirmed the position of ophiuroid brittle stars as sister to asteroid sea stars (the Asterozoa hypothesis). These results are consistent with earlier perspectives concerning plesiomorphies of Ambulacraria, including pharyngeal gill slits, a single axocoel, and paired hydrocoels and somatocoels. The resolved ambulacrarian phylogeny will help clarify the early evolution of chordate characteristics and has implications for our understanding of major fossil groups, including graptolites and somasteroideans.


Asunto(s)
Cordados no Vertebrados/genética , Filogenia , Animales , Evolución Biológica , Cordados/clasificación , Cordados/genética , Cordados no Vertebrados/clasificación , Funciones de Verosimilitud , Transcriptoma
12.
Sci Rep ; 4: 6380, 2014 Sep 16.
Artículo en Inglés | MEDLINE | ID: mdl-25223336

RESUMEN

Glacial cycles of the Quaternary have heavily influenced the demographic history of various species. To test the evolutionary impact of palaeo-geologic and climatic events on the demographic history of marine taxa from the coastal Western Pacific, we investigated the population structure and demographic history of two economically important fish (Trichiurus japonicus and T. nanhaiensis) that inhabit the continental shelves of the East China and northern South China Seas using the mitochondrial cytochrome b sequences and Bayesian Skyline Plot analyses. A molecular rate of 2.03% per million years, calibrated to the earliest flooding of the East China Sea shelf (70-140 kya), revealed a strong correlation between population sizes and primary production. Furthermore, comparison of the demographic history of T. japonicus populations from the East China and South China Seas provided evidence of the postglacial development of the Changjiang (Yangtze River) Delta. In the South China Sea, interspecific comparisons between T. japonicus and T. nanhaiensis indicated possible evolutionary responses to changes in palaeo-productivity that were influenced by East Asian winter monsoons. This study not only provides insight into the demographic history of cutlassfish but also reveals potential clues regarding the historic productivity and regional oceanographic conditions of the Western Pacific marginal seas.


Asunto(s)
Evolución Biológica , Clima , Perciformes/fisiología , Filogeografía , Animales , Teorema de Bayes , China , Datos de Secuencia Molecular , Océanos y Mares , Perciformes/clasificación
13.
Bioinformatics ; 30(17): i356-63, 2014 Sep 01.
Artículo en Inglés | MEDLINE | ID: mdl-25161220

RESUMEN

MOTIVATION: Automatic error correction of high-throughput sequencing data can have a dramatic impact on the amount of usable base pairs and their quality. It has been shown that the performance of tasks such as de novo genome assembly and SNP calling can be dramatically improved after read error correction. While a large number of methods specialized for correcting substitution errors as found in Illumina data exist, few methods for the correction of indel errors, common to technologies like 454 or Ion Torrent, have been proposed. RESULTS: We present Fiona, a new stand-alone read error-correction method. Fiona provides a new statistical approach for sequencing error detection and optimal error correction and estimates its parameters automatically. Fiona is able to correct substitution, insertion and deletion errors and can be applied to any sequencing technology. It uses an efficient implementation of the partial suffix array to detect read overlaps with different seed lengths in parallel. We tested Fiona on several real datasets from a variety of organisms with different read lengths and compared its performance with state-of-the-art methods. Fiona shows a constantly higher correction accuracy over a broad range of datasets from 454 and Ion Torrent sequencers, without compromise in speed. CONCLUSION: Fiona is an accurate parameter-free read error-correction method that can be run on inexpensive hardware and can make use of multicore parallelization whenever available. Fiona was implemented using the SeqAn library for sequence analysis and is publicly available for download at http://www.seqan.de/projects/fiona. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos , Mutación INDEL
14.
Bioinformatics ; 30(24): 3499-505, 2014 Dec 15.
Artículo en Inglés | MEDLINE | ID: mdl-25028723

RESUMEN

MOTIVATION: Next-generation sequencing (NGS) has revolutionized biomedical research in the past decade and led to a continuous stream of developments in bioinformatics, addressing the need for fast and space-efficient solutions for analyzing NGS data. Often researchers need to analyze a set of genomic sequences that stem from closely related species or are indeed individuals of the same species. Hence, the analyzed sequences are similar. For analyses where local changes in the examined sequence induce only local changes in the results, it is obviously desirable to examine identical or similar regions not repeatedly. RESULTS: In this work, we provide a datatype that exploits data parallelism inherent in a set of similar sequences by analyzing shared regions only once. In real-world experiments, we show that algorithms that otherwise would scan each reference sequentially can be speeded up by a factor of 115.


Asunto(s)
Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos , Algoritmos , Computadores
15.
Biol Bull ; 225(1): 24-41, 2013 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-24088794

RESUMEN

Archipelagos of the Indo-West Pacific are considered to be among the richest in the world in biodiversity, and phylogeographic studies generally support either the center of origin or the center of accumulation hypothesis to explain this pattern. To differentiate between these competing hypotheses for organisms from the Indo-West Pacific anchialine ecosystem, defined as coastal bodies of mixohaline water fluctuating with the tides but having no direct oceanic connections, we investigated the genetic variation, population structure, and evolutionary history of three caridean shrimp species (Antecaridina lauensis, Halocaridinides trigonophthalma, and Metabetaeus minutus) in the Ryukyu Archipelago, Japan. We used two mitochondrial genes--cytochrome c oxidase subunit I (COI) and large ribosomal subunit (16S-rDNA)--complemented with genetic examination of available specimens from the same or closely related species from the Indian and Pacific Oceans. In the Ryukyus, each species encompassed 2-3 divergent (9.52%-19.2% COI p-distance) lineages, each having significant population structure and varying geographic distributions. Phylogenetically, the A. lauensis and M. minutus lineages in the Ryukyus were more closely related to ones from outside the archipelago than to one another. These results, when interpreted in the context of Pacific oceanographic currents and geologic history of the Ryukyus, imply multiple colonizations of the archipelago by the three species, consistent with the center of accumulation hypothesis. While this study contributes toward understanding the biodiversity, ecology, and evolution of organisms in the Ryukyus and the Indo-West Pacific, it also has potential utility in establishing conservation strategies for anchialine fauna of the Pacific Basin in general.


Asunto(s)
Distribución Animal , Biodiversidad , Decápodos/genética , Variación Genética , Animales , Decápodos/clasificación , Decápodos/fisiología , Japón , Filogeografía , Análisis de Secuencia de ADN
16.
Nucleic Acids Res ; 41(7): e78, 2013 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-23358824

RESUMEN

We present Masai, a read mapper representing the state-of-the-art in terms of speed and accuracy. Our tool is an order of magnitude faster than RazerS 3 and mrFAST, 2-4 times faster and more accurate than Bowtie 2 and BWA. The novelties of our read mapper are filtration with approximate seeds and a method for multiple backtracking. Approximate seeds, compared with exact seeds, increase filtration specificity while preserving sensitivity. Multiple backtracking amortizes the cost of searching a large set of seeds by taking advantage of the repetitiveness of next-generation sequencing data. Combined together, these two methods significantly speed up approximate search on genomic data sets. Masai is implemented in C++ using the SeqAn library. The source code is distributed under the BSD license and binaries for Linux, Mac OS X and Windows can be freely downloaded from http://www.seqan.de/projects/masai.


Asunto(s)
Mapeo Cromosómico/métodos , Programas Informáticos , Algoritmos , Animales , Caenorhabditis elegans/genética , Drosophila melanogaster/genética , Escherichia coli/genética , Variación Genética , Genómica/métodos , Humanos
17.
Bioinformatics ; 28(20): 2592-9, 2012 Oct 15.
Artículo en Inglés | MEDLINE | ID: mdl-22923295

RESUMEN

MOTIVATION: During the past years, next-generation sequencing has become a key technology for many applications in the biomedical sciences. Throughput continues to increase and new protocols provide longer reads than currently available. In almost all applications, read mapping is a first step. Hence, it is crucial to have algorithms and implementations that perform fast, with high sensitivity, and are able to deal with long reads and a large absolute number of insertions and deletions. RESULTS: RazerS is a read mapping program with adjustable sensitivity based on counting q-grams. In this work, we propose the successor RazerS 3, which now supports shared-memory parallelism, an additional seed-based filter with adjustable sensitivity, a much faster, banded version of the Myers' bit-vector algorithm for verification, memory-saving measures and support for the SAM output format. This leads to a much improved performance for mapping reads, in particular, long reads with many errors. We extensively compare RazerS 3 with other popular read mappers and show that its results are often superior to them in terms of sensitivity while exhibiting practical and often competitive run times. In addition, RazerS 3 works without a pre-computed index. AVAILABILITY AND IMPLEMENTATION: Source code and binaries are freely available for download at http://www.seqan.de/projects/razers. RazerS 3 is implemented in C++ and OpenMP under a GPL license using the SeqAn library and supports Linux, Mac OS X and Windows.


Asunto(s)
Algoritmos , Mapeo Cromosómico/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos
18.
Bioinformatics ; 28(5): 619-27, 2012 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-22238266

RESUMEN

MOTIVATION: The reliable detection of genomic variation in resequencing data is still a major challenge, especially for variants larger than a few base pairs. Sequencing reads crossing boundaries of structural variation carry the potential for their identification, but are difficult to map. RESULTS: Here we present a method for 'split' read mapping, where prefix and suffix match of a read may be interrupted by a longer gap in the read-to-reference alignment. We use this method to accurately detect medium-sized insertions and long deletions with precise breakpoints in genomic resequencing data. Compared with alternative split mapping methods, SplazerS significantly improves sensitivity for detecting large indel events, especially in variant-rich regions. Our method is robust in the presence of sequencing errors as well as alignment errors due to genomic mutations/divergence, and can be used on reads of variable lengths. Our analysis shows that SplazerS is a versatile tool applicable to unanchored or single-end as well as anchored paired-end reads. In addition, application of SplazerS to targeted resequencing data led to the interesting discovery of a complete, possibly functional gene retrocopy variant. AVAILABILITY: SplazerS is available from http://www.seqan.de/projects/ splazers. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Genómica/métodos , Mutación INDEL , Análisis de Secuencia de ADN , Algoritmos , Humanos
19.
BMC Bioinformatics ; 12 Suppl 9: S15, 2011 Oct 05.
Artículo en Inglés | MEDLINE | ID: mdl-22151882

RESUMEN

BACKGROUND: Large-scale comparison of genomic sequences requires reliable tools for the search of local alignments. Practical local aligners are in general fast, but heuristic, and hence sometimes miss significant matches. RESULTS: We present here the local pairwise aligner STELLAR that has full sensitivity for ε-alignments, i.e. guarantees to report all local alignments of a given minimal length and maximal error rate. The aligner is composed of two steps, filtering and verification. We apply the SWIFT algorithm for lossless filtering, and have developed a new verification strategy that we prove to be exact. Our results on simulated and real genomic data confirm and quantify the conjecture that heuristic tools like BLAST or BLAT miss a large percentage of significant local alignments. CONCLUSIONS: STELLAR is very practical and fast on very long sequences which makes it a suitable new tool for finding local alignments between genomic sequences under the edit distance model. Binaries are freely available for Linux, Windows, and Mac OS X at http://www.seqan.de/projects/stellar. The source code is freely distributed with the SeqAn C++ library version 1.3 and later at http://www.seqan.de.


Asunto(s)
Genómica/métodos , Alineación de Secuencia/métodos , Programas Informáticos , Algoritmos , Animales , Drosophila/genética
20.
BMC Bioinformatics ; 12: 210, 2011 May 26.
Artículo en Inglés | MEDLINE | ID: mdl-21615913

RESUMEN

BACKGROUND: Second generation sequencing technologies yield DNA sequence data at ultra high-throughput. Common to most biological applications is a mapping of the reads to an almost identical or highly similar reference genome. The assessment of the quality of read mapping results is not straightforward and has not been formalized so far. Hence, it has not been easy to compare different read mapping approaches in a unified way and to determine which program is the best for what task. RESULTS: We present a new benchmark method, called Rabema (Read Alignment BEnchMArk), for read mappers. It consists of a strict definition of the read mapping problem and of tools to evaluate the result of arbitrary read mappers supporting the SAM output format. CONCLUSIONS: We show the usefulness of the benchmark program by performing a comparison of popular read mappers. The tools supporting the benchmark are licensed under the GPL and available from http://www.seqan.de/projects/rabema.html.


Asunto(s)
Análisis de Secuencia de ADN/métodos , Análisis de Secuencia de ADN/normas , Algoritmos , Animales , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...