Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
1.
IEEE Trans Vis Comput Graph ; 27(2): 1073-1083, 2021 02.
Artigo em Inglês | MEDLINE | ID: mdl-33095716

RESUMO

Data visualizations convert numbers into visual marks so that our visual system can extract data from an image instead of raw numbers. Clearly, the visual system does not compute these values as a computer would, as an arithmetic mean or a correlation. Instead, it extracts these patterns using perceptual proxies; heuristic shortcuts of the visual marks, such as a center of mass or a shape envelope. Understanding which proxies people use would lead to more effective visualizations. We present the results of a series of crowdsourced experiments that measure how powerfully a set of candidate proxies can explain human performance when comparing the mean and range of pairs of data series presented as bar charts. We generated datasets where the correct answer-the series with the larger arithmetic mean or range-was pitted against an "adversarial" series that should be seen as larger if the viewer uses a particular candidate proxy. We used both Bayesian logistic regression models and a robust Bayesian mixed-effects linear model to measure how strongly each adversarial proxy could drive viewers to answer incorrectly and whether different individuals may use different proxies. Finally, we attempt to construct adversarial datasets from scratch, using an iterative crowdsourcing procedure to perform black-box optimization.

2.
IEEE Trans Vis Comput Graph ; 26(1): 1012-1021, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31443016

RESUMO

Perceptual tasks in visualizations often involve comparisons. Of two sets of values depicted in two charts, which set had values that were the highest overall? Which had the widest range? Prior empirical work found that the performance on different visual comparison tasks (e.g., "biggest delta", "biggest correlation") varied widely across different combinations of marks and spatial arrangements. In this paper, we expand upon these combinations in an empirical evaluation of two new comparison tasks: the "biggest mean" and "biggest range" between two sets of values. We used a staircase procedure to titrate the difficulty of the data comparison to assess which arrangements produced the most precise comparisons for each task. We find visual comparisons of biggest mean and biggest range are supported by some chart arrangements more than others, and that this pattern is substantially different from the pattern for other tasks. To synthesize these dissonant findings, we argue that we must understand which features of a visualization are actually used by the human visual system to solve a given task. We call these perceptual proxies. For example, when comparing the means of two bar charts, the visual system might use a "Mean length" proxy that isolates the actual lengths of the bars and then constructs a true average across these lengths. Alternatively, it might use a "Hull Area" proxy that perceives an implied hull bounded by the bars of each chart and then compares the areas of these hulls. We propose a series of potential proxies across different tasks, marks, and spatial arrangements. Simple models of these proxies can be empirically evaluated for their explanatory power by matching their performance to human performance across these marks, arrangements, and tasks. We use this process to highlight candidates for perceptual proxies that might scale more broadly to explain performance in visual comparison.


Assuntos
Gráficos por Computador , Percepção Visual/fisiologia , Crowdsourcing , Humanos , Modelos Biológicos , Análise e Desempenho de Tarefas
3.
Genome Biol ; 20(1): 232, 2019 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-31690338

RESUMO

The MinHash algorithm has proven effective for rapidly estimating the resemblance of two genomes or metagenomes. However, this method cannot reliably estimate the containment of a genome within a metagenome. Here, we describe an online algorithm capable of measuring the containment of genomes and proteomes within either assembled or unassembled sequencing read sets. We describe several use cases, including contamination screening and retrospective analysis of metagenomes for novel genome discovery. Using this tool, we provide containment estimates for every NCBI RefSeq genome within every SRA metagenome and demonstrate the identification of a novel polyomavirus species from a public metagenome.


Assuntos
Contaminação por DNA , Ensaios de Triagem em Larga Escala , Metagenômica/métodos , Algoritmos , Humanos , Polyomavirus/isolamento & purificação , Proteoma
4.
Microbiome ; 6(1): 197, 2018 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-30396371

RESUMO

The Mid-Atlantic Microbiome Meet-up (M3) organization brings together academic, government, and industry groups to share ideas and develop best practices for microbiome research. In January of 2018, M3 held its fourth meeting, which focused on recent advances in biodefense, specifically those relating to infectious disease, and the use of metagenomic methods for pathogen detection. Presentations highlighted the utility of next-generation sequencing technologies for identifying and tracking microbial community members across space and time. However, they also stressed the current limitations of genomic approaches for biodefense, including insufficient sensitivity to detect low-abundance pathogens and the inability to quantify viable organisms. Participants discussed ways in which the community can improve software usability and shared new computational tools for metagenomic processing, assembly, annotation, and visualization. Looking to the future, they identified the need for better bioinformatics toolkits for longitudinal analyses, improved sample processing approaches for characterizing viruses and fungi, and more consistent maintenance of database resources. Finally, they addressed the necessity of improving data standards to incentivize data sharing. Here, we summarize the presentations and discussions from the meeting, identifying the areas where microbiome analyses have improved our ability to detect and manage biological threats and infectious disease, as well as gaps of knowledge in the field that require future funding and focus.


Assuntos
Armas Biológicas , Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Metagenômica/métodos , Humanos , Microbiota/fisiologia , Análise de Sequência de DNA/métodos
5.
PeerJ ; 6: e4892, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29868286

RESUMO

When performing bioforensic casework, it is important to be able to reliably detect the presence of a particular organism in a metagenomic sample, even if the organism is only present in a trace amount. For this task, it is common to use a sequence classification program that determines the taxonomic affiliation of individual sequence reads by comparing them to reference database sequences. As metagenomic data sets often consist of millions or billions of reads that need to be compared to reference databases containing millions of sequences, such sequence classification programs typically use search heuristics and databases with reduced sequence diversity to speed up the analysis, which can lead to incorrect assignments. Thus, in a bioforensic setting where correct assignments are paramount, assignments of interest made by "first-pass" classifiers should be confirmed using the most precise methods and comprehensive databases available. In this study we present a BLAST-based method for validating the assignments made by less precise sequence classification programs, with optimal parameters for filtering of BLAST results determined via simulation of sequence reads from genomes of interest, and we apply the method to the detection of four pathogenic organisms. The software implementing the method is open source and freely available.

6.
Genome Biol ; 17(1): 132, 2016 06 20.
Artigo em Inglês | MEDLINE | ID: mdl-27323842

RESUMO

Mash extends the MinHash dimensionality-reduction technique to include a pairwise mutation distance and P value significance test, enabling the efficient clustering and search of massive sequence collections. Mash reduces large sequences and sequence sets to small, representative sketches, from which global mutation distances can be rapidly estimated. We demonstrate several use cases, including the clustering of all 54,118 NCBI RefSeq genomes in 33 CPU h; real-time database search using assembled or unassembled Illumina, Pacific Biosciences, and Oxford Nanopore data; and the scalable clustering of hundreds of metagenomic samples by composition. Mash is freely released under a BSD license ( https://github.com/marbl/mash ).


Assuntos
Evolução Molecular , Genoma , Genômica/métodos , Metagenoma , Metagenômica/métodos , Software , Análise por Conglomerados , Bases de Dados de Ácidos Nucleicos , Filogenia
7.
Genome Biol ; 15(11): 524, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25410596

RESUMO

Whole-genome sequences are now available for many microbial species and clades, however existing whole-genome alignment methods are limited in their ability to perform sequence comparisons of multiple sequences simultaneously. Here we present the Harvest suite of core-genome alignment and visualization tools for the rapid and simultaneous analysis of thousands of intraspecific microbial strains. Harvest includes Parsnp, a fast core-genome multi-aligner, and Gingr, a dynamic visual platform. Together they provide interactive core-genome alignments, variant calls, recombination detection, and phylogenetic trees. Using simulated and real data we demonstrate that our approach exhibits unrivaled speed while maintaining the accuracy of existing methods. The Harvest suite is open-source and freely available from: http://github.com/marbl/harvest.


Assuntos
Bactérias/genética , Genoma Bacteriano/genética , Filogenia , Alinhamento de Sequência , Algoritmos , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA , Software
8.
Genome Announc ; 1(1)2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23405332

RESUMO

The Bacillus anthracis Carbosap genome, which includes the pXO1 and pXO2 plasmids, has been shown to encode the major B. anthracis virulence factors, yet this strain's attenuation has not yet been explained. Here we report the draft genome sequence of this strain, and a comparison to fully virulent B. anthracis.

9.
PLoS One ; 7(8): e43350, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22937038

RESUMO

BACKGROUND: Although genome-wide transcriptional analysis has been used for many years to study bacterial gene expression, many aspects of the bacterial transcriptome remain undefined. One example is antisense transcription, which has been observed in a number of bacteria, though the function of antisense transcripts, and their distribution across the bacterial genome, is still unclear. METHODOLOGY/PRINCIPAL FINDINGS: Single-stranded RNA-seq results revealed a widespread and non-random pattern of antisense transcription covering more than two thirds of the B. anthracis genome. Our analysis revealed a variety of antisense structural patterns, suggesting multiple mechanisms of antisense transcription. The data revealed several instances of sense and antisense expression changes in different growth conditions, suggesting that antisense transcription may play a role in the ways in which B. anthracis responds to its environment. Significantly, genome-wide antisense expression occurred at consistently higher levels on the lagging strand, while the leading strand showed very little antisense activity. Intrasample gene expression comparisons revealed a gene dosage effect in all growth conditions, where genes farthest from the origin showed the lowest overall range of expression for both sense and antisense directed transcription. Additionally, transcription from both strands was verified using a novel strand-specific assay. The variety of structural patterns we observed in antisense transcription suggests multiple mechanisms for this phenomenon, suggesting that some antisense transcription may play a role in regulating the expression of key genes, while some may be due to chromosome replication dynamics and transcriptional noise. CONCLUSIONS/SIGNIFICANCE: Although the variety of structural patterns we observed in antisense transcription suggest multiple mechanisms for antisense expression, our data also clearly indicate that antisense transcription may play a genome-wide role in regulating the expression of key genes in Bacillus species. This study illustrates the surprising complexity of prokaryotic RNA abundance for both strands of a bacterial chromosome.


Assuntos
Bacillus anthracis/genética , RNA Antissenso/genética , RNA/genética , Análise de Sequência com Séries de Oligonucleotídeos
10.
BMC Bioinformatics ; 12: 385, 2011 Sep 30.
Artigo em Inglês | MEDLINE | ID: mdl-21961884

RESUMO

BACKGROUND: A critical output of metagenomic studies is the estimation of abundances of taxonomical or functional groups. The inherent uncertainty in assignments to these groups makes it important to consider both their hierarchical contexts and their prediction confidence. The current tools for visualizing metagenomic data, however, omit or distort quantitative hierarchical relationships and lack the facility for displaying secondary variables. RESULTS: Here we present Krona, a new visualization tool that allows intuitive exploration of relative abundances and confidences within the complex hierarchies of metagenomic classifications. Krona combines a variant of radial, space-filling displays with parametric coloring and interactive polar-coordinate zooming. The HTML5 and JavaScript implementation enables fully interactive charts that can be explored with any modern Web browser, without the need for installed software or plug-ins. This Web-based architecture also allows each chart to be an independent document, making them easy to share via e-mail or post to a standard Web server. To illustrate Krona's utility, we describe its application to various metagenomic data sets and its compatibility with popular metagenomic analysis tools. CONCLUSIONS: Krona is both a powerful metagenomic visualization tool and a demonstration of the potential of HTML5 for highly accessible bioinformatic visualizations. Its rich and interactive displays facilitate more informed interpretations of metagenomic analyses, while its implementation as a browser-based application makes it extremely portable and easily adopted into existing analysis packages. Both the Krona rendering code and conversion tools are freely available under a BSD open-source license, and available from: http://krona.sourceforge.net.


Assuntos
Internet , Metagenômica/métodos , Software , Biologia Computacional , Trato Gastrointestinal/microbiologia , Humanos
11.
Bioinformatics ; 26(15): 1901-2, 2010 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-20562417

RESUMO

SUMMARY: Bisulfite sequencing allows cytosine methylation, an important epigenetic marker, to be detected via nucleotide substitutions. Since the Applied Biosystems SOLiD System uses a unique di-base encoding that increases confidence in the detection of nucleotide substitutions, it is a potentially advantageous platform for this application. However, the di-base encoding also makes reads with many nucleotide substitutions difficult to align to a reference sequence with existing tools, preventing the platform's potential utility for bisulfite sequencing from being realized. Here, we present SOCS-B, a reference-based, un-gapped alignment algorithm for the SOLiD System that is tolerant of both bisulfite-induced nucleotide substitutions and a parametric number of sequencing errors, facilitating bisulfite sequencing on this platform. An implementation of the algorithm has been integrated with the previously reported SOCS alignment tool, and was used to align CpG methylation-enriched Arabidopsis thaliana bisulfite sequence data, exhibiting a 2-fold increase in sensitivity compared to existing methods for aligning SOLiD bisulfite data. AVAILABILITY: Executables, source code, and sample data are available at http://solidsoftwaretools.com/gf/project/socs/


Assuntos
Algoritmos , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Sulfitos , Arabidopsis/genética , Alinhamento de Sequência/instrumentação , Análise de Sequência de DNA/instrumentação
12.
J Bacteriol ; 191(10): 3203-11, 2009 May.
Artigo em Inglês | MEDLINE | ID: mdl-19304856

RESUMO

Although gene expression has been studied in bacteria for decades, many aspects of the bacterial transcriptome remain poorly understood. Transcript structure, operon linkages, and information on absolute abundance all provide valuable insights into gene function and regulation, but none has ever been determined on a genome-wide scale for any bacterium. Indeed, these aspects of the prokaryotic transcriptome have been explored on a large scale in only a few instances, and consequently little is known about the absolute composition of the mRNA population within a bacterial cell. Here we report the use of a high-throughput sequencing-based approach in assembling the first comprehensive, single-nucleotide resolution view of a bacterial transcriptome. We sampled the Bacillus anthracis transcriptome under a variety of growth conditions and showed that the data provide an accurate and high-resolution map of transcript start sites and operon structure throughout the genome. Further, the sequence data identified previously nonannotated regions with significant transcriptional activity and enhanced the accuracy of existing genome annotations. Finally, our data provide estimates of absolute transcript abundance and suggest that there is significant transcriptional heterogeneity within a clonal, synchronized bacterial population. Overall, our results offer an unprecedented view of gene expression and regulation in a bacterial cell.


Assuntos
Bacillus anthracis/genética , Biologia Computacional , Perfilação da Expressão Gênica/métodos , Regulação Bacteriana da Expressão Gênica/genética , Dados de Sequência Molecular , Óperon/genética , RNA Mensageiro/genética , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Análise de Sequência de DNA
13.
Bioinformatics ; 24(23): 2776-7, 2008 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-18842598

RESUMO

UNLABELLED: Here, we report the development of SOCS (short oligonucleotide color space), a program designed for efficient and flexible mapping of Applied Biosystems SOLiD sequence data onto a reference genome. SOCS performs its mapping within the context of 'color space', and it maximizes usable data by allowing a user-specified number of mismatches. Sequence census functions facilitate a variety of functional genomics applications, including transcriptome mapping and profiling, as well as ChIP-Seq. AVAILABILITY: Executables, source code, and sample data are available at http://socs.biology.gatech.edu/


Assuntos
Genoma , Genômica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Bases de Dados Genéticas , Análise de Sequência de DNA
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA