RESUMO
MOTIVATION: The genomic surveillance of viral pathogens such as SARS-CoV-2 and HIV-1 has been critical to modern epidemiology and public health, but the use of sequence analysis pipelines requires computational expertise, and web-based platforms require sending potentially sensitive raw sequence data to remote servers. RESULTS: We introduce ViralWasm, a user-friendly graphical web application suite for viral genomics. All ViralWasm tools utilize WebAssembly to execute the original command line tools client-side directly in the web browser without any user setup, with a cost of just 2-3x slowdown with respect to their command line counterparts. AVAILABILITY AND IMPLEMENTATION: The ViralWasm tool suite can be accessed at: https://niema-lab.github.io/ViralWasm.
Assuntos
Genômica , Software , Humanos , Genômica/métodos , Navegador , Genoma ViralRESUMO
SUMMARY: Ribbon is an alignment visualization tool that shows how alignments are positioned within both the reference and read contexts, giving an intuitive view that enables a better understanding of structural variants and the read evidence supporting them. Ribbon was born out of a need to curate complex structural variant calls and determine whether each was well supported by long-read evidence, and it uses the same intuitive visualization method to shed light on contig alignments from genome-to-genome comparisons. AVAILABILITY AND IMPLEMENTATION: Ribbon is freely available online at http://genomeribbon.com/ and is open-source at https://github.com/marianattestad/ribbon. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Genômica , Software , GenomaRESUMO
We present Ginkgo (http://qb.cshl.edu/ginkgo), a user-friendly, open-source web platform for the analysis of single-cell copy-number variations (CNVs). Ginkgo automatically constructs copy-number profiles of cells from mapped reads and constructs phylogenetic trees of related cells. We validated Ginkgo by reproducing the results of five major studies. After comparing three commonly used single-cell amplification techniques, we concluded that degenerate oligonucleotide-primed PCR is the most consistent for CNV analysis.
Assuntos
Biologia Computacional , Variações do Número de Cópias de DNA , Genoma Humano , Oligonucleotídeos/genética , Algoritmos , Animais , Automação , Análise por Conglomerados , Drosophila , Feminino , Dosagem de Genes , Genoma , Humanos , Internet , Neoplasias Pulmonares/diagnóstico , Neoplasias Pulmonares/genética , Masculino , Camundongos , Pan troglodytes , Filogenia , Reação em Cadeia da Polimerase , Ratos , Reprodutibilidade dos Testes , Cromossomos Sexuais , Carcinoma de Pequenas Células do Pulmão/diagnóstico , Carcinoma de Pequenas Células do Pulmão/genética , SoftwareRESUMO
BACKGROUND: The evolutionary pressures that underlie the large-scale functional organization of the genome are not well understood in eukaryotes. Recent evidence suggests that functionally similar genes may colocalize (cluster) in the eukaryotic genome, suggesting the role of chromatin-level gene regulation in shaping the physical distribution of coordinated genes. However, few of the bioinformatic tools currently available allow for a systematic study of gene colocalization across several, evolutionarily distant species. Furthermore, most tools require the user to input manually curated lists of gene position information, DNA sequence or gene homology relations between species. With the growing number of sequenced genomes, there is a need to provide new comparative genomics tools that can address the analysis of multi-species gene colocalization. RESULTS: Kerfuffle is a web tool designed to help discover, visualize, and quantify the physical organization of genomes by identifying significant gene colocalization and conservation across the assembled genomes of available species (currently up to 47, from humans to worms). Kerfuffle only requires the user to specify a list of human genes and the names of other species of interest. Without further input from the user, the software queries the e!Ensembl BioMart server to obtain positional information and discovers homology relations in all genes and species specified. Using this information, Kerfuffle performs a multi-species clustering analysis, presents downloadable lists of clustered genes, performs Monte Carlo statistical significance calculations, estimates how conserved gene clusters are across species, plots histograms and interactive graphs, allows users to save their queries, and generates a downloadable visualization of the clusters using the Circos software. These analyses may be used to further explore the functional roles of gene clusters by interrogating the enriched molecular pathways associated with each cluster. CONCLUSIONS: Kerfuffle is a new, easy-to-use and publicly available tool to aid our understanding of functional genomics and comparative genomics. This software allows for flexibility and quick investigations of a user-defined set of genes, and the results may be saved online for further analysis. Kerfuffle is freely available at http://atwallab.org/kerfuffle, is implemented in JavaScript (using jQuery and jsCharts libraries) and PHP 5.2, runs on an Apache server, and stores data in flat files and an SQLite database.
Assuntos
Genes , Software , Sequência de Bases , Análise por Conglomerados , Genômica/métodos , Humanos , Internet , SinteniaRESUMO
A distinction between indolent and aggressive disease is a major challenge in diagnostics of prostate cancer. As genetic heterogeneity and complexity may influence clinical outcome, we have initiated studies on single tumor cell genomics. In this study, we demonstrate that sparse DNA sequencing of single-cell nuclei from prostate core biopsies is a rich source of quantitative parameters for evaluating neoplastic growth and aggressiveness. These include the presence of clonal populations, the phylogenetic structure of those populations, the degree of the complexity of copy-number changes in those populations, and measures of the proportion of cells with clonal copy-number signatures. The parameters all showed good correlation to the measure of prostatic malignancy, the Gleason score, derived from individual prostate biopsy tissue cores. Remarkably, a more accurate histopathologic measure of malignancy, the surgical Gleason score, agrees better with these genomic parameters of diagnostic biopsy than it does with the diagnostic Gleason score and related measures of diagnostic histopathology. This is highly relevant because primary treatment decisions are dependent upon the biopsy and not the surgical specimen. Thus, single-cell analysis has the potential to augment traditional core histopathology, improving both the objectivity and accuracy of risk assessment and inform treatment decisions.Significance: Genomic analysis of multiple individual cells harvested from prostate biopsies provides an indepth view of cell populations comprising a prostate neoplasm, yielding novel genomic measures with the potential to improve the accuracy of diagnosis and prognosis in prostate cancer. Cancer Res; 78(2); 348-58. ©2017 AACR.
Assuntos
Biomarcadores Tumorais/genética , Genômica/métodos , Neoplasias da Próstata/diagnóstico , Análise de Célula Única/métodos , Idoso , Idoso de 80 Anos ou mais , Humanos , Masculino , Pessoa de Meia-Idade , Gradação de Tumores , Estadiamento de Neoplasias , Filogenia , Prostatectomia , Neoplasias da Próstata/genética , Neoplasias da Próstata/cirurgia , Medição de RiscoRESUMO
A central challenge in sequencing single-cell genomes is the accurate determination of point mutations, phasing of these mutations, and identifying copy number variations with few assumptions. Ideally, this is accomplished under as low sequencing coverage as possible. Here we report our attempt to meet these goals with a novel library construction and library amplification methodology. In our approach, single-cell genomic DNA is first fragmented with saturated transposition to make a primary library that uniformly covers the whole genome by short fragments. The library is then amplified by a carefully optimized PCR protocol in a uniform and synchronized fashion for next-generation sequencing. Each step of the protocol can be quantitatively characterized. Our shallow sequencing data show that the library is tightly distributed and is useful for the determination of copy number variations.
Assuntos
Biblioteca Gênica , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Variações do Número de Cópias de DNA , HumanosRESUMO
BACKGROUND: Parrots belong to a group of behaviorally advanced vertebrates and have an advanced ability of vocal learning relative to other vocal-learning birds. They can imitate human speech, synchronize their body movements to a rhythmic beat, and understand complex concepts of referential meaning to sounds. However, little is known about the genetics of these traits. Elucidating the genetic bases would require whole genome sequencing and a robust assembly of a parrot genome. FINDINGS: We present a genomic resource for the budgerigar, an Australian Parakeet (Melopsittacus undulatus) -- the most widely studied parrot species in neuroscience and behavior. We present genomic sequence data that includes over 300× raw read coverage from multiple sequencing technologies and chromosome optical maps from a single male animal. The reads and optical maps were used to create three hybrid assemblies representing some of the largest genomic scaffolds to date for a bird; two of which were annotated based on similarities to reference sets of non-redundant human, zebra finch and chicken proteins, and budgerigar transcriptome sequence assemblies. The sequence reads for this project were in part generated and used for both the Assemblathon 2 competition and the first de novo assembly of a giga-scale vertebrate genome utilizing PacBio single-molecule sequencing. CONCLUSIONS: Across several quality metrics, these budgerigar assemblies are comparable to or better than the chicken and zebra finch genome assemblies built from traditional Sanger sequencing reads, and are sufficient to analyze regions that are difficult to sequence and assemble, including those not yet assembled in prior bird genomes, and promoter regions of genes differentially regulated in vocal learning brain regions. This work provides valuable data and material for genome technology development and for investigating the genomics of complex behavioral traits.