Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Mais filtros

Base de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Bioinformatics ; 40(Supplement_1): i287-i296, 2024 Jun 28.
Artigo em Inglês | MEDLINE | ID: mdl-38940135

RESUMO

SUMMARY: Improvements in nanopore sequencing necessitate efficient classification methods, including pre-filtering and adaptive sampling algorithms that enrich for reads of interest. Signal-based approaches circumvent the computational bottleneck of basecalling. But past methods for signal-based classification do not scale efficiently to large, repetitive references like pangenomes, limiting their utility to partial references or individual genomes. We introduce Sigmoni: a rapid, multiclass classification method based on the r-index that scales to references of hundreds of Gbps. Sigmoni quantizes nanopore signal into a discrete alphabet of picoamp ranges. It performs rapid, approximate matching using matching statistics, classifying reads based on distributions of picoamp matching statistics and co-linearity statistics, all in linear query time without the need for seed-chain-extend. Sigmoni is 10-100× faster than previous methods for adaptive sampling in host depletion experiments with improved accuracy, and can query reads against large microbial or human pangenomes. Sigmoni is the first signal-based tool to scale to a complete human genome and pangenome while remaining fast enough for adaptive sampling applications. AVAILABILITY AND IMPLEMENTATION: Sigmoni is implemented in Python, and is available open-source at https://github.com/vshiv18/sigmoni.


Assuntos
Algoritmos , Humanos , Sequenciamento por Nanoporos/métodos , Software , Nanoporos , Genoma Humano , Genômica/métodos , Análise de Sequência de DNA/métodos
2.
Bioinformatics ; 38(8): 2358-2360, 2022 04 12.
Artigo em Inglês | MEDLINE | ID: mdl-35157051

RESUMO

MOTIVATION: Ribosome profiling, or Ribo-seq, is the state-of-the-art method for quantifying protein synthesis in living cells. Computational analysis of Ribo-seq data remains challenging due to the complexity of the procedure, as well as variations introduced for specific organisms or specialized analyses. RESULTS: We present riboviz 2, an updated riboviz package, for the comprehensive transcript-centric analysis and visualization of Ribo-seq data. riboviz 2 includes an analysis workflow built on the Nextflow workflow management system for end-to-end processing of Ribo-seq data. riboviz 2 has been extensively tested on diverse species and library preparation strategies, including multiplexed samples. riboviz 2 is flexible and uses open, documented file formats, allowing users to integrate new analyses with the pipeline. AVAILABILITY AND IMPLEMENTATION: riboviz 2 is freely available at github.com/riboviz/riboviz.


Assuntos
Perfil de Ribossomos , Ribossomos , Ribossomos/genética , Ribossomos/metabolismo , Fluxo de Trabalho , RNA Mensageiro/metabolismo , Análise de Dados , Análise de Sequência de RNA/métodos
3.
BMC Bioinformatics ; 20(1): 171, 2019 Apr 03.
Artigo em Inglês | MEDLINE | ID: mdl-30943891

RESUMO

BACKGROUND: Molecular simulations are used to provide insight into protein structure and dynamics, and have the potential to provide important context when predicting the impact of sequence variation on protein function. In addition to understanding molecular mechanisms and interactions on the atomic scale, translational applications of those approaches include drug screening, development of novel molecular therapies, and targeted treatment planning. Supporting the continued development of these applications, we have developed the SNP2SIM workflow that generates reproducible molecular dynamics and molecular docking simulations for downstream functional variant analysis. The Python workflow utilizes molecular dynamics software (NAMD (Phillips et al., J Comput Chem 26(16):1781-802, 2005), VMD (Humphrey et al., J Mol Graph 14(1):33-8, 27-8, 1996)) to generate variant specific scaffolds for simulated small molecule docking (AutoDock Vina (Trott and Olson, J Comput Chem 31(2):455-61, 2010)). RESULTS: SNP2SIM is composed of three independent modules that can be used sequentially to generate the variant scaffolds of missense protein variants from the wildtype protein structure. The workflow first generates the mutant structure and configuration files required to execute molecular dynamics simulations of solvated protein variant structures. The resulting trajectories are clustered based on the structural diversity of residues involved in ligand binding to produce one or more variant scaffolds of the protein structure. Finally, these unique structural conformations are bound to small molecule ligand libraries to predict variant induced changes to drug binding relative to the wildtype protein structure. CONCLUSIONS: SNP2SIM provides a platform to apply molecular simulation based functional analysis of sequence variation in the protein targets of small molecule therapies. In addition to simplifying the simulation of variant specific drug interactions, the workflow enables large scale computational mutagenesis by controlling the parameterization of molecular simulations across multiple users or distributed computing infrastructures. This enables the parallelization of the computationally intensive molecular simulations to be aggregated for downstream functional analysis, and facilitates comparing various simulation options, such as the specific residues used to define structural variant clusters. The Python scripts that implement the SNP2SIM workflow are available (SNP2SIM Repository. https://github.com/mccoymd/SNP2SIM , Accessed 2019 February ), and individual SNP2SIM modules are available as apps on the Seven Bridges Cancer Genomics Cloud (Lau et al., Cancer Res 77(21):e3-e6, 2017; Cancer Genomics Cloud [ www.cancergenomicscloud.org ; Accessed 2018 November]).


Assuntos
Simulação de Acoplamento Molecular/métodos , Proteínas Mutantes/química , Humanos , Ligantes , Simulação de Dinâmica Molecular , Mutação de Sentido Incorreto , Conformação Proteica , Software , Fluxo de Trabalho
4.
Mol Phylogenet Evol ; 117: 135-140, 2017 12.
Artigo em Inglês | MEDLINE | ID: mdl-27965082

RESUMO

The Clauseneae (Aurantioideae, Rutaceae) is a tribe in the Citrus family that, although economically important as it contains the culinary and medicinally-useful curry tree (Bergera koenigii), has been relatively understudied. Due to the recent significant taxonomic changes made to this tribe, a closer inspection of the genetic relationships among its genera has been warranted. Whole genome skimming was used to generate chloroplast genomes from six species, representing each of the four genera (Bergera, Clausena, Glycosmis, Micromelum) in the Clauseneae tribe plus one closely related outgroup (Merrillia), using the published plastome sequence of Citrus sinensis as a reference. Phylogenetically informative character (PIC) data were analyzed using a genome alignment of the seven species, and variability frequency among the species was recorded for each coding and non-coding region, with the regions of highest variability identified for future phylogenetic studies. Non-coding regions exhibited a higher percentage of variable characters as expected, and the phylogenetic markers ycf1, matK, rpoC2, ndhF, trnS-trnG spacer, and trnH-psbA spacer proved to be among the most variable regions. Other markers that are frequently used in phylogenetic studies, e.g. rps16, atpB-rbcL, rps4-trnT, and trnL-trnF, proved to be far less variable. Phylogenetic analyses of the aligned sequences were conducted using Bayesian inference (MrBayes) and Maximum Likelihood (RAxML), yielding highly supported divisions among the four genera.


Assuntos
Genoma de Cloroplastos/genética , Filogenia , Rutaceae/classificação , Rutaceae/genética , Teorema de Bayes , Citrus/genética , Genoma de Planta/genética , Funções Verossimilhança , Murraya/genética , Alinhamento de Sequência
5.
bioRxiv ; 2024 Mar 10.
Artigo em Inglês | MEDLINE | ID: mdl-38496646

RESUMO

Nanopore signal analysis enables detection of nucleotide modifications from native DNA and RNA sequencing, providing both accurate genetic/transcriptomic and epigenetic information without additional library preparation. Presently, only a limited set of modifications can be directly basecalled (e.g. 5-methylcytosine), while most others require exploratory methods that often begin with alignment of nanopore signal to a nucleotide reference. We present Uncalled4, a toolkit for nanopore signal alignment, analysis, and visualization. Uncalled4 features an efficient banded signal alignment algorithm, BAM signal alignment file format, statistics for comparing signal alignment methods, and a reproducible de novo training method for k-mer-based pore models, revealing potential errors in ONT's state-of-the-art DNA model. We apply Uncalled4 to RNA 6-methyladenine (m6A) detection in seven human cell lines, identifying 26% more modifications than Nanopolish using m6Anet, including in several genes where m6A has known implications in cancer. Uncalled4 is available open-source at github.com/skovaka/uncalled4.

6.
bioRxiv ; 2023 Aug 30.
Artigo em Inglês | MEDLINE | ID: mdl-37645873

RESUMO

Improvements in nanopore sequencing necessitate efficient classification methods, including pre-filtering and adaptive sampling algorithms that enrich for reads of interest. Signal-based approaches circumvent the computational bottleneck of basecalling. But past methods for signal-based classification do not scale efficiently to large, repetitive references like pangenomes, limiting their utility to partial references or individual genomes. We introduce Sigmoni: a rapid, multiclass classification method based on the r-index that scales to references of hundreds of Gbps. Sigmoni quantizes nanopore signal into a discrete alphabet of picoamp ranges. It performs rapid, approximate matching using matching statistics, classifying reads based on distributions of picoamp matching statistics and co-linearity statistics. Sigmoni is 10-100× faster than previous methods for adaptive sampling in host depletion experiments with improved accuracy, and can query reads against large microbial or human pangenomes.

7.
Sci Rep ; 9(1): 4230, 2019 03 12.
Artigo em Inglês | MEDLINE | ID: mdl-30862864

RESUMO

The curry tree (Bergera koenigii L.) is a widely cultivated plant used in South Asian cooking. Next-generation sequencing was used to generate the transcriptome of the curry leaf to detect changes in gene expression during leaf development, such as those genes involved in the production of oils which lend the leaf its characteristic taste, aroma, and medicinal properties. Using abundance estimation (RSEM) and differential expression analysis, genes that were significantly differentially expressed were identified. The transcriptome was annotated with BLASTx using the non-redundant (nr) protein database, and Gene Ontology (GO) terms were assigned based on the top BLAST hit using Blast2GO. Lastly, functional enrichment of the assigned GO terms was analyzed for genes that were significantly differentially expressed. Of the most enriched GO categories, pathways involved in cell wall, membrane, and lignin synthesis were found to be most upregulated in immature leaf tissue, possibly due to the growth and expansion of the leaf tissue. Terpene synthases, which synthesize monoterpenes and sesquiterpenes, which comprise much of the curry essential oil, were found to be significantly upregulated in mature leaf tissue, suggesting that oil production increases later in leaf development. Enzymes involved in pigment production were also significantly upregulated in mature leaves. The findings were based on computational estimates of gene expression from RNA-seq data, and further study is warranted to validate these results using targeted techniques, such as quantitative PCR.


Assuntos
Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas/fisiologia , Folhas de Planta/crescimento & desenvolvimento , Rutaceae/metabolismo , Transcriptoma/fisiologia , Folhas de Planta/genética , Rutaceae/genética
8.
Genes (Basel) ; 10(5)2019 05 22.
Artigo em Inglês | MEDLINE | ID: mdl-31121954

RESUMO

Plants in the Burseraceae are globally recognized for producing resins and essential oils with medicinal properties and have economic value. In addition, most of the aromatic and non-aromatic components of Burseraceae resins are derived from a variety of terpene and terpenoid chemicals. Although terpene genes have been identified in model plant crops (e.g., Citrus, Arabidopsis), very few genomic resources are available for non-model groups, including the highly diverse Burseraceae family. Here we report the assembly of a leaf transcriptome of Protium copal, an aromatic tree that has a large distribution in Central America, describe the functional annotation of putative terpene biosynthetic genes and compare terpene biosynthetic genes found in P. copal with those identified in other Burseraceae taxa. The genomic resources of Protium copal can be used to generate novel sequencing markers for population genetics and comparative phylogenetic studies, and to investigate the diversity and evolution of terpene genes in the Burseraceae.


Assuntos
Burseraceae/genética , Folhas de Planta/genética , Terpenos/metabolismo , Transcriptoma/genética , Burseraceae/metabolismo , Genômica , Anotação de Sequência Molecular , Óleos Voláteis/metabolismo , Extratos Vegetais/genética , Extratos Vegetais/metabolismo , Folhas de Planta/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA