RESUMO
Several reduced-representation bisulfite sequencing methods have been developed in recent years to determine cytosine methylation de novo in nonmodel species. Here, we present epiGBS2, a laboratory protocol based on epiGBS with a revised and user-friendly bioinformatics pipeline for a wide range of species with or without a reference genome. epiGBS2 is cost- and time-efficient and the computational workflow is designed in a user-friendly and reproducible manner. The library protocol allows a flexible choice of restriction enzymes and a double digest. The bioinformatics pipeline was integrated in the Snakemake workflow management system, which makes the pipeline easy to execute and modular, and parameter settings for important computational steps flexible. We implemented bismark for alignment and methylation analysis and we preprocessed alignment files by double masking to enable single nucleotide polymorphism calling with Freebayes (epiFreebayes). The performance of several critical steps in epiGBS2 was evaluated against baseline data sets from Arabidopsis thaliana and great tit (Parus major), which confirmed its overall good performance. We provide a detailed description of the laboratory protocol and an extensive manual of the bioinformatics pipeline, which is publicly accessible on github (https://github.com/nioo-knaw/epiGBS2) and zenodo (https://doi.org/10.5281/zenodo.4764652).
Assuntos
Software , Sulfitos , Metilação de DNA , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodosRESUMO
Microsatellites are ubiquitously distributed, polymorphic repeat sequence valuable for association, selection, population structure and identification. They can be mined by genomic library, probe hybridization and sequencing of selected clones. Such approach has many limitations like biased hybridization and selection of larger repeats. In silico mining of polymorphic markers using data of various genotypes can be rapid and economical. Available tools lack in some or other aspects like: targeted user defined primer generation, polymorphism discovery using multiple sequence, size and number limits of input sequence, no option for primer generation and e-PCR evaluation, transferability, lack of complete automation and user-friendliness. They also lack the provision to evaluate published primers in e-PCR mode to generate additional allelic data using re-sequenced data of various genotypes for judicious utilization of previously generated data. We developed the tool (PolyMorphPredict) using Perl, R, Java and launched at Apache which is available at http://webtom.cabgrid.res.in/polypred/. It mines microsatellite loci and computes primers from genome/transcriptome data of any species. It can perform e-PCR using published primers for polymorphism discovery and across species transferability of microsatellite loci. Present tool has been evaluated using five species of different genome size having 21 genotypes. Though server is equipped with genomic data of three species for test run with gel simulation, but can be used for any species. Further, polymorphism predictability has been validated using in silico and in vitro PCR of four rice genotypes. This tool can accelerate the in silico microsatellite polymorphism discovery in re-sequencing projects of any species of plant and animal for their diversity estimation along with variety/breed identification, population structure, MAS, QTL and gene discovery, traceability, parentage testing, fungal diagnostics and genome finishing.
RESUMO
Microbial diseases in fish, plant, animal and human are rising constantly; thus, discovery of their antidote is imperative. The use of antibiotic in aquaculture further compounds the problem by development of resistance and consequent consumer health risk by bio-magnification. Antimicrobial peptides (AMPs) have been highly promising as natural alternative to chemical antibiotics. Though AMPs are molecules of innate immune defense of all advance eukaryotic organisms, fish being heavily dependent on their innate immune defense has been a good source of AMPs with much wider applicability. Machine learning-based prediction method using wet laboratory-validated fish AMP can accelerate the AMP discovery using available fish genomic and proteomic data. Earlier AMP prediction servers are based on multi-phyla/species data, and we report here the world's first AMP prediction server in fishes. It is freely accessible at http://webapp.cabgrid.res.in/fishamp/ . A total of 151 AMPs related to fish collected from various databases and published literature were taken for this study. For model development and prediction, N-terminus residues, C-terminus residues and full sequences were considered. Best models were with kernels polynomial-2, linear and radial basis function with accuracy of 97, 99 and 97 %, respectively. We found that performance of support vector machine-based models is superior to artificial neural network. This in silico approach can drastically reduce the time and cost of AMP discovery. This accelerated discovery of lead AMP molecules having potential wider applications in diverse area like fish and human health as substitute of antibiotics, immunomodulator, antitumor, vaccine adjuvant and inactivator, and also for packaged food can be of much importance for industries.
Assuntos
Algoritmos , Aquicultura/instrumentação , Aquicultura/métodos , Animais , Anti-Infecciosos/análise , Humanos , Aprendizado de Máquina , Modelos Teóricos , Peptídeos , ProteômicaRESUMO
Karnal bunt disease caused by the fungus Tilletia indica Mitra is a serious concern due to strict quarantines affecting international trade of wheat. We announce here the first draft assembly of two monosporidial lines, PSWKBGH-1 and -2, of this fungus, having approximate sizes of 37.46 and 37.21 Mbp, respectively.