RESUMEN
Genus Vigna represented by more than 100 species is a source of nutritious edible seeds and sprouts that are rich sources of protein and dietary supplements. It is further valuable because of therapeutic attributes due to its antioxidant and anti-diabetic properties. A highly diverse and an extremely ecological niche of different species can be valuable genomic resources for productivity enhancement. It is one of the most underutilized crops for food security and animal feeds. In spite of huge species diversity, only three species of Vigna have been sequenced; thus, there is a need for molecular markers for the remaining species. Computational approach of microsatellite marker discovery along with evaluation of polymorphism utilizing available genomic data of different genotypes can be a quick and an economical approach for genomic resource development. Cross-species transferability by e-PCR over available genomes can further prioritize the potential SSR markers, which could be used for genetic diversity and population differentiation of the remaining species saving cost and time. We present VigSatDB-the world's first comprehensive microsatellite database of genus Vigna, containing >875 K putative microsatellite markers with 772 354 simple and 103 865 compound markers mined from six genome assemblies of three Vigna species, namely, Vigna radiata (Mung bean), Vigna angularis (Adzuki bean) and Vigna unguiculata (Cowpea). It also contains 1976 validated published markers. Markers can be selected on the basis of chromosomes/location specificity, and primers can be generated using Primer3core tool integrated at backend. Efficacy of VigSatDB for microsatellite loci genotyping has been evaluated by 15 markers over a panel of 10 diverse genotype of V. radiata. Our web genomic resources can be used in diversity analysis, population and varietal differentiation, discovery of quantitative trait loci/genes, marker-assisted varietal improvement in endeavor of Vigna crop productivity and management.
Asunto(s)
ADN de Plantas/genética , Bases de Datos de Ácidos Nucleicos , Repeticiones de Microsatélite , Vigna/genética , Especificidad de la Especie , Vigna/clasificaciónRESUMEN
MicroRNA are 20-24 nt, non-coding, single stranded molecule regulating traits and stress response. Tissue and time specific expression limits its detection, thus is major challenge in their discovery. Wheat has limited 119 miRNAs in MiRBase due to limitation of conservation based methodology where old and new miRNA genes gets excluded. This is due to origin of hexaploid wheat by three successive hybridization, older AA, BB and younger DD subgenome. Species specific miRNA prediction (SMIRP concept) based on 152 thermodynamic features of training dataset using support vector machine learning approach has improved prediction accuracy to 97.7%. This has been implemented in TamiRPred ( http://webtom.cabgrid.res.in/tamirpred ). We also report highest number of putative miRNA genes (4464) of wheat from whole genome sequence populated in database developed in PHP and MySQL. TamiRPred has predicted 2092 (>45.10%) additional miRNA which was not predicted by miRLocator. Predicted miRNAs have been validated by miRBase, small RNA libraries, secondary structure, degradome dataset, star miRNA and binding sites in wheat coding region. This tool can accelerate miRNA polymorphism discovery to be used in wheat trait improvement. Since it predicts chromosome-wise miRNA genes with their respective physical location thus can be transferred using linked SSR markers. This prediction approach can be used as model even in other polyploid crops.
Asunto(s)
Biología Computacional/métodos , MicroARNs/genética , ARN de Planta/genética , Programas Informáticos , Triticum/genética , Cromosomas de las Plantas , Bases de Datos Genéticas , Genoma de Planta , Aprendizaje Automático , MicroARNs/química , Modelos Genéticos , Reproducibilidad de los Resultados , Máquina de Vectores de Soporte , Interfaz Usuario-ComputadorRESUMEN
Microsatellites are ubiquitously distributed, polymorphic repeat sequence valuable for association, selection, population structure and identification. They can be mined by genomic library, probe hybridization and sequencing of selected clones. Such approach has many limitations like biased hybridization and selection of larger repeats. In silico mining of polymorphic markers using data of various genotypes can be rapid and economical. Available tools lack in some or other aspects like: targeted user defined primer generation, polymorphism discovery using multiple sequence, size and number limits of input sequence, no option for primer generation and e-PCR evaluation, transferability, lack of complete automation and user-friendliness. They also lack the provision to evaluate published primers in e-PCR mode to generate additional allelic data using re-sequenced data of various genotypes for judicious utilization of previously generated data. We developed the tool (PolyMorphPredict) using Perl, R, Java and launched at Apache which is available at http://webtom.cabgrid.res.in/polypred/. It mines microsatellite loci and computes primers from genome/transcriptome data of any species. It can perform e-PCR using published primers for polymorphism discovery and across species transferability of microsatellite loci. Present tool has been evaluated using five species of different genome size having 21 genotypes. Though server is equipped with genomic data of three species for test run with gel simulation, but can be used for any species. Further, polymorphism predictability has been validated using in silico and in vitro PCR of four rice genotypes. This tool can accelerate the in silico microsatellite polymorphism discovery in re-sequencing projects of any species of plant and animal for their diversity estimation along with variety/breed identification, population structure, MAS, QTL and gene discovery, traceability, parentage testing, fungal diagnostics and genome finishing.
RESUMEN
Wheat fulfills 20% of global caloric requirement. World needs 60% more wheat for 9 billion population by 2050 but climate change with increasing temperature is projected to affect wheat productivity adversely. Trait improvement and management of wheat germplasm requires genomic resource. Simple Sequence Repeats (SSRs) being highly polymorphic and ubiquitously distributed in the genome, can be a marker of choice but there is no structured marker database with options to generate primer pairs for genotyping on desired chromosome/physical location. Previously associated markers with different wheat trait are also not available in any database. Limitations of in vitro SSR discovery can be overcome by genome-wide in silico mining of SSR. Triticum aestivum SSR database (TaSSRDb) is an integrated online database with three-tier architecture, developed using PHP and MySQL and accessible at http://webtom.cabgrid.res.in/wheatssr/. For genotyping, Primer3 standalone code computes primers on user request. Chromosome-wise SSR calling for all the three sub genomes along with choice of motif types is provided in addition to the primer generation for desired marker. We report here a database of highest number of SSRs (476,169) from complex, hexaploid wheat genome (~17 GB) along with previously reported 268 SSR markers associated with 11 traits. Highest (116.93 SSRs/Mb) and lowest (74.57 SSRs/Mb) SSR densities were found on 2D and 3A chromosome, respectively. To obtain homozygous locus, e-PCR was done. Such 30 loci were randomly selected for PCR validation in panel of 18 wheat Advance Varietal Trial (AVT) lines. TaSSRDb can be a valuable genomic resource tool for linkage mapping, gene/QTL (Quantitative trait locus) discovery, diversity analysis, traceability and variety identification. Varietal specific profiling and differentiation can supplement DUS (Distinctiveness, Uniformity, and Stability) testing, EDV (Essentially Derived Variety)/IV (Initial Variety) disputes, seed purity and hybrid wheat testing. All these are required in germplasm management as well as also in the endeavor of wheat productivity.
RESUMEN
Microbial diseases in fish, plant, animal and human are rising constantly; thus, discovery of their antidote is imperative. The use of antibiotic in aquaculture further compounds the problem by development of resistance and consequent consumer health risk by bio-magnification. Antimicrobial peptides (AMPs) have been highly promising as natural alternative to chemical antibiotics. Though AMPs are molecules of innate immune defense of all advance eukaryotic organisms, fish being heavily dependent on their innate immune defense has been a good source of AMPs with much wider applicability. Machine learning-based prediction method using wet laboratory-validated fish AMP can accelerate the AMP discovery using available fish genomic and proteomic data. Earlier AMP prediction servers are based on multi-phyla/species data, and we report here the world's first AMP prediction server in fishes. It is freely accessible at http://webapp.cabgrid.res.in/fishamp/ . A total of 151 AMPs related to fish collected from various databases and published literature were taken for this study. For model development and prediction, N-terminus residues, C-terminus residues and full sequences were considered. Best models were with kernels polynomial-2, linear and radial basis function with accuracy of 97, 99 and 97 %, respectively. We found that performance of support vector machine-based models is superior to artificial neural network. This in silico approach can drastically reduce the time and cost of AMP discovery. This accelerated discovery of lead AMP molecules having potential wider applications in diverse area like fish and human health as substitute of antibiotics, immunomodulator, antitumor, vaccine adjuvant and inactivator, and also for packaged food can be of much importance for industries.
Asunto(s)
Algoritmos , Acuicultura/instrumentación , Acuicultura/métodos , Animales , Antiinfecciosos/análisis , Humanos , Aprendizaje Automático , Modelos Teóricos , Péptidos , ProteómicaRESUMEN
DNA marker plays important role as valuable tools to increase crop productivity by finding plausible answers to genetic variations and linking the Quantitative Trait Loci (QTL) of beneficial trait. Prior approaches in development of Short Tandem Repeats (STR) markers were time consuming and inefficient. Recent methods invoking the development of STR markers using whole genomic or transcriptomics data has gained wide importance with immense potential in developing breeding and cultivator improvement approaches. Availability of whole genome sequences and in silico approaches has revolutionized bulk marker discovery. We report world's first sugarbeet whole genome marker discovery having 145 K markers along with 5 K functional domain markers unified in common platform using MySQL, Apache and PHP in SBMDb. Embedded markers and corresponding location information can be selected for desired chromosome, location/interval and primers can be generated using Primer3 core, integrated at backend. Our analyses revealed abundance of 'mono' repeat (76.82%) over 'di' repeats (13.68%). Highest density (671.05 markers/Mb) was found in chromosome 1 and lowest density (341.27 markers/Mb) in chromosome 6. Current investigation of sugarbeet genome marker density has direct implications in increasing mapping marker density. This will enable present linkage map having marker distance of â¼2 cM, i.e. from 200 to 2.6 Kb, thus facilitating QTL/gene mapping. We also report e-PCR-based detection of 2027 polymorphic markers in panel of five genotypes. These markers can be used for DUS test of variety identification and MAS/GAS in variety improvement program. The present database presents wide source of potential markers for developing and implementing new approaches for molecular breeding required to accelerate industrious use of this crop, especially for sugar, health care products, medicines and color dye. Identified markers will also help in improvement of bioenergy trait of bioethanol and biogas production along with reaping advantage of crop efficiency in terms of low water and carbon footprint especially in era of climate change. Database URL: http://webapp.cabgrid.res.in/sbmdb/.
Asunto(s)
Beta vulgaris/genética , Biocombustibles , Bases de Datos Genéticas , Genoma de Planta , Industrias , Repeticiones de Microsatélite/genética , Cromosomas de las Plantas/genética , Marcadores Genéticos , Motivos de Nucleótidos/genéticaRESUMEN
BACKGROUND: Identification of true to breed type animal for conservation purpose is imperative. Breed dilution is one of the major problems in sustainability except cases of commercial crossbreeding under controlled condition. Breed descriptor has been developed to identify breed but such descriptors cover only "pure breed" or true to the breed type animals excluding undefined or admixture population. Moreover, in case of semen, ova, embryo and breed product, the breed cannot be identified due to lack of visible phenotypic descriptors. Advent of molecular markers like microsatellite and SNP have revolutionized breed identification from even small biological tissue or germplasm. Microsatellite DNA marker based breed assignments has been reported in various domestic animals. Such methods have limitations viz. non availability of allele data in public domain, thus each time all reference breed has to be genotyped which is neither logical nor economical. Even if such data is available but computational methods needs expertise of data analysis and interpretation. RESULTS: We found Bayesian Networks as best classifier with highest accuracy of 98.7% using 51850 reference allele data generated by 25 microsatellite loci on 22 goat breed population of India. The FST values in the study were seen to be low ranging from 0.051 to 0.297 and overall genetic differentiation of 13.8%, suggesting more number of loci needed for higher accuracy. We report here world's first model webserver for breed identification using microsatellite DNA markers freely accessible at http://cabin.iasri.res.in/gomi/. CONCLUSION: Higher number of loci is required due to less differentiable population and large number of breeds taken in this study. This server will reduce the cost with computational ease. This methodology can be a model for various other domestic animal species as a valuable tool for conservation and breed improvement programmes.
Asunto(s)
Cruzamiento , Biología Computacional , Marcadores Genéticos , Internet , Repeticiones de Microsatélite , Animales , Teorema de Bayes , Cabras/genética , India , Programas InformáticosRESUMEN
Heliothis virescens, a polyphagous pest, is one of the most destructive pests of many crops and vegetables. Various insecticides and pesticides are used by agriculturalists to stop the growth and development of this pest. RNA interference is a new area for the management of pests/insects by inhibiting the growth related RNAs. This involves the miRNAs identification and its characterization. In the present study, computational approach is applied to predict putative miRNA candidates along with their possible target(s) in the Heliothis virescens. A total of 63,662 ESTs were downloaded from dbEST database and processed, trimmed and masked through EGassembler. The H. virescens contigs database obtained after assembly was now used to find the putative miRNA candidates by performing a local BLAST with the miRNAs of insects retrieved from miRBase. We have predicted putative miRNA candidates by homology search against all the reported insect miRNAs. These putative miRNAs candidates were further validated and filtered by different features. In addition, we have also attempted to predict the putative targets of these filtered miRNAs, by making use of 3' untranslated regions of mRNAs from B. mori. These miRNAs and their targets in H. virescens will help in improved understanding of molecular mechanisms of miRNA and development of novel and more precise techniques for better understanding some post transcriptional gene silencing.
RESUMEN
An elucidated genome of domestic livestock river buffalo will contribute enormously to economy and better understanding of genome evolution as well. An attempt is made to obtain genomic information on buffalo, based on total Expressed Sequence Tags (ESTs) of Bubalus bubalis available in public domain. These ESTs were annotated and classified into 15 different functional categories based on their homology to the known proteins. Interestingly, 41.79% of the contigs were found to be buffalo specific novel ESTs with respect to other species used in analysis which needs further studies. Also, 224 pSNPs (putative Single Nucleotide Polymorphism) were detected. This study will provide a home base for further genomic studies of buffalo and comparative studies enabling a starting point for the genome annotation of the organism. Supplementary materials are available for this article online.
Asunto(s)
Búfalos/genética , Bases de Datos Genéticas , Etiquetas de Secuencia Expresada , Marcadores Genéticos/genética , Animales , Frecuencia de los Genes , Genoma , Genómica , Polimorfismo de Nucleótido Simple , Especificidad de la EspecieRESUMEN
Molecular markers play a significant role for crop improvement in desirable characteristics, such as high yield, resistance to disease and others that will benefit the crop in long term. Pigeonpea (Cajanus cajan L.) is the recently sequenced legume by global consortium led by ICRISAT (Hyderabad, India) and been analysed for gene prediction, synteny maps, markers, etc. We present PIgeonPEa Microsatellite DataBase (PIPEMicroDB) with an automated primer designing tool for pigeonpea genome, based on chromosome wise as well as location wise search of primers. Total of 123 387 Short Tandem Repeats (STRs) were extracted from pigeonpea genome, available in public domain using MIcroSAtellite tool (MISA). The database is an online relational database based on 'three-tier architecture' that catalogues information of microsatellites in MySQL and user-friendly interface is developed using PHP. Search for STRs may be customized by limiting their location on chromosome as well as number of markers in that range. This is a novel approach and is not been implemented in any of the existing marker database. This database has been further appended with Primer3 for primer designing of selected markers with left and right flankings of size up to 500 bp. This will enable researchers to select markers of choice at desired interval over the chromosome. Furthermore, one can use individual STRs of a targeted region over chromosome to narrow down location of gene of interest or linked Quantitative Trait Loci (QTLs). Although it is an in silico approach, markers' search based on characteristics and location of STRs is expected to be beneficial for researchers. Database URL: http://cabindb.iasri.res.in/pigeonpea/
Asunto(s)
Cajanus/genética , Cartilla de ADN/metabolismo , Bases de Datos Genéticas , Genoma de Planta/genética , Repeticiones de Microsatélite/genética , Composición de Base/genética , Cromosomas de las Plantas/genética , Marcadores Genéticos , Reproducibilidad de los ResultadosRESUMEN
BACKGROUND: Though India has sequenced water buffalo genome but its draft assembly is based on cattle genome BTau 4.0, thus de novo chromosome wise assembly is a major pending issue for global community. The existing radiation hybrid of buffalo and these reported STR can be used further in final gap plugging and "finishing" expected in de novo genome assembly. QTL and gene mapping needs mining of putative STR from buffalo genome at equal interval on each and every chromosome. Such markers have potential role in improvement of desirable characteristics, such as high milk yields, resistance to diseases, high growth rate. The STR mining from whole genome and development of user friendly database is yet to be done to reap the benefit of whole genome sequence. DESCRIPTION: By in silico microsatellite mining of whole genome, we have developed first STR database of water buffalo, BuffSatDb (Buffalo MicroSatellite Database (http://cabindb.iasri.res.in/buffsatdb/) which is a web based relational database of 910529 microsatellite markers, developed using PHP and MySQL database. Microsatellite markers have been generated using MIcroSAtellite tool. It is simple and systematic web based search for customised retrieval of chromosome wise and genome-wide microsatellites. Search has been enabled based on chromosomes, motif type (mono-hexa), repeat motif and repeat kind (simple and composite). The search may be customised by limiting location of STR on chromosome as well as number of markers in that range. This is a novel approach and not been implemented in any of the existing marker database. This database has been further appended with Primer3 for primer designing of the selected markers enabling researcher to select markers of choice at desired interval over the chromosome. The unique add-on of degenerate bases further helps in resolving presence of degenerate bases in current buffalo assembly. CONCLUSION: Being first buffalo STR database in the world , this would not only pave the way in resolving current assembly problem but shall be of immense use for global community in QTL/gene mapping critically required to increase knowledge in the endeavour to increase buffalo productivity, especially for third world country where rural economy is significantly dependent on buffalo productivity.