Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Plant Genome ; : e20447, 2024 Apr 17.
Artigo em Inglês | MEDLINE | ID: mdl-38628142

RESUMO

Sesame (Sesamum indicum L.) is an ancient oilseed crop belonging to the family Pedaliaceae and a globally cultivated crop for its use as oil and food. In this study, 2496 sesame accessions, being conserved at the National Genebank of ICAR-National Bureau of Plant Genetic Resources (NBPGR), were genotyped using genomics-assisted double-digest restriction-associated DNA sequencing (ddRAD-seq) approach. A total of 64,910 filtered single-nucleotide polymorphisms (SNPs) were utilized to assess the genome-scale diversity. Applications of this genome-scale information (reduced representation using restriction enzymes) are demonstrated through the development of a molecular core collection (CC) representing maximal SNP diversity. This information is also applied in developing a mid-density panel (MDP) comprising 2515 hyper-variable SNPs, representing almost equally the genic and non-genic regions. The sesame CC comprising 384 accessions, a representative set of accessions with maximal diversity, was identified using multiple criteria such as k-mer (subsequence of length "k" in a sequence read) diversity, observed heterozygosity, CoreHunter3, GenoCore, and genetic differentiation. The coreset constituted around 15% of the total accessions studied, and this small subset had captured >60% SNP diversity of the entire population. In the coreset, the admixture analysis shows reduced genetic complexity, increased nucleotide diversity (π), and is geographically distributed without any repetitiveness in the CC germplasm. Within the CC, India-originated accessions exhibit higher diversity (as expected based on the center of diversity concept), than those accessions that were procured from various other countries. The identified CC set and the MDP will be a valuable resource for genomics-assisted accelerated sesame improvement program.

2.
PLoS One ; 18(6): e0286599, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37267340

RESUMO

To reduce the genome sequence representation, restriction site-associated DNA sequencing (RAD-seq) protocols is being widely used either with single-digest or double-digest methods. In this study, we genotyped the sesame population (48 sample size) in a pilot scale to compare single and double-digest RAD-seq (sd and ddRAD-seq) methods. We analysed the resulting short-read data generated from both protocols and assessed their performance impacting the downstream analysis using various parameters. The distinct k-mer count and gene presence absence variation (PAV) showed a significant difference between the sesame samples studied. Additionally, the variant calling from both datasets (sdRAD-seq and ddRAD-seq) exhibits a significant difference between them. The combined variants from both datasets helped in identifying the most diverse samples and possible sub-groups in the sesame population. The most diverse samples identified from each analysis (k-mer, gene PAV, SNP count, Heterozygosity, NJ and PCA) can possibly be representative samples holding major diversity of the small sesame population used in this study. The best possible strategies with suggested inputs for modifications to utilize the RAD-seq strategy efficiently on a large dataset containing thousands of samples to be subjected to molecular analysis like diversity, population structure and core development studies were discussed.


Assuntos
Sesamum , Sesamum/genética , Genoma , Genótipo , Análise de Sequência de DNA/métodos , Sequência de Bases
3.
Front Plant Sci ; 13: 904392, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35720556

RESUMO

Heat stress is one of the significant constraints affecting wheat production worldwide. To ensure food security for ever-increasing world population, improving wheat for heat stress tolerance is needed in the presently drifting climatic conditions. At the molecular level, heat stress tolerance in wheat is governed by a complex interplay of various heat stress-associated genes. We used a comparative transcriptome sequencing approach to study the effect of heat stress (5°C above ambient threshold temperature of 20°C) during grain filling stages in wheat genotype K7903 (Halna). At 7 DPA (days post-anthesis), heat stress treatment was given at four stages: 0, 24, 48, and 120 h. In total, 115,656 wheat genes were identified, including 309 differentially expressed genes (DEGs) involved in many critical processes, such as signal transduction, starch synthetic pathway, antioxidant pathway, and heat stress-responsive conserved and uncharacterized putative genes that play an essential role in maintaining the grain filling rate at the high temperature. A total of 98,412 Simple Sequences Repeats (SSR) were identified from de novo transcriptome assembly of wheat and validated. The miRNA target prediction from differential expressed genes was performed by psRNATarget server against 119 mature miRNA. Further, 107,107 variants including 80,936 Single nucleotide polymorphism (SNPs) and 26,171 insertion/deletion (Indels) were also identified in de novo transcriptome assembly of wheat and wheat genome Ensembl version 31. The present study enriches our understanding of known heat response mechanisms during the grain filling stage supported by discovery of novel transcripts, microsatellite markers, putative miRNA targets, and genetic variant. This enhances gene functions and regulators, paving the way for improved heat tolerance in wheat varieties, making them more suitable for production in the current climate change scenario.

4.
Front Plant Sci ; 13: 846937, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35712605

RESUMO

Black pepper (Piper nigrum), the "King of Spices," is an economically important spice in India and is known for its medicinal and cultural values. SSRs, the tandem repeats of small DNA sequences, are often polymorphic in nature with diverse applications. For population structure, QTL/gene discovery, MAS, and diversity analysis, it is imperative to have their location specificity. The existing PinigSSRdb catalogs ~70K putative SSR markers but these are anonymous (unknown chromosomal location), based on 916 scaffolds rather than 26 chromosomes. Under this study, we generated ddRAD sequence data of 29 black pepper genotypes from all over India, being low-cost and most efficient technique for the identification of polymorphic markers. The major limitation of ddRAD with compromised/non-uniform coverage has been successfully overcome by taking advantage of chromosome-wise data availability. The latest black pepper genome assembly was used to extract genome-wide SSRs. A total of 276,230 genomic SSRs were mined distributed over 26 chromosomes, with relative density of 362.88 SSRs/Mb and average distance of 2.76 Kb between two SSRs. This assembly was also used to find the polymorphic SSRs in the generated GBS data of 29 black pepper genotypes utilizing rapid and cost-effective method giving 3,176 polymorphic SSRs, out of which 2015 were found to be hypervariable. The developed web-genomic resource, BlackP2MSATdb (http://webtom.cabgrid.res.in/blackp2msatdb/), is the largest and first reported web resource for genomic and polymorphic SSRs of black pepper, which is useful to develop varietal signature, coreset, physical map, QTL/gene identification, and MAS in endeavor of black pepper production.

5.
Bioinformatics ; 38(2): 318-324, 2022 01 03.
Artigo em Inglês | MEDLINE | ID: mdl-34601584

RESUMO

MOTIVATION: Tea is a cross-pollinated woody perennial plant, which is why, application of conventional breeding is limited for its genetic improvement. However, lack of the genome-wide high-density SNP markers and genome-wide haplotype information has greatly hampered the utilization of tea genetic resources toward fast-track tea breeding programs. To address this challenge, we have generated a first-generation haplotype map of tea (Tea HapMap-1). Out-crossing and highly heterozygous nature of tea plants, make them more complicated for DNA-level variant discovery. RESULTS: In this study, whole genome re-sequencing data of 369 tea genotypes were used to generate 2,334,564 biallelic SNPs and 1,447,985 InDels. Around 2928.04 million paired-end reads were used with an average mapping depth of ∼0.31× per accession. Identified polymorphic sites in this study will be useful in mapping the genomic regions responsible for important traits of tea. These resources lay the foundation for future research to understand the genetic diversity within tea germplasm and utilize genes that determine tea quality. This will further facilitate the understanding of tea genome evolution and tea metabolite pathways thus, offers an effective germplasm utilization for breeding the tea varieties. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Camellia sinensis , Camellia sinensis/genética , Haplótipos , Projeto HapMap , Melhoramento Vegetal , Chá , Genoma de Planta
6.
Front Vet Sci ; 8: 593871, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34222390

RESUMO

Water buffalo (Bubalus bubalis) are an important animal resource that contributes milk, meat, leather, dairy products, and power for plowing and transport. However, mastitis, a bacterial disease affecting milk production and reproduction efficiency, is most prevalent in populations having intensive selection for higher milk yield, especially where the inbreeding level is also high. Climate change and poor hygiene management practices further complicate the issue. The management of this disease faces major challenges, like antibiotic resistance, maximum residue level, horizontal gene transfer, and limited success in resistance breeding. Bovine mastitis genome wide association studies have had limited success due to breed differences, sample sizes, and minor allele frequency, lowering the power to detect the diseases associated with SNPs. In this work, we focused on the application of targeted gene panels (TGPs) in screening for candidate gene association analysis, and how this approach overcomes the limitation of genome wide association studies. This work will facilitate the targeted sequencing of buffalo genomic regions with high depth coverage required to mine the extremely rare variants potentially associated with buffalo mastitis. Although the whole genome assembly of water buffalo is available, neither mastitis genes are predicted nor TGP in the form of web-genomic resources are available for future variant mining and association studies. Out of the 129 mastitis associated genes of cattle, 101 were completely mapped on the buffalo genome to make TGP. This further helped in identifying rare variants in water buffalo. Eighty-five genes were validated in the buffalo gene expression atlas, with the RNA-Seq data of 50 tissues. The functions of 97 genes were predicted, revealing 225 pathways. The mastitis proteins were used for protein-protein interaction network analysis to obtain additional cross-talking proteins. A total of 1,306 SNPs and 152 indels were identified from 101 genes. Water Buffalo-MSTdb was developed with 3-tier architecture to retrieve mastitis associated genes having genomic coordinates with chromosomal details for TGP sequencing for mining of minor alleles for further association studies. Lastly, a web-genomic resource was made available to mine variants of targeted gene panels in buffalo for mastitis resistance breeding in an endeavor to ensure improved productivity and the reproductive efficiency of water buffalo.

7.
J Fungi (Basel) ; 7(4)2021 Apr 12.
Artigo em Inglês | MEDLINE | ID: mdl-33921243

RESUMO

Identification and diversity analysis of fungi is greatly challenging. Though internal transcribed spacer (ITS), region-based DNA fingerprinting works as a "gold standard" for most of the fungal species group, it cannot differentiate between all the groups and cryptic species. Therefore, it is of paramount importance to find an alternative approach for strain differentiation. Availability of whole genome sequence data of nearly 2000 fungal species are a promising solution to such requirement. We present whole genome sequence-based world's largest microsatellite database, FungSatDB having >19M loci obtained from >1900 fungal species/strains using >4000 assemblies across globe. Genotyping efficacy of FungSatDB has been evaluated by both in-silico and in-vitro PCR. By in silico PCR, 66 strains of 8 countries representing four continents were successfully differentiated. Genotyping efficacy was also evaluated by in vitro PCR in four fungal species. This approach overcomes limitation of ITS in species, strain signature, and diversity analysis. It can accelerate fungal genomic research endeavors in agriculture, industrial, and environmental management.

8.
IEEE/ACM Trans Comput Biol Bioinform ; 18(4): 1361-1368, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-31494554

RESUMO

Alignment and comparison of protein 3D structures is an important and fundamental task in structural biology to study evolutionary, functional and structural relatedness among proteins. Since two decades, the research on protein structure alignment has been taken up on priority and numbers of research articles are being published. There are incremental advances over previous efforts, and still these methods continue to improve over the time and still this is an open problem in structural biology. A novel methodology has been developed for comparing protein 3D structure by employing conversion of pair of protein 3D structures into 2D graphs (undirected weighted graph), partitioning of 2D graphs into sub-graphs, matching sub-graphs with main graphs and finally these sub-graphs matches calculates similarity between the pair of proteins. The proposed method has been implemented in MATLAB and R Package. The performance of the developed methodology is tested with four existing best methods such as CE, jFATCAT, TM_Align and Dali on 100 proteins benchmark dataset with SCOP database. The proposed method is efficient in terms of time complexity, accuracy, grouping of proteins in relevant structural groups and provides additional information towards non-bonded interactions and sub-graphs indicates the dominance of secondary structure.


Assuntos
Biologia Computacional/métodos , Imageamento Tridimensional , Modelos Moleculares , Conformação Proteica , Proteínas/química , Algoritmos , Cadeias de Markov
9.
Front Plant Sci ; 11: 748, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32582265

RESUMO

Among several important wheat foliar diseases, Stripe rust (YR), Leaf rust (LR), and Stem rust (SR) have always been an issue of concern to the farmers and wheat breeders. Evolution of virulent pathotypes of these rusts has posed frequent threats to an epidemic. Pyramiding rust-resistant genes are the most economical and environment-friendly approach in postponing this inevitable threat. To achieve durable long term resistance against the three rusts, an attempt in this study was made searching for novel sources of resistant alleles in a panel of 483 spring wheat genotypes. This is a unique and comprehensive study where evaluation of a diverse panel comprising wheat germplasm from various categories and adapted to different wheat agro-climatic zones was challenged with 18 pathotypes of the three rusts with simultaneous screening in field conditions. The panel was genotyped using 35K SNP array and evaluated for each rust at two locations for two consecutive crop seasons. High heritability estimates of disease response were observed between environments for each rust type. A significant effect of population structure in the panel was visible in the disease response. Using a compressed mixed linear model approach, 25 genomic regions were found associated with resistance for at least two rusts. Out of these, seven were associated with all the three rusts on chromosome groups 1 and 6 along with 2B. For resistance against YR, LR, and SR, there were 16, 18, and 27 QTL (quantitative trait loci) identified respectively, associated at least in two out of four environments. Several of these regions got annotated with resistance associated genes viz. NB-LRR, E3-ubiquitin protein ligase, ABC transporter protein, etc. Alien introgressed (on 1B and 3D) and pleiotropic (on 7D) resistance genes were captured in seedling and adult plant disease responses, respectively. The present study demonstrates the use of genome-wide association for identification of a large number of favorable alleles for leaf, stripe, and stem rust resistance for broadening the genetic base. Quick conversion of these QTL into user-friendly markers will accelerate the deployment of these resistance loci in wheat breeding programs.

10.
Front Plant Sci ; 8: 2009, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-29234333

RESUMO

Wheat fulfills 20% of global caloric requirement. World needs 60% more wheat for 9 billion population by 2050 but climate change with increasing temperature is projected to affect wheat productivity adversely. Trait improvement and management of wheat germplasm requires genomic resource. Simple Sequence Repeats (SSRs) being highly polymorphic and ubiquitously distributed in the genome, can be a marker of choice but there is no structured marker database with options to generate primer pairs for genotyping on desired chromosome/physical location. Previously associated markers with different wheat trait are also not available in any database. Limitations of in vitro SSR discovery can be overcome by genome-wide in silico mining of SSR. Triticum aestivum SSR database (TaSSRDb) is an integrated online database with three-tier architecture, developed using PHP and MySQL and accessible at http://webtom.cabgrid.res.in/wheatssr/. For genotyping, Primer3 standalone code computes primers on user request. Chromosome-wise SSR calling for all the three sub genomes along with choice of motif types is provided in addition to the primer generation for desired marker. We report here a database of highest number of SSRs (476,169) from complex, hexaploid wheat genome (~17 GB) along with previously reported 268 SSR markers associated with 11 traits. Highest (116.93 SSRs/Mb) and lowest (74.57 SSRs/Mb) SSR densities were found on 2D and 3A chromosome, respectively. To obtain homozygous locus, e-PCR was done. Such 30 loci were randomly selected for PCR validation in panel of 18 wheat Advance Varietal Trial (AVT) lines. TaSSRDb can be a valuable genomic resource tool for linkage mapping, gene/QTL (Quantitative trait locus) discovery, diversity analysis, traceability and variety identification. Varietal specific profiling and differentiation can supplement DUS (Distinctiveness, Uniformity, and Stability) testing, EDV (Essentially Derived Variety)/IV (Initial Variety) disputes, seed purity and hybrid wheat testing. All these are required in germplasm management as well as also in the endeavor of wheat productivity.

11.
Artigo em Inglês | MEDLINE | ID: mdl-21844638

RESUMO

One of the major research directions in bioinformatics is that of assigning superfamily classification to a given set of proteins. The classification reflects the structural, evolutionary, and functional relatedness. These relationships are embodied in a hierarchical classification, such as the Structural Classification of Protein (SCOP), which is mostly manually curated. Such a classification is essential for the structural and functional analyses of proteins. Yet a large number of proteins remain unclassified. In this study, we have proposed an unsupervised machine learning approach to classify and assign a given set of proteins to SCOP superfamilies. In the method, we have constructed a database and similarity matrix using P-values obtained from an all-against-all BLAST run and trained the network with the ART2 unsupervised learning algorithm using the rows of the similarity matrix as input vectors, enabling the trained network to classify the proteins from 0.82 to 0.97 f-measure accuracy. The performance of ART2 has been compared with that of spectral clustering, Random forest, SVM, and HHpred. ART2 performs better than the others except HHpred. HHpred performs better than ART2 and the sum of errors is smaller than that of the other methods evaluated.


Assuntos
Biologia Computacional/métodos , Reconhecimento Automatizado de Padrão/métodos , Proteínas/química , Proteínas/classificação , Análise de Sequência de Proteína/métodos , Algoritmos , Análise por Conglomerados , Modelos Estatísticos , Redes Neurais de Computação , Estrutura Terciária de Proteína
12.
J Bioinform Comput Biol ; 8(5): 825-41, 2010 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-20981890

RESUMO

One of the major research directions in bioinformatics is that of predicting the protein superfamily in large databases and classifying a given set of protein domains into superfamilies. The classification reflects the structural, evolutionary and functional relatedness. These relationships are embodied in hierarchical classification such as Structural Classification of Protein (SCOP), which is manually curated. Such classification is essential for the structural and functional analysis of proteins. Yet, a large number of proteins remain unclassified. We have proposed an unsupervised machine-learning FuzzyART neural network algorithm to classify a given set of proteins into SCOP superfamilies. The proposed method is fast learning and uses an atypical non-linear pattern recognition technique. In this approach, we have constructed a similarity matrix from p-values of BLAST all-against-all, trained the network with FuzzyART unsupervised learning algorithm using the similarity matrix as input vectors and finally the trained network offers SCOP superfamily level classification. In this experiment, we have evaluated the performance of our method with existing techniques on six different datasets. We have shown that the trained network is able to classify a given similarity matrix of a set of sequences into SCOP superfamilies at high classification accuracy.


Assuntos
Algoritmos , Redes Neurais de Computação , Proteínas/classificação , Inteligência Artificial , Biologia Computacional , Bases de Dados de Proteínas , Cadeias de Markov , Reconhecimento Automatizado de Padrão , Estrutura Terciária de Proteína , Proteínas/química , Proteínas/genética , Alinhamento de Sequência/estatística & dados numéricos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA