Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 12 de 12
Filter
Add more filters










Publication year range
1.
Mol Biol Evol ; 40(6)2023 06 01.
Article in English | MEDLINE | ID: mdl-37159511

ABSTRACT

According to archaeological records, chickpea (Cicer arietinum) was first domesticated in the Fertile Crescent about 10,000 years BP. Its subsequent diversification in Middle East, South Asia, Ethiopia, and the Western Mediterranean, however, remains obscure and cannot be resolved using only archeological and historical evidence. Moreover, chickpea has two market types: "desi" and "kabuli," for which the geographic origin is a matter of debate. To decipher chickpea history, we took the genetic data from 421 chickpea landraces unaffected by the green revolution and tested complex historical hypotheses of chickpea migration and admixture on two hierarchical spatial levels: within and between major regions of cultivation. For chickpea migration within regions, we developed popdisp, a Bayesian model of population dispersal from a regional representative center toward the sampling sites that considers geographical proximities between sites. This method confirmed that chickpea spreads within each geographical region along optimal geographical routes rather than by simple diffusion and estimated representative allele frequencies for each region. For chickpea migration between regions, we developed another model, migadmi, that takes allele frequencies of populations and evaluates multiple and nested admixture events. Applying this model to desi populations, we found both Indian and Middle Eastern traces in Ethiopian chickpea, suggesting the presence of a seaway from South Asia to Ethiopia. As for the origin of kabuli chickpeas, we found significant evidence for its origin from Turkey rather than Central Asia.


Subject(s)
Cicer , Cicer/genetics , Polymorphism, Single Nucleotide , Bayes Theorem , Gene Frequency , Genomics
2.
Front Plant Sci ; 13: 1026943, 2022.
Article in English | MEDLINE | ID: mdl-36388581

ABSTRACT

Nodule bacteria (rhizobia), N2-fixing symbionts of leguminous plants, represent an excellent model to study the fundamental issues of evolutionary biology, including the tradeoff between microevolution, speciation, and macroevolution, which remains poorly understood for free-living organisms. Taxonomically, rhizobia are extremely diverse: they are represented by nearly a dozen families of α-proteobacteria (Rhizobiales) and by some ß-proteobacteria. Their genomes are composed of core parts, including house-keeping genes (hkg), and of accessory parts, including symbiotically specialized (sym) genes. In multipartite genomes of evolutionary advanced fast-growing species (Rhizobiaceae), sym genes are clustered on extra-chromosomal replicons (megaplasmids, chromids), facilitating gene transfer in plant-associated microbial communities. In this review, we demonstrate that in rhizobia, microevolution and speciation involve different genomic and ecological mechanisms: the first one is based on the diversification of sym genes occurring under the impacts of host-induced natural selection (including its disruptive, frequency-dependent and group forms); the second one-on the diversification of hkgs under the impacts of unknown factors. By contrast, macroevolution represents the polyphyletic origin of super-species taxa, which are dependent on the transfer of sym genes from rhizobia to various soil-borne bacteria. Since the expression of newly acquired sym genes on foreign genomic backgrounds is usually restricted, conversion of resulted recombinants into the novel rhizobia species involves post-transfer genetic changes. They are presumably supported by host-induced selective processes resulting in the sequential derepression of nod genes responsible for nodulation and of nif/fix genes responsible for symbiotic N2 fixation.

3.
PeerJ ; 10: e13888, 2022.
Article in English | MEDLINE | ID: mdl-36061756

ABSTRACT

High-throughput sequencing of amplicon libraries is the most widespread and one of the most effective ways to study the taxonomic structure of microbial communities, even despite growing accessibility of whole metagenome sequencing. Due to the targeted amplification, the method provides unparalleled resolution of communities, but at the same time perturbs initial community structure thereby reducing data robustness and compromising downstream analyses. Experimental research of the perturbations is largely limited to comparative studies on different PCR protocols without considering other sources of experimental variation related to characteristics of the initial microbial composition itself. Here we analyse these sources and demonstrate how dramatically they effect the relative abundances of taxa during the PCR cycles. We developed the mathematical model of the PCR amplification assuming the heterogeneity of amplification efficiencies and considering the compositional nature of data. We designed the experiment-five consecutive amplicon cycles (22-26) with 12 replicates for one real human stool microbial sample-and estimated the dynamics of the microbial community in line with the model. We found the high heterogeneity in amplicon efficiencies of taxa that leads to the non-linear and substantial (up to fivefold) changes in relative abundances during PCR. The analysis of possible sources of heterogeneity revealed the significant association between amplicon efficiencies and the energy of secondary structures of the DNA templates. The result of our work highlights non-trivial changes in the dynamics of real-life microbial communities due to their compositional nature. Obtained effects are specific not only for amplicon libraries, but also for any studies of metagenome dynamics.


Subject(s)
Bacteria , Nucleic Acid Amplification Techniques , Humans , Alleles , RNA, Ribosomal, 16S/genetics , Polymerase Chain Reaction/methods
4.
Elife ; 112022 05 05.
Article in English | MEDLINE | ID: mdl-35510622

ABSTRACT

Studies of protein fitness landscapes reveal biophysical constraints guiding protein evolution and empower prediction of functional proteins. However, generalisation of these findings is limited due to scarceness of systematic data on fitness landscapes of proteins with a defined evolutionary relationship. We characterized the fitness peaks of four orthologous fluorescent proteins with a broad range of sequence divergence. While two of the four studied fitness peaks were sharp, the other two were considerably flatter, being almost entirely free of epistatic interactions. Mutationally robust proteins, characterized by a flat fitness peak, were not optimal templates for machine-learning-driven protein design - instead, predictions were more accurate for fragile proteins with epistatic landscapes. Our work paves insights for practical application of fitness landscape heterogeneity in protein engineering.


Subject(s)
Genetic Fitness , Models, Genetic , Mutation , Proteins/genetics
5.
Cancer Cell ; 39(11): 1464-1478.e8, 2021 11 08.
Article in English | MEDLINE | ID: mdl-34719426

ABSTRACT

Bone metastases are devastating complications of cancer. They are particularly common in prostate cancer (PCa), represent incurable disease, and are refractory to immunotherapy. We seek to define distinct features of the bone marrow (BM) microenvironment by analyzing single cells from bone metastatic prostate tumors, involved BM, uninvolved BM, and BM from cancer-free, orthopedic patients, and healthy individuals. Metastatic PCa is associated with multifaceted immune distortion, specifically exhaustion of distinct T cell subsets, appearance of macrophages with states specific to PCa bone metastases. The chemokine CCL20 is notably overexpressed by myeloid cells, as is its cognate CCR6 receptor on T cells. Disruption of the CCL20-CCR6 axis in mice with syngeneic PCa bone metastases restores T cell reactivity and significantly prolongs animal survival. Comparative high-resolution analysis of PCa bone metastases shows a targeted approach for relieving local immunosuppression for therapeutic effect.


Subject(s)
Bone Neoplasms/pathology , Bone Neoplasms/secondary , Chemokine CCL20/genetics , Prostatic Neoplasms/pathology , Receptors, CCR6/genetics , Up-Regulation , Animals , Bone Neoplasms/genetics , Bone Neoplasms/immunology , Case-Control Studies , Cell Line, Tumor , Chemokine CCL20/metabolism , Gene Expression Regulation, Neoplastic , Humans , Macrophages/immunology , Male , Mice , Myeloid Cells/immunology , Prostatic Neoplasms/genetics , Prostatic Neoplasms/immunology , Receptors, CCR6/metabolism , Single-Cell Analysis , T-Lymphocytes/immunology , Tumor Microenvironment
6.
Front Plant Sci ; 12: 642591, 2021.
Article in English | MEDLINE | ID: mdl-34025691

ABSTRACT

The difference in symbiotic specificity between peas of Afghanistan and European phenotypes was investigated using molecular modeling. Considering segregating amino acid polymorphism, we examined interactions of pea LykX-Sym10 receptor heterodimers with four forms of Nodulation factor (NF) that varied in natural decorations (acetylation and length of the glucosamine chain). First, we showed the stability of the LykX-Sym10 dimer during molecular dynamics (MD) in solvent and in the presence of a membrane. Then, four NFs were separately docked to one European and two Afghanistan dimers, and the results of these interactions were in line with corresponding pea symbiotic phenotypes. The European variant of the LykX-Sym10 dimer effectively interacts with both acetylated and non-acetylated forms of NF, while the Afghanistan variants successfully interact with the acetylated form only. We additionally demonstrated that the length of the NF glucosamine chain contributes to controlling the effectiveness of the symbiotic interaction. The obtained results support a recent hypothesis that the LykX gene is a suitable candidate for the unidentified Sym2 allele, the determinant of pea specificity toward Rhizobium leguminosarum bv. viciae strains producing NFs with or without an acetylation decoration. The developed modeling methodology demonstrated its power in multiple searches for genetic determinants, when experimental detection of such determinants has proven extremely difficult.

7.
BMC Genomics ; 21(Suppl 8): 490, 2020 Jul 28.
Article in English | MEDLINE | ID: mdl-32723302

ABSTRACT

BACKGROUND: There is a plethora of methods for genome-wide association studies. However, only a few of them may be classified as multi-trait and multi-locus, i.e. consider the influence of multiple genetic variants to several correlated phenotypes. RESULTS: We propose a multi-trait multi-locus model which employs structural equation modeling (SEM) to describe complex associations between SNPs and traits - multi-trait multi-locus SEM (mtmlSEM). The structure of our model makes it possible to discriminate pleiotropic and single-trait SNPs of direct and indirect effect. We also propose an automatic procedure to construct the model using factor analysis and the maximum likelihood method. For estimating a large number of parameters in the model, we performed Bayesian inference and implemented Gibbs sampling. An important feature of the model is that it correctly copes with non-normally distributed variables, such as some traits and variants. CONCLUSIONS: We applied the model to Vavilov's collection of 404 chickpea (Cicer arietinum L.) accessions with 20-fold cross-validation. We analyzed 16 phenotypic traits which we organized into five groups and found around 230 SNPs associated with traits, 60 of which were of pleiotropic effect. The model demonstrated high accuracy in predicting trait values.


Subject(s)
Genome-Wide Association Study/statistics & numerical data , Latent Class Analysis , Polymorphism, Single Nucleotide/genetics , Quantitative Trait Loci/genetics , Bayes Theorem , Genotype , Humans , Likelihood Functions
8.
Int J Mol Sci ; 21(11)2020 May 31.
Article in English | MEDLINE | ID: mdl-32486400

ABSTRACT

A defining challenge of the 21st century is meeting the nutritional demands of the growing human population, under a scenario of limited land and water resources and under the specter of climate change. The Vavilov seed bank contains numerous landraces collected nearly a hundred years ago, and thus may contain 'genetic gems' with the potential to enhance modern breeding efforts. Here, we analyze 407 landraces, sampled from major historic centers of chickpea cultivation and secondary diversification. Genome-Wide Association Studies (GWAS) conducted on both phenotypic traits and bioclimatic variables at landraces sampling sites as extended phenotypes resulted in 84 GWAS hits associated to various regions. The novel haploblock-based test identified haploblocks enriched for single nucleotide polymorphisms (SNPs) associated with phenotypes and bioclimatic variables. Subsequent bi-clustering of traits sharing enriched haploblocks underscored both non-random distribution of SNPs among several haploblocks and their association with multiple traits. We hypothesize that these clusters of pleiotropic SNPs represent co-adapted genetic complexes to a range of environmental conditions that chickpea experienced during domestication and subsequent geographic radiation. Linking genetic variation to phenotypic data and a wealth of historic information preserved in historic seed banks are the keys for genome-based and environment-informed breeding intensification.


Subject(s)
Cicer/genetics , Crops, Agricultural/genetics , Plant Breeding , Seeds , Biodiversity , Climate , Cluster Analysis , Conservation of Natural Resources , Genetic Association Studies , Genetic Markers , Genetic Variation , Genome, Plant , Genotype , Geography , Haplotypes , History, 20th Century , History, 21st Century , Likelihood Functions , Linkage Disequilibrium , Phenotype , Polymorphism, Single Nucleotide , Seed Bank/history , Seed Bank/organization & administration
9.
Ecol Evol ; 9(18): 10377-10386, 2019 Sep.
Article in English | MEDLINE | ID: mdl-31624556

ABSTRACT

We hypothesized that population diversities of partners in nitrogen-fixing rhizobium-legume symbiosis can be matched for "interplaying" genes. We tested this hypothesis using data on nucleotide polymorphism of symbiotic genes encoding two components of the plant-bacteria signaling system: (a) the rhizobial nodA acyltransferase involved in the fatty acid tail decoration of the Nod factor (signaling molecule); (b) the plant NFR5 receptor required for Nod factor binding. We collected three wild-growing legume species together with soil samples adjacent to the roots from one large 25-year fallow: Vicia sativa, Lathyrus pratensis, and Trifolium hybridum nodulated by one of the two Rhizobium leguminosarum biovars (viciae and trifolii). For each plant species, we prepared three pools for DNA extraction and further sequencing: the plant pool (30 plant indiv.), the nodule pool (90 nodules), and the soil pool (30 samples). We observed the following statistically significant conclusions: (a) a monotonic relationship between the diversity in the plant NFR5 gene pools and the nodule rhizobial nodA gene pools; (b) higher topological similarity of the NFR5 gene tree with the nodA gene tree of the nodule pool, than with the nodA gene tree of the soil pool. Both nonsynonymous diversity and Tajima's D were increased in the nodule pools compared with the soil pools, consistent with relaxation of negative selection and/or admixture of balancing selection. We propose that the observed genetic concordance between NFR5 gene pools and nodule nodA gene pools arises from the selection of particular genotypes of the nodA gene by the host plant.

10.
Cells ; 8(9)2019 09 05.
Article in English | MEDLINE | ID: mdl-31491936

ABSTRACT

BACKGROUND: Transposons are selfish genetic elements that self-reproduce in host DNA. They were active during evolutionary history and now occupy almost half of mammalian genomes. Close insertions of transposons reshaped structure and regulation of many genes considerably. Co-evolution of transposons and host DNA frequently results in the formation of new regulatory regions. Previously we published a concept that the proportion of functional features held by transposons positively correlates with the rate of regulatory evolution of the respective genes. METHODS: We ranked human genes and molecular pathways according to their regulatory evolution rates based on high throughput genome-wide data on five histone modifications (H3K4me3, H3K9ac, H3K27ac, H3K27me3, H3K9me3) linked with transposons for five human cell lines. RESULTS: Based on the total of approximately 1.5 million histone tags, we ranked regulatory evolution rates for 25075 human genes and 3121 molecular pathways and identified groups of molecular processes that showed signs of either fast or slow regulatory evolution. However, histone tags showed different regulatory patterns and formed two distinct clusters: promoter/active chromatin tags (H3K4me3, H3K9ac, H3K27ac) vs. heterochromatin tags (H3K27me3, H3K9me3). CONCLUSION: In humans, transposon-linked histone marks evolved in a coordinated way depending on their functional roles.


Subject(s)
Chromatin/genetics , Evolution, Molecular , Histone Code , Histones/genetics , Chromatin/chemistry , Chromatin Assembly and Disassembly , DNA Transposable Elements , Histones/chemistry , Humans , Models, Genetic
11.
Front Plant Sci ; 9: 1734, 2018.
Article in English | MEDLINE | ID: mdl-30546376

ABSTRACT

The impact of deleterious variation on both plant fitness and crop productivity is not completely understood and is a hot topic of debates. The deleterious mutations in plants have been solely predicted using sequence conservation methods rather than function-based classifiers due to lack of well-annotated mutational datasets in these organisms. Here, we developed a machine learning classifier based on a dataset of deleterious and neutral mutations in Arabidopsis thaliana by extracting 18 informative features that discriminate deleterious mutations from neutral, including 9 novel features not used in previous studies. We examined linear SVM, Gaussian SVM, and Random Forest classifiers, with the latter performing best. Random Forest classifiers exhibited a markedly higher accuracy than the popular PolyPhen-2 tool in the Arabidopsis dataset. Additionally, we tested whether the Random Forest, trained on the Arabidopsis dataset, accurately predicts deleterious mutations in Orýza sativa and Pisum sativum and observed satisfactory levels of performance accuracy (87% and 93%, respectively) higher than obtained by the PolyPhen-2. Application of Transfer learning in classifiers did not improve their performance. To additionally test the performance of the Random Forest classifier across different angiosperm species, we applied it to annotate deleterious mutations in Cicer arietinum and validated them using population frequency data. Overall, we devised a classifier with the potential to improve the annotation of putative functional mutations in QTL and GWAS hit regions, as well as for the evolutionary analysis of proliferation of deleterious mutations during plant domestication; thus optimizing breeding improvement and development of new cultivars.

12.
Front Mol Neurosci ; 11: 192, 2018.
Article in English | MEDLINE | ID: mdl-29942251

ABSTRACT

Schizophrenia (SCZ) is a psychiatric disorder of unknown etiology. There is evidence suggesting that aberrations in neurodevelopment are a significant attribute of schizophrenia pathogenesis and progression. To identify biologically relevant molecular abnormalities affecting neurodevelopment in SCZ we used cultured neural progenitor cells derived from olfactory neuroepithelium (CNON cells). Here, we tested the hypothesis that variance in gene expression differs between individuals from SCZ and control groups. In CNON cells, variance in gene expression was significantly higher in SCZ samples in comparison with control samples. Variance in gene expression was enriched in five molecular pathways: serine biosynthesis, PI3K-Akt, MAPK, neurotrophin and focal adhesion. More than 14% of variance in disease status was explained within the logistic regression model (C-value = 0.70) by predictors accounting for gene expression in 69 genes from these five pathways. Structural equation modeling (SEM) was applied to explore how the structure of these five pathways was altered between SCZ patients and controls. Four out of five pathways showed differences in the estimated relationships among genes: between KRAS and NF1, and KRAS and SOS1 in the MAPK pathway; between PSPH and SHMT2 in serine biosynthesis; between AKT3 and TSC2 in the PI3K-Akt signaling pathway; and between CRK and RAPGEF1 in the focal adhesion pathway. Our analysis provides evidence that variance in gene expression is an important characteristic of SCZ, and SEM is a promising method for uncovering altered relationships between specific genes thus suggesting affected gene regulation associated with the disease. We identified altered gene-gene interactions in pathways enriched for genes with increased variance in expression in SCZ. These pathways and loci were previously implicated in SCZ, providing further support for the hypothesis that gene expression variance plays important role in the etiology of SCZ.

SELECTION OF CITATIONS
SEARCH DETAIL
...