Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 143
Filter
1.
mLife ; 3(2): 277-290, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38948139

ABSTRACT

Most in silico evolutionary studies commonly assumed that core genes are essential for cellular function, while accessory genes are dispensable, particularly in nutrient-rich environments. However, this assumption is seldom tested genetically within the pangenome context. In this study, we conducted a robust pangenomic Tn-seq analysis of fitness genes in a nutrient-rich medium for Sinorhizobium strains with a canonical open pangenome. To evaluate the robustness of fitness category assignment, Tn-seq data for three independent mutant libraries per strain were analyzed by three methods, which indicates that the Hidden Markov Model (HMM)-based method is most robust to variations between mutant libraries and not sensitive to data size, outperforming the Bayesian and Monte Carlo simulation-based methods. Consequently, the HMM method was used to classify the fitness category. Fitness genes, categorized as essential (ES), advantage (GA), and disadvantage (GD) genes for growth, are enriched in core genes, while nonessential genes (NE) are over-represented in accessory genes. Accessory ES/GA genes showed a lower fitness effect than core ES/GA genes. Connectivity degrees in the cofitness network decrease in the order of ES, GD, and GA/NE. In addition to accessory genes, 1599 out of 3284 core genes display differential essentiality across test strains. Within the pangenome core, both shared quasi-essential (ES and GA) and strain-dependent fitness genes are enriched in similar functional categories. Our analysis demonstrates a considerable fuzzy essential zone determined by cofitness connectivity degrees in Sinorhizobium pangenome and highlights the power of the cofitness network in understanding the genetic basis of ever-increasing prokaryotic pangenome data.

2.
mSystems ; 9(6): e0138523, 2024 Jun 18.
Article in English | MEDLINE | ID: mdl-38752789

ABSTRACT

A dysfunction of human host genes and proteins in coronavirus infectious disease 2019 (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a key factor impacting clinical symptoms and outcomes. Yet, a detailed understanding of human host immune responses is still incomplete. Here, we applied RNA sequencing to 94 samples of COVID-19 patients with and without hematological tumors as well as COVID-19 uninfected non-tumor individuals to obtain a comprehensive transcriptome landscape of both hematological tumor patients and non-tumor individuals. In our analysis, we further accounted for the human-SARS-CoV-2 protein interactome, human protein interactome, and human protein complex subnetworks to understand the mechanisms of SARS-CoV-2 infection and host immune responses. Our data sets enabled us to identify important SARS-CoV-2 (non-)targeted differentially expressed genes and complexes post-SARS-CoV-2 infection in both hematological tumor and non-tumor individuals. We found several unique differentially expressed genes, complexes, and functions/pathways such as blood coagulation (APOE, SERPINE1, SERPINE2, and TFPI), lipoprotein particle remodeling (APOC2, APOE, and CETP), and pro-B cell differentiation (IGHM, VPREB1, and IGLL1) during COVID-19 infection in patients with hematological tumors. In particular, APOE, a gene that is associated with both blood coagulation and lipoprotein particle remodeling, is not only upregulated in hematological tumor patients post-SARS-CoV-2 infection but also significantly expressed in acute dead patients with hematological tumors, providing clues for the design of future therapeutic strategies specifically targeting COVID-19 in patients with hematological tumors. Our data provide a rich resource for understanding the specific pathogenesis of COVID-19 in immunocompromised patients, such as those with hematological malignancies, and developing effective therapeutics for COVID-19. IMPORTANCE: A majority of previous studies focused on the characterization of coronavirus infectious disease 2019 (COVID-19) disease severity in people with normal immunity, while the characterization of COVID-19 in immunocompromised populations is still limited. Our study profiles changes in the transcriptome landscape post-severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection in hematological tumor patients and non-tumor individuals. Furthermore, our integrative and comparative systems biology analysis of the interactome, complexome, and transcriptome provides new insights into the tumor-specific pathogenesis of COVID-19. Our findings confirm that SARS-CoV-2 potentially tends to target more non-functional host proteins to indirectly affect host immune responses in hematological tumor patients. The identified unique genes, complexes, functions/pathways, and expression patterns post-SARS-CoV-2 infection in patients with hematological tumors increase our understanding of how SARS-CoV-2 manipulates the host molecular mechanism. Our observed differential genes/complexes and clinical indicators of normal/long infection and deceased COVID-19 patients provide clues for understanding the mechanism of COVID-19 progression in hematological tumors. Finally, our study provides an important data resource that supports the increasing value of the application of publicly accessible data sets to public health.


Subject(s)
COVID-19 , Immunocompromised Host , SARS-CoV-2 , Transcriptome , Humans , COVID-19/genetics , COVID-19/immunology , COVID-19/virology , Transcriptome/genetics , SARS-CoV-2/genetics , Hematologic Neoplasms/genetics , Hematologic Neoplasms/immunology , Male , Female , Protein Interaction Maps/genetics , Middle Aged , Gene Expression Profiling/methods
3.
Brief Bioinform ; 25(2)2024 Jan 22.
Article in English | MEDLINE | ID: mdl-38279649

ABSTRACT

The identification of human-herpesvirus protein-protein interactions (PPIs) is an essential and important entry point to understand the mechanisms of viral infection, especially in malignant tumor patients with common herpesvirus infection. While natural language processing (NLP)-based embedding techniques have emerged as powerful approaches, the application of multi-modal embedding feature fusion to predict human-herpesvirus PPIs is still limited. Here, we established a multi-modal embedding feature fusion-based LightGBM method to predict human-herpesvirus PPIs. In particular, we applied document and graph embedding approaches to represent sequence, network and function modal features of human and herpesviral proteins. Training our LightGBM models through our compiled non-rigorous and rigorous benchmarking datasets, we obtained significantly better performance compared to individual-modal features. Furthermore, our model outperformed traditional feature encodings-based machine learning methods and state-of-the-art deep learning-based methods using various benchmarking datasets. In a transfer learning step, we show that our model that was trained on human-herpesvirus PPI dataset without cytomegalovirus data can reliably predict human-cytomegalovirus PPIs, indicating that our method can comprehensively capture multi-modal fusion features of protein interactions across various herpesvirus subtypes. The implementation of our method is available at https://github.com/XiaodiYangpku/MultimodalPPI/.


Subject(s)
Benchmarking , Cytomegalovirus , Humans , Machine Learning , Natural Language Processing
4.
mSystems ; 9(1): e0100423, 2024 Jan 23.
Article in English | MEDLINE | ID: mdl-38078741

ABSTRACT

Oomycetes are fungus-like eukaryotic microorganisms which can cause catastrophic diseases in many plants. Successful infection of oomycetes depends highly on their effector proteins that are secreted into plant cells to subvert plant immunity. Thus, systematic identification of effectors from the oomycete proteomes remains an initial but crucial step in understanding plant-pathogen relationships. However, the number of experimentally identified oomycete effectors is still limited. Currently, only a few bioinformatics predictors exist to detect potential effectors, and their prediction performance needs to be improved. Here, we used the sequence embeddings from a pre-trained large protein language model (ProtTrans) as input and developed a support vector machine-based method called POOE for predicting oomycete effectors. POOE could achieve a highly accurate performance with an area under the precision-recall curve of 0.804 (area under the receiver operating characteristic curve = 0.893, accuracy = 0.874, precision = 0.777, recall = 0.684, and specificity = 0.936) in the fivefold cross-validation, considerably outperforming various combinations of popular machine learning algorithms and other commonly used sequence encoding schemes. A similar prediction performance was also observed in the independent test. Compared with the existing oomycete effector prediction methods, POOE provided very competitive and promising performance, suggesting that ProtTrans effectively captures rich protein semantic information and dramatically improves the prediction task. We anticipate that POOE can accelerate the identification of oomycete effectors and provide new hints to systematically understand the functional roles of effectors in plant-pathogen interactions. The web server of POOE is freely accessible at http://zzdlab.com/pooe/index.php. The corresponding source codes and data sets are also available at https://github.com/zzdlabzm/POOE.IMPORTANCEIn this work, we use the sequence representations from a pre-trained large protein language model (ProtTrans) as input and develop a Support Vector Machine-based method called POOE for predicting oomycete effectors. POOE could achieve a highly accurate performance in the independent test set, considerably outperforming existing oomycete effector prediction methods. We expect that this new bioinformatics tool will accelerate the identification of oomycete effectors and further guide the experimental efforts to interrogate the functional roles of effectors in plant-pathogen interaction.


Subject(s)
Oomycetes , Oomycetes/metabolism , Fungal Proteins/genetics , Software , Plants/metabolism , Language
5.
J Proteome Res ; 23(1): 494-499, 2024 01 05.
Article in English | MEDLINE | ID: mdl-38069805

ABSTRACT

Plant-pathogen protein-protein interactions (PPIs) play crucial roles in the arm race between plants and pathogens. Therefore, the identification of these interspecies PPIs is very important for the mechanistic understanding of pathogen infection and plant immunity. Computational prediction methods can complement experimental efforts, but their predictive performance still needs to be improved. Motivated by the rapid development of natural language processing and its successful applications in the field of protein bioinformatics, here we present an improved XGBoost-based plant-pathogen PPI predictor (i.e., AraPathogen2.0), in which sequence encodings from the pretrained protein language model ESM2 and Arabidopsis PPI network-related node representations from the graph embedding technique struc2vec are used as input. Stringent benchmark experiments showed that AraPathogen2.0 could achieve a better performance than its precedent version, especially for processing the test data set with novel proteins unseen in the training data.


Subject(s)
Arabidopsis , Protein Interaction Mapping , Protein Interaction Mapping/methods , Natural Language Processing , Plants , Proteins/metabolism , Arabidopsis/metabolism
6.
Plant Methods ; 19(1): 141, 2023 Dec 07.
Article in English | MEDLINE | ID: mdl-38062445

ABSTRACT

BACKGROUND: Protein-protein interactions (PPIs) are heavily involved in many biological processes. Consequently, the identification of PPIs in the model plant Arabidopsis is of great significance to deeply understand plant growth and development, and then to promote the basic research of crop improvement. Although many experimental Arabidopsis PPIs have been determined currently, the known interactomic data of Arabidopsis is far from complete. In this context, developing effective machine learning models from existing PPI data to predict unknown Arabidopsis PPIs conveniently and rapidly is still urgently needed. RESULTS: We used a large-scale pre-trained protein language model (pLM) called ESM-1b to convert protein sequences into high-dimensional vectors and then used them as the input of multilayer perceptron (MLP). To avoid the performance overestimation frequently occurring in PPI prediction, we employed stringent datasets to train and evaluate the predictive model. The results showed that the combination of ESM-1b and MLP (i.e., ESMAraPPI) achieved more accurate performance than the predictive models inferred from other pLMs or baseline sequence encoding schemes. In particular, the proposed ESMAraPPI yielded an AUPR value of 0.810 when tested on an independent test set where both proteins in each protein pair are unseen in the training dataset, suggesting its strong generalization and extrapolating ability. Moreover, the proposed ESMAraPPI model performed better than several state-of-the-art generic or plant-specific PPI predictors. CONCLUSION: Protein sequence embeddings from the pre-trained model ESM-1b contain rich protein semantic information. By combining with the MLP algorithm, ESM-1b revealed excellent performance in predicting Arabidopsis PPIs. We anticipate that the proposed predictive model (ESMAraPPI) can serve as a very competitive tool to accelerate the identification of Arabidopsis interactome.

7.
Commun Biol ; 6(1): 1233, 2023 12 06.
Article in English | MEDLINE | ID: mdl-38057566

ABSTRACT

A set of high-quality pan-genomes would help identify important genes that are still hidden/incomplete in bird reference genomes. In an attempt to address these issues, we have assembled a de novo chromosome-level reference genome of the Silkie (Gallus gallus domesticus), which is an important avian model for unique traits, like fibromelanosis, with unclear genetic foundation. This Silkie genome includes the complete genomic sequences of well-known, but unresolved, evolutionarily, endocrinologically, and immunologically important genes, including leptin, ovocleidin-17, and tumor-necrosis factor-α. The gap-less and manually annotated MHC (major histocompatibility complex) region possesses 38 recently identified genes, with differentially regulated genes recovered in response to pathogen challenges. We also provide whole-genome methylation and genetic variation maps, and resolve a complex genetic region that may contribute to fibromelanosis in these animals. Finally, we experimentally show leptin binding to the identified leptin receptor in chicken, confirming an active leptin ligand-receptor system. The Silkie genome assembly not only provides a rich data resource for avian genome studies, but also lays a foundation for further functional validation of resolved genes.


Subject(s)
Chickens , Leptin , Animals , Chickens/genetics , Leptin/genetics , Genome , Genomics , Chromosomes
8.
Methods Mol Biol ; 2690: 385-399, 2023.
Article in English | MEDLINE | ID: mdl-37450161

ABSTRACT

Proteome-wide characterization of protein-protein interactions (PPIs) is crucial to understand the functional roles of protein machinery within cells systematically. With the accumulation of PPI data in different plants, the interaction details of binary PPIs, such as the three-dimensional (3D) structural contexts of interaction sites/interfaces, are urgently demanded. To meet this requirement, we have developed a comprehensive and easy-to-use database called PlaPPISite ( http://zzdlab.com/plappisite/index.php ) to present interaction details for 13 plant interactomes. Here, we provide a clear guide on how to search and view protein interaction details through the PlaPPISite database. Firstly, the running environment of our database is introduced. Secondly, the input file format is briefly introduced. Moreover, we discussed which information related to interaction sites can be achieved through several examples. In addition, some notes about PlaPPISite are also provided. More importantly, we would like to emphasize the importance of interaction site information in plant systems biology through this user guide of PlaPPISite. In particular, the easily accessible 3D structures of PPIs in the coming post-AlphaFold2 era will definitely boost the application of plant interactome to decipher the molecular mechanisms of many fundamental biological issues.


Subject(s)
Plants , Protein Interaction Mapping , Protein Interaction Mapping/methods , Databases, Protein , Plants/metabolism , Proteome/metabolism , Plant Proteins
9.
BMC Genomics ; 24(1): 301, 2023 Jun 03.
Article in English | MEDLINE | ID: mdl-37270481

ABSTRACT

BACKGROUND: The behaviors and ontogeny of Aedes aegypti are closely related to the spread of diseases caused by dengue (DENV), chikungunya (CHIKV), Zika (ZIKV), and yellow fever (YFV) viruses. During the life cycle, Ae. aegypti undergoes drastic morphological, metabolic, and functional changes triggered by gene regulation and other molecular mechanisms. Some essential regulatory factors that regulate insect ontogeny have been revealed in other species, but their roles are still poorly investigated in the mosquito. RESULTS: Our study identified 6 gene modules and their intramodular hub genes that were highly associated with the ontogeny of Ae. aegypti in the constructed network. Those modules were found to be enriched in functional roles related to cuticle development, ATP generation, digestion, immunity, pupation control, lectins, and spermatogenesis. Additionally, digestion-related pathways were activated in the larvae and adult females but suppressed in the pupae. The integrated protein‒protein network also identified cilium-related genes. In addition, we verified that the 6 intramodular hub genes encoding proteins such as EcKinase regulating larval molt were only expressed in the larval stage. Quantitative RT‒PCR of the intramodular hub genes gave similar results as the RNA-Seq expression profile, and most hub genes were ontogeny-specifically expressed. CONCLUSIONS: The constructed gene coexpression network provides a useful resource for network-based data mining to identify candidate genes for functional studies. Ultimately, these findings will be key in identifying potential molecular targets for disease control.


Subject(s)
Aedes , Dengue , Yellow Fever , Zika Virus Infection , Zika Virus , Male , Animals , Female , Yellow Fever/genetics , Zika Virus/genetics , Gene Regulatory Networks , Mosquito Vectors , Proteins/genetics , Larva
10.
BMC Plant Biol ; 23(1): 225, 2023 Apr 27.
Article in English | MEDLINE | ID: mdl-37106367

ABSTRACT

BACKGROUND: Alternative splicing (AS) is a co-transcriptional regulatory mechanism of plants in response to environmental stress. However, the role of AS in biotic and abiotic stress responses remains largely unknown. To speed up our understanding of plant AS patterns under different stress responses, development of informative and comprehensive plant AS databases is highly demanded. DESCRIPTION: In this study, we first collected 3,255 RNA-seq data under biotic and abiotic stresses from two important model plants (Arabidopsis and rice). Then, we conducted AS event detection and gene expression analysis, and established a user-friendly plant AS database termed PlaASDB. By using representative samples from this highly integrated database resource, we compared AS patterns between Arabidopsis and rice under abiotic and biotic stresses, and further investigated the corresponding difference between AS and gene expression. Specifically, we found that differentially spliced genes (DSGs) and differentially expressed genes (DEG) share very limited overlapping under all kinds of stresses, suggesting that gene expression regulation and AS seemed to play independent roles in response to stresses. Compared with gene expression, Arabidopsis and rice were more inclined to have conserved AS patterns under stress conditions. CONCLUSION: PlaASDB is a comprehensive plant-specific AS database that mainly integrates the AS and gene expression data of Arabidopsis and rice in stress response. Through large-scale comparative analyses, the global landscape of AS events in Arabidopsis and rice was observed. We believe that PlaASDB could help researchers understand the regulatory mechanisms of AS in plants under stresses more conveniently. PlaASDB is freely accessible at http://zzdlab.com/PlaASDB/ASDB/index.html .


Subject(s)
Arabidopsis , Oryza , Alternative Splicing , Arabidopsis/metabolism , Plants/genetics , Gene Expression Profiling , Stress, Physiological/genetics , Gene Expression Regulation, Plant , Oryza/metabolism , Plant Proteins/genetics
11.
Plant J ; 114(4): 984-994, 2023 05.
Article in English | MEDLINE | ID: mdl-36919205

ABSTRACT

Currently, the experimentally identified interactome of Arabidopsis (Arabidopsis thaliana) is still far from complete, suggesting that computational prediction methods can complement experimental techniques. Motivated by the prosperity and success of deep learning algorithms and natural language processing techniques, we introduce an integrative deep learning framework, DeepAraPPI, allowing us to predict protein-protein interactions (PPIs) of Arabidopsis utilizing sequence, domain and Gene Ontology (GO) information. Our current DeepAraPPI comprises: (i) a word2vec encoding-based Siamese recurrent convolutional neural network (RCNN) model; (ii) a Domain2vec encoding-based multiple-layer perceptron (MLP) model; and (iii) a GO2vec encoding-based MLP model. Finally, DeepAraPPI combines the prediction results of the three individual predictors through a logistic regression model. Compiling high-quality positive and negative training and test samples by applying strict filtering strategies, DeepAraPPI shows superior performance compared with existing state-of-the-art Arabidopsis PPI prediction methods. DeepAraPPI also provides better cross-species predictive ability in rice (Oryza sativa) than traditional machine learning methods, although the overall performance in cross-species prediction remains to be improved. DeepAraPPI is freely accessible at http://zzdlab.com/deeparappi/. In the meantime, we have also made the source code and data sets of DeepAraPPI available at https://github.com/zjy1125/DeepAraPPI.


Subject(s)
Arabidopsis , Deep Learning , Arabidopsis/genetics , Algorithms , Software , Machine Learning , Computational Biology/methods
12.
EMBO J ; 42(8): e112401, 2023 04 17.
Article in English | MEDLINE | ID: mdl-36811145

ABSTRACT

The maintenance of sodium/potassium (Na+ /K+ ) homeostasis in plant cells is essential for salt tolerance. Plants export excess Na+ out of cells mainly through the Salt Overly Sensitive (SOS) pathway, activated by a calcium signal; however, it is unknown whether other signals regulate the SOS pathway and how K+ uptake is regulated under salt stress. Phosphatidic acid (PA) is emerging as a lipid signaling molecule that modulates cellular processes in development and the response to stimuli. Here, we show that PA binds to the residue Lys57 in SOS2, a core member of the SOS pathway, under salt stress, promoting the activity and plasma membrane localization of SOS2, which activates the Na+ /H+ antiporter SOS1 to promote the Na+ efflux. In addition, we reveal that PA promotes the phosphorylation of SOS3-like calcium-binding protein 8 (SCaBP8) by SOS2 under salt stress, which attenuates the SCaBP8-mediated inhibition of Arabidopsis K+ transporter 1 (AKT1), an inward-rectifying K+ channel. These findings suggest that PA regulates the SOS pathway and AKT1 activity under salt stress, promoting Na+ efflux and K+ influx to maintain Na+ /K+ homeostasis.


Subject(s)
Arabidopsis Proteins , Arabidopsis , Protein Serine-Threonine Kinases , Salt Stress , Arabidopsis/metabolism , Arabidopsis Proteins/metabolism , Homeostasis , Phosphatidic Acids/metabolism , Potassium/metabolism , Protein Serine-Threonine Kinases/metabolism , Salt Stress/genetics , Sodium/metabolism
13.
Brief Bioinform ; 24(2)2023 03 19.
Article in English | MEDLINE | ID: mdl-36682013

ABSTRACT

While deep learning (DL)-based models have emerged as powerful approaches to predict protein-protein interactions (PPIs), the reliance on explicit similarity measures (e.g. sequence similarity and network neighborhood) to known interacting proteins makes these methods ineffective in dealing with novel proteins. The advent of AlphaFold2 presents a significant opportunity and also a challenge to predict PPIs in a straightforward way based on monomer structures while controlling bias from protein sequences. In this work, we established Structure and Graph-based Predictions of Protein Interactions (SGPPI), a structure-based DL framework for predicting PPIs, using the graph convolutional network. In particular, SGPPI focused on protein patches on the protein-protein binding interfaces and extracted the structural, geometric and evolutionary features from the residue contact map to predict PPIs. We demonstrated that our model outperforms traditional machine learning methods and state-of-the-art DL-based methods using non-representation-bias benchmark datasets. Moreover, our model trained on human dataset can be reliably transferred to predict yeast PPIs, indicating that SGPPI can capture converging structural features of protein interactions across various species. The implementation of SGPPI is available at https://github.com/emerson106/SGPPI.


Subject(s)
Machine Learning , Proteins , Humans , Proteins/chemistry , Protein Binding , Amino Acid Sequence , Saccharomyces cerevisiae/metabolism
14.
ISME J ; 17(3): 417-431, 2023 03.
Article in English | MEDLINE | ID: mdl-36627434

ABSTRACT

Migration from rhizosphere to rhizoplane is a key selecting process in root microbiome assembly, but not fully understood. Rhizobiales members are overrepresented in the core root microbiome of terrestrial plants, and here we report a genome-wide transposon-sequencing of rhizoplane fitness genes of beneficial Sinorhizobium fredii on wild soybean, cultivated soybean, rice, and maize. There were few genes involved in broad-host-range rhizoplane colonization. The fadL mutant lacking a fatty acid transporter exhibited high colonization rates, while mutations in exoFQP (encoding membrane proteins directing exopolysaccharide polymerization and secretion), but not those in exo genes essential for exopolysaccharide biosynthesis, led to severely impaired colonization rates. This variation was not explainable by their rhizosphere and rhizoplane survivability, and associated biofilm and exopolysaccharide production, but consistent with their migration ability toward rhizoplane, and associated surface motility and the mixture of quorum-sensing AHLs (N-acylated-L-homoserine lactones). Genetics and physiology evidences suggested that FadL mediated long-chain AHL uptake while ExoF mediated the secretion of short-chain AHLs which negatively affected long-chain AHL biosynthesis. The fadL and exoF mutants had elevated and depleted extracellular long-chain AHLs, respectively. A synthetic mixture of long-chain AHLs mimicking that of the fadL mutant can improve rhizobial surface motility. When this AHL mixture was spotted into rhizosphere, the migration toward roots and rhizoplane colonization of S. fredii were enhanced in a diffusible way. This work adds novel parts managing extracellular AHLs, which modulate bacterial migration toward rhizoplane. The FadL-ExoFQP system is conserved in Alphaproteobacteria and may shape the "home life" of diverse keystone rhizobacteria.


Subject(s)
Rhizobium , Bacteria/genetics , Quorum Sensing , Biofilms , Fatty Acids , Acyl-Butyrolactones/metabolism
15.
Genome Biol ; 23(1): 264, 2022 12 22.
Article in English | MEDLINE | ID: mdl-36550554

ABSTRACT

BACKGROUND: Heterosis is widely used in agriculture. However, its molecular mechanisms are still unclear in plants. Here, we develop, sequence, and record the phenotypes of 418 hybrids from crosses between two testers and 265 rice varieties from a mini-core collection. RESULTS: Phenotypic analysis shows that heterosis is dependent on genetic backgrounds and environments. By genome-wide association study of 418 hybrids and their parents, we find that nonadditive QTLs are the main genetic contributors to heterosis. We show that nonadditive QTLs are more sensitive to the genetic background and environment than additive ones. Further simulations and experimental analysis support a novel mechanism, homo-insufficiency under insufficient background (HoIIB), underlying heterosis. We propose heterosis in most cases is not due to heterozygote advantage but homozygote disadvantage under the insufficient genetic background. CONCLUSION: The HoIIB model elucidates that genetic background insufficiency is the intrinsic mechanism of background dependence, and also the core mechanism of nonadditive effects and heterosis. This model can explain most known hypotheses and phenomena about heterosis, and thus provides a novel theory for hybrid rice breeding in future.


Subject(s)
Hybrid Vigor , Oryza , Oryza/genetics , Genome-Wide Association Study , Transcriptome , Plant Breeding , Genomics
16.
Int J Mol Sci ; 23(14)2022 Jul 10.
Article in English | MEDLINE | ID: mdl-35886965

ABSTRACT

The protozoan pathogen Cryptosporidium parvum infects intestinal epithelial cells and causes diarrhea in humans and young animals. Among the more than 20 genes encoding insulinase-like metalloproteinases (INS), two are paralogs with high sequence identity. In this study, one of them, INS-16 encoded by the cgd3_4270 gene, was expressed and characterized in a comparative study of its sibling, INS-15 encoded by the cgd3_4260 gene. A full-length INS-16 protein and its active domain I were expressed in Escherichia coli, and antibodies against the domain I and an INS-16-specific peptide were produced in rabbits. In the analysis of the crude extract of oocysts, a ~60 kDa fragment of INS-16 rather than the full protein was recognized by polyclonal antibodies against the specific peptide, indicating that INS-16 undergoes proteolytic cleavage before maturation. The expression of the ins-16 gene peaked at the invasion phase of in vitro C. parvum culture, with the documented expression of the protein in both sporozoites and merozoites. Localization studies with antibodies showed significant differences in the distribution of the native INS-15 and INS-16 proteins in sporozoites and merozoites. INS-16 was identified as a dense granule protein in sporozoites and macrogamonts but was mostly expressed at the apical end of merozoites. We screened 48 candidate INS-16 inhibitors from the molecular docking of INS-16. Among them, two inhibited the growth of C. parvum in vitro (EC50 = 1.058 µM and 2.089 µM). The results of this study suggest that INS-16 may have important roles in the development of C. parvum and could be a valid target for the development of effective treatments.


Subject(s)
Cryptosporidium parvum , Insulysin , Metalloproteases , Protozoan Proteins , Animals , Cryptosporidiosis/metabolism , Cryptosporidium/metabolism , Cryptosporidium parvum/metabolism , Insulysin/metabolism , Metalloproteases/metabolism , Molecular Docking Simulation , Protozoan Proteins/metabolism , Rabbits , Sporozoites/metabolism
17.
Mol Biol Evol ; 39(7)2022 07 02.
Article in English | MEDLINE | ID: mdl-35776423

ABSTRACT

Genetic recombination plays a critical role in the emergence of pathogens with phenotypes such as drug resistance, virulence, and host adaptation. Here, we tested the hypothesis that recombination between sympatric ancestral populations leads to the emergence of divergent variants of the zoonotic parasite Cryptosporidium parvum with modified host ranges. Comparative genomic analyses of 101 isolates have identified seven subpopulations isolated by distance. They appear to be descendants of two ancestral populations, IIa in northwestern Europe and IId from southwestern Asia. Sympatric recombination in areas with both ancestral subtypes and subsequent selective sweeps have led to the emergence of new subpopulations with mosaic genomes and modified host preference. Subtelomeric genes could be involved in the adaptive selection of subpopulations, while copy number variations of genes encoding invasion-associated proteins are potentially associated with modified host ranges. These observations reveal ancestral origins of zoonotic C. parvum and suggest that pathogen import through modern animal farming might promote the emergence of divergent subpopulations of C. parvum with modified host preference.


Subject(s)
Cryptosporidiosis , Cryptosporidium parvum , Cryptosporidium , Animals , Cryptosporidiosis/parasitology , Cryptosporidium/genetics , Cryptosporidium parvum/genetics , DNA Copy Number Variations , Recombination, Genetic
18.
Front Microbiol ; 13: 883674, 2022.
Article in English | MEDLINE | ID: mdl-35558125

ABSTRACT

Calcium-dependent protein kinases (CDPKs) are important in calcium influx, triggering several biological processes in Cryptosporidium spp. As they are not present in mammals, CDPKs are considered promising drug targets. Recent studies have characterized CpCDPK1, CpCDPK3, CpCDPK4, CpCDPK5, CpCDPK6, and CpCDPK9, but the role of CpCPK2A remains unclear. In this work, we expressed recombinant CpCDPK2A encoded by the cgd2_1060 gene in Escherichia coli and characterized the biologic functions of CpCDPK2A using qRT-PCR, immunofluorescence microscopy, immuno-electron microscopy, and in vitro neutralization. The results revealed that CpCDPK2A protein was highly expressed in the apical region of sporozoites and merozoites and in macrogamonts. Monoclonal or polyclonal antibodies against CpCDPK2A failed to block the invasion of host cells. Among the 44 candidate inhibitors from molecular docking of CpCDPK2A, one inhibitor was identified as having a potential effect on both Cryptosporidium parvum growth and CpCDPK2A enzyme activities. These data suggest that CpCDPK2A may play some roles during the development of C. parvum and might be a potential drug target against cryptosporidiosis.

19.
Comput Struct Biotechnol J ; 20: 2322-2331, 2022.
Article in English | MEDLINE | ID: mdl-35615014

ABSTRACT

As one of the most studied Apicomplexan parasite Cryptosporidium, Cryptosporidium parvum (C. parvum) causes worldwide serious diarrhea disease cryptosporidiosis, which can be deadly to immunodeficiency individuals, newly born children, and animals. Proteome-wide identification of protein-protein interactions (PPIs) has proven valuable in the systematic understanding of the genome-phenome relationship. However, the PPIs of C. parvum are largely unknown because of the limited experimental studies carried out. Therefore, we took full advantage of three bioinformatics methods, i.e., interolog mapping (IM), domain-domain interaction (DDI)-based inference, and machine learning (ML) method, to jointly predict PPIs of C. parvum. Due to the lack of experimental PPIs of C. parvum, we used the PPI data of Plasmodium falciparum (P. falciparum), which owned the largest number of PPIs in Apicomplexa, to train an ML model to infer C. parvum PPIs. We utilized consistent results of these three methods as the predicted high-confidence PPI network, which contains 4,578 PPIs covering 554 proteins. To further explore the biological significance of the constructed PPI network, we also conducted essential network and protein functional analysis, mainly focusing on hub proteins and functional modules. We anticipate the constructed PPI network can become an important data resource to accelerate the functional genomics studies of C. parvum as well as offer new hints to the target discovery in developing drugs/vaccines.

20.
Front Microbiol ; 13: 842976, 2022.
Article in English | MEDLINE | ID: mdl-35495666

ABSTRACT

Identifying human-virus protein-protein interactions (PPIs) is an essential step for understanding viral infection mechanisms and antiviral response of the human host. Recent advances in high-throughput experimental techniques enable the significant accumulation of human-virus PPI data, which have further fueled the development of machine learning-based human-virus PPI prediction methods. Emerging as a very promising method to predict human-virus PPIs, deep learning shows the powerful ability to integrate large-scale datasets, learn complex sequence-structure relationships of proteins and convert the learned patterns into final prediction models with high accuracy. Focusing on the recent progresses of deep learning-powered human-virus PPI predictions, we review technical details of these newly developed methods, including dataset preparation, deep learning architectures, feature engineering, and performance assessment. Moreover, we discuss the current challenges and potential solutions and provide future perspectives of human-virus PPI prediction in the coming post-AlphaFold2 era.

SELECTION OF CITATIONS
SEARCH DETAIL