RESUMO
Protein complexes are key functional units in cellular processes. High-throughput techniques, such as co-fractionation coupled with mass spectrometry (CF-MS), have advanced protein complex studies by enabling global interactome inference. However, dealing with complex fractionation characteristics to define true interactions is not a simple task, since CF-MS is prone to false positives due to the co-elution of non-interacting proteins by chance. Several computational methods have been designed to analyze CF-MS data and construct probabilistic protein-protein interaction (PPI) networks. Current methods usually first infer PPIs based on handcrafted CF-MS features, and then use clustering algorithms to form potential protein complexes. While powerful, these methods suffer from the potential bias of handcrafted features and severely imbalanced data distribution. However, the handcrafted features based on domain knowledge might introduce bias, and current methods also tend to overfit due to the severely imbalanced PPI data. To address these issues, we present a balanced end-to-end learning architecture, Software for Prediction of Interactome with Feature-extraction Free Elution Data (SPIFFED), to integrate feature representation from raw CF-MS data and interactome prediction by convolutional neural network. SPIFFED outperforms the state-of-the-art methods in predicting PPIs under the conventional imbalanced training. When trained with balanced data, SPIFFED had greatly improved sensitivity for true PPIs. Moreover, the ensemble SPIFFED model provides different voting schemes to integrate predicted PPIs from multiple CF-MS data. Using the clustering software (i.e. ClusterONE), SPIFFED allows users to infer high-confidence protein complexes depending on the CF-MS experimental designs. The source code of SPIFFED is freely available at: https://github.com/bio-it-station/SPIFFED.
Assuntos
Mapeamento de Interação de Proteínas , Proteínas , Mapeamento de Interação de Proteínas/métodos , Proteínas/química , Algoritmos , Mapas de Interação de Proteínas , SoftwareRESUMO
BACKGROUND: Ciliates are an ancient and diverse eukaryotic group found in various environments. A unique feature of ciliates is their nuclear dimorphism, by which two types of nuclei, the diploid germline micronucleus (MIC) and polyploidy somatic macronucleus (MAC), are present in the same cytoplasm and serve different functions. During each sexual cycle, ciliates develop a new macronucleus in which newly fused genomes are extensively rearranged to generate functional minichromosomes. Interestingly, each ciliate species seems to have its way of processing genomes, providing a diversity of resources for studying genome plasticity and its regulation. Here, we sequenced and analyzed the macronuclear genome of different strains of Paramecium bursaria, a highly divergent species of the genus Paramecium which can stably establish endosymbioses with green algae. RESULTS: We assembled a high-quality macronuclear genome of P. bursaria and further refined genome annotation by comparing population genomic data. We identified several species-specific expansions in protein families and gene lineages that are potentially associated with endosymbiosis. Moreover, we observed an intensive chromosome breakage pattern that occurred during or shortly after sexual reproduction and contributed to highly variable gene dosage throughout the genome. However, patterns of copy number variation were highly correlated among genetically divergent strains, suggesting that copy number is adjusted by some regulatory mechanisms or natural selection. Further analysis showed that genes with low copy number variation among populations tended to function in basic cellular pathways, whereas highly variable genes were enriched in environmental response pathways. CONCLUSIONS: We report programmed DNA rearrangements in the P. bursaria macronuclear genome that allow cells to adjust gene copy number globally according to individual gene functions. Our results suggest that large-scale gene copy number variation may represent an ancient mechanism for cells to adapt to different environments.
Assuntos
Genoma de Protozoário , Paramecium/genética , Macronúcleo/genética , MetagenômicaRESUMO
Protein complexes are fundamental to all cellular processes, so understanding their evolutionary history and assembly processes is important. Gene duplication followed by divergence is considered a primary mechanism for diversifying protein complexes. Nonetheless, to what extent assembly of present-day paralogous complexes has been constrained by their long evolutionary pathways and how cross-complex interference is avoided remain unanswered questions. Subunits of protein complexes are often stabilized upon complex formation, whereas unincorporated subunits are degraded. How such cooperative stability influences protein complex assembly also remains unclear. Here, we demonstrate that subcomplexes determined by cooperative stabilization interactions serve as building blocks for protein complex assembly. We further develop a protein stability-guided method to compare the assembly processes of paralogous complexes in cellulo. Our findings support that oligomeric state and the structural organization of paralogous complexes can be maintained even if their assembly processes are rearranged. Our results indicate that divergent assembly processes by paralogous complexes not only enable the complexes to evolve new functions, but also reinforce their segregation by establishing incompatibility against deleterious hybrid assemblies.
Assuntos
Complexos Multiproteicos , Complexos Multiproteicos/metabolismo , Complexos Multiproteicos/química , Complexos Multiproteicos/genética , Estabilidade Proteica , Evolução Molecular , Subunidades Proteicas/metabolismo , Subunidades Proteicas/química , Multimerização Proteica , Ligação Proteica , Duplicação GênicaRESUMO
Dobzhansky-Muller incompatibilities represent a major driver of reproductive isolation between species. They are caused when interacting components encoded by alleles from different species cannot function properly when mixed. At incipient stages of speciation, complex incompatibilities involving multiple genetic loci with weak effects are frequently observed, but the underlying mechanisms remain elusive. Here we show perturbed proteostasis leading to compromised mitosis and meiosis in Saccharomyces cerevisiae hybrid lines carrying one or two chromosomes from Saccharomyces bayanus var. uvarum. Levels of proteotoxicity are correlated with the number of protein complexes on replaced chromosomes. Proteomic approaches reveal that multi-protein complexes with subunits encoded by replaced chromosomes tend to be unstable. Furthermore, hybrid defects can be alleviated or aggravated, respectively, by up- or down-regulating the ubiquitin-proteasomal degradation machinery, suggesting that destabilized complex subunits overburden the proteostasis machinery and compromise hybrid fitness. Our findings reveal the general role of impaired protein complex assembly in complex incompatibilities.