RESUMEN
Identification of bacterial protein-protein interactions and predicting the structures of the complexes could aid in the understanding of pathogenicity mechanisms and developing treatments for infectious diseases. Here, we developed a deep learning-based pipeline that leverages residue-residue coevolution and protein structure prediction to systematically identify and structurally characterize protein-protein interactions at the proteome-wide scale. Using this pipeline, we searched through 78 million pairs of proteins across 19 human bacterial pathogens and identified 1923 confidently predicted complexes involving essential genes and 256 involving virulence factors. Many of these complexes were not previously known; we experimentally tested 12 such predictions, and half of them were validated. The predicted interactions span core metabolic and virulence pathways ranging from post-transcriptional modification to acid neutralization to outer membrane machinery and should contribute to our understanding of the biology of these important pathogens and the design of drugs to combat them.
RESUMEN
Identification of bacterial protein-protein interactions and predicting the structures of these complexes could aid in the understanding of pathogenicity mechanisms and developing treatments for infectious diseases. Here we developed RoseTTAFold2-Lite, a rapid deep learning model that leverages residue-residue coevolution and protein structure prediction to systematically identify and structurally characterize protein-protein interactions at the proteome-wide scale. Using this pipeline, we searched through 78 million pairs of proteins across 19 human bacterial pathogens and identified 1,923 confidently predicted complexes involving essential genes and 256 involving virulence factors. Many of these complexes were not previously known; we experimentally tested 12 such predictions, and half of them were validated. The predicted interactions span core metabolic and virulence pathways ranging from post-transcriptional modification to acid neutralization to outer-membrane machinery and should contribute to our understanding of the biology of these important pathogens and the design of drugs to combat them.
Asunto(s)
Bacterias , Proteínas Bacterianas , Aprendizaje Profundo , Factores de Virulencia , Humanos , Proteínas Bacterianas/metabolismo , Proteínas Bacterianas/genética , Proteínas Bacterianas/química , Factores de Virulencia/metabolismo , Factores de Virulencia/genética , Bacterias/metabolismo , Bacterias/genética , Bacterias/patogenicidad , Proteoma/metabolismo , Mapeo de Interacción de Proteínas , Unión Proteica , Genes EsencialesRESUMEN
Protein-protein interactions play critical roles in biology, but the structures of many eukaryotic protein complexes are unknown, and there are likely many interactions not yet identified. We take advantage of advances in proteome-wide amino acid coevolution analysis and deep-learningbased structure modeling to systematically identify and build accurate models of core eukaryotic protein complexes within the Saccharomyces cerevisiae proteome. We use a combination of RoseTTAFold and AlphaFold to screen through paired multiple sequence alignments for 8.3 million pairs of yeast proteins, identify 1505 likely to interact, and build structure models for 106 previously unidentified assemblies and 806 that have not been structurally characterized. These complexes, which have as many as five subunits, play roles in almost all key processes in eukaryotic cells and provide broad insights into biological function.