RESUMO
Homorepeats are motifs with reiterations of the same amino acid. They are prevalent in proteins associated with diverse physiological functions but also linked to several pathologies. Structural characterization of homorepeats has remained largely elusive, primarily because they generally occur in the disordered regions or proteins. Here, we address this subject by combining structures derived from machine learning with conformational sampling through physics-based simulations. We find that hydrophobic homorepeats have a tendency to fold into structured secondary conformations, while hydrophilic ones predominantly exist in unstructured states. Our data show that the flexibility rendered by disorder is a critical component besides the chemical feature that drives homorepeats composition toward hydrophilicity. The formation of regular secondary structures also influences their solubility, as pathologically relevant homorepeats display a direct correlation between repeat expansion, induction of helicity, and self-assembly. Our study provides critical insights into the conformational landscape of protein homorepeats and their structure-activity relationship.