RESUMEN
Eukaryotic precursor mRNA splicing is a process involving a very complex RNA-protein edifice. Serine/arginine-rich (SR) proteins play essential roles in precursor mRNA constitutive and alternative splicing and have been suggested to be crucial in plant-specific forms of developmental regulation and environmental adaptation. Despite their functional importance, little is known about their origin and evolutionary history. SR splicing factors have a modular organization featuring at least one RNA recognition motif (RRM) domain and a carboxyl-terminal region enriched in serine/arginine dipeptides. To investigate the evolution of SR proteins, we infer phylogenies for more than 12,000 RRM domains representing more than 200 broadly sampled organisms. Our analyses reveal that the RRM domain is not restricted to eukaryotes and that all prototypical SR proteins share a single ancient origin, including the plant-specific SR45 protein. Based on these findings, we propose a scenario for their diversification into four natural families, each corresponding to a main SR architecture, and a dozen subfamilies, of which we profile both sequence conservation and composition. Finally, using operational criteria for computational discovery and classification, we catalog SR proteins in 20 model organisms, with a focus on green algae and land plants. Altogether, our study confirms the homogeneity and antiquity of SR splicing factors while establishing robust phylogenetic relationships between animal and plant proteins, which should enable functional analyses of lesser characterized SR family members, especially in green plants.