Proteome-wide structural analysis quantifies structural conservation across distant species.
Genome Res
; 2023 Nov 22.
Article
en En
| MEDLINE
| ID: mdl-37993136
ABSTRACT
Traditional evolutionary biology research mainly relies on sequence information to infer evolutionary relationships between genes or proteins. In contrast, protein structural information has long been overlooked, although structures are more conserved and closely linked to the functions than the sequences. To address this gap, we conducted a proteome-wide structural analysis using experimental and computed protein structures for organisms from the three distinct domains, including Homo sapiens (eukarya), Escherichia coli (bacteria), and Methanocaldococcus jannaschii (archaea). We reveal the distribution of structural similarity and sequence identity at the genomic level and characterize the twilight zone, where signals obtained from sequence alignment are blurred and evolutionary relationships cannot be inferred unambiguously. We find that structurally similar homologous protein pairs in the twilight zone account for â¼0.004%-0.021% of all possible protein pair combinations, which translates to â¼8%-32% of the protein-coding genes, depending on the species under comparison. In addition, by comparing the structural homologs, we show that human proteins involved in the energy supply are more similar to their E. coli homologs, whereas proteins relating to the central dogma are more similar to their M. jannaschii homologs. We also identify a bacterial GPCR homolog in the E. coli proteome that displays distinctive domain architecture. Our results shed light on the characteristics of the twilight zone and the origin of different pathways from a protein structure perspective, highlighting an exciting new frontier in evolutionary biology.
Texto completo:
1
Colección:
01-internacional
Banco de datos:
MEDLINE
Idioma:
En
Revista:
Genome Res
Asunto de la revista:
BIOLOGIA MOLECULAR
/
GENETICA
Año:
2023
Tipo del documento:
Article
País de afiliación:
China