Proteins can have various degrees of similarity. If two proteins have highly similar amino acid sequences, it is generally assumed that they are closely related evolutionarily. As the evolutionary distance increases, the degree of similarity usually drops. Even if the sequence similarity is low, proteins may have similar functions and 3D structures. Detecting remote similarities, a core structural bioinformatics technique, is important in the study of functional and evolutionary relationships between protein families.
The RCSB PDB offers tools that quickly identify 3D protein sequence neighbors. For each PDB entry, the 3D Similarity tab lists the representative entries with 40% sequence similarity that are found using the jFATCAT-rigid algorithm. As an example, look at the 3D similarity tab for entry 1q6z. Representative protein chains are used since calculation of a real all vs. all comparison would require a great amount of CPU time. A detailed description of the procedure used is available.
Novel domain architectures and unexpected structural similarities can be detected by analyzing structural alignments. As an example, one of the top ranking structural neighbors of 1q6z (chain A), is entry 3hww. Clicking on "view" from the Structure Similarity table will show a summary view of the alignment.
3hww has an RMSD of 3 Å based on the Cα positions, while the two protein chains are only 14% identical by sequence. 1q6z is a benzoylformate decarboxylase (EC number 126.96.36.199), while 3hww is a 2-succinyl-5-enolpyruvyl-6-hydroxy-3-cyclohexene-1-carboxylate synthase (EC number 188.8.131.52). Despite the low sequence identity and divergence in function, the high structural similarity indicates that both proteins evolved from a common ancestor.