Structure alignment

12 important questions on Structure alignment

Explain the difference between structural superposition and structural alignment.

Structural superposition:
  • input =  proteins with their atomic coordinates (2 PDB files) + a mapping, indicating which are corresponding residues (an alignment)
  • output = 2 superimposed proteins (typically by providing a rotation and translation of coordinate frame)

Structural aligment:
  • input = 2 proteins with their atomic coordinates (2 PDB files)
  • output = An alignment between two protein structures, based on the structure alone

What is the goal in superposition? And how is this achieved?

You want to obtain the minimal RMSD between the two proteins.

--> translate and rotate
---> translate so that centers of mass fall onto each other
and find rotation that minimizes RMSD. (use Jacobi algorithm, eigenvalue problem).

What if we want to superimpose two proteins with different sequences?

  • We 'only' need an alignment between the two structures.
  • Problem: find an optimal alignment of residues, using the structures (coordinates) of two proteins.
  • Higher grades + faster learning
  • Never study anything twice
  • 100% sure, 100% understanding
Discover Study Smart

What is the complexity of superposition, and of structural alignment?

  • Superposition: O(Np) ---- polynomial
  • Structural alignment: NP

For structural alignment, what are the problems of representation, optimization and scoring?

  • Representation: How to represent the input structures in a coordinate-independent space suitable for alignment.
  • Optimization: How to sample the space of possible alignment solutions between the structures.
  • Scoring: How to score a given alignment and determine its statistical significance.

For sequence alignment, what are the solutions to representation, optimization  and scoring?

  • Representation: Sequence (+ scoring matrix indicating sequence similarity)
  • Optimization: (finding maximal alignment score): Dynamic Programming: Needleman-Wunsch  / Smith-Waterman
  • Scoring: Scoring Matrix → alignment score → e-value (BLAST)

Epxlain the concept of SSAP.

  • C-beta vectors
  • SSAP uses vectors, in a reference frame of the backbone
  • This also adds directional information

Explain the concept of double dynamic programming.

  • High level matrix: elements are resulting scores of low level DP
  • Low level matrix: keep one pair of residues fixed

How is scoring significance determined?

  • Extreme value distribution
  • Typically one needs a p-value or z-score to indicate the relevance of a structural alignment
  • A z-score indicates how many standard deviations an element is from the mean
  • Note, this is the same what BLAST does

Structural Alignments are often used as “gold” standard for sequence alignment. Is this problematic?

Could be! As structural alignment is a problem of it's own. The true alignment is not known/not known perse what the best alignment is.

How does multiple structrure alignment work?

Starts with pairwise alignments, clustering, & iteratively align (just like MSA)

Why is sequence alignment easier?

  • Take the maximum at each step
  • Why are we allowed to do this?
  • If a point B lies on the most optimal path between A and C, than the optimal subpath A-B, lies on the same optimal alignment between A-C.


Realigning residues that are close together in sequence, may affect the alignment score of pairs much further along the sequence, since such residues may be close in space.We need to try all (or an exponentially large number of) possible combinations to find optimal alignment

The question on the page originate from the summary of the following study material:

  • A unique study and practice tool
  • Never study anything twice again
  • Get the grades you hope for
  • 100% sure, 100% understanding
Remember faster, study better. Scientifically proven.
Trustpilot Logo