Home / Summaries / Structural Bioinformatics / alignment-structural-sequence

Structure alignment

Q: What is the complexity of superposition, and of structural alignment?

Superposition: O(N p ) ---- polynomial Structural alignment: NP

Q: For structural alignment, what are the problems of representation, optimization and scoring?

Representation : How to represent the input structures in a coordinate-independent space suitable for alignment. Optimization : How to sample the space of possible alignment solutions between the structures. Scoring : How to score a given alignment and determine its statistical significance.

Q: For sequence alignment, what are the solutions to representation, optimization and scoring?

Representation: Sequence (+ scoring matrix indicating sequence similarity) Optimization: (finding maximal alignment score) : Dynamic Programming: Needleman-Wunsch / Smith-Waterman Scoring: Scoring Matrix → alignment score → e-value (BLAST)

12 important questions on Structure alignment

Explain the difference between structural superposition and structural alignment.

Structural superposition:

input = proteins with their atomic coordinates (2 PDB files) + a mapping, indicating which are corresponding residues (an alignment)
output = 2 superimposed proteins (typically by providing a rotation and translation of coordinate frame)

Structural aligment:

input = 2 proteins with their atomic coordinates (2 PDB files)
output = An alignment between two protein structures, based on the structure alone

What is the goal in superposition? And how is this achieved?

You want to obtain the minimal RMSD between the two proteins.

--> translate and rotate
---> translate so that centers of mass fall onto each other
and find rotation that minimizes RMSD. (use Jacobi algorithm, eigenvalue problem).

What if we want to superimpose two proteins with different sequences?

We 'only' need an alignment between the two structures.
Problem: find an optimal alignment of residues, using the structures (coordinates) of two proteins.

What is the complexity of superposition, and of structural alignment?

Superposition: O(N^p) ---- polynomial
Structural alignment: NP

For structural alignment, what are the problems of representation, optimization and scoring?

Representation: How to represent the input structures in a coordinate-independent space suitable for alignment.
Optimization: How to sample the space of possible alignment solutions between the structures.
Scoring: How to score a given alignment and determine its statistical significance.

For sequence alignment, what are the solutions to representation, optimization and scoring?

Representation: Sequence (+ scoring matrix indicating sequence similarity)
Optimization: (finding maximal alignment score): Dynamic Programming: Needleman-Wunsch / Smith-Waterman
Scoring: Scoring Matrix → alignment score → e-value (BLAST)

Epxlain the concept of SSAP.

C-beta vectors
SSAP uses vectors, in a reference frame of the backbone
This also adds directional information

Explain the concept of double dynamic programming.

High level matrix: elements are resulting scores of low level DP
Low level matrix: keep one pair of residues fixed

How is scoring significance determined?

Extreme value distribution
Typically one needs a p-value or z-score to indicate the relevance of a structural alignment
A z-score indicates how many standard deviations an element is from the mean
Note, this is the same what BLAST does

Structural Alignments are often used as “gold” standard for sequence alignment. Is this problematic?

Could be! As structural alignment is a problem of it's own. The true alignment is not known/not known perse what the best alignment is.

How does multiple structrure alignment work?

Starts with pairwise alignments, clustering, & iteratively align (just like MSA)

Why is sequence alignment easier?

Take the maximum at each step
Why are we allowed to do this?
If a point B lies on the most optimal path between A and C, than the optimal subpath A-B, lies on the same optimal alignment between A-C.

Realigning residues that are close together in sequence, may affect the alignment score of pairs much further along the sequence, since such residues may be close in space.We need to try all (or an exponentially large number of) possible combinations to find optimal alignment

The question on the page originate from the summary of the following study material:

Structural Bioinformatics

View summary

A unique study and practice tool
Never study anything twice again
Get the grades you hope for
100% sure, 100% understanding

Remember faster, study better. Scientifically proven.

Structure alignment

12 important questions on Structure alignment

Explain the difference between structural superposition and structural alignment.

What is the goal in superposition? And how is this achieved?

What if we want to superimpose two proteins with different sequences?

What is the complexity of superposition, and of structural alignment?

For structural alignment, what are the problems of representation, optimization and scoring?

For sequence alignment, what are the solutions to representation, optimization and scoring?

Epxlain the concept of SSAP.

Explain the concept of double dynamic programming.

How is scoring significance determined?

Structural Alignments are often used as “gold” standard for sequence alignment. Is this problematic?

How does multiple structrure alignment work?

Why is sequence alignment easier?

Summaries related to Protein folding

Structural Bioinformatics

Syllabus Introduction to systems biology

Class notes - Algorithms in Sequence Analysis

Class notes - Biosystems Data Analysis

Indian Economics

Global politics

Essentials of international relations

Behavioral genetics

Management and organisational behaviour

Follow Up Engels idioom 4/5 H

International Business

Marketing fundamentals