Home / Summaries / Class notes - Algorithms in Sequence Analysis / protein-alignment-sequences

Intro + Pairwise alignment

10 important questions on Intro + Pairwise alignment

If we want to find the function of a newly sequenced gene with a 'lazy approach' (only bioinformatics, no biological experiments), how would we do this?

Find a set of protein sequences similar to the unknown sequence.
Identify similarities and differences.
For long protein sequences: first identify domains and then use corresponding subsequences.

Name 3 things we look at for reconstructing evolutionary and functional relationships.

Based on sequence

identity (simplest)
similarity

Homology (ultimate goal)
Other information such as 3D structure

What did a study on 3D structure and protein evolution show?

The distance from the active site determines the rate of evolution. Close: slow evolution, Far: fast evolution

What is a frame shift mutation?

An insertion or deletion leading to a different reading frame, shifting all codons. Often results in shortened protein. Often nonfunctional.

What is a DNA expression mutation?

A mutation that does not change the protein itself but it's expression, eg where a protein is made and how much of a protein is made. Can lead to proteins being made at the wrong time or in the wrong cell type. Or under/overproduction.

What can you entounter when reconstructing "evolution" with sequences?

See slide.

Name conditions for aligning sequences.

Sequences should be related trough divergent evolution

so they should be homologous
and preferablly orthologous:
paralogous sequences can become too distant for correct alignment (think of BAD)

Analogous sequences should not be aligned!

Sometimes a short functional motif can be detected.

What should an alignment scoring method do? How is alignment score defined.

Produce reasonable alignments
Must assign scores to:

substitutions (match/mismatch)

DNA
Proteins

Gap penalties

linear
affine
concave

Alignment score is defined as the summed score of all alignment columns.

Explain the concept of combinatorial explosion and the solution we use.

1 gap in 1 seq: n+1 possibilities for alignment
2 gaps in 1 seq: (n + 1)n
3 gaps in 1 seq: (n + 1)n(n - 1)
*check formula later

explodes!

Solution = dynamic programming:

breaks up alignment problem in smaller subproblems, solve them iteratively.
Alignment is simulated as a Markov process. All sequence positions are seen as independent and identically distributed.
Chanches of sequence events are independent

Therefore probabilities per aligned position are multiplied
AA matrices contain log odds --> sum

Name 2 alternative alignment methods (so not global, semiglobal local).

De Novo sequencing

tracks overlap between millions of short seq reads coming from seq experiment. N is number of reads --> N ^2 overlap matches required.

Reference based sequencing

aligns short reads against reference genome

These algorithms are not based on evolutionary considerations per se, but match (near)identical fragments

The question on the page originate from the summary of the following study material:

Algorithms in Sequence Analysis

View summary

A unique study and practice tool
Never study anything twice again
Get the grades you hope for
100% sure, 100% understanding

Remember faster, study better. Scientifically proven.

Intro + Pairwise alignment

10 important questions on Intro + Pairwise alignment

If we want to find the function of a newly sequenced gene with a 'lazy approach' (only bioinformatics, no biological experiments), how would we do this?

Name 3 things we look at for reconstructing evolutionary and functional relationships.

What did a study on 3D structure and protein evolution show?

What is a frame shift mutation?

What is a DNA expression mutation?

What can you entounter when reconstructing "evolution" with sequences?

Name conditions for aligning sequences.

What should an alignment scoring method do? How is alignment score defined.

Explain the concept of combinatorial explosion and the solution we use.

Name 2 alternative alignment methods (so not global, semiglobal local).

Summaries related to Intro + Pairwise alignment

Class notes - Algorithms in Sequence Analysis

Syllabus Introduction to systems biology

Structural Bioinformatics

Class notes - Biosystems Data Analysis

Indian Economics

Global politics

Essentials of international relations

Behavioral genetics

Management and organisational behaviour

Follow Up Engels idioom 4/5 H

International Business

Marketing fundamentals