Functional site prediction

7 important questions on Functional site prediction

What is a clue that you have an enzyme?

Conservated region (look at MSA) in a cleft

Name some types of functional sites. Why focus on active site prediction?

  • Catalytic residues (in enzyme active sites).
  • Specificity determinants
  • Allosteric sites
  • Protein-protein interaction interfaces
  • Post-translational modification sites (e.g., phosphorylation, glycosylation, etc.)


For active site: you have a lot of manually curated data.

Why do you need positive and negative training data?

  • Positive = residue labeled that it is catalytic
  • Negative = residue labeled as definitely not catalytic
  • Higher grades + faster learning
  • Never study anything twice
  • 100% sure, 100% understanding
Discover Study Smart

What to be careful for when k-fold is not used for crossvalidation?

  • What ended up in test set could be really easy
  • or related to training set

Tell about the originial CSA dataset.

  • Original hand-annotated entries, derived from the primary literature. References for these entries are given.
  • Homologous entries, found by PSI-BLAST to one of the original entries (using an E-value cutoff of 0.00005).
    • The equivalent residues, which align in sequence to the catalytic residues found in the original entry, are documented.(note: they check for agreement at these sites)
  • KS comments: –The second type of entry is a form of annotation transfer(you should understand the fundamental assumption underlying this approach

What information is known on active sites from 3D?

  • Most often in pockets/clefts
  • Solvent accessible
    • Must be somewhat accessible (otherwise can’t interact with a substrate!)
    • Surprisingly, they are not always highlyaccessible (see Bartlett review with results of B-factor analysis)
  • Secondary structure: More often on loop regions

What is the motivation for INTREPID?

  • Biologists are typically interested in a single protein, not the whole family
  • Not all proteins in large diverse families use all positions identically
  • Active sites may be perfectly conserved, but other key positions are more likely to vary across subtypes
  • Structural divergence and alignment errors will contribute noise

The question on the page originate from the summary of the following study material:

  • A unique study and practice tool
  • Never study anything twice again
  • Get the grades you hope for
  • 100% sure, 100% understanding
Remember faster, study better. Scientifically proven.
Trustpilot Logo