Home / Summaries / Class notes - Intelligent Data-analyse / set-information-dt

Decision Tree Induction

6 important questions on Decision Tree Induction

Given: suppose (according to slide 8) we have an information source generates 4 symbols with probabilities of respectively: p1 = 1/2, p2 = 1/4 , p3 = 1/8 , p4 =1/8.
Using the (‘directly decodable’) code s1 ~ 0, s2 ~ 10,
s3 ~ 110, s4 ~ 111,
            a) calculate the expected code length per symbol (e.g. self-information),
            b) compare the result of b) with the value of the entropy, and
            c) conclude about what you have discovered.

a. I (s1)= -log2 (1/2)=1; I (s2)= - log2 (1/4)=2 ;Info(s3)= Info(s4) = log2 (1/8) =3

b.Info (s)= 1/2 I (s1) + 1/4 I (s2) + 1/8 I (s3)+1/8 I (s4)=1.75
c.“expected information” = “weighted sum (average) of self-information”
•“Weights” are the probabilities!

What is H and what is the maximum number of H?

H(0.5;0.5) = 1 (maximum value = 1 bit needed, why?)
Note that H is a function of a probability distribution!

Which means H can also be applied to conditional probabilities (more later)

When is the highest information gain (A) and will that attribute be chosen?

If the average conditional entropy is zero, the information gain is highest and that attribute A will be chosen

What is a fundamental problem of all learning algorithms?

Overfitting: if the set of possible hypotheses is (too) large, meaningless 'regularity' may easily be found: in case of DTs, overfitting takes place if the tree becomes too refined.

How would you counteract overfitting?

Early stopping by:
Use a training set to learn/induce a (hypothesis) tree
Use a separate validation set to select the ‘best performing’ DT:
•e.g., measure performance on validation set while expanding the DT
•stop expansion of DT if performance start to decrease (!)
•Use a separate test set to estimate the ‘true performance’ of DT selected…

Why is there a need for three sets? Namely training set, validation set and test set.

Hands on session 2

The question on the page originate from the summary of the following study material:

Intelligent Data-analyse

View summary

A unique study and practice tool
Never study anything twice again
Get the grades you hope for
100% sure, 100% understanding

Remember faster, study better. Scientifically proven.

Decision Tree Induction

6 important questions on Decision Tree Induction

What is H and what is the maximum number of H?

When is the highest information gain (A) and will that attribute be chosen?

What is a fundamental problem of all learning algorithms?

How would you counteract overfitting?

Why is there a need for three sets? Namely training set, validation set and test set.

Summaries related to Data warehouses

Class notes - Intelligent Data-analyse

Indian Economics

Global politics

Essentials of international relations

Behavioral genetics

Management and organisational behaviour

Follow Up Engels idioom 4/5 H

International Business

Marketing fundamentals

Projectmanagement, A practical Approach-Engl…

Basic Management Accounting for the Hospital…

International business