Home / Summaries / Class notes - Machine Learning / weights-samples-complexity

Dataset complexity

15 important questions on Dataset complexity

The complexity of your problem consists mainly of two parts:

- Inherent complexity: do you expect a complex decision boundary?
- is your dataset representative enough of your problem

What do you check to see if your problem is high dimensional?

Check if the number of features or variables is significantly larger than the number of samples or the number of samples are significantly higher than the features

What can we say about the dimensionality when you have more features than samples?

A linear classifier can perfectly separate your solution. Using one feature per sample is sufficient. Hence the problem is high dimensional = complex

How do we measure complexity?

Learning curve

What do learning curves tell us?

Complex classifiers are good when you have a sufficient number of training objects.
When a small number of training objects is available you overtrain
Use a simple classifier when you don't have many training examples

Is there something more general to quantify the complexity? Learning curves are specific per dataset?

Mean squared error = Variance + squared bias

What does a high bias mean?

Classifier favors a specific solution and will consistently produce this solution

What does high variance mean?

Classifier is more flexible, thus the solution boundary may differ

What is L2 regularization?

Adding the sum of squares of the weights. Used if we want small changes in the output to not influence our results to much (keep the weights small)

What does L2 to the weights?

Encourages weight values to decay towards 0, unless supported by the data. Also known as parameter shrinkage

What is Ridge regression?

The combination of L2 and linear regression

What does L1 to the weights?

Encourages weights to become 0 = sparse model. Especially useful is the amount of features is much larger than the amount of samples

Which hyperparameters to optimize in SVM?

- Kernel type
- Kernel parameters
- Slack

How to tune hyperparameters?

- Grid search
- Randomized search

What are the issues with hyperparameter optimization?

- No guarantees (best option may not be in samples)
- Computational expensive
- Randomness
- Overfitting

The question on the page originate from the summary of the following study material:

Machine Learning

View summary

A unique study and practice tool
Never study anything twice again
Get the grades you hope for
100% sure, 100% understanding

Remember faster, study better. Scientifically proven.

Dataset complexity

15 important questions on Dataset complexity

The complexity of your problem consists mainly of two parts:

What do you check to see if your problem is high dimensional?

What can we say about the dimensionality when you have more features than samples?

How do we measure complexity?

What do learning curves tell us?

Is there something more general to quantify the complexity? Learning curves are specific per dataset?

What does a high bias mean?

What does high variance mean?

What is L2 regularization?

What does L2 to the weights?

What is Ridge regression?

What does L1 to the weights?

Which hyperparameters to optimize in SVM?

How to tune hyperparameters?

What are the issues with hyperparameter optimization?

Summaries related to Random Forest

Class notes - Machine Learning

Global politics

Essentials of international relations

Behavioral genetics

Management and organisational behaviour

Follow Up Engels idioom 4/5 H

International Business

Marketing fundamentals

Projectmanagement, A practical Approach-Engl…

Basic Management Accounting for the Hospital…

International business

Organization theory and design