Random Forest
10 important questions on Random Forest
How can we make DT more robust to overfitting and more efficient?
Why has using a single classifier some issues?
Different classifiers have different advantages and disadvantages = difficult trade off
How does an ensemble reduce overfitting?
- Higher grades + faster learning
- Never study anything twice
- 100% sure, 100% understanding
How can we introduce variability in an ensemble of classifiers?
- Change the dataset (RF --> yes)
Why is Bagging or Bootstrap Aggregating needed?
Why is random feature selection needed?
Which hyperparameter is always tuned in RF?
Which hyperparameters can you tune in RF?
- Ensemble method
- Bagging and random feature selection parameters
- Minimum samples for node split
- Minimum samples for a node leaf
- Number of trees
- Maximum features to consider for split
On what is feature importance based?
Why is feature importance not straightforward?
- No clear correlation with class
- Sensitive to features with more categories
The question on the page originate from the summary of the following study material:
- A unique study and practice tool
- Never study anything twice again
- Get the grades you hope for
- 100% sure, 100% understanding