Learn Random Forest algorithms
- Brush up on Decision Trees
- Understanding Random Forest
- Random Forest algorithm - video
- How to Develop a Random Forest Ensemble in Python - good in-depth introduction and code examples
- Bagging and Random Forest Ensemble Algorithms for Machine Learning
- How to Calculate Feature Importance With Python - Focus on RF section
- Explaining Feature Importance by example of a Random Forest
- Feature Selection Using Random Forest
- Random Forests - some good theory
- Section. 8.2 "Bagging, Random Forests, Boosting" in Introduction to Statistical Learning
- Random Forest Algorithm with Python and Scikit-Learn
- Scikit Learn documentation
- Understanding Random Forests Classifiers in Python
- What problem can RF solve? Classification, regression, both?
- What are the issues with DT, that are solved by RF?
- What are the strengths and weaknesses of RF?
- What are the tuning parameters for RF? Which is the most important tuning param?
- How do we calculate feature importance from RF?
We will be using RF in the same exercises we did in Decision Trees section
★☆☆ - Easy
★★☆ - Medium
★★★ - Challenging
★★★★ - Bonus
Use Scikit's make_blobs or make_classification to generate some sample data.
Try to separate them using RF
- Here is Bank marketing dataset
- You may want to encode variables
- Use DT to predict yes/no binary decision
- Visualize the tree
- Create a confusion matrix
- What is the accuracy of the model
- Run Cross Validation to gauge the accuracy of this model
Use Scikit's make_regression to generate some sample data.
Use RandomForestRegressor to solve this
- Use Bike sharing data
- Use RandomForestRegressor to predict bike demand
- Visualize the tree
- Use RMSE, R2 to evaluate the model
- Use Cross Validation to thoroughly test the model performance