Trees π²π²π²
1 min read
Regression tree
Recursive binary splitting
Consider all predictors and all possible values of cutpoint, choose the one resulting with the lowest RSS tree
Tree pruning
Cost complexity pruning
Classification trees
Make recursive splits by classification error rate, Gini index, or cross entropy
Categorical variables are ok, no need for dummy variables
trees are easy to explain, display, interpret
Tree prediction performance can be improved by bagging, boosting, aggregating many decision trees
Bagging
Trees have high variance and bagging is way to reduce the variance
Bootstrap aggregation
create many sets from the same data using sampling with replacement
Create many trees and aggregate them
To estimate test error, no need for CV or validation set. instead use
 out of the bag observations
 out of the bag error
Bagging improves predicton accuracy but decreases interpretability
Important predictors can be seen by averaging RSS for regression trees or Gini index for classification trees
Boosting
Just as we learn from previous mistakes, models can learn from previous models

Sequential training

Adaboost gives more weight to incorrectly classified samples next time

gradient boosting
 xg boost , extreme gradient boosting
 Since boosting is sequential, itβs slow. Xgboost parallelizes it
Random forests π³π΄π²
Select a random sample of m predictors
Thus, decorrelate the trees
As a measure of node purity, Gini index and cross entropy could be used.
January 16, 2020