XGBoost is so efficient and powerful for Kaggle competitions that it deserves a post of its own. Being an extension of the classic gradient boosting machine (gbm), xgboost (extreme gradient boosting) is optimized to be highly scalable, efficient, and portable.
It’s primarily developed by Tianqi Chen at University of Washington, with the R Package authored by Tong He:
- Intro by Tong He: Introduction to XGBoost R Package.
- Intro by Tianqi Chen: Introduction to Boosted Trees
XGBoost uses the same model (tree ensembles) as random forest, but the difference is in how the model is trained. XGBoost learn the trees with an additive strategy: fix what it has learned, and add one new tree at a time.