XGBoost is typically the most powerful ML option whenever you’re working with structured data. In this SuperDataScience episode, our Chief Data Scientist, Jon Krohn, talks to world-leading XGBoost expert, Matt Harrison, on how it works and how to make the most of it.
Matt:
• Is the author of seven best-selling books on Python and Machine Learning.
• His most recent book, “Effective XGBoost”, was published in March.
• Teaches “Exploratory Data Analysis with Python” at Stanford University.
• Through his consultancy MetaSnake, he’s taught Python at leading global organizations like NASA, Netflix, and Qualcomm.
• Previously worked as a CTO and Software Engineer.
• Holds a degree in Computer Science from Stanford.
This episode will appeal primarily to practicing data scientists who are keen to learn about XGBoost or keen to become an even deeper expert on XGBoost by learning about it from a world-leading educator on the library.
In this episode, Matt details:
• Why XGBoost is the go-to library for attaining the highest accuracy when building a classification model.
• Modeling situations where XGBoost should not be your first choice.
• The XGBoost hyperparameters to adjust to squeeze every bit of juice out of your tabular training data and his recommended library for automating hyperparameter selection.
• His top Python libraries for other XGBoost-related tasks such as data preprocessing, visualizing model performance, and model explainability.
• Languages beyond Python that have convenient wrappers for applying XGBoost.
• Best practices for communicating XGBoost results to non-technical stakeholders.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.