Day 1: Data wrangling
\- Advanced course on Pandas
\- Tidy data
\- Lab on MovieLens dataset
\- Challenge and getting started with RAMP
Day 2: ML Pipelines and hyperparameter search
\- Column transformer and pipelines
\- Bayesian optimization and hyper parameter search
\- Learning curves
Day 3: Metrics and dealing with unbalanced data
\- Presentation of the different ML metrics
\- Problem of the metric with unbalanced data
\- ML approaches to deal with imbalanced data
Day 4: Ensemble methods and feature engineering
\- Gradient Boosting
\- Stacking
\- feature engineering
Day 5: Model inspection
\- partial dependence plots
\- feature importance
Challenges
Besides the students will compete during the week on a data challenge.