While reading the classic Introduction to Statistical Learning, I found it beneficial to reproduce the book's graphs, tables, and labs in Python. I also used this opportunity to create a literate document using Org mode in Emacs.
I hope you will find this project useful. You are welcome to send me comments and feedback on any aspect of this project (e.g., Python code; use of Org mode and emacs; organization of code, data, and text).
Below packages are used in this project.
Chapter | Topic | R package | Python package |
---|---|---|---|
All | Graphs | graphics | matplotlib |
All | Dataframes | base | pandas |
All | Matrix calculations | base | NumPy |
3 Linear Regression | Linear models | stats | StatsModels |
4 Classification | Generalized models | stats | StatsModels |
4 Classification | Linear/quadratic discriminant analysis | MASS | scikit-learn |
4 Classification | K nearest neighbors | class | scikit-learn |
6 Linear Model Selection | Ridge Regression | glmnet | scikit-learn |
6 Linear Model Selection | Lasso | glmnet | scikit-learn |
6 Linear Model Selection | Principal Component Regression | pls | scikit-learn |
6 Linear Model Selection | Partial Least Squares | pls | scikit-learn |
8 Tree-Based Methods | Trees | tree | scikit-learn |
8 Tree-Based Methods | Bagging and Random Forests | randomForest | scikit-learn |
8 Tree-Based Methods | Boosting | gbm | scikit-learn |
9 Support Vector Machines | Support Vector Classifiers | e1071 | scikit-learn |
9 Support Vector Machines | Support Vector Machines | e1071 | scikit-learn |
10 Unsupervised Learning | Principal Component Analysis | base | scikit-learn |
10 Unsupervised Learning | Clustering Methods | base | scikit-learn |
10 Unsupervised Learning | Dendrograms | base | SciPy |