Code to generate submission for the Walmart Trip Type Classification Kaggle Competition
This submission landed me at 56
on the private leaderboard out of a total of 1061
participants
The data for the competition can be downloaded from https://www.kaggle.com/c/walmart-recruiting-trip-type-classification/data
The code expects train.csv
and test.csv
to be present in same directory as the code
Execute the file named run_all.py
to generate features and build models and generate the submission
My solution was an ensemble of 3 Neural Network models and 2 XGBoost models
What did I do differently when compared to the others in the competition ?
Feature Aggregation
While others were using 5000+ features to get their scores, I managed to use feature aggregation to reduce the number of features being used for modelling
The NN models use ~400 features while the XGBoost models use 800 odd features
numpy
scipy
pandas
sklearn
lasagne
nolearn
xgboost