Fast-Iron

Kaggle Blue Book for Bulldozers Competition

Windows 7 64bit on Intel QuadCore with 12GB RAM Python 2.7 with Pandas, Numpy ,Scikit-Learn 0.13.1

##How to train your model ###How to make predictions on a new test set. Before train make predictions, data need to be pre-processed, step below:

Place the training, appendix and test data in the Data folder
Edit prepare_data.py and change the following line with names of training, appendix and test data trainData = "Data\TrainAndValid.csv" testData = "Data\Test.csv" appendixData = "Data\Machine_Appendix.csv"
Run the script. This will create four files in DataProcessed. This step take about 10-15 minutes depending on machine and file sizes. Can be incredibly optimized on request to just few minutes (seconds?) for the test set.

PREDICT on Test.csv data

Simply run train_and_predict.py will create the output named current_prediction.csv train_and_predict.py is already set to run to recreate the output. gradient boosting regressor are serialized and trained. random forest need to be re-trained (too big to attach). Training the random forests takes 102 minutes.

TRAIN on new data

Edit train_and_predict.py To train GB models change trainGB_models to True To train RF models change trainRF_models to True To save the models, change dumpModels to True

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
README.md		README.md
prepare_data.py		prepare_data.py
train_and_predict.py		train_and_predict.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

prepare_data.py

prepare_data.py

train_and_predict.py

train_and_predict.py

Repository files navigation

Fast-Iron

PREDICT on Test.csv data

TRAIN on new data

About

Releases

Packages

Languages

RichGit101/blue-book-bulldozers-2nd

Folders and files

Latest commit

History

Repository files navigation

Fast-Iron

PREDICT on Test.csv data

TRAIN on new data

About

Resources

Stars

Watchers

Forks

Languages