Kaggle_CrowdFlower

1st Place Solution for Search Results Relevance Competition on Kaggle (https://www.kaggle.com/c/crowdflower-search-relevance)

The best single model we have obtained during the competition was an XGBoost model with linear booster of Public LB score 0.69322 and Private LB score 0.70768. Our final winning submission was a median ensemble of 35 best Public LB submissions. This submission scored 0.70807 on Public LB and 0.72189 on Private LB.

FlowChart

Documentation

See ./Doc/Kaggle_CrowdFlower_ChenglongChen.pdf for documentation.

Instruction

download data from the competition website and put all the data into folder ./Data.
run python ./Code/Feat/run_all.py to generate features. This will take a few hours.
run python ./Code/Model/generate_best_single_model.py to generate best single model submission. In our experience, it only takes a few trials to generate model of best performance or similar performance. See the training log in ./Output/Log/[Pre@solution]_[Feat@svd100_and_bow_Jun27]_[Model@reg_xgb_linear]_hyperopt.log for example.
run python ./Code/Model/generate_model_library.py to generate model library. This is quite time consuming. But you don't have to wait for this script to finish: you can run the next step once you have some models trained.
run python ./Code/Model/generate_ensemble_submission.py to generate submission via ensemble selection.
if you don't want to run the code, just submit the file in ./Output/Subm.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
BlogPost		BlogPost
Code		Code
Data		Data
Doc		Doc
Fig		Fig
Output		Output
libfm-1.40.windows		libfm-1.40.windows
rgf1.2		rgf1.2
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BlogPost

BlogPost

Code

Code

Data

Data

Doc

Doc

Fig

Fig

Output

Output

libfm-1.40.windows

libfm-1.40.windows

rgf1.2

rgf1.2

README.md

README.md

Repository files navigation

Kaggle_CrowdFlower

FlowChart

Documentation

Instruction

About

Releases

Packages

Languages

0x0all/Kaggle_CrowdFlower

Folders and files

Latest commit

History

Repository files navigation

Kaggle_CrowdFlower

FlowChart

Documentation

Instruction

About

Resources

Stars

Watchers

Forks

Languages