kaggle-google-quest

Code for training the transformer models of the 6th place solution in the Google QUEST Q&A Labeling Kaggle competition.

A detailed description is posted on the Kaggle discussion section here

Running the code requires:

The competition data put in a directory named data. It can be downloaded from https://www.kaggle.com/c/google-quest-challenge/data.
The packages in the requirements.txt file need to be installed.
By default training is done on GPU (single RTX 2080Ti) so CUDA needs to be available as well as about 10GB of GPU memory. To adress CUDA out of memory errors, the batch size can be lowered to 1 and gradient accumelation raised to 8 inside the train.py script.

To reproduce all 4 transformer models run the following commands:

python train.py -model_name=siamese_roberta && python finetune.py -model_name=siamese_roberta
python train.py -model_name=siamese_bert && python finetune.py -model_name=siamese_bert
python train.py -model_name=siamese_xlnet && python finetune.py -model_name=siamese_xlnet
python train.py -model_name=double_albert && python finetune.py -model_name=double_albert

The notebooks folder contains two notebooks. The stacking.ipynb implements our weighted ensembling + post-processing grid search and the oof_cvs.ipynb shows the CV scores of our models under variuos settings (i.e. ignoring hard targets or ignoring duplicate question rows).

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
models		models
nbs		nbs
oofs		oofs
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
common.py		common.py
create_features.py		create_features.py
datasets.py		datasets.py
evaluation.py		evaluation.py
finetune.py		finetune.py
inference.py		inference.py
learning.py		learning.py
one_cycle.py		one_cycle.py
requirements.txt		requirements.txt
tokenization.py		tokenization.py
train.py		train.py

License

piyushbhuwalka/kaggle-google-quest

Folders and files

Latest commit

History

Repository files navigation

kaggle-google-quest

About

Resources

License

Stars

Watchers

Forks

Languages