Skip to content

zh4ngx/allen-ai-challenge

Repository files navigation

Tackling Kaggle's Allen AI Challenge

Progress on my blog

Run Examples

In both cases, examples are given assuming you're doing everything in the current directory. The models and submissions are automatically timestamped in the --model and --project directories, respectively.

LSA

$ python -m gensim.scripts.make_wiki

$ python lsa_train.py \
    --dictionary wiki_en_wordsids.txt \
    --corpus wiki_en_tfidf.mm \
    --model model/lsa

$ python lsa_evaluate.py \
    --model model/lsa/timestamp.model \
    --data training.tsv

$ python lsa_submit.py \
    --model model/lsa/timestamp.model \
    --project $(pwd)

Word2Vec

$ wget https://word2vec.googlecode.com/svn/trunk/questions-words.txt

$ python word2vec_train.py \
    --articles enwiki-latest-pages-articles.xml.bz2 \
    --model model/word2vec \
    --demo question-words.txt \
    --lines wiki-lines.txt

$ python word2vec_evaluate.py \
    --model model/word2vec/timestamp.model \
    --data training.tsv

$ python word2vec_submit.py \
    --model model/word2vec/timestamp.model \
    --project $(pwd)

About

Tracked progress for the Allen AI Challenge

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages