GitHub - philip30/tacl2015: Scripts and data to reproduce the experiment.

philip30 / tacl2015 Public

Notifications You must be signed in to change notification settings
Fork 0
Star 0

Scripts and data to reproduce the experiment.

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
config		config
metadata		metadata
script		script
stem		stem
.gitignore		.gitignore
README		README
closed-shuf.q.sh		closed-shuf.q.sh
config.ini		config.ini
human-eval.gz		human-eval.gz
id.tar.gz		id.tar.gz

Repository files navigation

Running Geoquery Experiment:

All the configuration should be configured in config.ini
- Locate all software dependencies there
- TRAIN, TUNE, TUNE_TEST (test the tune), TEST = true for do, empty for not.

Experiment parameters are located in:
script/experiment/part/parameters.sh

Modifying how-to-run the experiment:
script/experiment/part/execute.sh

Order running the experiment:
1. - Configure everything, make sure every dependencies are linked (inside config.ini).
- Configure mysql database (config/mysql.config), this experiment uses mysql to cache the query being fired to geoquery.
Basically you don't need to edit any file inside config directory except mysql.config.

2. Extract id.tar.gz

$ tar xzf id.tar.gz
This will generate the id/wong-split and id/andreas-split directories.
The important thing is to also generate the 10-folds directory for tuning.
The experiment is 10-folds, for each fold, another 10-folds are done against it for tuning.
You can decide whether you want to keep or remove id.tar.gz after that

3. Generate the data
This will map the id, to the real geoquery data.

$ script/prepare/data.sh
This will prepare the full-question data.
The full-question data is used for baseline experiment.

$ script/prepare/data-kwshuf-real.sh
This will prepare the shuffled keyword data

4. Generate Language model
$ script/prepare/prepare-lm.sh

This process will take so long, it's better to run the process inside that prepare-lm.sh in separate computer.

5. Run the experiment.
All the experiment are located in "script/experiment" directories.

baseline.sh -> Baseline Question-MR.
pipeline.sh -> Running Pipelined experiment
keyword-shuf.sh -> Baseline KW-MR.
kw-shuf.sh -> 3Sync system without language model
kw-shuf-news.sh -> 3Sync system with language model trained on WMT NEWS-2009 data
kw-shuf-q.sh -> 3Sync system with language model trained on the question data.
kw-shuf-geo.sh -> 3Sync system with language model trained on the NL of training data.

para-augment.sh -> 3Sync system with questions language model + paraphrasing features

* All experiments included NNLM that trained on the training data.
* Every script will do 2 experiments in one time (one without NNLM and one with NNLM)
* To configure -Empty experiment, uncomment the line "source config/empty.config" in the config.ini

6. Results
The Results for the experiment can be seen inside the working directory.
The test-nnlm.result is the result +NNLM.
The test.result is the result -NNLM.
Each fold has this results, and in the working directory there is also an averaged result named with the same file.
The human evaluation can be found in human-eval/result.txt after unpackaging the package.

About

Scripts and data to reproduce the experiment.

Readme

Activity

0 stars

2 watching

0 forks

Report repository

Releases

No releases published

Packages

No packages published

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

config

config

metadata

metadata

script

script

stem

stem

.gitignore

.gitignore

README

README

closed-shuf.q.sh

closed-shuf.q.sh

config.ini

config.ini

human-eval.gz

human-eval.gz

id.tar.gz

id.tar.gz

Repository files navigation

About

Releases

Packages

Languages

philip30/tacl2015

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Languages