Skip to content

philip30/tacl2015

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Running Geoquery Experiment:

All the configuration should be configured in config.ini
- Locate all software dependencies there 
- TRAIN, TUNE, TUNE_TEST (test the tune), TEST = true for do, empty for not.

Experiment parameters are located in:
script/experiment/part/parameters.sh

Modifying how-to-run the experiment:
script/experiment/part/execute.sh

Order running the experiment:
1. - Configure everything, make sure every dependencies are linked (inside config.ini).
   - Configure mysql database (config/mysql.config), this experiment uses mysql to cache the query being fired to geoquery.  
     Basically you don't need to edit any file inside config directory except mysql.config. 

2. Extract id.tar.gz
   
   $ tar xzf id.tar.gz 
   This will generate the id/wong-split and id/andreas-split directories.
   The important thing is to also generate the 10-folds directory for tuning.
   The experiment is 10-folds, for each fold, another 10-folds are done against it for tuning.
   You can decide whether you want to keep or remove id.tar.gz after that

3. Generate the data 
   This will map the id, to the real geoquery data.

   $ script/prepare/data.sh
   This will prepare the full-question data. 
   The full-question data is used for baseline experiment.

   $ script/prepare/data-kwshuf-real.sh
   This will prepare the shuffled keyword data

4. Generate Language model
   $ script/prepare/prepare-lm.sh

   This process will take so long, it's better to run the process inside that prepare-lm.sh in separate computer.

5. Run the experiment.
   All the experiment are located in "script/experiment" directories.

   baseline.sh -> Baseline Question-MR.
   pipeline.sh -> Running Pipelined experiment
   keyword-shuf.sh -> Baseline KW-MR.
   kw-shuf.sh -> 3Sync system without language model
   kw-shuf-news.sh -> 3Sync system with language model trained on WMT NEWS-2009 data
   kw-shuf-q.sh -> 3Sync system with language model trained on the question data.
   kw-shuf-geo.sh -> 3Sync system with language model trained on the NL of training data.
   
   para-augment.sh -> 3Sync system with questions language model + paraphrasing features

   * All experiments included NNLM that trained on the training data.
   * Every script will do 2 experiments in one time (one without NNLM and one with NNLM)
   * To configure -Empty experiment, uncomment the line "source config/empty.config" in the config.ini

6. Results
    The Results for the experiment can be seen inside the working directory.
    The test-nnlm.result is the result +NNLM.
    The test.result is the result -NNLM.
    Each fold has this results, and in the working directory there is also an averaged result named with the same file.
    The human evaluation can be found in human-eval/result.txt after unpackaging the package.

About

Scripts and data to reproduce the experiment.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published