HRG Parser

Recommended Environment

Linux
Java 8
Python 3.6
GCC 7.0

Data Preparation

You need get DeepBank1.1 data first.

Split the punctuation from the lexicon

Compile the java project "pseud" and run: java jigsaw.treebank.RedwoodsTreeReader <"profile" dir of deepbank1.1>

The outputs are in out directory. You should manually split them into train, dev and test set into directory java_out_train, java_out_dev, java_out_test respectively, and put them into a common directory main_dir_base.

install python dependency

Run pip3 install -r requirements.txt

Extract grammar

Modify deepbank_export_path and main_dir_base in extract_sync_grammar.py and run it. The output trees, derivation and grammar will be in deepbank-preprocessed directory.

Train a POSTagger

Modify paths in scripts/train_leaftag.py and run it.

Train a Phrase Structure Parser

Modify paths in scripts/train_span_lite.py and run it.

Predict with Trained Phrase Structure Parser

Run main.py span predict and follow the instruction.

Train a HRG Parser

Modify paths in scripts/train_udef_lite.py and run it.

Predict with Trained HRG Parser

Run main.py hrg predict and follow the instruction.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
coupling_mst_msg		coupling_mst_msg
delphin		delphin
hrgguru		hrgguru
pseud		pseud
scripts		scripts
span		span
supertagger		supertagger
tagger_base		tagger_base
utils		utils
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
accurate-shrg-based.pdf		accurate-shrg-based.pdf
beam.py		beam.py
common_utils.py		common_utils.py
conll_reader.py		conll_reader.py
convert_model_type.py		convert_model_type.py
couple_parser.py		couple_parser.py
derivation_analysis.py		derivation_analysis.py
dynet_helper.py		dynet_helper.py
dynet_parser_base.py		dynet_parser_base.py
edge_eval_network.py		edge_eval_network.py
extract_sync_grammar.py		extract_sync_grammar.py
get_unlexicalized_rules.py		get_unlexicalized_rules.py
graph_utils.py		graph_utils.py
kfold.py		kfold.py
logger.py		logger.py
main.py		main.py
nn.py		nn.py
parser_base.py		parser_base.py
requirements.txt		requirements.txt
select_best.py		select_best.py
select_best_in_dir.py		select_best_in_dir.py
training_scheduler.py		training_scheduler.py
tree_utils.py		tree_utils.py
vocab_utils.py		vocab_utils.py

License

draplater/hrg-parser

Folders and files

Latest commit

History

Repository files navigation

HRG Parser

Recommended Environment

Data Preparation

Split the punctuation from the lexicon

install python dependency

Extract grammar

Train a POSTagger

Train a Phrase Structure Parser

Predict with Trained Phrase Structure Parser

Train a HRG Parser

Predict with Trained HRG Parser

About

Resources

License

Stars

Watchers

Forks

Languages