cmput497_a3_yonael_zichun3

Prerequisites

Java
Python3
virtualenv

How to run

Setup

# Setup python virtual environment
$ virtualenv venv --python=python3
$ source venv/bin/activate

# Install python dependencies
$ pip install -r requirements.txt

Stanford POS Tagger

# transform data files to use "_" as seperator and one sentence per line
# or make format-dev to create a subet of training file
# for development
$ make format

# train models
# NOTE to marker: model has been pre-trained and attach as a part of the submission
# you may skip this part and test directly
$ make train

# test models
# tagged sentences are saved under `output/` directory
$ make test

# run error analysis
# the output consists of accuracy, confusion metrics, percision/recall, and some other stuffs
$ python stanford_post_analysis.py > test-stanford-output.txt

HMM and Brill

# Train two HMM models on both respective testing sets and opposite testing sets
# the output consists of accuracy, confusion metrics, percision/recall, and some other stuffs
$ make test-hmm > test-hmm-output.txt

# Train two Brill models on both respective testing sets and opposite testing sets
# the output consists of accuracy, confusion metrics, percision/recall, and some other stuffs
$ make test-brill > test-brill-output.txt

Output

Output tagged sentences are avaliable under output/ directory following such convention <tagger_name>.<test_file>-tagged.<train_file>.txt.

Authors

Yonael Bekele
Michael Lin

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.vscode		.vscode
A3DataCleaned		A3DataCleaned
output		output
pos_tagging		pos_tagging
stanford-postagger		stanford-postagger
.editorconfig		.editorconfig
.gitignore		.gitignore
Domain1.prop		Domain1.prop
Domain1.tagger		Domain1.tagger
Domain1.tagger.props		Domain1.tagger.props
Domain2.prop		Domain2.prop
Domain2.tagger		Domain2.tagger
Domain2.tagger.props		Domain2.tagger.props
ELL.prop		ELL.prop
ELL.tagger		ELL.tagger
ELL.tagger.props		ELL.tagger.props
Makefile		Makefile
README.md		README.md
Report_cmput_497_ybekele_zichun3.pdf		Report_cmput_497_ybekele_zichun3.pdf
clean_data.py		clean_data.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
stanford_post_analysis.py		stanford_post_analysis.py
tagger.py		tagger.py
test-brill-output.txt		test-brill-output.txt
test-hmm-output.txt		test-hmm-output.txt
test-stanford-output.txt		test-stanford-output.txt

michaellzc/cmput497_a3_yonael_zichun3

Folders and files

Latest commit

History

Repository files navigation

cmput497_a3_yonael_zichun3

Prerequisites

How to run

Setup

Stanford POS Tagger

HMM and Brill

Output

Authors

About

Topics

Resources

Stars

Watchers

Forks

Languages