HashtagMaster: Segmentation tool for hashtags

This repository contains the code and resources from the following paper

Repo Structure:

word_breaker: Code for word-breaker beam search.
neural_ranker: Code for our neural pairwise ranker models. (4 variants)
data: Task datasets and other feature files. All the features files for the experiment are added except the language models. We provided a small sample of the language models. Please email us for the whole language model.

Instructions:

First, run the "Word Breaker" to get the top-k candidates:

python word_breaker/main.py --k 10 --lm data/small_gt.bin --out train_topk.tsv --input data/our_dataset/train_corrected.tsv

python word_breaker/main.py --k 10 --lm data/small_gt.bin --out test_topk.tsv --input data/our_dataset/test_corrected.tsv
Rerank the top-k candidates:

python3 neural_ranker/main.py --train data/our_dataset/train_corrected.tsv --train_topk train_topk.tsv --test data/our_dataset/test_corrected.tsv --test_topk test_topk.tsv --out output.tsv

Citation

Please cite if you use the above resources for your research

@InProceedings{ACL-2019-Maddela,
  author = 	"Maddela, Mounica and Xu, Wei and Preoţiuc-Pietro, Daniel",
  title = 	"Multi-task Pairwise Neural Ranking for Hashtag Segmentation",
  booktitle = 	"Proceedings of the Association for Computational Linguistics (ACL)",
  year = 	"2019",
}

Please use Python 3 to run the code.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
data		data
neural_ranker		neural_ranker
word_breaker		word_breaker
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

neural_ranker

neural_ranker

word_breaker

word_breaker

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

HashtagMaster: Segmentation tool for hashtags

Repo Structure:

Instructions:

Citation

About

Releases

Packages

Languages

License

mounicam/hashtag_master

Folders and files

Latest commit

History

Repository files navigation

HashtagMaster: Segmentation tool for hashtags

Repo Structure:

Instructions:

Citation

About

Resources

License

Stars

Watchers

Forks

Languages