Leco

This the the repository for the SIGIR 2020 paper "Enhancing Text Classification via Discovering Additional Semantic Clues from Logograms".

By leveraging the cross-linguistic variation of two types of writing systems, Leco utilizes logograms to capture reliable clues for the text classification of phonographic languages, especially for low-resource ones.

Overview

code/ contains the source codes (Leco Classifier and Gaussian Embedding).
data/ contains example datasets used for evaluating.

Reqirements:

Python (≥3.0)
PyTorch (≥1.0)
BERT-Base: Please initialize a pretrained BERT model (self.bert in class TextEmbedding) to obtain BERT embeddings.
Hyperparameters are in _public.py.

Citation

If you find this study helpful or related, please kindly consider citing as:

@inproceedings{Leco,
  title = {Enhancing Text Classification via Discovering Additional Semantic Clues from Logograms},
  author = {Chen Qian and Fuli Feng and Lijie Wen and Li Lin and Tat-Seng Chua},
  booktitle = {Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR)},
  year = {2020},
  pages = {1201–1210}
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
code		code
data		data
.gitattributes		.gitattributes
Framework.png		Framework.png
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

code

code

data

data

.gitattributes

.gitattributes

Framework.png

Framework.png

README.md

README.md

Repository files navigation

Leco

Overview

Reqirements:

Citation

About

Releases

Packages

Languages

qianc62/Leco

Folders and files

Latest commit

History

Repository files navigation

Leco

Overview

Reqirements:

Citation

About

Resources

Stars

Watchers

Forks

Languages