Auto-correction-for-transliterated-queries

This is a query correction system that is designed for transliterated queries.

This project is a part of the my transaction paper on Auto Correction and Sense Disambiguation in Transliterated Queries, which is currently under review. The project is also inspired by my following papers:

Key features of the model:

Can be retrained on a new dataset of well spelled queries in mixed languages such as Hindi-English, English-French, Hindi-Bengali, etc.
No need of an annotated dataset, only need well spelled queries.
Can be trained on smaller dataset - ~10K queries, giving reasonable performance.
A trained mocdel on English corpus is provided, queries are taken from Yahoo webscope - 150K questions.
The model is tested on a training corpus of 12K queries in English-Hindi mixed scripts (collected manually). The dataset will be made publically available as soon as the paper is accepted.

Usage:

obj = auto_correct()
obj.run()

enter a query
hw to lrn pythn anddeeplearning eas ily
how to learn python and deep learning easily    11.2134873867

Parameters of the model

There are two parameters of the auto-corrector:

obj = auto_correct(retrain=,data=)

For retraining the model, set retrain = True and pass the queries as the other argument. The queries must be given in the following format:

queries=[]
queries = ['how to handle a 1.5 year old when hitting',
 'how can i avoid getting sick in china',
 'how do male penguins survive without eating for four months',
 'how do i remove candle wax from a polar fleece jacket',
 'how do i find an out of print book']

obj = auto_correct(retrain=True,data=queries)

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
saved_model		saved_model
README.md		README.md
auto_correct.py		auto_correct.py
code_run.py		code_run.py
dict1		dict1
dict_rev1		dict_rev1
history		history
learning_model.py		learning_model.py
process_data.py		process_data.py
question.pkl		question.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

saved_model

saved_model

README.md

README.md

auto_correct.py

auto_correct.py

code_run.py

code_run.py

dict1

dict1

dict_rev1

dict_rev1

history

history

learning_model.py

learning_model.py

process_data.py

process_data.py

question.pkl

question.pkl

Repository files navigation

Auto-correction-for-transliterated-queries

This is a query correction system that is designed for transliterated queries.

Key features of the model:

Usage:

Parameters of the model

About

Releases

Packages

Languages

uditsharma7/Auto-correction-for-transliterated-queries

Folders and files

Latest commit

History

Repository files navigation

Auto-correction-for-transliterated-queries

This is a query correction system that is designed for transliterated queries.

Key features of the model:

Usage:

Parameters of the model

About

Resources

Stars

Watchers

Forks

Languages