To facilitate communication with relevant researchers, both code and data have been uploaded.

Among them, there are two datasets to validate the model. One is the SIGHAN Bake-off dataset, the other is the logistics dataset. Since the logistics dataset is a private dataset provided by the logistics company, this part of data has not been uploaded. The main_word2vecPublic.py is utilized to train and test the model. And the process_word2vecPublic is used to calculate the decision weight matrix and to process the data. The word_vector file contains code to train the word vector model.

Steps to use the above file: step1: Pre-trained word2vec model. This part of the code comes from word_vector directory step2: Build decision weight and prepare input data for the model according to process_word2vecPublic.py step3: Train and test model by main_word2vecPublic.py

The Baselines reference https://github.com/shibing624/pycorrector

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.idea		.idea
N元错误检错		N元错误检错
correctionData		correctionData
data		data
fuction_test		fuction_test
word_vector		word_vector
统计包含错误字符文本		统计包含错误字符文本
README.md		README.md
main_word2vecPublic.py		main_word2vecPublic.py
process_word2vecPublic.py		process_word2vecPublic.py
weight_character.py		weight_character.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.idea

.idea

N元错误检错

N元错误检错

correctionData

correctionData

data

data

fuction_test

fuction_test

word_vector

word_vector

统计包含错误字符文本

统计包含错误字符文本

README.md

README.md

main_word2vecPublic.py

main_word2vecPublic.py

process_word2vecPublic.py

process_word2vecPublic.py

weight_character.py

weight_character.py

Repository files navigation

To facilitate communication with relevant researchers, both code and data have been uploaded.

About

Releases

Packages

Languages

hmfighting/spelling_check

Folders and files

Latest commit

History

Repository files navigation

To facilitate communication with relevant researchers, both code and data have been uploaded.

About

Resources

Stars

Watchers

Forks

Languages