Skip to content

xraywu/WebClassifier

Repository files navigation

WebClassifier

A Chinese term classifier based on web search results

Files in this repository includes

  1. Scripts to retrieve features of terms from search engine, cleansing the raw feature lists and sampling the data
  2. Dictionary used for Chinese text segmentation
  3. Sample data sets include:
    • Input term sets named in drugList-;
    • Raw term-feature matrix generated from different search engines and term set named in drugFeature-;
    • The exact testing and training sets used for this study named in -TestTrain.

You will need 7-Zip (http://www.7-zip.org/) to decompress the files.

About

A Chinese term classifier based on web search results

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published