Skip to content

jaindeepali/Adler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Adler

Ready-to-use text classification corpus generation from TechTC-300 Test Collection . The final dataset uses chi-squared feature selection and tf-idf feature weighting. Classification is done using a Decision-Jungle classifier in AzureML .

Setup

  • Clone the repo: git clone https://github.com/jaindeepali/Adler
  • Create config file from sample: cp Adler/config/sample.config.json Adler/config/config.json
  • Open config.json and edit the path to the data directory
  • Create python virtual environment: virtualenv .venv
  • Activate virtual environment: source .venv/bin/activate
  • Install Adler package: python setup.py install
  • Run script to generate dataset: /scripts/generate.py

About

Generating text corpus from TechTC-300 Test Collection for topic classification.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published