Skip to content

sridif/attribution

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

attribution

Source folder -

pth.py : holds the path to training and test files. The git only stores tokenized dictionaries. Not the entire xml files given in the website.

pan_util.py: contains utilities to mine the xml train files. for example - given a conversation id get all the conversations

pan_alg.py: creates a self designed feature set with 4 parameters. these parameters were crossvalidated and hard coded in it. These features are then used to train a naive bayes classifier.

Results folder -

     The best result so far is 93 % true positive
                               25 % false positive


     Scope for improvement - making the features taking into account the frequency when the bayes classifier is being constructed.

For anydoubt contact elango at kth dot se

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages