data_proj2

INTRO

data mining project 2 implemented with SGD and SVM kernel implemented with sklearn class RBFSampler,Nystroem,AdditiveChi2Sampler,SkewedChi2Sampler Final version with AdditiveChi2Sampler

HOW TO RUN

download repository to your local drive and cd to the folder data_proj2.
Copy and paste you previous data files to the folder: training_set.txt test_data.txt test_label.txt
To compute weights: cat training_set.txt | python mapper.py | python reducer > weights.txt
As an exmaple, the computed weights is already in the folder: weights_4.txt
To compute accuracy: evaluate.py weights_4.txt test_data.txt test_label.txt

RESULT

The current method AdditiveChi2Sampler have accuracy = 0.80463
SkewedChi2Sampler(skewedness=1.0, n_components=100, random_state=1) accuracy = 0.703116
tune gamma and n_components of RBFSampler to get better accuracy
- when RBFSampler(gamma=0.1, n_components=100), accuracy = 0.687130
- when RBFSampler(gamma=0.1, n_components=200), accuracy = 0.681268
- when RBFSampler(gamma=1, n_components= 100), accuracy = 0.714028
without any kernel transform accuracy = 0.74747

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
README.md		README.md
evaluate.py		evaluate.py
mapper.py		mapper.py
reducer.py		reducer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

evaluate.py

evaluate.py

mapper.py

mapper.py

reducer.py

reducer.py

Repository files navigation

data_proj2

About

Releases

Packages

Languages

Conggge/data_proj2

Folders and files

Latest commit

History

Repository files navigation

data_proj2

About

Resources

Stars

Watchers

Forks

Languages