Skip to content

zihaow21/cnntweets

 
 

Repository files navigation

Deep Ensemble for Sentiment Analysis

Installation

  • clone the repository
git clone git@github.com:bgshin/cnntweets.git
  • make virtual env
mkvirtualenv sent
  • Dependencies
pip install -r requirements.txt
  • Python 2.7
  • requirements
    • boto==2.40.0
    • bz2file==0.98
    • gensim==0.12.4
    • numpy==1.11.0
    • protobuf==3.0.0b2
    • requests==2.10.0
    • scipy==0.17.1
    • six==1.10.0
    • smart-open==1.3.3
    • tensorflow==0.8.0

Usage

  • WITHOUT pre-trained w2v

    • Train

       cd cnn
       nohup python cnn_train.py > out.txt &
    • Test

      • Modify cnn/cnn_test.py

         savepath = 'model_path/model-xxxx'
      • Run test script

         cd cnn
         python cnn_test.py
  • WITH pre-trained w2v

    • Download and extract the compressed file to have the pre-trained w2v bin file

    • Modify w2v_cnn/cnn_train.py

       model_path = 'path_to_w2v_bin/word2vec_twitter_model.bin'
    • Train

       cd w2v_cnn
       nohup python cnn_train.py > out.txt &
    • Test

      • Modify w2v_cnn/cnn_test.py

         savepath = 'model_path/model-xxxx'
      • Run test script

         cd w2v_cnn
         python cnn_test.py

Dataset

Semeval 2016

  • Dev (semeval16_T4A_devtest_npo)

    • number of data: 1588
  • Tst (semeval16_T4A_test_npo)

    • number of data: 20632
  • Trn (semeval13_T2B_16T4A_train_dev_npo)

    • number of data: 15385
  • Data files

    • semeval13_T2B_16T4A_train_dev_devtest_npo - 1588+15385 = 16973
    • semeval16_T4A_devtest_npo = 1588
    • semeval13_T2B_16T4A_train_dev_npo = 15385
    • semeval16_T4A_dev_npo = 1595
    • semeval16_T4A_test_npo = 20632
    • semeval16_T4A_train_npo = 4796
  • Format of data (TAB separated)

    no sentiment sentences
    1 objective I may be the ...
    2 positive TGIF folks! ...

Preprocessing

  • Label definition
    • 'objective': [0, 1, 0], 1
    • 'positive': [0, 0, 1], 2
    • 'negative': [1, 0, 0], 0

Reference

Pre-trained Word2vec done by Fréderic Godin

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.8%
  • Shell 0.2%