Skip to content

jdbnc/Twap

 
 

Repository files navigation

########
Overview
########

Twap is a very hacky app that uses the tweetstream library and django
to query the twitter streaming api for tweets filtered by terms
in a searchlist.

Primarily we used this to trag tweets based on hashtags related to
occupy wall street so we can archive these for some kind of analysis
later.

Requirements
============

Additional libraries required to run the application are listed in
requirements.txt.

I recommend using Pip to install the requirements in a virtualenv.

Setup
==========

* Copy localsettings-dist.py to localsettings.py and configure as
  needed.
* In localsettings.py add your TWITTER_USER and TWITTER_PASSWORD information.
* Add a list of terms to filter tweets by in settings.SEARCHLIST
* Syncdb and follow other normal Django setup processes.  Note this does NOT
  need to be setup on apache.  It mostly runs via manage.py

Starting & Stopping Harvest
===========================

** Stopping Harvest **

I made twap to be hard to kill though it seems to lock up ever few days.  I
believe this is because the MySQL log fills up but I'm not sure.  Anyway,
if you need to restart the app you will have to *manually kill the app*
by using the kill command.  To do this you'll need the PID of the thread
which is easily obtained by typing::

    > ps aux | grep twitterharvest

This will list all the jobs running with that name.  I often run this as
'SCREEN' (see below) and here is an example of the output you will get::

    sturnbu   2196  0.0  0.0   5248   528 ?        Ss   Oct25   0:07 SCREEN python manage.py twitterharvest
    sturnbu   2197  1.0 61.1 3096908 2438712 pts/1 Ss+  Oct25  29:04 python manage.py twitterharvest
    sturnbu  18738  0.0  0.0   4156   876 pts/3    S+   12:41   0:00 grep --color=auto twitterharvest

The PID is the number right after the user name and in this example I will
want to kill PID '2196' but ** that value will differ in each case ** so
use the command above to get the current PID.

If in doubt just kill any PID with 'twitterharvest' in it but you should only
have to kill the version with 'SCREEN' in it's process.

To Kill a process you only need to type::

    > kill <PID>

In this example that would be::

    > kill 2196

You can ensure the thread was killed by running the ps command again listed
above but this time you should only see the line with 'grep' in it, which is
the thread you created by running the ps command, and that is expected.

** Starting Harvest **

Harvest is run vi a manage.py command and if the application is installed and
configured right you should only need to run the 'twitterharvest' command from
the applicaiton root directory like so::

    > screen python manage.py twitterharvest

By using the screen command you should be able to close out the window and the
harvest will continue to run even after you log out.



. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . _________
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ./ It’s a twap! \
. . . . . . . . . . . . . . . . _,,,--~~~~~~~~--,_ . . . .\ ._________/
. . . . . . . . . . . . . . ,-‘ : : : :::: :::: :: : : : : :º ‘-, . . \/. . . . . . . . . .
. . . . . . . . . . . . .,-‘ :: : : :::: :::: :::: :::: : : :o : ‘-, . . . . . . . . . .
. . . . . . . . . . . ,-‘ :: ::: :: : : :: :::: :::: :: : : : : :O ‘-, . . . . . . . . .
. . . . . . . . . .,-‘ : :: :: :: :: :: : : : : : , : : :º :::: :::: ::’; . . . . . . . .
. . . . . . . . .,-‘ / / : :: :: :: :: : : :::: :::-, ;; ;; ;; ;; ;; ;; ;\ . . . . . . . .
. . . . . . . . /,-‘,’ :: : : : : : : : : :: :: :: : ‘-, ;; ;; ;; ;; ;; ;;| . . . . . . .
. . . . . . . /,’,-‘ :: :: :: :: :: :: :: : ::_,-~~,_’-, ;; ;; ;; ;; | . . . . . . .
. . . . . _/ :,’ :/ :: :: :: : : :: :: _,-‘/ : ,-‘;’-‘’’’’~-, ;; ;; ;;,’ . . . . . . . .
. . . ,-‘ / : : : : : : ,-‘’’ : : :,--‘’ :|| /,-‘-‘--‘’’__,’’’ \ ;; ;,-‘ . . . . . . . .
. . . \ :/,, : : : _,-‘ --,,_ : : \ :\ ||/ /,-‘-‘x### ::\ \ ;;/ . . . . . . . . . .
. . . . \/ /---‘’’’ : \ #\ : :\ : : \ :\ \| | : (O##º : :/ /-‘’ . . . . . . . . . . .
. . . . /,’____ : :\ ‘-#\ : \, : :\ :\ \ \ : ‘-,___,-‘,-`-,, . . . . . . . . . . .
. . . . ‘ ) : : : :’’’’--,,--,,,,,,¯ \ \ :: ::--,,_’’-,,’’’¯ :’- :’-, . . . . . . . . .
. . . . .) : : : : : : ,, : ‘’’’~~~~’ \ :: :: :: :’’’’’¯ :: ,-‘ :,/\ . . . . . . . . .
. . . . .\,/ /|\\| | :/ / : : : : : : : ,’-, :: :: :: :: ::,--‘’ :,-‘ \ \ . . . . . . . .
. . . . .\\’|\\ \|/ ‘/ / :: :_--,, : , | )’; :: :: :: :,-‘’ : ,-‘ : : :\ \, . . . . . . .
. . . ./¯ :| \ |\ : |/\ :: ::----, :\/ :|/ :: :: ,-‘’ : :,-‘ : : : : : : ‘’-,,_ . . . .
. . ..| : : :/ ‘’-(, :: :: :: ‘’’’’~,,,,,’’ :: ,-‘’ : :,-‘ : : : : : : : : :,-‘’’\\ . . . .
. ,-‘ : : : | : : ‘’) : : :¯’’’’~-,: : ,--‘’’ : :,-‘’ : : : : : : : : : ,-‘ :¯’’’’’-,_ .
./ : : : : :’-, :: | :: :: :: _,,-‘’’’¯ : ,--‘’ : : : : : : : : : : : / : : : : : : :’’-,
/ : : : : : -, :¯’’’’’’’’’’’¯ : : _,,-~’’ : : : : : : : : : : : : : :| : : : : : : : : :
 : : : : : : :¯’’~~~~~~’’’ : : : : : : : : : : : : : : : : : : | : : : : : : : : :

About

A small app to harvest tweets via the twitter streaming api and archive them.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 48.7%
  • CSS 43.5%
  • HTML 7.8%