Skip to content

A server that continually collects and stores tweets from the streaming api

Notifications You must be signed in to change notification settings

cbeach/twitter_vamp

Repository files navigation

Twitter_vamp - Twitter archiving tool
-------------------------------------------------------------------------------
Copyright © 2011 Casey Beach

Developed by Casey Beach to be used as a data collection tool for research 
into natural language processing, machine learning, and data analysis.

Special thanks to Derek Z Hurley for his work on clubot.

TODO:
-------------------------------------------------------------------------------
* Create IRC bot for control.
* Use a config file
* Write database transfer tool.


Release Notes
-------------------------------------------------------------------------------
0.01 - collect_data.py will connect with the twitter api useing a hardcoded
user name and password.  The twitter streaming api is dumped straight to the
terminal.  The data base initialization is written but not yet tested in any
way.


AMQP Structure
-------------------------------------------------------------------------------
Exchanges:
    *direct.raw
    *direct.text
    *direct.mentions
    *direct.hashtags
    *direct.urls
    *direct.user
    *direct.place
    *direct.delete

Structure:
                                     * ----------------*
                                     * ----> text      *
                                     * ----> user      *
*------------*      *----------*     * ----> mentions  *
*  Raw Feed  *----> *  Parser  *---->* ----> hashtags  *
*------------*      *----------*     * ----> urls      *
                                     * ----> place     *
                                     * ----> deletes   *
                                     *-----------------*


Database Description
-------------------------------------------------------------------------------
I'm changing the way the Database works, stay tuned...

About

A server that continually collects and stores tweets from the streaming api

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published