Quorum

Your go-to ETL for online data

Architecture

Quorum has four main different components:

Kafka: An Apache Kafka server for streaming data to and from the other services.
Scheduler: A lightweght scheduler to plan and manage what to scrape and when to do it. Based on schedule.
storage: Data consumer from a predefined Kafka topic and stores it in a persistent volume.
quorum: Content scrapers that produce to predefined kafka topics. Thus far we have incorporated the following scrapers:
- Twitter
- Facebook
- Reddit

Usage

First off, the scrapers are implemented using the official APIs for each platform. As such, in order to use you will need the proper credentials.

Also, you need to specify what accounts, pages, subreddits to scrape!

In the top level directory run the command below.

  $ cp config_example.py config.py

Edit the config.py with the needed service credentials.

To spin up your own version of quorum you can run the following command at the top level directory of this project:

./start.sh

Tests

$ python -m unittest discover quorum/facebook

Todo: add a way to read all the tests in all the folders with one script or command. Will also need to install dependencies before hand to get this to work.

Working on the documentation :-)

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
kafka		kafka
quorum		quorum
scheduler		scheduler
storage		storage
.gitignore		.gitignore
README.md		README.md
config_example.py		config_example.py
install_docker.sh		install_docker.sh
start.sh		start.sh
stop.sh		stop.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kafka

kafka

quorum

quorum

scheduler

scheduler

storage

storage

.gitignore

.gitignore

README.md

README.md

config_example.py

config_example.py

install_docker.sh

install_docker.sh

start.sh

start.sh

stop.sh

stop.sh

Repository files navigation

Quorum

Architecture

Usage

Tests

Working on the documentation :-)

About

Releases

Packages

Languages

tamilyn/quorum

Folders and files

Latest commit

History

Repository files navigation

Quorum

Architecture

Usage

Tests

Working on the documentation :-)

About

Resources

Stars

Watchers

Forks

Languages