Skip to content

Distributed System for large scale data management - Ensimag 2018/2019

Notifications You must be signed in to change notification settings

ygouzerh/smack-it

Repository files navigation

Smack-it

Distributed System for large scale data management

Architecture

automation/ : Bash scripts to automatize the deployment of the infrastructure and the application

config/ : Kubernetes and AWS configuration files

src/ : Source code of our application

  • app/ : Code source of the web application and Dockerfile associated
  • aws/ : Code source of the aws manager (perform actions on AWS)
  • cassandra/ : Creator of the database and Dockerfile associated
  • kafka/ : Code source of the kafka producer to generate tweets and Dockerfile associated to kafka
  • utils/ : Config manager for project's variables

Getting started

Run the solution

In order to automatically run the solution, please follow these steps :

  1. Configure your credentials AWS :

Follow this guide to have at least the ~/.aws/credentials and ~/.aws/config (at least with the region) files : https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html

  1. Install the packages (python3)

pip3 install -r requirements.txt (You could use virtualenv to isolate this project from your other projects : https://virtualenv.pypa.io/en/latest/)

  1. Configure your deployment (Optional)

You could configure the deployment using the config/instances.ini file

  1. Go to the automation folder

cd automation

  1. Launch the deployment script and wait during the creation time

./deploy.sh

Connect to the master and control the Kubernetes cluster (Optional)

In order to automatically run the solution, please follow these steps :

  1. Go back to the root folder

cd ..

  1. Get back the master's public ip address

./manage.py read type get-master-public-ip

  1. Connect by ssh to the master

ssh -i ssh/Smackey ubuntu@MASTER_PUBLIC_IP

  1. You could use kubectl commands for example to play with the cluster

kubectl get pods

Some pods at the startup will have the status 'Error' : the cluster just need some time to attain a global coherency

See the application running

  1. Get back a worker's ip address

./manage.py read type get-workers-public-ip

  1. Open a browser to the address below

http://ONE_WORKER_PUBLIC_IP:32222

Contributing

All contributions are well appreciated.

Please read CONTRIBUTING.md before starting to contribute on this project.