Skip to content

vittorio-nardone/web-scraping-at-edge

Repository files navigation

web-scraping-at-edge

How to use Web Scraping @edge with Raspberry PI, AWS Kinesis Data Firehose and AWS Glue.

About this project:

  • Run a Docker container on Raspberry PI to perform web scraping of a Paradox IP150 web interface to get motion detectors status
  • Push captured data to a AWS Kinesis Data Firehose stream
  • Perform ETL with a AWS Glue job
  • Use a Notebook to view detected Events and Vectors

Please read blog post at https://www.vittorionardone.it/en/digital-transformation-blog/

NOTICE: scraping is tested on Italian version of IP150 UI. To add support to other language, please edit "paradox.py".

Docker setup (on Raspberry PI)

Please install Docker and Docker-Compose first on your Raspberry PI.

  1. Create ".env" file and provide these variables:
PARADOX_IPADDRESS=192.168.1.x
PARADOX_USERCODE=xxxxxx
PARADOX_PASSWORD=yyyyyyyyyy
KINESIS_STREAM=paradox-stream
KEYPRESS_CHECK=1
  1. Create ".aws-credentials" file to provide you access key:
[default]
aws_access_key_id=AAAAAAAAAAAAAAAAA
aws_secret_access_key=XXXXXXXXXXXXXXXXXXXXXXXXXXXXX
  1. Build image and run container
docker-compose up

Vectors detection

Heatmap

Local GLUE

You can easly run Notebook with AWS Glue integration locally on your laptop, using Docker.

docker run -itd -p 8888:8888 -p 4040:4040 \
           -v ~/.aws:/root/.aws:ro \
           --name glue_jupyter \
           amazon/aws-glue-libs:glue_libs_1.0.0_image_01 \
           /home/jupyter/jupyter_start.sh

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published