Skip to content

JensGe/owi_Scheduler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

owi Scheduler Overview

A distributed Web Crawling Schedulercomponent to distribute URLs from Frontier to multiple Fetcher.

Distributed Fetcher ask for new URL Lists per REST API call.

Build Status

Build Status Quality Gate Status codecov

Lines of Code Bugs Code Smells Duplicated Lines (%) Technical Debt

Specification

Python Packages

The project is built on the python package FastAPI (MIT licensed) (https://fastapi.tiangolo.com/). FastApi itself is built on top of the following packages:

The Project also import parts of the following Libraries / Frameworks

Docker Image

The Docker Image provided by FastAPI is used as well

  • tiangolo/uvicorn-gunicorn-fastapi:latest

Deployment

The project is deployed on an AWS EC2 Ubuntu Machine.

[Link to Online Docs] http://ec2-18-185-96-23.eu-central-1.compute.amazonaws.com/docs

Commands

Re-Run local Docker-Image (Windows PowerShell)

docker ps -q | % { docker stop $_ }
docker pull dockerjens23/websch
docker build -t websch .
docker run -d -p 80:80 websch

Re-Run remote Docker-Image (Ubuntu)

sudo docker stop $(sudo docker ps -q)
sudo docker pull dockerjens23/websch
sudo docker run -d -p 80:80 dockerjens23/websch

Get Loginfo of running Container

sudo docker logs --follow $(sudo docker ps -q)

Linux Server Admin Commands

# disk free (human-readable)
df -h
# list all docker container (inactive, too)
sudo docker ps -a

Start Docker with PostgreSQL Credentials as Environment Variables

sudo docker run --env-file ./env.list -p 80:80

Environment Variables file

POSTGRES_ENV_USER=...
POSTGRES_ENV_PW=...
POSTGRES_ENV_URI=...
POSTGRES_ENV_DB=...

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages