Simpals. All is simple :)
An attempt to pass simpals test task (scrapping 999 user adverts api and stocking them into db + index all data in elasticearch in realtime :).
- I decided I wanted this to be fully distributed, so I split the logic into several worker processes who coordinate via Redis queues, orchestrated using
docker-compose
- I decided to take a ride on the bleeding edge and refactored everything to use from tornado to
asyncio/uvloop
- Rather than just convert EU to MDL everytime , I also decided to stock all BNM rates into mongodb
- To index all needed data live into Elasticsearch, I used monstache. The main dependency is to run mongodb as replica( I decided to simply convert standalone mongodb server into replica mode)
- Add web interface for metrics
- Add unit tests
- Add integration tests
-
import.py
is a raw adverts importer -
metrics.py
keeps tabs on various stats and pushes them out every few seconds -
fetcher.py
fetches adverts, convert all needed data and stores them on MongoDB -
web.py
provides a simple web front-end for live status updates via SSE.
Processes are written to leverage asyncio/uvloop
and interact via Redis
A Docker compose file is supplied for running the entire stack locally.
docker-compose up --no-start && docker-compose start
docker-compose ps
Find specific container name for db ( for example, simp_als_db_1)
docker exec -it $name_of_container_found_in_step_3 mongo
rs.initiate()
curl -X GET "localhost:9200/_cat/nodes?v&pretty"
URI Parameters:
- page: Page number, default : 1
- per_page: Number of elements per page, default:50
'Example : curl -X get "http://localhost:8000/raw_adverts?page=1&per_page=1
'
- 'Get all detailed adverts :
curl -X get "http://localhost:8000/adverts
' - 'Get all raw adverts :
curl -X get "http://localhost:8000/raw_adverts
' - 'Get specific raw advert':
curl -X get "http://localhost:8000/raw_adverts/62733250?page=1&per_page1"
Connect to mongo-database:
docker exec -it simp-als_db_1 mongo
Use desired query :))