This is a simple activity service for saving and retrieving historical events. This service has just two endpoints /heath
and /events
. The /events
is the main endpoint while the /health
is miscellaneous. Schema for the endpoints are as below;
-
GET /heath
- This endpoint is provided for pinging the service to check if it is up or not. response:{ "status": "healthy" }
-
GET /events
- This returns a list of historical events based on the query params supplied. When there is no query params supplied for filtering, it returns all events in the database.request:
/events/?email=ade%40gmail.com&environment=production&component=order&message=failed&from_date=04-16-2021
response:
{ "events": [ { "component": "order", "data": { "age": 20 }, "email": "ade@gmail.com", "environment": "production", "message": "order failed", "id": "ed24df5o903-ea47ce8a-3ea7df9", "created_at": "12411411.23" } ] }
-
POST /events
- This is used to save events.request:
{ "component": "order", "data": { "age": 20 }, "email": "ade@gmail.com", "environment": "production", "message": "order failed" }
response:
{ "component": "order", "data": { "age": 20 }, "email": "ade@gmail.com", "environment": "production", "message": "order failed", "id": "ed24df5o903-ea47ce8a-3ea7df9", "created_at": "12411411.23" }
For easy distribution and ease of starting up, I have dockerized the app and the postgres db used for it.
I used flask, flask-restplus, flask-migrate and other related flask libraries to build the app.
Postgresql is the choice of database used for storing events while using flask-SQLAlchemy as the ORM. Any Postgres database can be used with the app, we just need to save the credentials in the .env
file.
Clone the repository using either https
or ssh
-
Install and launch Docker. Ensure it's running.
-
Ensure no other service is running on port
5000
. -
On a terminal,
cd
into the root directory. -
Create a
.env
file and update it with the postgres and other environments' details in.env.sample
file in the directory. For the purpose of testing, we will keep these values so that we can use the postgres db within docker. You can also replace the values in there with your choice. -
Run the command below. This should build and run the app.
docker-compose up
-
After the containers have been successfully built and running, we can access the endpoints at
http://localhost:5000/api/health
andhttp://localhost:5000/api/events
.
-
Run your local or remote postgresql then get the credentials.
-
On a terminal,
cd
into the root directory. -
Create a
.env
file and update it with the postgres and other environments' details in.env.sample
file in the directory. Replace the postgre variables with your database credentials. -
Run the command below to start the app;
flask run
-
Open another terminal tab,
cd
into the root directory and run the command below to set up the database.flask db upgrade
-
You can now access the endpoints at
http://localhost:5000/api/health
andhttp://localhost:5000/api/events
.
This app is structured in a MVC format such that there can be separation of responsibilities between the different layers of the app. The folder structure is displayed below. Each resource has a model, controller and corresponding view.
root directory
.
+-- models
| +-- model_a.py
| +-- model_b.py
+-- restapi
| +-- endpoints
| | +-- resource_a.py
| | +-- resource_b.py
+-- services
| +-- service_a.py
| +-- service_b.py
+-- other files
This is the data layer where the type of data to be stored in the database is declared. Here is where data saving and retrieving happens. Each resource will typically have a corresponding file for its model. In this app, we have only one resource which is event
, so we have just one file in the models folder.
id
- A unique hexadecimal string to identify each event data. I used a UUID4 (universally unique identifier - version 4) generator to automatically generate this unique id because;- The UUID4 uses a random number to generate unique id such that the probability of generating two same one at different times is so low that it can be ignored.
- An auto-generated integer incrementing id would have guaranteed non-collision in id but that is not safe because guessing a number has a high probability that a record exists that matches it, which makes it a lot easier for an attacker to get unauthorized data.
component
- Required string type.created_at
- A datetime field generated when the event data is saved.data
- Json data type with no specific schema.email
- Email address saved as string.environment
- Required string type telling where the event occured.message
- Required string type which describes the event in a way.
This layer represents the views in our MVC structure as this is where input data are collected and also responsible for formatting and sending out responses to the user's input. User's input and responses are also validated on this layer using expect
and marshalling
from flask-restplus.
This is an intermediary between the data layer and the presentation layer. It holds all the business logic
which means decisions about the data regarding the business can be implemented here. Each resource will have a corresponding service module that holds business logic associated with that resource.
For business logic that cuts across multiple resources, this can be placed inside an util
module such that it is accessible to different services. And if in the future, the business logic becomes complex and large, it could be extracted into a microservice of its own.
I would write both integration and unit tests.
These are tests to confirm the end-to-end functionality of the app. This means actual data are used for testing and real results are expected. I would set up a different docker container with a test database which actual data can be written to when testing and the data and table will be removed after testing.
This will be done by calling the endpoint with different data combinations and confirming the response matches the expected output.
Different scenarios that will be tested for POST
are;
- All request data are valid.
- All request data are empty.
- Invalid email.
- Invalid json data.
- Empty component data.
- Empty message data.
- Empty environment data.
Different scenarios that will be tested for GET
are;
- Request with no query params.
- Request with all query params.
- Request with invalid
from_date
query param.
This is carried out by passing both valid and invalid data to the data layer and trying to save it to confirm if the operation is successful when data is valid and error thrown when invalid.
These are tests majorly written for functions executing the business logic. A set of different combinations of arguments are passed to the functions and the outputs are confirmed if they match the expected behaviour. In the case of activity service, we will be passing different data values to the functions in the event service module while mocking calls to the database. The returned values are asserted and the reactions (error raised or otherwise) are confirmed to meet the expected behaviour.
When there are over a hundred activities happening per second on the database, this could cause deadlock of the database. There are a couple of ways to mitigate this which include caching, queue system, load balancing and use of NoSQL.
A cache can be implemented for read activity to the database using Redis. In the case of events, when a query is made to get events from the database for the tiem, the result is cached in redis such that subsequent request with the same query params does not go to the database but get the result from the cache thereby reducing the number of operations going to the database.
A queue system can be placed between the controller and the data layer such that every post request goes into the queue and when one is not done, the next one does not get executed. This way, the database only gets one request at a time.
Using a non-relational database in place of postgresql would be another option to consider if the requests are just over 100 since they have a tendency of handling high volume of multiple requests due to their nature.