Skip to content

Web service for shortening URLs, using the Tornado Python framework.

Notifications You must be signed in to change notification settings

jmd-dk/url-shortener

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 

Repository files navigation

URL shortener

This Python project implements URL shortening as a web service, using the Tornado framework.

In an effort to scale the service, it utilizes concurrency at both the asyncio and multiprocessing level.

The service maps arbitrary URLs to short URLs, which are kept in a sqlite3 database. The domain of the short URLs is that of the server, so that it can handle the redirection as well.

Design

  • Web framework: Tornado was chosen for the web framework and HTTP server, as it allows for easy utilization of asyncio. The still simpler built-in threaded HTTP server in Python was rejected due to bad performance at high load.
  • Concurrency: The use of asyncio means that a single process can serve many requests concurrently. Furthermore, any number of processes may be added to the pool of web servers, through the use of a multiprocessing feature of Tornado. On top of this, a separate process handles the database.
  • Database: We are in need of a persistent mapping from long URLs to short URLs, as well as the reverse (for later redirection). For this, two key-value data stores are ideal. This project uses the sqlitedict Python package, which provides such key-value stores on top off SQLite.

A key problem is how to generate the short URLs. Instead of relying on hashing (and dealing with collisions), the program simply generates every1 possible alphanumeric string in order.

To bypass the problem of concurrent access to the database, a separate process is dedicated to this purpose. It communicates with the server processes using queues. The communication through these queues are rather slow, providing the bottleneck of the program. On top of this, the database process does not make use of asyncio or threads.

How to run locally

To run the code, you will need2 Python 3.6+, along with the Tornado, sqlitedict and url-normalize third-party packages. The code also uses various standard shell utilities, which has only been tested on a Linux system (i.e. using GNU utilities).

The program url_shortener.py can either be run as a script or imported as a Python module. To run it as a script, do

    python url_shortener.py

The web service will start up on localhost, using port=8000, as displayed on start-up. Now visit http://localhost:8000/ in a browser and you will be confronted with a self-evident interface.

To change the settings shown at start-up, set the corresponding environment variables. Consider

export port=8888  # Or supply it as below
nprocs=6 verbosity=2 python url_shortener.py

which starts url_shortener.py on port 8888, using 6 server processors and a verbosity level of 2.

To run the service from within another Python module, include code like so:

import url_shortener

# Non-blocking example
killer = url_shortener.start_service(block=False)
# Do computation
...
# When done, shut down the service like so
killer()

# Blocking example
url_shortener.start_service(port=8888, nprocs=6)
unreachable_statement  # Until service is killed, e.g. via Ctrl+C

Testing

A test suite consisting of correctness and stress tests can be found in the test bash script. As with the Python script, the various settings may be set through environment variables. Importantly, the python variable, storing the path to the python interpreter, should also be set (may also be done permanently in the test source).

To run the test script, simply execute it:

nprocs=4 ./test

If the service is already running on the matching address and port, this will be used for the test. Otherwise, a new service will be spun up. As some warnings are emitted by the service during the tests, it is nicer to run it in a terminal window separate from the tests.

¹ With the exception of left 0-padded strings.

² Python 3.7+ needed to run without constant TypeError's.

About

Web service for shortening URLs, using the Tornado Python framework.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published