Skip to content
This repository has been archived by the owner on Oct 7, 2021. It is now read-only.


Folders and files

Last commit message
Last commit date

Latest commit


Repository files navigation


A classic multi-zoo enterprise architecture:

  • Backoffice: The crawler populates the database via the forum's XML API
  • Backend: An JSON-API-to-RDBMS connector
  • Frontend: Displays backend data as tables, graphs etc. and allows user interaction
  • High Performance Offline Transactional Analytics: potstats2-analytics

Le Stack

  • Database: PostgreSQL (10+)
  • Database connector: sqlalchemy/psycopg2
  • HTTP adapter: flask
  • Frontend: Angular
  • Search server: ElasticSearch (7+)


  1. Set up postgres

    For Arch Linux:

    • Install postgres package.
    • su -l postgres
    • [postgres]$ initdb --locale $LANG -E UTF8 -D '/var/lib/postgres/data' --data-checksums
    • # systemctl enable postgresql; systemctl start postgresql
    • [postgres]$ createuser --superuser $YOUR_LOGIN
    • [$YOUR_LOGIN]$ createdb potstats2

    This should work with the default database URL (postgresql://localhost/potstats2).

  2. Create Python environment (possibly not conforming to recent-est PyPA guidelines):

    virtualenv --python=python3 _venv
    . _venv/bin/activate
    pip install [-e,--editable] .
  3. Create DB schema (probably use alembic later):

    potstats2-db create_schema
  4. Load database dump or run crawler (currently runs against some random subforum no one ever cared about, so this should take all but a minute):
  • Crawler: potstats2-worldeater

Configuration (src/potstats2/

  • Optional config file (~/.config/potstats2.ini):

    # See
    db = postgresql://scott:tiger@localhost/mydatabase
    some_flask_setting = 1234
  • Environment variables override config file:

    # No post-mortem debugger
    export POTSTATS2_DEBUG=0


Instead of complex error handling (beyond e.g. retries within the API abstraction) we can use checkpointing, since all data is stored in a SQL database. The design itself already implies checkpoints, some extra commits are already in place. Phases which will take a lot of time in practice (thread, update and post discovery) probably should get some time-based checkpoints.


Run potstats2-backend-dev for the usual Flask dev server.


Optional API caching and statistics

Redis can be used to cache API requests. Since there is no time-based expiry, setting a memory limit and an eviction policy as per is recommended. The redis and blinker packages are required to use that functionality (pip install redis blinker or pip install <this package>[cache], i.e. pip install .[cache]).

After setting redis up, simply set REDIS_URL (for a local installation this would usually be redis:// The cache is automatically invalidated by

  • running database analytics,
  • running worldeater,
  • changing software version.

Manual invalidation is provided by simply flushing the redis database, or alternatively for lazy folks, potstats2-cache invalidate.

The Redis DB used for caching is 0; DB 1 contains some basic statistics.

Statistics can be accessed through the /api/backend-stats endpoint, especially if you are using a recent Firefox version, which formats JSON quite nicely by itself. Alternatively potstats2-cache stats does pretty much the same thing sans HTTP.


Go to the src/potstats2-frontend directory and npm install it. The script creates a tarball for deployment at dist/potstats2-frontend.tar.gz. The tarball includes pre-compressed files compatible with nginx's gzip_static module.

Search frontend

This is plain JavaScript in a HTML file. No build tools/steps are required.