Skip to content


Folders and files

Last commit message
Last commit date

Latest commit



3 Commits

Repository files navigation


Cool Open-Source Robust Nice and Easy Tightening Tool Online

Free, open source tool to create and manage static version for any website.

Available for Linux.

Table of Contents


Cornetto Cornetto


  • Generate static versions from any dynamic website
  • Manage the previously generated static versions
    • Deploy them to your production git repository (your web server should pull that repository to update your static website)
    • Visualize them


Cornetto requires Python 3.x >= 3.4 .

The source of the application is divided into two folders

  • back - Python (Flask) source code for the API
  • front - JavaScript (React) source code for the user interface

If you want to contribute please feel free to fork and make pull requests.

Backend API

The backend API is built with Python, it uses the Flask framework and can be launched as a WSGI application. You will find samples of both Cornetto and Apache httpd configurations in the back/cfg folder.

  • Install the required package dependencies. If you want to use Apache to serve the API you will need libapache2-mod-wsgi-py3
# apt install libxml2 libxslt1.1 python3-lxml libimage-exiftool-perl libapache2-mod-wsgi-py3
  • Configure the virtual env and install Cornetto and its dependencies
/back/ $ python3 -m venv venv
/back/ $ . venv/bin/activate
/back/ $ pip3 install -e .
  • Copy back/cfg/ to the directory of your choice
    • We advise that it be /etc/cornetto/
    • Ensure the create_app(/etc/cornetto/ call in match that directory
/back/ # mkdir /etc/cornetto
/back/ # cp cfg/ /etc/cornetto/
  • Adapt to your needs. See settings

  • Configure Apache to serve the API or use Gunicorn

    • Example : launching the API with Gunicorn
/back/$ gunicorn -w 4 wsgi:application
[2019-04-12 17:18:57 +0200] [6439] [INFO] Starting gunicorn 19.7.1
[2019-04-12 17:18:57 +0200] [6439] [INFO] Listening at: (6439)
[2019-04-12 17:18:57 +0200] [6439] [INFO] Using worker: sync
[2019-04-12 17:18:57 +0200] [6442] [INFO] Booting worker with pid: 6442
[2019-04-12 17:18:57 +0200] [6443] [INFO] Booting worker with pid: 6443
[2019-04-12 17:18:57 +0200] [6446] [INFO] Booting worker with pid: 6446
[2019-04-12 17:18:57 +0200] [6447] [INFO] Booting worker with pid: 6447


The web interface is built with React and uses react-scripts.

  • In the front/ directory, install node dependencies by running

    /front/ $ npm install
  • Create a production build by running

    /front/ $ npm run build
  • It will create a build/ folder that you can rename and copy to your webserver root directory. For example with the webserver root directory /var/www

    /front/ $ mv build/ cornetto_front/
    /front/ $ cp -r cornetto_front/ /var/www/
  • If you only want to launch the front in development mode, run

    /front/ $ npm start
    • In development mode, make sure to launch the API first and to proxy API calls in package.json

        "proxy": "http://localhost:8000",
    • The default proxy set in package.json is intended to work with an API started with Gunicorn on port 8000

Frontend Customization

You might want to modify frontend parameters before building.

In file /front/src/strings.js you need to modify the line 159 to 162 to adapt the parameter to your needs.

url: {
  site_static: '',
  site_visualize: '',
  site_prod: '',
  name: ''

At line 594 of file /front/src/sagas/statifications.js there is a condition that checks the maximum number of accepted HTML errors and Scrapy errors. The limit defaults to 500, you might want to adjust this number to your needs.

593     // maximum number of errors accepted
594     if (result && (result.html_errors.length > 500 || result.scrapy_errors.length > 500)) {

Installation walkthrough

This example of installation assumes that you are using Apache on a Debian-based server.

  • Install the dependencies

    # apt install libxml2 libxslt1.1 python3-lxml libimage-exiftool-perl libapache2-mod-wsgi-py3
  • back/cfg/ contains all the settings of Cornetto, it should be placed in /etc/cornetto/

  • The source of the API inside the back folder can be copied to /opt/cornetto

The files you will need to create

You will need to setup two git repositories (git init)

  • /opt/cornetto/git_static - Current static versions of your website up to now
  • /opt/cornetto/git_visualize - Punctually visualizable previous versions

Those two folders will be served by a web server like Apache, so it is necessary that they have the correct permissions (example: chown -R www-data:www-data).

└── cornetto/
    ├── git_static/
    |   ├── .git/
    |   └── ... (website source)
    └── git_visualize/
        ├── .git/
        └── ... (website source)

Then you will need to copy the frontend build to an Apache document root folder. Usually we also set two symbolic links to the git repositories /opt/cornetto/git_static and /opt/cornetto/git_visualize, this way everything is accessible in the same place.

├── web server root foler (example: /var/www/)
   ├── static -> /opt/cornetto/git_static
   ├── visualize -> /opt/cornetto/git_visualize
   └── frontend/

Then you will need to create the configuration file for each site.

sites-enabled configuration folder, exemple for Apache :
└── sites-enabled/

There are two configuration files, one for http requests (port 80) and one for https requests (port 443). The http configuration file should redirect to https.

There is one couple of configuration files for each site that should be served :

  • One for the repository that contains the source of the latest static version
  • One for the repository that contains the source of the visualized static version
  • One for the folder that contains the frontend of Cornetto


Configuring Cornetto

The file /etc/cornetto/ contains all the configuration of Cornetto. You need to adapt the content to your setup.


Enable or disable Flask debug mode. Should be False in production.

  • DEBUG = False


Set the log level. Set 'DEBUG' for maximum verbosity.



Set the path to the directory where the crawler log file will be stored.

  • LOGDIR = '/opt/cornetto/log/'


Set the path to the API log file.

  • API_LOGFILE = '/opt/cornetto/log/api.log'


Set the path to the virtual env, it is used to start scrapy as a subprocess.

  • PYTHONPATH = '/opt/cornetto/venv/lib/python3.6/site-packages/'


Set the path to the crawler log file. This file will be created when a process is started. It is renamed when the static version has been committed.

  • LOGFILE = '/opt/cornetto/log/statif.log'


Set the path to the API directory.

  • PROJECT_DIRECTORY = '/opt/cornetto/'


Set the path to the file that will store the statification process pid.

  • PIDFILE = '/opt/cornetto/'


Set the path to the file that will lock any operation when a request has been done.

  • LOCKFILE = '/opt/cornetto/.lockRoute'


Set the path to the file that will store the crawler progress counter.

  • CRAWLER_PROGESS_COUNTER_FILE = '/opt/cornetto/crawlerProgressCounterFile.txt'


Set the path to the file that will store information about the status of the API. This file gives information about any running process.

  • STATUS_BACKGROUND = '/opt/cornetto/statusBackground.json'


Set the SQLAlchemy parameter. Should be False.



Set the uri of the database DATABASE_URI = 'sqlite:////opt/cornetto/cornetto.db'


Set the URL to the git repository where to push a static version.

  • URL_GIT = 'ssh://'


Set the URL to the git directory where to push a static version for publication. This should be the git repository served by your internet-facing webserver.

  • URL_GIT_PROD = ['/opt/cornetto/git_prod']


Set the path to the folder containing the git repository to create new static version.

  • STATIC_REPOSITORY = '/opt/cornetto/git_static'


Set the path to the folder containing the git repository to visualize a static version.

  • VISUALIZE_REPOSITORY = '/opt/cornetto/git_visualize'


Set the URL(s) (if many, comma separate URLs) from which the crawler will start.

  • URLS = ''


Set the list of domain(s) (if many, comma separate domains) that will be authorized to crawl.

  • DOMAINS = ','


Set a REGEX that will match the actual URL of the dynamic website so it will be replace with URL_REPLACEMENT.

  • URL_REGEX = '(https?://)?web('


Set the URL that should replace the URL matched by URL_REGEX.



Set the comma-separated list of files paths that need to be deleted at the end of the statification process.



Set the comma-separated list of directories that need to be deleted at the end of the statification process.



Copyright (C) 2018–2019 ANSSI

Contributors: 2018–2019

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see


Outil de gestion de version statique de site web







No releases published


No packages published