Skip to content

pombredanne/webrecorder

 
 

Repository files navigation

Webrecorder Project

This is the official repository of the Webrecorder web archiving platform: https://webrecorder.io/

Webrecorder provides an integrated platform for creating high-fidelity web archives while browsing, sharing, and disseminating archived content.

Users may try the service anonymously or login and create a permanent online archive.

Webrecorder will support multiple backends and will integrate with existing preservation systems.

For now, Webrecorder is still in a beta prototype stage, and this deployment is recommended for advanced users only.

For best experience, please try Webrecorder at https://webrecorder.io/

Running Locally

Webrecorder can be run using Docker and Docker Compose. See Docker Installation for details on installing Docker. See Docker Compose Installation for installing Compose.

1). git clone https://github.com/webrecorder/webrecorder

2). cd webrecorder; bash init-default.sh.

3). docker-compose build

4). docker-compose up -d

(The init-default.sh is a convenience script that copies webrecorder/webrecorder_sample.env -> webrecorder/webrecorder.env and creates keys for session encryption.)

Point your browser to port http://<DOCKER HOST>:8089/ to view the Webrecorder.

Configuration

Webrecorder is fully configured from webrecorder/config.yaml, which includes full settings for the application.

Archived data (WARCs) are stored locally under the ./data/ directory, and all metadata and user info is stored in a persistent Redis instance.

Useful environment and deployment settings are loaded from webrecorder/webrecorder.env and can be overriden per-deployment.

Following are a few of these settings:

Storage

The DEFAULT_STORAGE option in webrecorder.env configures storage options. Default is just the local file system.

Currently, s3 is also supported. To use s3, set DEFAULT_STORAGE=s3 and fill in the additional auth settings in webrecorder.env

With default local storage, archived data is kept in the ./data/accounts directory only.

Mail

Webrecorder sends invitiation, confirmation and lost password emails. By default, a local SMTP server is run in Docker, however, this can be configured to use a remote server by changing EMAIL_SMTP_URL and EMAIL_SMTP_SENDER.

Invites

By default, Webrecorder allows anyone with access to the web site to register for an account. However, users may wish to limit registration to specifically invited users. The https://webrecorder.io/ deployment uses this feature at this time.

To require invites, simply set REQUIRE_INVITES=true

Updating Deployment

When making changes to Webrecorder, running docker-compose build; docker-compose up -d will restart all of the containers.

To restart only the Webrecorder container, use the ./rebuild.sh script.

Architecture

Webrecorder is built using a variety of open-source tools and uses pywb, warcprox, Redis and Nginx. It is written in Python and uses the Bottle, Cork, Beaker frameworks.

Contact

Webrecorder is a project of Rhizome, created by Ilya Kreymer

For any questions/concerns regarding the project or https://webrecorder.io/ you can:

License

Webrecorder is Licensed under the Apache 2.0 License. See NOTICE and LICENSE for details.

Packages

No packages published

Languages

  • Python 48.1%
  • HTML 29.9%
  • JavaScript 19.2%
  • CSS 1.6%
  • Other 1.2%