LAKEsuperior is an alternative Fedora Repository implementation.
LAKEsuperior aims at being an uncomplicated, efficient Fedora 4 implementation.
Its main goals are:
- Reliability: Based on solid technologies with stability in mind.
- Efficiency: Small memory and CPU footprint, high scalability.
- Ease of management: Tools to perform monitoring and maintenance included.
- Simplicity of design: Straight-forward architecture, robustness over features.
- Drop-in replacement for Fedora4 (with some caveats); currently being tested with Hyrax 2
- Very stable persistence layer based on LMDB and filesystem. Fully ACID-compliant writes guarantee consistency of data.
- Term-based search (planned) and SPARQL Query API + UI
- No performance penalty for storing many resources under the same container; no kudzu pairtree segmentation 1
- Extensible provenance metadata tracking
- Multi-modal access: HTTP (REST), command line interface and native Python API.
- Fits in a pocket: you can carry 50M triples in an 8Gb memory stick.
Implementation of the official Fedora API specs (Fedora 5.x and beyond) is not foreseen in the short term, however it would be a natural evolution of this project if it gains support.
Please make sure you read the Delta document for divergences with the official Fedora4 implementation.
LAKEsuperior is for anybody who cares about preserving data in the long term.
Less vaguely, LAKEsuperior is targeted at who needs to store large quantities of highly linked metadata and documents.
Its Python/C environment and API make it particularly well suited for academic and scientific environments who would be able to embed it in a Python application as a library or extend it via plug-ins.
LAKEsuperior is able to be exposed to the Web as a Linked Data Platform server. It also acts as a SPARQL query (read-only) endpoint, however it is not meant to be used as a full-fledged triplestore at the moment.
In its current status, LAKEsuperior is aimed at developers and hands-on managers who are interested in evaluating this project.
Note: These instructions have been tested on Linux. They may work on Darwin with little or no modification, and possibly on Windows with some modifications. Feedback is welcome.
- Python 3.5 or greater.
- A message broker supporting the STOMP protocol. For testing and evaluation purposes, CoilMQ is included with the dependencies and should be automatically installed.
- Create a virtualenv in a project folder:
virtualenv -p <python 3.5+ exec path> <virtualenv folder>
- Activate the virtualenv:
source <path_to_virtualenv>/bin/activate
- Clone this repo
cd
into repo folder- Install dependencies:
pip install -r requirements.txt
- Start your STOMP broker, e.g.:
coilmq &
. If you have another queue manager listening to port 61613 you can either configure a different port on the application configuration, or use the existing message queue. - Run
./lsup_admin bootstrap
to initialize the binary and graph stores - Run
./fcrepo
.
The app should run for testing and evaluation purposes without any further
configuration. All the application data are stored by default in the data
directory.
To change the default configuration you should:
- Copy the
etc.skeleton
folder to a separate location - Set the configuration folder location in the environment:
export FCREPO_CONFIG_DIR=<your config dir location>
(you can add this line at the end of your virtualenvactivate
script) - Configure the application
- Bootstrap the app or copy the original data folders to the new location if any loction options changed
- (Re)start the server:
./fcrepo
The configuration options are documented in the files.
Note: test.yml
must specify a different location for the graph and for
the binary stores than the default one, otherwise running a test suite will
destroy your main data store. The application will issue an error message and
refuse to start if these locations overlap.
If you like fried repositories for lunch, deploy before 11AM.
LAKEsuperior is in alpha status. Please see the project issues list for a rudimentary road map.
This has been so far a single person's off-hours project (with much input from several sides). In order to turn into anything close to a Beta release and eventually to a production-ready implementation, it needs some community love.
Contributions are welcome in all forms, including ideas, issue reports, or even just spinning up the software and providing some feedback. LAKEsuperior is meant to live as a community project.
1 However if your client splits pairtrees upstream, such as Hyrax does, that obviously needs to change to get rid of the path segments. ↩