Stack Exchange in offline mode using Stack Exchange Data Dump.
It is massively based upon Solr-Jetty-Maven for the Solr instance, and Stackdump for the Python front-end and indexing.
You should have the following installed and ready to use:
- Java JDK (>= 1.7)
- Maven (>= 3.2.0)
- Python (>= 2.5)
- Packages: python-virtualenv python-pip python-dev
- Python packages: see requirements.txt
To start a virtualenv with dependencies, use the following:
virtualenv ~/.virtualenv/offstack
source ~/.virtualenv/offstack/bin/activate
pip install -r requirements.txt
You need to download Stack Exchange Data Dump. I recommend using the provided torrent file.
To start Solr embedded in Jetty, simply use:
# Running start_solr.sh
./start_solr.sh
# Run Maven command directly
mvn install jetty:run
To index the desired website, you have two options:
Run import_all.sh script:
./import_all.sh <Path to Stack Exchange Data Dump>
Run a single import:
./manage.sh import_site <Path to XML files>
Once data are indexed, the only thing to do is starting the Python front-end with start_web.sh script.