This is our exclusive badass headless browser aimed on what we do best: scraping.
We need to scrape websites after processing CSS and JS, therefore we need a full-stack browser and yet, we need it to be headless.
QT has something called QT Platform Application, it basically says what are the available platforms for QT: cocoa, windows, X11
That can be set through the environment variable QT_QPA_PLATFORM
.
When running sleepyhollow in production server we want to make QT work in a way that is compatible with embedded systems which is not much of a different than we hav ein our amazon instances.
Make sure to export QT_QPA_PLATFORM
as minimal
, like below:
export QT_QPA_PLATFORM=minimal
When running in minimal
(see "QPA Platform" above), QT doesn't know
where to find fonts and therefore type rendering wouldn't work.
As it turns out, the minimal
backend will look for fonts in
/path-to-qt5-installation/lib/fonts
All it requires is a basic set of fonts, most of which can be solved by using a single font file that contains all the glyphs for all the existing languages, it can be downloaded from the Unifoundry website.
Although, we have our own font package available, ready to be put in the QT font installation dir, it's available at: yipit-software-packages/qt5/qt-fonts.tar.bz2
1. Install QT5, the core library for the browser
brew upgrade
brew tap yipit/yipit
brew install yipit/yipit/qt5
brew install autogen automake autoconf
mkvirtualenv sleepy-hollow
pip install -r requirements.pip
./hollow test
or if you want to debug the output, use the --verbose
mode:
./hollow -v test
./hollow --verbose test
./hollow release
Sleepy Hollow test suite comes with a builtin tornado server, you can run it with
./hollow server 4000