Garrent

Garrent is a personal project of mine to scrap and analyse data from HKEX includes stock list, CCASS, SHHK/SZHK connect, and etc. The script worked fine in late 2018, however due to some changes made by HKEX, some script might not work anymore.

This project was a precedent of a Wechat Mini Program developed internally by "YY港股圈" (YY Hong Kong Stock Circle, wechat ID: Victoria-hk-stocks). However due to the rapid change of market conditions and sentiment, the project was never launched.

The goal of releasing this personal project is to:

As a part of my personal portfolio
For interested parties who would like to work on the area without having to work from ground up.

Background

Garrent was originally the data backend of a bigger project, therefore it does not come with a front-end GUI. The output of garrent is a whole bunch of processed data lying on database. I might consider releasing the front end, which is an unfinished conversational UI to the data, sometime in the future.

Although Garrent does not handle trading data (i.e. stock OHLCV data), the data being taken in is considerably numerous for a personal project. Data size of 200-300MB are collected per day approximately. Historical data is available for a year so inital database size would be at least 10GB, without counting the indices. MySQL database optimization knowledege is therefore essential, I recommend O'reilly's High Performance MySQL: Optimization, Backups, and Replication as a start.

To use this code, understanding of HKEX data structure is as important as coding skills. The project intended to cover following data scrapping:

Stock List*
CCASS Participant List
Daily shortsell data (from http://www.analystz.hk)
CCASS Holding List
Southbound top 10 list
Disclosure Interests

To the date of writing this document, due to web site layout changes, the data scrapping marked with (*) no longer works.

Furthermore, since the data being scrapped is numerous, Garrent uses RQ to manage data scrapping task queue.

Prerequisites

Python 3
MySQL compliant, recommended using Percona
Redis
PhantomJS

Configuration

To get Garrent working, different cron jobs are required for data scrapping, data cleansing and data mining. This part will guide through setup step by step.

1. Database Path

Both ./garrent/database.py and ./garrent/pw_models.py have to be modified to reflect database path. The reason of using pymysql and peewee at the same time are due to some legacy development issue. However peewee is preferred for future development, if any.

2. Command-line script

run.py is the core command line script for most functions. It uses click as simple commandline interface. General usage should be:

./run.py [command] [options]

Supported commands are listed as following.

status
initdb
cleanup
stock
ccassplayer
buyback
q_buyback
q_shareholder
q_ccass
sbtop10
sbstock
failed
sbholding

3. Cron Jobs

There are 3 shell scripts to do the data scrapping and cleansing:

./garrent/scripts/ud_equity.sh
- Update equity list
./garrent/scripts/ud_sbflow.sh
- Update Southbound Top 10 stock lists
./garrent/scripts/ud_ccass.sh
- Update CCASS players and holdings

The scripts should be ran one by one between 00:00:00 HKT and 06:00:00 HKT after trading day to avoid trading hour lagging. RQ workers should be set no more than 3 or you might face IP ban from HKEX.

Contributing

Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.

Versioning

We use SemVer for versioning. For the versions available, see the tags on this repository.

Authors

Henry Fong - Initial work - foongsy

See also the list of contributors who participated in this project.

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Acknowledgments

Hat tip to anyone whose code was used

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
garrent		garrent
scripts		scripts
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.md		LICENSE.md
README.md		README.md
contributors		contributors
cronjobs.py		cronjobs.py
requirements.txt		requirements.txt
run.py		run.py
test.sql		test.sql

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

garrent

garrent

scripts

scripts

.gitignore

.gitignore

CONTRIBUTING.md

CONTRIBUTING.md

LICENSE.md

LICENSE.md

README.md

README.md

contributors

contributors

cronjobs.py

cronjobs.py

requirements.txt

requirements.txt

run.py

run.py

test.sql

test.sql

Repository files navigation

Garrent

Background

Prerequisites

Configuration

Contributing

Versioning

Authors

License

Acknowledgments

About

Releases

Packages

Languages

License

foongsy/garrent

Folders and files

Latest commit

History

Repository files navigation

Garrent

Background

Prerequisites

Configuration

Contributing

Versioning

Authors

License

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Languages