The idea here is pretty simple:
- ingest time-series data
- create forecast classifier(s)
- predict on current data
- profit!
sudo apt-get install libatlas-base-dev python-dev gfortran pkg-config libfreetype6-dev
python3 -m venv ./venv
source venv/bin/activate
pip install -r requirements.txt
luigid # to start the scheduler
python CryptoForecast/run_luigi.py MyTaskName
"""
Configuration options
"""
# Common
data_dir = "data-dir"
plot_dir = "plot-dir"
fig_size = (12, 8)
tradeAmount = 1
fidelity = 86400 # in seconds
# Backtesting
assets = {'usd': 0, 'btc': 10, 'eth': 0}
# Bollinger strategy
ewmInterval = 86400 * 20 # in seconds
stdK = 1.2
- pick a primary model
- reproduce model results on test set
- attempt to model on (manually curated) new data
- implement ingestion script once model fits reasonably well
- Subdirs of
CryptoForecast
each represent different data sources. - Each subdir contains a hierarchy organized by pipeline stage. The stages are:
- ingest : incoming data downloads
- preprocess : steps taken to massage data into proper format
- analyze : standard data summary methods to inform model selection
- model : creation & testing of various models
- forecast : use of models to predict future values ()
- action-plan : use of forecasts to plan actions (buy/sell)
- Common task base/abstract classes are in
./common/
- Tasks which extend common tasks should be named {DATA_SOURCE}{PARENT_TASK}; example:
BTC + Seasonal = BTCSeasonal
.
LocalTarget
outputs should have file names similar to their task classes.- all
LocalTarget
s shall be placed under./data
- Time-series shall be sampled or interpolated to daily frequency.
legend
- ⌛ in-progress
- ✅ done
- 🚫 fail
Hmm... what data to ingest... How about:
- ⌛ historical self-values (eg autoregression)
- http://api.bitcoincharts.com/v1/csv/
- more suggestions on this SO answer
- ⌛ historical values of other crypto-currencies (CCF might be useful here if one lags the other)
- ⌛ google trends data using
- sentiment analysis
- twitter ingest (NLP sold separately)
- stack overflow activity ( ethereum / monero communities or question volumes on s.o. itself )
- crypto-mining hardware release schedules
- crypto-mining profitablity recomendation calculators
- like coinwarz profit ratio charts
- mean/median transaction fees like bitinfocharts
- Hash rates & difficulties like this from bitcoinwisdom
- granger causality test (and similar)
- ⌛ cross-correlation function
- USD values could be adjusted using CPI values from bls.gov data
- generate stationary series for modeling
- ⌛ differencing
- ✅ frequency analysis
- ✅ FFT
- ✅ seasonal decomposition
- ⌛ ACF & PACF
Time series models... For these I like:
- Long short-term memory NN (LSTM NN) built on
- keras:
- theano:
- custom:
- Fuzzy Time Series Predictions
- ⌛ good ol' fashioned autoregressors
- ✅ my old behavAR project
- also see this script
- ⌛ ARIMA / ARIMAX
- ⌛ statsmodels ARIMAX (SARIMAX?)
- fb prophet ARIMA / exp smoothing
- ✅ my old behavAR project
- backtesting
- exchange api