Skip to content

EECE5645 Parallel Processing Data Analytics Final Project with Professor Ioannidis

Notifications You must be signed in to change notification settings

blakermchale/parallel-stocks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

78 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

parallel-stocks

EECE5645 Parallel Processing Data Analytics Final Project
Professor Ioannidis

Installing

Log into the discovery cluster and run this while having a cluster checked out.

conda create --name final python=3.7
conda activate final
conda install tensorflow-gpu
pip3 install tensorflow sklearn --user
pip3 install -e elephas/ --user

Data

Bitcoin Dataset

Download the bitcoin dataset and place the csv (after extracting the zip) in the data/raw folder. Rename the file to bitstampUSD.csv

python src/make_dataset.py

Resources

PySpark Discovery Keras
Discovery GPU

Running

Run the src/train_random_forest__ml_lib.py using the following command:

spark-submit --master local[40] --executor-memory 100G --driver-memory 100G train_random_forest_ml_lib.py

The above command can also be used for train_gradient_boost_mllib.py

About

EECE5645 Parallel Processing Data Analytics Final Project with Professor Ioannidis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •