Skip to content

dariooliveira/oa_lesson_4_classification

 
 

Repository files navigation

Supervised Learning and Classification

Datasets and example code for Lesson 4 in Oracle Academy's Data Science Bootcamp. This lesson contains a number of datasets.

  1. ncdc_parse.hql and ncdc_parser.py provide HiveQL and python script for parsing the NCDC data in the data folder
  2. tree_building.R provides a script for building a decision tree in R
  3. weather_ooze provides a set of Hive and Pig+Weka scripts for deploying an Oozie workflow for model evaluation
  4. olh provides loading script for Oracle Loader for Hadoop
  5. pmml provides the complete source for deploying a model saved as PMML vida Cascading (requires gradle and Cascading to build)
  6. data provides 3 years of weather station data for California
  7. weather_sample and weather_sample2 provide samples for tree_building.R

About

Lesson 4 in Oracle Academy's Data Science Bootcamp: Supervised Learning for Classification and Prediction

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published