Skip to content

Kuan-Ru-Chiou/analysis

 
 

Repository files navigation

analysis

Build Status PRs Binder Rmotr Colab

As DS application demo part of the "Daas (Data as a service) repo", this repo using jupyter notebook (mainly) as media showing step-by-step analysis and ML/DL approaches on various data science subjects. The idea is : demo how does a data scientist deal with a new dataset, pre-process the data, do exploration analysis (EDA), then running suitable model and offering suggestions with business feasibility and acceptable statistical errors. (i.e. DS workflow : business understanding -> data preprocess -> EDA -> data understanding -> analysis/modeling ). Main focus of this project: 1) Statistics/ML analysis 2) ML theory/algorithms explanation 3) Spark op/ML demo

Main Projects

Machine Learning

Tensorflow Demo

Statistics

Spark

spark op intro

  • Pyspark Basic 1 - Basic spark ops (transform & action): RDD,Map,FlatMap, Reduce,filter, Distinct, Intersection
  • Pyspark Basic 2 -Basic spark ops : load csv,dataframe,SparkSQL, transformation in [RDD, dataframe, SparkSQL]
  • Pyspark Basic 3 -Basic spark ops : Spark DataFrame OP

spark ML intro

spark APP

Other Projects

  • dev

Quick start

Quick_start.md

About

Repo for practical data science problems approaches, including notebook demo and working scripts

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 89.8%
  • TSQL 7.6%
  • HTML 1.7%
  • Python 0.8%
  • PLpgSQL 0.1%
  • R 0.0%