Skip to content

ethen8181/programming

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

programming

Contents and code snippets kept for personal reference only.

Linux Command Lines

Books and Courses

Python Machine Learning

2016.3.20 | Walking through the book python machine learning.

Chapter 2 : Training Machine Learning Algorithms for Classification

  • Coding up perceptron, batch / stochastic gradient descent.
  • View [nbviewer]

Chapter 3 : A Tour of Machine Learning Classifiers Using Scikit-Learn

  • Using some of Scikit-Learn’s classification algorithm, including logistic regression, svm, decision tree, knn.
  • View [nbviewer]

Chapter 4 : Building Good Training Sets – Data Pre-Processing

  • Preprocessing. Filling in missing values and LabelCoding categorical variable.
  • Coding up sequential backward selection.
  • Accessing randomforest variable importance.
  • View [nbviewer]

Chapter 5 : Compressing Data via Dimensionality Reduction ( TODO : lda, kernel )

  • Principal Component Analysis. Matching implementation from scratch and Scikit-Learn.
  • View [nbviewer]

Chapter 6 : Learning Best Practices for Model Evaluation and Hyperparameter Optimization

  • Scikit-Learn’s Pipeline, Learning and Validation Curve.
  • K-Fold, Grid Search and ROC curve.
  • View [nbviewer]

Chapter 7 : Combining Different Models for Ensemble Learning

  • Coding up majority voting, using Scikit-Learn’s version and combining it with Grid Search.
  • View[nbviewer]

Python3 Object-Oriented Programming

2016.3.20 | Walking through the book Python3 Object-Oriented Programming.

Chapter 2 : Python OOP Basics

  • Naming conventions for public, protected, private methods.
  • Setting up a python package.
  • Explanation of if __name__ == '__main__'.
  • View [nbviewer]

Chapter 4 : Exceptions

  • Raising exceptions and overriding the Exception class to define our own.
  • Using hashlib to encode strings.
  • View [nbviewer]

Chapter 5 : When to Use OOP

  • Using @property to cache expensive values.
  • Explanations of EAFP (easier to ask for forgiveness) and when to use hasattr and when to use try .. except.
  • Use of the zipfile, os and shutil module to unzip file and remove directory.
  • Use of __str__ to format printing.
  • Examples of defining methods in subclass.
  • View [nbviewer]

Chapter 6 : Data Structures

  • Data structures: tuples, nametuples, dictionary, set.
  • Examples of __lt__(to make classes sortable), __repr__; __add__ and __radd__(to make classes summable).
  • re.compile used with finding all links in a webpage.
  • View [nbviewer]

Chapter 7 : OOP Shortcuts

  • Unpacking lists or dictionaries with * and **; update method for dictionaries.
  • Generator comprehension.
  • __getitem__ dictionary like indexing syntax for classes.
  • View [nbviewer]

Managing Big Data with MySQL

2016.2.18 | Walking through the Coursera course Managing Big Data with MySQL.

Important things to note!!!!!

  1. The results are not reproducible as all the notebooks were connected to the MySQL server provided during course. Currently, the database's data, including six separate tables are being stored separately as csv files in the dognition data folder.

  2. If you wish to view to documentations, downloading the whole folder and viewing them on your local ipython notebook is strongly recommended, since all the documentations may be not that visually appealing when viewing it directly on the web. To explain what I mean, consider the screenshot below. Even though the output consists of 480 rows in total, instead of printing out the whole thing, the local ipython notebook will only display the first few rows and provide a scroller for you to scroll down. And if you were to view it on the web all the rows will be printed out.

truncate_rows

notebooks:

  • Looking at your data. [nbviewer]
  • Selecting data subsets using WHERE. [nbviewer]
  • Formatting selected data ( AS, DISTINCT, exporting data to csv file ). [nbviewer]
  • Summarizing your data. [nbviewer]
  • Summarizing your data by groups. [nbviewer]
  • Common pitfalls of grouped queries. [nbviewer]
  • Inner Joins. [nbviewer]
  • Outer Joins. [nbviewer]
  • Subqueries and derived tables. [nbviewer]
  • Useful logical functions ( IF, CASE ). [nbviewer]
  • Working on the dataset part 1. [nbviewer]
  • Working on the dataset part 2. [nbviewer]