Skip to content

machine learning, regression, classification, data wrangling, some SQL stuff, and more

Notifications You must be signed in to change notification settings

yorktronic/data_science

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 

Repository files navigation

Data Science!

This repo is focused exclusively on my adventure learning data science while enrolled in the Thinkful.com data science program, and the tools and techniques necessary to perform data-science-related tasks. This includes, but is not limited to: python, SQLite, pandas, numPy, sciPy, dato's graphlab create and sframe, time series analysis, statistical analysis and plots, regression and classifation, random forests, decision trees, k nearest neighbors, etc.

There are other data sciency things in the root folder of my github, such as my <a href="https://github.com/yorktronic/hots-comp-calc" target-"_blank">Heroes of the Storm team comp calculator, using optimization techniques to plan food quantities for my wedding, and my work for Coursera's machine learning specialization.

My data science blog can be found here.

Selected Projects / Techniques Used

  1. Predicting body position of smartphone users based on accelerometer data Decision trees, random forest, black box analysis, dato, graphlab create.

  2. Predict class of flower based on sepal measurements k nearest neighbors, graphlab create, pandas

  3. Weather analysis of major US cities. API calls, pandas, requests, sqlite, histograms, qq plots.

  4. Determined factors correlated with interest rate offerings from Lending Club. Linear regression, pandas, matplotlib.

  5. Cross validation of Lending Club linear regression pandas, statsmodels, scikit-learn, KFold.

  6. How New Yorkers bike using the CitiBike public bike program. Time series data, pandas, matplotlib, sqlite.

  7. Document retrieval using Wikipedia data. Text analysis, vectors, nearest neighbors.

  8. Sentiment analysis of Amazon.com product reviews. Natural language processing, classification.

About

machine learning, regression, classification, data wrangling, some SQL stuff, and more

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published