Skip to content

john-science/data_science_by_example

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 

Repository files navigation

Data Science by Example

This repository is a collection of examples using different data analysis libraries, tools, and techniques.

CSV-parsing: Python, NumPy, Pandas

Reading a CSV file line-by-line is a common problem with a lot of solutions. Which is fastest?

outlier identification: iPython, SciPy, matplotlib

We'll implement the generalized ESD test to automate the task of finding bad data points.

machine learning, classification: iPython, scikit-learn

Learning about supervised and unsupervised classification with scikit-learn.

web scraping, data analysis: iPython, requests, BeautifulSoup, Pandas

The Discworld is one of the longest series of books ever, with 41 books. If you haven't read anything in Discworld, where should you start? Let's use Goodreads to learn more.

web scraping, data analysis: iPython, Pandas

Everyone knows their favorite actors. Only movie buffs know their favorite directors. But no one knows their favorite movie writers. Let's use IMDB to find out more.

web scraping: iPython, BeautifulSoup, requests

Randal Monroe posited that if you start on any Wikipedia article and take the first link, then take the first link on that article, and repeat, you always end up on the Philosophy page. Let's automate that.

Releases

No releases published

Packages

No packages published