Data Science by Example

This repository is a collection of examples using different data analysis libraries, tools, and techniques.

6. Reading CSVs Fast

CSV-parsing: Python, NumPy, Pandas

Reading a CSV file line-by-line is a common problem with a lot of solutions. Which is fastest?

5. Finding Outliers

outlier identification: iPython, SciPy, matplotlib

We'll implement the generalized ESD test to automate the task of finding bad data points.

4. Classifying Irises

machine learning, classification: iPython, scikit-learn

Learning about supervised and unsupervised classification with scikit-learn.

3. Parsing GoodReads for the Discworld Series

web scraping, data analysis: iPython, requests, BeautifulSoup, Pandas

The Discworld is one of the longest series of books ever, with 41 books. If you haven't read anything in Discworld, where should you start? Let's use Goodreads to learn more.

2. Parsing IMDB for the Best Writers & Directors

web scraping, data analysis: iPython, Pandas

Everyone knows their favorite actors. Only movie buffs know their favorite directors. But no one knows their favorite movie writers. Let's use IMDB to find out more.

1. The Philosophy of Wikipedia

web scraping: iPython, BeautifulSoup, requests

Randal Monroe posited that if you start on any Wikipedia article and take the first link, then take the first link on that article, and repeat, you always end up on the Philosophy page. Let's automate that.

Name		Name	Last commit message	Last commit date
Latest commit History 130 Commits
examples		examples
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples

examples

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Data Science by Example

6. Reading CSVs Fast

5. Finding Outliers

4. Classifying Irises

3. Parsing GoodReads for the Discworld Series

2. Parsing IMDB for the Best Writers & Directors

1. The Philosophy of Wikipedia

About

Releases

Packages

License

john-science/data_science_by_example

Folders and files

Latest commit

History

Repository files navigation

Data Science by Example

About

Topics

Resources

License

Stars

Watchers

Forks