##Udacity Data Analyst Nano-Degree Projects
###Author: Zach Farmer
####Repository for Udacity Data Analyst Nano-Degree Projects
Purpose :
This GitHub repository was created as a portfolio into which I intend to place my projects from Udacity's Nano-Degree program.
If your goal is to learn the material yourself then I would suggest looking into Udacity's program. If you're interested in using any of the code, you are forewarned that this code was created within the context of learning and understanding and may not be suitable for production. Furthermore skeleton code and data files were provided by Udacity and come attached with their use license (Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License).
To view the projects please navigate to the relevent directory; project filenames will include project number for reference. Final versions of projects have been converted to html document format. See below for Project titles.
##Requirements
Suggest Continuum Analytics Anaconda python distribution (http://continuum.io/downloads)
for science, math, engineering, data analysis. This distribution will allow you to run the ipython notebooks and most of the
modules used within.
Note:
Further packages may be neccessary but will be made clear in the ipython notebook file for each of the projects. I did not set up the notebook to automatically download any missing packages therefore if you are missing any of the required modules I suggest that you use Anaconda's
conda
binary package manager to install and update any of the necessary modules. e.g In a terminalconda install {ggplot}
{insert relevent package here sans the braces} orconda update {ggplot}
Installation instructions for Anaconda's python distribution can be found here: [http://docs.continuum.io/anaconda/install.html] (http://docs.continuum.io/anaconda/install.html)
For any R code found in .Rmd markdown files you might wish to implement on your own machine I would suggest installing RStudio. (http://www.rstudio.com/products/rstudio/download/)
###Format
Each project has its own folder and many are self contained, with the above installations you should be able to run many of the ipython notebooks an your own machine, installing specialized packages and downloading datasets where necessary. Some of the notebooks and specifically the R-markdown file may not work as they utilize my file system hierarchy for some of their loading functionality and would have to be changed to that of your local system in order to work.
Project_1
: Test a Perceptual Phenomenon -- Stroop Effect
Project_2
: Investigate A Dataset -- NYC Subway Turnstile date with Weather statistics
Project_3
: Wrangle Open Street Map Data
Project_4
: Explore and Summarize Data -- Prosper Loans
Project_5
: Identify Fraud From Enron Email -- Machine Learning
___Project_6___
Project_7
: Design an A|B Test
Each folder contains a README which will describe the nature of the project and list out all of the files related to the project found within project sub-directories.
##Credit and Reference Information
Udacity Home Page
Primary Resources:
Udacity Data Analyst Nano Degree Program
- Statistics
- Intro to Data Science
- Data Wrangling with MongoDB
- Data Analysis with R
- Intro to Machine Learning
- A|B Testing
Secondary Resources:
*(work in progress... will continue to expand upon)
*(To do: add license info, add contact info)