Skip to content

Doko - A smart tour guider for world-traveler. Explore and travel without planning. This is a sample of work as a result of 14-day work for a capstone project of data science immersive program at Galvanize - SOMA (a.k.a. Zipfian Academy)

Notifications You must be signed in to change notification settings

EmmaNguyen/doko

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DOKO: Galvanize Capstone Project

Notice: This is an early stage of work and there will be more updates. Stay stuned!

Doko - A smart tour guider for world-traveler. Explore and travel without planning. This is a sample of work as a result of 14-day work for a capstone project of data science immersive program at Galvanize.

Overview

The ultimate goal is to create a web app that helps people explore new places. I apply unsupervised learning models including K-means (Clustering), to segment user data into similar groups and build a recommendation of business ids. The data set is taken from Yelp dataset challenge under the condition of Academic/Education purpose.

This is just a simple demo of Kmeans clustering to run on a local machine based on 1000 users. I am testing some more models on Spark with AWS with amount of data arround 2 TB. More work will be updated later.

Future work

My next plan is apply Topological Data Analysis (Computational Topology) by implementing them on Spark to study the structure of data shape and generating a better similar matrix from a cluster of network. It is one of very promising solutions to deal with very high dimensionality and complex data, however, still lacks of a lot of work on this field.

Acknowledgement

I would like to send my thank to Galvanize for my time at data immersive program which gives me a ton of experience in my journey of seeking for knowledge.

Reference

  1. “NEO-PI-R - Manual.” Accessed October 20, 2016. http://www.unifr.ch/ztd/HTS/inftest/WEB-Informationssystem/en/4en001/d590668ef5a34f17908121d3edf2d1dc/hb.htm
  2. Inc, 2016 Yelp –. ‘Yelp Dataset Challenge’. 2004. Accessed October 20, 2016. https://www.yelp.com/dataset_challenge
  3. Wikipedia. Wikimedia Foundation, 2016. s.v ‘Maven’. Accessed October 20, 2016. https://en.wikipedia.org/wiki/Maven.
  4. Wikipedia. Wikimedia Foundation, 2016. s.v ‘Topological data analysis’. Accessed October 20, 2016. https://en.wikipedia.org/wiki/Topological_data_analysis

About

Doko - A smart tour guider for world-traveler. Explore and travel without planning. This is a sample of work as a result of 14-day work for a capstone project of data science immersive program at Galvanize - SOMA (a.k.a. Zipfian Academy)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published