Apache_Spark-MlLiB-Titanic-Kaggle-Competition

Introduction This is a really simple example showcasing how to use spark to read some data, running it through an algorithm, and then storing the results into Elasticsearch via the elasticsearch-hadoop connector.

The data has been taken from the Titanic dataset of the Kaggle website (www.kaggle.com) and a RandomForest classifier run on it to predict the survivors among the titanic passengers based on various factors such as age, gender, fare, etc.

Credits Kaggle (for the Titanic Dataaset)

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
README.md		README.md
spark.py		spark.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

README.md

README.md

spark.py

spark.py

Repository files navigation

Apache_Spark-MlLiB-Titanic-Kaggle-Competition

About

Releases

Packages

Languages

PranavGoel/Apache_Spark-MlLiB-Titanic-Kaggle-Competition

Folders and files

Latest commit

History

Repository files navigation

Apache_Spark-MlLiB-Titanic-Kaggle-Competition

About

Resources

Stars

Watchers

Forks

Languages