Introduction This is a really simple example showcasing how to use spark to read some data, running it through an algorithm, and then storing the results into Elasticsearch via the elasticsearch-hadoop connector.
The data has been taken from the Titanic dataset of the Kaggle website (www.kaggle.com) and a RandomForest classifier run on it to predict the survivors among the titanic passengers based on various factors such as age, gender, fare, etc.
Credits Kaggle (for the Titanic Dataaset)