MUG-Nantes-Demo-Hadoop

Demo connecteur Hadoop de MongoDB

Slides : http://fr.slideshare.net/BrunoBonnin/mug-nantes-mongodb-et-son-connecteur-pour-hadoop

Etape 0 - Build

Java

cd spark/java
mvn clean package assembly:single

Etape 1 - Import data

Clean (mongo shell)

use marketdata
db.stock_prices.drop()

Source : cf. http://www.barchartmarketdata.com/data-samples/mstf.csv
Import

mongoimport nom_fichier.csv --type csv --headerline -d marketdata -c stock_prices

Data des sociétés (fichier texte mis dans HDFS)

data/put-hdfs.sh

Etape 2 - Hive demo

Création table des sociétés

hive -f hive/0-create-company.sql

Création table externe

hive -f hive/1-create-stock-prices.sql

Select sur la nouvelle table

hive -f hive/2-select-from-stock-prices.sql

Création table des max/min

hive -f hive/3-create-max-min-prices.sql

Insertion des données dans table des max/min

hive -f hive/4-insert-max-min-prices.sql

Select dans table des max/min

hive -f hive/5-select-max-min-prices.sql

Etape 3 - Spark demo

Clean (mongo shell)

use marketdata
db.max_min_prices.drop()

Lancement tâche Spark

spark/run-java-connector-demo.sh

Check data (mongo shell)

use marketdata
db.max_min_prices.find().sort({"Day":1})

Etape 3 - Alternative : Spark demo en Python

Clean HDFS

hdfs dfs -rm -r data/spark_result

Lancement tâche Spark

spark/run-py-connector-demo.sh

Check data

hdfs dfs -cat data/spark_result/part-00000

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
data		data
hive		hive
mongodb/libs		mongodb/libs
spark		spark
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

hive

hive

mongodb/libs

mongodb/libs

spark

spark

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

MUG-Nantes-Demo-Hadoop

Etape 0 - Build

Etape 1 - Import data

Etape 2 - Hive demo

Etape 3 - Spark demo

Etape 3 - Alternative : Spark demo en Python

About

Releases

Packages

Languages

License

bbonnin/MUG-Nantes-Demo-Hadoop

Folders and files

Latest commit

History

Repository files navigation

MUG-Nantes-Demo-Hadoop

Etape 0 - Build

Etape 1 - Import data

Etape 2 - Hive demo

Etape 3 - Spark demo

Etape 3 - Alternative : Spark demo en Python

About

Resources

License

Stars

Watchers

Forks

Languages