The project has basically 3 parts -
->We created our own cluster in our lab with 7-8 computers , in which one was Name Node and the others were Data Node .
- we first implemented HDFS . uploaded a file of around 4 GB in our cluster.
- we perform Map reduce . basically we run a program for word count on that file .
- we implemeted hive in which we created a database and perform ceratin operation in it.
->We created our cluster using AWS instances . // at run time it asks user about the no of instances to launch and then make one of those instances Namenode and othe datanode.
->We created our cluster using DOCKER.