- Please download and unzip: https://goo.gl/PXhGRK
- Download winutils from https://bit.ly/2NbXns4 - place it in C:/username/Downloads/hadoop/bin
- Open Anaconda Propmt (Start Menu)
- Create the folder C:\tmp\hive
- Run the following steps:
> set HADOOP_HOME=C:\\username\\Downloads\\hadoop
> set PATH=%HADOOP_HOME%\\bin;%PATH%
> winutils.exe chmod -R 777 C:\\tmp\\hive
- Locate the Spark folder (suppose it is C:\username\Downloads\spark), and run:
> set SPARK_HOME=C:\\username\\Downloads\\spark
> set PATH=%SPARK_HOME%\\bin;%PATH%
> set PYSPARK_DRIVER_PYTHON=jupyter
> set PYSPARK_DRIVER_PYTHON_OPTS='notebooks'
- Download tutorial material from: https://bitbucket.org/jaidevd/ipec-fdp
- Unzip downloaded folder to C:\username\Downloads\ipec-fdp
- Navigate to that folder from Anaconda prompt
- Run
> conda install --file requirements.txt
- Run
> pyspark