Skip to content

sushmaakoju/regression-api

Repository files navigation

regression-api

Flask API for training Regression models and displating plotly dashboard.

Code of conduct

Resources were used for specific literature/code, it is provided in the respective implementation file. The code provided here is implicitly expected not to be replicated for homeworks, assignments or any other programming work. It is welcome to take inspiration, but it is implicitly expected to cite this resource if used for learning, inspiration purposes. Please refer code of conduct.

regression-api-demo

This is a repository for Regression algorithms using scikit-learn and display results using Plotly/Dash interactive plots and a dahsboard.

Steps to install

Clone the repository:

git clone https://github.com/sushmaakoju/regression-api.git
cd regression-api

Start API (if pre-requisites are setup):

Assuming you already installed Python3.8.x, ( Optional pre-requisites: Java and Spark/Hadoop setup) install following python requirements.

Pre-requisites to Start the API

Install Spark 3.0.1

Pre-requisites:

  • Java version 8+
  • Python 3.8.x
Install Spark on Windows 10 Operating System
  • Check Java version ()

    java -version
    
  • If you don't have Java installed, download and install it from here: Java 8x

  • Download Spark 3.0.1 with Hadoop 2.7.4 version from Spark

  • Verify checksum for Spark:

    certutil -hashfile complete-path-to-downloaded-spark-targz-file SHA512
    
  • Compare checksum from checksums section for spark 3.0.1 Hadoop 2.7.4. This should be same as displayed in certutil command.

  • To install, create a Spark folder

        mkdir Spark 
        cd Spark 
    
  • Extract downloaded-spark-targz-file named as "spark-3.0.1-bin-hadoop2.7.tar" to C:\Spark

  • Download winutils.exe for Hadoop from winutils for 2.7.4

  • Create a folder Hadoop and copy the winutils.exe file in hadoop\bin folder.

        mkdir hadoop 
        cd hadoop 
        mkdir bin 
    
Create Environment variables:
  • Go to Environment Variable from Conttol Panel -> System -> Advanced System Settings. Go to "User variables for username" section.

  • For Spark, click on New and enter Variable name as SPARK_HOME and Variable value as:

    C:\Spark\spark-3.0.1-bin-hadoop2.7 
    
  • For Hadoop, click on New and enter Variable name as HADOOP_HOME and Variable value as:

    C:\hadoop 
    
  • For Java, click on New and enter variable name as JAVA_HOME and Variable value as:

    C:\Program Files\Java\jre1.8.0_xxx 
    
  • Select Path variable in User Variables section and click on Edit and add following 3 entries: %SPARK_HOME%\bin %HADOOP_HOME%\bin %JAVA_HOME%\bin

  • Save all the settings.

Launch and test Spark Installation
  • Open command prompt, Navigate to C: and type:

    C:\Spark\spark-3.0.1-bin-hadoop2.7\bin\spark-shell
    

    or just type to test if Environment variables set earlier:

    spark-shell
    
  • Following Scala prompt must launch.

  • Navigate to http://localhost:4040/ on browser and you can see Apache Spark UI as follows:

  • Create a document with name "test" without any extension in C: and add some few lines with line
    breaks, example from
    here: placeholder text.

  • From Scala command prompt, type Following commands:

    val x =sc.textFile("test")
    x.take(2).foreach(println)
    
  • Type ctrl-d to exit Spark shell.

Install Spark on Ubuntu\Linux\MacOS
Install Apache Spark on Google Cloud

How to configure Dataproc on Google Cloud and Apache Spark/Hadoop

About

Flask API for Regression

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages