Skip to content

beljun/zeppelin-plotly

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Overview

Apache Zeppelin comes with basic charts out of the box and even offers the ability to call into the underlying NVD3 library. Sometimes, though, you just want some Plot.ly goodness. While this awesome visualization library can be imported directly within Jupyter, having it work on Zeppelin needs a bit more of work. This repository is one way of making that happen.

It does two things:

  • The Zeppelin web app sources plotly-latest.min.js from the CDN.
  • It overrides Plot.ly's offline module with the code contained here such that calls to it will render via Zeppelin's display system instead of Jupyter's.

Setup

These steps apply to the node that runs Zeppelin.

  1. Install the following Python libraries:

  2. Clone this repository in a commonly accessible path.

  3. Update index.html in zeppelin-web such that it sources plotly-latest.min.js during startup. The cleaner way would be to modify Zeppelin's code and then compile. But I assume you have already have Zeppelin set up and you just want to get Plot.ly running on it as fast as possible. In this case, we just extract index.html from zeppelin-web-*.war, insert the script tag for plotly.js, and then add index.html back into the same WAR file.

  4. Find the zeppelin-web-*.war file inside the root folder of your Zeppelin install (i.e. zeppelin-web-0.6.0-incubating-SNAPSHOT.war). Back it up somewhere so you can restore the original copy if you need to.

  5. Extract index.html

    jar xvf zeppelin-web-0.6.0-incubating-SNAPSHOT.war index.html
    
  6. Edit index.html to insert plotly-latest.min.js script tag just before the <body> tag.

    <link rel="stylesheet" ng-href="assets/styles/looknfeel/{{looknfeel}}.css"> <link rel="stylesheet" href="assets/styles/fontImport.css"> <script src="https://cdn.plot.ly/plotly-latest.min.js"></script> <body ng-class="{'bodyAsIframe': asIframe}">
    
  7. Insert this modified index.html back into the WAR file.

    jar uvf zeppelin-web-0.6.0-incubating-SNAPSHOT.war index.html
    

Your modifications to the setup are now done and you can just start or restart your Zeppelin instance. In case it fails to start, you probably mistyped something when editing index.html. You can always restore the original WAR file you backed up if all else fails.

Usage

  1. At the top of a Zeppelin notebook, copy and execute the following code. Replace with the actual path where you cloned the repository.

    %pyspark
    # Load this module from a non-standard path (non-installed package)
    import sys
    sys.path.insert(0, "PATH_TO_YOUR_REPO/zeppelin-plotly")
    
    import offline
    
    # Override Plotly's module loading to use our patch instead.
    sys.modules["plotly"].offline = offline
    sys.modules["plotly.offline"] = offline
    
    # Import cufflinks (Pandas, etc) like usual.
    import cufflinks as cf
    
    # We need to use Cufflinks in offline mode.
    cf.go_offline()
    
  2. You can then use Plotly and Cufflinks (offline mode) as usual in the suceeding notebook paragraphs.

    %pyspark
    import pandas as pd
    import numpy as np
    
    df = pd.DataFrame(np.random.randn(1000, 4), columns=['a', 'b', 'c', 'd'])
    df.scatter_matrix()
    

Check out the showcase notebook included inside the sample folder if you want to see the different visualizations available on Plot.ly. Just import the json file like how you would any other Zeppelin notebookand then point the sys.path.insert bit to the actual location of your cloned repo.

TODO

  • Maybe contribute this back to the Plot.ly and/or Zeppelin projects so that integration becomes even easier.
  • Write tests

About

One way of using Plot.ly on Zeppelin notebooks

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages