Skip to content

andrewdmcleod/apache-zeppelin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview

Apache Zeppelin is a web-based notebook that enables interactive data analytics. You can make beautiful data-driven, interactive, and collaborative documents with SQL, Scala and more.

As a Multi-purpose Notebook, Apache Zeppelin is the place for interactive:

  • Data Ingestion
  • Data Discovery
  • Data Analytics
  • Data Visualization & Collaboration

Usage

This is a subordinate charm that requires the apache-spark interface. This means that you will need to deploy a base Apache Spark cluster to use Zeppelin. An easy way to deploy the recommended environment is to use the apache-hadoop-spark-zeppelin bundle. This will deploy the Apache Hadoop platform with an Apache Spark + Zeppelin unit that communicates with the cluster by relating to the apache-hadoop-plugin subordinate charm:

juju-quickstart apache-hadoop-spark-zeppelin

Alternatively, you may manually deploy the recommended environment as follows:

juju deploy apache-hadoop-hdfs-master hdfs-master
juju deploy apache-hadoop-yarn-master yarn-master
juju deploy apache-hadoop-compute-slave compute-slave
juju deploy apache-hadoop-plugin plugin
juju deploy apache-spark spark
juju deploy apache-zeppelin zeppelin

juju add-relation yarn-master hdfs-master
juju add-relation compute-slave yarn-master
juju add-relation compute-slave hdfs-master
juju add-relation plugin yarn-master
juju add-relation plugin hdfs-master
juju add-relation spark plugin
juju add-relation zeppelin spark

Once deployment is complete, expose the zeppelin service:

juju expose zeppelin

You may now access the web interface at http://{spark_unit_ip_address}:9090. The ip address can be found by running juju status spark | grep public-address.

Testing the deployment

By default, this deployment uses Spark in YARN mode and supports storing job data in HDFS. To test this, access the Zeppelin web interface at http://{spark_unit_ip_address}:9090. The ip address can be found by running juju status spark | grep public-address.

  • Verify there is a green icon in the upper-right corner that says "Connected"
  • Click the Zeppelin HDFS Tutorial link
  • Click the Save button to bind the tutorial to our supported interpreters
  • Click the Play button (arrow at the top of the page)
  • Click OK when prompted to run all paragraphs

The tutorial may take 5-10 minutes to run as it retrieves sample data, processes jobs, and stores results in HDFS. When successful, each paragraph will report FINISHED in their respective upper-right corners

Contact Information

Help

apache-zeppelin

About

apache zeppelin old style juju charm with updated notebook

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages