Running Zeppelin on CDH

Download and Build Zeppelin

Go to the download page and get the latest source package.

Untar the source package and create a git repo to make bower happy:

$ tar zxvf zeppelin-0.5.6-incubating.tgz
$ cd zeppelin-0.5.6-incubating
$ git init

Before building from source first determine the Hadoop version by running the following command on the edge node:

$ hadoop version
Hadoop 2.6.0-cdh5.4.8
This command was run using /opt/cloudera/parcels/CDH-5.4.8-1.cdh5.4.8.p0.4/lib/hadoop/hadoop-common-2.6.0-cdh5.4.8.jar

Build Zeppelin with YARN support enabled using the Maven profile corresponding to the Hadoop version found above:

$ mvn clean package -Pbuild-distr -Pyarn -Pspark-1.5 -Dspark.version=1.5.2 \
    -Phadoop-2.6 -Dhadoop.version=2.6.0-cdh5.4.8 -DskipTests -Pvendor-repo

Note we are assuming that you are using a custom Spark version as described in our previous post.

Installing Zeppelin on the Edge Node

Copy the distribution to the edge node:

$ scp zeppelin-distribution/target/zeppelin-x.y.z-incubating.tar.gz edge-node:

SSH to the edge node, unzip the tarball and cd to the Zeppelin installation directory:

$ tar zxvf /path/to/zeppelin-x.y.z-incubating.tar.gz
$ cd zeppelin-x.y.z-incubating/

Configure Zeppelin by creating and editing conf/

$ cp conf/{.template,}

It should contain the following variables:

export SPARK_HOME="$HOME/spark-x.y.z-bin-cdhx.y.z" # Assuming you are using a custom Spark version
export MASTER=yarn-client
export ZEPPELIN_JAVA_OPTS="-Dspark.yarn.jar=$HOME/spark-x.y.z-bin-cdhx.y.z/lib/spark-assembly-x.y.z-hadoopx.y.z-cdhx.y.z.jar"

export DEFAULT_HADOOP_HOME=/opt/cloudera/parcels/CDH-x.y.z-1.cdhx.y.z.p0.11/lib/hadoop

if [ -n "$HADOOP_HOME" ]; then

export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-/etc/hadoop/conf}

Manage the Zeppelin Server

To start the server run:

 $ bin/ start

To stop it:

 $ bin/ stop