Wednesday, November 21, 2012

Setting up a Zookeeper Cluster

ZooKeeper is a distributed, open-source and high-performance coordination service for distributed applications. Setting up a running Zookeeper cluster is a prerequisite to kick off many installations.
This post would direct you on how to setup a Zookeeper Cluster.

 

Prerequisites

Java JDK installed on all nodes of the cluster.

 

Setting up the cluster

The following installation steps have been tested for 10.04, 10.10, 11.04, 12.04 versions of Ubuntu.

1. Obtain the zookeeper setup at some location. Setup can be downloaded from :
http://hadoop.apache.org/zookeeper/releases.html

2. Create a file with any name (eg. zoo.cfg) in the conf folder of the copied setup and write in it


dataDir=/var/zookeeper/                                                                   
clientPort=2181
initLimit=5
syncLimit=2
server.server1=zoo1:2888:3888                               
server.server2=zoo2:2888:3888
server.server3=zoo3:2888:3888                                                         

Here 2888 and 3888 ports cannot be modified but the server.id(server.server1) and the zkServerName(zoo1) can be changed by the user. Using the above entries as sample entries, next it is required that a file named “myid” be created in the path specified in dataDir which contains just one entry which is of the server id. So the first system of the cluster would have a file named "myid" created at the path specified in dataDir containing server1 and so on i.e.
To make it more clear, if we are using 3 systems with IP 192.192.192.191, 192, 193
and zoo1 would designate 192.192.192.191, zoo2 would designate 192.192.192.192, zoo3 would designate 192.192.192.193
then
the machine 192.192.192.191 should contain a file called myid at /var/zookeeper/ (or the value of dataDir specified in zoo.cfg) containing the following entry
server1
Similarly machines 192.192.192.192 and 192.192.192.193 should have entries server2 and server3 respectively.

3. Update the /etc/hosts file on each machine to add the host names being used in the zookeeper configuration. This is needed so as to make it understandable that zoo1, zoo2 and zoo3 refer to which systems.
Post-updation the /etc/hosts file on each system in the cluster would have a similar set of entries like :

192.192.192.191   zoo1
192.192.192.192   zoo2                                                                     
192.192.192.193   zoo3

4. This completes the configuration part, next cd to the zookeeper home and start the cluster by running

bin/zkServer.sh start                                                                          
command on each system.
And you have a running Zookeeper cluster at your disposal. 
Good Luck !!!

3 comments:

  1. Hi,

    Nicely done.
    One correction.
    Donot put server1, server2,...in the myid file..
    Just put "1", "2", etc in the myid file ( without the quotes of course), otherwise that will result in a config error.

    -Vishal

    ReplyDelete
    Replies
    1. Actually,
      just "1" would be put in myid if in zoo.cfg you are entering like below:

      server.1=zoo1:2888:3888

      _________________

      But, she's entered like below in zoo.cfg

      server.server1=zoo1:2888:3888

      so, in myid, it would be "server1"


      Thanks
      Parvinder

      Delete
  2. I have followed these steps and but I am not sure how to connect the client to these servers.
    I did zkCli.sh but I get the error which is described here:

    http://stackoverflow.com/questions/7755525/why-this-error-is-coming-in-zookeeper

    ReplyDelete