Passionate about data

Data and its implications on software design and development.

10 Node Riak Cluster on a Single Machine

When trying to evaluate NoSQL databases, its usually better to try them out. While trying them out, its better to use them with multiple node configurations instead of running single node. Such as clusters in Riak or Replica-set in mongodb maybe even a sharded setup. On our project we evaluated a 10 node Riak cluster so that we could experiment with N, R and W values and decide which values where optimal for us. In Riak here is what N, R and W mean.

N = Number of Riak nodes to which data will be replicated R = Number of Riak nodes which have to return results for the read to be considered successful W = Number of Riak nodes which have to return a write success before the write is considered successful These N,R and W settings provide us the ability to tune our CAP requirements, thus they need to be carefully considered when architecting the system. What better way is there to test the assumptions we make than trying the assumptions out with some code and running Riak nodes. To experiment with our assumptions, we built a script that will download Riak 1.3.1 and create 10 nodes with different ports for pb_port, http port, handoff_port by changing them in app.config and -name in vm.args file for each node.

The first node is used as a master node to which all other nodes join after they are started using the riak-admin cluster join master_node_name@127.0.0.1 command to join the cluster. When all the nodes have been started, we can look at the cluster plan or configuration using riak-admin cluster plan and then commit the cluster plan using riak-admin cluster commit. This commits the cluster changes and makes all the nodes part of a cluster, we can view the status of the cluster using riak-admin status or see the nodes of the cluster using

Check status of cluster
1
riak-admin status | grep 'ring_members'
Script to create 10 node cluster of Riak
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
#!/bin/bash -e
working_dir=`pwd`
master_node_id='1'
node_directory='riak-node'
node_name_prefix='node'
master_node_name=$node_name_prefix$master_node_id
echo "Checking if Riak is downloaded"
if [ ! -f "riak-1.3.*.tar" ]; then
  rm -rf riak-1.3.*.tar
  wget http://s3.amazonaws.com/downloads.basho.com/riak/1.3/1.3.1/osx/10.6/riak-1.3.1-osx-x86_64.tar.gz
fi

echo "Unzip and Set up Riak"
gunzip riak-1.3.1-osx-x86_64.tar.gz
tar -xf riak-1.3.1-osx-x86_64.tar

echo "Setting up 10 nodes"
for i in {1..10}
do
  protocolbuffer_port=`expr 8000 + $i`
  http_port=`expr 8100 + $i`
  handoff_port=`expr 8200 + $i`
  cd $working_dir
  nodeid=$node_directory$i
  cp -r riak-1.3.1 $nodeid
  sed -e s/8087/$protocolbuffer_port/g -i '' $nodeid/etc/app.config
  sed -e s/8098/$http_port/g -i '' $nodeid/etc/app.config
  sed -e s/8099/$handoff_port/g -i '' $nodeid/etc/app.config
  sed -e s/riak@/$node_name_prefix$i@/g -i '' $nodeid/etc/vm.args
  cd $nodeid/bin
  ./riak start
  if [ $i -ne $master_node_id ]; then
    ./riak-admin cluster join $master_node_name@127.0.0.1
  fi	
done
cd $working_dir/$node_directory$master_node_id/bin/
./riak-admin cluster plan
./riak-admin cluster commit
./riak-admin status | grep 'node.@'
echo "10 node Riak cluster setup"

These shell scripts works on my mac and has not been tested on anything other than OSX-10.7+

Once experimenting, prototyping with the Riak cluster is done, we can gracefully shutdown the riak cluster and clean up all the folders using this shell script

Cleanup Riak cluster and the nodes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#!/bin/bash -e
working_dir=`pwd`
master_node_id='1'
node_directory='riak-node'
echo "Cleaning up riak cluster"
for i in {1..10}
do
  echo "Shutdown node:"$i
  nodeid=$node_directory$i
  cd $working_dir
  cd $nodeid/bin
  ./riak stop
done
cd $working_dir
./riak-node1/erts-5.9.1/bin/epmd -kill
rm -rf riak-node
rm .tar
rm -rf riak-1.3.1

These scripts and approach can be modified to work across platforms