When trying to evaluate NoSQL databases, its usually better to try them out. While trying them out, its better to use them with multiple node configurations instead of running single node. Such as clusters in Riak or Replica-set in mongodb maybe even a sharded setup. On our project we evaluated a 10 node Riak cluster so that we could experiment with N, R and W values and decide which values where optimal for us. In Riak here is what N, R and W mean.
N = Number of Riak nodes to which data will be replicated R = Number of Riak nodes which have to return results for the read to be considered successful W = Number of Riak nodes which have to return a write success before the write is considered successful
These N,R and W settings provide us the ability to tune our CAP requirements, thus they need to be carefully considered when architecting the system. What better way is there to test the assumptions we make than trying the assumptions out with some code and running Riak nodes. To experiment with our assumptions, we built a script that will download Riak 1.3.1 and create 10 nodes with different ports for pb_port, http port, handoff_port by changing them in app.config and -name in vm.args file for each node.
The first node is used as a master node to which all other nodes join after they are started using the riak-admin cluster join master_node_name@127.0.0.1 command to join the cluster. When all the nodes have been started, we can look at the cluster plan or configuration using riak-admin cluster plan and then commit the cluster plan using riak-admin cluster commit. This commits the cluster changes and makes all the nodes part of a cluster, we can view the status of the cluster using riak-admin status or see the nodes of the cluster using
riak-admin status | grep 'ring_members'
#!/bin/bash -e
working_dir=`pwd`
master_node_id='1'
node_directory='riak-node'
node_name_prefix='node'
master_node_name=$node_name_prefix$master_node_id
echo "Checking if Riak is downloaded"
if [ ! -f "riak-1.3.*.tar" ]; then
rm -rf riak-1.3.*.tar
wget http://s3.amazonaws.com/downloads.basho.com/riak/1.3/1.3.1/osx/10.6/riak-1.3.1-osx-x86_64.tar.gz
fi
echo "Unzip and Set up Riak"
gunzip riak-1.3.1-osx-x86_64.tar.gz
tar -xf riak-1.3.1-osx-x86_64.tar
echo "Setting up 10 nodes"
for i in {1..10}
do
protocolbuffer_port=`expr 8000 + $i`
http_port=`expr 8100 + $i`
handoff_port=`expr 8200 + $i`
cd $working_dir
nodeid=$node_directory$i
cp -r riak-1.3.1 $nodeid
sed -e s/8087/$protocolbuffer_port/g -i '' $nodeid/etc/app.config
sed -e s/8098/$http_port/g -i '' $nodeid/etc/app.config
sed -e s/8099/$handoff_port/g -i '' $nodeid/etc/app.config
sed -e s/riak@/$node_name_prefix$i@/g -i '' $nodeid/etc/vm.args
cd $nodeid/bin
./riak start
if [ $i -ne $master_node_id ]; then
./riak-admin cluster join $master_node_name@127.0.0.1
fi
done
cd $working_dir/$node_directory$master_node_id/bin/
./riak-admin cluster plan
./riak-admin cluster commit
./riak-admin status | grep 'node.@'
echo "10 node Riak cluster setup"
These shell scripts works on my mac and has not been tested on anything other than OSX-10.7+
Once experimenting, prototyping with the Riak cluster is done, we can gracefully shutdown the riak cluster and clean up all the folders using this shell script
#!/bin/bash -e
working_dir=`pwd`
master_node_id='1'
node_directory='riak-node'
echo "Cleaning up riak cluster"
for i in {1..10}
do
echo "Shutdown node:"$i
nodeid=$node_directory$i
cd $working_dir
cd $nodeid/bin
./riak stop
done
cd $working_dir
./riak-node1/erts-5.9.1/bin/epmd -kill
rm -rf riak-node
rm .tar
rm -rf riak-1.3.1
These scripts and approach can be modified to work across platforms