Wednesday, January 4, 2012

Replication for HA and DR

Disaster Recovery (DR) is an essential part of any storage solution. There are many aspects of this: local backups, remote backups, and cross data center replication.
I will tackle backups in another blog post, here I will write about replication.

Since I prefer to use built-in features rather than rolling our own internal add-ons, over the past few months I have added various features to HBase that I believe are needed for DR with HBase. These are:
  • HBASE-4071 minimum versions with TTL
  • HBASE-4536 option to keep deleted cells and markers
  • HBASE-2195 cyclic replication - master <-> master is a special case of that)
  • HBASE-2196 (replication to multiple slave clusters)
With these in place it is possible to set up slave cluster(s) in other data centers and use them for DR purposes.

Disaster here means:
  • The data center sinks into the ocean.
    Note that replication is asynchronous. If an entire data center - along with all local backups - is lost, there will be a window during which some data is lost.
But once you have a replica it can be used for the following scenarios as well:
  • The customer deleted some data by accident (which means performing and confirming soft deletion and performing and confirming hard deletion afterwards... Still it happens and did happen).
  • The application software deletes some data by accident.

Replication can help as follows:
  • Setup up the main cluster as usual.
  • Setup an identical slave cluster (same tables, same config)
  • Enable replication between the two
This would be a nice way to just allow the slave to take over if the master fails completely. In that case it might be prudent to setup master-master replication between the two.

Or maybe you want be able to restore the state of (say) the past 48 hours without expensive and time consuming restores from (gasp) tape:

  • Setup up the main cluster as usual.
  • Setup a slave in the same way, but set the TTL  for all tables to 48h and minimum versions to 1 (HBASE-4071), along with a reasonably large setting for maximum versions (MAX_INT to be correct).
  • Enable replication.
Now the slave cluster will retain the data for at least 48h, and always keep one version of the data around. If any accidental modification is detected within 48h the data can be recovered (although there is no way to stop the clock).

Or say you wanted to guard against accidental deletes as well.
  • Setup up the main cluster as usual.
  • As before setup TTL, min/max versions in the slave cluster's table.
  • Also enable keep deleted cells (see HBASE-4536) on all slave tables
  • Enable replication.
Now the slave cluster will also keep deleted cells (and all needed delete markers) around for 48h. So that full point-in-time queries are possible to look at the state of a table as of a specific time, guarding against accidental modifications and deletes.

Maybe you want even multiple slave clusters, one to take over in case of failure, and one to guard against accidental deletes, one with many cheap disks, etc.

2 comments:

  1. Nice Post Lars :)
    I am also interested in different replication policies of HBase. In particular I am looking forward to do some part of my PhD work by implementing new replication schemes for NoSQL systems like HBase. Could you kindly able to guide me through few initial steps from where I can able to hook up with the HBase replication mechanism (I mean the actual code, documentation, various deployment scenarios, etc.)? I am very new to HBase, Hadoop, HDFS and Zookeeper so it would be also great to point out few good end-to-end workaround deployment guides as well :) Thanks again for the nice post. Now heading towards your other posts :)

    ReplyDelete
  2. Hi Joarder,

    glad it was useful.
    You can find some information here:
    http://hbase.apache.org/replication.html

    and here:
    http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/replication/package-summary.html#requirements

    The best way to start is probably by looking at the code.
    You could start with ReplicationSource.java and ReplicationSink.java and work your way out from here.

    ReplyDelete