All of us love zookeeper and whenever we have to pick up choosing a coordination service this has really become the default choice for almost all of us.
A zookeeper is a lot like a file system (but is not one) with hierarchical structure. Every node in the zookeeper tree is called a znode (pronounced as zee-node). It contains data and can also have children.
I’m listing down some of the key characteristics of Zookeeper which all of us should know before using it.
1. File API without partial read/writes
The Goal of Zookeeper is to provide a very simple file API. Hence, the number of operations supported on files are also very less. It only provides these simple methods on files: create, delete, exists, get data, set data, get children and sync.
When you want to read or write, you either have to read all the data or write all the data. There is no partial reading or writing option.
2. Focus on Read dominant workloads
Zookeeper is designed for systems that are read-dominant i.e. when you need a lot of super fast read operations and very few write operations. It is the best choice of solution when the ratio of read to write is 10:1. If you have a write intensive use-case don’t expect it to perform better no matter how many nodes you add to a Zookeeper Quorum.
3. Ordered updates and strong persistence guarantees
When you write in zookeeper is persists and keeps them in ordered manner i.e. each and every update in Zookeeper is stamped with a number. Everyone can see that order and when a write is successful on zookeeper it’s a guarantee that the majority of the servers have seen that change. Also, all the client requests coming to zookeeper are served in order.
4. Conditional updates
This allows you to make changes in the data with optimistic locking. Optimistic locking needs a version of the record and Zookeeper provides this facility in the following way:
Stat stat = new Stat(); //org.apache.zookeeper.data.Stat Object
byte zookpeerData = zk.getData("/zkpath", null, stat));
This reads the data along with the version information in one go. The stat object is used to store the version of the data. You can then use this version information (that you got during the read operation) to write while writing the data as well.
zk.setData("/zkpath", data, stat.getVersion());
If there is a version mismatch, the method will throw KeeperException.BadVersionException, which gives you an optimistic lock.
5. Batch updates
If you want to do simultaneous operation of multiple nodes together, you can use the batch update feature of Zookeeper. This lets you modify multiple nodes atomically.
6. Watches for data changes
This is a very useful feature where you can add watchers to the data. You don’t have to poll the data to see if has changed but you will get notified when the data changes.
7. No Renames
Zookeeper does not allow you rename the nodes or files. If a parent node has children, you can’t even delete it.
8. Ephemeral nodes
You can create ephemeral nodes in Zookeeper. This means the nodes will be temporary in nature and are active till the session that created the nodes is alive. Once the session expires the nodes also gets deleted automatically.
9. Data is in-memory
All the data is being served from the memory. The files system is used for reliability as a backup store so that if anything goes wrong you can recover from it. So one thing you should consider is that we don’t get an unlimited amount of memory, but whatever amount of heap you assign to the process or service.