by July 1, 2011 0 comments

Hrishikesh Dewan,Consultant, SIEMENS

In the last two issues, we discussed about REDIS – the remote data structure server. Continuing with the same spirit, in this and the next issue, we will discuss an interesting distributed data storage platform known as CouchDB. It’s a distributed document oriented store with several interesting principles of distributed, parallel and data store/retrieval algorithms that have been applied in its design. The designers of CouchDB proclaim it as the easiest software to use and maintain and hence they used the word ‘couch’ implicitly meaning ‘to relax’.

[image_library_tag 370/67370, border=”0″ align=”right” hspace=”4″ vspace=”4″ ,default]

Applies to:
Cloud developers
USP: Learn how this document oriented database helps develop scalable applications
Related articles: REDIS cloud DB:

CouchDB is one variety of NoSQL database among several others like MongoDB, RavenDB, Cassandra etc and primarily is a document oriented data store. The word document has a special meaning; it embodies a loose structure of composition of a set of key-value pairs and by itself is a self-contained entity. For example, your visiting card is a document. Your photo is also a document and so is the bill you receive when you buy items from a shop. Each of these documents is self-contained in that it contains enough information about a particular entity and each may vary widely as well. The easiest way to understand that is the visiting card example. A person may have a mobile phone number but not a fixed phone number and further someone may have both but there may also be a fax number. Primarily in RDBMS, to store this kind of information, we create different tables. Relate the table using primary-foreign keys and then store ‘null’ whenever there is no data to be filled. For example, the visiting card entry for the person with no fax will have null in it’s field. Similarly, we define one-many relationship and segregate them into multiple tables. For example, a person may have multiple mobile phone numbers and so on. In document oriented data stores, we don’t create such rudimentary and rigid structures to store data; instead we create a loose representation of the data and provide a unique key to identify a particular document. Documents stored may be arbitrary and thus could be a collection of documents with no relationship among them. The flexibility of loose structuring proves to be a boon for many applications but also carries certain disadvantages as well. Since the querying mechanism needs to take care of a lot of uncertainties of combinations, extra work is required to organize them. However, indexes are created for this purpose and they help to solve problems of these kind. CouchDB in essence is such a document oriented store along with several other features that help in a wide variety of engineering problems.

But before we move to the nitty-gritty details of how to use CouchDB, let us first understand the core algorithms/techniques and data structures used by CouchDB. Understanding of these core data structures and algorithms will not only help us in knowing CouchDB better but also other Cloud-based storage frameworks that we shall take up in subsequent issues. Let us start with the easiest one first.

Secondary memory data storage and

To access secondary memory, for example, hard disk takes orders of magnitude much more than to access primary memory. Access to hard disk is via the disk I/O driver and is generally an interrupt to the executing process. Further, the unit of access in a disk is not a single word but rather a block of memory addresses. When computers were slow, had smaller footprint and reduced data set for computation, memory data structures like Binary Search Tree, Linked list, etc are used to process and store data. However, when the amount of data started increasing, it became obvious that storing all data in the main memory will not be feasible. For example, can we store 10 million key-value pairs with each value being of size 10 MB on a 4GB primary memory? The answer is no. Further, an added advantage of storing data in the secondary memory benefits us to make data sets crash proof even if there is power outage or a RAM crash. To overcome problems of such kind several secondary memory data structures were designed and B-Tree is one of them. B-Tree was invented by Bayer and McCreight in the beginning of 1971 while they were working at Boeing and it is not clearly known what the ‘B’ in BTree stands for. Nonetheless, this structure (shown in figure) is a balanced tree where all the leaves are at the same height and each node includes a multiple number of elements with an equal number of child keys. According to the definition of a BTree, there are a few properties that each B Tree has to adhere to and they are:

  • Root must have at least two children.
  • All internal nodes have at least M/2 and at most M elements.
  • All the leaves have the same height.

Since, for every element there are two children, therefore a BTree node with M elements have M+1 children at most and such children can contain at most M elements. Further, there is a distinct relationship between the order of elements and their children in each node. In a single node, if we have M elements and keys of each elements are K1, K2, K3, Km then elements inside the node are ordered as per the following relation : K1 < K2 < K3 < K4 ......< Km. Note that we are not bothered about duplicate keys here. Also, within each intermediate key, we have a child whose elements are less than Kj but greater than Ki. This definition and its structure is almost like a Binary Search Tree (BST) but unlike a BST, where there is one single node with maximum two childs, in a BTree, we have M elements and M+1 children. In a typical implementation, a single node with multiple keys are either implemented as an array, a link list and in some sophisticated implementations even a BST is used.

[image_library_tag 373/67373, border=”0″ align=”middle” hspace=”4″ vspace=”4″ ,default]

Having understood the structure, let us know delve into the operations that can be performed on a BTree. Primarily, there are three operations — search, insert and delete.

[image_library_tag 375/67375, border=”0″ align=”middle” hspace=”4″ vspace=”4″ ,default]

Searching a BTree

Searching is an easy algorithm. We start with the root and look for the element. If the element is found in the node, then we return, else we find the link which is close to the search element in the node. For example, in the BTree of Figure 1, if we are looking for the existence of an element with key as “8”, our start procedure will follow as below.

Start with the root. In this case, the node at the top has an element with key ’13’. In this example, ‘8’ is less than 13 and so we take the left end. Once, we reach the left end, we continue with the same procedure. The node with elements ‘4’ and ‘7’ is searched. Clearly ‘8’ is not in the node and the node 8 is greater than 7. So, the right link corresponding to the element 8 is chosen. We reach one more node. In this node, the element 8 is at the beginning and we are successful.

Insertion and deletion in a BTree

Insertion, however, can change the structure of the tree. But the first step in insertion is always a search and a successful insert always falls from the tree. Falling from a search during insert is good as we can identify at what position we have to enter the element. For example, let us try inserting an element with key as ‘9’ in the above tree. The first thing during insertion is to find if at all we have enough space to accommodate any new element. In the BTree in figure 1, the number of elements that can be accommodated in a single node is 3. In this case, we insert the element in its right position that is after ‘8’ and then break it into three halves. In this case, the individual parts are [1-M/2-1], M/2,[M/2+1, M]. The M/2 element is moved up to the parent node and if there is a space for accommodating the element, the pointers are adjusted else, the parent node also follows the same procedure. In a way, the breaking of nodes can move up to root and even the root can break leading to an increase in height of the tree. The new BST thus created now includes 9. Similar is the case with deletion. In case of deletion, nodes could squeeze if they have less number of elements (that is M/2) and consequently the height of the tree can decrease.

It is not difficult to conclude that B Tree is an efficient structure to search, insert and delete. Also, from the terminology of disk, each such node can spawn across a block of disk thereby providing a fast access to millions of nodes. One interesting result that arises of including multiple elements in a single node is that a large number of records can be accommodated in a single BTree, thereby increasing the access speed. It is for this purpose BTrees are very popular in database design, implementation and that of creating and maintaining indexes.

In practical implementations, BTrees include not only keys but values also. That is corresponding to every key, there is a value and the value could be RDBMS row or even a document or an image.

The ‘key’ is a unique identifier created from the properties of the value or of the system (say a random number) and both of them are inserted into the BTree.

In the next issue, we will discuss about the variants of BTree and how CouchDB uses this structure.

No Comments so far

Jump into a conversation

No Comments Yet!

You can be the one to start a conversation.