Monday, June 8, 2009

Document Oriented Databases

by Hari Krishnan

History

Relational Databases have been almost the only way applications persist data. In the old days when code was written mostly with languages like COBOL, even navigational databases were sufficient. The switch to relational databases made it easier to query. Not much has changed in the way we store data since then. This may be attributed to fact that query performance is still the most important aspects for choosing a persistence mechanism.
Object oriented code as we all know has to go through mapping tools to be persisted as relational entities. Are we going to use the same database concepts in the coming years?

Retrospect

Have you been bothered by the below issues?
• We model business logic as interaction between objects. Concepts such as triggers also model some amount of business logic. Though the confusion should not arise, many a times people mix business logic in database layer and business layer. In some rare circumstances there may even be duplication of logic in Business layer and Database layer. Example: When a new customer is created, a trigger inserts a new row into another table called privileged customer based on a data condition. The trigger here has business logic in it which is not covered by unit test cases.
• Applications have Domain validations like customer name cannot be null. If you have such validations are in our code, which we normally do, then why do we need a database that also does these validations?
• We create great Object oriented business layer and lose sleep over mapping them to a relational model. All I care about is persistence of the state of my objects. Though I agree relational databases have very good query performance, are we really keeping our eyes closed to other persistence techniques?

Alternate Approach

CouchDB – A Document Oriented Database: Document-based databases do not store data in tables with uniform sized fields for each record. Instead, each record is stored as a document that has certain characteristics. Any number of fields of any length can be added to a document.
CouchDB is a distributed, fault-tolerant and schema-free document-oriented database accessible via a RESTful HTTP/JSON API. Couch DB is not an object-oriented database.

In a relational database we would store addresses of a customer in a different table, with a foreign key linking it to the customer.
With a document-oriented database, such as CouchDB, the nested resources maybe stored together with the main resource. Example JSON Document:
{
"name": "Geeky Customer",
"adresses": [
{"street": "Wall Street", "Number": "2"},
{"street": "Dalal Street", "Number": "4"},
]
}
This brings us to an interesting thought. We do not require ORM frameworks like hibernate and active record which are mostly written around SQL-like problems that CouchDB just doesn’t have.
There are libraries like CouchRest, RelaxDB, ActiveCouch etc which provide simple ways to connect to CouchDB.
I have taken CouchDB only as an example. There are many new database which are quickly becoming popular for specific situations. It may be worth the effort to take a look at such alternatives.

Download this Geek Snack episode here.