Comparing Mongo DB and Couch DB

We are getting a lot of questions "how are mongo db and couch different?"  It's a good question: both are document-oriented databases with schemaless JSON-style object data storage.  Both products have their place -- we are big believers that databases are specializing and "one size fits all" no longer applies.

We are not CouchDB gurus so please let us know in the forums if we have something wrong.

MVCC

One big difference is that CouchDB is MVCC  based, and MongoDB is more of a traditional update-in-place store.  MVCC is very good for certain classes of problems: problems which need intense versioning; problems with offline databases that resync later; problems where you want a large amount of master-master replication happening.  Along with MVCC comes some work too: first, the database must be compacted periodically, if there are many updates.  Second, when conflicts occur on transactions, they must be handled by the programmer manually (unless the db also does conventional locking -- although then master-master replication is likely lost).

MongoDB updates an object in-place when possible.  Problems require high update rates of objects are a great fit; compaction is not necessary. Mongo's replication works great but, without the MVCC model, it is more oriented towards master/slave and auto failover configurations than to complex master-master setups.  With MongoDB you should see high write performance, especially for updates.

Horizontal Scalability

One fundamental difference is that a number of Couch users use replication as a way to scale.  With Mongo, we tend to think of replication as a way to gain reliability/failover rather than scalability.  Mongo uses (auto) sharding as our path to scalabity (sharding is in alpha).  In this sense MongoDB is more like Google BigTable.  (We hear that Couch might one day add partitioning too.)

Query Expression

Couch uses a clever index building scheme to generate indexes which support particular queries.  There is an elegance to the approach, although one must predeclare these structures for each query one wants to execute.  One can think of them as materialized views.

Mongo uses traditional dynamic queries.  As with, say, MySQL, we can do queries where an index does not exist, or where an index is helpful but only partially so.  Mongo includes a query optimizer which makes these determinations.  We find this is very nice for inspecting the data administratively, and this method is also good when we don't want an index: such as insert-intensive collections.  When an index corresponds perfectly to the query, the Couch and Mongo approaches are then conceptually similar.  We find expressing queries as JSON-style objects in MongoDB to be quick and painless though

Atomicity

Both MongoDB and CouchDB support concurrent modifications of single documents.  Both forego complex transactions involving large numbers of objects.

Durability

The products take different approaches to durability.  CouchDB is a "crash-only" design where the db can terminate at any time and remain consistent.  MongoDB take a different approach to durability.  On a machine crash, one then would run a repairDatabase() operation when starting up again (similar to MyISAM).  MongoDB recommends using replication -- either LAN or WAN -- for true durability as a given server could permanently be dead.  To summarize: CouchDB is better at durability when using a single server with no replication.

Map Reduce

Both CouchDB and MongoDB support map/reduce operations.  For CouchDB map/reduce is inherent to the building of all views.  With MongoDB, map/reduce is only for data processing jobs but not for traditional queries.

Javascript

Both CouchDB and MongoDB make use of Javascript.  CouchDB uses Javascript extensively including in the building of views .

MongoDB supports the use of Javascript but more as an adjunct.  In MongoDB, query expressions are typically expressed as JSON-style query objects; however one may also specify a javascript expression as part of the query.  MongoDB also supports running arbitrary javascript functions server-side and uses javascript for map/reduce operations.

REST

Couch uses REST as its interface to the database.  With its focus on performance, MongoDB relies on language-specific database drivers for access to the database over a proprietary binary protocol.  Of course, one could add a REST interface atop an existing MongoDB driver at any time -- that would be a very nice community project.  Some early stage REST implementations exist for MongoDB.

Performance

Philosophically, Mongo is very oriented toward performance, at the expense of features that would impede performance.  We see Mongo DB being useful for many problems where databases have not been used in the past because databases are too "heavy".  Features that give MongoDB good performance are:

  • client driver per language: native socket protocol for client/server interface (not REST)
  • use of memory mapped files for data storage
  • collection-oriented storage (objects from the same collection are stored contiguously)
  • update-in-place (not MVCC)
  • written in C++

Use Cases

It may be helpful to look at some particular problems and consider how we could solve them.

  • if we were building Lotus Notes, we would use Couch as its programmer versioning reconciliation/MVCC model fits perfectly.  Any problem where data is offline for hours then back online would fit this.  In general, if we need several eventually consistent master-master replica databases, geographically distributed, often offline, we would use Couch.
  • if we had very high performance requirements we would use Mongo.  For example, web site user profile object storage and caching of data from other sources.
  • if we were building a system with very critical transactions, such as financial transactions, we would not use MongoDB for those transactions -- although we might in hybrid for other data elements of the system.  For something like this we would likely choose a traditional RDBMS.
  • for a problem with very high update rates, we would use Mongo as it is good at that.  For example, updating real time analytics counters for a web sites (pages views, visits, etc.)

Generally, we find MongoDB to be a very good fit for building web infrastructure.


Labels

couchdb couchdb Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.

IF YOU HAVE A QUESTION, POST IT TO THE USER GROUP.

These pages are fine for comments, but for questions, your best bet will always be the MongoDB User Group.

blog comments powered by Disqus