Capped collections are fixed sized collections that have a very high performance auto-FIFO age-out feature (age out is based on insertion order). They are a bit like the "RRD" concept if you are familiar with that.
In addition, capped collections automatically, with high performance, maintain insertion order for the objects in the collection; this is very powerful for certain use cases such as logging.
Capped collections are not shardable.
Creating
Unlike a standard collection, you must explicitly create a capped collection, specifying a collection size in bytes. The collection's data space is then preallocated. Note that the size specified includes database headers.
> db.createCollection("mycoll", {capped:true, size:100000})
Behavior
- Once the space is fully utilized, newly added objects will replace the oldest objects in the collection.
- If you perform a find() on the collection with no ordering specified, the objects will always be returned in insertion order. Reverse order is always retrievable with find().sort({$natural:-1}).
Usage and Restrictions
- You may insert new objects in the capped collection.
- You may update the existing objects in the collection. However, the objects must not grow in size. If they do, the update will fail. Note if you are performing updates, you will likely want to declare an appropriate index (given there is no _id index for capped collections by default).
- The database does not allow deleting objects from a capped collection. Use the drop() method to remove all rows from the collection. (After the drop you must explicitly recreate the collection.)
- Capped collection are not shardable.
 | Warning
Capped collections do not have a unique index on _id. Replication requires unique _ids. Therefore, if you are using a capped collections and replication, you must ensure that you have unique _ids. Having duplicate _ids in your capped collection may cause replication to halt on slaves/secondaries and require manual intervention or a resync.
You may want to create a unique index on _id to prevent this issue, see the autoIndexId section below. |
Applications
- Logging. Capped collections provide a high-performance means for storing logging documents in the database. Inserting objects in an unindexed capped collection will be close to the speed of logging to a filesystem. Additionally, with the built-in FIFO mechanism, you are not at risk of using excessive disk space for the logging.
- Automatic Maintaining of Insertion Order. Capped collections keep documents in their insertion order automatically, with no index being required for this property. The logging example above is a good example of a case where keeping items in order is important.
- Caching. If you wish to cache a small number of objects in the database, perhaps cached computations of information, the capped tables provide a convenient mechanism for this. Note that for this application you will typically want to use an index on the capped table as there will be more reads than writes.
- Automatic Age Out. If you know you want data to automatically "roll out" over time as it ages, a capped collection can be an easier way to support than writing manual removal via cron scripts. Ejection from the capped collection is also inexpensive compared to explicit remove operations.
Recommendations
- When appropriate, do not create indexes on a capped collection. If the collection will be written to much more than it is read from, it is better to have no indexes. Note that you may create indexes on a capped collection; however, you are then moving from "log speed" inserts to "database speed" inserts -- that is, it will still be quite fast by database standards.
- Use natural ordering to retrieve the most recently inserted elements from the collection efficiently. This is (somewhat) analogous to tail on a log file.
Options
size.
The size of the capped collection. This must be specified.
max
You may also optionally cap the number of objects in the collection. Once the limit is reached, items roll out on a least recently inserted basis.
Note: When specifying a cap on the number of objects, you must also cap on size. Be sure to leave enough room for your chosen number of objects or items will roll out faster than expected. You can use the validate() utility method to see how much space an existing collection uses, and from that estimate your size needs.
db.createCollection("mycoll", {capped:true, size:100000, max:100});
db.mycoll.validate();
autoIndexId
The autoIndexId field may be set to true or false to explicitly enable or disable automatic creation of a unique key index on the _id object field.
 | An index is not automatically created on _id for capped collections by default. |
If you will be using the _id field, you should create an index on _id.
Given these are used sometimes without an _id index, it can be useful to insert objects without an _id field. Most drivers and the mongo shell add an _id client-side. See each driver's documentation for how to suppress this (behavior might vary by driver). In the mongo shell one could invoke:
> db.mycollection._mongo.insert(db.mycollection._fullName, myObjectWithoutAnId)
Checking if a collection is capped
You can check if a collection is capped by using the isCapped() shell function. db.foo.isCapped()
Here is the definition.
> db.c.isCapped
function () {
var e = this.exists();
return e && e.options && e.options.capped ? true : false; }
Converting a collection to capped
You can convert a (non-capped) collection to a capped collection with the convertToCapped command:
> db.runCommand({"convertToCapped": "mycoll", size: 100000});
{ "ok": 1 }
Note that the size is in bytes.
See Also
PLEASE POST QUESTIONS IN THE USER GROUPS FORUM. Post non-question comments and helpful hints here.
blog comments powered by Disqus