|
It is important to choose the right shard key for a collection. If the collection is gigantic it is difficult to change the key later. When in doubt please ask for suggestions in the support forums or IRC. The basic rule of thumb is : whatever field is common to most of your queries, that field is likely the right one to use for the shard key. If a query involves the shard key, it can (typically) be sent to only one shard. The discussion below considers a structured event logging system. Documents in the events collection have the form: {
mon_node : "ny153.example.com" ,
application : "apache" ,
time : "2011-01-02T21:21:56.249Z" ,
level : "ERROR" ,
msg : "something is broken"
}
CardinalityDocuments in a sharded collection are grouped together in chunks. All data in a sharded collection is broken into chunks. A chunk consists of all the documents who have a shard key within the chunk's specified (key_low..key_high) key range. Using the logging example above, if you chose {mon_node:1}
as the shard key, then all data for a single mon_node is in one chunk, and thus on one shard. One can easily imagine having a lot of data for a mon_node and wanting to split it among shards. In addition, chunks are approximately 64MB in size. If the chunk is unsplittable because it represents a single mon_node value, the balancer migrating that chunk from server to server would be slow. If one were to shard on the key {mon_node:1,time:1}
we can then split data for a single mon_node down to the millisecond. No chunk will ever be too large. (Note there will not be a chunk for every mon_node,time value – rather on ranges such as {mon_node:"nyc_app_24",time:"2011-01-01"}..{mon_node:"nyc_app_27",time:"2011-07-12"}
or if a lot of data for one mon_node: {mon_node:"nyc_app_24",time:"2011-01-01"}..{mon_node:"nyc_app_24",time:"2011-01-07"}
Keeping chunks to a reasonable size is very important so data can be balanced and moving chunks isn't too expensive. Write scalingOne of the primary reasons for using sharding is to distribute writes. To do this, its important that writes hit as many chunks as possible. Again using the example above, choosing { time : 1 }
as the shard key will cause all writes to go to the newest chunk. If the shard key were {mon_node:1,application:1,time:1}
then each mon_node,application pair can potentially be written to different shards. Thus there were 100 mon_node,application pairs, and 10 shards, then each shard would get 1/10th the writes. Note that because the most significant part of an ObjectId is time-based, using ObjectId as the shard key has the same issue as using time directly. Query isolationAnother key consideration is how many shards any query has to hit. Ideally a query would go directly from mongos to the shard (mongod) that owns the document requested. Thus, if you know that most common queries include a condition on mon_node, then starting the shard key with mon_node would be efficient. All queries work regardless of the shard key, but if mongos cannot determine which shard that owns the data, it will send the operation to all shards in parallel. This will increase the response time latency and increase the volume of network traffic and load. SortingWhen a query includes a sort criteria, the query is sent to the same shards that would have received the equivalent query without the sort expression. Each shard queried performs a sort on its subset of the data locally (potentially utilizing an appropriate index locally if one is available). mongos merges the ordered results coming from each shard and returns the merged stream to the client. Thus the mongos process does very little work and requires little RAM for request of this nature.
ReliabilityOne important aspect of sharding is how much of the system will be affected in case an entire shard becomes unavailable (even though it is usually a reliable replica set). Say you have a twitter-like system, with comment entries like: {
_id: ObjectId("4d084f78a4c8707815a601d7"),
user_id : 42 ,
time : "2011-01-02T21:21:56.249Z" ,
comment : "I am happily using MongoDB",
}
One might use id or user_id as shard key. id will give you better granularity and spread, but in case a shard goes down, it will impact almost all your users (some data missing). If you use user_id instead, a lower percentage of users will be impacted (say 20% for a 5 shard setup), although these users will not see any of their data. Index optimizationAs mentioned in other sections about indexing, it is usually much better performance-wise if only a portion of the index is read/updated, so that this "active" portion can stay in RAM most of the time. Most shard keys described above result in an even spread over the shards but also within each mongod's index. Instead, it can be beneficial to factor in some kind of timestamp at the beginning of the shard key in order to limit the portion of the index that gets hit. {
_id: ObjectId("4d084f78a4c8707815a601d7"),
user_id : 42 ,
title: "sunset at the beach",
upload_time : "2011-01-02T21:21:56.249Z" ,
data: ...,
}
You could instead build a custom id that includes the month of the upload time, and a unique identifier (ObjectId, MD5 of data, etc). Your entry now looks like: {
_id: "2011-01_4d084f78a4c8707815a601d7",
user_id : 42 ,
title: "sunset at the beach",
upload_time : "2011-01-02T21:21:56.249Z" ,
data: ...,
}
This custom id would be the key used for sharding, and also the id used to retrieve a document. It gives both a good spread across all servers, and it reduces the portion of the index hit on most queries. Further notes:
GridFSThere are different ways that GridFS can be sharded, depending on the need. One common way to shard, based on pre-existing indexes, is:
As the default files_id is an ObjectId, files_id will be ascending and thus all GridFS chunks will be sent to a single sharding chunk. If your write load is too high for a single server to handle, you may want to shard on a different key or use a different value for the _id in the files collection. |


PLEASE POST QUESTIONS IN THE USER GROUPS FORUM. Post non-question comments and helpful hints here.
blog comments powered by Disqus