|
Here we present a list of useful commands for obtaining information about a sharding cluster. To set up a sharding cluster, see the docs on sharding configuration. Identifying a Shard Cluster// Test if we're speaking to a mongos process or straight to a mongod process. // If connected to a mongod this will return a "no such cmd" message. > db.runCommand({ isdbgrid : 1}); // If connected to mongos, this command returns { "ismaster": true, // "msg": "isdbgrid", "maxBsonObjectSize": XXX, "ok": 1 } > db.runCommand({ismaster:1}); List Existing Shards> db.runCommand({ listShards : 1});
{"servers" :
[{"_id" : ObjectId( "4a9d40c981ba1487ccfaa634") ,
"host" : "localhost:10000"},
{"_id" : ObjectId( "4a9d40df81ba1487ccfaa635") ,
"host" : "localhost:10001"}
],
"ok" : 1
}
List Which Databases are ShardedHere we query the config database, albeit through mongos. The getSisterDB command is used to return the config database. This will list all databases in the cluster. Databases with partitioned : true have sharding enabled. > config = db.getSisterDB("config") > config.databases.find() { "_id" : "admin", "partitioned" : false, "primary" : "config" } { "_id" : "MyShardedDatabase", "partitioned" : true, "primary" : "localhost:30001" } { "_id" : "MyUnshardedDatabase", "partitioned" : false, "primary" : "localhost:30002" } View Sharding Details> use admin > db.printShardingStatus(); // A very basic sharding configuration on localhost sharding version: { "_id" : 1, "version" : 2 } shards: { "_id" : ObjectId("4bd9ae3e0a2e26420e556876"), "host" : "localhost:30001" } { "_id" : ObjectId("4bd9ae420a2e26420e556877"), "host" : "localhost:30002" } { "_id" : ObjectId("4bd9ae460a2e26420e556878"), "host" : "localhost:30003" } databases: { "name" : "admin", "partitioned" : false, "primary" : "localhost:20001", "_id" : ObjectId("4bd9add2c0302e394c6844b6") } my chunks { "name" : "foo", "partitioned" : true, "primary" : "localhost:30002", "sharded" : { "foo.foo" : { "key" : { "_id" : 1 }, "unique" : false } }, "_id" : ObjectId("4bd9ae60c0302e394c6844b7") } my chunks foo.foo { "_id" : { $minKey : 1 } } -->> { "_id" : { $maxKey : 1 } } on : localhost:30002 { "t" : 1272557259000, "i" : 1 } Notice the output to the printShardingStatus command. First, we see the locations of the three shards comprising the cluster. Next, the various databases living on the cluster are displayed. The first database shown is the admin database, which has not been partitioned. The primary field indicates the location of the database, which, in the case of the admin database, is on the config server running on port 20001. The second database is partitioned, and it's easy to see the shard key and the location and ranges of chunks comprising the partition. Since there's no data in the foo database, only a single chunk exists. That single chunk includes the entire range of possible shard keys. BalancingThe balancer is a background task that tries to keep the number of chunks even across all servers of the cluster. The activity of balancing is transparent to querying. Your application doesn't need to know or care that there is any data-moving activity ongoing. To make that so, the balancer is careful about when and how much data it would transfer. Let's look at how much to transfer first. The unit of transfer is a chunk. With a steady state, the size of chunks should be roughly 64MB of data. This has shown to be the sweet spot of how much data to move at once. More than that, and the migration would take longer and the queries might perceive that in a wider difference in response times. Less than that, and the overhead of moving wouldn't pay off as highly. Regarding when to transfer load, the balancer waits for a threshold of uneven chunk counts to occur before acting. In the field, having a difference of 8 chunks between the least and most loaded shards showed to be a good heuristic. (This is an arbitrary number, granted.) The concern here is not to incur overhead if -- exagerating to make a point -- there is a difference of one doc between shard A and shard B. It's just inneficient to monitor load differences at that fine of a grain. Now, once the balancer "kicked in," it will redistribute chunks, one at a time -- in what we call rounds -- until that difference in chunks beween any two shards is down to 2 chunks. A common source of questions is why a given collection is not being balanced. By far, the most probable cause is: it doesn't need to. If the chunk difference is small enough, redistributing chunks won't matter enough. The implicit assumption here is that you actually have a large enough collection and the overhead of the balancing machinery is little compared to the amount of data your app is handling. If you do the math, you'll find that you might not hit "balancing threshold" if you're doing an experiment on your laptop. Another possibility is that the balancer is not making progress. The balancing task happens at an arbitrary mongos (query router) in your cluster. Since there can be several query routers, there is a mechanism they all use to decide which mongos will take the responsibility. The mongos acting as balancer takes a "lock" by inserting a document into the 'locks' collection of the config database. When a mongos is running the balancer the 'state' of that lock is 1 (taken). To check the state of that lock // connect to mongos > use config > db.locks.find( { _id : "balancer" } ) A typical output for that command would be { "_id" : "balancer", "process" : "guaruja:1292810611:1804289383", "state" : 1, "ts" : ObjectId("4d0f872630c42d1978be8a2e"), "when" : "Mon Dec 20 2010 11:41:10 GMT-0500 (EST)", "who" : "guaruja:1292810611:1804289383:Balancer:846930886", "why" : "doing balance round" }
There are at least three points to note in that output. One, the state attribute is 1 (2 in v2.0), which means that lock is taken. We can assume the balancer is active. Two, that balancer has been running since Monday, December the 20th. That's what the attribute "when" tells us. And, the third thing, the balancer is running on a machine called "guaruja". The attribute "who" gives that away. To check what the balancer is actually doing, we'd look at the mongos log on that machine. The balancer outputs rows to the log prefixed by "[Balancer]". Mon Dec 20 11:53:00 [Balancer] chose [shard0001] to [shard0000] { _id: "test.foo-_id_52.0", lastmod: Timestamp 2000|1, ns: "test.foo", min: { _id: 52.0 }, max: { _id: 105.0 }, shard: "shard0001" }
What this entry is saying is that the balancer decided to move the chunk _id:[52..105) from shard0001 to shard0000. Both mongod's log detailed entries of how that migrate is progressing.
If you want to pause the balancer temporarily for maintenance, you can by modifying the settings collection in the config db. // connect to mongos > use config > db.settings.update( { _id: "balancer" }, { $set : { stopped: true } } , true ); As a result of that command, one should stop seeing "[Balancer]" entries in the mongos that was running the balancer. If, for curiosity, you're running that mongos in a more verbose level, you'd see an entry such as the following. Mon Dec 20 11:57:35 "[Balancer]" skipping balancing round because balancing is disabled
You would just set stopped to false to turn on again. For more information on chunk migration and commands, see: Moving Chunks Balancer windowBy default the balancer operates continuously, but it is also possible to set an active "window" of time each day for balancing chunks. This can often be useful when a cluster adds or removes only a few GB of data per day and traffic would be less disrupted if balancing occurred during low-traffic times. This time interval is also specified in the config.settings balancer document. For example, to balance chunks only from 9AM to 9PM: // connect to mongos > use config > db.settings.update({ _id : "balancer" }, { $set : { activeWindow : { start : "9:00", stop : "21:00" } } }, true ) To specify a time range spanning midnight, just swap the order of the times: // connect to mongos > use config > db.settings.update({ _id : "balancer" }, { $set : { activeWindow : { start : "21:00", stop : "9:00" } } }, true ) This enables balancing from 9PM to 9AM. If enabling the active window feature, it is important to check periodically that the amount of data inserted each day is not more than the balancer can handle in the limited window. Currently only time of day is supported for automatic scheduling. Chunk Size ConsiderationsmongoDB sharding is based on *range partitioning*. Chunks are split automatically when they reach a certain size threshold. The threshold varies, but the rule of thumb is, expect it to between a half and the maximum chunk size in the steady state. The default maximum chunk size is 64MB (sum of objects in the collection, not counting indices), though in older versions it was 200MB. Chunk size has been intensely debated -- and much hacked. So let's understand the tradeoffs that that choice of size implies. When you move data, there's some state resetting in mongos. Queries that used to hit a given shard for a migrated chunk, now need to hit a new shard. This state resetting isn't free, so we want to move chunks infrequently (pressure to move a lot of keys at once). However, the actual moving has a cost that's proportional to the number of keys you're moving (pressure to move few keys). If you opt to change the default chunk size for a production site, you can do that by changing the value of the chunksize setting on the config database by running > use config
> db.settings.save({_id:"chunksize", value:<new_chunk_size_in_mb>})
Note though that for an existing cluster, it may take some time for the collections to split to that size, if smaller than before, and currently autosplitting is only triggered if the collection gets new documents or updates. For more information on chunk splitting and commands, see: Splitting Chunks |

PLEASE POST QUESTIONS IN THE FORUMS: http://groups.google.com/group/mongodb-user. Post tips and clarifications here.
blog comments powered by Disqus