By shard ownership we mean which server owns a particular key range.
Early draft/thoughts will change:
Contract
- the master copy of the ownership information is in the config database
- mongos instances have cached info on which server owns a shard. this information may be stale.
- mongod instances have definitive information on who owns a shard (atomic with the config db) when they know about a shards ownership
mongod
The mongod processes maintain a cache of shards the mongod instance owns:
map<ShardKey,state> ownership
State values are as follows:
- missing - no element in the map means no information available. In such a situation we should query the config database to get the state.
- 1 - this instance owns the shard
- 0 - this instance does not own the shard (indicates we queried the config database and found another owner, and remembered that fact)
Initial Assignment of a region to a node.
This is trivial: add the configuration to the config db. As the ShardKey is new, no nodes have any cached information.
Splitting a Key Range
The mongod instance A which owns the range R breaks it into R1,R2 which are still owned by it. It updates the config db. We take care to handle the config db crashing or being unreachable on the split:
lock(R) on A
update the config db -- ideally atomically perhaps with eval(). await return code.
ownership[R].erase
unlock(R) on A
After the above the cache has no information on the R,R1,R2 ownerships, and will requery configdb on the next request. If the config db crashed and failed to apply the operation, we are still consistent.
Migrate ownership of keyrange R from server A->B. We assume here that B is the coordinator of the job:
B copies range from A
lock(R) on A and B
B copies any additional operations from A (fast)
clear ownership maps for R on A and B. B waits for a response from A on this operation.
B then updates the ownership data in the config db. (Perhaps even fsyncing.) await return code.
unlock(R) on B
delete R on A (cleanup)
unlock (R) on A
We clear the ownership maps first. That way, if the config db update fails, nothing bad happens, IF mongos filters data upon receipt for being in the correct ranges (or in its query parameters).
R stays locked on A for the cleanup work, but as that shard no longer owns the range, this is not an issue even if slow. It stays locked for that operation in case the shard were to quickly migrate back.
Migrating Empty Shards
Typically we migrate a shard after a split. After certain split scenarios, a shard may be empty but we want to migrate it.
PLEASE POST QUESTIONS IN THE FORUMS: http://groups.google.com/group/mongodb-user. Post tips and clarifications here.
blog comments powered by Disqus