Geospatial Indexing

v1.4+

MongoDB supports two-dimensional geospatial indexes. It is designed with location-based queries in mind, such as "find me the closest N items to my location." It can also efficiently filter on additional criteria, such as "find me the closest N museums to my location."

In order to use the index, you need to have a field in your object that an array where the first 2 elements are x,y coordinates (or y,x - just be consistent; it might be advisable to use order-preserving dictionaries/hashes in your client code, to ensure consistency).

To make sure ordering is preserved from all languages use a 2 element array
[ x, y ]

Some examples:

{ loc : [ 50 , 30 ] } //SUGGESTED OPTION
{ loc : { x : 50 , y : 30 } }
{ loc : { foo : 50 , y : 30 } }
{ loc : { lon : 40.739037, lat: 73.992964 } }

Creating the Index

db.places.ensureIndex( { loc : "2d" } )

By default, the index assumes you are indexing longitude/latitude and is thus configured for a [-180..180) value range.

The index space bounds are inclusive of the lower bound and exclusive of the upper bound.

If you are indexing something else, you can specify some options:

db.places.ensureIndex( { loc : "2d" } , { min : -500 , max : 500 } )

that will scale the index to store values between -500 and 500.  Bounded geospatial searches are currently limited to rectangular and circular areas with no "wrapping" at the outer boundaries. You cannot insert values outside the boundary interval [min, max).  For example, using the code above, the point (-500, 500) could not be inserted and would raise an error (the point (-500, 499), however, would be fine).

Pre-1.9 releases of mongo do not allow the insertion of points exactly at the boundaries.
db.places.ensureIndex( { loc : "2d" } , { bits : 26 } )

The bits parameter sets the precision of the 2D geo-hash values, the smallest "buckets" in which locations are stored. By default, precision is set to 26 bits which is equivalent to approximately 1 foot given (longitude, latitude) location values and default (-180, 180) bounds.  To index other spaces which may have very large bounds, it may be useful to increase the number of bits up to the maximum of 32.

You may only have 1 geospatial index per collection, for now. While MongoDB may allow to create multiple indexes, this behavior is unsupported. Because MongoDB can only use one index to support a single query, in most cases, having multiple geo indexes will produce undesirable behavior.

Implicit array expansion syntax is only supported in v1.9+, where "foo.bar" : "2d" may reference a nested field like so:

{ foo : [ { bar : [ ... ] } ] }

This restriction holds even if there are not multiple locations per document and the array size is 1. In older versions you will need to embed the nested location without a parent array :

{ foo : { bar : [ ... ] } }

Querying

The index can be used for exact matches:

db.places.find( { loc : [50,50] } )

Of course, that is not very interesting. More important is a query to find points near another point, but not necessarily matching exactly:

db.places.find( { loc : { $near : [50,50] } } )

The above query finds the closest points to (50,50) and returns them sorted by distance (there is no need for an additional sort parameter). Use limit() to specify a maximum number of points to return (a default limit of 100 applies if unspecified):

db.places.find( { loc : { $near : [50,50] } } ).limit(20)

You can also use $near with a maximum distance

db.places.find( { loc : { $near : [50,50] , $maxDistance : 5 } } ).limit(20)
Prior to v1.9.1, geospatial indexes can be used for exact lookups only when there is no other criteria specified in the query and locations are specified in arrays. For any type of $near, $within, or geoNear query, this restriction does not apply and any additional search criteria can be used.

All distances in geospatial queries are specified in the same units as the document coordinate system (aside from spherical queries, discussed below ).  For example, if your indexed region is of size [300, 300), representing a 300 x 300 meter field, and you have documents at locations (10, 20) and (10, 30), representing objects at points in meters (x, y), you could query for points $near : [10, 20], $maxDistance : 10. The distance unit is the same as in your coordinate system, and so this query looks for points up to 10 meters away.

When using longitude and latitude, which are angular measures, distance is effectively specified in approximate units of "degrees," which vary by position on the globe but can very roughly be converted to distance using 69 miles per degree latitude or longitude. The maximum error in northern or southernmost populated regions is ~2x longitudinally - for many purposes this is acceptable. Spherical queries (below ) take the curvature of the earth into account.

Compound Indexes

MongoDB geospatial indexes optionally support specification of secondary key values.  If you are commonly going to be querying on both a location and other attributes at the same time, add the other attributes to the index.  The other attributes are annotated within the index to make filtering faster.  For example:

db.places.ensureIndex( { location : "2d" , category : 1 } );
db.places.find( { location : { $near : [50,50] }, category : 'coffee' } );
Limits in geospatial queries are always applied to the geospatial component first - this can cause unexpected results when also re-sorting results by additional criteria, pending resolution of SERVER-4247.

geoNear Command

While the find() syntax above is typically preferred, MongoDB also has a geoNear command which performs a similar function.  The geoNear command has the added benefit of returning the distance of each item from the specified point in the results, as well as some diagnostics for troubleshooting.

Valid options are: "near", "num", "maxDistance", "distanceMultiplier" and "query".

> db.runCommand( { geoNear : "places" , near : [50,50], num : 10 } );
> db.runCommand({geoNear:"asdf", near:[50,50]})
{
        "ns" : "test.places",
        "near" : "1100110000001111110000001111110000001111110000001111",
        "results" : [
                {
                        "dis" : 69.29646421910687,
                        "obj" : {
                                "_id" : ObjectId("4b8bd6b93b83c574d8760280"),
                                "y" : [
                                        1,
                                        1
                                ],
                                "category" : "Coffee"
                        }
                },
                {
                        "dis" : 69.29646421910687,
                        "obj" : {
                                "_id" : ObjectId("4b8bd6b03b83c574d876027f"),
                                "y" : [
                                        1,
                                        1
                                ]
                        }
                }
        ],
        "stats" : {
                "time" : 0,
                "btreelocs" : 1,
                "btreelocs" : 1,
                "nscanned" : 2,
                "nscanned" : 2,
                "objectsLoaded" : 2,
                "objectsLoaded" : 2,
                "avgDistance" : 69.29646421910687
        },
        "ok" : 1
}

The above command will return the 10 closest items to  (50,50).  (The loc field is automatically determined by checking for a 2d index on the collection.)

If you want to add an additional filter, you can do so:

> db.runCommand( { geoNear : "places" , near : [ 50 , 50 ], num : 10,
... query : { type : "museum" } } );

query can be any regular mongo query.

Bounds Queries

$within can be used instead of $near to find items within a shape. Results are not sorted by distance, which may result in faster queries when this sorting is not required.  Shapes of type $box (rectangles), $center (circles), and $polygon (concave and convex polygons) are supported.  All bounds queries implicitly include the border of the shape as part of the boundary, though due to floating-point inaccuracy this can't strictly be relied upon.

To query for all points within a rectangle, you must specify the lower-left and upper-right corners:

> box = [[40.73083, -73.99756], [40.741404,  -73.988135]]
> db.places.find({"loc" : {"$within" : {"$box" : box}}})

A circle is specified by a center point and radius:

> center = [50, 50]
> radius = 10
> db.places.find({"loc" : {"$within" : {"$center" : [center, radius]}}})

A polygon is specified by an array or object of points, where each point may be specified by either an array or an object. The last point in the polygon is implicitly connected to the first point in the polygon.

> polygonA = [ [ 10, 20 ], [ 10, 40 ], [ 30, 40 ], [ 30, 20 ] ]
> polygonB = { a : { x : 10, y : 20 }, b : { x : 15, y : 25 }, c : { x : 20, y : 20 } }
> db.places.find({ "loc" : { "$within" : { "$polygon" : polygonA } } })
> db.places.find({ "loc" : { "$within" : { "$polygon" : polygonB } } })

Polygon searches are strictly limited to looking for points inside polygons, polygon shapes in documents can't currently be indexed in MongoDB.

Polygon searches are supported in versions >= 1.9

The Earth is Round but Maps are Flat

The current implementation assumes an idealized model of a flat earth, meaning that an arcdegree of latitude (y) and longitude (x) represent the same distance everywhere. This is only true at the equator where they are both about equal to 69 miles or 111km. However, at the 10gen offices at { x : -74 , y : 40.74 } one arcdegree of longitude is about 52 miles or 83 km (latitude is unchanged). This means that something 1 mile to the north would seem closer than something 1 mile to the east.

New Spherical Model

v1.8+.

Spherical distances can be used by adding "Sphere" to the name of the query. For example, use $nearSphere or $centerSphere ($boxSphere and $polygonSphere don't make as much sense and so aren't supported). If you use the geoNear command to get distance along with the results, you just need to add spherical:true to the list of options.

There are a few caveats that you must be aware of when using spherical distances. The biggest is:

  1. The code assumes that you are using decimal degrees in (longitude, latitude) order. This is the same order used for the GeoJSON spec. Using (latitude, longitude) will result in very incorrect results, but is often the ordering used elsewhere, so it is good to double-check.  The names you assign to a location object (if using an object and not an array) are completely ignored, only the ordering is detected.  A few examples:
    /* assuming longitude is 13, latitude is -50 */
    [13, -50] // ok
    { x : 13, y : -50 } // ok
    { lon : 13, lat : -50 } // ok
    
    /* wrong, will make lat = longitude and lon = latitude */
    { lat : -50, lon : 13 }
    

    As above, the use of order-preserving dictionaries is required for consistent results.

Also:

  1. All distances use radians. This allows you to easily multiply by the radius of the earth (about 6371 km or 3959 miles) to get the distance in your choice of units. Conversely, divide by the radius of the earth when doing queries.
  2. We don't currently handle wrapping at the poles or at the transition from -180° to +180° longitude, however we detect when a search would wrap and raise an error.
  3. While the default Earth-like bounds are [-180, 180), valid values for latitude are between -90° and 90°.

Spherical Example

Below is a simple example of a spherical distance query, demonstrating how to convert a specified range in kilometers to a maxDistance in radians as well as converting the returned distance results from radians back to kilometers. The same conversion of kilometer to radian distance bounds is required when performing bounded $nearSphere and $centerSphere queries.

> db.points.insert({ pos : { lon : 30, lat : 30 } })
> db.points.insert({ pos : { lon : -10, lat : -20 } })
> db.points.ensureIndex({ pos : "2d" })
>
> var earthRadius = 6378 // km
> var range = 3000 // km
>
> distances = db.runCommand({ geoNear : "points", near : [0, 0], spherical : true, maxDistance : range / earthRadius /* to radians */ }).results
[
	{
		"dis" : 0.3886630122897946,
		"obj" : {
			"_id" : ObjectId("4d9123026ccc7e2cf22925c4"),
			"pos" : {
				"lon" : -10,
				"lat" : -20
			}
		}
	}
]
> pointDistance = distances[0].dis * earthRadius // back to km
2478.89269238431

Multi-location Documents

v.1.9+

MongoDB now also supports indexing documents by multiple locations. These locations can be specified in arrays of sub-objects, for example:

> db.places.insert({ addresses : [ { name : "Home", loc : [55.5, 42.3] }, { name : "Work", loc : [32.3, 44.2] } ] })
> db.places.ensureIndex({ "addresses.loc" : "2d" })

Multiple locations may also be specified in a single field:

> db.places.insert({ lastSeenAt : [ { x : 45.3, y : 32.2 }, [54.2, 32.3], { lon : 44.2, lat : 38.2 } ] })
> db.places.ensureIndex({ "lastSeenAt" : "2d" })

By default, when performing geoNear or $near-type queries on collections containing multi-location documents, the same document may be returned multiple times, since $near queries return ordered results by distance. Queries using the $within operator by default do not return duplicate documents. 

v2.0

In v2.0, this default can be overridden by the use of a $uniqueDocs parameter for geoNear and $within queries, like so:

> db.runCommand( { geoNear : "places" , near : [50,50], num : 10, uniqueDocs : false } )
> db.places.find( { loc : { $within : { $center : [[0.5, 0.5], 20], $uniqueDocs : true } } } )
Currently it is not possible to specify $uniqueDocs for $near queries

Whether or not uniqueDocs is true, when using a limit the limit is applied (as is normally the case) to the number of results returned (and not to the docs or locations).  If running a geoNear query with uniqueDocs : true, the closest location in a document to the center of the search region will always be returned - this is not true for $within queries.

In addition, when using geoNear queries and multi-location documents, often it is useful to return not only distances, but also the location in the document which was used to generate the distance.  In v2.0, to return the location alongside the distance in the geoNear results (in the field loc), specify includeLocs : true in the geoNear query. The location returned will be a copy of the location in the document used.

If the location was an array, the location returned will be an object with "0" and "1" fields in v2.0.0 and v2.0.1.
> db.runCommand({ geoNear : "places", near : [ 0, 0 ], maxDistance : 20, includeLocs : true })
{
	"ns" : "test.places",
	"near" : "1100000000000000000000000000000000000000000000000000",
	"results" : [
		{
			"dis" : 5.830951894845301,
			"loc" : {
				"x" : 3,
				"y" : 5
			},
			"obj" : {
				"_id" : ObjectId("4e52672c15f59224bdb2544d"),
				"name" : "Final Place",
				"loc" : {
					"x" : 3,
					"y" : 5
				}
			}
		},
		{
			"dis" : 14.142135623730951,
			"loc" : {
				"0" : 10,
				"1" : 10
			},
			"obj" : {
				"_id" : ObjectId("4e5266a915f59224bdb2544b"),
				"name" : "Some Place",
				"loc" : [
					[
						10,
						10
					],
					[
						50,
						50
					]
				]
			}
		},
		{
			"dis" : 14.142135623730951,
			"loc" : {
				"0" : -10,
				"1" : -10
			},
			"obj" : {
				"_id" : ObjectId("4e5266ba15f59224bdb2544c"),
				"name" : "Another Place",
				"loc" : [
					[
						-10,
						-10
					],
					[
						-50,
						-50
					]
				]
			}
		}
	],
	"stats" : {
		"time" : 0,
		"btreelocs" : 0,
		"nscanned" : 5,
		"objectsLoaded" : 3,
		"avgDistance" : 11.371741047435734,
		"maxDistance" : 14.142157540259815
	},
	"ok" : 1
}

Sharded Collections

v1.8+. Creating a geospatial index for a sharded collection is supported with some caveats: see http://jira.mongodb.org/browse/SHARDING-83.  There are no caveats for using geospatial indexes with unsharded collections in a sharded cluster.

Implementation

The current implementation encodes geographic hash codes atop standard MongoDB B-trees. Results of $near queries are exact. One limitation with this encoding, while fast, is that prefix lookups don't give exact results, especially around bit flip areas.  MongoDB solves this by doing a grid-neighbor search after the initial prefix scan to pick up any straggler points.  This generally ensures that performance remains very high while providing correct results.

Presentations

Follow @mongodb

MongoDB Pittsburgh - May 15
MongoNYC - May 23
MongoDB Paris - Jun 14
MongoDB UK - Jun 20
MongoDC - June 26


Labels

geo geo Delete
gis gis Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.

PLEASE POST QUESTIONS IN THE USER GROUPS FORUM. Post non-question comments and helpful hints here.

blog comments powered by Disqus