IntroductionGridFS is a storage specification for large objects in MongoDB. GridFS takes large objects and stores them as chunks of data as well as metadata. This document specifies the requirements of a GridFS implementation. Normally, you need not worry about the details of the format -- for information on how to use GridFS, see Storing Files. SpecificationStorage CollectionsGridFS uses two collections to store data :
These are "subcollections" on a "root collection". By default this is fs so for a GridFS store, the collection would be considered to be fs, with the two parts fs.files and fs.chunks. The root collection is allowed to vary, to provide for the ability for a user to segment large objects into subsets. For example, one might partition objects by type, such as pdf, contracts, {{videos}, etc. However, fs is the default root collection for GridFS, and must be supported by any GridFS implementation in a way that it doesn't have to be specified to perform GridFS operations. For example: /* * default root collection usage - must be supported */ GridFS myFS = new GridFS(myDatabase); // returns a default GridFS (e.g. "fs" root collection) myFS.storeFile(new File("/tmp/largething.mpg")); // saves the file into the "fs" GridFS store /* * specified root collection usage - optional */ GridFS myContracts = new GridFS(myDatabase, "contracts"); // returns a GridFS where "contracts" is root myFS.retrieveFile("smithco", new File("/tmp/smithco_20090105.pdf")); // retrieves object whose filename is "smithco" Note that the above API is for demonstration purposes only - this spec does not (at this time) recommend any API. See individual driver documentation for API specifics. filesThe structure of the object metadata document is as follows : {
"_id" : <unspecified>, // unique ID for this file
"filename" : data_string, // human name for the file
"contentType" : data_string, // valid mime type for the object
"length" : data_number, // size of the file in bytes
"chunkSize" : data_number, // size of each of the chunks. Default is 256k
"uploadDate" : data_date, // date when object first stored
"aliases" : data_array of data_string, // optional array of alias strings
"metadata" : data_object, // anything the user wants to store
"md5" : data_string // result of running the "filemd5" command on this file's chunks
}
Note that the _id field can be of any type at the discretion of the spec implementor. chunksThe structure of the chunk document is as follows : {
"_id" : <unspecified>, // object id of the chunk in the _chunks collection
"files_id" : <unspecified>, // _id value of the owning {{files}} collection entry
"n" : data_number, // "chunk number" - chunks are numbered in order, starting with 0
"data" : data_binary (type 0x02), // binary data for chunk
}
Notes:
IndexingGridFS implementations should create an index on { files_id:1, n:1}
in the chunks collection, and should count on being able to retrieve chunks efficiently via db.fs.chunks.find({file_id: myFileID}).orderby({n:1});
|

IF YOU HAVE A QUESTION, POST IT TO THE USER GROUP.
These pages are fine for comments, but for questions, your best bet will always be the MongoDB User Group. blog comments powered by Disqus