(Damien Katz -- July 19 2012)
This documents the format of the Key/Value pairs in leaf nodes, and the reduction format in inner nodes.
The underlying b-tree format is the same as already used in CouchStore.
- All integers are network byte order.
- All strings are encoded as UTF-8.
- Adjacent fields are tightly packed with no padding.
In leaf nodes, KeyValues
have the following format:
Key
:EmittedJsonKeyLength
-- 16bit integerEmittedJSONKey
-– JSON -– The key emitted by the map functionUnquotedDocId
–- String -– The raw doc ID (occupies the remaining bytes)
- Value:
PartitionId
-- 16bit integer -- This is the partitionId (vbucket) from which this document id maps to.- 1 to infinity
JSONStringValue
s -- These are all the values that were emitted for thisEmittedJSONKey
.
EachJSONStringValue
is of the form:ValueLength
-- 24bit unsigned integerJSONValue
- string that isValueLength
bytes long
(Parsing the JSONStringValue
s is simply reading the first 24 bits, getting the length of the following string and extracting the string. If there is still buffer left, the process is repeated until the is no value buffer left.)
When an emit happens, and the Key is different from all other keys emitted for that document, then there is only one JSONStringValue
.
But when multiple identical keys are emitted, the values are coalesced into a list of Values, and there will be multiple values.
SubTreeCount
-- 40bit integer -- Count of all Values in subtree.
NOTE: this is possibly greater than theKeyId
count in the subtree, because a document can emit multiple identical keys, and they are coalesced into singleKeyId
, with all the values emitted in a list as the value.SubTreePartitionBitmap
-- 1024 bits -- a bitfield of all partition keys in the subtree. Currently this is hardcoded at 1024 bits in length, but in the future we may change this to a variable size. Until then, it works with any # of vbuckets ≤ 1024.JSONReductions
-- remaining bytes -- Zero or moreJSONReductions
, each consisting of:JSONLen
-- 16bit integerJSON
-- the actual JSON string
In leaf nodes, KeyValues
have the following format:
Key
-- blob -- The raw docId, not quoted or JSONified in any way.Value
:PartitionId
-- 16bit integer -- This is the partitionId (vbucket) from which this document id maps to.- 1-n
ViewKeysMappings
, where n ≤ the # of mapfunctions defined in the design document.
AViewKeysMapping
is: *ViewId
-- 8bit integer -- the ordinal id of the map view in the group the following keys were emitted from *NumKeys
-- 16bit integer -- the number ofJSONKeys
that follow *JSONKeys
-- a sequence of: *KeyLen
-- 16bit integer -- Length of the followingJSONKey
*Key
-- JSON string -- Emitted JSON key
SubTreeCount
-- 40bit integer -- count of all Keys in subtree.SubTreePartitionBitmap
-- 1024 bits -- a bitfield of all partition keys in the subtree. Currently this is hardcoded at 1024 in length, but in the future we may change this to a variable size. Until then, it works with any # of vbuckets ≤ 1024.