Skip to content

Commit

Permalink
fix: preserve the order of the keys (#211)
Browse files Browse the repository at this point in the history
* fix: preserve the order of the keys
Co-authored-by: Rodolfo Carvalho <[email protected]>
  • Loading branch information
zookzook authored Nov 26, 2023
1 parent f01d497 commit b360d87
Show file tree
Hide file tree
Showing 5 changed files with 456 additions and 199 deletions.
92 changes: 85 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,15 +135,18 @@ Mongo.insert_many(top, "users", [

## Data Representation

This driver chooses to accept both maps and lists of key-value tuples when encoding BSON documents (1), but will only decode documents into maps. This has the side effect that document field order is lost when decoding. Maps are convenient to work with, but map keys are not ordered, unlike BSON document fields.
This driver chooses to accept both maps and lists of key-value tuples when encoding BSON documents (1), but will only
decode documents into maps. Maps are convenient to work with, but Elixir map keys are not ordered, unlike BSON document
keys.

Driver users should represent documents using a list of tuples when field order matters, for example when sorting by multiple fields:
That design decision means document key order is lost when encoding Elixir maps to BSON and, conversely, when decoding
BSON documents to Elixir maps. However, see [Preserve Document Key Order](#preserve-document-key-order) to learn how to
preserve key order when it matters.

```elixir
Mongo.find(top, "users", %{}, sort: [last_name: 1, first_name: 1, _id: 1])
```

Additionally, the driver accepts both atoms and strings for document keys, but will only decode them into strings. Creating atoms from arbitrary input (such as database documents) is [discouraged](https://elixir-lang.org/getting-started/mix-otp/genserver.html#:~:text=However%2C%20naming%20dynamic,our%20system%20memory!) because atoms are not garbage collected.
Additionally, the driver accepts both atoms and strings for document keys, but will only decode them into strings.
Creating atoms from arbitrary input (such as database documents) is
[discouraged](https://elixir-lang.org/getting-started/mix-otp/genserver.html#:~:text=However%2C%20naming%20dynamic,our%20system%20memory!)
because atoms are not garbage collected.

[BSON symbols (deprecated)](https://bsonspec.org/spec.html#:~:text=Symbol.%20%E2%80%94%20Deprecated) can only be decoded (2).

Expand All @@ -169,6 +172,81 @@ Additionally, the driver accepts both atoms and strings for document keys, but w
max key :BSON_max
decimal128 Decimal{}

## Preserve Document Key Order

### Encoding from Elixir to BSON

For some MongoDB operations, the order of the keys in a document affect the result. For example, that is the case when
sorting a query by multiple fields.

In those cases, driver users should represent documents using a list of tuples (or a keyword list) to preserve the
order. Example:

```elixir
Mongo.find(top, "users", %{}, sort: [last_name: 1, first_name: 1, _id: 1])
```

The query above will sort users by last name, then by first name and finally by ID. If an Elixir map had been used to
specify `:sort`, query results would end up sorted unexpectedly wrong.

### Decoding from BSON to Elixir

Decoded BSON documents are always represented by Elixir maps because the driver depends on that to implement its
functionality.

If the order of document keys as stored by MongoDB is needed, the driver can be configured to use a BSON decoder module
that puts a list of keys in the original order under the `:__order__` key (and it works recursively).

```elixir
config :mongodb_driver,
decoder: BSON.PreserveOrderDecoder
```

It is possible to customize the key. For example, to use `:original_order` instead of the default `:__order__`:

```elixir
config :mongodb_driver,
decoder: {BSON.PreserveOrderDecoder, key: :original_order}
```

The resulting maps with annotated key order can be recursively transformed into lists of tuples. That allows for
preserving the order again when encoding. Here is an example of how to achieve that:

```elixir
defmodule MapWithOrder do
def to_list(doc, order_key \\ :__order__) do
do_to_list(doc, order_key)
end

defp do_to_list(%{__struct__: _} = elem, _order_key) do
elem
end

defp do_to_list(doc, order_key) when is_map(doc) do
doc
|> Map.get(order_key, Map.keys(doc))
|> Enum.map(fn key -> {key, do_to_list(Map.get(doc, key), order_key)} end)
end

defp do_to_list(xs, order_key) when is_list(xs) do
Enum.map(xs, fn elem -> do_to_list(elem, order_key) end)
end

defp do_to_list(elem, _order_key) do
elem
end
end

# doc = ...
MapWithOrder.to_list(doc)
```

Note that structs are kept as-is, to handle special values such as `BSON.ObjectId`.

The decoder module is defined at compile time. The default decoder is `BSON.Decoder`, which does not preserve document
key order. As it needs to execute fewer operations when decoding data received from MongoDB, it offers improved
performance. Therefore, the default decoder is recommended for most use cases of this driver.

## Writing your own encoding info

If you want to write a custom struct to your mongo collection - you can do that
Expand Down
Loading

0 comments on commit b360d87

Please sign in to comment.