Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add data modeling page for mongodb #744

Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
132 changes: 132 additions & 0 deletions docs/connectors/mongodb/modeling.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
---
sidebar_position: 1
sidebar_label: Data Modeling
description: "Learn how to model your data with MongoDB and Hasura DDN."
keywords:
- mongodb
- modeling
seoFrontMatterUpdated: false
---

import Thumbnail from "@site/src/components/Thumbnail";

# Data Modeling

The MongoDB data connector makes available any database resource that is listed in its configuration. In order to
populate the configuration, the connector supports introspecting the database via the CLI `update` command.

## Queryable collections

The MongoDB data connector supports several types of queryable collections:

- Collections
- Views
- Native Queries

Collections and views are typically discovered automatically during the introspection process.

Native queries on the other hand are custom aggregation pipelines that you specify in the data connector configuration.
These are queries run using the MongoDB `aggregate` command which makes them similar to views, except that you may
configure parameters that are interpolated into the pipeline at query time (similar to a function), and they do not
require any DDL permissions on the database. See the configuration reference on
[Native Queries](/connectors/mongodb/native-operations/native-queries.mdx).

## Scalar types

The MongoDB data connector supports nearly all of scalar types that the MongoDB database does. In addition the MongoDB
data connector provides an additional scalar type, `extendedJSON` which acts like an "any" type.

:::tip Data type names

MongoDB's scalar types are listed in [MongoDB's BSON Types][BSON: data types] documentation. The MongoDB data connector
configuration uses "alias" names for each type which are camel-cased with a lower-case initial letter. The same names
are used in the connector's NDC schema.

In order to use standard GraphQL scalar types rather than custom scalar types in the resulting GraphQL schema, the DDN
configuration enables expressing a mapping between data connector types and GraphQL types.

:::

:::info Limited support

The data connector does not support the `dbPointer` BSON type which is deprecated in MongoDB. Other deprecated types,
`symbol` and `undefined`, are supported for the time being.

:::

:::info Extended JSON

`extendedJSON` may represent any MongoDB value including scalars, objects, arrays, or null. The automatically-generated
schema in your MongoDB data connector configuration may include fields with the `extendedJSON` type in cases where the
introspection system is unable to infer a specific type. The name is based on the serialization format used: unlike
other values, values with the `extendedJSON` type are serialized in the
[MongoDB Extended JSON format](https://www.mongodb.com/docs/manual/reference/mongodb-extended-json/).

:::

## Structural types

The data connector supports MongoDB array and object types, and they will appear as such in the schema it exposes. Since
MongoDB is a document store that is intended to store arbitrarily complex data structures, the MongoDB data connector is
designed to support the same.

As mentioned previously, although `extendedJSON` is presented as a scalar type from the perspective of GraphQL it may
represent arbitrary structural data encoded using the [MongoDB Extended JSON format][Extended JSON].

## Schema Introspection

The CLI is capable of automatically creating a configuration for the MongoDB data connector which includes a schema for
your database. Because MongoDB databases do not have an internal schema, the connector schema is produced by randomly
sampling database documents. By default it samples 100 documents - you may get more accurate results by increasing this
number which you can do by editing the `sampleSize` option in `configuration.json`.

During sampling types of values found are compared between documents. This includes top-level fields, but also nested
object fields, and array elements. For a field path where the same type of value is found in every document the schema
sets that type. If a value at that path is missing in some documents a nullable type is assumed.

In general automatic schema introspection when encountering arrays or objects will accurately reflect the complex
structure seen in the database. These types will in turn map to complex GraphQL types. The introspection system falls
back to the `extendedJSON` when encountering fields with the same name and path in multiple objects that contain
incompatible types. For example given a collection with these documents:

```js
{ _id: { $oid: "66cf91a0ec1dfb55954378bd" }, value: 1, created: "2024-02-20" }
{ _id: { $oid: "66cf9230ec1dfb55954378be" }, value: 2, created: { $date: "2024-07-03" }, updated: { $date: "2024-10-25" }}
{ _id: { $oid: "66cf9274ec1dfb55954378bf" }, value: 3, created: { $date: "2024-07-03" } }
```

(Note that when inputting data to MongoDB `{ $oid: ... }` is interpreted as an `objectId` value, and `{ $date: ... }` is
interpreted as a `date` value.)

The CLI infers these schema types:

- `_id` : `objectId`
- `value` : `int`
- `created` : `extendedJSON`
- `updated` : nullable `date`

Because `updated` appears in one document, but not in all of them, it gets a nullable type.

Because `created` has type `string` in one document, but has type `date` in the others, it gets the `extendendJSON` type
which is the only type that can represent both `string` and `date`.

:::tip Subtyping

The connector assumes subtyping to a limited extent: it treats `int` as a subtype of `double`. In cases where different
documents use both numeric types with the same field name an automatically-generated connector schema will supply the
`double` type.

:::

## Queries

The connector supports all query operations; see the [query documentation](/graphql-api/queries/index.mdx) for details.

## Mutations

We currently support mutations in the form of
[Native Mutations](/connectors/mongodb/native-operations/native-mutations.mdx).
These are arbitrary MongoDB database commands that you write into the data connector configuration.

[BSON: data types]: https://www.mongodb.com/docs/manual/reference/bson-types/
[Extended JSON]: https://www.mongodb.com/docs/manual/reference/mongodb-extended-json/
Loading