hasura · robertjdominguez · Oct 25, 2024 · Oct 25, 2024
diff --git a/docs/connectors/mongodb/modeling.mdx b/docs/connectors/mongodb/modeling.mdx
@@ -0,0 +1,132 @@
+---
+sidebar_position: 1
+sidebar_label: Data Modeling
+description: "Learn how to model your data with MongoDB and Hasura DDN."
+keywords:
+  - mongodb
+  - modeling
+seoFrontMatterUpdated: false
+---
+
+import Thumbnail from "@site/src/components/Thumbnail";
+
+# Data Modeling
+
+The MongoDB data connector makes available any database resource that is listed in its configuration. In order to
+populate the configuration, the connector supports introspecting the database via the CLI `update` command.
+
+## Queryable collections
+
+The MongoDB data connector supports several types of queryable collections:
+
+- Collections
+- Views
+- Native Queries
+
+Collections and views are typically discovered automatically during the introspection process.
+
+Native queries on the other hand are custom aggregation pipelines that you specify in the data connector configuration.
+These are queries run using the MongoDB `aggregate` command which makes them similar to views, except that you may
+configure parameters that are interpolated into the pipeline at query time (similar to a function), and they do not
+require any DDL permissions on the database. See the configuration reference on
+[Native Queries](/connectors/mongodb/native-operations/native-queries.mdx).
+
+## Scalar types
+
+The MongoDB data connector supports nearly all of scalar types that the MongoDB database does. In addition the MongoDB
+data connector provides an additional scalar type, `extendedJSON` which acts like an "any" type.
+
+:::tip Data type names
+
+MongoDB's scalar types are listed in [MongoDB's BSON Types][BSON: data types] documentation. The MongoDB data connector
+configuration uses "alias" names for each type which are camel-cased with a lower-case initial letter. The same names
+are used in the connector's NDC schema.
+
+In order to use standard GraphQL scalar types rather than custom scalar types in the resulting GraphQL schema, the DDN
+configuration enables expressing a mapping between data connector types and GraphQL types.
+
+:::
+
+:::info Limited support
+
+The data connector does not support the `dbPointer` BSON type which is deprecated in MongoDB. Other deprecated types,
+`symbol` and `undefined`, are supported for the time being.
+
+:::
+
+:::info Extended JSON
+
+`extendedJSON` may represent any MongoDB value including scalars, objects, arrays, or null. The automatically-generated
+schema in your MongoDB data connector configuration may include fields with the `extendedJSON` type in cases where the
+introspection system is unable to infer a specific type. The name is based on the serialization format used: unlike
+other values, values with the `extendedJSON` type are serialized in the
+[MongoDB Extended JSON format](https://www.mongodb.com/docs/manual/reference/mongodb-extended-json/).
+
+:::
+
+## Structural types
+
+The data connector supports MongoDB array and object types, and they will appear as such in the schema it exposes. Since
+MongoDB is a document store that is intended to store arbitrarily complex data structures, the MongoDB data connector is
+designed to support the same.
+
+As mentioned previously, although `extendedJSON` is presented as a scalar type from the perspective of GraphQL it may
+represent arbitrary structural data encoded using the [MongoDB Extended JSON format][Extended JSON].
+
+## Schema Introspection
+
+The CLI is capable of automatically creating a configuration for the MongoDB data connector which includes a schema for
+your database. Because MongoDB databases do not have an internal schema, the connector schema is produced by randomly
+sampling database documents. By default it samples 100 documents - you may get more accurate results by increasing this
+number which you can do by editing the `sampleSize` option in `configuration.json`.
+
+During sampling types of values found are compared between documents. This includes top-level fields, but also nested
+object fields, and array elements. For a field path where the same type of value is found in every document the schema
+sets that type. If a value at that path is missing in some documents a nullable type is assumed.
+
+In general automatic schema introspection when encountering arrays or objects will accurately reflect the complex
+structure seen in the database. These types will in turn map to complex GraphQL types. The introspection system falls
+back to the `extendedJSON` when encountering fields with the same name and path in multiple objects that contain
+incompatible types. For example given a collection with these documents:
+
+```js
+{ _id: { $oid: "66cf91a0ec1dfb55954378bd" }, value: 1, created: "2024-02-20" }
+{ _id: { $oid: "66cf9230ec1dfb55954378be" }, value: 2, created: { $date: "2024-07-03" }, updated: { $date: "2024-10-25" }}
+{ _id: { $oid: "66cf9274ec1dfb55954378bf" }, value: 3, created: { $date: "2024-07-03" } }
+```
+
+(Note that when inputting data to MongoDB `{ $oid: ... }` is interpreted as an `objectId` value, and `{ $date: ... }` is
+interpreted as a `date` value.)
+
+The CLI infers these schema types:
+
+- `_id` : `objectId`
+- `value` : `int`
+- `created` : `extendedJSON`
+- `updated` : nullable `date`
+
+Because `updated` appears in one document, but not in all of them, it gets a nullable type.
+
+Because `created` has type `string` in one document, but has type `date` in the others, it gets the `extendendJSON` type
+which is the only type that can represent both `string` and `date`.
+
+:::tip Subtyping
+
+The connector assumes subtyping to a limited extent: it treats `int` as a subtype of `double`. In cases where different
+documents use both numeric types with the same field name an automatically-generated connector schema will supply the
+`double` type.
+
+:::
+
+## Queries
+
+The connector supports all query operations; see the [query documentation](/graphql-api/queries/index.mdx) for details.
+
+## Mutations
+
+We currently support mutations in the form of
+[Native Mutations](/connectors/mongodb/native-operations/native-mutations.mdx).
+These are arbitrary MongoDB database commands that you write into the data connector configuration.
+
+[BSON: data types]: https://www.mongodb.com/docs/manual/reference/bson-types/
+[Extended JSON]: https://www.mongodb.com/docs/manual/reference/mongodb-extended-json/