elastisys · aarnq · Aug 26, 2024 · Aug 26, 2024 · Oct 9, 2024 · Ajarmar
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -17,13 +17,15 @@ repos:
       - id: trailing-whitespace
         exclude: ^helmfile\.d/upstream/|^tests/.*/resources/
 
+  # TODO: Migrate to same config as compliantkubernetes repo.
   - repo: https://github.com/igorshubovych/markdownlint-cli
     rev: v0.39.0
     hooks:
       - id: markdownlint
         name: lint markdown
         args:
           - --disable
+          - MD007 # Unordered list indentation
           - MD013 # Line length
           - MD024 # Multiple headings with the same content
           - MD026 # Trailing punctuation in heading

diff --git a/config/schemas/README.md b/config/schemas/README.md
@@ -109,3 +109,102 @@ This can be enabled this in other repositories by editing the file `.vscode/sett
   }
 }
 ```
+
+## Guidelines
+
+The uses for the schema is to provide structural and conditional validation here in this repository, and to provide reference documentation in the public documentation repository.
+To make it easier to read, write, and use the schema we have a set of guidelines to make the most out of it that will follow below.
+
+> [!note]
+> The word "schema" here refers to any part of the schema, even subschemas, or schemas in definitions, etc.
+>
+> The example keys given are within the schema, shortened by stripping any `properties` and `items` components.
+
+### Metadata
+
+These guidelines aims to help to create better quality reference documentation.
-These guidelines aims to help to create better quality reference documentation.
+These guidelines aim to help in creating better quality reference documentation.
-These guidelines aims to help to create better quality reference documentation.
+These guidelines aim to help in creating better quality reference documentation.
+
+#### Titles
+
+- Must be provided, else it becomes rendered as `Untitled`.
+- Should be capitalised, as it becomes part of rendered headers `<title> Schema`.
+- Should be short, as it becomes part of rendered names.
+- Should be based on the key of the schema, as it becomes more intuitive.
+- Should include the context of the schema for common keys, e.g. `Object Storage` and `Object Storage Network Policies`.
+- Should include the context type of the schema last, as it becomes part of rendered headers `<title> <type> Schema`.
+
+#### Descriptions
+
+- Must be [GitHub Flavoured Markdown (GFM)](https://github.github.com/gfm/), as it will be rendered down into markdown documentation.
+- Should be provided for all complex types, `array` and `object` with `items` or `properties` respectively, and other types of note.
+- Should be written with full sentences, including proper punctuation.
+- Should include a leading explanation of what the option does and affect.
+    - For the root of larger schemas, e.g:
+        - `.dex` - "Configure Dex, the federated OpenID connect identity provider."
+    - For the root of smaller schemas, e.g.:
+        - `.dex.expiry` - "Configure expiry when authenticating via Dex."
+    - For the leaf of schemas, e.g.:
+        - `.dex.expiry.idToken` - "Configure expiry of id tokens."
+- Should include a follow up explanation for more details when required, e.g:
+    - `.grafana` - "Compliant Kubernetes hosts two instances of Grafana one for the Platform Administrator and one for the Application Developer"
+- Should include a follow up explanation for conditionals, e.g:
+    - `.networkPolicies.ingressNginx.ingressOverride` - "When enabled a list of IPs must be set that should override the allowed ingress traffic."
+- Should include a follow up explanation for requirements, e.g:
+    - `.velero` - "This requires that `.objectStorage` is configured, and will use the bucket or container set in `.objectStorage.buckets.velero`."
+- Should include a reference to upstream documentation, and applicable mapping rules, wrapped in a note GFM alert, e.g.:
+    - `.fluentd.forwarder.buffer` - "See \[the upstream documentation for reference\]\(\<link\>\), set keys will converted from `camelCase` to `snake_case` as required."
+- Should include a note about which cluster it is applicable, either in schema roots as a general case or in schema leafs as an exception, wrapped in a note GFM alert:
+    - For the root of larger schemas, e.g.:
+        - `.dex` - "Dex is installed in the service cluster, therefore this configuration mainly applies there."
+    - For the leaf of schemas, e.g.:
+        - `.dex.subdomain` - "Must be set for both service and workload clusters."
+
+#### Defaults and examples
+
+- Must be provided for all simple types, all `array` types, all compositional, conditional, and reusable `object` types.
+- Must be valid input to the schema, with the exception of `set-me`.
+- Should degrade gracefully, e.g. disabling optional components rather than enabling.
+
+### Structural
+
+These guidelines aims to help to create schemas that are easier to read and write.
-These guidelines aims to help to create schemas that are easier to read and write.
+These guidelines aim to help in creating schemas that are easier to read and write.
-These guidelines aims to help to create schemas that are easier to read and write.
+These guidelines aim to help in creating schemas that are easier to read and write.
+
+- All schemas must define a type.
+- All schemas should define a single type to not cause type confusion.
+- All schemas should use `additionalProperties: false` when no additional properties should be defined to make it easier to find misspelled keys.
+
+#### Defines
+
+Defines here refer to the use of `$defs` to define schemas, and `$ref`:s to use schemas.
+This is usable in two situations:
+
+1. To reuse common schemas
+2. To reduce and flatten nested schemas
+
+- Must be grouped, so documentation can be generated for it, e.g.:
+    - `.opensearch.$defs.roles` is not used on its own, but contains schemas that are reused.
+- Must be flattened, so documentation can be generated for it, e.g.:
+    - `.opensearch.$defs.roles` contains schemas, but those schemas does not define subschemas instead they use `$refs` to other schemas under `.opensearch.$defs.roles`.
+- Should be used whenever possible to reuse common schemas, e.g. to have common definitions for similar things like `.grafana.ops` and `.grafana.user`.
+- Should be used whenever needed to reduce nested schemas, e.g. to reduce the generated names of a schema which is composed of the path to it within the schema.
+- Should be defined under `$defs` local to where they are used, e.g.:
+    - `.opensearch.$defs.node` with subschemas reused by `.opensearch.master`, `.opensearch.data`, and `.opensearch.client` node.
+    - `.opensearch.$defs.roles` with subschemas reduced and flattened for `.opensearch.extraRoles`.
+- Should **not** define titles, descriptions, defaults, or examples unless they are universally applicable.
+    - Else it will not be possible to override this metadata when referenced.
+
+#### Imports
+
+Imports here refers to the use of `$defs` to define schemas fetched from upstream sources.
+The same guidelines for defines are applicable here as well.
+
+- Must include a comment under the key `$comment` in the schema with the stanza "Schema imported from \[\<upstream-name\>\]\(\<upstream-link\>\).".
+- Must include a reference to upstream documentation.
+
+#### Compositions and Conditions
+
+Compositions and conditions here refers to the use of `allOf`, `anyOf`, and `oneOf`, etc. to define schema compositions and conditions.
+
+- Schemas with compositions and conditions must set defaults because otherwise they cannot be fully resolved.
+- Schemas with compositions and conditions should set examples because otherwise they cannot be fully resolved.
+- Should not use `if` `then` `else` as it has less support in tooling, and has a semantic that is generally different from other schema constructs.