Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON schema contribution guide #2248

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,15 @@ repos:
- id: trailing-whitespace
exclude: ^helmfile\.d/upstream/|^tests/.*/resources/

# TODO: Migrate to same config as compliantkubernetes repo.
- repo: https://github.com/igorshubovych/markdownlint-cli
rev: v0.39.0
hooks:
- id: markdownlint
name: lint markdown
args:
- --disable
- MD007 # Unordered list indentation
- MD013 # Line length
- MD024 # Multiple headings with the same content
- MD026 # Trailing punctuation in heading
Expand Down
99 changes: 99 additions & 0 deletions config/schemas/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,3 +109,102 @@ This can be enabled this in other repositories by editing the file `.vscode/sett
}
}
```

## Guidelines

The uses for the schema is to provide structural and conditional validation here in this repository, and to provide reference documentation in the public documentation repository.
To make it easier to read, write, and use the schema we have a set of guidelines to make the most out of it that will follow below.

> [!note]
> The word "schema" here refers to any part of the schema, even subschemas, or schemas in definitions, etc.
>
> The example keys given are within the schema, shortened by stripping any `properties` and `items` components.

### Metadata

These guidelines aims to help to create better quality reference documentation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
These guidelines aims to help to create better quality reference documentation.
These guidelines aim to help in creating better quality reference documentation.

Minor wording nit.


#### Titles

- Must be provided, else it becomes rendered as `Untitled`.
- Should be capitalised, as it becomes part of rendered headers `<title> Schema`.
- Should be short, as it becomes part of rendered names.
- Should be based on the key of the schema, as it becomes more intuitive.
- Should include the context of the schema for common keys, e.g. `Object Storage` and `Object Storage Network Policies`.
- Should include the context type of the schema last, as it becomes part of rendered headers `<title> <type> Schema`.

#### Descriptions

- Must be [GitHub Flavoured Markdown (GFM)](https://github.github.com/gfm/), as it will be rendered down into markdown documentation.
- Should be provided for all complex types, `array` and `object` with `items` or `properties` respectively, and other types of note.
- Should be written with full sentences, including proper punctuation.
- Should include a leading explanation of what the option does and affect.
- For the root of larger schemas, e.g:
- `.dex` - "Configure Dex, the federated OpenID connect identity provider."
- For the root of smaller schemas, e.g.:
- `.dex.expiry` - "Configure expiry when authenticating via Dex."
- For the leaf of schemas, e.g.:
- `.dex.expiry.idToken` - "Configure expiry of id tokens."
- Should include a follow up explanation for more details when required, e.g:
- `.grafana` - "Compliant Kubernetes hosts two instances of Grafana one for the Platform Administrator and one for the Application Developer"
- Should include a follow up explanation for conditionals, e.g:
- `.networkPolicies.ingressNginx.ingressOverride` - "When enabled a list of IPs must be set that should override the allowed ingress traffic."
- Should include a follow up explanation for requirements, e.g:
- `.velero` - "This requires that `.objectStorage` is configured, and will use the bucket or container set in `.objectStorage.buckets.velero`."
- Should include a reference to upstream documentation, and applicable mapping rules, wrapped in a note GFM alert, e.g.:
- `.fluentd.forwarder.buffer` - "See \[the upstream documentation for reference\]\(\<link\>\), set keys will converted from `camelCase` to `snake_case` as required."
- Should include a note about which cluster it is applicable, either in schema roots as a general case or in schema leafs as an exception, wrapped in a note GFM alert:
- For the root of larger schemas, e.g.:
- `.dex` - "Dex is installed in the service cluster, therefore this configuration mainly applies there."
- For the leaf of schemas, e.g.:
- `.dex.subdomain` - "Must be set for both service and workload clusters."

#### Defaults and examples

- Must be provided for all simple types, all `array` types, all compositional, conditional, and reusable `object` types.
- Must be valid input to the schema, with the exception of `set-me`.
- Should degrade gracefully, e.g. disabling optional components rather than enabling.

### Structural

These guidelines aims to help to create schemas that are easier to read and write.
aarnq marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
These guidelines aims to help to create schemas that are easier to read and write.
These guidelines aim to help in creating schemas that are easier to read and write.


- All schemas must define a type.
- All schemas should define a single type to not cause type confusion.
- All schemas should use `additionalProperties: false` when no additional properties should be defined to make it easier to find misspelled keys.

#### Defines

Defines here refer to the use of `$defs` to define schemas, and `$ref`:s to use schemas.
This is usable in two situations:

1. To reuse common schemas
2. To reduce and flatten nested schemas

- Must be grouped, so documentation can be generated for it, e.g.:
- `.opensearch.$defs.roles` is not used on its own, but contains schemas that are reused.
- Must be flattened, so documentation can be generated for it, e.g.:
- `.opensearch.$defs.roles` contains schemas, but those schemas does not define subschemas instead they use `$refs` to other schemas under `.opensearch.$defs.roles`.
- Should be used whenever possible to reuse common schemas, e.g. to have common definitions for similar things like `.grafana.ops` and `.grafana.user`.
- Should be used whenever needed to reduce nested schemas, e.g. to reduce the generated names of a schema which is composed of the path to it within the schema.
- Should be defined under `$defs` local to where they are used, e.g.:
- `.opensearch.$defs.node` with subschemas reused by `.opensearch.master`, `.opensearch.data`, and `.opensearch.client` node.
- `.opensearch.$defs.roles` with subschemas reduced and flattened for `.opensearch.extraRoles`.
- Should **not** define titles, descriptions, defaults, or examples unless they are universally applicable.
- Else it will not be possible to override this metadata when referenced.

#### Imports

Imports here refers to the use of `$defs` to define schemas fetched from upstream sources.
The same guidelines for defines are applicable here as well.

- Must include a comment under the key `$comment` in the schema with the stanza "Schema imported from \[\<upstream-name\>\]\(\<upstream-link\>\).".
- Must include a reference to upstream documentation.

#### Compositions and Conditions

Compositions and conditions here refers to the use of `allOf`, `anyOf`, and `oneOf`, etc. to define schema compositions and conditions.

- Schemas with compositions and conditions must set defaults because otherwise they cannot be fully resolved.
- Schemas with compositions and conditions should set examples because otherwise they cannot be fully resolved.
- Should not use `if` `then` `else` as it has less support in tooling, and has a semantic that is generally different from other schema constructs.
Comment on lines +206 to +210
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it might be good to include some examples in this section? I think conditionals can get kind of advanced, and there aren't that many examples to look at in our existing schemas.