Skip to content

Latest commit

 

History

History
2994 lines (2456 loc) · 61.5 KB

Schema.adoc

File metadata and controls

2994 lines (2456 loc) · 61.5 KB

JSON Schema Tutorial

Acknowledgments

Thanks to Jason Desrosiers and Clemens Uhlenhut for their clarifications around the default keyword [1].

To Do

  1. Explain how to use this validator in this section

  2. Add a section on const before this section

  3. Add a section about definition vs. declaration

  4. Check the use of the $schema property in this section

Introduction

MIME Type : application/schema+json

No specific file extension is defined yet [2] [3].

As explained, in the JSON Schema web site or JSON Schema github repository, there are 3 specifications around JSON Schema :

Confusingly, these 3 documents can also be found at :

Warning
At the time of this writing ( January 2019 ), the status of all these specifications is Internet-Draft.
As explained here, the current draft is Draft 07.

Conventions

Here are the conventions used in this document :

this is a JSON Schema document
this is an inconsistent JSON Schema document
this is a valid JSON document
this is an invalid JSON document

Useful Tools

A lot of useful tools are listed on the JSON Schema web site.

Online Validators

Warning
Francis Galiegue’s validator only validate up to Draft 4 schemas

Libraries

The only problem with these online validators is that they can’t handle schemas that are split into multiple files.
For that you need to use a JSON Schema validation library written in you favourite language [4].

Java
  1. Francis Galiegue’s JSON Schema Validator

    An instance is validated against a given schema using the following command :

    java -jar json-schema-validator-2.2.6-lib.jar schema.json instance.json

    The jar is downloaded from here as mentioned in the "Full" jar; command line section [5].

    Warning
    Unfortunatelly this only validates up to Draft 4 schemas
  2. JSON Schema Validator for Java

JavaScript
  1. Another Json Validator

    Install using npm install -g ajv-cli as mentioned here.

    • If you have a standalone schema, validate using ajv -s schemas/standalone.schema.json -d "examples/instance.json".

    • If you have a set of schemas, validate using ajv -s schemas/primary.schema.json -r schemas/linked.schema.json -d "examples/instance.json".

    Note
    Don’t forget to replace the paths mentioned in the above commands with your own paths !

The Simplest Schema Ever !

{}

Any well-formed JSON text will pass the validation against the above schema :

false
42
false
"string"
null
{ "key" : "value" }
[ "value1", 12, { "key" : "value" } ]
Note
RFC 7159 and Standard ECMA-404 : The JSON Data Interchange Format mention that the first four cases are valid even though certain previous specifications of JSON constrained a JSON text to be an object or an array ( See RFC 4627 ).

Is this even a schema ?

A JSON Schema is just a JSON document that conforms to the JSON Schema’s Schema.

A $schema keyword [6] can be used to explicitly specify that a JSON document is a schema.

{ "$schema": "http://json-schema.org/draft-07/schema#" } (1)
  1. The value specifies the version of the specification & the location of the schema

Note that you can specify the version of the specification or even the specification the schema adheres to :

  • http://json-schema.org/hyper-schema#

JSON Schema hyperschema written against the current version of the specification.

  • http://json-schema.org/draft-04/schema#

JSON Schema written against draft 4 of the specification.

Specifying Types

The type keyword is used to specify the type of a value or a structure :

Schema :

{ "type" : "string" }

Instances :

"string"
invalid
42

The valid values for the type keyword are :

  • string

  • integer and number [7]

  • boolean [ true, false ]

  • object and array

  • null [ null ]

You have a choice

The type keyword can have a value that is an array of the allowed types.

Schema :

{ "type": ["number", "string"] } (1)
  1. Note that this specifies that the value can be a number or a string.

Instances :

42
invalid
false

Be more specific

The enum keyword can be used in conjunction with the type keyword to restrict the set of valid values to a subset of the valid values for the type.

Schema :

{
    "type": "string",
    "enum": ["red", "amber", "green"]
}

Instances :

"red"
invalid
"black"

Consistency

If the enum keyword is used in conjunction with the type keyword, the values specified should be valid values for the type.

Schema :

inconsistent
{
    "type": "number",
    "enum": ["zero", 1, 2]
}

Instances :

invalid
"zero"

No Type

The enum keyword can be used on its own.
In this case the set of valid values can be of any type.

Schema :

{
    "enum": ["zero", 1, 2.0, null]
}

Instances :

"zero"
null
1
1.0
2
Note
The last 2 cases are valid because JSON, as opposed to JSON Schema, does not make any difference between a number and an integer.

Specific Arrays

The enum keyword can be used to enumerate valid arrays.

Schema :

{
    "type": "array",
    "enum": [ ["A", "B"], [1,2] ]
}

Instances :

["A", "B"]
invalid
["A"]

Excluding Types

The not keyword can be used to specify that a document is valid if it doesn’t conform to a certain schema.
The value must be a schema.

Schema :

{
    "not" : {
        "type": "string",
        "enum": ["red", "amber", "green"]
    }
}

or

{
    "type": "string",
    "not" : {
        "enum": ["red", "amber", "green"]
    }
}

Instances :

"black"
invalid
"red"

Specifying Formats

The format keyword [8] can be used to define specific formats.
All the current built-in formats apply to the string type :

  • date-time

Schema :

{
    "type": "string",
    "format": "date-time"
}

Instances :

"2015-11-11T23:45:00Z"
invalid
"2015-11-11T23:45:00"
  • date

Schema :

{
    "type": "string",
    "format": "date"
}

Instances :

"2015-11-11"
invalid
"2015-11-11T23:45:00Z"
  • time

  • email and idn-email

  • hostname and idn-hostname

  • ipv4 and ipv6

  • uri, uri-reference, iri and iri-reference

  • uri-template

  • json-pointer and relative-json-pointer

  • regex [9]

Warning

Note that there are significant differences between drafts regarding formats.
You should therefore validate that the draft you are using supports the specified format.

For example, draft 4 of the specification :

  • doesn’t mention the date, time, utc-millisec, regex, color, style or phone formats,

  • renames ip-address to ipv4 and host-name to hostname,

  • only mentions string formats.

User-defined formats

It is not possible to define your own format à la RELAX NG.

Specifying Constraints

The following keywords can be used to further constrain the set of valid values within the specified type.

string

  • minLength and maxLength

Schema :

{
    "type": "string",
    "minLength": 2,
    "maxLength": 3
}

Instances :

"AB"
invalid
"A"

Schema :

{
    "type": "string",
    "pattern": "^(\\([0-9]{3}\\))?[0-9]{3}-[0-9]{4}$"
}

Instances :

"(888)555-1212"
invalid
"(888)5551212"

integer and number

  • multipleOf

  • minimum, exclusiveMinimum, maximum and exclusiveMaximum

Schema :

{
    "type": "number",
    "multipleOf" : 1.5,
    "minimum": 1.5,
    "maximum": 6,
    "exclusiveMaximum": true
}

Instances :

1.5
3
invalid
6.0

Combining Schemas

Schemas can be combined to create more complex schemas using the allOf, anyOf and oneOf keywords.
The value must be an array of schemas.

  • anyOf

Schema :

{
    "anyOf": [
        { "type": "string", "maxLength": 5 },
        { "type": "integer", "maximum": 99999 }
    ]
}

Instances :

"413"
"test"
413
invalid
100000
invalid
"100000"
Tip

The anyOf keyword can be used to allow a single schema to valiadate a list of items or a single item as show below :

{
    "definitions": {
        "plan": {
            ...
        }
    },
    "anyOf": [
        {
            "type": "array",
            "items": { "$ref": "#/definitions/plan" },
            "additionalProperties": false
        },
        { "$ref": "#/definitions/plan" }
    ]
}
  • allOf

Schema :

{
    "allOf": [
        { "type": "string", "maxLength": 5 },
        { "type": "string", "minLength": 2 }
    ]
}

Instances :

"413"
invalid
"1"

Schema :

inconsistent
{
    "allOf": [
        { "type": "string", "maxLength": 5 },
        { "type": "integer", "maximum": 99999 }
    ]
}

The combined schemas must be combinable since the value will have to adhere to all the schemas at the same time.

  • oneOf

Schema :

{
    "oneOf": [
        { "type": "number", "multipleOf": 5 },
        { "type": "number", "multipleOf": 3 }
    ]
}

Instances :

10
invalid
15

Defining Arrays

Element Types

The items keyword is used to describe array elements.
The value must be a schema.

This is done in the same way as above.

Schema :

{
    "type": "array",
    "items": {
        "type": "number"
    }
}

Instances :

[1, 2, 3, 4, 5]
[]
invalid
["1", "2", "3", "4", "5"]

Schema :

{
    "type": "array",
    "items": {
        "type": "string",
        "format": "date"
    }
}

Instances :

["2015-11-11", "2015-11-12", "2015-11-13", "2015-11-14", "2015-11-15"]

Schema :

{
    "type": "array",
    "items": {
        "type": ["number", "string"]
    }
}

Instances :

[1, 2, 3, 4, 5]
["1", "2", "3", "4", "5"]
["1", 2, "3", 4, "5"]

Schema :

{
    "type": "array",
    "items": {
        "type": "string",
        "enum": ["red", "amber", "green"]
    }
}

Instances :

["red", "green"]
invalid
["red", "blue"]

Schema :

{
    "type": "array",
    "items": {
        "type": "string",
        "minLength": 2,
        "maxLength": 3
    }
}

Instances :

["AA", "AB"]
invalid
["A", "AA"]

List Size

The size of the array can be specified using minItems and maxItems.

Schema :

{
    "type": "array",
    "minItems": 2,
    "maxItems": 3,
    "items": {
        "type": "string"
    }
}

Instances :

["AA", "AB"]
invalid
["AA"]
Tip
Most of the time, it is useful to have minItems set to 1.
This avoids the confusion caused by a property which value is an empty array : [] which is usually best represented by a missing property.

Sets

It is possible to mandate that each element in the list be unique using the uniqueItems keyword.

Schema :

{
    "type": "array",
    "uniqueItems": true
}

Instances :

["AA", "AB"]
invalid
["AA", "AA"]

Note that the unique items can be arrays or objects.

Tip
The objects are considered non-unique if at least one of their properties is different; the order of the properties is irrelevant.

Tuples

A tuple is an array where each item has a different meaning and therefore type, similar to a database row.
To cater for this, the value of the items keyword can be an array of schemas instead of a single schema.

Schema :

{
    "type": "array",
    "items": [
        {
            "type": "string",
            "enum": ["maths", "physics", "french", "other"]
        },
        {
            "type": "number"
        }
    ]
}

Instances :

["maths", 82.5]
invalid
["english"]

But, as opposed to objects where property order is irrelevant, here, order matters !

invalid
[82.5, "maths"]

But, as is the case with objects, nothing is mandatory by default :

["maths"]
Caution
Unfortunately, as opposed to objects where required elements can be specified, there is no way to specify which elements of the tuple are required.

But, as is the case with objects, additional elements are allowed by default :

["maths", 82.5, "additional text"]

Strict Definition

The additionalItems keyword is used, in tuples, to enforce that only elements specified in the schemas are allowed to appear.

Schema :

{
    "type": "array",
    "items": [
        {
            "type": "string",
            "enum": ["maths", "physics", "french", "other"]
        },
        {
            "type": "number"
        }
    ],
    "additionalItems" : false
}

Instances :

invalid
["maths", 82.5, "additional text"]

Looser Definition

Conform to a Schema

It is possible, in tuples, to allow only additional items that conform to a given schema.

In this case, the value of the additionalItems keyword must be a schema.

Schema :

{
    "type": "array",
    "items": [
        {
            "type": "string",
            "enum": ["maths", "physics", "french", "other"]
        },
        {
            "type": "number"
        }
    ],
    "additionalItems" : {
        "type": "string",
        "format": "date-time"
    }
}

Instances :

["maths", 82.5, "2015-11-11T23:45:00Z"]
invalid
["maths", 82.5, "additional text"]
Tip
The additionalItems keyword can only be used with tuples.
It wouldn’t make sense to use it with arrays since the schema specified by the items keyword is the only element type that is allowed for the array.
Arrays behave as if there was an implicit additionalItems property set to false.

Specifying Structures

The object type is the only strcutured type which structure is user-defined.

What’s the structure ?

The properties keyword is used to define the structure of an object.

Schema :

{
    "type": "object",
    "properties": {
        "name": { "type": "string" },
        "gender": { "type": "string", "enum": ["male", "female"] },
        "birthday": { "type": "string", "format": "date" }
    }
}

Instances :

{
    "name": "aowss",
    "gender": "male",
    "birthday": "1973-01-24"
}

As you can see, order is not enforced :

{
    "gender": "male",
    "name": "aowss",
    "birthday": "1973-01-24"
}

As you can see, nothing is mandatory :

{}

As you can see, you can add properties :

{
    "name": "aowss",
    "gender": "male",
    "nationality": "french",
    "birthday": "1973-01-24"
}
invalid
{
    "name": "aowss",
    "gender": "male",
    "birthday": false (1)
}
  1. the birthday property has been declared to be of type string in the schema and the instance specifies a boolean property.

What is required ?

The required keyword is used to specify which properties are mandatory.

Note
This is different from XML Schema where elments are mandatory by default.

Schema :

{
    "type": "object",
    "properties": {
        "name": { "type": "string" },
        "gender": { "type": "string", "enum": ["male", "female"] },
        "birthday": { "type": "string", "format": "date" }
    },
    "additionalProperties": false,
    "required": ["name", "gender"]
}

Instances :

{
    "name": "aowss",
    "gender": "male"
}
invalid
{} (1)
  1. The schema declares that name and gender are mandatory and the instance doesn’t specify these properties.

Enforce Order

It is currently not possible to enforce order.

Note
There is no equivalent to XML Schema’s sequence keyword.

Strict Definition

The additionalProperties keyword is used to enforce that only properties specified in the schema are allowed to appear.

Schema :

{
    "type": "object",
    "properties": {
        "name": { "type": "string" },
        "gender": { "type": "string", "enum": ["male", "female"] },
        "birthday": { "type": "string", "format": "date" }
    },
    "additionalProperties": false
}

Instances :

invalid
{
    "name": "aowss",
    "gender": "male",
    "nationality": "french", (1)
    "birthday": "1973-01-24"
}
  1. The schema doesn’t allow any property that has not been declared to appear in the instance.

Looser Definition

Conform to a Schema

As is the case with tuples, it is possible to allow only additional properties that conform to a given schema.

In this case, the value of the additionalProperties keyword must be a schema.

Schema :

{
    "type": "object",
    "properties": {
        "name": { "type": "string" },
        "gender": { "type": "string", "enum": ["male", "female"] }
    },
    "additionalProperties": { "type": "string", "format": "date" }
}

Instances :

{
    "name": "aowss",
    "gender": "male",
    "dob": "1973-01-24"
}
invalid
{
    "name": "aowss",
    "gender": "male",
    "dob": 1973 (1)
}
  1. The schema allows non declared properties to be specified in the instance but their type must be string and their format must be date.

Restrict the number

The minProperties &maxProperties keywords are used to enforce the number of properties.

Schema :

{
    "type": "object",
    "minProperties": 2,
    "maxProperties": 3
}

Instances :

{
    "name": "aowss",
    "gender": "male",
    "birthday": "1973-01-24"
}
invalid
{
    "name": "aowss",
    "gender": "male",
    "nationality": "french",
    "birthday": "1973-01-24" (1)
}
  1. The schema doesn’t allow for more than 3 properties.

The value of the maxProperties keyword must be greater than the number of required properties :

Schema :

inconsistent
{
    "type": "object",
    "properties": {
        "name": { "type": "string" },
        "gender": { "type": "string", "enum": ["male", "female"] },
        "birthday": { "type": "string", "format": "date" },
        "nationality": { "type": "string", "default": "french" }
    },
    "additionalProperties": false,
    "maxProperties": 2, (1)
    "required": ["name", "gender", "nationality"] (1)
}
  1. The maximum number of properties is less than the number of required properties !

If the additionalProperties keyword is specified with a value of false, these keywords only make sense to restrict the number of optional properties that can be specified.

Only these names

The patternProperties keyword is used to enforce a given pattern for the name of a property.

It’s the property’s name that must conform to the specified pattern.

The property’s value must conform to the provided schema.

Allow additional boolean properties that begin with an _ :

Schema :

{
    "type": "object",
    "properties": {
        "name": { "type": "string" },
        "gender": { "type": "string", "enum": ["male", "female"] }
    },
    "patternProperties": {
        "^_": { "type": "boolean" }
    },
    "additionalProperties": false
}

Instances :

{
    "name": "aowss",
    "gender": "male",
    "_member": true,
    "_loggedIn": false
}
invalid
{
    "name": "aowss",
    "gender": "male",
    "member": true (1)
}
  1. The schema allows non declared properties to be specified in the instance but their name must begin with _.

Tip
patternProperties can be used in conjunction with additionalProperties.
In that case, additionalProperties will refer to any properties that are not explicitly listed in properties and don’t match any of the patternProperties.

It’s possible to have more than one pattern specified.

Schema :

{
    "type": "object",
    "properties": {
        "name": { "type": "string" },
        "gender": { "type": "string", "enum": ["male", "female"] }
    },
    "patternProperties": {
        "^_": { "type": "boolean" },
        "^-": { "type": "string" }
    },
    "additionalProperties": false
}

Instances :

{
    "name": "aowss",
    "gender": "male",
    "_member": true,
    "-user": "aowss"
}
invalid
{
    "name": "aowss",
    "gender": "male",
    "_member": true,
    "-user": true (1)
}
  1. The schema allows non declared properties with names that begin with - to be specified but their type must be string.

Make sure it’s an object

Caution
Note that if you don’t specify that the type is object, then any other type will be valid.

Schema :

{
     (1)
    "properties": {
        "name": { "type": "string" },
        "gender": { "type": "string", "enum": ["male", "female"] },
        "birthday": { "type": "string", "format": "date" }
    },
    "additionalProperties": false
}
  1. The schema doesn’t specify that the type of the instance must be an object.

Instances :

[ "aowss", "male" ] (1)
  1. Any type is valid, including an array.
    Since this is not an object, it doesn’t have to comply to the schema properties !

{
    "name": "aowss",
    "gender": "male"
}
invalid
{ (1)
    "name": "aowss",
    "gender": "male",
    "nationality": "french", (2)
    "birthday": "1973-01-24"
}
  1. The instance’s type is an object.

  2. The nationality property is not allowed.

If the instance’s type is an object, it must be valid in respect to the schema properties.

Warning
Beware that a lot of examples around using the ref keyword, do not enforce that !

Simple Cross Validation

The dependencies keyword is used to manage dependencies between properties.

Property Dependency

I need this property if the other property is specified

If the passport number is specified, than we need the nationality.

Schema :

{
    "type": "object",
    "properties": {
        "name": { "type": "string" },
        "gender": { "type": "string", "enum": ["male", "female"] },
        "birthday": { "type": "string", "format": "date" },
        "nationality": { "type": "string" },
        "passport": { "type": "string" }
    },
    "additionalProperties": false,
    "required": ["name", "gender", "birthday"],
    "dependencies": {
        "passport": ["nationality"]
    }
}

Note that this means that the passport property requires the nationality property and not the reverse.

Instances :

{
    "name": "aowss",
    "gender": "male",
    "birthday": "1973-01-24"
}
{
    "name": "aowss",
    "gender": "male",
    "birthday": "1973-01-24",
    "nationality": "french"
}
{
    "name": "aowss",
    "gender": "male",
    "birthday": "1973-01-24",
    "passport": "02AA12345",
    "nationality": "french"
}
invalid
{
    "name": "aowss",
    "gender": "male",
    "birthday": "1973-01-24",
    "passport": "02AA12345" (1)
     (2)
}
  1. The passport property is specified.

  2. The nationality property is not specified.

In fact, we need both or none of them !

Schema :

{
    "type": "object",
    "properties": {
        "name": { "type": "string" },
        "gender": { "type": "string", "enum": ["male", "female"] },
        "birthday": { "type": "string", "format": "date" },
        "nationality": { "type": "string" },
        "passport": { "type": "string" }
    },
    "additionalProperties": false,
    "required": ["name", "gender", "birthday"],
    "dependencies": {
        "passport": ["nationality"],
        "nationality": ["passport"]
    }
}

Instances :

{
    "name": "aowss",
    "gender": "male",
    "birthday": "1973-01-24"
     (1)
     (2)
}
  1. The nationality property is not specified.

  2. The passport property is not specified.

invalid
{
    "name": "aowss",
    "gender": "male",
    "birthday": "1973-01-24",
    "nationality": "french" (1)
     (2)
}
  1. The nationality property is specified.

  2. The passport property is not specified.

Different Schemas

If the nationality is specified, we need all passport details to be provided.

Schema :

{
    "type": "object",
    "properties": {
        "name": { "type": "string" },
        "gender": { "type": "string", "enum": ["male", "female"] },
        "birthday": { "type": "string", "format": "date" },
        "nationality": { "type": "string" }
    },
    "required": ["name", "gender", "birthday"],
    "dependencies": {
        "nationality": {
            "properties": {
                "passportNumber": { "type": "string" },
                "passportIssueDate": { "type": "string", "format": "date" },
                "passportExpiryDate": { "type": "string", "format": "date" }
            },
            "required": ["passportNumber", "passportIssueDate", "passportExpiryDate"]
        }
    }
}

Note that this means that the nationality property requires the passport properties.

Tip
A more natural way of understanding it is : if the nationality property is specified, then the passport details must be specified.

Instances :

{
    "name": "aowss",
    "gender": "male",
    "birthday": "1973-01-24"
}
{
    "name": "aowss",
    "gender": "male",
    "birthday": "1973-01-24",
    "nationality": "french",
    "passportNumber": "02AA12345",
    "passportIssueDate": "2011-02-12",
    "passportExpiryDate": "2021-02-11"
}
invalid
{
    "name": "aowss",
    "gender": "male",
    "birthday": "1973-01-24",
    "nationality": "french" (1)
     (2)
}
  1. The nationality property is specified.

  2. The passport details are not specified.

Caution
Beware, this requires additional properties !

Note that since the passport properties are now defined in the depedencies section, additionalProperties can’t be set to false at the object level :

Schema :

inconsistent
{
    "type": "object",
    "properties": {
        ...
    },
    "additionalProperties": false, (1)
    "required": ["name", "gender", "birthday"],
    "dependencies": {
        "nationality": {
            "properties": {
                ...
            },
            "required": ["passportNumber", "passportIssueDate", "passportExpiryDate"]
        }
    }
}
  1. The additionalProperties property can’t be set to false since additional properties are definied in the dependencies.

This is different from the case where the dependency was on properties !
In that case, no additional properties were needed : they were all defined in the object schema.

Caution
Annoying side effects !!!

Since additionalProperties can’t be set to false, the following documents are valid :

Schema ( same as above ):

{
    "type": "object",
    "properties": {
        "name": { "type": "string" },
        "gender": { "type": "string", "enum": ["male", "female"] },
        "birthday": { "type": "string", "format": "date" },
        "nationality": { "type": "string" }
    },
    "required": ["name", "gender", "birthday"],
    "dependencies": {
        "nationality": {
            "properties": {
                "passportNumber": { "type": "string" },
                "passportIssueDate": { "type": "string", "format": "date" },
                "passportExpiryDate": { "type": "string", "format": "date" }
            },
            "required": ["passportNumber", "passportIssueDate", "passportExpiryDate"]
        }
    }
}

Instances :

The passport properties without the nationality :

{
    "name": "aowss",
    "gender": "male",
    "birthday": "1973-01-24",
     (1)
    "passportNumber": "02AA12345",
    "passportIssueDate": "2011-02-12",
    "passportExpiryDate": "2021-02-11"
}
  1. The nationality property is not required since it’s the passport details that require the nationality and not the opposite.

Some passport properties only :

{
    "name": "aowss",
    "gender": "male",
    "birthday": "1973-01-24",
    "passportNumber": "02AA12345"
     (1)
}
  1. The passportIssueDate and passportExpiryDate properties are not required !

Passport properties with a different format :

{
    "name": "aowss",
    "gender": "male",
    "birthday": "1973-01-24",
    "passportNumber": 212345 (1)
}
  1. The passportNumber property can have any format !

Any additional properties :

{
    "name": "aowss",
    "gender": "male",
    "birthday": "1973-01-24",
    "number": "02AA12345" (1)
}
  1. As is always the case when additionalProperties is not set to false, any property is allowed.

Caution
Beware, by default, properties are not required !

If you don’t specify that the passport properties are mandatory, then the dependency is meaningless :

Schema :

{
    "type": "object",
    "properties": {
        "name": { "type": "string" },
        "gender": { "type": "string", "enum": ["male", "female"] },
        "birthday": { "type": "string", "format": "date" },
        "nationality": { "type": "string" }
    },
    "required": ["name", "gender", "birthday"],
    "dependencies": {
        "nationality": {
            "properties": {
                "passportNumber": { "type": "string" },
                "passportIssueDate": { "type": "string", "format": "date" },
                "passportExpiryDate": { "type": "string", "format": "date" }
            }
        }
    }
}

Instances :

{
    "name": "aowss",
    "gender": "male",
    "birthday": "1973-01-24",
    "nationality": "french"
     (1)
}
  1. Since all the required properties are optional, it’s fine to have none of them.

This is different from the case where the dependency was on properties !
In that case, "dependencies": { "passport": ["nationality"] } effectively meant that the nationality property was required if the passport property was present.

Advanced Cases

This type of array or this type of object

As we have seen above, it is possible to specify that a value can be one of several types.
As we have seen above, it is possible to specify the schema of an array.
As we have seen above, it is possible to specify the schema of an object.

Schema :

{
    "type": ["array", "object"],
    "items": {
        "type": "number"
    },
    "properties": {
        "name": { "type": "string" },
        "gender": { "type": "string", "enum": ["male", "female"] },
        "birthday": { "type": "string", "format": "date" }
    },
    "additionalProperties": false
}

Instances :

{
    "name": "aowss",
    "gender": "male",
    "birthday": "1973-01-24"
}
[1, 2, 3, 4, 5]
invalid
{
    "name": "aowss",
    "gender": "male",
    "birthday": "1973-01-24",
    "nationality": "french"
}
invalid
["aowss", "male", "1973-01-24"]

This is using the fact that type can accept a list of acceptable types.

What it really means is that the type must be one of the listed types.
It is therefore more natural, at least in my opinion, to write the above schema as follows :

Schema :

{
    "oneOf" : [
        {
            "type": "array",
            "items": {
                "type": "number"
            }
        },
        {
            "type": "object",
            "properties": {
                "name": { "type": "string" },
                "gender": { "type": "string", "enum": ["male", "female"] },
                "birthday": { "type": "string", "format": "date" }
            },
            "additionalProperties": false
        }
    ]
}

This is also more flexible : you can define any number of arrays and objects or even other types as being acceptable.

In the previous schema, you could only define one array and one object since the matching of the allowed types to the specified schemas was done automatically :

  • the array type is matched to the items definition,

  • the object type is matched to the properties definition.

What about reuse ?

Referencing an existing schema

The $ref keyword is used to reference an existing schema.
The value is a JSON Pointer expression.

Schema :

{
    "$schema": "http://json-schema.org/draft-06/schema#",
    "definitions": {
        "passenger": { (2)
            "type": "object",
            "properties": {
                "name" : {
                    "type": "string",
                    "description": "The passenger's first and last name"
                },
                ...
            }
        }
    },
    "type": "object",
    "properties": {
        "passengers": {
            "type": "array",
            "items": {
                "$ref": "#/definitions/passenger" (1)
            },
            "uniqueItems": true
        }
    },
    "additionalProperties": false
}
  1. Reference to another location in this schema

  2. Location referenced by the $ref keyword

Tip

It is customary ( but not required ) to put the referenced schemas in the parent schema under a key called definitions.

The specification says :

This keyword plays no role in validation per se. Its role is to provide a standardized location for schema authors to inline JSON Schemas into a more general schema.

This keyword’s value MUST be an object. Each member value of this object MUST be a valid JSON Schema.

The net effect of using the $ref keyword is that it is logically replaced by what it points to.

Resulting Schema :

{
    "$schema": "http://json-schema.org/draft-06/schema#",
    "type": "object",
    "properties": {
        "passengers": {
            "type": "array",
            "items": { (1)
                "type": "object",
                "properties": {
                    "name" : {
                        "type": "string",
                        "description": "The passenger's first and last name"
                    },
                    ...
                }
            },
            "uniqueItems": true
        }
    },
    "additionalProperties": false
}
  1. The $ref keyword has been replaced by what it points to

Current and external schemas

# refers to the current document.

The following expression points to the passenger schema under the definitions property in the current schema document :

{ "$ref": "#/definitions/passenger" }

The following expression points to the price schema under the commons property in the common.schema.json schema document :

{ "$ref": "common.schema.json#/commons/price" }

Schemas :

seat.schema.json
{
    "$schema": "http://json-schema.org/draft-06/schema#",
    "definitions": {
        "seat": {
            "type": "object",
            "properties": {
                ...,
                "price" : { "$ref": "common.schema.json#/commons/price" }
            }
        }
    },
    "type": "object",
    "properties": {
        "seat" : { "$ref": "#/definitions/seat" }
    },
    "required" : [ "seat" ],
    "additionalProperties": false
}
common.schema.json
{
    "$schema": "http://json-schema.org/draft-06/schema#",
    "commons": {
        "currency" : {
            "type": "string",
            "pattern": "^[A-Z]{3}$"
        },
        ...,
        "price": {
            "type": "object",
            "properties": {
                "amount" : {
                    "type": "number"
                },
                "currency" : { "$ref": "#/commons/currency" }
            }
        },
        ...
    }
}

Documenting

The title and description keywords are used to describe parts of a schema.
These keywords are not used in the validation process.

{
    "$schema": "http://json-schema.org/draft-06/schema#",
    "definitions": {
        "passenger": {
            "title": "Passenger", (1)
            "description": "A Flight Passenger", (2)
            "type": "object",
            "properties": {
                "type" : {
                    "description": "The passenger's type", (3)
                    "type": "string",
                    "enum": [ "Adult", "Child", "Infant", "Young Adult"]
                },
                "frequentFlyer" : {
                    "type": "object",
                    "properties": {
                        "programme" : {
                            "title": "Frequent Flyer Programme", (4)
                            "description": "The passenger's frequent flyer programme", (3)
                            "type": "string",
                            "enum": [ "Executive Club", "AA Passenger", "Finnair Bonus"]
                        }
                    }
                }
            }
        }
    }
}
  1. A schema’s title

  2. A schema’s description

  3. A property’s description

  4. A property’s title

Default Values

The default keyword is used to document eventual default values.
This keyword is not used in the validation process.

Schema :

{
    "type": "object",
    "properties": {
        "name": { "type": "string" },
        "gender": { "type": "string", "enum": ["male", "female"], "default": "Male" }, (1)
        "birthday": { "type": "string", "format": "date" },
        "nationality": { "type": "string", "default": "french" }
    },
    "additionalProperties": false,
    "required": ["name", "gender", "nationality"]
}
  1. The default value doesn’t have to comply to the schema [10] [11].
    As you can see Male is not a valid value for the following : "enum": ["male", "female"].

Instances :

{
    "name": "aowss"
     (1)
}
  1. Since the default keyword is not used in the validation process, the mandatory gender & nationality properties must be specified.

Caution
In my opinion, this keyword is useless and misleading !
It is useless since it is not used to document anything meaningful, especially if it can have a value that doesn’t comply to the schema.
It is misleading since it gives the impression that specifying a default value will have an effect on the validation process.
Tip
This is very different from XML Schema’s default keyword.

Open Content Model

The JSON Scehma content model is open : by default, properties that have not been specified in the schema are allowed.
This behaviour can be changed for arrays and objects.

Although the open content model can seem a little counter-intuitive, the ideas behind it are evolvability & decoupling.

Example 1. Scenario
  1. Party A publishes a schema for its public web API.

  2. Party B and Party C use this schema to interact with Party A.

  3. Party A makes some changes to its API and publishes a new version of the schema that is backward compatible.

  4. Party B is interested in the new features and upgrades the schema it uses to the new version.

  5. Party C is not interested in the new features and continues to use the old version of the schema.

Because of the open content model, the old version of the schema still validates the new instance documents, i.e. the ones that adheres to the new schema.

Caution
A lot of attention and testing is needed to ensure that the schema is really constraining the instance documents in the expected way.
There’s a fine line between evolvability and no constraints, especially considering the above-mentioned gotchas.
Note

This is one of the fundamental differences between JSON Schema and XML Schema.
In XML Schema, the content model is closed : by default only elements / attributes that have been specified are allowed.
Extension points can be defined using the any keyword to allow for unspecified content.

Duplicate Keys

Caution
Even though JSON allows duplicate keys, they should not be used !

JSON

The meaning is not clear

In XML you use duplicate keys to build lists.
In JSON you have the array type for that.

JSON Parsing

Parsers will throw an error or just ignore all but the last occurrence

See RFC 7159

JSON Pointer

You can’t address duplicate keys properly

JSON Schema

There is no way to specify that a key is unique since JSON Schema assumes that keys are unique

Since the validator relies on a parser that is most likely going to ignore the duplicate key, the validator will validate the instance as if there was only one key : the last one.
Therefore if an instance contains a duplicate key where the first key’s value is invalid and the second key’s value is valid, the validator will consider the instance as valid !

Summary

type keywords

number or integer

multipleOf, maximum, exclusiveMaximum, minimum, exclusiveMinimum

string

maxLength, minLength, pattern

array

items, additionalItems, maxItems, minItems, uniqueItems

object

maxProperties, minProperties, required, properties, additionalProperties, patternProperties, dependencies

Worked Examples

One of those is required

It is possible to specify that an object can have a certain set of properties or another set of properties.
If some of the properties are shared

TBC

Same set of properties but different rules

A person has a first name, a last name and eventually an email address.
A payer is a person whose email address is required for confirmation purpose.

Schema :

{
    "$schema": "http://json-schema.org/draft-06/schema#",
    "definitions": {
        "person": {
            "type": "object",
            "properties": {
                "firstName" : { "type": "string" },
                "lastName" : { "type": "string" },
                "email" : {
                    "type": "string",
                    "format": "email"
                }
            },
            "required" : [ "firstName", "lastName" ],
            "additionalProperties": false
        },
        "payer" : {
            "allOf": [ (1)
                { "$ref": "#/definitions/person" },
                { "required" : [ "email"] } (1)
            ]
        }
    },
    "type": "object",
    "properties": {
        "person" : { "$ref": "#/definitions/person" },
        "payer" : { "$ref": "#/definitions/payer" }
    },
    "additionalProperties": false
}
  1. The payer must at the same time, as denoted by the allOf keyword, be a person and have an email, as denoted by the required keyword.

Instances :

{
    "person" : {
        "firstName" : "Aowss",
        "lastName" : "Ibrahim"
    }
}
{
    "person" : {
        "firstName" : "Aowss",
        "lastName" : "Ibrahim",
        "email" : "[email protected]" (1)
    }
}
  1. A person can have an email.

{
    "payer" : {
        "firstName" : "Aowss",
        "lastName" : "Ibrahim",
        "email" : "[email protected]"
    }
}
{
    "payer" : {
        "firstName" : "Aowss",
        "lastName" : "Ibrahim"
         (1)
    }
}
  1. The email property is required for a payer.

Note
This kind of constructs don’t exist in XML Schema 1.0.
Tip
It is not possible to make a required property optional.

This only works if the required property is not nested :

{
    "$schema": "http://json-schema.org/draft-06/schema#",
    "definitions": {
    "person": {
        "type": "object",
        "properties": {
            "firstName" : { "type": "string" },
            "lastName" : { "type": "string" },
            "contactDetails" : {
                "type": "object",
                "properties": {
                    "phone" : { "type" : "integer" },
                    "email" : {
                        "type": "string",
                        "format": "email" (1)
                    }
                },
                "additionalProperties": false
            }
        },
        "required" : [ "firstName", "lastName" ],
        "additionalProperties": false
    },
    "payer" : {
        "allOf": [
            { "$ref": "#/definitions/person" },
            { "required" : [ "email"] } (2)
        ]
    }
    },
    "type": "object",
    "properties": {
        "person" : { "$ref": "#/definitions/person" },
        "payer" : { "$ref": "#/definitions/payer" }
    },
    "additionalProperties": false
}
  1. The email property is now nested within a contactDetails property.

  2. It is not possible to reference a nested property.

A workaround is to mark the contactDetails property as being required.
You also need to specify that it must contain at least one property to avoid an empty contactDetails object.

{
    "$schema": "http://json-schema.org/draft-06/schema#",
    "definitions": {
    "person": {
        "type": "object",
        "properties": {
            "firstName" : { "type": "string" },
            "lastName" : { "type": "string" },
            "contactDetails" : {
                "type": "object",
                "properties": {
                    "phone" : { "type" : "integer" },
                    "email" : {
                        "type": "string",
                        "format": "email"
                    }
                },
                "minProperties": 1, (1)
                "additionalProperties": false
            }
        },
        "required" : [ "firstName", "lastName" ],
        "additionalProperties": false
    },
    "payer" : {
        "allOf": [
            { "$ref": "#/definitions/person" },
            { "required" : [ "contactDetails"] } (2)
        ]
    }
    },
    "type": "object",
    "properties": {
        "person" : { "$ref": "#/definitions/person" },
        "payer" : { "$ref": "#/definitions/payer" }
    },
    "additionalProperties": false
}
  1. contactDetails must contain at least one property.

  2. contactDetails is required.

Instances :

{
    "person" : {
        "firstName" : "Aowss",
        "lastName" : "Ibrahim"
         (1)
    }
}
  1. A person without contact details.

{
    "person" : {
        "firstName" : "Aowss",
        "lastName" : "Ibrahim",
        "contactDetails" : {
            "email" : "[email protected]" (1)
        }
    }
}
  1. A person's contact details can be an email.

{
    "person" : {
        "firstName" : "Aowss",
        "lastName" : "Ibrahim",
        "contactDetails" : {
            "phone" : 97788987654 (1)
        }
    }
}
  1. A person's contact details can be a phone.

{
    "payer" : {
        "firstName" : "Aowss",
        "lastName" : "Ibrahim",
        "contactDetails" : {
            "email" : "[email protected]"
        }
    }
}
{
    "payer" : {
        "firstName" : "Aowss",
        "lastName" : "Ibrahim"
         (1)
    }
}
  1. The contactDetails property is required for a payer.

A flexible tuple

The array contains items of type string.
Each item’s set of valid values have been defined by a different property using the enum keyword.

The proposed solution is more flexible than a tuple but more retrictive than an array of string.

Schemas :

common.schema.json
{
    "$schema": "http://json-schema.org/draft-06/schema#",
    "definitions": {
        "seatType" : {
            "type": "string",
            "enum": [ "Bulkhead", "Cot", "Exit" ]
        },
        "seatDirection" : {
            "type": "string",
            "enum": [ "Forward Facing", "Rear Facing" ]
        },
        "seatSection" : {
            "type": "string",
            "enum": [ "Aisle", "Window", "Other" ]
        },
        "aircraftSection" : {
            "type": "string",
            "enum": [ "Left", "Right", "Centre" ]
        }
    }
}
seat.schema.json
{
    "type": "array",
    "items": { (1)
        "anyOf": [ (1)
            { "$ref": "common.schema.json#/definitions/seatType" },
            { "$ref": "common.schema.json#/definitions/seatSection" },
            { "$ref": "common.schema.json#/definitions/aircraftSection" },
            { "$ref": "common.schema.json#/definitions/seatDirection" }
        ]
    },
    "additionalItems": false
}
  1. Each item in the array can be of one of the specified types.

Instances :

["Cot", "Aisle", "Left", "Forward Facing"]
["Aisle", "Left", "Forward Facing"] (1)
  1. Items are not mandatory : the seatType is missing.

["Aisle", "Cot", "Bulkhead", "Left", "Forward Facing"] (1) (2)
  1. Items can appear more than once : 2 seatType, Cot and Bulkhead, are present.

  2. Order is irrelevant : the seatSection comes before the seatType.

["Cot", "Cot", "Bulkhead", "Left", "Forward Facing"] (1)
  1. There is no way to prevent the repetition of "Cot".

This is different from defining a tuple which is more constraining :

Schema :

seat.schema.json
{
    "type": "array",
    "items": [ (1)
        { "$ref": "common.schema.json#/definitions/seatType" },
        { "$ref": "common.schema.json#/definitions/seatSection" },
        { "$ref": "common.schema.json#/definitions/aircraftSection" },
        { "$ref": "common.schema.json#/definitions/seatDirection" }
    ],
    "additionalItems": false
}
  1. A 4-item tuple.

Instances :

["Cot", "Aisle", "Left", "Forward Facing"]
invalid
["Aisle", "Left", "Forward Facing"] (1)
  1. All items are mandatory : a seatSection must be present.

invalid
["Aisle", "Cot", "Left", "Forward Facing"] (1)
  1. Order is relevant : the seatSection must come after the seatType.

invalid
["Cot", "Bulkhead", "Aisle", "Left", "Forward Facing"] (1)
  1. Items can only appear once : you can’t have 2 seatType.

Limitations

Extending an existing schema

Suggestions

  1. The cross validation facilities involving different schemas need to be changed to avoid these issues.

    It should be possible to set additionalProperties to false.

  2. The default value for a property should conform to the schema of that property as mandated by the OpenAPI Specifiction.

  3. Schema inconsistencies should flag the schema as being invalid [12].

  4. It should be possible to indicate which items are mandatory in a tuple.

  5. A mechanism to define or extend existing formats should be available. The set of available formats should be extended.

  6. An enumProperties should be introduced as an equivalent to patternProperties.

  7. The uniqueItems keyword should be extended to use a JSON Pointer to reference what needs to be unique.

  8. The required keyword should be extended to use a JSON Pointer to reference what is required.

Resources

This is a very good resource.
The explanations are clear.
The presentation is very good.

This is a very good tutorial ( as are most of his tutorials ).
It provides a comparison with XML Schema ( Roger has a very extensive knowledge of XML Schema ).


2. .json can be used since a JSON Schema is a JSON document; .schema.json is often used to make the distinction between the schema and the instance document
3. when the MIME Type will be registered, a file extension will probably be defined
4. you can find a list here
5. you need to download this jar : json-schema-validator-2.2.6-lib.jar, not this one : json-schema-validator-2.2.6.jar
7. leading zeros are not allowed
9. This should be a valid ECMA 262 regular expression
10. the specification says : It is RECOMMENDED that a default value be valid against the associated schema
11. the OpenAPI Specification mandates the compliance : Unlike JSON Schema, the value MUST conform to the defined type
12. if a property references an inexistant definition, its content can be anything !