Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start of JSON schema tool kit #6

Merged
merged 13 commits into from
Aug 12, 2023
110 changes: 110 additions & 0 deletions docs/json-schema.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
===========
JSON Schema
===========

Goals:
======
Document and Validate all JSON messages used to communicate between microservices within IDSS Engine.

Additional Goal:
----------------
The different message types should share as much structure in common as is practical. This consistency
should increase human readability, as developers become familiar with standard, and reduce complexity
for message handling.

Resources:
==========
- `JSON schema <https://json-schema.org/>`_
- `Understanding JSON schema <https://json-schema.org/understanding-json-schema/>`_

Convention/Patterns:
====================
- Auxiliary or Subschema are used to group related objects. Subschema objects are defined at the root
level in json files. The filename is based on the dominant object.

**Timing.json**::

{
"TimeString": {
"description": "String representation of a date/time",
"type": "string",
"format": "date-time"
},
"TimeList": {
"description": "A list of specific time_string(s)",
"type": "array",
"items": {"$ref": "#/TimeString"},
"minItems": 1
},
"TimeRange": {
"description": "Date/time string specifying the start and end date/ times",
"type": "object",
"properties": {
"start": {"$ref": "#/TimeString"},
"end": {"$ref": "#/TimeString"}
},
"required": [
"start",
"end"
]
},
"Timing": {
"description": "Either a TimeString, TimeList, or TimeRange",
"oneOf": [
{"$ref": "#/TimeString"},
{"$ref": "#/TimeList"},
{"$ref": "#/TimeRange"}
]
}
}



- Schemas that define a message should, for the most part, be entirely built from subschema. The
filename should represent the message type and end with _schema in order to distinguish from
subschema.

.. note:: *These examples are not complete, thus will need to be updated*


**New_data_schema.json**

*(This currently is only a partial schema, new data service publishes other information)*

One of the messages that the new_data service publishes indicates when a field from a product source
is available for a specific issue/valid datetime. Basically this is a message specifying the Variable
subschema, except that a Variable does NOT require a Field *(it is optional)*. Thus the new_data message
utilized the Variable subschema and adds the additional requirement that there must be at least one
field.::

{
"type": "object",
"allOf": [
{"$ref": "variable.json#/Variable"},
{"properties": {"field": {"minItems": 1}}}
]
}


**Criteria_schema.json**

*(This is incomplete, there is more in a criteria message, but is a
good example of how a message is built from subschemas)*::

{
"type": "object",
"properties": {
"corrId": {"$ref": "corr_id.json#/CorrId"},
"issueDt": {"$ref": "timing.json#/Timing"},
"tags": {"$ref": "tags.json#/Tags"},
"validDt": {"$ref": "timing.json#/Timing"},
"location": {"$ref": "location.json#/Location"}
},
"required": [
"corrId",
"issueDt",
"tags",
"validDt",
"location"
]
}
10 changes: 6 additions & 4 deletions python/idsse_common/idsse/common/utils.py
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad we have a common method now to create timestamps, it's something that is done a lot in different ways in many places. We will have update some of our services (EPM, ECM, IMS Gateway) to include the commons package as they currently do not

Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@

import copy
import logging
from datetime import datetime, timedelta
from datetime import datetime, timedelta, timezone
from subprocess import Popen, PIPE, TimeoutExpired
from typing import Sequence, Optional, Generator, Any

Expand Down Expand Up @@ -103,13 +103,15 @@ def exec_cmd(commands: Sequence[str], timeout: Optional[int] = None) -> Sequence

def to_iso(date_time: datetime) -> str:
"""Format a datetime instance to an ISO string"""
logger.debug(f'Datetime ({datetime}) to iso')
return date_time.strftime('%Y-%m-%dT%H:%M:%SZ')
logger.debug('Datetime (%s) to iso', datetime)
return (f'{date_time.strftime("%Y-%m-%dT%H:%M")}:'
f'{(date_time.second + date_time.microsecond / 1e6):06.3f}'
f'{"Z" if date_time.tzinfo in [None, timezone.utc] else date_time.strftime("%Z")[3:]}')


def to_compact(date_time: datetime) -> str:
"""Format a datetime instance to an compact string"""
logger.debug(f'Datetime ({datetime}) to compact -- {__name__}')
logger.debug('Datetime (%s) to compact -- %s', datetime, __name__)
return date_time.strftime('%Y%m%d%H%M%S')


Expand Down
72 changes: 72 additions & 0 deletions python/idsse_common/idsse/common/validate_schema.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
"""Class for validating IDSSe JSON messages against schema"""
# ----------------------------------------------------------------------------------
# Created on Mon Aug 07 2023
#
# Copyright (c) 2023 Regents of the University of Colorado. All rights reserved. (1)
#
# Contributors:
# Geary Layne (1)
#
# ----------------------------------------------------------------------------------
import json
import os
from typing import Optional, Union

from jsonschema import Validator, FormatChecker, RefResolver
from jsonschema.validators import validator_for


def _get_refs(json_obj: Union[dict, list], result: Optional[set] = None) -> set:
if result is None:
result = set()
if isinstance(json_obj, dict):
for key, value in json_obj.items():
print(key, ':', value, 'type', type(value))
if key == '$ref':
idx = value.index('#/')
if idx > 0:
result.add(value[:idx])
else:
_get_refs(value, result)
elif isinstance(json_obj, list):
print('\tas list')
for item in json_obj:
_get_refs(item, result)
return result


def get_validator(schema_name) -> Validator:
"""Get a jsonschema Validator to be used when evaluating json messages against specified schema

Args:
schema_name (str): The name of the message schema,
must exist under idss-engine-common/schema

Returns:
Validator: A validator loaded with schema and all dependencies
"""
current_path = os.path.dirname(os.path.realpath(__file__))
schema_dir = os.path.join(os.path.sep, *(current_path.split(os.path.sep)[:-4]), 'schema')
schema_filename = os.path.join(schema_dir, schema_name+'.json')
with open(schema_filename, 'r', encoding='utf8') as file:
schema = json.load(file)

base = json.loads('{"$schema": "http://json-schema.org/draft-07/schema#"}')

dependencies = {base.get('$id'): base}
refs = _get_refs(schema)
while len(refs):
new_refs = set()
for ref in refs:
schema_filename = os.path.join(schema_dir, ref)
with open(schema_filename, 'r', encoding='utf8') as file:
ref_schema = json.load(file)
dependencies[ref_schema.get('$id', ref)] = ref_schema
new_refs = _get_refs(ref_schema, new_refs)
refs = {ref for ref in new_refs if ref not in dependencies}

print(dependencies)
resolver = RefResolver.from_schema(schema=base,
store=dependencies)

return validator_for(base)(schema, resolver=resolver, format_checker=FormatChecker())
Binary file not shown.
Binary file not shown.
6 changes: 3 additions & 3 deletions python/idsse_common/test/test_path_builder.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,19 +9,19 @@
#
# --------------------------------------------------------------------------------

import pytest # pylint: disable=import-error
from datetime import datetime, timedelta
from typing import Union
import pytest

from idsse.common.utils import TimeDelta
from idsse.common.path_builder import PathBuilder

# pylint: disable=missing-function-docstring
# pylint: disable=invalid-name


def test_from_dir_filename_creates_valid_pathbuilder():
directory = './test_directory'
filename ='some_file.txt'
filename = 'some_file.txt'
path_builder = PathBuilder.from_dir_filename(directory, filename)

assert isinstance(path_builder, PathBuilder)
Expand Down
4 changes: 2 additions & 2 deletions python/idsse_common/test/test_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,9 @@
#
# --------------------------------------------------------------------------------

import pytest # pylint: disable=import-error
from datetime import datetime, timedelta

import pytest # pylint: disable=import-error

from idsse.common.utils import TimeDelta
from idsse.common.utils import datetime_gen, hash_code, to_compact, to_iso
Expand All @@ -37,7 +37,7 @@ def test_timedelta_day():

def test_to_iso():
dt = datetime(2013, 12, 11, 10, 9, 8)
assert to_iso(dt) == '2013-12-11T10:09:08Z'
assert to_iso(dt) == '2013-12-11T10:09:08.000Z'


def test_to_compact():
Expand Down
Loading