Skip to content

Commit

Permalink
Overhaul how date & time parsing works.
Browse files Browse the repository at this point in the history
This commit breaks FHIRDate into four classes:
- FHIRDate
- FHIRDateTime
- FHIRInstant
- FHIRTime

BREAKING CHANGES:
- If consuming code were inspecting the elementProperties array and
  doing a check like `element_type is FHIRDate`, that will now fail
  for `datetime`, `instant`, and `time` elements. Backwards
  compatibility is however maintained for checks written like
  `issubclass(element_type, FHIRDate)`.
  Instances have a similar resolution: `isinstance(obj, FHIRDate)` will
  still work.
- If consuming code were manually creating FHIRDate objects
  themselves with a time component, that will now fail with a
  ValueError.

Since the first item is unavoidable if we want to fix the bugs listed
below and has a workaround that works before and after this change,
and the second item is not an expected workflow, I hope that such
breaking changes do not cause too much harm for consumers.

BUG FIXES:
- FHIR `time` fields are now correctly parsed. Previously, a time of
  "10:12:14" would result in a **date** of "1001-01-01"
- Passing too much detail to a `date` field or too little detail to an
  `instant` field will now correctly throw a validation error.
  For example, a Patient.birthDate field with a time. Or an
  Observation.issued field with just a year.
- Sub-seconds would be incorrectly chopped off of a `datetime`'s
  `.isostring` (which the FHIR spec allows us to do) and an `instant`'s
  `.isostring` (which the FHIR spec **does not** allow us to do).
  The `.date` Python representation and the `.as_json()` call would
  both work correctly and keep the sub-seconds. Only `.isostring` was
  affected.

IMPROVEMENTS:
- Leap seconds are now half-supported. The FHIR spec says clients
  "SHOULD accept and handle leap seconds gracefully", which we do...
  By dropping the leap second on the floor and rolling back to :59.
  But this is an improvement on previous behavior of a validation
  error. The `.as_json()` value will still preserve the leap second.
- The `.date` field is now always the appropriate type (datetime.date
  for FHIRDate, datetime.datetime for FHIRDateTime and FHIRInstant,
  and datetime.time for FHIRTime). Previously, a `datetime` field might
  result in a datetime.date if only given a date portion. (Which isn't
  entirely wrong, but consistently providing the same data type is
  useful.)
- The new classes have appropriately named fields in addition to the
  backwards-compatible `.date` field -- FHIRDateTime.datetime,
  FHIRInstant.datetime, and FHIRTime.time. These will always be the
  same value as `.date` for now - but in a future major release, the
  `.date` alias may be dropped.
- The dependency on isodate can now be dropped. It is lightly
  maintained and the stdlib can handle most of its job nowadays.
- Much better class documentation for what sort of things are
  supported and which are not.
  • Loading branch information
mikix committed Jul 22, 2024
1 parent efcaade commit 9cbc5ae
Show file tree
Hide file tree
Showing 344 changed files with 2,826 additions and 1,633 deletions.
4 changes: 4 additions & 0 deletions MAINTAINERS.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,3 +40,7 @@ Using flit (*Note*: Alternatively, you can use [twine](https://twine.readthedocs

pip install -U flit
flit publish

### Announce the release

Make a post in the [Zulip channel](https://chat.fhir.org/#narrow/stream/179218-python) for python.
1 change: 0 additions & 1 deletion fhir-parser-resources/fhirabstractresource.py
Original file line number Diff line number Diff line change
Expand Up @@ -211,6 +211,5 @@ def where(cls, struct):
return fhirsearch.FHIRSearch(cls, struct)


from . import fhirdate
from . import fhirsearch
from . import fhirelementfactory
180 changes: 128 additions & 52 deletions fhir-parser-resources/fhirdate.py
Original file line number Diff line number Diff line change
@@ -1,81 +1,157 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
# Facilitate working with dates.
# 2014, SMART Health IT.

import sys
import logging
import isodate
"""Facilitate working with FHIR dates and times."""
# 2014-2024, SMART Health IT.

import datetime
import re
from typing import Any, Union


class FHIRDate:
"""
A convenience class for working with FHIR dates in Python.
http://hl7.org/fhir/R4/datatypes.html#date
logger = logging.getLogger(__name__)
Converting to a Python representation does require some compromises:
- This class will convert partial dates ("reduced precision dates") like "2024" into full
dates using the earliest possible time (in this example, "2024-01-01") because Python's
date class does not support partial dates.
If such compromise is not useful for you, avoid using the `date` or `isostring`
properties and just use the `as_json()` method in order to work with the original,
exact string.
class FHIRDate(object):
""" Facilitate working with dates.
- `date`: datetime object representing the receiver's date-time
For backwards-compatibility reasons, this class is the parent class of FHIRDateTime,
FHIRInstant, and FHIRTime. But they are all separate concepts and in a future major release,
they should be split into entirely separate classes.
Public properties:
- `date`: datetime.date representing the JSON value
- `isostring`: an ISO 8601 string version of the above Python object
Public methods:
- `as_json`: returns the original JSON used to construct the instance
"""

def __init__(self, jsonval=None):
self.date = None

def __init__(self, jsonval: Union[str, None] = None):
self.date: Union[datetime.date, datetime.datetime, datetime.time, None] = None

if jsonval is not None:
isstr = isinstance(jsonval, str)
if not isstr and sys.version_info[0] < 3: # Python 2.x has 'str' and 'unicode'
isstr = isinstance(jsonval, basestring)
if not isstr:
if not isinstance(jsonval, str):
raise TypeError("Expecting string when initializing {}, but got {}"
.format(type(self), type(jsonval)))
try:
if 'T' in jsonval:
self.date = isodate.parse_datetime(jsonval)
else:
self.date = isodate.parse_date(jsonval)
except Exception as e:
logger.warning("Failed to initialize FHIRDate from \"{}\": {}"
.format(jsonval, e))

self.origval = jsonval

if not self._REGEX.fullmatch(jsonval):
raise ValueError("does not match expected format")
self.date = self._from_string(jsonval)

self.origval: Union[str, None] = jsonval

def __setattr__(self, prop, value):
if 'date' == prop:
if prop in {'date', self._FIELD}:
self.origval = None
object.__setattr__(self, prop, value)

# Keep these two fields in sync
object.__setattr__(self, self._FIELD, value)
object.__setattr__(self, "date", value)
else:
object.__setattr__(self, prop, value)

@property
def isostring(self):
def isostring(self) -> Union[str, None]:
"""
Returns a standardized ISO 8601 version of the Python representation of the FHIR JSON.
Note that this may not be a fully accurate version of the input JSON.
In particular, it will convert partial dates like "2024" to full dates like "2024-01-01".
It will also normalize the timezone, if present.
"""
if self.date is None:
return None
if isinstance(self.date, datetime.datetime):
return isodate.datetime_isoformat(self.date)
return isodate.date_isoformat(self.date)

return self.date.isoformat()

@classmethod
def with_json(cls, jsonobj):
def with_json(cls, jsonobj: Union[str, list]):
""" Initialize a date from an ISO date string.
"""
isstr = isinstance(jsonobj, str)
if not isstr and sys.version_info[0] < 3: # Python 2.x has 'str' and 'unicode'
isstr = isinstance(jsonobj, basestring)
if isstr:
if isinstance(jsonobj, str):
return cls(jsonobj)

if isinstance(jsonobj, list):
return [cls(jsonval) for jsonval in jsonobj]

raise TypeError("`cls.with_json()` only takes string or list of strings, but you provided {}"
.format(type(jsonobj)))

@classmethod
def with_json_and_owner(cls, jsonobj, owner):
def with_json_and_owner(cls, jsonobj: Union[str, list], owner):
""" Added for compatibility reasons to FHIRElement; "owner" is
discarded.
"""
return cls.with_json(jsonobj)

def as_json(self):

def as_json(self) -> Union[str, None]:
"""Returns the original JSON string used to create this instance."""
if self.origval is not None:
return self.origval
return self.isostring


##################################
# Private properties and methods #
##################################

# Pulled from spec for date
_REGEX = re.compile(r"([0-9]([0-9]([0-9][1-9]|[1-9]0)|[1-9]00)|[1-9]000)(-(0[1-9]|1[0-2])(-(0[1-9]|[1-2][0-9]|3[0-1]))?)?")
_FIELD = "date"

@staticmethod
def _parse_partial(value: str, cls):
"""
Handle partial dates like 1970 or 1980-12.
FHIR allows them, but Python's datetime classes do not natively parse them.
"""
# Note that `value` has already been regex-certified by this point,
# so we don't have to handle really wild strings.
if len(value) < 10:
pieces = value.split("-")
if len(pieces) == 1:
return cls(int(pieces[0]), 1, 1)
else:
return cls(int(pieces[0]), int(pieces[1]), 1)
return cls.fromisoformat(value)

@staticmethod
def _parse_date(value: str) -> datetime.date:
return FHIRDate._parse_partial(value, datetime.date)

@staticmethod
def _parse_datetime(value: str) -> datetime.datetime:
# Until we depend on Python 3.11+, manually handle Z
value = value.replace("Z", "+00:00")
value = FHIRDate._strip_leap_seconds(value)
return FHIRDate._parse_partial(value, datetime.datetime)

@staticmethod
def _parse_time(value: str) -> datetime.time:
value = FHIRDate._strip_leap_seconds(value)
return datetime.time.fromisoformat(value)

@staticmethod
def _strip_leap_seconds(value: str) -> str:
"""
Manually ignore leap seconds by clamping the seconds value to 59.
Python native times don't support them (at the time of this writing, but also watch
https://bugs.python.org/issue23574). For example, the stdlib's datetime.fromtimestamp()
also clamps to 59 if the system gives it leap seconds.
But FHIR allows leap seconds and says receiving code SHOULD accept them,
so we should be graceful enough to at least not throw a ValueError,
even though we can't natively represent the most-correct time.
"""
# We can get away with such relaxed replacement because we are already regex-certified
# and ":60" can't show up anywhere but seconds.
return value.replace(":60", ":59")

@staticmethod
def _from_string(value: str) -> Any:
return FHIRDate._parse_date(value)
57 changes: 57 additions & 0 deletions fhir-parser-resources/fhirdatetime.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
"""Facilitate working with FHIR time fields."""
# 2024, SMART Health IT.

import datetime
import re
from typing import Any, Union

from .fhirdate import FHIRDate


# This inherits from FHIRDate as a matter of backwards compatibility.
# (in case anyone was doing isinstance(obj, FHIRDate))
# Next time we bump the major version, we can stop that and also drop the
# backwards-compatible 'date' alias. R4-QUIRK

class FHIRDateTime(FHIRDate):
"""
A convenience class for working with FHIR datetimes in Python.
http://hl7.org/fhir/R4/datatypes.html#datetime
Converting to a Python representation does require some compromises:
- This class will convert partial dates ("reduced precision dates") like "2024" into full
naive datetimes using the earliest possible time (in this example, "2024-01-01T00:00:00")
because Python's datetime class does not support partial dates.
- FHIR allows arbitrary sub-second precision, but Python only holds microseconds.
- Leap seconds (:60) will be changed to the 59th second (:59) because Python's time classes
do not support leap seconds.
If such compromise is not useful for you, avoid using the `date`, `datetime`, or `isostring`
properties and just use the `as_json()` method in order to work with the original,
exact string.
Public properties:
- `datetime`: datetime.datetime representing the JSON value (naive or aware)
- `date`: backwards-compatibility alias for `datetime`
- `isostring`: an ISO 8601 string version of the above Python object
Public methods:
- `as_json`: returns the original JSON used to construct the instance
"""

def __init__(self, jsonval: Union[str, None] = None):
self.datetime: Union[datetime.datetime, None] = None
super().__init__(jsonval)

##################################
# Private properties and methods #
##################################

# Pulled from spec for datetime
_REGEX = re.compile(r"([0-9]([0-9]([0-9][1-9]|[1-9]0)|[1-9]00)|[1-9]000)(-(0[1-9]|1[0-2])(-(0[1-9]|[1-2][0-9]|3[0-1])(T([01][0-9]|2[0-3]):[0-5][0-9]:([0-5][0-9]|60)(\.[0-9]+)?(Z|(\+|-)((0[0-9]|1[0-3]):[0-5][0-9]|14:00)))?)?)?")
_FIELD = "datetime"

@staticmethod
def _from_string(value: str) -> Any:
return FHIRDate._parse_datetime(value)
54 changes: 54 additions & 0 deletions fhir-parser-resources/fhirinstant.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
"""Facilitate working with FHIR time fields."""
# 2024, SMART Health IT.

import datetime
import re
from typing import Any, Union

from .fhirdate import FHIRDate


# This inherits from FHIRDate as a matter of backwards compatibility.
# (in case anyone was doing isinstance(obj, FHIRDate))
# Next time we bump the major version, we can stop that and also drop the
# backwards-compatible 'date' alias. R4-QUIRK

class FHIRInstant(FHIRDate):
"""
A convenience class for working with FHIR instants in Python.
http://hl7.org/fhir/R4/datatypes.html#instant
Converting to a Python representation does require some compromises:
- FHIR allows arbitrary sub-second precision, but Python only holds microseconds.
- Leap seconds (:60) will be changed to the 59th second (:59) because Python's time classes
do not support leap seconds.
If such compromise is not useful for you, avoid using the `date`, `datetime`, or `isostring`
properties and just use the `as_json()` method in order to work with the original,
exact string.
Public properties:
- `datetime`: datetime.datetime representing the JSON value (aware only)
- `date`: backwards-compatibility alias for `datetime`
- `isostring`: an ISO 8601 string version of the above Python object
Public methods:
- `as_json`: returns the original JSON used to construct the instance
"""

def __init__(self, jsonval: Union[str, None] = None):
self.datetime: Union[datetime.datetime, None] = None
super().__init__(jsonval)

##################################
# Private properties and methods #
##################################

# Pulled from spec for instant
_REGEX = re.compile(r"([0-9]([0-9]([0-9][1-9]|[1-9]0)|[1-9]00)|[1-9]000)-(0[1-9]|1[0-2])-(0[1-9]|[1-2][0-9]|3[0-1])T([01][0-9]|2[0-3]):[0-5][0-9]:([0-5][0-9]|60)(\.[0-9]+)?(Z|(\+|-)((0[0-9]|1[0-3]):[0-5][0-9]|14:00))")
_FIELD = "datetime"

@staticmethod
def _from_string(value: str) -> Any:
return FHIRDate._parse_datetime(value)
54 changes: 54 additions & 0 deletions fhir-parser-resources/fhirtime.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
"""Facilitate working with FHIR time fields."""
# 2024, SMART Health IT.

import datetime
import re
from typing import Any, Union

from .fhirdate import FHIRDate


# This inherits from FHIRDate as a matter of backwards compatibility.
# (in case anyone was doing isinstance(obj, FHIRDate))
# Next time we bump the major version, we can stop that and also drop the
# backwards-compatible 'date' alias. R4-QUIRK

class FHIRTime(FHIRDate):
"""
A convenience class for working with FHIR times in Python.
http://hl7.org/fhir/R4/datatypes.html#time
Converting to a Python representation does require some compromises:
- FHIR allows arbitrary sub-second precision, but Python only holds microseconds.
- Leap seconds (:60) will be changed to the 59th second (:59) because Python's time classes
do not support leap seconds.
If such compromise is not useful for you, avoid using the `date`, `time`, or `isostring`
properties and just use the `as_json()` method in order to work with the original,
exact string.
Public properties:
- `time`: datetime.time representing the JSON value
- `date`: backwards-compatibility alias for `time`
- `isostring`: an ISO 8601 string version of the above Python object
Public methods:
- `as_json`: returns the original JSON used to construct the instance
"""

def __init__(self, jsonval: Union[str, None] = None):
self.time: Union[datetime.time, None] = None
super().__init__(jsonval)

##################################
# Private properties and methods #
##################################

# Pulled from spec for time
_REGEX = re.compile(r"([01][0-9]|2[0-3]):[0-5][0-9]:([0-5][0-9]|60)(\.[0-9]+)?")
_FIELD = "time"

@staticmethod
def _from_string(value: str) -> Any:
return FHIRDate._parse_time(value)
Loading

0 comments on commit 9cbc5ae

Please sign in to comment.