Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust usage of utcnow() and utcfromtimestamp() for Python 3 #440

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ jobs:
python-version: ['3.7', '3.8', '3.9', '3.10']
cratedb-version: ['4.8.0']
sqla-version: ['1.3.24', '1.4.37']
fail-fast: true
fail-fast: false
env:
CRATEDB_VERSION: ${{ matrix.cratedb-version }}
SQLALCHEMY_VERSION: ${{ matrix.sqla-version }}
Expand Down
40 changes: 40 additions & 0 deletions CHANGES.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,46 @@ Changes for crate
Unreleased
==========

- BREAKING CHANGE: The datetime handling is now independent of the time zone
setting of your machine and always uses UTC. See next section for more
details.


Datetime handling
-----------------

The datetime handling is now independent of the time zone setting of your
machine. This has to be considered when working with the SQLAlchemy dialect,
returning timestamps, and your system time zone is not expressed in UTC.

To find out about your situation, invoke::

date +"%Z %z"

where you can find out if your system is running on UTC, or not::

UTC +0000
CEST +0200

For more information about this change, please consider reading those sections
of the Python documentation and corresponding articles.

.. warning::

Because naive ``datetime`` objects are treated by many ``datetime`` methods
as local times, it is preferred to use aware datetimes to represent times
in UTC. As such, the recommended way to create an object representing

- the current time in UTC is by calling ``datetime.now(timezone.utc)``.
- a specific timestamp in UTC is by calling ``datetime.fromtimestamp(..., tz=timezone.utc)``.

References:

- https://docs.python.org/3/library/datetime.html#datetime.datetime.utcnow
- https://docs.python.org/3/library/datetime.html#datetime.datetime.utcfromtimestamp
- https://blog.ganssle.io/articles/2019/11/utcnow.html
- https://aaronoellis.com/articles/python-datetime-utcnow-considered-harmful


2022/07/04 0.27.1
=================
Expand Down
2 changes: 2 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,8 @@ def read(path):
'zope.testing>=4,<5',
'zope.testrunner>=5,<6',
'zc.customdoctests>=1.0.1,<2',
'freezegun>=1,<2',
'time-machine>=2,<3',
'createcoverage>=1,<2',
'stopit>=1.1.2,<2',
'flake8>=4,<5'],
Expand Down
8 changes: 5 additions & 3 deletions src/crate/client/sqlalchemy/dialect.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
# software solely pursuant to the terms of the relevant commercial agreement.

import logging
from datetime import datetime, date
from datetime import datetime, date, timezone

from sqlalchemy import types as sqltypes
from sqlalchemy.engine import default, reflection
Expand Down Expand Up @@ -90,7 +90,8 @@ def process(value):
if not value:
return
try:
return datetime.utcfromtimestamp(value / 1e3).date()
return datetime.fromtimestamp(value / 1e3, tz=timezone.utc) \
.replace(tzinfo=None).date()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we throw away the tzinfo why not keep using utcfromtimestamp?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea is to be a preparation step to actually allow the user to pass a tzinfo here, later on?

except TypeError:
pass

Expand Down Expand Up @@ -129,7 +130,8 @@ def process(value):
if not value:
return
try:
return datetime.utcfromtimestamp(value / 1e3)
return datetime.fromtimestamp(value / 1e3, tz=timezone.utc) \
.replace(tzinfo=None)
except TypeError:
pass

Expand Down
15 changes: 6 additions & 9 deletions src/crate/client/sqlalchemy/doctests/itests.txt
Original file line number Diff line number Diff line change
Expand Up @@ -65,10 +65,10 @@ Retrieve the location from the database::
>>> location.name
'Earth'

Date should have been set at the insert due to default value via python method::
Date should have been set at the insert due to default value via Python method::

>>> from datetime import datetime
>>> now = datetime.utcnow()
>>> from datetime import datetime, timezone
>>> now = datetime.now(timezone.utc).replace(tzinfo=None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the change here. Isn't now(timezone.utc).replace(tzinfo=None) the same as utcnow()?

Copy link
Member Author

@amotl amotl Jul 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is really sad. In situations when the system time zone setting, defined through the TZ environment variable, is expressed in UTC, everything is fine. However, it is not the case if the system time zone is anything other than UTC.

With c1f183e, I've tried to improve the situation to outline the problem, on behalf of some test cases. 5894c2e quotes and references all important information about this topic, from both the Python documentation and articles by other capacities.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the case.

utcnow:

Return the current UTC date and time, with tzinfo None.

Yes, it is a naive datetime without timezone info attached, but it is using utc.

>>> print(datetime.utcnow(), '\n', datetime.now(tz=timezone.utc).replace(tzinfo=None), '\n', datetime.now())
2022-07-22 18:27:27.055192
 2022-07-22 18:27:27.055200
 2022-07-22 20:27:27.055213
↪  date +"%Z %z"
CEST +0200

Copy link
Member

@mfussenegger mfussenegger Jul 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also as the articles point out, the problem is the use of the naive datetime (lack of tzinfo) in subsequent operations. If we remove the tzinfo anyways via .replace(tzinfo=None) we don't gain anything by using now(tz=timezone.utc) but only make things more complicated.

And I'm not sure if attaching a timezone is even the correct thing to do, because the timestamp stored in CrateDB also doesn't have one attached. Assuming that timestamp values returned by the server are UTC may not be safe.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. You convinced me that this goes too far. I've submitted #441 instead, which solely focuses on mitigating a fluke I regularly occasionally observed when running the test suite on CI after midnight.

>>> dt = location.date

>>> dt.year == now.year
Expand All @@ -80,9 +80,6 @@ Date should have been set at the insert due to default value via python method::
>>> dt.day == now.day
True

>>> (now - location.datetime_tz).seconds < 4
True

Verify the return type of date and datetime::

>>> type(location.date)
Expand All @@ -103,10 +100,10 @@ aren't set when the row is inserted as there is no default method::
>>> location.nullable_date is None
True

The datetime and date can be set using a update statement::
The datetime and date can be set using an update statement::

>>> location.nullable_date = datetime.today()
>>> location.nullable_datetime = datetime.utcnow()
>>> location.nullable_date = datetime.now(timezone.utc).replace(tzinfo=None).date()
>>> location.nullable_datetime = datetime.now(timezone.utc).replace(tzinfo=None)
>>> session.flush()

Refresh "locations" table:
Expand Down
5 changes: 3 additions & 2 deletions src/crate/client/sqlalchemy/tests/dialect_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@
# with Crate these terms will supersede the license and you may use the
# software solely pursuant to the terms of the relevant commercial agreement.

from datetime import datetime
from unittest import TestCase
from unittest.mock import MagicMock, patch

Expand All @@ -32,6 +31,8 @@
from sqlalchemy.orm import Session
from sqlalchemy.testing import eq_, in_

from crate.testing.util import datetime_now_utc_naive

FakeCursor = MagicMock(name='FakeCursor', spec=Cursor)


Expand Down Expand Up @@ -62,7 +63,7 @@ class Character(self.base):
name = sa.Column(sa.String, primary_key=True)
age = sa.Column(sa.Integer, primary_key=True)
obj = sa.Column(Object)
ts = sa.Column(sa.DateTime, onupdate=datetime.utcnow)
ts = sa.Column(sa.DateTime, onupdate=datetime_now_utc_naive)

self.character = Character
self.session = Session()
Expand Down
6 changes: 3 additions & 3 deletions src/crate/client/sqlalchemy/tests/insert_from_select_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@
# with Crate these terms will supersede the license and you may use the
# software solely pursuant to the terms of the relevant commercial agreement.

from datetime import datetime
from unittest import TestCase
from unittest.mock import patch, MagicMock

Expand All @@ -29,6 +28,7 @@
from sqlalchemy import select, insert

from crate.client.cursor import Cursor
from crate.testing.util import datetime_now_utc_naive


fake_cursor = MagicMock(name='fake_cursor')
Expand All @@ -51,15 +51,15 @@ class Character(Base):

name = sa.Column(sa.String, primary_key=True)
age = sa.Column(sa.Integer)
ts = sa.Column(sa.DateTime, onupdate=datetime.utcnow)
ts = sa.Column(sa.DateTime, onupdate=datetime_now_utc_naive)
status = sa.Column(sa.String)

class CharacterArchive(Base):
__tablename__ = 'characters_archive'

name = sa.Column(sa.String, primary_key=True)
age = sa.Column(sa.Integer)
ts = sa.Column(sa.DateTime, onupdate=datetime.utcnow)
ts = sa.Column(sa.DateTime, onupdate=datetime_now_utc_naive)
status = sa.Column(sa.String)

self.character = Character
Expand Down
7 changes: 4 additions & 3 deletions src/crate/client/sqlalchemy/tests/update_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
from sqlalchemy.ext.declarative import declarative_base

from crate.client.cursor import Cursor
from crate.testing.util import datetime_now_utc_naive


fake_cursor = MagicMock(name='fake_cursor')
Expand All @@ -50,7 +51,7 @@ class Character(self.base):
name = sa.Column(sa.String, primary_key=True)
age = sa.Column(sa.Integer)
obj = sa.Column(Object)
ts = sa.Column(sa.DateTime, onupdate=datetime.utcnow)
ts = sa.Column(sa.DateTime, onupdate=datetime_now_utc_naive)

self.character = Character
self.session = Session()
Expand All @@ -60,7 +61,7 @@ def test_onupdate_is_triggered(self):
char = self.character(name='Arthur')
self.session.add(char)
self.session.commit()
now = datetime.utcnow()
now = datetime_now_utc_naive()

fake_cursor.fetchall.return_value = [('Arthur', None)]
fake_cursor.description = (
Expand Down Expand Up @@ -89,7 +90,7 @@ def test_bulk_update(self):
Checks whether bulk updates work correctly
on native types and Crate types.
"""
before_update_time = datetime.utcnow()
before_update_time = datetime_now_utc_naive()

self.session.query(self.character).update({
# change everyone's name to Julia
Expand Down
8 changes: 4 additions & 4 deletions src/crate/client/tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,6 @@
import unittest
import doctest
from pprint import pprint
from datetime import datetime, date
from http.server import HTTPServer, BaseHTTPRequestHandler
import ssl
import time
Expand Down Expand Up @@ -56,6 +55,7 @@
)
from .sqlalchemy.tests import test_suite as sqlalchemy_test_suite
from .sqlalchemy.types import ObjectArray
from ..testing.util import datetime_now_utc_naive, date_now_utc_naive
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this is relative?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyCharm created that import line for me. The import statements above are also relative.


log = logging.getLogger('crate.testing.layer')
ch = logging.StreamHandler()
Expand Down Expand Up @@ -213,9 +213,9 @@ class Location(Base):
__tablename__ = 'locations'
name = sa.Column(sa.String, primary_key=True)
kind = sa.Column(sa.String)
date = sa.Column(sa.Date, default=date.today)
datetime_tz = sa.Column(sa.DateTime, default=datetime.utcnow)
datetime_notz = sa.Column(sa.DateTime, default=datetime.utcnow)
date = sa.Column(sa.Date, default=date_now_utc_naive)
datetime_tz = sa.Column(sa.DateTime, default=datetime_now_utc_naive)
datetime_notz = sa.Column(sa.DateTime, default=datetime_now_utc_naive)
nullable_datetime = sa.Column(sa.DateTime)
nullable_date = sa.Column(sa.Date)
flag = sa.Column(sa.Boolean)
Expand Down
Loading