Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle dates in the far future #42

Merged
merged 4 commits into from
Apr 18, 2023
Merged

Conversation

iaindillingham
Copy link
Member

Several tables referenced by the query contain dates in the far future (e.g. the year 9999), probably because these dates represent null values. These cause all sorts of problems, so we try to:

  • avoid them 94ab676
  • avoid drift between an SQL-based action and a Python-based action b3852d0
  • let ourselves know sooner rather than later, and with a more helpful error message 72f30e1

Several tables referenced by the query contain dates in the far future
(e.g. the year 9999), probably because these dates represent null
values. The report text, which was copied from the notebook text, says
that the figures represent event activity from 2016-01-01 to the date
the report was run. We can avoid having to handle dates in the far
future in downstream actions by making the query consistent with the
report text, and returning event activity to the date the report was
run.
We move the initialisation of `FROM_DATE` and `TO_DATE` from deep within
`render_report` to `config`, to make them more visible and to highlight
the redundancy -- their canonical values are available within
*analysis/query.sql* (former) and at runtime (latter). SQL Runner
doesn't accept parametrised queries [1] -- and even if it did, we'd
still have redundancy in *project.yaml* -- and the date and time an
upstream SQL Runner action was executed is not accessible to downstream
actions [2], so the best we can do to prevent drift between the `query`
action and the `render_report` action is to highlight the redundancy.

[1]: opensafely-core/sqlrunner#72
[2]: opensafely-core/sqlrunner#86
Because I don't know but my future self will assume that I did.
If a column given by parse_dates cannot be represented as an array of
datetimes, then the column is returned as a string; no error is raised.
We often encounter such columns, but we would like to know sooner rather
than later, and with a more helpful error message.
Copy link
Member

@milanwiedemann milanwiedemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like to see the transition from Jupyter notebook to this project. these new improvments all makes sense to me and look good

@@ -1,3 +1,5 @@
-- If you change the values of these variables,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AIUI sqlrunner doesn't (yet) support parameterised sql statements, to ensure consistency with the values in config.py could a configure or similar action be used preceding the query action to populate the values in this file with the ones from the configuration?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aware this might be lily-gilding

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could! Such an action could also use Jinja2, as the render_report action does. That said, whilst I wouldn't say the current implementation is beautiful, I also wouldn't say it needs further ornamentation.

To put it another way, I don't think the risks justify the effort. And if they did, or came close, then the effort should probably be invested in SQL Runner rather than in this study (opensafely-core/sqlrunner#72).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that would be a much more sensible approach!

@iaindillingham iaindillingham merged commit 5f6d3d2 into main Apr 18, 2023
@iaindillingham iaindillingham deleted the iaindillingham/to-date branch April 18, 2023 12:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants