-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle dates in the far future #42
Conversation
Several tables referenced by the query contain dates in the far future (e.g. the year 9999), probably because these dates represent null values. The report text, which was copied from the notebook text, says that the figures represent event activity from 2016-01-01 to the date the report was run. We can avoid having to handle dates in the far future in downstream actions by making the query consistent with the report text, and returning event activity to the date the report was run.
We move the initialisation of `FROM_DATE` and `TO_DATE` from deep within `render_report` to `config`, to make them more visible and to highlight the redundancy -- their canonical values are available within *analysis/query.sql* (former) and at runtime (latter). SQL Runner doesn't accept parametrised queries [1] -- and even if it did, we'd still have redundancy in *project.yaml* -- and the date and time an upstream SQL Runner action was executed is not accessible to downstream actions [2], so the best we can do to prevent drift between the `query` action and the `render_report` action is to highlight the redundancy. [1]: opensafely-core/sqlrunner#72 [2]: opensafely-core/sqlrunner#86
Because I don't know but my future self will assume that I did.
If a column given by parse_dates cannot be represented as an array of datetimes, then the column is returned as a string; no error is raised. We often encounter such columns, but we would like to know sooner rather than later, and with a more helpful error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like to see the transition from Jupyter notebook to this project. these new improvments all makes sense to me and look good
@@ -1,3 +1,5 @@ | |||
-- If you change the values of these variables, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AIUI sqlrunner doesn't (yet) support parameterised sql statements, to ensure consistency with the values in config.py
could a configure
or similar action be used preceding the query
action to populate the values in this file with the ones from the configuration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aware this might be lily-gilding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could! Such an action could also use Jinja2, as the render_report
action does. That said, whilst I wouldn't say the current implementation is beautiful, I also wouldn't say it needs further ornamentation.
To put it another way, I don't think the risks justify the effort. And if they did, or came close, then the effort should probably be invested in SQL Runner rather than in this study (opensafely-core/sqlrunner#72).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that would be a much more sensible approach!
Several tables referenced by the query contain dates in the far future (e.g. the year 9999), probably because these dates represent null values. These cause all sorts of problems, so we try to: