Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: use Narwhals expressions (and stable API) #528

Merged
merged 2 commits into from
Nov 5, 2024

Conversation

MarcoGorelli
Copy link
Contributor

Hey, this does a couple of things:

Comment on lines -577 to -579
df[name]
.dt.replace_time_zone("UTC")
.dt.convert_time_zone(local_tz)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separately from this PR, is this logic correct? note that dtype == nw.Datetime checks for any Datetime variant, including those which already have a time zone:

>>> nw.Datetime('us', 'Asia/Kathmandu') == nw.Datetime
True
>>> nw.Datetime('us') == nw.Datetime
True
>>> nw.Datetime('ns', 'UTC') == nw.Datetime
True

just like Polars does

I'm not saying that the logic here isn't correct, I'm just pointing this out in case there was any misunderstanding that can be corrected - dealing with time zones is always tricky 😄

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this was correct when I originally wrote it because VegaFusion previously only returned naive timestamps that were implicitly in UTC. But since then I've updated VegaFusion to work with timezone aware timestamps internally, and so it is now possible for it to return timezone aware timestamps.

So the correct logic now would be to use replace_time_zone only when it's a naive timestamp. What's the canonical way to test for a timezone aware timestamp in polars/narwhals?

Copy link
Contributor Author

@MarcoGorelli MarcoGorelli Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah thanks for explaining, so is the desired logic:

  • if it's tz-naive, then set to UTC and then convert to local_tz
  • if it's tz-aware, then just convert to local_tz

?

If so, then just .dt.convert_time_zone(local_tz) would do the trick, as that converts as if from UTC for tz-naive data https://narwhals-dev.github.io/narwhals/api-reference/expr_dt/#narwhals.expr.ExprDateTimeNamespace.convert_time_zone

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that's perfect!

@jonmmease
Copy link
Collaborator

Thanks!

@jonmmease jonmmease merged commit aced322 into vega:v2 Nov 5, 2024
19 checks passed
jonmmease pushed a commit that referenced this pull request Nov 16, 2024
* use Narwhals stable api, use expressions

* just use .dt.convert_time_zone
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants