Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(parsers, MX): drop seldom occuring datapoints with an hour set to 25 #7696

Merged
merged 2 commits into from
Jan 6, 2025

Conversation

consideRatio
Copy link
Contributor

@consideRatio consideRatio commented Jan 6, 2025

Issue

Possibly #7556, but I'm not sure - it could be something else. I found this issue though.

Description

It appears that sometimes the date / hour data returned for the PX production parser includes a 25th hour entry, instead of the expected 1-24. In such cases, we now drop that datapoint.

Double check

  • I have tested my parser changes locally with poetry run test_parser "zone_key"
  • I have run pnpx prettier@2 --write . and poetry run format in the top level directory to format my changes.

@github-actions github-actions bot added parser python Pull requests that update Python code zone config Pull request or issue for zone configurations labels Jan 6, 2025
Comment on lines +151 to +154
# The hour column has been seen at least once (3rd Nov 2024) to include 1-25
# hours rather than the expected 1-24, due to this, we are for now dropping
# such entries if they show up
df = df.drop(df[df["Hora"] == "25"].index)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this even happen...

Is it possible this is related to daylight savings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered that as well, but this was 3rd Nov, and I don't think it would change in November - also, and it seems MX hasn't had DST since 2022 or something like that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then IDK what it can be but since it's not every day we should probably just get rid of that datapoint.

I don't even know where it would end up as our database is strictly 24 hours and not 25 😅

Copy link
Member

@VIKTORVAV99 VIKTORVAV99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 🎉

@VIKTORVAV99 VIKTORVAV99 enabled auto-merge (squash) January 6, 2025 16:01
@VIKTORVAV99 VIKTORVAV99 merged commit 2d517ad into electricitymaps:master Jan 6, 2025
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parser python Pull requests that update Python code zone config Pull request or issue for zone configurations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants