-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
an attempt at fixing the 15m vs 1h resolutions (for now) #196
base: main
Are you sure you want to change the base?
Conversation
also
|
Here the german special case is not handled. They have 2 sets of data one 15min and one 60min. The 60min set is the one that needs to be parsed. That is why we ignored everything besides 60min before. entsoe |
what's the preferred outcome in this case? The code currently only deals with 1h intervals, so ...
It seems there's no relation between the value for a certain point, an the hour-points in the 15m resolution (no average of either all hour-points, or some sliding window around the hour-mark) so a choice needs to be made. Given the DE example above, there's
and
There's no way to get the 63,97 value of 23:00 by using the datapoints in the 15m set (or I made some serious mistake). |
anything besides the 1h interval could be silently discarded in the German case. The 15min data is some price from another electricity product. No idea what it really means. Except that we do not want it. ' day-ahead prices of the separate 10:15 auction of EXAA are also published under the filter “resolution=PT15M" |
addressed that, prefers the 60M data, but takes 15M data if no 60M data is available (which keeps the oddball BE "bug" situation working). Added the DE example as a test for this situation |
Not sure if your test files are the same as the DE files, but your mixed file contains Looking at the date you commited the file (october 7) I can image that the data retrieved gave PT15 for the past and PT60 for future prices. Are you sure there are overlapping periods in the DE case? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to make a separate method for PT15M logic as we need to also cope with missing positions (price remains equal)
The PT15M logic can return the hour and price found after iterating through max 4 positions and return the average. As such the code works for any number of positions (max4) in a PT15M resolution
See some (untested) code in #202
seems like i forgot to add the datafile for the DE testcase... should be good now. |
The solution would have to be generic, apparently the XML specs allow for PT30M resolutions as well... |
The setup in #202 allows for extending towards PT30M and other resolutions |
In the BE mixed file there is
In the DE example there are
Meaning: in the DE case the server gave 2 responses to the request. Two 'TimeSeries' with different mRID, different resolutions and different classificationSequence_AttributeInstanceComponent.position It seems the classificationSequence_AttributeInstanceComponent.position is an optional parameter also valid in the request. Not sure what it does. I think when there are two (or more) Timeseries elements in the response these can be interpreted as alternatives |
Yes I agree on this. Separate mehods with duplicate code are not really clean or maintainable. I even think we should go further than this and should just use the data in the resolution we get from the api and make the complete integration generic with multiple resolutions. Instead of forcing it in the 60m resolution which can be done in multiple ways and then create confusion in the future. I also think we can assume real mixed responses like the belgian bug of this week is not really something we will see again. Would be nice to be able to handle it not really a dealbreaker.
That the german response has multiple data sets is already explained an mentioned above. More info in this issue where entsoe-py handles this case. Good spot on the classificationSequence_AttributeInstanceComponent.position I didn't notice that one before. If this field makes is possible to separate SDAC and EXAA data it could be used to easily only parse the SDAC data. Will need to see if we can find any documentation on this field. |
My two cents below. The resolution is actually an ISO8601 defined expression of duration (Pxxxx -> period, followed by time indicators). Imho, best way forward would be to parse the data with the given resolution, making zero assumptions, towards a pandas DF and then aggregate over it towards the wanted resolution (60 minutes in this case, might change in the future). Pandas has https://pandas.pydata.org/docs/reference/api/pandas.Timedelta.isoformat.html as a parser for the resolution, and the code could use the And, as a bonus, if i remember correctly, pandas also has a forward-fill method for the Using pandas might give a more robust parsing and handling of the data. The code would only have to translate between the XML and the dataframes, and apply the correct math to the dataframes as needed. (I kind of got pulled in this by making this one small fix (to actually just fix my battery-charging), but the problem is interesting from a developers view, so I keep coming back ... ;-) ) |
Using pandas would pin us to version 2.1.4 as pandas is pinned by the home assistant core project. This already caused a mess before so I am quite averse to taking on that dependency. And if id didn't removed the dependency on entsoe-py a few weeks ago. The xml change would have broken the integration completely and the only solution would have been writing our own xml parser from zero. Taking dependencies is always a trade-off and an interesting topic from a developer view. In general I love libraries, they make so much possible. But there is always a cost. Besides that I think aggregating or resampling the data is not the responsibility of this integration. The aim is to bring the data of entso-e inside home assistant and make it usable. The data should be kept as pure and untouched as possible. |
timedelta
dependant on the returned valueThis should fix the missing data for now. This needs further tuning if there's ever a non-4-step-15m-interval. I might add that later this week, no time today ...