You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Following on from #42, a description of how to move unit handling and other changes into the body of aneris.
convenience
I would keep convenience.harmonise_all where it is i.e. not merge it into the main body: it's a useful interface that can just wrap around other stuff and provides an interface that doesn't currently exist. There might be a better way to implement it rather than this groupby stuff...
convenience._harmonise_single is not that well written tbh (well, it's too big). The steps are basically:
work out what history timeseries you're harmonising to (explode if there's nothing that matches or harmonisation year isn't there/is nan)
convert the units of history to the timeseries being harmonised
determine whether to override the default tree or not. I basically added a new way of handling overrides, that behave a bit more like pyam and scmdata's filter methods. I'm not sure if the new implementation is the best, but it did what I wanted and was more intuitive to me cause I could use the variable/region names as I saw them in the input data rather than having to know about how aneris chops variable names into sectors etc. (My implementation does also require a check to make sure that the user hasn't put in an ambiguous choice over override)
harmonise (fairly simple now as we know the timeseries to harmonise, history, harmonisation year and method)
convenience.convert_units is pretty simple so could just stay (and I think is much faster because it's all pandas and pint based). Otherwise make pyam or scmdata a dependency and use their unit conversion (might come with a performance hit for large sets of timeseries).
convenience._harmonise_aligned and convenience._get_delta are just wrappers to make harmonisation slightly clearer. The pattern of having the harmonisation methods as a class method does seem a bit weird to me, but it's also not really an issue. You could merge this into the main code base. Alternately, you could just rethink the way the main code base works (could be a lot of work and there are probably good reasons it's written like it is). Finally, we could just leave these mini wrappers as they are (more stuff to maintain, but provides an extra handy interface for those who don't need the full power of the current implementation like me).
errors
Just added an errors module. Nice and easy, probably can just stay as is.
overall
In terms of migrating it all, you could put some of the unit handling stuff deep in the guts. I think the more fundamental question is whether you want to maintain one interface (the existing one or the new one) or multiple. The old one has the advantage that it's probably better when you have lots of sectors, want people to put config in csvs etc. The new one has the advantage that it's a bit more what you see is what you get, you start with some timeseries, you use the names of the timeseries as they appear to set any overrides, all unit conversions are handled for you.
I think the fundamental difference between the two is how they cut the data. My implementation just does one by one harmonisation, ignoring any need for internal consistency between the timeseries. The existing implementation is definitely much better at ensuring that regional sums are maintained sensibly, sectoral breakdowns are maintained etc. You could probably change the internal logic and config to make it more what you see is what you get (rather than breaking variable names down into gas and sector in the guts of the processing) and use some of pyam's internal summing and consistency checking to simplify the code base, that might be where I would start.
The text was updated successfully, but these errors were encountered:
Following on from #42, a description of how to move unit handling and other changes into the body of aneris.
convenience
I would keep
convenience.harmonise_all
where it is i.e. not merge it into the main body: it's a useful interface that can just wrap around other stuff and provides an interface that doesn't currently exist. There might be a better way to implement it rather than this groupby stuff...convenience._harmonise_single
is not that well written tbh (well, it's too big). The steps are basically:convenience.convert_units
is pretty simple so could just stay (and I think is much faster because it's all pandas and pint based). Otherwise make pyam or scmdata a dependency and use their unit conversion (might come with a performance hit for large sets of timeseries).convenience._harmonise_aligned
andconvenience._get_delta
are just wrappers to make harmonisation slightly clearer. The pattern of having the harmonisation methods as a class method does seem a bit weird to me, but it's also not really an issue. You could merge this into the main code base. Alternately, you could just rethink the way the main code base works (could be a lot of work and there are probably good reasons it's written like it is). Finally, we could just leave these mini wrappers as they are (more stuff to maintain, but provides an extra handy interface for those who don't need the full power of the current implementation like me).errors
Just added an errors module. Nice and easy, probably can just stay as is.
overall
In terms of migrating it all, you could put some of the unit handling stuff deep in the guts. I think the more fundamental question is whether you want to maintain one interface (the existing one or the new one) or multiple. The old one has the advantage that it's probably better when you have lots of sectors, want people to put config in csvs etc. The new one has the advantage that it's a bit more what you see is what you get, you start with some timeseries, you use the names of the timeseries as they appear to set any overrides, all unit conversions are handled for you.
I think the fundamental difference between the two is how they cut the data. My implementation just does one by one harmonisation, ignoring any need for internal consistency between the timeseries. The existing implementation is definitely much better at ensuring that regional sums are maintained sensibly, sectoral breakdowns are maintained etc. You could probably change the internal logic and config to make it more what you see is what you get (rather than breaking variable names down into gas and sector in the guts of the processing) and use some of pyam's internal summing and consistency checking to simplify the code base, that might be where I would start.
The text was updated successfully, but these errors were encountered: