-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature minimal workflow #60
Feature minimal workflow #60
Conversation
The minimal configuration allows to run the entire workflow in minimal time. On my machine, with automatic downloads available, the workflow takes less than 5 minutes to run when run with three concurrent jobs. This is a first step towards continuous integration tests.
This is currently blocked by #58. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial comments before #58 is merged in
basins = gpd.read_file(path_to_basins) | ||
def preprocess_basins(path_to_basins, bbox, path_to_output): | ||
"Filter and fix basin shapes." | ||
basins = gpd.read_file(path_to_basins, bbox=bbox) | ||
basins.geometry = basins.geometry.map(_buffer_if_necessary) | ||
basins.to_file(path_to_output, driver=DRIVER) | ||
|
||
|
||
def _buffer_if_necessary(shape): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
once incorporated, move to eurocalliopelib and add in the same assertion as I did in solar_and_wind_potentials
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, this function is only used once within this repo and it is difficult to test and therefore I don't see an added value of moving it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once solar and wind potentials is a submodule, I can see a shared util library being useful. Then it makes sense for functions like this to move to the 'lib' since it can be used in both modules...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but maybe do it then? I am not 100% sure this'll work. I've played around with importing libraries of other workflows vut didn't manage to make it work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit of a confusing PR since you introduce the minimal workflow alongside restructuring the hydro stuff. I assume the restructuring isn't entirely necessary for the minimal workflow to function?
Anyway, the restructuring makes sense. One thing I wonder about is whether the last hydro step (referencing 'hydro.py') should also be in 'hydro.smk', for consistency?
basins = gpd.read_file(path_to_basins) | ||
def preprocess_basins(path_to_basins, bbox, path_to_output): | ||
"Filter and fix basin shapes." | ||
basins = gpd.read_file(path_to_basins, bbox=bbox) | ||
basins.geometry = basins.geometry.map(_buffer_if_necessary) | ||
basins.to_file(path_to_output, driver=DRIVER) | ||
|
||
|
||
def _buffer_if_necessary(shape): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once solar and wind potentials is a submodule, I can see a shared util library being useful. Then it makes sense for functions like this to move to the 'lib' since it can be used in both modules...?
In fact, the two parts of this PR are related: the minimal workflow would not work without restructuring the hydro module. |
About moving In any case, I renamed it to make its purpose clearer: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a changelog entry and we're good to go!
This is a first step towards continuous integration (#56) and should allow to run integration tests manually. This configuration takes less than 5 minutes to run on my machine, with three concurrent jobs and with automatic downloads available.
I could imagine us to enforce successful runs in each of our future pull requests -- right before merging changes into
develop
. As long as issues in #56 aren't solved, it has to run locally by the author of the pull request.In the process, I also moved estimation of storage capacities into the preprocessing of the JRC hydro stations database. Apart from problems that arose in its old place, the new place fits logically much better.