-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Tiling for Multiprocessing #652
base: main
Are you sure you want to change the base?
Conversation
04008e3
to
5555068
Compare
5555068
to
b787b68
Compare
Great! 😄 In terms of locations in the package, as these are methods that only array-related, I suggest we move them to either geoutils/geoutils/raster/sampling.py Line 152 in 6482e77
Preference for the second option. I don't picture users manipulating these lower-level functions themselves, so we could keep them outside of the Also, I remember that the "stitching" part of the splitting with overlap was quite delicate sometimes (cropping the overlapping chunks with relative indexes). If it can be useful to you, here's how we had it coded in one package ("inv" saves the indexes to stitch the overlapping chunks later on): https://github.com/iamdonovan/pyddem/blob/2fbd54049cf90f77076ad62ba5e1b7cf179cf52e/pyddem/fit_tools.py#L1390 |
[ | ||
[[0, 50, 0, 50], [0, 50, 45, 95], [0, 50, 90, 100]], | ||
[[45, 95, 0, 50], [45, 95, 45, 95], [45, 95, 90, 100]], | ||
[[90, 100, 0, 50], [90, 100, 45, 95], [90, 100, 90, 100]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have expected the tiling here to yield [0, 55, 0, 55], [0, 55, 45, 100], [45, 100, 0, 55], [45, 100, 45, 100].
The list comprehensions here should normally work, might save you time re-coding this 😉 : https://github.com/iamdonovan/pyddem/blob/main/pyddem/fit_tools.py#L1398
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please be careful using pyddem, it's under GPL-3.0 License (same as RichDem), It is therefore not advisable to import it. However, I will look into it regarding "inspiration."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks to Sébastien Dinot (https://github.com/sdinot) for proposing legally viable solutions.
-
Even projects under the GNU GPL v3.0 license are rarely written ex nihilo. This function may come from another project released under a permissive license. We need to search it. Without a specialized tool, it's mission impossible, but I have a little idea: it's time to test Software Heritage.
-
If that leads nowhere, someone needs to write an equivalent function without being influenced by the existing code (in other words, it would be best to find someone who has never seen this code). When rewritten in a black-box approach, there's little chance the final result will be identical. This implementation can potentially draw inspiration from an algorithm found in a book or article, in which case the latter should be explicitly cited as the source of inspiration to avoid any ambiguity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'm not sure where this function came from originally.
From Git blame, I'm the one who added it to Pyddem 5 years ago, but it might have been inspired from a Stackoverflow response or similar.
@rhugonnet I created a new file |
Exactly yes, it's all good now I think! 🙂 On the organization again: For next steps: We might have to think about how the user will define the tiling argument into |
@rhugonnet I agree that it is maybe better to avoid overloading Raster class with tiling. Then I moved these functions in For next steps, you are probably right, but as tiling is no longer defined in Raster, I find that it makes less sense to define an attribute |
Resolves #649
Summary
The tiling grid generation enables splitting large raster datasets into smaller tiles, with an optional overlap parameter, to facilitate multiprocessing and improve performance when handling large
Features
Tiling Grid Generation (
generate_tiling_grid
): Generate a grid of positions by splitting [row_min, row_max] x [col_min, col_max] into tiles of size row_split x col_split with optional overlap.Raster Method
compute_tiling
: Compute the raster tiling grid withgenerate_tiling_grid
, specifyingtile_size
andoverlap
. Ensure both raster (self
andraster_ref
) are compatible with the same tiling.Raster Method
plot_tiling
: Plot the raster with its tiling grid.Tests
generate_tiling_grid
(expected shape, expected first and last tile, overlap consistency).Example (on xDEM)
200x200 tiling without overlapping :
data:image/s3,"s3://crabby-images/7d6ed/7d6edc24c8b5b16733a4d8abff3e309092cc09aa" alt="image"
200x200 tiling with 10 pixels overlapping :
data:image/s3,"s3://crabby-images/350c1/350c1d48f03358066d5f4d4ec0a11cc5a43f15ea" alt="image"