Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: Refactor caching (#124, #148, #153) #219

Merged
merged 8 commits into from
Feb 23, 2024
Merged

Conversation

ntfrgl
Copy link
Member

@ntfrgl ntfrgl commented Feb 21, 2024

This PR fully implements:

  • the suggestions (1-3) in this comment, i.e., a new mix-in class Cached with a decorator @Cached.method(),
  • a bare minimum but robust version of suggestion (4), i.e., a detailed behavioral test of the mix-in itself which leaves subclass-specific tests to future work,
  • a resolution of the incorrect caching behavior in Directed nsi #153, and
  • a resolution of another minor issue due to obsolete caching attributes.

Obviously, there is still more refactoring to be done to make the caching system consistent across the project, but this PR should deliver on the main concern of #148, namely a well-behaved caching decorator. The last commit message should provide enough context for reviewing. @fkuehlein, please raise any concerns you might have, since you are expected to maintain this code going forward.

fkuehlein and others added 6 commits November 28, 2023 17:46
- replace `@cached_var` with `@functools.lru_cache`
  and `@cached_const` with `@functools.cached_property`
- refactor calls of any `cached_property`, now not callable anymore,
  substitute with `self.set_link_attribute(property)` where necessary
- remove `cache_helper` and `cache` attribute in `core.network`
- adapt `clear_cache()` methods accordingly,
  introduce `clear_cache_startswith()`
- remove `Network.clear_link_attribute()`
- adapt `Network.{del|set}_link_attribute()` accordingly,
  drop clearing cache of `path_lengths()` in `set_link_attribute()`
- related issue: #148

- locally disable pylint false positive `not-callable` on
  `multiprocess.Pool()`
- replace removed `clear_paths_cache()`
  with new `clear_cache_startswith('path_')`
- related: #148
This reverts commit 0166cd2.
Improve the approach in reverted commits fe23f23, 0166cd2:

- introduce `@Cached.method()` based on `@functools.lru_cache()`
- track mutable dependencies for cache invalidation
- fix cache clearing helper
- add behavioural spec in `tests/test_core/test_cache.py`

Adjustments in existing classes in this commit:

- define `__cache_state__()` for all subclasses of `Cached`
  (might require updates when adding further methods to the cache)
- define `@Cached.method(attrs=(...))` where appropriate
- remove obsolete class-specific caching mechanisms
- factor out cached worker: `Network.nsi_betweenness()`
- define helper: `Network.find_link_attribute()`
- define helper: `ClimateNetwork._weighted_metric()`

Adding this behaviour to classes without caching should be straightforward.
Classes that remain to be adjusted (still have a `clear_cache()` method
or subclass `Cached` without conforming to its protocoll):

- `ClimateNetwork` and its subclasses:
    The recursive interaction between `__init__()`, `_similarity_measure`
    and `_regenerate_network()` across the class hierarchy is very
    stateful and does not fit into the regular OO dependency pattern
    assumed by `Cached`. A redesign of this logic would be advisable,
    but is left for future work.
- `RecurrenceNetwork` and its subclasses, and `Surrogate`:
    Left as an exercise for the reader.
@ntfrgl ntfrgl added bug maintenance something should be improved or is outdated labels Feb 21, 2024
@ntfrgl ntfrgl added this to the Release 0.7 milestone Feb 21, 2024
@ntfrgl ntfrgl requested a review from fkuehlein February 21, 2024 13:36
@ntfrgl ntfrgl linked an issue Feb 21, 2024 that may be closed by this pull request
- fix indentation
- add missing reference
- importing `Hashable` and `Callable` from `collections.abc` directly
- thereby fix pylint warnings from 93ab748
Copy link

codecov bot commented Feb 22, 2024

Codecov Report

Attention: 54 lines in your changes are missing coverage. Please review.

Comparison is base (2739894) 62.82% compared to head (415e2bf) 63.76%.

Files Patch % Lines
src/pyunicorn/climate/havlin.py 52.00% 12 Missing ⚠️
src/pyunicorn/timeseries/recurrence_plot.py 85.89% 11 Missing ⚠️
src/pyunicorn/core/cache.py 85.71% 8 Missing ⚠️
src/pyunicorn/timeseries/joint_recurrence_plot.py 50.00% 5 Missing ⚠️
src/pyunicorn/climate/climate_network.py 80.00% 4 Missing ⚠️
src/pyunicorn/climate/hilbert.py 73.33% 4 Missing ⚠️
src/pyunicorn/core/network.py 96.77% 4 Missing ⚠️
src/pyunicorn/climate/mutual_info.py 76.92% 3 Missing ⚠️
src/pyunicorn/climate/climate_data.py 95.83% 1 Missing ⚠️
src/pyunicorn/core/geo_network.py 90.90% 1 Missing ⚠️
... and 1 more
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #219      +/-   ##
==========================================
+ Coverage   62.82%   63.76%   +0.93%     
==========================================
  Files          43       44       +1     
  Lines        6220     6221       +1     
==========================================
+ Hits         3908     3967      +59     
+ Misses       2312     2254      -58     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@fkuehlein
Copy link
Collaborator

fkuehlein commented Feb 23, 2024

Thanks a lot @ntfrgl for all the work put into implementing this and also for all the little cleanups and refactors along the way!

I took a bit of time to understand the structure and functionality of the new Cached class in detail, although its application appears straightforward indeed. I think I'll be able to maintain it, continuously include it in further classes and eventually add subclass-specific tests.

I added the remaining minor fixes in the commits above to make sure all checks pass, so this PR should be good to go.

@fkuehlein fkuehlein merged commit 634d3a0 into master Feb 23, 2024
3 checks passed
@fkuehlein fkuehlein deleted the 148-cache-helper branch February 23, 2024 11:26
@ntfrgl
Copy link
Member Author

ntfrgl commented Feb 25, 2024

Thank you for the careful review. I'm glad to know the project in your hands.

As a last addition, I slightly simplified the implementation of nested Cached objects in afc6abf, effectively removing some debugging artifacts. But that shouldn't concern the current release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug maintenance something should be improved or is outdated
Projects
None yet
Development

Successfully merging this pull request may close these issues.

@cached_var may store wrong values if method has arguments
2 participants