Annotation removal using locking #3015

shawkins · 2025-10-22T18:00:13Z

Replacement / alternative to #2999.

Layered on the tombstone removal changes / default enablement of parsing resource versions. It turns out that some of the tombstone changes needed refinement anyway. For example, we need to compare to a localized latest version, not to the cache.

This seems like the most straight-forward way to do this with locking - but of course that means we have to be very careful with what locks are already held. If this seems too risky, then queuing managed in the TemporaryResourceCache could be used instead, but that will need more code restructuring.

These changes also assume we only want to deal with this skipping problem when using the InformerEventSource. That could be generalized if needed for #3011

Much like the PrimaryUpdateAndCacheUtils, users will need to call InformerEventSource.PrimaryUpdateAndCacheUtils, to properly use this via custom logic.

This does not do any cleanup the extraneous puts into the cache because of onCreated / onUpdated being called after the updateAndCacheResource method.

This also does not try to reason about deletes.

cc @csviri @metacosm

Signed-off-by: Steve Hawkins <[email protected]>

csviri

I generally like this approarch. What we should do is also document it on high level, like javadoc on TemporaryResourceCache. Like the general idea; using lock striping for IO operations, etc.

Not sure if all the scenarios are covered with unit tests, probably those could be also added.

Will have to spend more time to think through all the scenarion, overall LGTM, added some comments, and questions to clarify.

csviri · 2025-10-23T12:57:19Z

...n/java/io/javaoperatorsdk/operator/processing/event/source/informer/InformerEventSource.java

+    }
+    try {
+      temporaryResourceCache.startModifying(id);
+      var updated = updateMethod.apply(resourceToUpdate);


We might want to iterate over this, the start modifyn and recent resource update should be called from an utils, but fine by me now as it is

I had initially started down the path of a common method with the primary cache utils - but for clarity this seemed like a better place to start.

Are you expecting that we'll want to push the event skipping logic down into ManagedInformerEventSource, and/or are you thinking that we'll effectively be able to use a Context rather than a direct reference to the event source?

csviri · 2025-10-23T12:59:48Z

...ava/io/javaoperatorsdk/operator/processing/event/source/informer/TemporaryResourceCache.java

-                    .equals(previousResourceVersion))
-            || isLaterResourceVersion(resourceId, newResource, cachedResource))) {
+    // first check against the source in general - this also prevents resurrecting resources when
+    // we've already seen the deletion event


As mentioned on other PR , would be nice to describe the resurrection scenario more in detial

csviri · 2025-10-23T13:08:38Z

...ava/io/javaoperatorsdk/operator/processing/event/source/informer/TemporaryResourceCache.java

+    boolean[] known = new boolean[1];
+    synchronized (this) {
+      if (!unknownState) {
+        latestResourceVersion = resource.getMetadata().getResourceVersion();


While is this better this way as using the latest sync version?

We are effectively using the TemporaryResourceCache to reason about things the reconciler was responsible for creating / updating.

If we skip putting something in the cache, then we may not skip that event as intended.

Let's say the informer has seen:

create v1

modify v2 - the reconciler did this

modify v3 - something else did this

But has only delivered to the TemporaryResourceCache

create v1

And we're now attempting a put with v2. If we use the informer latest version, we won't actually put v2 into the TemporaryResourceCache as it looks out-of-date. But then on onAddOrUpdateEvent we'll answer that the event wasn't known and trigger a reconciliation.

So by using a local notion of latestResourceVersion we're reasoning over what the TemporaryResourceCache actually knows.

Also because the get logic has changed to check both the TemporaryResourceCache and the informer cache, we won't actually return an out-of-date resource due to the put.

csviri · 2025-10-23T13:48:09Z

...n/java/io/javaoperatorsdk/operator/api/config/informer/InformerEventSourceConfiguration.java

    private final GroupVersionKind groupVersionKind;
    private final InformerConfiguration<R> informerConfig;
    private final KubernetesClient kubernetesClient;
+    private final boolean comparableResourceVersions;


We should add this also to @Informer annotation

csviri · 2025-10-23T13:49:37Z

...n/java/io/javaoperatorsdk/operator/processing/event/source/informer/InformerEventSource.java


  @Override
-  public void onDelete(R resource, boolean b) {
+  public synchronized void onDelete(R resource, boolean b) {


I don't understand why we have this synchronized, could you elaborate pls?

The current logic has start and the onAddOrUpdate methods synchronized. I'm assuming that the intention there is to effectively block event handling until the primaryToSecondaryIndex has been built from the initial listing. I don't think delete should be handled differently.

That could also be done with a more granular lock around the usages of primaryToSecondaryIndex.

In general there's effectively a single threaded model of event delivery, so unless users are calling the onXXX methods manually, those calls will be serialized.

removing tombstones

c310e6b

Signed-off-by: Steve Hawkins <[email protected]>

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 22, 2025

shawkins mentioned this pull request Oct 22, 2025

removing tombstones #3013

Draft

remove the use of the previous annotation via locking

8558c1a

Signed-off-by: Steve Hawkins <[email protected]>

shawkins force-pushed the annotation_removal-b branch from d650f86 to 8558c1a Compare October 23, 2025 13:21

csviri reviewed Oct 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Annotation removal using locking #3015

Annotation removal using locking #3015

shawkins commented Oct 22, 2025 •

edited

Loading

Uh oh!

csviri left a comment

Uh oh!

csviri Oct 23, 2025

Uh oh!

shawkins Oct 23, 2025

Uh oh!

csviri Oct 23, 2025

Uh oh!

csviri Oct 23, 2025

Uh oh!

shawkins Oct 23, 2025

Uh oh!

csviri Oct 23, 2025

Uh oh!

csviri Oct 23, 2025

Uh oh!

shawkins Oct 23, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Annotation removal using locking #3015

Are you sure you want to change the base?

Annotation removal using locking #3015

Conversation

shawkins commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

csviri left a comment

Choose a reason for hiding this comment

Uh oh!

csviri Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

shawkins Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

csviri Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

csviri Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

shawkins Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

csviri Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

csviri Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

shawkins Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

shawkins commented Oct 22, 2025 •

edited

Loading

shawkins Oct 23, 2025 •

edited

Loading