Skip to content

Conversation

@LeeHyungGeol
Copy link
Contributor

Fixes #1239

📢 Type of change

  • Bugfix
  • New feature
  • Enhancement
  • Refactoring

📜 Description

Addresses the issue where failed CompletableFuture<QueueAttributes> objects were cached indefinitely, causing all subsequent requests to fail until service restart.

Problem

When queue creation hasn't propagated to AWS SQS (eventual consistency), the initial failure gets cached permanently. This requires service restart to retry, as reported in #1234:

"Only by restarting the reception server does the look-up start to work."

Solution

Implemented Option A (Minimal Change) as discussed - automatically remove failed futures from cache:

private CompletableFuture<QueueAttributes> getQueueAttributes(String endpointName) {
    CompletableFuture<QueueAttributes> future = this.queueAttributesCache.computeIfAbsent(
        endpointName, newName -> doGetQueueAttributes(endpointName, newName)
    );
    
    future.whenComplete((result, throwable) -> {
        if (throwable != null) {
            this.queueAttributesCache.remove(endpointName);
            logger.debug("Removed failed queue attributes from cache for: {}", endpointName);
        }
    });
    
    return future;
}

Key Benefits:

  • Failed futures removed immediately → automatic retry
  • Successful futures still cached → performance preserved
  • No breaking changes (~5 lines)

💚 How did you test it?

Added 4 comprehensive tests:

  • shouldRemoveFailedQueueAttributesFromCache - Verifies failed cache removal and retry
  • shouldCacheSuccessfulQueueAttributes - Verifies successful cache behavior
  • shouldCacheSuccessfulQueueAttributesWithAttributeNames - Verifies cache with config options
  • shouldCreateSeparateObservationsForRetryAfterCacheFailure - Verifies observability tracking

📝 Checklist

  • I reviewed submitted code
  • I added tests to verify changes
  • I updated reference documentation to reflect the change
  • All tests passing
  • No breaking changes

🔮 Next steps

Potential improvements for follow-up PRs:

  • TTL-based cache expiration
  • Rate limiting for failed attempts
  • Retry mechanism for SqsMessageListenerContainer

@github-actions github-actions bot added the component: sqs SQS integration related issue label Oct 20, 2025
Copy link
Contributor

@tomazfernandes tomazfernandes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @LeeHyungGeol! Looking forward to more!

@tomazfernandes tomazfernandes merged commit fe9f347 into awspring:main Nov 1, 2025
5 checks passed
tomazfernandes pushed a commit to tomazfernandes/spring-cloud-aws that referenced this pull request Nov 1, 2025
* fix: remove failed queue attributes from cache

Signed-off-by: LeeHyungGeol <[email protected]>

* add SqsTemplateTests test cases

Signed-off-by: LeeHyungGeol <[email protected]>

* update shouldCacheSuccessfulQueueAttributesWithAttributeNames test case comment

Signed-off-by: LeeHyungGeol <[email protected]>

* update shouldRemoveFailedQueueAttributesFromCache test case comment

Signed-off-by: LeeHyungGeol <[email protected]>

---------

Signed-off-by: LeeHyungGeol <[email protected]>
(cherry picked from commit fe9f347)
MatejNedic pushed a commit that referenced this pull request Nov 6, 2025
* fix: remove failed queue attributes from cache



* add SqsTemplateTests test cases



* update shouldCacheSuccessfulQueueAttributesWithAttributeNames test case comment



* update shouldRemoveFailedQueueAttributesFromCache test case comment



---------


(cherry picked from commit fe9f347)

Signed-off-by: LeeHyungGeol <[email protected]>
Co-authored-by: 이형걸 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component: sqs SQS integration related issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow Retrying QueueAttributes Fetching Failures

2 participants