Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Add retry logic in "_waitForPkamAuthSuccess" to retry when secondary is temporarily not reachable #684

Merged
merged 4 commits into from
Oct 8, 2024

Conversation

sitaram-kalluri
Copy link
Member

@sitaram-kalluri sitaram-kalluri commented Oct 7, 2024

- What I did

  • After submitting an enrollment request, retry for couple of times when secondary server is temporarily not reachable.

- How I did it

  • Introduce a new variable "_maxActivationRetries" which is a final variable which represents the maximum retries
  • In "_waitForPkamAuthSuccess" method, catch if an exception is thrown and increment the "retryCounter" until it "maxActivationRetry" is reached. After that re-throw the exception.
  • In the _attemptPkamAuth method, if enrollment is denied, it returns an AtEnrollmentException. This exception is caught in the catch block of _waitForPkamAuthSuccess (which calls _attemptPkamAuth), triggering the retry logic that we want to avoid. To prevent this, we need to handle AtEnrollmentException separately here, which introduces code duplication.
  • To handle all the exceptions in one place, moved the existing exception handing logic in _attemptPkamAuth here. With the changes, _attemptPkamAuth do not handle exception. It rethrows and all the exceptions are handled in the _waitForPkamAuthSuccess.

- How to verify it

  • Tested the changes manually and attaching the test log:
/usr/lib/dart/bin/dart --enable-asserts --no-serve-devtools /home/sitaram/IdeaProjects/atsign/core/at_libraries/packages/at_onboarding_cli/bin/activate_cli.dart enroll -s ABC123 -n wavi:rw -p wavi -a @alice🛠 -r vip.ve.atsign.zone -k ~/.atsign/keys/@alice-pkam-local.atKeys -d local-17
Submitting enrollment request
Waiting for approval; will check every 10 seconds
Checking ... Enrollment ID: 0d387d23-e38d-4404-b297-f11adfdd1685
 not approved. Will retry in 10 seconds
Checking ...  not approved. Will retry in 10 seconds
Checking ...  not approved. Will retry in 10 seconds
Checking ... SEVERE|2024-10-07 15:36:23.447566|OnboardingCli|Exception occurred when authenticating the atSign: @alice🛠 caused by HandshakeException: Connection terminated during handshake. Attempting to retry for 1 attempt 
 not approved. Will retry in 10 seconds
Checking ... SEVERE|2024-10-07 15:36:33.462455|OnboardingCli|Exception occurred when authenticating the atSign: @alice🛠 caused by HandshakeException: Connection terminated during handshake. Attempting to retry for 2 attempt 
 not approved. Will retry in 10 seconds
Checking ... SEVERE|2024-10-07 15:46:06.025958|OnboardingCli|Exception occurred when authenticating the atSign: @alice🛠 caused by HandshakeException: Connection terminated during handshake. Attempting to retry for 3 attempt 
 not approved. Will retry in 10 seconds
Checking ... SEVERE|2024-10-07 15:46:16.032858|OnboardingCli|Exception occurred when authenticating the atSign: @alice🛠 caused by HandshakeException: Connection terminated during handshake. Attempting to retry for 4 attempt 
 not approved. Will retry in 10 seconds
Checking ... SEVERE|2024-10-07 15:46:26.040091|OnboardingCli|Exception occurred when authenticating the atSign: @alice🛠 caused by HandshakeException: Connection terminated during handshake. Attempting to retry for 5 attempt 
 not approved. Will retry in 10 seconds
Checking ... SEVERE|2024-10-07 15:46:36.047084|OnboardingCli|Exception occurred when authenticating the atSign: @alice🛠 caused by HandshakeException: Connection terminated during handshake Activation failed after 5 attempts 
[Error] HandshakeException: Connection terminated during handshake
[Error] Enrollment failed.
  Cause: HandshakeException: Connection terminated during handshake
  Please try again or contact [email protected]

Process finished with exit code 0

- Description for the changelog

  • fix: Add retry logic in "_waitForPkamAuthSuccess" to retry when secondary is temporarily not reachable

try {
pkamAuthSucceeded =
await _attemptPkamAuth(atLookUp, enrollmentIdFromServer);
} catch (e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does _attemptPkamAuth throw an exception if PKAM auth fails in a "normal" way (i.e. the enrollment has not yet been approved)?

Some code comments explaining what exactly is going on here would be useful

Copy link
Member Author

@sitaram-kalluri sitaram-kalluri Oct 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the _attemptPkamAuth method, if enrollment is denied, it returns an AtEnrollmentException. This exception is caught in the catch block of _waitForPkamAuthSuccess (which calls _attemptPkamAuth),
triggering the retry logic that we want to avoid. To prevent this, we need to handle AtEnrollmentException separately here, which introduces code duplication.

To handle all the exceptions in one place, moved the existing exception handing logic in _attemptPkamAuth here. With the changes, _attemptPkamAuth do not handle exception. It rethrows and all the exceptions are handled in the _waitForPkamAuthSuccess.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - I'd suggest we now remove _attemptPkamAuth and move its code inline, as _attemptPkamAuth isn't used anywhere else

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the "_attemptPkamAuth" and moved its code inline.

@sitaram-kalluri sitaram-kalluri requested a review from gkc October 8, 2024 07:31
try {
pkamAuthSucceeded =
await _attemptPkamAuth(atLookUp, enrollmentIdFromServer);
} catch (e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - I'd suggest we now remove _attemptPkamAuth and move its code inline, as _attemptPkamAuth isn't used anywhere else

@sitaram-kalluri sitaram-kalluri requested a review from gkc October 8, 2024 12:12
@gkc gkc merged commit 1bdca44 into trunk Oct 8, 2024
11 checks passed
@gkc gkc deleted the 1394-at_activate-should-retry-if-atserver-isnt-reachable branch October 8, 2024 12:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

at_activate should retry if atServer isn't reachable
2 participants