Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System Test, RDS Core: active PDB failure on drain work-around #394

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

elenagerman
Copy link
Collaborator

No description provided.

@elenagerman elenagerman force-pushed the pdb-wa branch 2 times, most recently from e20aad8 to 09b9dc3 Compare January 31, 2025 00:41
@elenagerman elenagerman requested a review from yprokule January 31, 2025 00:41
@elenagerman
Copy link
Collaborator Author

@yprokule PTAL, this is eIP failover active PDB on Drain failure workaround

@elenagerman elenagerman requested a review from yprokule February 2, 2025 16:02
@elenagerman elenagerman force-pushed the pdb-wa branch 2 times, most recently from 87550bf to 1a0b910 Compare February 3, 2025 16:52
@elenagerman elenagerman requested a review from yprokule February 3, 2025 16:53
@elenagerman elenagerman force-pushed the pdb-wa branch 3 times, most recently from 125273a to 28f6d1f Compare February 3, 2025 18:23

return fmt.Errorf("failed to uncordon %q due to %w", nodeName, err)
}

time.Sleep(15 * time.Second)

err = restoreActivePDBValues(pdbList)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this part won't be executed in case there's any preceding error (draining/uncording/etc) so more appropriate place for this will be in BeforeAll to capture initial state of PDBs and Before/AfterEach to restore them, wdyt?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a great idea, but I doubt how I can implement it. The problem is pdbList, all info for this List was collected during set... state

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and my apologies, I slightly modify both methods:

  • I took it to the upper level
  • now it's suitable for all pdbs, not only has == 1 values
  • now the SetMinAvailableToZeroForActivePDB method collects not only pdb objects, but original values as well
  • RestoreActivePDBValues, based on this map restores original values and not just set minAvailable values = 1

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually I can separate 'collect pdb info' into stand alone function, wait a min, I'll check...

Copy link
Collaborator Author

@elenagerman elenagerman Feb 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yprokule no, it will not help. I still do not see how I can pass to the AfterEach state PDB list. Do you have any idea?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yprokule do you have any idea how to implement it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sent U some draft directly. Also noticed that some PDBs have their minAvailable field reverted after it's updated by the test, this must be investigated further.

Also noticed that after node reboot it takes some time for respective pods to start so proposed an extra check here - #413

@elenagerman elenagerman force-pushed the pdb-wa branch 5 times, most recently from 6fb8f73 to 22e1e75 Compare February 6, 2025 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants