Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible FAILURE STATE in State Machine #1025

Open
Nhyalgotphores opened this issue Jan 16, 2024 · 1 comment
Open

Possible FAILURE STATE in State Machine #1025

Nhyalgotphores opened this issue Jan 16, 2024 · 1 comment

Comments

@Nhyalgotphores
Copy link

Nhyalgotphores commented Jan 16, 2024

Description

While unlikely it is possible to trigger failure state that is impossible to solve with provided tools and has to assigned state has to be set manually.

In cluster with two nodes and one quorum it's possible to trigger this sequence of events.

Beginning STATE

NODE1 PRIMARY
NODE2 SECONDARY

Sequence of events

NODE1 has error in .pgpass preventing it from comunicating with rest of cluster but itself it isn't cause of automatic switchover
NODE2 tries enable maintenance.
STATE SECONDARY > wait_maintenance
NODE1 is assigned state
PRIMARY > wait_primary but the error in .pgpass causes fall into demote_timeout which causes impasse.
Because NODE1 cannot reach target state FATAL pg_autoctl does not know how to reach state "wait_primary" from "demote_timeout"
And NODE2 cannot leave maintenance because it is stuck in wait_maintenance. Neither node will start and whole cluster gets stuck.

Workaround:

  1. Shutdown BOTH NODES and QUORUM PG_AUTOCTL
  2. START PG_CTL quorum
  3. update node set goalstate='primary' where nodeid=1;
  4. STOP PG_CTL quorum
  5. START PG_AUTOCTL in order QUORUM, NODE1, NODE2

This allows the NODE1 to start and transition to wait_primary which allows NODE2 to reach maintenance.

Expected solution:

Transition between demote_timeout and wait_primary should be implemented.

@Nhyalgotphores
Copy link
Author

Nhyalgotphores commented Jan 16, 2024

Workaround EVENTS

                    Event Time |   Node |       Current State |      Assigned State | Comment
-------------------------------+--------+---------------------+---------------------+-----------
 2024-01-16 11:50:44.677014+01 |    0/1 |             primary |        wait_primary | Setting goal state of node 1 "<NODE1>" (<NODE1>:5433) to wait_primary and node 2 "<NODE2>" (<NODE2>:5433) to wait_maintenance after a user-initiated start_maintenance call.
 2024-01-16 11:50:44.677014+01 |    0/2 |           secondary |    wait_maintenance | Setting goal state of node 1 "<NODE1>" (<NODE1>:5433) to wait_primary and node 2 "<NODE2>" (<NODE2>:5433) to wait_maintenance after a user-initiated start_maintenance call.
 2024-01-16 11:50:48.080015+01 |    0/2 |    wait_maintenance |    wait_maintenance | New state is reported by node 2 "<NODE2>" (<NODE2>:5433): "wait_maintenance"
 2024-01-16 11:50:57.412881+01 |    0/2 |    wait_maintenance |    wait_maintenance | Node node 2 "<NODE2>" (<NODE2>:5433) is marked as unhealthy by the monitor
 2024-01-16 11:51:17.442579+01 |    0/1 |             primary |        wait_primary | Node node 1 "<NODE1>" (<NODE1>:5433) is marked as unhealthy by the monitor
 2024-01-16 11:54:01.597231+01 |    0/1 |      demote_timeout |        wait_primary | New state is reported by node 1 "<NODE1>" (<NODE1>:5433): "demote_timeout"
 2024-01-16 12:44:25.280301+01 |    0/1 |      demote_timeout |        wait_primary | Node node 1 "<NODE1>" (<NODE1>:5433) is marked as healthy by the monitor
 2024-01-16 14:05:19.371168+01 |    0/1 |      demote_timeout |        wait_primary | Node node 1 "<NODE1>" (<NODE1>:5433) is marked as unhealthy by the monitor
 2024-01-16 14:06:53.994032+01 |    0/1 |      demote_timeout |        wait_primary | Node node 1 "<NODE1>" (<NODE1>:5433) is marked as healthy by the monitor
    2024-01-16 14:40:53.268+01 |    0/1 |      demote_timeout |        wait_primary | Node node 1 "<NODE1>" (<NODE1>:5433) is marked as unhealthy by the monitor
 2024-01-16 14:43:40.262183+01 |    0/1 |             primary |             primary | New state is reported by node 1 "<NODE1>" (<NODE1>:5433): "primary"
 2024-01-16 14:43:40.262183+01 |    0/1 |             primary |        wait_primary | Setting goal state of node 1 "<NODE1>" (<NODE1>:5433) to wait_primary because none of the standby nodes in the quorum are healthy at the moment.
 2024-01-16 14:43:40.355978+01 |    0/1 |        wait_primary |        wait_primary | New state is reported by node 1 "<NODE1>" (<NODE1>:5433): "wait_primary"
 2024-01-16 14:43:40.659967+01 |    0/1 |        wait_primary |        wait_primary | Node node 1 "<NODE1>" (<NODE1>:5433) is marked as healthy by the monitor
 2024-01-16 14:44:03.793994+01 |    0/2 |    wait_maintenance |         maintenance | Setting goal state of node 2 "<NODE2>" (<NODE2>:5433) to maintenance after node 1 "<NODE1>" (<NODE1>:5433) converged to wait_primary.
 2024-01-16 14:44:03.952567+01 |    0/2 |         maintenance |         maintenance | New state is reported by node 2 "<NODE2>" (<NODE2>:5433): "maintenance"
 2024-01-16 14:51:48.666013+01 |    0/2 |         maintenance |          catchingup | Setting goal state of node 2 "<NODE2>" (<NODE2>:5433) to catchingup  after a user-initiated stop_maintenance call.
  2024-01-16 14:51:50.05806+01 |    0/2 |          catchingup |          catchingup | New state is reported by node 2 "<NODE2>" (<NODE2>:5433): "catchingup"
  2024-01-16 14:51:51.44403+01 |    0/2 |          catchingup |          catchingup | Node node 2 "<NODE2>" (<NODE2>:5433) is marked as healthy by the monitor
 2024-01-16 14:51:51.475895+01 |    0/2 |          catchingup |           secondary | Setting goal state of node 2 "<NODE2>" (<NODE2>:5433) to secondary after it caught up.
 2024-01-16 14:51:51.594911+01 |    0/2 |           secondary |           secondary | New state is reported by node 2 "<NODE2>" (<NODE2>:5433): "secondary"
 2024-01-16 14:51:51.636226+01 |    0/1 |        wait_primary |             primary | Setting goal state of node 1 "<NODE1>" (<NODE1>:5433) to primary now that we have 1 healthy  secondary nodes in the quorum.
  2024-01-16 14:51:51.88083+01 |    0/1 |             primary |             primary | New state is reported by node 1 "<NODE1>" (<NODE1>:5433): "primary"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant