Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stolon-keper doesn't stop recovery after reaching recovery point #798

Open
pyhalov opened this issue Sep 20, 2020 · 0 comments · May be fixed by #799
Open

stolon-keper doesn't stop recovery after reaching recovery point #798

pyhalov opened this issue Sep 20, 2020 · 0 comments · May be fixed by #799
Labels

Comments

@pyhalov
Copy link

pyhalov commented Sep 20, 2020

What happened:
I've applied the following config:

 cat ~/stolon_restore.json 
{
         "initMode": "pitr",
         "synchronousReplication": true,
         "minSynchronousStandbys": 0,
         "maxSynchronousStandbys": 1,
         "pgParameters":{
            "log_statement":"all",
            "log_line_prefix":"%m [%r][%p]",
            "log_min_messages":"INFO",
            "wal_level":"logical",
            "max_prepared_transactions":"200",
         },
         "pitrConfig": { 
                "dataRestoreCommand": "cp -r /path/to/backup/* %d" , 
                "archiveRecoverySettings": {
                    "restoreCommand": "cp \"/path/to/backup/pg_wal/%f\" \"%p\""
                },
               "recoveryTargetSettings": { "recoveryTargetName": "_sharded_backup_1600524319"}
         } 
}

PostgreSQL recovered, but after recovery was restarted in recovery mode, tried to fetch new wal files, failed to reach recovery target and exited.


Sep 20 09:30:50 vg1 stolon-keeper[1383]: 2020-09-20 09:30:50.975 MSK [][1938]LOG:  database system is ready to accept connections
Sep 20 09:30:51 vg1 stolon-keeper[1383]: 2020-09-20T09:30:51.619+0300        INFO        cmd/keeper.go:1250        recovery completed
Sep 20 09:30:51 vg1 stolon-keeper[1383]: 2020-09-20 09:30:51.631 MSK [[local]][1959]LOG:  statement: select 1
Sep 20 09:30:51 vg1 stolon-keeper[1383]: 2020-09-20 09:30:51.637 MSK [[local]][1960]LOG:  statement: select 1 from information_schema.tables where table_schema = 'pg_catalog' and table_name = 'pg_file_settings'
Sep 20 09:30:51 vg1 stolon-keeper[1383]: 2020-09-20 09:30:51.645 MSK [[local]][1960]LOG:  statement: select name, setting from pg_file_settings where name IS NOT NULL and setting IS NOT NULL
Sep 20 09:30:51 vg1 stolon-keeper[1383]: 2020-09-20T09:30:51.657+0300        INFO        postgresql/postgresql.go:384        stopping database
Sep 20 09:30:51 vg1 stolon-proxy[1353]: 2020-09-20T09:30:51.658+0300        INFO        cmd/proxy.go:272        master address        {"address": "127.0.0.1:5443"}
Sep 20 09:30:51 vg1 stolon-keeper[1383]: 2020-09-20 09:30:51.659 MSK [][1938]LOG:  received fast shutdown request
Sep 20 09:30:51 vg1 stolon-keeper[1383]: 2020-09-20 09:30:51.659 MSK [][1938]LOG:  aborting any active transactions
Sep 20 09:30:51 vg1 stolon-keeper[1383]: 2020-09-20 09:30:51.661 MSK [][1938]LOG:  background worker "logical replication launcher" (PID 1958) exited with exit code 1
Sep 20 09:30:51 vg1 stolon-keeper[1383]: waiting for server to shut down....2020-09-20 09:30:51.672 MSK [][1947]LOG:  shutting down
Sep 20 09:30:51 vg1 stolon-proxy[1353]: 2020-09-20T09:30:51.689+0300        INFO        cmd/proxy.go:290        proxying to master address        {"address": "127.0.0.1:5443"}
Sep 20 09:30:51 vg1 stolon-keeper[1383]: 2020-09-20 09:30:51.703 MSK [][1938]LOG:  database system is shut down
Sep 20 09:30:51 vg1 stolon-keeper[1383]:  done
Sep 20 09:30:51 vg1 stolon-keeper[1383]: server stopped
Sep 20 09:30:51 vg1 stolon-keeper[1383]: 2020-09-20T09:30:51.774+0300        INFO        cmd/keeper.go:1484        our db requested role is master
Sep 20 09:30:51 vg1 stolon-keeper[1383]: 2020-09-20T09:30:51.785+0300        INFO        postgresql/postgresql.go:858        writing recovery signal file
Sep 20 09:30:51 vg1 stolon-keeper[1383]: 2020-09-20T09:30:51.785+0300        INFO        postgresql/postgresql.go:319        starting database

Look at writing recovery signal file line - keeper shouldn't have written recovery signal file after postgres restart

What you expected to happen:
I've expected postgresql to recover and start accepting queries.

Environment:

  • Stolon version:
    Head
@pyhalov pyhalov added the bug label Sep 20, 2020
pyhalov pushed a commit to pyhalov/stolon that referenced this issue Sep 20, 2020
@pyhalov pyhalov linked a pull request Sep 20, 2020 that will close this issue
pyhalov pushed a commit to pyhalov/stolon that referenced this issue Sep 20, 2020
pyhalov pushed a commit to pyhalov/stolon that referenced this issue Sep 20, 2020
pyhalov added a commit to pyhalov/stolon that referenced this issue Sep 20, 2020
pyhalov added a commit to pyhalov/stolon that referenced this issue Sep 20, 2020
pyhalov added a commit to pyhalov/stolon that referenced this issue Sep 20, 2020
pyhalov added a commit to pyhalov/stolon that referenced this issue Sep 20, 2020
pyhalov added a commit to pyhalov/stolon that referenced this issue Sep 20, 2020
pyhalov added a commit to pyhalov/stolon that referenced this issue Sep 20, 2020
pyhalov added a commit to postgrespro/stolon that referenced this issue Sep 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant