Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GITHUB-ISSUE-1837: Fix exec in pitr and proxysql pods #1838

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

dcaputo-harmoni
Copy link
Contributor

Added retries with backoff on proxysql user sync to address intermittent failures. Forced pitr filename checks to default to success error code when files not present to prevent operator from kicking up error.

Fixes #1837

@pull-request-size pull-request-size bot added the size/M 30-99 lines label Oct 6, 2024
@dcaputo-harmoni dcaputo-harmoni changed the title GITHUB-ISSUE-1812: Fix exec in pitr and proxysql pods GITHUB-ISSUE-1837: Fix exec in pitr and proxysql pods Oct 6, 2024
@dcaputo-harmoni
Copy link
Contributor Author

Just FYI - have been running this image for 24 hours now and not a single error, which is a big change from dozens of errors (detailed in the github issue) over the same period previously.

@hors hors added the community label Oct 7, 2024
@egegunes egegunes self-assigned this Oct 7, 2024
@egegunes
Copy link
Contributor

egegunes commented Oct 7, 2024

@dcaputo-harmoni thank you for you contribution. I think it makes sense to have some kind of retry mechanism for syncing users. But I believe it's better to use https://pkg.go.dev/k8s.io/client-go/util/retry#OnError rather than retrying in a for loop.

@dcaputo-harmoni
Copy link
Contributor Author

@egegunes @hors just made the change, let me know if that's what you are thinking. Appears to be working as intended on my side.

@dcaputo-harmoni
Copy link
Contributor Author

dcaputo-harmoni commented Oct 7, 2024

Looks like all the errors have come back on this last build. Since both the sync users and reconcile errors are occurring, it's probably not related to the retry methodology as that only applies to sync users. Any ideas?

I did notice that it's trying to use ssl to communicate with the kubernetes api i.e. https://10.0.64.1:443. When I was replicating the calls using websocat from the operator pod, I needed to provide -k to allow insecure mode (i.e. not validating ssl certificates) for the calls to work. Could this be needed for these calls as well?

2024-10-07T15:32:03.434Z    INFO    setup    Runs on    {"platform": "kubernetes", "version": "v1.29.4"}
2024-10-07T15:32:03.434Z    INFO    setup    Manager starting up    {"gitCommit": "9b93167760060d05e675e19ca321c44e6a28d387", "gitBranch": "github-issue-1837-fix-exec-in-pitr-and-proxysql-pods", "buildTime": "2024-10-07T13:04:08Z", "goVersion": "go1.22.8", "os": "linux", "arch": "amd64"}
2024-10-07T15:32:03.434Z    INFO    setup    Registering Components.
2024-10-07T15:32:05.830Z    INFO    controller-runtime.webhook    Registering webhook    {"path": "/validate-percona-xtradbcluster"}
2024-10-07T15:32:05.831Z    INFO    setup    Starting the Cmd.
2024-10-07T15:32:05.832Z    INFO    controller-runtime.metrics    Starting metrics server
2024-10-07T15:32:05.832Z    INFO    controller-runtime.metrics    Serving metrics server    {"bindAddress": ":8080", "secure": false}
2024-10-07T15:32:05.832Z    INFO    starting server    {"name": "health probe", "addr": ":8081"}
2024-10-07T15:32:05.832Z    INFO    controller-runtime.webhook    Starting webhook server
2024-10-07T15:32:05.832Z    INFO    controller-runtime.certwatcher    Updated current TLS certificate
2024-10-07T15:32:05.832Z    INFO    controller-runtime.webhook    Serving webhook server    {"host": "", "port": 9443}
2024-10-07T15:32:05.832Z    INFO    controller-runtime.certwatcher    Starting certificate watcher
2024-10-07T15:32:05.933Z    INFO    attempting to acquire leader lease core-mysql/08db1feb.percona.com...
2024-10-07T15:32:20.979Z    INFO    successfully acquired lease core-mysql/08db1feb.percona.com
2024-10-07T15:32:20.980Z    INFO    Starting EventSource    {"controller": "pxc-controller", "source": "kind source: *v1.PerconaXtraDBCluster"}
2024-10-07T15:32:20.980Z    INFO    Starting EventSource    {"controller": "pxcbackup-controller", "source": "kind source: *v1.PerconaXtraDBClusterBackup"}
2024-10-07T15:32:20.980Z    INFO    Starting Controller    {"controller": "pxc-controller"}
2024-10-07T15:32:20.980Z    INFO    Starting EventSource    {"controller": "pxcrestore-controller", "source": "kind source: *v1.PerconaXtraDBClusterRestore"}
2024-10-07T15:32:20.980Z    INFO    Starting Controller    {"controller": "pxcbackup-controller"}
2024-10-07T15:32:20.980Z    INFO    Starting Controller    {"controller": "pxcrestore-controller"}
2024-10-07T15:32:21.187Z    INFO    Starting workers    {"controller": "pxcbackup-controller", "worker count": 1}
2024-10-07T15:32:21.187Z    INFO    Starting workers    {"controller": "pxcrestore-controller", "worker count": 1}
2024-10-07T15:32:21.187Z    INFO    Starting workers    {"controller": "pxc-controller", "worker count": 1}
2024-10-07T15:32:23.033Z    INFO    Creating or updating backup job    {"controller": "pxc-controller", "namespace": "core-mysql", "name": "mysql-db", "reconcileID": "e2a2d529-7113-469c-933a-458c2194ab8d", "name": "41015-daily-backup", "schedule": "0 0 * * *"}
2024-10-07T15:32:23.033Z    INFO    Creating or updating backup job    {"controller": "pxc-controller", "namespace": "core-mysql", "name": "mysql-db", "reconcileID": "e2a2d529-7113-469c-933a-458c2194ab8d", "name": "41015-sat-night-backup", "schedule": "0 0 * * 6"}
2024-10-07T15:32:23.315Z    INFO    add new job    {"controller": "pxc-controller", "namespace": "core-mysql", "name": "mysql-db", "reconcileID": "e2a2d529-7113-469c-933a-458c2194ab8d", "name": "ensure-version/core-mysql/mysql-db", "schedule": "0 4 * * *"}
2024-10-07T15:32:23.315Z    INFO    add new job    {"controller": "pxc-controller", "namespace": "core-mysql", "name": "mysql-db", "reconcileID": "e2a2d529-7113-469c-933a-458c2194ab8d", "name": "telemetry/core-mysql/mysql-db", "schedule": "51 * * * *"}
2024-10-07T15:33:33.542Z    ERROR    sync users    {"controller": "pxc-controller", "namespace": "core-mysql", "name": "mysql-db", "reconcileID": "633305af-d46a-4e39-8df1-c0427c8d55b6", "error": "exec syncusers failed after retries on proxysql pod mysql-db-proxysql-0: error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-proxysql-0/exec?command=proxysql-admin&command=--syncusers&command=--add-query-rule&container=proxysql&stderr=true&stdout=true\": unexpected EOF", "errorVerbose": "error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-proxysql-0/exec?command=proxysql-admin&command=--syncusers&command=--add-query-rule&container=proxysql&stderr=true&stdout=true\": unexpected EOF\nexec syncusers failed after retries on proxysql pod mysql-db-proxysql-0\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).syncPXCUsersWithProxySQL\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/users.go:934\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).resyncPXCUsersWithProxySQL.func1\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:1224\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695"}
github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).resyncPXCUsersWithProxySQL.func1
    /go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:1226
2024-10-07T15:41:52.991Z    ERROR    sync users    {"controller": "pxc-controller", "namespace": "core-mysql", "name": "mysql-db", "reconcileID": "0801a879-97be-4238-8213-503500e27397", "error": "exec syncusers failed after retries on proxysql pod mysql-db-proxysql-2: error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-proxysql-2/exec?command=proxysql-admin&command=--syncusers&command=--add-query-rule&container=proxysql&stderr=true&stdout=true\": unexpected EOF", "errorVerbose": "error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-proxysql-2/exec?command=proxysql-admin&command=--syncusers&command=--add-query-rule&container=proxysql&stderr=true&stdout=true\": unexpected EOF\nexec syncusers failed after retries on proxysql pod mysql-db-proxysql-2\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).syncPXCUsersWithProxySQL\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/users.go:934\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).resyncPXCUsersWithProxySQL.func1\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:1224\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695"}
github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).resyncPXCUsersWithProxySQL.func1
    /go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:1226
2024-10-07T15:49:54.292Z    ERROR    Reconciler error    {"controller": "pxc-controller", "namespace": "core-mysql", "name": "mysql-db", "reconcileID": "f23554d0-1b23-4c89-997d-572c50e146b5", "error": "exec binlog collector pod mysql-db-pitr-6987f698c8-nknx4: error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-pitr-6987f698c8-nknx4/exec?command=%2Fbin%2Fbash&command=-c&command=cat+%2Ftmp%2Fgap-detected+%7C%7C+true&container=pitr&stderr=true&stdout=true\": unexpected EOF", "errorVerbose": "error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-pitr-6987f698c8-nknx4/exec?command=%2Fbin%2Fbash&command=-c&command=cat+%2Ftmp%2Fgap-detected+%7C%7C+true&container=pitr&stderr=true&stdout=true\": unexpected EOF\nexec binlog collector pod mysql-db-pitr-6987f698c8-nknx4\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/pxc/backup.CheckPITRErrors\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/pxc/backup/pitr.go:64\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).Reconcile\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:432\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:116\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:303\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:263\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:224\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler
    /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:316
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem
    /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:263
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.2
    /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:224
2024-10-07T15:51:53.051Z    ERROR    sync users    {"controller": "pxc-controller", "namespace": "core-mysql", "name": "mysql-db", "reconcileID": "07628537-d0bf-47bd-b3c3-a5d814ebfe5b", "error": "exec syncusers failed after retries on proxysql pod mysql-db-proxysql-1: error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-proxysql-1/exec?command=proxysql-admin&command=--syncusers&command=--add-query-rule&container=proxysql&stderr=true&stdout=true\": unexpected EOF", "errorVerbose": "error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-proxysql-1/exec?command=proxysql-admin&command=--syncusers&command=--add-query-rule&container=proxysql&stderr=true&stdout=true\": unexpected EOF\nexec syncusers failed after retries on proxysql pod mysql-db-proxysql-1\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).syncPXCUsersWithProxySQL\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/users.go:934\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).resyncPXCUsersWithProxySQL.func1\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:1224\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695"}
github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).resyncPXCUsersWithProxySQL.func1
    /go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:1226
2024-10-07T16:02:38.171Z    ERROR    sync users    {"controller": "pxc-controller", "namespace": "core-mysql", "name": "mysql-db", "reconcileID": "22dcb03b-dd9e-49d4-9c7c-54b858b90b1c", "error": "exec syncusers failed after retries on proxysql pod mysql-db-proxysql-2: error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-proxysql-2/exec?command=proxysql-admin&command=--syncusers&command=--add-query-rule&container=proxysql&stderr=true&stdout=true\": unexpected EOF", "errorVerbose": "error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-proxysql-2/exec?command=proxysql-admin&command=--syncusers&command=--add-query-rule&container=proxysql&stderr=true&stdout=true\": unexpected EOF\nexec syncusers failed after retries on proxysql pod mysql-db-proxysql-2\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).syncPXCUsersWithProxySQL\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/users.go:934\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).resyncPXCUsersWithProxySQL.func1\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:1224\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695"}
github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).resyncPXCUsersWithProxySQL.func1
    /go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:1226
2024-10-07T16:06:45.983Z    ERROR    sync users    {"controller": "pxc-controller", "namespace": "core-mysql", "name": "mysql-db", "reconcileID": "bc678940-0715-4d26-8adf-2c940bf65dc7", "error": "exec syncusers failed after retries on proxysql pod mysql-db-proxysql-2: error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-proxysql-2/exec?command=proxysql-admin&command=--syncusers&command=--add-query-rule&container=proxysql&stderr=true&stdout=true\": unexpected EOF", "errorVerbose": "error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-proxysql-2/exec?command=proxysql-admin&command=--syncusers&command=--add-query-rule&container=proxysql&stderr=true&stdout=true\": unexpected EOF\nexec syncusers failed after retries on proxysql pod mysql-db-proxysql-2\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).syncPXCUsersWithProxySQL\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/users.go:934\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).resyncPXCUsersWithProxySQL.func1\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:1224\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695"}
github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).resyncPXCUsersWithProxySQL.func1
    /go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:1226
2024-10-07T16:17:22.495Z    ERROR    Reconciler error    {"controller": "pxc-controller", "namespace": "core-mysql", "name": "mysql-db", "reconcileID": "dde8cb2b-f6c1-4a66-99a8-79042a1a80b8", "error": "exec binlog collector pod mysql-db-pitr-6987f698c8-nknx4: error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-pitr-6987f698c8-nknx4/exec?command=%2Fbin%2Fbash&command=-c&command=cat+%2Ftmp%2Fpitr-timeline+%7C%7C+true&container=pitr&stderr=true&stdout=true\": unexpected EOF", "errorVerbose": "error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-pitr-6987f698c8-nknx4/exec?command=%2Fbin%2Fbash&command=-c&command=cat+%2Ftmp%2Fpitr-timeline+%7C%7C+true&container=pitr&stderr=true&stdout=true\": unexpected EOF\nexec binlog collector pod mysql-db-pitr-6987f698c8-nknx4\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/pxc/backup.UpdatePITRTimeline\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/pxc/backup/pitr.go:128\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).Reconcile\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:437\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:116\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:303\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:263\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:224\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler
    /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:316
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem
    /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:263
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.2
    /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:224
2024-10-07T16:24:37.085Z    ERROR    sync users    {"controller": "pxc-controller", "namespace": "core-mysql", "name": "mysql-db", "reconcileID": "7dca56e5-842e-4242-94bd-42c7e4f6a9a0", "error": "exec syncusers failed after retries on proxysql pod mysql-db-proxysql-0: error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-proxysql-0/exec?command=proxysql-admin&command=--syncusers&command=--add-query-rule&container=proxysql&stderr=true&stdout=true\": unexpected EOF", "errorVerbose": "error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-proxysql-0/exec?command=proxysql-admin&command=--syncusers&command=--add-query-rule&container=proxysql&stderr=true&stdout=true\": unexpected EOF\nexec syncusers failed after retries on proxysql pod mysql-db-proxysql-0\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).syncPXCUsersWithProxySQL\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/users.go:934\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).resyncPXCUsersWithProxySQL.func1\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:1224\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695"}
github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).resyncPXCUsersWithProxySQL.func1
    /go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:1226
2024-10-07T16:32:28.123Z    ERROR    sync users    {"controller": "pxc-controller", "namespace": "core-mysql", "name": "mysql-db", "reconcileID": "f301c060-ce23-492e-bf66-d10830a39f1d", "error": "exec syncusers failed after retries on proxysql pod mysql-db-proxysql-1: error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-proxysql-1/exec?command=proxysql-admin&command=--syncusers&command=--add-query-rule&container=proxysql&stderr=true&stdout=true\": unexpected EOF", "errorVerbose": "error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-proxysql-1/exec?command=proxysql-admin&command=--syncusers&command=--add-query-rule&container=proxysql&stderr=true&stdout=true\": unexpected EOF\nexec syncusers failed after retries on proxysql pod mysql-db-proxysql-1\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).syncPXCUsersWithProxySQL\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/users.go:934\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).resyncPXCUsersWithProxySQL.func1\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:1224\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695"}
github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).resyncPXCUsersWithProxySQL.func1
    /go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:1226
2024-10-07T16:33:35.711Z    ERROR    sync users    {"controller": "pxc-controller", "namespace": "core-mysql", "name": "mysql-db", "reconcileID": "29b61588-2511-49f1-8ba9-e1443395cf1d", "error": "exec syncusers failed after retries on proxysql pod mysql-db-proxysql-2: error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-proxysql-2/exec?command=proxysql-admin&command=--syncusers&command=--add-query-rule&container=proxysql&stderr=true&stdout=true\": unexpected EOF", "errorVerbose": "error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-proxysql-2/exec?command=proxysql-admin&command=--syncusers&command=--add-query-rule&container=proxysql&stderr=true&stdout=true\": unexpected EOF\nexec syncusers failed after retries on proxysql pod mysql-db-proxysql-2\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).syncPXCUsersWithProxySQL\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/users.go:934\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).resyncPXCUsersWithProxySQL.func1\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:1224\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695"}
github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).resyncPXCUsersWithProxySQL.func1
    /go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:1226
2024-10-07T16:33:35.740Z    ERROR    Reconciler error    {"controller": "pxc-controller", "namespace": "core-mysql", "name": "mysql-db", "reconcileID": "29b61588-2511-49f1-8ba9-e1443395cf1d", "error": "exec binlog collector pod mysql-db-pitr-6987f698c8-nknx4: error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-pitr-6987f698c8-nknx4/exec?command=%2Fbin%2Fbash&command=-c&command=cat+%2Ftmp%2Fpitr-timeline+%7C%7C+true&container=pitr&stderr=true&stdout=true\": unexpected EOF", "errorVerbose": "error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-pitr-6987f698c8-nknx4/exec?command=%2Fbin%2Fbash&command=-c&command=cat+%2Ftmp%2Fpitr-timeline+%7C%7C+true&container=pitr&stderr=true&stdout=true\": unexpected EOF\nexec binlog collector pod mysql-db-pitr-6987f698c8-nknx4\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/pxc/backup.UpdatePITRTimeline\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/pxc/backup/pitr.go:128\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).Reconcile\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:437\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:116\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:303\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:263\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:224\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler
    /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:316
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem
    /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:263
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.2
    /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:224
2024-10-07T16:40:34.791Z    ERROR    sync users    {"controller": "pxc-controller", "namespace": "core-mysql", "name": "mysql-db", "reconcileID": "98231a08-07ab-4047-a38e-3db2e7917ae8", "error": "exec syncusers failed after retries on proxysql pod mysql-db-proxysql-2: error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-proxysql-2/exec?command=proxysql-admin&command=--syncusers&command=--add-query-rule&container=proxysql&stderr=true&stdout=true\": dial tcp 10.0.64.1:443: connect: connection timed out", "errorVerbose": "error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-proxysql-2/exec?command=proxysql-admin&command=--syncusers&command=--add-query-rule&container=proxysql&stderr=true&stdout=true\": dial tcp 10.0.64.1:443: connect: connection timed out\nexec syncusers failed after retries on proxysql pod mysql-db-proxysql-2\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).syncPXCUsersWithProxySQL\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/users.go:934\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).resyncPXCUsersWithProxySQL.func1\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:1224\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695"}
github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).resyncPXCUsersWithProxySQL.func1
    /go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:1226
2024-10-07T16:46:19.610Z    ERROR    sync users    {"controller": "pxc-controller", "namespace": "core-mysql", "name": "mysql-db", "reconcileID": "431416ad-7a87-4a07-9065-0de89ce8a54c", "error": "exec syncusers failed after retries on proxysql pod mysql-db-proxysql-2: error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-proxysql-2/exec?command=proxysql-admin&command=--syncusers&command=--add-query-rule&container=proxysql&stderr=true&stdout=true\": unexpected EOF", "errorVerbose": "error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-proxysql-2/exec?command=proxysql-admin&command=--syncusers&command=--add-query-rule&container=proxysql&stderr=true&stdout=true\": unexpected EOF\nexec syncusers failed after retries on proxysql pod mysql-db-proxysql-2\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).syncPXCUsersWithProxySQL\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/users.go:934\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).resyncPXCUsersWithProxySQL.func1\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:1224\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695"}
github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).resyncPXCUsersWithProxySQL.func1
    /go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:1226
2024-10-07T16:48:22.514Z    ERROR    Reconciler error    {"controller": "pxc-controller", "namespace": "core-mysql", "name": "mysql-db", "reconcileID": "21d1c283-44be-42d4-a1e1-daca2b547fbe", "error": "exec binlog collector pod mysql-db-pitr-6987f698c8-nknx4: error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-pitr-6987f698c8-nknx4/exec?command=%2Fbin%2Fbash&command=-c&command=cat+%2Ftmp%2Fgap-detected+%7C%7C+true&container=pitr&stderr=true&stdout=true\": unexpected EOF", "errorVerbose": "error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-pitr-6987f698c8-nknx4/exec?command=%2Fbin%2Fbash&command=-c&command=cat+%2Ftmp%2Fgap-detected+%7C%7C+true&container=pitr&stderr=true&stdout=true\": unexpected EOF\nexec binlog collector pod mysql-db-pitr-6987f698c8-nknx4\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/pxc/backup.CheckPITRErrors\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/pxc/backup/pitr.go:64\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).Reconcile\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:432\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:116\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:303\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:263\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:224\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler
    /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:316
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem
    /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:263
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.2
    /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:224
2024-10-07T16:51:37.677Z    ERROR    sync users    {"controller": "pxc-controller", "namespace": "core-mysql", "name": "mysql-db", "reconcileID": "1e7ed1c9-361b-4b35-8456-54cbca3a0d37", "error": "exec syncusers failed after retries on proxysql pod mysql-db-proxysql-0: error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-proxysql-0/exec?command=proxysql-admin&command=--syncusers&command=--add-query-rule&container=proxysql&stderr=true&stdout=true\": unexpected EOF", "errorVerbose": "error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-proxysql-0/exec?command=proxysql-admin&command=--syncusers&command=--add-query-rule&container=proxysql&stderr=true&stdout=true\": unexpected EOF\nexec syncusers failed after retries on proxysql pod mysql-db-proxysql-0\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).syncPXCUsersWithProxySQL\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/users.go:934\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).resyncPXCUsersWithProxySQL.func1\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:1224\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695"}
github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).resyncPXCUsersWithProxySQL.func1
    /go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:1226
2024-10-07T17:11:00.320Z    ERROR    sync users    {"controller": "pxc-controller", "namespace": "core-mysql", "name": "mysql-db", "reconcileID": "78f9200d-0113-43c6-9713-bfac85500e20", "error": "exec syncusers failed after retries on proxysql pod mysql-db-proxysql-0: error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-proxysql-0/exec?command=proxysql-admin&command=--syncusers&command=--add-query-rule&container=proxysql&stderr=true&stdout=true\": unexpected EOF", "errorVerbose": "error sending request: Post \"https://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-proxysql-0/exec?command=proxysql-admin&command=--syncusers&command=--add-query-rule&container=proxysql&stderr=true&stdout=true\": unexpected EOF\nexec syncusers failed after retries on proxysql pod mysql-db-proxysql-0\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).syncPXCUsersWithProxySQL\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/users.go:934\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).resyncPXCUsersWithProxySQL.func1\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:1224\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695"}
github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).resyncPXCUsersWithProxySQL.func1
    /go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:1226

@dcaputo-harmoni
Copy link
Contributor Author

dcaputo-harmoni commented Oct 7, 2024

Here is some additional debugging info from the operator container (downloaded the latest release of websocat to it):

./websocat "wss://10.0.64.1:443/api/v1/namespaces/core-mysql/pods/mysql-db-proxysql-0/exec?command=proxysql-admin&command=--syncusers&command=--add-query-rule&container=proxysql&stderr=true&stdout=true" --header="Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" -k -E

 Syncing user accounts from PXC(mysql-db-pxc-0.mysql-db-pxc.core-mysql.svc.cluster.local:3306) to ProxySQL
 Synced PXC users to the ProxySQL database!

As stated earlier, the -k (insecure mode) flag is needed because of the self signed certs, and websocat doesn't support providing a certificate.

@dcaputo-harmoni
Copy link
Contributor Author

Further update - I've managed to get this working, see the last commit. In short, all of the exec commands could use error retrying, so I built the logic from client-go/util/retry into clientcmd.go, and had it retry on all errors. This seems to have addressed all of the issues, I've now observed it encounter errors requiring retries, and successfully recover on the retry. I'll keep this code running for a while longer, but this feels like it has done it.

@pull-request-size pull-request-size bot added size/L 100-499 lines and removed size/M 30-99 lines labels Oct 11, 2024
@dcaputo-harmoni
Copy link
Contributor Author

I've added conforming retries to PodLogs and IsPodRunning, as those can occasionally encounter the same errors.

@JNKPercona
Copy link
Collaborator

Test name Status
affinity-8-0 passed
auto-tuning-8-0 passed
cross-site-8-0 passed
demand-backup-cloud-8-0 failure
demand-backup-encrypted-with-tls-8-0 failure
demand-backup-8-0 passed
haproxy-5-7 passed
haproxy-8-0 passed
init-deploy-5-7 passed
init-deploy-8-0 passed
limits-8-0 passed
monitoring-2-0-8-0 passed
one-pod-5-7 passed
one-pod-8-0 passed
pitr-8-0 passed
pitr-gap-errors-8-0 passed
proxy-protocol-8-0 passed
proxysql-sidecar-res-limits-8-0 passed
pvc-resize-5-7 passed
pvc-resize-8-0 passed
recreate-8-0 passed
restore-to-encrypted-cluster-8-0 passed
scaling-proxysql-8-0 passed
scaling-8-0 passed
scheduled-backup-5-7 passed
scheduled-backup-8-0 passed
security-context-8-0 passed
smart-update1-8-0 passed
smart-update2-8-0 passed
storage-8-0 passed
tls-issue-cert-manager-ref-8-0 passed
tls-issue-cert-manager-8-0 passed
tls-issue-self-8-0 passed
upgrade-consistency-8-0 passed
upgrade-haproxy-5-7 passed
upgrade-haproxy-8-0 passed
upgrade-proxysql-5-7 passed
upgrade-proxysql-8-0 failure
users-5-7 passed
users-8-0 passed
validation-hook-8-0 passed
We run 41 out of 41

commit: e1d4558
image: perconalab/percona-xtradb-cluster-operator:PR-1838-e1d45581

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community size/L 100-499 lines
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Exec timeouts on PITR reconciler and sync users processes
4 participants