-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
K8SPSMDB-1219: PBM multi storage support #1843
base: main
Are you sure you want to change the base?
Conversation
pkg/apis/psmdb/v1/psmdb_defaults.go
Outdated
if len(cr.Spec.Backup.Storages) > 1 { | ||
mainFound := false | ||
for _, stg := range cr.Spec.Backup.Storages { | ||
if stg.Main { | ||
mainFound = true | ||
} | ||
} | ||
|
||
if !mainFound { | ||
return errors.New("main backup storage is not specified") | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we can have multiple storages configured, this also means that we can also have multiple storages configured as main. I think we can increase the scope of this validation and ensure that only one main
storage is defined.
@@ -35,6 +35,7 @@ import ( | |||
var ( | |||
GitCommit string | |||
GitBranch string | |||
BuildTime string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, why is build time needed for our log?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for consistency. we have this in PXC operator and it's useful in case of problems with image caching
pkg/apis/psmdb/v1/psmdb_types.go
Outdated
return name, stg, nil | ||
} | ||
|
||
return name, stg, errors.New("main storage not found") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can write this like return "", BackupStorageSpec{}, errors.New("main storage not found")
and drop completely the vars at the start of the function. It is more idiomatic.
AnnotationPVCResizeInProgress = "percona.com/pvc-resize-in-progress" | ||
) | ||
|
||
func (cr *PerconaServerMongoDB) PBMResyncNeeded() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is totally fine as is, but if we want to be super bulletproof, we can do this
func (cr *PerconaServerMongoDB) PBMResyncNeeded() bool {
v, exists := cr.Annotations[AnnotationResyncPBM]
return exists && v != ""
}
@@ -299,11 +299,20 @@ func (r *ReconcilePerconaServerMongoDB) rsStatus(ctx context.Context, cr *api.Pe | |||
} | |||
|
|||
for _, pod := range list.Items { | |||
if pod.DeletionTimestamp != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recommend we use .IsZero()
for similar checks
@@ -1052,6 +1054,27 @@ func (b BackupSpec) IsEnabledPITR() bool { | |||
return b.PITR.Enabled | |||
} | |||
|
|||
func (b BackupSpec) MainStorage() (string, BackupStorageSpec, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would love if we added a unit test for this function, a really basic one, mainly because getting the right main storage is super critical not to break and I wouldn't like us to rely on tests that somehow verify this together with other logic.
pbm, err := backup.NewPBM(ctx, r.client, cr) | ||
if err != nil { | ||
return errors.Wrap(err, "create pbm object") | ||
} | ||
defer pbm.Close(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since pbm
is used on this scope for the first time, here
if err := enablePiTRIfNeeded(ctx, pbm, cr); err != nil {
return errors.Wrap(err, "enable pitr if needed")
}
We can save some API calls if the full backup logic needs to return by moving the new pbm creation after the if clause.
} | ||
} | ||
|
||
if err := enablePiTRIfNeeded(ctx, pbm, cr); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙌🏽 nice, like for these new functions
|
||
// running in separate goroutine to not block reconciliation | ||
// until all resync operations finished | ||
go func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we pass the cr object but it may change in another reconciliation loop, right? I think we should handle this scenario with some form of syncing or queuing i.e. if multiple loops occur in a small amount of time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But maybe if a resync is running, new reconciliations won’t start another routine
func (b *pbmC) ResyncMainStorageAndWait(ctx context.Context) error { | ||
if err := b.ResyncMainStorage(ctx); err != nil { | ||
return errors.Wrap(err, "start resync") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, would it be helpful to add :
start main storage resync
ticker := time.NewTicker(1 * time.Second) | ||
defer ticker.Stop() | ||
|
||
log.Info("waiting for resync to start") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be helpful to users, if we add time that we wait? Does it make sense in this case to make timeouts configurable?
1baaa9d
to
0f208f4
Compare
"single storage": { | ||
spec: BackupSpec{ | ||
Storages: map[string]BackupStorageSpec{ | ||
"storage-1": BackupStorageSpec{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[gofmt] reported by reviewdog 🐶
"storage-1": BackupStorageSpec{ | |
"storage-1": { |
"multiple storages": { | ||
spec: BackupSpec{ | ||
Storages: map[string]BackupStorageSpec{ | ||
"storage-1": BackupStorageSpec{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[gofmt] reported by reviewdog 🐶
"storage-1": BackupStorageSpec{ | |
"storage-1": { |
Type: BackupStorageS3, | ||
S3: BackupStorageS3Spec{}, | ||
}, | ||
"storage-2": BackupStorageSpec{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[gofmt] reported by reviewdog 🐶
"storage-2": BackupStorageSpec{ | |
"storage-2": { |
Type: BackupStorageS3, | ||
S3: BackupStorageS3Spec{}, | ||
}, | ||
"storage-3": BackupStorageSpec{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[gofmt] reported by reviewdog 🐶
"storage-3": BackupStorageSpec{ | |
"storage-3": { |
logf "sigs.k8s.io/controller-runtime/pkg/log" | ||
|
||
"github.com/percona/percona-backup-mongodb/pbm/config" | ||
psmdbv1 "github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[goimports-reviser] reported by reviewdog 🐶
psmdbv1 "github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1" | |
psmdbv1 "github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1" |
0f208f4
to
d8ac896
Compare
|
||
destroy $namespace | ||
log "test passed" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[shfmt] reported by reviewdog 🐶
Operator always supported multiple storages but didn't have native support for having multiple backup storages until v2.6.0. In operator we were reconfiguring PBM every time user selects a storage for their backups/restores different than the previous storage. This was causing long wait periods esp. in storages with lots of backups due to resync operation. Another limitation was forcing users to have only 1 backup storage if they want to enable point-in-time-recovery. In case of multiple storages, PBM would upload oplog chunks to whatever storage is last used by a backup/restore and this would make consistent recovery impossible. PBM v2.6.0 added native support for multiple storages and these changes introduce it to our operator: * User can have one main storage in PBM configuration. Any other storages can be added as profiles. Main storage can be found in: ``` kubectl exec cluster1-rs0-0 -c backup-agent -- pbm config ``` This commit introduces a new field `main` in storage spec: ``` storages: s3-us-west: main: true type: s3 s3: bucket: operator-testing credentialsSecret: cluster1-s3-secrets region: us-west-2 ```` If user only has 1 storage configured in `cr.yaml`, operator will automatically use it as main storage. If more than 1 storage is configured, one of them must have `main: true`. User can't have more than 1 storage with `main: true`. If user changes main storage in cr.yaml, operator will configure PBM with the new storage and start resync. Any other storage in `cr.yaml` will be added to PBM as a profile. User can see profiles using cli: ``` kubectl exec cluster1-rs0-0 -c backup-agent -- pbm profile list ``` When user adds a new profile to `cr.yaml`, operator will add it to PBM but won't start resync. **`pbm config --force-resync` only start resync for the main storage.** To manually resync a profile: ``` kubectl exec cluster1-rs0-0 -c backup-agent -- pbm profile sync <storage-name> ``` If user starts a restore using a backup in a storage configured as profile, operator will start resync operation for profile and block restore until resync finishes. Note: Profiles are also called external storages in PBM documentation. If user has multiple storages in `cr.yaml` and changes main storage between them, operator: 1. configures PBM with the new main storage. 2. adds the old main as a profile. 2. deletes profile for the new main storage. If user configures `backupSource` in backups/restores: * if `cr.yaml` has no storages configured, operator configures PBM with storage data in `backupSource` field. This storage will effectively be the main storage until user adds a storage to `cr.yaml`. After a storage is configured PBM configuration will be overwritten and `backupSource` storage will be gone. * if `cr.yaml` has a storage configured, operator adds `backupSource` storage as a profile. * Oplog chunks will be only be uploaded to main storage. User can use any backup as base backup for point-in-time-recovery. * Incremental backup chains all need to be stored in the same storage. TBD after #1836 merged. --- Other significant changes in operator behavior: * Operator now configures automatically configures PBM on a fresh cluster. Before this changes, PBM was not configured until user starts a backup/restore after deploying a fresh cluster. Now, PBM will directly configured with main storage in `cr.yaml` and resync will be started in the background. There's a new field in `PerconaServerMongoDB` status: `backupConfigHash`. Operator will maintain hash of the current PBM configuration and reconfigures PBM if hash is changed. Fields in `spec.backup.pitr` are excluded from hash calculation, they're handled separately. * If `PerconaServerMongoDB` is annotated with `percona.com/resync-pbm=true`, operator will start resync operation both for main storage and profiles. Resync for profiles are started with the equivalent of `pbm profile sync --all`. These resync operations will be run in the background and will not block the reconciliation. * If a backup that has `percona.com/delete-backup` finalizer is deleted, operator will only delete oplogs chunks if it's in main storage.
c653f23
to
15256b4
Compare
|
||
run_mongo "use ${database}\n db.${collection}.${command}()" "$uri" "mongodb" "$suffix" \ | ||
local full_command="db.${collection}.${command}()" | ||
if [[ ! -z ${sort} ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[shfmt] reported by reviewdog 🐶
if [[ ! -z ${sort} ]]; then | |
if [[ -n ${sort} ]]; then |
commit: 15256b4 |
CHECKLIST
Jira
Needs Doc
) and QA (Needs QA
)?Tests
compare/*-oc.yml
)?Config/Logging/Testability