-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Propose WAL format versioning and change strategy. #40
base: main
Are you sure you want to change the base?
Conversation
We recommend the [Two-Fold Migration Strategy](#two-fold-migration-strategy) with one variation: a configuration flag that tells Prometheus to write a Y+1 version if the user desires. | ||
|
||
|
||
In other words, we propose to add a TSDB **integer** flag `--storage.tsdb.write-wal-format` that tells Prometheus to use a particular WAL format for both WAL and WBL. This kind of flag will change its default to a new version ONLY when the previous Prometheus version can read that version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should be more specific about the version here. Prometheus defines LTS versions, so this should read
In other words, we propose to add a TSDB **integer** flag `--storage.tsdb.write-wal-format` that tells Prometheus to use a particular WAL format for both WAL and WBL. This kind of flag will change its default to a new version ONLY when the previous Prometheus version can read that version. | |
In other words, we propose to add a TSDB **integer** flag `--storage.tsdb.write-wal-format` that tells Prometheus to use a particular WAL format for both WAL and WBL. This kind of flag will change its default to a new version ONLY when the previous LTS Prometheus version can read that version. |
This would mean 2 years life time of a WAL version (but of course it may mean a jump of more than one version). So let's suppose we implement this flag and the default is 1 in version 3.1 LTS (or something) . To set the default to 2 (or more) we need an LTS release a year later that can read version 2, but defaults to 1. Another year later we could set default 2.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, I later read the alternatives - I think it makes sense to instill a sense of stability
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sorry. Copy-pasta. I really wanted to NOT do LTS for now, but initially I thought that's a good idea. This should be in alternatives. WDYT? EDIT: Actually, no you just propose an idea that's discussed in the alternatives.
This would mean 2 years life time of a WAL version (but of course it may mean a jump of more than one version). So let's suppose we implement this flag and the default is 1 in version 3.1 LTS (or something) . To set the default to 2 (or more) we need an LTS release a year later that can read version 2, but defaults to 1. Another year later we could set default 2.
I'm not sure it makes sense to bind to a yearly LTS (is it once a year?) or generally in some time window for default switch. It's just silly if we have a WAL version that will be extensively tested and better - is it sensible to wait a year to switch defaults? It's a bit heavy process. However, we can consider it.
I added more details on why LTS is not ideal option in alternatives, can you check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, one year https://prometheus.io/docs/introduction/release-cycle/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could backport the ability to read the new WAL, to the current LTS.
Signed-off-by: bwplotka <[email protected]>
Co-authored-by: George Krajcsovits <[email protected]> Signed-off-by: Bartlomiej Plotka <[email protected]>
Signed-off-by: bwplotka <[email protected]>
|
||
We propose the addition of a `meta.json` file in the wal directory, similar to [block meta.json](https://github.com/prometheus/prometheus/blob/5e124cf4f2b9467e4ae1c679840005e727efd599/tsdb/block.go#L171), with Version field set to `1` for the current format and `2` for new changes e.g. when we start to write [new records](https://github.com/prometheus/prometheus/pull/15467/files). No `meta.json` is equivalent to `{"version":1} `meta.json` file. | ||
|
||
We propose to also store the new WALs in separate directories e.g. `wal.v2`. Thanks to that the rewrite from one version to another is **eventual** and can be done segment by segment without any forcible rewrite. The WAL will be rewritten within the next 2h of a normal operation. The additional advantage of this way of versioning is that it's clear when your WAL fully migrated to a certain version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should mention checkpoints here.
* We already suffer from replay problems, so I propose an eventual rewrite. | ||
* Rewrite is more risking than read-only WAL (of previous version) | ||
|
||
### Don't version WAL, don't introduce a flag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
### Don't version WAL, don't introduce a flag | |
### Alternative: Don't version WAL, don't introduce a flag |
I suggest to put "Alternative" in each of these sub-headings, because it isn't clear once you've scrolled down a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E_TOO_MANY_ALTERNATIVES
|
||
## Alternatives | ||
|
||
### Require LTS support |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is more of an extension than an alternative.
### Record-based or segment based WAL versioning | ||
|
||
Given we usually change WAL by changing its records, the WAL version could be simply [max number of types](https://github.com/prometheus/prometheus/blob/5e124cf4f2b9467e4ae1c679840005e727efd599/tsdb/record/record.go#L54) we write to WAL. | ||
Alternatively, it could be per segment e.g. introduce a special version record type that is only in the front of the segment file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mention checkpoints here.
* changes that merge records | ||
* sharding? | ||
|
||
### Maintain two WALs (well four, with WBL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we have separate WAL and WBL ?
We recommend the [Two-Fold Migration Strategy](#two-fold-migration-strategy) with two details: | ||
|
||
* A new flag that tells Prometheus what WAL version to write. | ||
* There can be multiple "forward compatible" version, but the official minimum is one (see, the rejected [LTS idea](#require-lts-support)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who rejected it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose it to reject in this proposal. If we decide opposite, will switch. Sorry for confusion, I assume I can't use past lang here?
Cons: | ||
* Extremely heavy process that will make us afraid/refuse to make improvements to WAL, because it's too much work. It might fails our goal of `Balancing development velocity with user data stability risks` | ||
* One mitigation would be an LTS retroactive strategy e.g. LTS 3.1 only supports WALv1, 3.3 adds WALv2, we do 3.1.1 with WALv2 too, 3.4 can now switch to WALv2, 4.0 can remove WALv1 support. It gives us more flexibility, but it's not very realistic to do patch release of LTS with risky feature like a new WAL. | ||
* We literally have no formal process for LTS versions and we don't do them regularly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* We literally have no formal process for LTS versions and we don't do them regularly. | |
* We have no formal process for LTS versions and we don't do them regularly. |
1. LTS 3.1 only supports WAL v1. | ||
2. 3.3 adds WAL v2. | ||
3. we wait unit next LTS so e.g. 3.24. | ||
4. 3.25 can now switch to WAL v2. | ||
5. 4.0 can remove WAL v1 support. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should write down what the upgrade path looks like if this approach is not chosen, in the main section of this document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's written in the Two-Fold strategy section, but I can use the same versions here.
|
||
We propose to add a string `--storage.tsdb.stateful.write-wal-version` flag, with the default to `v1` that has a "stateful" consequence -- once new version is used, users will be able to revert only to certain Prometheus versions. Help of this flag will explain clearly what's possible and what Prometheus version you will be able to revert to. | ||
|
||
In other words, we propose to add a TSDB flag `--storage.tsdb.stateful.write-wal-version=<version>` that tells Prometheus to use a particular WAL format for both WAL and WBL. This kind of flag will change its default to a new version ONLY when (at least) one previous Prometheus version can read that version (while writing the old one). The initial version would be `v1`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think one version (6 weeks) is enough to say "v2 is good enough, turn it on for everyone".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would love to keep the gate open if the change will be trivial (like literally we have this case now with nhcb IMO, but we can argue) (:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we have some sort of distinction in the policy for when the flag's default value can be changed for cases where the WAL change only impacts an experimental feature (such as NHCB)?
![twofold.png](../assets/2024-11-25_changing_wal_format/twofold.png) | ||
|
||
1. We release Prometheus X+1 version that supports both Y and Y+1 data but still writes Y. | ||
2. We release Prometheus X+2 version that supports both Y and Y+1 data, but now it writes new data as Y. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2. We release Prometheus X+2 version that supports both Y and Y+1 data, but now it writes new data as Y.
For clarity: does this mean that the new record type (Y+1) will be re-named to Y?
* Gives a bit more stability to users and less surprises. | ||
|
||
Cons: | ||
* Extremely heavy process that will make us afraid/refuse to make improvements to WAL, because it's too much work. It might fails our goal of `Balancing development velocity with user data stability risks` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Extremely heavy process that will make us afraid/refuse to make improvements to WAL, because it's too much work. It might fails our goal of `Balancing development velocity with user data stability risks` | |
* Extremely heavy process that will make us afraid/refuse to make improvements to WAL, because it's too much work. It might fail our goal of `Balancing development velocity with user data stability risks` |
Fixes prometheus/prometheus#15200
Also join
#prometheus-wal-dev
on Slack for the sync discussion!