forked from thanos-io/thanos
-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Release v0.37 #109
Merged
Release v0.37 #109
+16,052
−4,763
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* fix serverAsClient goroutines leak Signed-off-by: Thibault Mange <[email protected]> * fix lint Signed-off-by: Thibault Mange <[email protected]> * update changelog Signed-off-by: Thibault Mange <[email protected]> * delete invalid comment Signed-off-by: Thibault Mange <[email protected]> * remove temp dev test Signed-off-by: Thibault Mange <[email protected]> * remove timer channel drain Signed-off-by: Thibault Mange <[email protected]> --------- Signed-off-by: Thibault Mange <[email protected]>
If we account stats for remote write and local writes we will count them twice since the remote write will be counted locally again by the remote receiver instance. Signed-off-by: Michael Hoffmann <[email protected]>
We have seen deadlocks with endpoint discovery caused by the metric collector hanging and not releasing the store labels lock. This causes the endpoint update to hang, which also makes all endpoint readers hang on acquiring a read lock for the resolved endpoints slice. This commit makes sure the Collect method on the metrics collector has a built in timeout to guard against cases where an upstream call never reads from the collection channel. Signed-off-by: Filip Petkovski <[email protected]>
…ne (thanos-io#7382) * *: Ensure objstore flag values are masked & disable debug/pprof/cmdline Signed-off-by: Saswata Mukherjee <[email protected]> * small fix Signed-off-by: Saswata Mukherjee <[email protected]> --------- Signed-off-by: Saswata Mukherjee <[email protected]>
In LabelNames and LabelValues gRPC calls were not pruned properly. While results are not wrong, this leads to inefficient fan-out for setups with many endpoints. We took the opportunity to unify the store filtering and generally also the larger layout of the gRPC methods, including logging and tracing. Signed-off-by: Michael Hoffmann <[email protected]>
Signed-off-by: Pedro Tanaka <[email protected]>
Signed-off-by: Pedro Tanaka <[email protected]>
* Appending warn to changelog about breaking change Signed-off-by: Pedro Tanaka <[email protected]> * Including warning emoji Signed-off-by: Pedro Tanaka <[email protected]> --------- Signed-off-by: Pedro Tanaka <[email protected]>
…7392) If we have a new querier it will create query hints even without the pushdown feature being present anymore. Old sidecars will then trigger query pushdown which leads to broken max,min,max_over_time and min_over_time. Signed-off-by: Michael Hoffmann <[email protected]>
* *: Using native histograms for grpc middleware metrics Since we updated the middleware library, we can now use native histograms to keep track of latencies in grpc calls. This is a semi-breaking change if people enabled native histogram collection on their Prometheus monitoring Thanos instances. Signed-off-by: Pedro Tanaka <[email protected]> adding change log Signed-off-by: Pedro Tanaka <[email protected]> * removing empty space; Signed-off-by: Pedro Tanaka <[email protected]> * Put full disclaimer in changelog Signed-off-by: Pedro Tanaka <[email protected]> --------- Signed-off-by: Pedro Tanaka <[email protected]>
* compact: recover from panics (thanos-io#7318) For thanos-io#6775, it would be useful to know the exact block IDs to aid debugging. Signed-off-by: Giedrius Statkevičius <[email protected]> * Sidecar: wait for prometheus on startup (thanos-io#7323) Signed-off-by: Michael Hoffmann <[email protected]> * Receive: fix serverAsClient.Series goroutines leak (thanos-io#6948) * fix serverAsClient goroutines leak Signed-off-by: Thibault Mange <[email protected]> * fix lint Signed-off-by: Thibault Mange <[email protected]> * update changelog Signed-off-by: Thibault Mange <[email protected]> * delete invalid comment Signed-off-by: Thibault Mange <[email protected]> * remove temp dev test Signed-off-by: Thibault Mange <[email protected]> * remove timer channel drain Signed-off-by: Thibault Mange <[email protected]> --------- Signed-off-by: Thibault Mange <[email protected]> * Receive: fix stats (thanos-io#7373) If we account stats for remote write and local writes we will count them twice since the remote write will be counted locally again by the remote receiver instance. Signed-off-by: Michael Hoffmann <[email protected]> * *: Ensure objstore flag values are masked & disable debug/pprof/cmdline (thanos-io#7382) * *: Ensure objstore flag values are masked & disable debug/pprof/cmdline Signed-off-by: Saswata Mukherjee <[email protected]> * small fix Signed-off-by: Saswata Mukherjee <[email protected]> --------- Signed-off-by: Saswata Mukherjee <[email protected]> * Query: dont pass query hints to avoid triggering pushdown (thanos-io#7392) If we have a new querier it will create query hints even without the pushdown feature being present anymore. Old sidecars will then trigger query pushdown which leads to broken max,min,max_over_time and min_over_time. Signed-off-by: Michael Hoffmann <[email protected]> * Cut patch release v0.35.1 Signed-off-by: Saswata Mukherjee <[email protected]> --------- Signed-off-by: Giedrius Statkevičius <[email protected]> Signed-off-by: Michael Hoffmann <[email protected]> Signed-off-by: Thibault Mange <[email protected]> Signed-off-by: Saswata Mukherjee <[email protected]> Co-authored-by: Giedrius Statkevičius <[email protected]> Co-authored-by: Michael Hoffmann <[email protected]> Co-authored-by: Thibault Mange <[email protected]>
Previously we defered starting the gRPC server by blocking the whole startup until we could ping prometheus. This breaks usecases that rely on the config reloader to start prometheus. We fix it by using a channel to defer starting the grpc server and loading external labels in an actor concurrently. Signed-off-by: Michael Hoffmann <[email protected]>
* Uupdate Prometheus Signed-off-by: alanprot <[email protected]> * Updating prometheus to 4e664035e84e Signed-off-by: alanprot <[email protected]> * Temporarily pinning prometheus common Signed-off-by: alanprot <[email protected]> * fixing lint Signed-off-by: alanprot <[email protected]> * Using jsoniter to encode promql responses Signed-off-by: alanprot <[email protected]> * Removing e2e test case with unvalid hifen on a matcher -> prometheus now support this use case Signed-off-by: alanprot <[email protected]> * Updating prometheus to v0.52.2-0.20240606174736-edd558884b24 Signed-off-by: alanprot <[email protected]> * pinning grpc to v1.63.2 Signed-off-by: alanprot <[email protected]> --------- Signed-off-by: alanprot <[email protected]> Co-authored-by: EC2 Default User <[email protected]>
Signed-off-by: Michael Hoffmann <[email protected]>
Allow suppressing environment variables expansion errors when unset, and thus keep the reloader from crashing. Instead leave them as is. Signed-off-by: Pranshu Srivastava <[email protected]>
* Update adopters.yml Signed-off-by: Rishabh Soni <[email protected]> * Add files via upload Signed-off-by: Rishabh Soni <[email protected]> --------- Signed-off-by: Rishabh Soni <[email protected]>
Signed-off-by: Vasiliy Rumyantsev <[email protected]>
Signed-off-by: Pedro Tanaka <[email protected]>
Recently ran into an issue with Istio in particular, where leaving the trailing dot on the SRV record returned by `dnssrvnoa` lookups led to an inability to connect to the endpoint. Removing the trailing dot fixes this behaviour. Now, technically, this is a valid URL, and it shouldn't be a problem. One could definitely argue that Istio should be responsible here for ensuring that the traffic is delivered. The problem seems rooted in how Istio attempts to do wildcard matching or URLs it receives - including the dot leads it to lookup an empty DNS field, which is invalid. The approach I take here is actually copied from how Prometheus does it. Therefore I hope we can sneak this through with the argument that 'this is how Prometheus does it', regardless of whether or not this is philosophically correct... Signed-off-by: verejoel <[email protected]>
Bumps [go.opentelemetry.io/contrib/propagators/autoprop](https://github.com/open-telemetry/opentelemetry-go-contrib) from 0.38.0 to 0.53.0. - [Release notes](https://github.com/open-telemetry/opentelemetry-go-contrib/releases) - [Changelog](https://github.com/open-telemetry/opentelemetry-go-contrib/blob/main/CHANGELOG.md) - [Commits](open-telemetry/opentelemetry-go-contrib@zpages/v0.38.0...zpages/v0.53.0) --- updated-dependencies: - dependency-name: go.opentelemetry.io/contrib/propagators/autoprop dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [go.opentelemetry.io/contrib/samplers/jaegerremote](https://github.com/open-telemetry/opentelemetry-go-contrib) from 0.7.0 to 0.22.0. - [Release notes](https://github.com/open-telemetry/opentelemetry-go-contrib/releases) - [Changelog](https://github.com/open-telemetry/opentelemetry-go-contrib/blob/main/CHANGELOG.md) - [Commits](open-telemetry/opentelemetry-go-contrib@v0.7.0...v0.22.0) --- updated-dependencies: - dependency-name: go.opentelemetry.io/contrib/samplers/jaegerremote dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…hanos-io#7492) * compact: Update filtered blocks list before second downsample pass If the second downsampling pass is given the same filteredMetas list as the first pass, it will create duplicates of blocks created in the first pass. It will also not be able to do further downsampling e.g 5m->1h using blocks created in the first pass, as it will not be aware of them. The metadata was already being synced before the second pass, but not updated into the filteredMetas list. Signed-off-by: Thomas Hartland <[email protected]> * Update changelog Signed-off-by: Thomas Hartland <[email protected]> * e2e/compact: Fix number of blocks cleaned assertion The value was increased in 2ed48f7 to fix the test, with the reasoning that the hardcoded value must have been taken from a run of the CI that didn't reach the max value due to CI worker lag. More likely the real reason is that commit 68bef3f the day before had caused blocks to be duplicated during downsampling. The duplicate block is immediately marked for deletion, causing an extra +1 in the number of blocks cleaned. Subtracting one from the value again now that the block duplication issue is fixed. Signed-off-by: Thomas Hartland <[email protected]> * e2e/compact: Revert change to downsample count assertion Combined with the previous commit this effectively reverts all of 2ed48f7, in which two assertions were changed to (unknowingly) account for a bug which had just been introduced in the downsampling code, causing duplicate blocks. This assertion change I am less sure on the reasoning for, but after running through the e2e tests several times locally, it is consistent that the only downsampling happens in the "compact-working" step, and so all other steps would report 0 for their total downsamples metric. Signed-off-by: Thomas Hartland <[email protected]> --------- Signed-off-by: Thomas Hartland <[email protected]>
Signed-off-by: 🌲 Harry 🌊 John 🏔 <[email protected]>
…s.go (thanos-io#7552) Signed-off-by: Nishant Bansal <[email protected]>
Signed-off-by: 🌲 Harry 🌊 John 🏔 <[email protected]>
Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from 0.24.0 to 0.25.0. - [Commits](golang/crypto@v0.24.0...v0.25.0) --- updated-dependencies: - dependency-name: golang.org/x/crypto dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
thanos-io#7528) Bumps [go.opentelemetry.io/otel/bridge/opentracing](https://github.com/open-telemetry/opentelemetry-go) from 1.21.0 to 1.28.0. - [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases) - [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md) - [Commits](open-telemetry/opentelemetry-go@v1.21.0...v1.28.0) --- updated-dependencies: - dependency-name: go.opentelemetry.io/otel/bridge/opentracing dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
This commits adds the option of filtering rules by rule name, rule group, or file. This brings the rule API closer in-line with the current Prometheus api. Signed-off-by: Jacob Baungard Hansen <[email protected]>
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.26.0 to 0.27.0. - [Commits](golang/net@v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: golang.org/x/net dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…hanos-io#7525) Bumps [go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc](https://github.com/open-telemetry/opentelemetry-go) from 1.27.0 to 1.28.0. - [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases) - [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md) - [Commits](open-telemetry/opentelemetry-go@v1.27.0...v1.28.0) --- updated-dependencies: - dependency-name: go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Yi Jin <[email protected]>
Signed-off-by: Yuchen Wang <[email protected]>
* support hedged requests in store Signed-off-by: milinddethe15 <[email protected]> * hedged roundtripper with tdigest for dynamic delay Signed-off-by: milinddethe15 <[email protected]> * refactor struct and fix lint Signed-off-by: milinddethe15 <[email protected]> * Improve hedging implementation Signed-off-by: milinddethe15 <[email protected]> * Improved hedging implementation Signed-off-by: milinddethe15 <[email protected]> * Update store doc Signed-off-by: milinddethe15 <[email protected]> * fix white space Signed-off-by: milinddethe15 <[email protected]> * add enabled field Signed-off-by: milinddethe15 <[email protected]> --------- Signed-off-by: milinddethe15 <[email protected]>
I always get this in logs: ``` err: receive capnp conn: close tcp ...: use of closed network connection ``` This is also visible in the e2e test. After Done() returns, the connection is closed either way so no need to close it again. Signed-off-by: Giedrius Statkevičius <[email protected]>
* Fix a storage GW bug that loses TSDB infos when joining them * E2E test demonstrating a bug in the MinT calculation in distributed Engine Signed-off-by: Michael Hoffmann <[email protected]>
Signed-off-by: Saswata Mukherjee <[email protected]>
Signed-off-by: Saswata Mukherjee <[email protected]>
…o#7915) * always close block series client at the end Signed-off-by: Ben Ye <[email protected]> * add back close for loser tree Signed-off-by: Ben Ye <[email protected]> --------- Signed-off-by: Ben Ye <[email protected]>
* Update objstore and promql-engine to latest Signed-off-by: Saswata Mukherjee <[email protected]> * Fixes after upgrade Signed-off-by: Saswata Mukherjee <[email protected]> --------- Signed-off-by: Saswata Mukherjee <[email protected]>
Signed-off-by: Saswata Mukherjee <[email protected]>
Signed-off-by: Saswata Mukherjee <[email protected]>
Signed-off-by: Yi Jin <[email protected]>
Signed-off-by: Yi Jin <[email protected]>
Signed-off-by: Yi Jin <[email protected]>
Signed-off-by: Yi Jin <[email protected]>
Signed-off-by: Yi Jin <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
merge db_main branch to release branch which has been running for a few weeks, a few highlights to call out:
Changes
Verification