Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V0.37 dev nov 1 - git merge no fast forward #95

Merged
merged 475 commits into from
Nov 8, 2024

Conversation

jnyi
Copy link
Collaborator

@jnyi jnyi commented Nov 5, 2024

Tested in dev-azure-westus, dashboards and alerts are matched

Screenshot 2024-11-07 at 11 56 13 PM
  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

Verification

coleenquadros and others added 30 commits May 27, 2024 12:22
Signed-off-by: Coleen Iona Quadros <[email protected]>
…7392)

If we have a new querier it will create query hints even without the
pushdown feature being present anymore. Old sidecars will then trigger
query pushdown which leads to broken max,min,max_over_time and
min_over_time.

Signed-off-by: Michael Hoffmann <[email protected]>
* *: Using native histograms for grpc middleware metrics

Since we updated the middleware library, we can now use native histograms to keep track of latencies in grpc calls.
This is a semi-breaking change if people enabled native histogram collection on their Prometheus monitoring Thanos instances.

Signed-off-by: Pedro Tanaka <[email protected]>

adding change log

Signed-off-by: Pedro Tanaka <[email protected]>

* removing empty space;

Signed-off-by: Pedro Tanaka <[email protected]>

* Put full disclaimer in changelog

Signed-off-by: Pedro Tanaka <[email protected]>

---------

Signed-off-by: Pedro Tanaka <[email protected]>
* compact: recover from panics (thanos-io#7318)

For thanos-io#6775, it would be useful
to know the exact block IDs to aid debugging.

Signed-off-by: Giedrius Statkevičius <[email protected]>

* Sidecar: wait for prometheus on startup (thanos-io#7323)

Signed-off-by: Michael Hoffmann <[email protected]>

* Receive: fix serverAsClient.Series goroutines leak (thanos-io#6948)

* fix serverAsClient goroutines leak

Signed-off-by: Thibault Mange <[email protected]>

* fix lint

Signed-off-by: Thibault Mange <[email protected]>

* update changelog

Signed-off-by: Thibault Mange <[email protected]>

* delete invalid comment

Signed-off-by: Thibault Mange <[email protected]>

* remove temp dev test

Signed-off-by: Thibault Mange <[email protected]>

* remove timer channel drain

Signed-off-by: Thibault Mange <[email protected]>

---------

Signed-off-by: Thibault Mange <[email protected]>

* Receive: fix stats (thanos-io#7373)

If we account stats for remote write and local writes we will count them
twice since the remote write will be counted locally again by the remote
receiver instance.

Signed-off-by: Michael Hoffmann <[email protected]>

* *: Ensure objstore flag values are masked & disable debug/pprof/cmdline (thanos-io#7382)

* *: Ensure objstore flag values are masked & disable debug/pprof/cmdline

Signed-off-by: Saswata Mukherjee <[email protected]>

* small fix

Signed-off-by: Saswata Mukherjee <[email protected]>

---------

Signed-off-by: Saswata Mukherjee <[email protected]>

* Query: dont pass query hints to avoid triggering pushdown (thanos-io#7392)

If we have a new querier it will create query hints even without the
pushdown feature being present anymore. Old sidecars will then trigger
query pushdown which leads to broken max,min,max_over_time and
min_over_time.

Signed-off-by: Michael Hoffmann <[email protected]>

* Cut patch release v0.35.1

Signed-off-by: Saswata Mukherjee <[email protected]>

---------

Signed-off-by: Giedrius Statkevičius <[email protected]>
Signed-off-by: Michael Hoffmann <[email protected]>
Signed-off-by: Thibault Mange <[email protected]>
Signed-off-by: Saswata Mukherjee <[email protected]>
Co-authored-by: Giedrius Statkevičius <[email protected]>
Co-authored-by: Michael Hoffmann <[email protected]>
Co-authored-by: Thibault Mange <[email protected]>
Signed-off-by: Saswata Mukherjee <[email protected]>
Signed-off-by: Saswata Mukherjee <[email protected]>
Signed-off-by: Saswata Mukherjee <[email protected]>
If we are constantly running compactor in a loop then we shouldn't pay
the price of constantly holding the lock in the garbage collection
function. What the lock holding means in practice that we have to wait
two or sometimes even three times the amount it takes to sync metas.
That doesn't make sense since we are running the compactor in a loop and
the compacted blocks are properly taken care of.

Signed-off-by: Giedrius Statkevičius <[email protected]>
…5.1-to-main

Merge release 0.35.1 to main
…er-header

Query-frontend: Set value of remote_user field in Slow Query Logs from HTTP header
This commit splits the single promql_query_exec span into two
separate spans, covering query creation and execution.

Signed-off-by: Filip Petkovski <[email protected]>
Co-authored-by: Jeroen van de Lockand <[email protected]>
* receive: remove serverAsClient usage

Remove serverAsClient usage to reduce CPU usage.

Signed-off-by: Giedrius Statkevičius <[email protected]>

* receive: remove unused param

Signed-off-by: Giedrius Statkevičius <[email protected]>

* receive: make local client lazy

Signed-off-by: Giedrius Statkevičius <[email protected]>

---------

Signed-off-by: Giedrius Statkevičius <[email protected]>
Split promql span into query create and exec spans
Previously we defered starting the gRPC server by blocking the whole
startup until we could ping prometheus. This breaks usecases that rely
on the config reloader to start prometheus.
We fix it by using a channel to defer starting the grpc server
and loading external labels in an actor concurrently.

Signed-off-by: Michael Hoffmann <[email protected]>
Dependency: Update minio-go to v7.0.70 which includes support for EKS Pod Identity.

Signed-off-by: farhad <[email protected]>
* Uupdate Prometheus

Signed-off-by: alanprot <[email protected]>

* Updating prometheus to 4e664035e84e

Signed-off-by: alanprot <[email protected]>

* Temporarily pinning prometheus common

Signed-off-by: alanprot <[email protected]>

* fixing lint

Signed-off-by: alanprot <[email protected]>

* Using jsoniter to encode promql responses

Signed-off-by: alanprot <[email protected]>

* Removing e2e test case with unvalid hifen on a matcher -> prometheus now support this use case

Signed-off-by: alanprot <[email protected]>

* Updating prometheus to v0.52.2-0.20240606174736-edd558884b24

Signed-off-by: alanprot <[email protected]>

* pinning grpc to v1.63.2

Signed-off-by: alanprot <[email protected]>

---------

Signed-off-by: alanprot <[email protected]>
Co-authored-by: EC2 Default User <[email protected]>
Changelog - update the changelog entry position
The distributed engine retrieves label sets once per query, and
doing the expensive copying and conversion uses a lot of memory.

We already set them in the format we need in the endpoint status,
so we can retrieve them from there.

Signed-off-by: Filip Petkovski <[email protected]>
Fetches the right version of prometheus from
the releases api rather than the tags api

Signed-off-by: Aritra24 <[email protected]>
* Remove unused/broken `vendor` key.
* Increase Go PR limit from 5 to 20.
* Fixup yaml consistency.

Signed-off-by: SuperQ <[email protected]>
Bumps [actions/checkout](https://github.com/actions/checkout) from 3 to 4.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](actions/checkout@v3...v4)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 2 to 3.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](github/codeql-action@v2...v3)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [peter-evans/create-pull-request](https://github.com/peter-evans/create-pull-request) from 3 to 6.
- [Release notes](https://github.com/peter-evans/create-pull-request/releases)
- [Commits](peter-evans/create-pull-request@v3...v6)

---
updated-dependencies:
- dependency-name: peter-evans/create-pull-request
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [github.com/felixge/fgprof](https://github.com/felixge/fgprof) from 0.9.2 to 0.9.4.
- [Release notes](https://github.com/felixge/fgprof/releases)
- [Commits](felixge/fgprof@v0.9.2...v0.9.4)

---
updated-dependencies:
- dependency-name: github.com/felixge/fgprof
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
)

Bumps [github.com/klauspost/compress](https://github.com/klauspost/compress) from 1.17.8 to 1.17.9.
- [Release notes](https://github.com/klauspost/compress/releases)
- [Changelog](https://github.com/klauspost/compress/blob/master/.goreleaser.yml)
- [Commits](klauspost/compress@v1.17.8...v1.17.9)

---
updated-dependencies:
- dependency-name: github.com/klauspost/compress
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [github.com/onsi/gomega](https://github.com/onsi/gomega) from 1.29.0 to 1.33.1.
- [Release notes](https://github.com/onsi/gomega/releases)
- [Changelog](https://github.com/onsi/gomega/blob/master/CHANGELOG.md)
- [Commits](onsi/gomega@v1.29.0...v1.33.1)

---
updated-dependencies:
- dependency-name: github.com/onsi/gomega
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
pedro-stanaka and others added 13 commits October 31, 2024 17:56
Signed-off-by: Pedro Tanaka <[email protected]>
Signed-off-by: Pedro Tanaka <[email protected]>
Signed-off-by: Pedro Tanaka <[email protected]>
Signed-off-by: Pedro Tanaka <[email protected]>
Signed-off-by: Pedro Tanaka <[email protected]>
…tats-collection

QFE: new middleware to force query statistics collection
Applies the fix described in thanos-io#7883.

Signed-off-by: Filip Petkovski <[email protected]>
Signed-off-by: Filip Petkovski <[email protected]>
Signed-off-by: Filip Petkovski <[email protected]>
Signed-off-by: Yi Jin <[email protected]>
@jnyi jnyi force-pushed the v0.37-dev-nov-1 branch 2 times, most recently from 64bbfbe to 959689f Compare November 5, 2024 23:53
GiedriusS and others added 3 commits November 6, 2024 09:46
Properly preserve results from other resolve calls. There is an
assumption that resolve() is always called with the same addresses but
that is not true with gRPC and `--endpoint-group`. Without this fix,
multiple resolves could happen at the same time but some of the callers
will not be able to retrieve the results leading to random errors.

Signed-off-by: Giedrius Statkevičius <[email protected]>
Read "minT" from prometheus metrics so that we also set it for sidecars
that are not uploading blocks.

Signed-off-by: Michael Hoffmann <[email protected]>
Add wait_interval*3 timeout to SyncMetas(). We had an incident in
production where object storage had had some problems and the syncer got
stuck due to no timeout. The timeout value is arbitrary but just exists
so that it wouldn't get stuck for eternity.

Signed-off-by: Giedrius Statkevičius <[email protected]>
fpetkovski and others added 4 commits November 7, 2024 11:47
@jnyi jnyi merged commit ebc6bc6 into databricks:db_main Nov 8, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.