Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(bloom): Compute chunkrefs for series right before sending task to builder #14808

Merged
merged 6 commits into from
Nov 7, 2024

Conversation

salvacorts
Copy link
Contributor

@salvacorts salvacorts commented Nov 7, 2024

What this PR does / why we need it:
We've seen high memory usage in the planners that caused OOMs. We pulled a profile and saw most of the in-use memory was allocated by:

image

This PR improves this by not storing the chunkrefs in the tasks we store in the queue. Instead, we populate the chunkrefs just before sending the task to the builder.

Special notes for your reviewer:
We considered adding a memory limit to the queue and lazy loading the tasks but by doing so, we'd lose the metrics on the total number of tasks computed for a tenant, thus loosing the ability to track the build progress (% of completed tasks out of the total for a tenant).

We now have the following task definitions:

  • protos.ProtoTask: The one we send to the builder through the wire. This one contains the chunkrefs.
  • protos.Task: Same as protos.ProtoTask but with Loki friendly types. Has a ToProtoTask() that converts it to protos.ProtoTask.
  • New: strategies.Task: inherits from protos.Task. Overrides the Gaps fields to have series w/o chunks (only FPs). ToProtoTask() takes forSeries as argument and populates the chunkrefs.
  • planner.QueueTask: Inherits from strategies.Task and adds queue tracking info and forSeries to later on populate the chunks when transforming the queue task to protos.ProtoTask. ToProtoTask() calls the ToProtoTask() from strategies.Task passing the forSeries

Checklist

  • Reviewed the CONTRIBUTING.md guide (required)
  • Documentation added
  • Tests updated
  • Title matches the required conventional commits format, see here
    • Note that Promtail is considered to be feature complete, and future development for logs collection will be in Grafana Alloy. As such, feat PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/setup/upgrade/_index.md
  • If the change is deprecating or removing a configuration option, update the deprecated-config.yaml and deleted-config.yaml files respectively in the tools/deprecated-config-checker directory. Example PR

@pull-request-size pull-request-size bot added size/L and removed size/XL labels Nov 7, 2024
@salvacorts salvacorts force-pushed the salvacorts/populate-chunks-before-sending-task branch from 1174064 to 9d6870f Compare November 7, 2024 10:40
@salvacorts salvacorts marked this pull request as ready for review November 7, 2024 10:48
@salvacorts salvacorts requested a review from a team as a code owner November 7, 2024 10:48
@salvacorts salvacorts changed the title refactor(bloom): Compute chunkrefs for series right before sending task to builder perf(bloom): Compute chunkrefs for series right before sending task to builder Nov 7, 2024
*strategies.Task

// We use forSeries in ToProtoTask to get the chunks for the series in the gaps.
forSeries common.ClosableForSeries
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between common.ClosableForSeries and sharding.ForSeries (which is used in strategies.Task?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

common.ClosableForSeries is

type ClosableForSeries interface {
	sharding.ForSeries
	Close() error
}

Added ForSeries alias to the common pkg and used it instead of common.ClosableForSeries and sharding.ForSeries

type ForSeries = sharding.ForSeries

@salvacorts salvacorts merged commit 66e6b1c into main Nov 7, 2024
59 checks passed
@salvacorts salvacorts deleted the salvacorts/populate-chunks-before-sending-task branch November 7, 2024 13:02
Copy link
Member

@rfratto rfratto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just have a question about the improvement here

@@ -87,11 +89,23 @@ func GenBlock(ref bloomshipper.BlockRef) (bloomshipper.Block, error) {
}, nil
}

func GenSeries(bounds v1.FingerprintBounds) []*v1.Series {
func GenSeries(bounds v1.FingerprintBounds) []model.Fingerprint {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Should this be GenFingerprint/GenFignerPrintWithStep?


// ToProtoTask converts a Task to a ProtoTask.
// It will use the opened TSDB to get the chunks for the series in the gaps.
func (t *Task) ToProtoTask(ctx context.Context, forSeries common.ForSeries) (*protos.ProtoTask, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

silly question: doesn't this move the memory overhead from NewTSDBSeriesIter to here? why does this use less memory?

Comment on lines +521 to +523
if _, ok := alreadyOpen[idx]; ok {
continue
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants