-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use bytetstring instead of string for performance and switch proto libraries. #7
Conversation
} | ||
wr.Timeseries = wr.Timeseries[:len(series)] | ||
func createWriteRequest(series []*types.TimeSeriesBinary, externalLabels map[string]string, data *proto.Buffer) ([]byte, error) { | ||
wr := &prompb.WriteRequest{Timeseries: make([]prompb.TimeSeries, len(series))} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving this to inside the func had no impact on CPU but saved memory.
@@ -48,6 +48,72 @@ func TestLabels(t *testing.T) { | |||
} | |||
} | |||
|
|||
func BenchmarkFill(b *testing.B) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How these compare before/after?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Roughly 20% better if memory serves me and my notes are accurate.
// Assume roughly each series has 10 labels, we do this because at very large mappings growing the map took up to 5% of cpu time. | ||
// By pre allocating it that disappeared. | ||
strMapToIndex := make(map[string]uint32, len(s.series)*10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Kinda surprised it had so big impact. Maybe we can track the average size of this and autotune it at runtime?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also surprised. This trades memory for cpu, ie this map is bigger than we actually need in most cases since we store only the unique strings.
The gogo library though deprecated has better performance than googles and is what prometheus uses. The bytestring is safe as long as the underlying buffer is not reused, which it is not.
PS: the branch name is because originally I tried interning the strings with the new unique package but it ended up not saving memory and increased cpu significantly. If/when we make the whole prometheus pipeline interned then this will need to change.