Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: compress kept trace decision message #1430

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

VinozzZ
Copy link
Contributor

@VinozzZ VinozzZ commented Nov 14, 2024

Which problem is this PR solving?

Due to the multiplication effect of pubsub messages through Redis, we would like to bring down the amount of network traffic sent by Redis.

One of the biggest contributors is TraceDecison message used to communicate kept trace decisions.

This PR uses snappy to compress the data and then encoded using gob

Here's the benchmark results comparing to the original JSON encoding.
In the BenchmarkCompressionSizes test, I generated a batch of 1000 trace decisions as the input. The amount of decrease in data size is 90%

goos: darwin
goarch: arm64
pkg: github.com/honeycombio/refinery/collect
cpu: Apple M2 Max
BenchmarkDynamicJSONEncoding-12          	   3139	   353164 ns/op	 443032 B/op	      3 allocs/op
BenchmarkDynamicCompressedEncoding-12    	   3050	   383788 ns/op	 475333 B/op	     62 allocs/op
BenchmarkDynamicJSONDecoding-12          	    561	  2114997 ns/op	 590167 B/op	   5018 allocs/op
BenchmarkDynamicCompressedDecoding-12    	   3285	   360892 ns/op	 445667 B/op	   5244 allocs/op
BenchmarkCompressionSizes/JSON_Encoding-12         	   5000	   238046 ns/op	 164091 B/op	      2 allocs/op
--- BENCH: BenchmarkCompressionSizes/JSON_Encoding-12
    trace_decision_test.go:305: JSON Encoding: Total Batch: 1, Total Size: 160671 bytes, Average Size: 160671.00 bytes
    trace_decision_test.go:305: JSON Encoding: Total Batch: 100, Total Size: 16227771 bytes, Average Size: 162277.71 bytes
    trace_decision_test.go:305: JSON Encoding: Total Batch: 5000, Total Size: 819582771 bytes, Average Size: 163916.55 bytes
BenchmarkCompressionSizes/Snappy_Compression-12    	   5052	   236998 ns/op	 299547 B/op	     59 allocs/op
--- BENCH: BenchmarkCompressionSizes/Snappy_Compression-12
    trace_decision_test.go:319: Snappy Compression: Total Batch: 1, Total Size: 12115 bytes, Average Size: 12115.00 bytes
    trace_decision_test.go:319: Snappy Compression: Total Batch: 100, Total Size: 1223615 bytes, Average Size: 12236.15 bytes
    trace_decision_test.go:319: Snappy Compression: Total Batch: 5052, Total Size: 62428595 bytes, Average Size: 12357.20 bytes

I also used sync.Pool for both compression and decompression buffers as well as snappy.Writer to reduce allocation

@VinozzZ VinozzZ requested a review from a team as a code owner November 14, 2024 23:13
@VinozzZ VinozzZ self-assigned this Nov 14, 2024
@VinozzZ VinozzZ added the type: enhancement New feature or request label Nov 14, 2024
@VinozzZ VinozzZ added this to the v2.9 milestone Nov 14, 2024
Copy link
Contributor

@kentquirk kentquirk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! I had a couple of comments, but they're optional.

Comment on lines +168 to +170
out := make([]byte, buf.Len())
copy(out, buf.Bytes())
return out, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the most efficient way to do that:

Suggested change
out := make([]byte, buf.Len())
copy(out, buf.Bytes())
return out, nil
return bytes.Clone(buf.Bytes()), nil


return string(data), nil
}

func newDroppedTraceDecision(msg string) ([]TraceDecision, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not also compress the dropped messages? Even though they're unlikely to do much better than 50% compression (they're essentially hex-coded noise most of the time), that's not nothing given how many we send.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants