-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for compression from buildbarn #158
Comments
Hey @eagleonhill, When compression support was added to the Remote Execution protocol, I was opposed to adding it in its given form. The reason being that the transmitting side does not properly send the size of the compressed object to the receiving side. With the way LocalBlobAccess is designed, it would be pretty hard to allocate space for objects as they are being written, as it tries to store objects consecutively. This means that compression would either be limited to just on wire (not at rest), or bb-storage would need to do unnecessary copying of data to guarantee consecutive storage. Furthermore, one downside of compression is that it's a lot harder to provide fast random access to blobs. As Buildbarn relies heavily on virtual file systems (based on FUSE/NFSv4), this is something that we need. Even without compression the protocol's story around random access is a bit messy, as there's no way to guarantee data integrity by reading CAS objects partially. Because of that, I am currently spending time with the remote API working group to get chunking/decomposition of CAS objects added. See these PRs:
Once we have proper support for chunking/decomposition, the story around compression will become a lot simpler. All (primitive) CAS objects will become bounded in size. This means that even if the receiving side doesn't know the size of a compressed CAS object, it can safely ingest it into memory to compute its size, before letting LocalBlobAccess write it to disk. So in summary, the reason Buildbarn doesn't support compression is not because we don't care. We want to do it right, and that still takes a bit more time to sort out. I will keep this issue open, so that people can remain informed on updates. |
Hey @EdSchouten
maybe buildbarn can have a try on it. when receive WriteRequest, Allocator a space with |
That’s the point: if we add compression support, I don’t just want compression in transit. Also at rest. And that’s problematic, as we can’t allocate space for something we don’t know the size of. |
From my reading of this bug, I still don't understand why supporting gRPC-level in-transit compression (https://github.com/grpc/grpc-go/blob/master/Documentation/compression.md) is not desired/allowed. This feature should be easy to enable and it is completely invisible to the application-level logic, correct? The reason I ask is because we face the situation where some of our artifacts are too large to download in a reasonable amount of time. However, these artifacts can be compressed very well and they almost-never change (so they stay in the cache for a long time)—so we don't really care about on-disk compression. Having the ability to compress in-transit only would solve 90% of our problems with slow builds due to downloads, and I think this should be a fairly simple change? More specifically, what we'd be looking at is having compression between the frontend and bb-clientd, and nothing else. Bazel nor the storage nodes wouldn't even have to know that the compression is happening. |
+1 to gRPC-level in-transit compression. Our biggest artifacts (~1GB) are large binaries with debug information that we know compress really well: adding "-gz=zlib" to our gcc options shrinks these by 60% or more. Unfortunately "-gz=zlib" adds 40+ seconds to the link time, which is a dealbreaker. We're optimistic gRPC compression would perform better. (gcc 13 adds a "-gz=zstd" option which might be better, but we're on gcc 12.3 for now). |
Bazel remote protocol now have compression option. That would be very useful for cases where the bazel and build-barn is in different clusters, and also saves storage usage.
The text was updated successfully, but these errors were encountered: