Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dev: table.Session.Execute does X3 allocations for every string(or bytes) byte in result #1428

Open
zveznicht opened this issue Aug 28, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@zveznicht
Copy link

Every byte loaded from YDB via query gets allocated in heap three time.
For example if client reads from YDB 2 GiB/sec of bytes, the go runtime will have to allocate 6 GiB/sec of memory. Which adds additional pressure on GC.

These allocations are:

  1. GRPC allocation for response message (https://github.com/grpc/grpc-go/blob/v1.63.x/rpc_util.go#L630, they are moving buffer around rn, so I use specific tag here for clarity). Basically to read every response GRPC allocates slice of bytes. It supports pooling, but by default the pool is noop, it simply allocates new slice every time.
  2. Proto unmarshalling of Ydb_Table.ExecuteDataQueryResponse. Protobuf doesn't use zero-copy in standard implementation. So it is allocates new byte[] for every bytes field and copy values there.
  3. Proto unmarshalling of Ydb_Table.ExecuteDataQueryResponse.Result into Ydb_Table.ExecuteQueryResult. Resul has Any type, which is essentially a byte array with type name attached, so it undergoes one more unmarshalling.

The possible improvements.
For (1) it is pretty easy to allow sdk users to enable grpc pool. It is experimental though (https://github.com/grpc/grpc-go/blob/v1.63.x/experimental/experimental.go#L46). And right now it doesn't work without additional hacks, because pool can't be used when stats.Handler is used. And sdk uses ststs.Handler internally. (Maybe this gets fixed in grpc code itself)

For (3) the easiest solution would be to stop using Any for query response. And instead switching to something like oneof to specialize more common/important response types.
Otherwise (2) and (3) can be resolved only by switching from standard proto unmarshalling to something custom, to have zero copy. But probably resolving just (1) and (3) can be enough.

@zveznicht zveznicht added the enhancement New feature or request label Aug 28, 2024
@asmyasnikov
Copy link
Member

asmyasnikov commented Aug 28, 2024

For (1) it is pretty easy to allow sdk users to enable grpc pool. It is experimental though (https://github.com/grpc/grpc-go/blob/v1.63.x/experimental/experimental.go#L46).

You can set your pool with option config.WithGrpcOptions(experimental.WithRecvBufferPool(yourPool))
By default we dont want to set this option because this is experimental feature of grpc-go library and can vbe brokes in different versions of grpc-go. If you return good feedback about option after some time in production - we consider the idea of using WithRecvBufferPool by default

@zveznicht
Copy link
Author

You can set your pool with option config.WithGrpcOptions(experimental.WithRecvBufferPool(yourPool))

we did, but unfortunately it doesn't work on its own, because ydb sdk also sets stats.Handler internally. Because of that grpc doesn't return bytes back to pool. And we have to register additional stats.Handler to do it on our own.

@zveznicht
Copy link
Author

But in the latest version of grpc, they seem to remove original bytes slice from stats.Handler event and it shouldn't interfere with buffer anymore (https://github.com/grpc/grpc-go/blob/master/experimental/experimental.go#L43). So we can probably cross this item.

Also I was told there is another API (query.Service) which doesn't use Any in response, which should remove one more allocation. So as long as it becomes usable (#1429) I'll try it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants
@asmyasnikov @zveznicht and others