-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restructure Matrix read operations #314
base: main
Are you sure you want to change the base?
Restructure Matrix read operations #314
Conversation
This pull request has been linked to Shortcut Story #43913: Restructure Matrix read operations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this!
(Optional)(Non-blocking) It would be really quite nice if we could unit test this, as if it breaks right now we'll have no way to know. Maybe we could set readBatchSize
in the constructor and default to READ_BATCH_SIZE
, and then in some unit tests set this to a smaller number.
@@ -36,6 +36,28 @@ | |||
#include <string> | |||
#include <tiledb/tiledb> | |||
|
|||
// Default batch size for all TileDB read operations. | |||
// This is expressed in number of array cells read per request. | |||
constexpr size_t DEFAULT_READ_BATCH_SIZE_CELLS = 100000000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest using 100'000'000
for legibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the reads are reading in batches until the total size of the array is reached. I don't think this is a good way to do this. The initial request should the total_size
we want to read, as that will complete in the vast vast majority of cases. If it doesn't, then we should continue the query and get more.
There is no reason to batch things in this way -- that is a pattern for a different use case.
Especially if there is a fixed overhead per query, we want to minimize the number of actual queries submitted.
tiledb_helpers::submit_query(tdb_func__, uri_, query); | ||
query.set_subarray(subarray).set_layout(layout_order); | ||
tiledb::Query::Status status; | ||
do { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this result in incomplete queries whenever read_batch_size_cells
is less than total_size
? I don't think we should batch read unless we actually need to. That is, we should always try to read total_size
and then iterate if we don't get total_size
.
Having incomplete queries here is a very edge case. Reading in batches may or may not solve that issue, but it adds latency to the common case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point of read_batch_size_cells
is to setup a cap on the individual request size to TileDB. The cap is configurable, so we could set it to a large value in order to perform all requests in one go.
To explain why I expect this to reduce our memory footprint without large latency overhead, lets use an example:
- Assume we want to read 15GB of data from a
tiledb://
array - Assume that TileDB requires some extra memory (20%) to perform the filter pipeline (decompression, etc.)
- Assume read I/O throughput and network speed is 200MB/s and all operations are IO bound(no CPU overhead)
- Assume each time we do an extra call to core we have 500ms latency overhead
- One go
- Peak allocated memory on client: 15GB
- Peak allocated memory on server: 15GB + 3GB = 18GB
- Peak allocated memory(REST+client when collocated during a realtime UDF): 33GB
- Latency: 500ms + 2 * 15GB/200MB/s = 150,5s
- With batch size 1GB
- Peak allocated memory on client: 15GB
- Peak allocated memory on server: 1GB + 0,3GB = 1,3GB
- Peak allocated memory(REST+client when collocated during a realtime UDF): 16,3GB
- Latency: 15*500ms + 2 * 15GB/200MB/s = 157,5s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One rule in software is "don't pay for what you don't need." It is true that there is only a 5% penalty with this particular set of assumptions. However, I expect the server is going to deal with its memory footprint in its own way, which is why we might get incomplete reads. I would say just request what is needed and let the server say how much memory it wants to devote to that or not.
Do other apps try to be parsimonious this way?
I guess, on the other hand, since this is configurable, we can always just make it really large.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See in-line comment.
tiledb_helpers::submit_query(tdb_func__, uri_, query); | ||
query.set_subarray(subarray).set_layout(layout_order); | ||
tiledb::Query::Status status; | ||
do { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One rule in software is "don't pay for what you don't need." It is true that there is only a 5% penalty with this particular set of assumptions. However, I expect the server is going to deal with its memory footprint in its own way, which is why we might get incomplete reads. I would say just request what is needed and let the server say how much memory it wants to devote to that or not.
Do other apps try to be parsimonious this way?
I guess, on the other hand, since this is configurable, we can always just make it really large.
Restructure Matrix read operations to handle incomplete queries and batching