You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If the cache is enabled, readWithRLock always reads the file using a siphon.
The siphon code copies every byte it reads into a bytes.Buffer. When the full file has been read, that bytes.Buffer is used to update the cache.
However, if the underlying file is e.g. a gigabyte in size, the siphon will end up with a bytes.Buffer containing that entire gigabyte. Unless you've set your cache size to over a gigabyte, this gets thrown away as soon as the ReadStream is done.
The main reason we use ReadStream is so we can deal with very large items without having to stick the entire thing in memory at once. Having discovered this, we'll probably disable the cache, but there are cases where people may wish to have a cache enabled without blowing up their memory!
The text was updated successfully, but these errors were encountered:
floren
added a commit
to floren/diskv
that referenced
this issue
Aug 18, 2021
The siphon will now stop writing to its internal buffer once the size of the buffer
exceeds the maximum cache size. Because we write until we *exceed* the max cache size,
we're safe to attempt the cache update even if the buffer only contains partial data,
because it's still over the limit & will be rejected.
I do use diskv in situations where some values are multiple gigabytes, and some are multiple megabytes.
We noticed this behavior when trying to figure out why sending items from one node's diskv to another took up so much memory. Once we figured out what was going on, we disabled the cache, but thought we'd report the behavior and offer a fix. I believe if you configure a 100MB cache, and you read a 500MB item via ReadStream, you'd be surprised to learn that a diskv makes a complete in-memory copy of the item before immediately and always throwing it away.
If you're not particularly worried about that corner case, feel free to close this issue and the related PR.
If the cache is enabled,
readWithRLock
always reads the file using asiphon
.The siphon code copies every byte it reads into a bytes.Buffer. When the full file has been read, that bytes.Buffer is used to update the cache.
However, if the underlying file is e.g. a gigabyte in size, the siphon will end up with a bytes.Buffer containing that entire gigabyte. Unless you've set your cache size to over a gigabyte, this gets thrown away as soon as the ReadStream is done.
The main reason we use ReadStream is so we can deal with very large items without having to stick the entire thing in memory at once. Having discovered this, we'll probably disable the cache, but there are cases where people may wish to have a cache enabled without blowing up their memory!
The text was updated successfully, but these errors were encountered: