If you're experiencing slow performance, please ensure your config file parameters are up to date. Please refer to the base config file.
Cloudfuse Stream is a feature which helps support reading and writing large files that will not fit in the file cache on the local disk. It also provides performance optimization for scenarios where only small portions of a file are accessed since the file does not have to be downloaded in full before reading or writing to it. It supports the following modes
-
File-Handle based Caching
- Separate file handles have separate buffers irrespective of whether or not they point to the same file
- Ideal for scenarios where multiple handles are reading from different parts of a file
- Not recommended to be used for multiple writer or single writer multiple reader scenarios
- If writing through multiple handles, the last handle closed will win and may not persist writes from previously closed handles if their data buffers overlap
- If writing on one handle, modified data will only be visible by handles opened after the writer handle closes.
-
File-Name based Caching
- Separate file handles pointing to the same file share buffers
- Ideal for scenarios where multiple handles are reading from close by parts of a file and multiple writer or single writer multiple reader
To enable stream, first specify stream under the components sequence between libfuse and attr_cache. Note 'stream' and 'file_cache' currently can not co-exist.
components:
- libfuse
- stream
- attr_cache
- azstorage
or
components:
- libfuse
- stream
- attr_cache
- s3storage
The different configuration options for stream are,
block-size-mb: 16
: Integer parameter that specifies the size of each block to be cached in memory (in MB). When using S3 storage, the parameter part-size-mb in s3storage should be set to the same value as this one.max-buffers: 16
: Integer parameter that specifies the total number of buffers to be cached in memory (in MB).buffer-size-mb: 16
: Integer parameter that specifies the size of each buffer to be cached in memory (in MB).file-caching: true|false
: Boolean parameter to specify file name based caching. Default is false which specifies file handle based caching.
After adding the components, add the following section to your Cloudfuse config file. The following example enables Cloudfuse to use up to 64 * 128 MB of memory to cache data buffers with file handle based caching
stream:
block-size-mb: 64
max-buffers: 128
buffer-size-mb: 64
file-caching: false
To disable caching and stream straight from S3 or Azure Storage, set all stream buffer configuration options to 0.