juicedata · davies · Sep 21, 2023 · Sep 3, 2023 · Sep 4, 2023 · Sep 4, 2023
@@ -10,7 +10,7 @@ slug: /metadata_dump_load
 - JuiceFS v1.0.4 starts to support importing an encrypted backup.
 :::
 
-JuiceFS supports [multiple metadata engines](../reference/how_to_set_up_metadata_engine.md), and each engine stores and manages data in a different format internally. JuiceFS provides the [`dump`](../reference/command_reference.md#dump) command to export metadata in a uniform JSON format, also there's the [`load`](../reference/command_reference.md#load) command to restore or migrate backups to any metadata storage engine.
+JuiceFS supports [multiple metadata engines](../reference/how_to_set_up_metadata_engine.md), and each engine stores and manages data in a different format internally. JuiceFS provides the [`dump`](../reference/command_reference.md#dump) command to export metadata in a uniform JSON format, also there's the [`load`](../reference/command_reference.md#load) command to restore or migrate backups to any metadata storage engine. This dump / load process can also be used to migrate a community edition file system to enterprise edition (read [enterprise docs](https://juicefs.com/docs/cloud/metadata_dump_load) for more), and vice versa.
 
 ## Metadata backup {#backup}
 

@@ -79,7 +79,9 @@ $ ls -l /usr/bin/fusermount
 -rwsr-xr-x 1 root root 32096 Oct 30  2018 /usr/bin/fusermount
 ```
 
-## Connection problems with object storage (slow internet speed) {#io-error-object-storage}
+## Read write slow & read write error {#read-write-error}
+
+### Connection problems with object storage (slow internet speed) {#io-error-object-storage}
 
 If JuiceFS Client cannot connect to object storage, or the bandwidth is simply not enough, JuiceFS will complain in logs:
 
@@ -100,7 +102,7 @@ The first issue with slow connection is upload / download timeouts (demonstrated
 * Reduce buffer size, e.g. [`--buffer-size=64`](../reference/command_reference.md#mount) or even lower. In a large bandwidth condition, increasing buffer size improves parallel performance. But in a low speed environment, this only makes `flush` operations slow and prone to timeouts.
 * Default timeout for GET / PUT requests are 60 seconds, increasing `--get-timeout` and `--put-timeout` may help with read / write timeouts.
 
-In addition, the ["Client Write Cache"](../guide/cache.md#writeback) feature needs to be used with caution in low bandwidth environment. Let's briefly go over the JuiceFS Client background job design: every JuiceFS Client runs background jobs by default, one of which is data compaction, and if the client has poor internet speed, it'll drag down performance for the whole system. A worse case is when client write cache is also enabled, compaction results are uploaded too slowly, forcing other clients into a read hang when accessing the affected files:
+In addition, the ["Client Write Cache"](../guide/cache.md#client-write-cache) feature needs to be used with caution in low bandwidth environment. Let's briefly go over the JuiceFS Client background job design: every JuiceFS Client runs background jobs by default, one of which is data compaction, and if the client has poor internet speed, it'll drag down performance for the whole system. A worse case is when client write cache is also enabled, compaction results are uploaded too slowly, forcing other clients into a read hang when accessing the affected files:
 
 ```text
 # While compaction results are slowly being uploaded in low speed clients, read from other clients will hang and eventually fail
@@ -111,6 +113,23 @@ In addition, the ["Client Write Cache"](../guide/cache.md#writeback) feature nee
 
 To avoid this type of issue, we recommend disabling background jobs on low-bandwidth clients, i.e. adding [`--no-bgjob`](../reference/command_reference.md#mount) option to the mount command.
 
+### WARNING log: block not found in object storage {#warning-log-block-not-found-in-object-storage}
+
+When using JuiceFS at scale, there will be some warnings in client logs:
+
+```
+<WARNING>: fail to read sliceId 1771585458 (off:4194304, size:4194304, clen: 37746372): get chunks/0/0/1_0_4194304: oss: service returned error: StatusCode=404, ErrorCode=NoSuchKey, ErrorMessage="The specified key does not exist.", RequestId=62E8FB058C0B5C3134CB80B6
+```
+
+When this type of warning occurs, but not accompanied by I/O errors (indicated by `input/output error` in client logs), you can safely ignore them and continue normal use, client will retry automatically and resolves this issue.
+
+This warning means that JuiceFS Client cannot read a particular slice, because a block does not exist, and object storage has to return a `NoSuchKey` error. Usually this is caused by:
+
+* Clients carry out compaction asynchronously, which upon completion, will change the relationship between file and its corresponding blocks, causing problems for other clients that's already reading this file, hence the warning.
+* Some clients enabled ["Client Write Cache"](../guide/cache.md#client-write-cache), they write a file, commit to the Metadata Service, but the corresponding blocks are still pending to upload (caused by for example, [slow internet speed](#io-error-object-storage)). Meanwhile, other clients that are already accessing this file will meet this warning.
+
+Again, if no errors occur, just safely ignore this warning.
+
 ## Read amplification
 
 In JuiceFS, a typical read amplification manifests as object storage traffic being much larger than JuiceFS Client read speed. For example, JuiceFS Client is reading at 200MiB/s, while S3 traffic grows up to 2GiB/s.

@@ -827,7 +827,7 @@ Slice{pos: 40M, id:  0, size: 24M, off:   0, len: 24M} // can be omitted
 
 ### Data objects
 
-#### Object naming
+#### Object naming {#object-storage-naming-format}
 
 Block is the basic unit for JuiceFS to manage data. Its size is 4 MiB by default, and can be changed only when formatting a file system, within the interval [64 KiB, 16 MiB]. Each Block is an object in the object storage after upload, and is named in the format `${fsname}/chunks/${hash}/${basename}`, where
 

@@ -92,7 +92,7 @@ Read [JuiceFS Internals](development/internals.md) and [Data Processing Flow](in
 
 You could mount JuiceFS with [`--writeback` option](reference/command_reference.md#mount), which will write the small files into local disks first, then upload them to object storage in background, this could speedup coping many small files into JuiceFS.
 
-See ["Write Cache in Client"](guide/cache.md#writeback) for more information.
+See ["Write Cache in Client"](guide/cache.md#client-write-cache) for more information.
 
 ### Does JuiceFS support distributed cache?
 

@@ -144,7 +144,7 @@ Repeated reads of the same file in JuiceFS can be extremely fast, with latencies
 
 Starting from Linux kernel 3.15, FUSE supports [writeback-cache](https://www.kernel.org/doc/Documentation/filesystems/fuse-io.txt) mode, the kernel will consolidate high-frequency random small (10-100 bytes) write requests to significantly improve its performance, but this comes with a side effect: sequential writes are also turned into random writes, hence sequential write performance is hindered, so only use it on intensive random write scenarios.
 
-To enable writeback-cache mode, use the [`-o writeback_cache`](../reference/fuse_mount_options.md#writeback_cache) option when you [mount JuiceFS](../reference/command_reference.md#mount). Note that writeback-cache mode is not the same as [Client write data cache](#writeback), the former is a kernel implementation while the latter happens inside the JuiceFS Client, read the corresponding section to learn their intended scenarios.
+To enable writeback-cache mode, use the [`-o writeback_cache`](../reference/fuse_mount_options.md#writeback_cache) option when you [mount JuiceFS](../reference/command_reference.md#mount). Note that writeback-cache mode is not the same as [Client write data cache](#client-write-cache), the former is a kernel implementation while the latter happens inside the JuiceFS Client, read the corresponding section to learn their intended scenarios.
 
 ### Read cache in client {#client-read-cache}
 
@@ -180,7 +180,7 @@ Below are some important options for cache configuration (see [`juicefs mount`](
 
   There are two main read patterns, sequential read and random read. Sequential read usually demands higher throughput while random reads needs lower latency. When local disk throughput is lower than object storage, consider enable `--cache-partial-only` so that sequential reads do not cache the whole block, but rather, only small reads (like footer of Parquet / ORC file) are cached. This allows JuiceFS to take advantage of low latency provided by local disk, and high throughput provided by object storage, at the same time.
 
-### Client write data cache {#writeback}
+### Client write data cache {#client-write-cache}
 
 Enabling client write cache can improve performance when writing large amount of small files. Read this section to learn about client write cache.
 

@@ -48,7 +48,7 @@ Client write cache is also referred to as "Writeback mode" throughout the docs.
 
 For scenarios that does not deem consistency and data security as top priorities, enabling client write cache is also an option to further improve performance. When client write cache is enabled, flush operations return immediately after writing data to the local cache directory. Then, local data is uploaded asynchronously to the object storage. In other words, the local cache directory is a cache layer for the object storage.
 
-Learn more in [Client Write Cache](../guide/cache.md#writeback).
+Learn more in [Client Write Cache](../guide/cache.md#client-write-cache).
 
 ## Data reading process {#workflow-of-read}
 

@@ -167,10 +167,10 @@ juicefs format sqlite3://myjfs.db myjfs --trash-days=0
 |Items|Description|
 |-|-|
 |`--block-size=4096`|size of block in KiB (default: 4096). 4M is usually a better default value because many object storage services use 4M as their internal block size, thus using the same block size in JuiceFS usually yields better performance.|
-|`--compress=none`|compression algorithm, choose from `lz4`, `zstd`, `none` (default). Enabling compression will inevitably affect performance, choose wisely.|
+|`--compress=none`|compression algorithm, choose from `lz4`, `zstd`, `none` (default). Enabling compression will inevitably affect performance. Among the two supported algorithms, `lz4` offers a better performance, while `zstd` comes with a higher compression ratio, Google for their detailed comparison.|
 |`--encrypt-rsa-key=value`|A path to RSA private key (PEM)|
 |`--encrypt-algo=aes256gcm-rsa`|encrypt algorithm (aes256gcm-rsa, chacha20-rsa) (default: "aes256gcm-rsa")|
-|`--hash-prefix`|add a hash prefix to name of objects (default: false)|
+|`--hash-prefix`|For most object storages, if object storage blocks are sequentially named, they will also be closely stored in the underlying physical regions. When loaded with intensive concurrent consecutive reads, this can cause hotspots and hinder object storage performance.<br/><br/>Enabling `--hash-prefix` will add a hash prefix to name of the blocks (slice ID mod 256, see [internal implementation](../development/internals.md#object-storage-naming-format)), this distributes data blocks evenly across actual object storage regions, offering more consistent performance. Obviously, this option dictates object naming pattern and **should be specified when a file system is created, and cannot be changed on-the-fly.**<br/><br/>Currently, [AWS S3](https://aws.amazon.com/about-aws/whats-new/2018/07/amazon-s3-announces-increased-request-rate-performance) had already made improvements and no longer require application side optimization, but for other types of object storages, this option still recommended for large scale scenarios.|
 |`--shards=0`|If your object storage limit speed in a bucket level (or you're using a self-hosted object storage with limited performance), you can store the blocks into N buckets by hash of key (default: 0), when N is greater than 0, `bucket` should to be in the form of `%d`, e.g. `--bucket "juicefs-%d"`. `--shards` cannot be changed afterwards and must be planned carefully ahead.|
 
 #### Management options {#format-management-options}
@@ -666,12 +666,12 @@ For metadata cache description and usage, refer to [Kernel metadata cache](../gu
 |-|-|
 |`--buffer-size=300`|total read/write buffering in MiB (default: 300), see [Read/Write buffer](../guide/cache.md#buffer-size)|
 |`--prefetch=1`|prefetch N blocks in parallel (default: 1), see [Client read data cache](../guide/cache.md#client-read-cache)|
-|`--writeback`|upload objects in background (default: false), see [Client write data cache](../guide/cache.md#writeback)|
-|`--upload-delay=0`|When `--writeback` is enabled, you can use this option to add a delay to object storage upload, default to 0, meaning that upload will begin immediately after write. Different units are supported, including `s` (second), `m` (minute), `h` (hour). If files are deleted during this delay, upload will be skipped entirely, when using JuiceFS for temporary storage, use this option to reduce resource usage. Refer to [Client write data cache](../guide/cache.md#writeback).|
+|`--writeback`|upload objects in background (default: false), see [Client write data cache](../guide/cache.md#client-write-cache)|
+|`--upload-delay=0`|When `--writeback` is enabled, you can use this option to add a delay to object storage upload, default to 0, meaning that upload will begin immediately after write. Different units are supported, including `s` (second), `m` (minute), `h` (hour). If files are deleted during this delay, upload will be skipped entirely, when using JuiceFS for temporary storage, use this option to reduce resource usage. Refer to [Client write data cache](../guide/cache.md#client-write-cache).|
 |`--cache-dir=value`|directory paths of local cache, use `:` (Linux, macOS) or `;` (Windows) to separate multiple paths (default: `$HOME/.juicefs/cache` or `/var/jfsCache`), see [Client read data cache](../guide/cache.md#client-read-cache)|
 |`--cache-mode value` <VersionAdd>1.1</VersionAdd> |file permissions for cached blocks (default: "0600")|
 |`--cache-size=102400`|size of cached object for read in MiB (default: 102400), see [Client read data cache](../guide/cache.md#client-read-cache)|
-|`--free-space-ratio=0.1`|min free space ratio (default: 0.1), if [Client write data cache](../guide/cache.md#writeback) is enabled, this option also controls write cache size, see [Client read data cache](../guide/cache.md#client-read-cache)|
+|`--free-space-ratio=0.1`|min free space ratio (default: 0.1), if [Client write data cache](../guide/cache.md#client-write-cache) is enabled, this option also controls write cache size, see [Client read data cache](../guide/cache.md#client-read-cache)|
 |`--cache-partial-only`|cache random/small read only (default: false), see [Client read data cache](../guide/cache.md#client-read-cache)|
 |`--verify-cache-checksum value` <VersionAdd>1.1</VersionAdd> |Checksum level for cache data. After enabled, checksum will be calculated on divided parts of the cache blocks and stored on disks, which are used for verification during reads. The following strategies are supported:<br/><ul><li>`none`: Disable checksum verification, if local cache data is tampered, bad data will be read;</li><li>`full` (default): Perform verification when reading the full block, use this for sequential read scenarios;</li><li>`shrink`: Perform verification on parts that's fully included within the read range, use this for random read scenarios;</li><li>`extend`: Perform verification on parts that fully include the read range, this causes read amplifications and is only used for random read scenarios demanding absolute data integrity.</li></ul>|
 |`--cache-eviction value` <VersionAdd>1.1</VersionAdd> |cache eviction policy (none or 2-random) (default: "2-random")|

@@ -1,27 +1,17 @@
 ---
 title: JuiceFS Metrics
 sidebar_position: 4
-slug: /p8s_metrics
 ---
 
-:::tip
-Please see the ["Monitoring and Data Visualization"](../administration/monitoring.md) documentation to learn how to collect and display JuiceFS monitoring metrics.
-:::
+If you haven't yet set up monitoring for JuiceFS, read [monitoring and data visualization"](../administration/monitoring.md) to learn how.
 
 ## Global labels
 
 | Name       | Description      |
 | ----       | -----------      |
 | `vol_name` | Volume name      |
-| `mp`       | Mount point path |
-
-:::info
-When Prometheus scrapes a target, it attaches `instance` label automatically to the scraped time series which serve to identify the scraped target, and its format is `<host>:<port>`. Refer to [official document](https://prometheus.io/docs/concepts/jobs_instances) for more information.
-:::
-
-:::info
-If the monitoring metrics are reported through [Prometheus Pushgateway](https://github.com/prometheus/pushgateway) (for example, [JuiceFS Hadoop Java SDK](../administration/monitoring.md#hadoop)), the value of the `mp` label is `sdk-<PID>`, and the value of the `instance` label is the host name.
-:::
+| `instance` | Client host name in format `<host>:<port>`. Refer to [official document](https://prometheus.io/docs/concepts/jobs_instances) for more information |
+| `mp`       | Mount point path, if metrics are reported through [Prometheus Pushgateway](https://github.com/prometheus/pushgateway), for example, [JuiceFS Hadoop Java SDK](../administration/monitoring.md#hadoop), `mp` will be `sdk-<PID>` |
 
 ## File system
 

@@ -10,7 +10,7 @@ slug: /metadata_dump_load
 - JuiceFS v1.0.4 开始支持通过 `load` 命令恢复加密的元数据备份
 :::
 
-JuiceFS 支持[多种元数据引擎](../reference/how_to_set_up_metadata_engine.md)，且各引擎内部的数据管理格式各有不同。为了便于管理，JuiceFS 提供了 [`dump`](../reference/command_reference.md#dump) 命令允许将所有元数据以统一格式写入到 JSON 文件进行备份。同时，JuiceFS 也提供了 [`load`](../reference/command_reference.md#load) 命令，允许将备份恢复或迁移到任意元数据存储引擎。
+JuiceFS 支持[多种元数据引擎](../reference/how_to_set_up_metadata_engine.md)，且各引擎内部的数据管理格式各有不同。为了便于管理，JuiceFS 提供了 [`dump`](../reference/command_reference.md#dump) 命令允许将所有元数据以统一格式写入到 JSON 文件进行备份。同时，JuiceFS 也提供了 [`load`](../reference/command_reference.md#load) 命令，允许将备份恢复或迁移到任意元数据存储引擎。这个导出导入流程也可以用来将 JuiceFS 社区版文件系统迁移到企业版（参考[企业版文档](https://juicefs.com/docs/zh/cloud/metadata_dump_load)），反之亦然。
 
 ## 元数据备份 {#backup}