diff --git a/docs/en/administration/metadata_dump_load.md b/docs/en/administration/metadata_dump_load.md index 7cf67e9ece92..9df053cc2869 100644 --- a/docs/en/administration/metadata_dump_load.md +++ b/docs/en/administration/metadata_dump_load.md @@ -10,7 +10,7 @@ slug: /metadata_dump_load - JuiceFS v1.0.4 starts to support importing an encrypted backup. ::: -JuiceFS supports [multiple metadata engines](../reference/how_to_set_up_metadata_engine.md), and each engine stores and manages data in a different format internally. JuiceFS provides the [`dump`](../reference/command_reference.md#dump) command to export metadata in a uniform JSON format, also there's the [`load`](../reference/command_reference.md#load) command to restore or migrate backups to any metadata storage engine. +JuiceFS supports [multiple metadata engines](../reference/how_to_set_up_metadata_engine.md), and each engine stores and manages data in a different format internally. JuiceFS provides the [`dump`](../reference/command_reference.md#dump) command to export metadata in a uniform JSON format, also there's the [`load`](../reference/command_reference.md#load) command to restore or migrate backups to any metadata storage engine. This dump / load process can also be used to migrate a community edition file system to enterprise edition (read [enterprise docs](https://juicefs.com/docs/cloud/metadata_dump_load) for more), and vice versa. ## Metadata backup {#backup} diff --git a/docs/en/administration/troubleshooting.md b/docs/en/administration/troubleshooting.md index a422743574e3..1eb476abd9f1 100644 --- a/docs/en/administration/troubleshooting.md +++ b/docs/en/administration/troubleshooting.md @@ -79,7 +79,9 @@ $ ls -l /usr/bin/fusermount -rwsr-xr-x 1 root root 32096 Oct 30 2018 /usr/bin/fusermount ``` -## Connection problems with object storage (slow internet speed) {#io-error-object-storage} +## Read write slow & read write error {#read-write-error} + +### Connection problems with object storage (slow internet speed) {#io-error-object-storage} If JuiceFS Client cannot connect to object storage, or the bandwidth is simply not enough, JuiceFS will complain in logs: @@ -100,7 +102,7 @@ The first issue with slow connection is upload / download timeouts (demonstrated * Reduce buffer size, e.g. [`--buffer-size=64`](../reference/command_reference.md#mount) or even lower. In a large bandwidth condition, increasing buffer size improves parallel performance. But in a low speed environment, this only makes `flush` operations slow and prone to timeouts. * Default timeout for GET / PUT requests are 60 seconds, increasing `--get-timeout` and `--put-timeout` may help with read / write timeouts. -In addition, the ["Client Write Cache"](../guide/cache.md#writeback) feature needs to be used with caution in low bandwidth environment. Let's briefly go over the JuiceFS Client background job design: every JuiceFS Client runs background jobs by default, one of which is data compaction, and if the client has poor internet speed, it'll drag down performance for the whole system. A worse case is when client write cache is also enabled, compaction results are uploaded too slowly, forcing other clients into a read hang when accessing the affected files: +In addition, the ["Client Write Cache"](../guide/cache.md#client-write-cache) feature needs to be used with caution in low bandwidth environment. Let's briefly go over the JuiceFS Client background job design: every JuiceFS Client runs background jobs by default, one of which is data compaction, and if the client has poor internet speed, it'll drag down performance for the whole system. A worse case is when client write cache is also enabled, compaction results are uploaded too slowly, forcing other clients into a read hang when accessing the affected files: ```text # While compaction results are slowly being uploaded in low speed clients, read from other clients will hang and eventually fail @@ -111,6 +113,23 @@ In addition, the ["Client Write Cache"](../guide/cache.md#writeback) feature nee To avoid this type of issue, we recommend disabling background jobs on low-bandwidth clients, i.e. adding [`--no-bgjob`](../reference/command_reference.md#mount) option to the mount command. +### WARNING log: block not found in object storage {#warning-log-block-not-found-in-object-storage} + +When using JuiceFS at scale, there will be some warnings in client logs: + +``` +: fail to read sliceId 1771585458 (off:4194304, size:4194304, clen: 37746372): get chunks/0/0/1_0_4194304: oss: service returned error: StatusCode=404, ErrorCode=NoSuchKey, ErrorMessage="The specified key does not exist.", RequestId=62E8FB058C0B5C3134CB80B6 +``` + +When this type of warning occurs, but not accompanied by I/O errors (indicated by `input/output error` in client logs), you can safely ignore them and continue normal use, client will retry automatically and resolves this issue. + +This warning means that JuiceFS Client cannot read a particular slice, because a block does not exist, and object storage has to return a `NoSuchKey` error. Usually this is caused by: + +* Clients carry out compaction asynchronously, which upon completion, will change the relationship between file and its corresponding blocks, causing problems for other clients that's already reading this file, hence the warning. +* Some clients enabled ["Client Write Cache"](../guide/cache.md#client-write-cache), they write a file, commit to the Metadata Service, but the corresponding blocks are still pending to upload (caused by for example, [slow internet speed](#io-error-object-storage)). Meanwhile, other clients that are already accessing this file will meet this warning. + +Again, if no errors occur, just safely ignore this warning. + ## Read amplification In JuiceFS, a typical read amplification manifests as object storage traffic being much larger than JuiceFS Client read speed. For example, JuiceFS Client is reading at 200MiB/s, while S3 traffic grows up to 2GiB/s. diff --git a/docs/en/development/internals.md b/docs/en/development/internals.md index 7f2d1871bf14..971366a6547a 100644 --- a/docs/en/development/internals.md +++ b/docs/en/development/internals.md @@ -827,7 +827,7 @@ Slice{pos: 40M, id: 0, size: 24M, off: 0, len: 24M} // can be omitted ### Data objects -#### Object naming +#### Object naming {#object-storage-naming-format} Block is the basic unit for JuiceFS to manage data. Its size is 4 MiB by default, and can be changed only when formatting a file system, within the interval [64 KiB, 16 MiB]. Each Block is an object in the object storage after upload, and is named in the format `${fsname}/chunks/${hash}/${basename}`, where diff --git a/docs/en/faq.md b/docs/en/faq.md index 4a167a0083f9..3576665d9bbf 100644 --- a/docs/en/faq.md +++ b/docs/en/faq.md @@ -92,7 +92,7 @@ Read [JuiceFS Internals](development/internals.md) and [Data Processing Flow](in You could mount JuiceFS with [`--writeback` option](reference/command_reference.md#mount), which will write the small files into local disks first, then upload them to object storage in background, this could speedup coping many small files into JuiceFS. -See ["Write Cache in Client"](guide/cache.md#writeback) for more information. +See ["Write Cache in Client"](guide/cache.md#client-write-cache) for more information. ### Does JuiceFS support distributed cache? diff --git a/docs/en/guide/cache.md b/docs/en/guide/cache.md index d3cd55b62910..965fddd6e805 100644 --- a/docs/en/guide/cache.md +++ b/docs/en/guide/cache.md @@ -144,7 +144,7 @@ Repeated reads of the same file in JuiceFS can be extremely fast, with latencies Starting from Linux kernel 3.15, FUSE supports [writeback-cache](https://www.kernel.org/doc/Documentation/filesystems/fuse-io.txt) mode, the kernel will consolidate high-frequency random small (10-100 bytes) write requests to significantly improve its performance, but this comes with a side effect: sequential writes are also turned into random writes, hence sequential write performance is hindered, so only use it on intensive random write scenarios. -To enable writeback-cache mode, use the [`-o writeback_cache`](../reference/fuse_mount_options.md#writeback_cache) option when you [mount JuiceFS](../reference/command_reference.md#mount). Note that writeback-cache mode is not the same as [Client write data cache](#writeback), the former is a kernel implementation while the latter happens inside the JuiceFS Client, read the corresponding section to learn their intended scenarios. +To enable writeback-cache mode, use the [`-o writeback_cache`](../reference/fuse_mount_options.md#writeback_cache) option when you [mount JuiceFS](../reference/command_reference.md#mount). Note that writeback-cache mode is not the same as [Client write data cache](#client-write-cache), the former is a kernel implementation while the latter happens inside the JuiceFS Client, read the corresponding section to learn their intended scenarios. ### Read cache in client {#client-read-cache} @@ -180,7 +180,7 @@ Below are some important options for cache configuration (see [`juicefs mount`]( There are two main read patterns, sequential read and random read. Sequential read usually demands higher throughput while random reads needs lower latency. When local disk throughput is lower than object storage, consider enable `--cache-partial-only` so that sequential reads do not cache the whole block, but rather, only small reads (like footer of Parquet / ORC file) are cached. This allows JuiceFS to take advantage of low latency provided by local disk, and high throughput provided by object storage, at the same time. -### Client write data cache {#writeback} +### Client write data cache {#client-write-cache} Enabling client write cache can improve performance when writing large amount of small files. Read this section to learn about client write cache. diff --git a/docs/en/introduction/io_processing.md b/docs/en/introduction/io_processing.md index 708d2bda22ae..4b034a6c6c81 100644 --- a/docs/en/introduction/io_processing.md +++ b/docs/en/introduction/io_processing.md @@ -48,7 +48,7 @@ Client write cache is also referred to as "Writeback mode" throughout the docs. For scenarios that does not deem consistency and data security as top priorities, enabling client write cache is also an option to further improve performance. When client write cache is enabled, flush operations return immediately after writing data to the local cache directory. Then, local data is uploaded asynchronously to the object storage. In other words, the local cache directory is a cache layer for the object storage. -Learn more in [Client Write Cache](../guide/cache.md#writeback). +Learn more in [Client Write Cache](../guide/cache.md#client-write-cache). ## Data reading process {#workflow-of-read} diff --git a/docs/en/reference/command_reference.md b/docs/en/reference/command_reference.md index 60f5f403cc7b..ebd28b75da1e 100644 --- a/docs/en/reference/command_reference.md +++ b/docs/en/reference/command_reference.md @@ -167,10 +167,10 @@ juicefs format sqlite3://myjfs.db myjfs --trash-days=0 |Items|Description| |-|-| |`--block-size=4096`|size of block in KiB (default: 4096). 4M is usually a better default value because many object storage services use 4M as their internal block size, thus using the same block size in JuiceFS usually yields better performance.| -|`--compress=none`|compression algorithm, choose from `lz4`, `zstd`, `none` (default). Enabling compression will inevitably affect performance, choose wisely.| +|`--compress=none`|compression algorithm, choose from `lz4`, `zstd`, `none` (default). Enabling compression will inevitably affect performance. Among the two supported algorithms, `lz4` offers a better performance, while `zstd` comes with a higher compression ratio, Google for their detailed comparison.| |`--encrypt-rsa-key=value`|A path to RSA private key (PEM)| |`--encrypt-algo=aes256gcm-rsa`|encrypt algorithm (aes256gcm-rsa, chacha20-rsa) (default: "aes256gcm-rsa")| -|`--hash-prefix`|add a hash prefix to name of objects (default: false)| +|`--hash-prefix`|For most object storages, if object storage blocks are sequentially named, they will also be closely stored in the underlying physical regions. When loaded with intensive concurrent consecutive reads, this can cause hotspots and hinder object storage performance.

Enabling `--hash-prefix` will add a hash prefix to name of the blocks (slice ID mod 256, see [internal implementation](../development/internals.md#object-storage-naming-format)), this distributes data blocks evenly across actual object storage regions, offering more consistent performance. Obviously, this option dictates object naming pattern and **should be specified when a file system is created, and cannot be changed on-the-fly.**

Currently, [AWS S3](https://aws.amazon.com/about-aws/whats-new/2018/07/amazon-s3-announces-increased-request-rate-performance) had already made improvements and no longer require application side optimization, but for other types of object storages, this option still recommended for large scale scenarios.| |`--shards=0`|If your object storage limit speed in a bucket level (or you're using a self-hosted object storage with limited performance), you can store the blocks into N buckets by hash of key (default: 0), when N is greater than 0, `bucket` should to be in the form of `%d`, e.g. `--bucket "juicefs-%d"`. `--shards` cannot be changed afterwards and must be planned carefully ahead.| #### Management options {#format-management-options} @@ -666,12 +666,12 @@ For metadata cache description and usage, refer to [Kernel metadata cache](../gu |-|-| |`--buffer-size=300`|total read/write buffering in MiB (default: 300), see [Read/Write buffer](../guide/cache.md#buffer-size)| |`--prefetch=1`|prefetch N blocks in parallel (default: 1), see [Client read data cache](../guide/cache.md#client-read-cache)| -|`--writeback`|upload objects in background (default: false), see [Client write data cache](../guide/cache.md#writeback)| -|`--upload-delay=0`|When `--writeback` is enabled, you can use this option to add a delay to object storage upload, default to 0, meaning that upload will begin immediately after write. Different units are supported, including `s` (second), `m` (minute), `h` (hour). If files are deleted during this delay, upload will be skipped entirely, when using JuiceFS for temporary storage, use this option to reduce resource usage. Refer to [Client write data cache](../guide/cache.md#writeback).| +|`--writeback`|upload objects in background (default: false), see [Client write data cache](../guide/cache.md#client-write-cache)| +|`--upload-delay=0`|When `--writeback` is enabled, you can use this option to add a delay to object storage upload, default to 0, meaning that upload will begin immediately after write. Different units are supported, including `s` (second), `m` (minute), `h` (hour). If files are deleted during this delay, upload will be skipped entirely, when using JuiceFS for temporary storage, use this option to reduce resource usage. Refer to [Client write data cache](../guide/cache.md#client-write-cache).| |`--cache-dir=value`|directory paths of local cache, use `:` (Linux, macOS) or `;` (Windows) to separate multiple paths (default: `$HOME/.juicefs/cache` or `/var/jfsCache`), see [Client read data cache](../guide/cache.md#client-read-cache)| |`--cache-mode value` 1.1 |file permissions for cached blocks (default: "0600")| |`--cache-size=102400`|size of cached object for read in MiB (default: 102400), see [Client read data cache](../guide/cache.md#client-read-cache)| -|`--free-space-ratio=0.1`|min free space ratio (default: 0.1), if [Client write data cache](../guide/cache.md#writeback) is enabled, this option also controls write cache size, see [Client read data cache](../guide/cache.md#client-read-cache)| +|`--free-space-ratio=0.1`|min free space ratio (default: 0.1), if [Client write data cache](../guide/cache.md#client-write-cache) is enabled, this option also controls write cache size, see [Client read data cache](../guide/cache.md#client-read-cache)| |`--cache-partial-only`|cache random/small read only (default: false), see [Client read data cache](../guide/cache.md#client-read-cache)| |`--verify-cache-checksum value` 1.1 |Checksum level for cache data. After enabled, checksum will be calculated on divided parts of the cache blocks and stored on disks, which are used for verification during reads. The following strategies are supported:
| |`--cache-eviction value` 1.1 |cache eviction policy (none or 2-random) (default: "2-random")| diff --git a/docs/en/reference/p8s_metrics.md b/docs/en/reference/p8s_metrics.md index 5d41a1ff32e6..71c4c7c37220 100644 --- a/docs/en/reference/p8s_metrics.md +++ b/docs/en/reference/p8s_metrics.md @@ -1,27 +1,17 @@ --- title: JuiceFS Metrics sidebar_position: 4 -slug: /p8s_metrics --- -:::tip -Please see the ["Monitoring and Data Visualization"](../administration/monitoring.md) documentation to learn how to collect and display JuiceFS monitoring metrics. -::: +If you haven't yet set up monitoring for JuiceFS, read [monitoring and data visualization"](../administration/monitoring.md) to learn how. ## Global labels | Name | Description | | ---- | ----------- | | `vol_name` | Volume name | -| `mp` | Mount point path | - -:::info -When Prometheus scrapes a target, it attaches `instance` label automatically to the scraped time series which serve to identify the scraped target, and its format is `:`. Refer to [official document](https://prometheus.io/docs/concepts/jobs_instances) for more information. -::: - -:::info -If the monitoring metrics are reported through [Prometheus Pushgateway](https://github.com/prometheus/pushgateway) (for example, [JuiceFS Hadoop Java SDK](../administration/monitoring.md#hadoop)), the value of the `mp` label is `sdk-`, and the value of the `instance` label is the host name. -::: +| `instance` | Client host name in format `:`. Refer to [official document](https://prometheus.io/docs/concepts/jobs_instances) for more information | +| `mp` | Mount point path, if metrics are reported through [Prometheus Pushgateway](https://github.com/prometheus/pushgateway), for example, [JuiceFS Hadoop Java SDK](../administration/monitoring.md#hadoop), `mp` will be `sdk-` | ## File system diff --git a/docs/zh_cn/administration/metadata_dump_load.md b/docs/zh_cn/administration/metadata_dump_load.md index 4bf53755e615..50d19bfbb5b3 100644 --- a/docs/zh_cn/administration/metadata_dump_load.md +++ b/docs/zh_cn/administration/metadata_dump_load.md @@ -10,7 +10,7 @@ slug: /metadata_dump_load - JuiceFS v1.0.4 开始支持通过 `load` 命令恢复加密的元数据备份 ::: -JuiceFS 支持[多种元数据引擎](../reference/how_to_set_up_metadata_engine.md),且各引擎内部的数据管理格式各有不同。为了便于管理,JuiceFS 提供了 [`dump`](../reference/command_reference.md#dump) 命令允许将所有元数据以统一格式写入到 JSON 文件进行备份。同时,JuiceFS 也提供了 [`load`](../reference/command_reference.md#load) 命令,允许将备份恢复或迁移到任意元数据存储引擎。 +JuiceFS 支持[多种元数据引擎](../reference/how_to_set_up_metadata_engine.md),且各引擎内部的数据管理格式各有不同。为了便于管理,JuiceFS 提供了 [`dump`](../reference/command_reference.md#dump) 命令允许将所有元数据以统一格式写入到 JSON 文件进行备份。同时,JuiceFS 也提供了 [`load`](../reference/command_reference.md#load) 命令,允许将备份恢复或迁移到任意元数据存储引擎。这个导出导入流程也可以用来将 JuiceFS 社区版文件系统迁移到企业版(参考[企业版文档](https://juicefs.com/docs/zh/cloud/metadata_dump_load)),反之亦然。 ## 元数据备份 {#backup} diff --git a/docs/zh_cn/administration/troubleshooting.md b/docs/zh_cn/administration/troubleshooting.md index 79379cc7fcfc..ff6602e81962 100644 --- a/docs/zh_cn/administration/troubleshooting.md +++ b/docs/zh_cn/administration/troubleshooting.md @@ -79,7 +79,9 @@ $ ls -l /usr/bin/fusermount -rwsr-xr-x 1 root root 32096 Oct 30 2018 /usr/bin/fusermount ``` -## 与对象存储通信不畅(网速慢) {#io-error-object-storage} +## 读写慢与读写失败 {#read-write-error} + +### 与对象存储通信不畅(网速慢) {#io-error-object-storage} 如果无法访问对象存储,或者仅仅是网速太慢,JuiceFS 客户端也会发生读写错误。你也可以在日志中找到相应的报错。 @@ -100,7 +102,7 @@ $ ls -l /usr/bin/fusermount * 降低读写缓冲区大小,比如 [`--buffer-size=64`](../reference/command_reference.md#mount) 或者更小。当带宽充裕时,增大读写缓冲区能提升并发性能。但在低带宽场景下使用过大的读写缓冲区,`flush` 的上传时间会很长,因此容易超时。 * 默认 GET/PUT 请求超时时间为 60 秒,因此增大 `--get-timeout` 以及 `--put-timeout`,可以改善读写超时的情况。 -此外,低带宽环境下需要慎用[「客户端写缓存」](../guide/cache.md#writeback)特性。先简单介绍一下 JuiceFS 的后台任务设计:每个 JuiceFS 客户端默认都启用后台任务,后台任务中会执行碎片合并(compaction)、异步删除等工作,而如果节点网络状况太差,则会降低系统整体性能。更糟的是如果该节点还启用了客户端写缓存,则容易出现碎片合并后上传缓慢,导致其他节点无法读取该文件的危险情况: +此外,低带宽环境下需要慎用[「客户端写缓存」](../guide/cache.md#client-write-cache)特性。先简单介绍一下 JuiceFS 的后台任务设计:每个 JuiceFS 客户端默认都启用后台任务,后台任务中会执行碎片合并(compaction)、异步删除等工作,而如果节点网络状况太差,则会降低系统整体性能。更糟的是如果该节点还启用了客户端写缓存,则容易出现碎片合并后上传缓慢,导致其他节点无法读取该文件的危险情况: ```text # 由于 writeback,碎片合并后的结果迟迟上传不成功,导致其他节点读取文件报错 @@ -111,6 +113,23 @@ $ ls -l /usr/bin/fusermount 为了避免此类问题,我们推荐在低带宽节点上禁用后台任务,也就是为挂载命令添加 [`--no-bgjob`](../reference/command_reference.md#mount) 参数。 +### 警告日志:找不到对象存储块 {#warning-log-block-not-found-in-object-storage} + +规模化使用 JuiceFS 时,往往会在客户端日志中看到类似以下警告: + +``` +: fail to read sliceId 1771585458 (off:4194304, size:4194304, clen: 37746372): get chunks/0/0/1_0_4194304: oss: service returned error: StatusCode=404, ErrorCode=NoSuchKey, ErrorMessage="The specified key does not exist.", RequestId=62E8FB058C0B5C3134CB80B6 +``` + +出现这一类警告时,如果并未伴随着访问异常(比如日志中出现 `input/output error`),其实不必特意关注,客户端会自行重试,往往不影响文件访问。 + +这行警告日志的含义是:访问 slice 出错了,因为对应的某个对象存储块不存在,对象存储返回了 `NoSuchKey` 错误。出现此类异常的可能原因有下: + +* JuiceFS 客户端会异步运行碎片合并(Compaction),碎片合并完成后,文件与对象存储数据块(Block)的关系随之改变,但此时可能其他客户端正在读取该文件,因此随即报错。 +* 某些客户端开启了[「写缓存」](../guide/cache.md#client-write-cache),文件已经写入,提交到了元数据服务,但对应的对象存储 Block 却并未上传完成(比如[网速慢](#io-error-object-storage)),导致其他客户端在读取该文件时,对象存储返回数据不存在。 + +再次强调,如果并未出现应用端访问异常,则可安全忽略此类警告。 + ## 读放大 {#read-amplification} 在 JuiceFS 中,一个典型的读放大现象是:对象存储的下行流量,远大于实际读文件的速度。比方说 JuiceFS 客户端的读吞吐为 200MiB/s,但是在 S3 观察到了 2GiB/s 的下行流量。 diff --git a/docs/zh_cn/development/internals.md b/docs/zh_cn/development/internals.md index 42c1e448f566..e23d8882bd56 100644 --- a/docs/zh_cn/development/internals.md +++ b/docs/zh_cn/development/internals.md @@ -830,7 +830,7 @@ Slice{pos: 40M, id: 0, size: 24M, off: 0, len: 24M} // 实际这一段也会 ### 数据对象 -#### 对象命名 +#### 对象命名 {#object-storage-naming-format} Block 是 JuiceFS 管理数据的基本单元,其大小默认为 4 MiB,且可在文件系统格式化时配置,允许调整的区间范围为 [64 KiB, 16 MiB]。每个 Block 上传后即为对象存储中的一个对象,其命名格式为 `${fsname}/chunks/${hash}/${basename}`,其中: diff --git a/docs/zh_cn/faq.md b/docs/zh_cn/faq.md index 54e518d3ea6c..fa1eb3a67058 100644 --- a/docs/zh_cn/faq.md +++ b/docs/zh_cn/faq.md @@ -92,7 +92,7 @@ JuiceFS 不将原始文件存入对象存储,而是将其按照某个大小( 请在挂载时加上 [`--writeback` 选项](reference/command_reference.md#mount),它会先把数据写入本机的缓存,然后再异步上传到对象存储,会比直接上传到对象存储快很多倍。 -请查看[「客户端写缓存」](guide/cache.md#writeback)了解更多信息。 +请查看[「客户端写缓存」](guide/cache.md#client-write-cache)了解更多信息。 ### JuiceFS 支持分布式缓存吗? diff --git a/docs/zh_cn/guide/cache.md b/docs/zh_cn/guide/cache.md index 2969f398c0f5..fe478cadc216 100644 --- a/docs/zh_cn/guide/cache.md +++ b/docs/zh_cn/guide/cache.md @@ -142,7 +142,7 @@ JuiceFS 客户端会跟踪所有最近被打开的文件,要重复打开相同 从 Linux 内核 3.15 开始,FUSE 支持[内核回写(writeback-cache)](https://www.kernel.org/doc/Documentation/filesystems/fuse-io.txt)模式,内核会把高频随机小 IO(例如 10-100 字节)的写请求合并起来,显著提升随机写入的性能。但其副作用是会将顺序写变为随机写,严重降低顺序写的性能。开启前请考虑使用场景是否匹配。 -在挂载命令通过 [`-o writeback_cache`](../reference/fuse_mount_options.md) 选项来开启内核回写模式。注意,内核回写与[「客户端写缓存」](#writeback)并不一样,前者是内核中的实现,后者则发生在 JuiceFS 客户端,二者适用场景也不一样,详读对应章节以了解。 +在挂载命令通过 [`-o writeback_cache`](../reference/fuse_mount_options.md) 选项来开启内核回写模式。注意,内核回写与[「客户端写缓存」](#client-write-cache)并不一样,前者是内核中的实现,后者则发生在 JuiceFS 客户端,二者适用场景也不一样,详读对应章节以了解。 ### 客户端读缓存 {#client-read-cache} @@ -178,7 +178,7 @@ JuiceFS 客户端会把从对象存储下载的数据,以及新上传的小于 读一般有两种模式,连续读和随机读。对于连续读,一般需要较高的吞吐。对于随机读,一般需要较低的时延。当本地磁盘的吞吐反而比不上对象存储时,可以考虑启用 `--cache-partial-only`,这样一来,连续读虽然会将一整个对象块读取下来,但并不会被缓存。而随机读(例如读 Parquet 或者 ORC 文件的 footer)所读取的字节数比较小,不会读取整个对象块,此类读取就会被缓存。充分地利用了本地磁盘低时延和网络高吞吐的优势。 -### 客户端写缓存 {#writeback} +### 客户端写缓存 {#client-write-cache} 开启客户端写缓存能提升特定场景下的大量小文件写入性能,请详读本节了解。 diff --git a/docs/zh_cn/introduction/io_processing.md b/docs/zh_cn/introduction/io_processing.md index 199db16c1f17..743266e75cf5 100644 --- a/docs/zh_cn/introduction/io_processing.md +++ b/docs/zh_cn/introduction/io_processing.md @@ -48,7 +48,7 @@ JuiceFS 支持随机写,包括通过 mmap 等进行的随机写。 如果对数据一致性和可靠性没有极致要求,可以在挂载时添加 `--writeback` 以进一步提写性能。客户端缓存开启后,Slice flush 仅需写到本地缓存目录即可返回,数据由后台线程异步上传到对象存储。换个角度理解,此时本地目录就是对象存储的缓存层。 -更详细的介绍请见[「客户端写缓存」](../guide/cache.md#writeback)。 +更详细的介绍请见[「客户端写缓存」](../guide/cache.md#client-write-cache)。 ## 读取流程 {#workflow-of-read} diff --git a/docs/zh_cn/reference/command_reference.md b/docs/zh_cn/reference/command_reference.md index 57ad94029ed7..c5de58c7a371 100644 --- a/docs/zh_cn/reference/command_reference.md +++ b/docs/zh_cn/reference/command_reference.md @@ -167,10 +167,10 @@ juicefs format sqlite3://myjfs.db myjfs --trash-days=0 |项 | 说明| |-|-| |`--block-size=4096`|块大小,单位为 KiB,默认 4096。4M 是一个较好的默认值,不少对象存储(比如 S3)都将 4M 设为内部的块大小,因此将 JuiceFS block size 设为相同大小,往往也能获得更好的性能。| -|`--compress=none`|压缩算法,支持 `lz4`、`zstd`、`none`(默认),启用压缩将不可避免地对性能产生一定影响。| +|`--compress=none`|压缩算法,支持 `lz4`、`zstd`、`none`(默认),启用压缩将不可避免地对性能产生一定影响。这两种压缩算法中,`lz4` 提供更好的性能,但压缩比要逊于 `zstd`,他们的具体性能差别具体需要读者自行搜索了解。| |`--encrypt-rsa-key=value`|RSA 私钥的路径,查看[数据加密](../security/encryption.md)以了解更多。| |`--encrypt-algo=aes256gcm-rsa`|加密算法 (aes256gcm-rsa, chacha20-rsa) (默认:"aes256gcm-rsa")| -|`--hash-prefix`|给每个对象添加 hash 前缀,默认为 false。| +|`--hash-prefix`|对于部分对象存储服务,如果对象存储命名路径的键值(key)是连续的,那么坐落在对象存储上的物理数据也将是连续的。在大规模顺序读场景下,这样会带来数据访问热点,让对象存储服务的部分区域访问压力过大。

启用 `--hash-prefix` 将会给每个对象路径命名添加 hash 前缀(用 slice ID 对 256 取模,详见[内部实现](../development/internals.md#object-storage-naming-format)),相当于“打散”对象存储键值,避免在对象存储服务层面创造请求热点。显而易见,由于影响着对象存储块的命名规则,该选项**必须在创建文件系统之初就指定好、不能动态修改。**

目前而言,[AWS S3](https://aws.amazon.com/about-aws/whats-new/2018/07/amazon-s3-announces-increased-request-rate-performance) 已经做了优化,不再需要应用侧的随机对象前缀。而对于其他对象对象存储服务(比如 [COS 就在文档里推荐随机化前缀](https://cloud.tencent.com/document/product/436/13653#.E6.B7.BB.E5.8A.A0.E5.8D.81.E5.85.AD.E8.BF.9B.E5.88.B6.E5.93.88.E5.B8.8C.E5.89.8D.E7.BC.80)),因此,对于这些对象存储,如果文件系统规模庞大,建议启用该选项以提升性能。| |`--shards=0`|如果对象存储服务在桶级别设置了限速(或者你使用自建的对象存储服务,单个桶的性能有限),可以将数据块根据名字哈希分散存入 N 个桶中。该值默认为 0,也就是所有数据存入单个桶。当 N 大于 0 时,`bucket` 需要包含 `%d` 占位符,例如 `--bucket=juicefs-%d`。`--shards` 设置无法动态修改,需要提前规划好用量。| #### 管理参数 {#format-management-options} @@ -666,12 +666,12 @@ juicefs mount redis://localhost /mnt/jfs --backup-meta 0 |-|-| |`--buffer-size=300`|读写缓冲区的总大小;单位为 MiB (默认:300)。阅读[「读写缓冲区」](../guide/cache.md#buffer-size)了解更多。| |`--prefetch=1`|并发预读 N 个块 (默认:1)。阅读[「客户端读缓存」](../guide/cache.md#client-read-cache)了解更多。| -|`--writeback`|后台异步上传对象,默认为 false。阅读[「客户端写缓存」](../guide/cache.md#writeback)了解更多。| -|`--upload-delay=0`|启用 `--writeback` 后,可以使用该选项控制数据延迟上传到对象存储,默认为 0 秒,相当于写入后立刻上传。该选项也支持 `s`(秒)、`m`(分)、`h`(时)这些单位。如果在等待的时间内数据被应用删除,则无需再上传到对象存储。如果数据只是临时落盘,可以考虑用该选项节约资源。阅读[「客户端写缓存」](../guide/cache.md#writeback)了解更多。| +|`--writeback`|后台异步上传对象,默认为 false。阅读[「客户端写缓存」](../guide/cache.md#client-write-cache)了解更多。| +|`--upload-delay=0`|启用 `--writeback` 后,可以使用该选项控制数据延迟上传到对象存储,默认为 0 秒,相当于写入后立刻上传。该选项也支持 `s`(秒)、`m`(分)、`h`(时)这些单位。如果在等待的时间内数据被应用删除,则无需再上传到对象存储。如果数据只是临时落盘,可以考虑用该选项节约资源。阅读[「客户端写缓存」](../guide/cache.md#client-write-cache)了解更多。| |`--cache-dir=value`|本地缓存目录路径;使用 `:`(Linux、macOS)或 `;`(Windows)隔离多个路径 (默认:`$HOME/.juicefs/cache` 或 `/var/jfsCache`)。阅读[「客户端读缓存」](../guide/cache.md#client-read-cache)了解更多。| |`--cache-mode value` 1.1|缓存块的文件权限 (默认:"0600")| |`--cache-size=102400`|缓存对象的总大小;单位为 MiB (默认:102400)。阅读[「客户端读缓存」](../guide/cache.md#client-read-cache)了解更多。| -|`--free-space-ratio=0.1`|最小剩余空间比例,默认为 0.1。如果启用了[「客户端写缓存」](../guide/cache.md#writeback),则该参数还控制着写缓存占用空间。阅读[「客户端读缓存」](../guide/cache.md#client-read-cache)了解更多。| +|`--free-space-ratio=0.1`|最小剩余空间比例,默认为 0.1。如果启用了[「客户端写缓存」](../guide/cache.md#client-write-cache),则该参数还控制着写缓存占用空间。阅读[「客户端读缓存」](../guide/cache.md#client-read-cache)了解更多。| |`--cache-partial-only`|仅缓存随机小块读,默认为 false。阅读[「客户端读缓存」](../guide/cache.md#client-read-cache)了解更多。| |`--verify-cache-checksum=full` 1.1|缓存数据一致性检查级别,启用 Checksum 校验后,生成缓存文件时会对数据切分做 Checksum 并记录于文件末尾,供读缓存时进行校验。支持以下级别:
  • `none`:禁用一致性检查,如果本地数据被篡改,将会读到错误数据;
  • `full`(默认):读完整数据块时才校验,适合顺序读场景;
  • `shrink`:对读范围内的切片数据进行校验,校验范围不包含读边界所在的切片(可以理解为开区间),适合随机读场景;
  • `extend`:对读范围内的切片数据进行校验,校验范围同时包含读边界所在的切片(可以理解为闭区间),因此将带来一定程度的读放大,适合对正确性有极致要求的随机读场景。
| |`--cache-eviction value` 1.1|缓存逐出策略 (none 或 2-random) (默认值:"2-random")| diff --git a/docs/zh_cn/reference/p8s_metrics.md b/docs/zh_cn/reference/p8s_metrics.md index 5de862942aab..e308e30126a7 100644 --- a/docs/zh_cn/reference/p8s_metrics.md +++ b/docs/zh_cn/reference/p8s_metrics.md @@ -1,27 +1,17 @@ --- title: JuiceFS 监控指标 sidebar_position: 4 -slug: /p8s_metrics --- -:::tip 提示 -请查看[「监控」](../administration/monitoring.md)文档了解如何收集及展示 JuiceFS 监控指标 -::: +如果你尚未搭建监控系统、收集 JuiceFS 客户端指标,阅读[「监控」](../administration/monitoring.md)文档了解如何收集这些指标以及可视化。 ## 全局标签 | 名称 | 描述 | | ---- | ----------- | | `vol_name` | Volume 名称 | -| `mp` | 挂载点路径 | - -:::info 说明 -Prometheus 在抓取监控指标时会自动附加 `instance` 标签以帮助识别不同的抓取目标,格式为 `:`。详见[官方文档](https://prometheus.io/docs/concepts/jobs_instances)。 -::: - -:::info 说明 -如果是通过 [Prometheus Pushgateway](https://github.com/prometheus/pushgateway) 的方式上报监控指标(例如 [JuiceFS Hadoop Java SDK](../administration/monitoring.md#hadoop)),`mp` 标签的值为 `sdk-`,`instance` 标签的值为主机名。 -::: +| `instance` | 客户端主机名,格式为 `:`。详见[官方文档](https://prometheus.io/docs/concepts/jobs_instances) | +| `mp` | 挂载点路径,如果是通过 [Prometheus Pushgateway](https://github.com/prometheus/pushgateway) 上报,例如 [JuiceFS Hadoop Java SDK](../administration/monitoring.md#hadoop),那么 `mp` 标签的值为 `sdk-` | ## 文件系统