Skip to content

Commit

Permalink
翻译
Browse files Browse the repository at this point in the history
  • Loading branch information
vergil-lai committed Oct 20, 2024
1 parent 22bba27 commit 79f24b8
Show file tree
Hide file tree
Showing 48 changed files with 1,806 additions and 1,450 deletions.
2 changes: 1 addition & 1 deletion Creating_a_cluster/Adding_a_new_node.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# 添加新节点

To add a new node to a cluster, simply start another instance of Manticore and ensure that it is accessible by the other nodes in the cluster. Connect the new node to the rest of the cluster using a [distributed table](../Creating_a_table/Creating_a_distributed_table/Creating_a_distributed_table.md) and ensure data safety with [replication](../Creating_a_cluster/Setting_up_replication/Setting_up_replication.md).
要向集群添加新节点,只需启动另一个 Manticore 实例,并确保它可以被集群中的其他节点访问。通过使用 [分布式表](../Creating_a_table/Creating_a_distributed_table/Creating_a_distributed_table.md) 将新节点连接到集群的其余部分,并通过 [复制](../Creating_a_cluster/Setting_up_replication/Setting_up_replication.md) 确保数据安全。

<!-- proofread -->
15 changes: 8 additions & 7 deletions Creating_a_cluster/Remote_nodes.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Adding a distributed table with remote agents
# 使用远程代理添加分布式表

To understand how to add a distributed table with remote agents, it is important to first have a basic understanding of [distributed tables](../Creating_a_table/Creating_a_distributed_table/Creating_a_distributed_table.md) In this article, we will focus on how to use a distributed table as the basis for creating a cluster of Manticore instances.
要了解如何使用远程代理添加分布式表,首先需要对 [分布式表](../Creating_a_table/Creating_a_distributed_table/Creating_a_distributed_table.md) 有基本的理解。在本文中,我们将重点讨论如何将分布式表作为创建 Manticore 实例集群的基础。
<!-- example conf dist 1 -->
Here is an example of how to split data over 4 servers, each serving one of the shards:
这里是如何将数据分配到 4 台服务器的示例,每台服务器负责一个分片:


<!-- intro -->
Expand All @@ -19,14 +19,15 @@ table mydist {
}
```
<!-- end -->
In the event of a server failure, the distributed table will still work, but the results from the failed shard will be missing.
在服务器故障的情况下,分布式表仍然可以工作,但来自故障分片的结果将缺失。

<!-- example conf dist 2 -->
Now that we've added mirrors, each shard is found on 2 servers. By default, the master (the searchd instance with the distributed table) will randomly pick one of the mirrors.

The mode used for picking mirrors can be set using the [ha_strategy](../Creating_a_cluster/Remote_nodes/Load_balancing.md#ha_strategy) setting. In addition to the default `random` mode there's also `ha_strategy = roundrobin`.
现在我们已经添加了镜像,每个分片都存在于 2 台服务器上。默认情况下,主节点(带有分布式表的 searchd 实例)将随机选择其中一个镜像。

More advanced strategies based on latency-weighted probabilities include `noerrors` and `nodeads`. These not only take out mirrors with issues but also monitor response times and do balancing. If a mirror responds slower (for example, due to some operations running on it), it will receive fewer requests. When the mirror recovers and provides better times, it will receive more requests.
选择镜像的模式可以通过 [ha_strategy](../Creating_a_cluster/Remote_nodes/Load_balancing.md#ha_strategy) 设置进行配置。除了默认的 `random` 模式外,还有 `ha_strategy = roundrobin`

基于延迟加权概率的更高级策略包括 `noerrors``nodeads`。这些策略不仅会排除有问题的镜像,还会监控响应时间并进行负载均衡。如果某个镜像响应较慢(例如,正在运行某些操作),它将接收较少的请求。当镜像恢复并提供更好的响应时间时,它将接收更多的请求。

<!-- intro -->
##### ini:
Expand Down
71 changes: 36 additions & 35 deletions Creating_a_cluster/Remote_nodes/Load_balancing.md
Original file line number Diff line number Diff line change
@@ -1,59 +1,59 @@
# Load balancing
# 负载均衡

Load balancing is turned on by default for any [distributed table](../../Creating_a_table/Creating_a_distributed_table/Creating_a_distributed_table.md) that uses [mirroring](../../Creating_a_cluster/Remote_nodes/Mirroring.md). By default, queries are distributed randomly among the mirrors. You can change this behavior by using the [ha_strategy](../../Creating_a_cluster/Remote_nodes/Load_balancing.md).
负载均衡默认启用,适用于任何使用 [镜像](../../Creating_a_cluster/Remote_nodes/Mirroring.md) [分布式表](../../Creating_a_table/Creating_a_distributed_table/Creating_a_distributed_table.md)。默认情况下,查询会在镜像之间随机分配。您可以通过使用 [ha_strategy](../../Creating_a_cluster/Remote_nodes/Load_balancing.md) 来更改此行为。

## ha_strategy

```ini
ha_strategy = {random|nodeads|noerrors|roundrobin}
```

The mirror selection strategy for load balancing is optional and is set to `random` by default.
镜像选择策略用于负载均衡是可选的,默认设置为 `random`

The strategy used for mirror selection, or in other words, choosing a specific [agent mirror](../../Creating_a_cluster/Remote_nodes/Mirroring.md#Agent-mirrors) in a distributed table, is controlled by this directive. Essentially, this directive controls how master performs the load balancing between the configured mirror agent nodes. The following strategies are implemented:
此指令控制镜像选择策略,也就是在分布式表中选择特定的 [代理镜像](../../Creating_a_cluster/Remote_nodes/Mirroring.md#Agent-mirrors)。本质上,该指令控制主节点如何在配置的镜像代理节点之间进行负载均衡。实施的策略包括:

### Simple random balancing
### 简单随机均衡

<!-- example conf balancing 1 -->
The default balancing mode is simple linear random distribution among the mirrors. This means that equal selection probabilities are assigned to each mirror. This is similar to round-robin (RR), but does not impose a strict selection order.
默认的均衡模式是简单线性随机分配给镜像。这意味着每个镜像分配了相等的选择概率。这类似于轮询(RR),但不强制执行严格的选择顺序。

<!-- intro -->
##### Example:
##### 示例:

<!-- request Example -->
```ini
ha_strategy = random
```
<!-- end -->

### Adaptive randomized balancing
### 自适应随机均衡

The default simple random strategy does not take into account the status of mirrors, error rates, and most importantly, actual response latencies. To address heterogeneous clusters and temporary spikes in agent node load, there are a group of balancing strategies that dynamically adjust the probabilities based on the actual query latencies observed by the master.
默认的简单随机策略不考虑镜像的状态、错误率,以及最重要的实际响应延迟。为了解决异构集群和代理节点负载的临时峰值,有一组均衡策略根据主节点观察到的实际查询延迟动态调整概率。

The adaptive strategies based on **latency-weighted probabilities** work as follows:
基于 **延迟加权概率** 的自适应策略如下:

1. Latency stats are accumulated in blocks of ha_period_karma seconds.
2. Latency-weighted probabilities are recomputed once per karma period.
3. The "dead or alive" flag is adjusted once per request, including ping requests.
1. `ha_period_karma` 秒的时间段内累计延迟统计数据。
2. 每个 Karma 周期重新计算延迟加权概率。
3. 每次请求(包括 ping 请求)都会调整 "死或活" 标志。

Initially, the probabilities are equal. On every step, they are scaled by the inverse of the latencies observed during the last karma period, and then renormalized. For example, if during the first 60 seconds after the master startup, 4 mirrors had latencies of 10 ms, 5 ms, 30 ms, and 3 ms respectively, the first adjustment step would go as follows:
最初,所有概率是相等的。在每一步中,它们会根据最近 Karma 周期观察到的延迟的倒数进行缩放,然后重新归一化。例如,在主节点启动后的前 60 秒,如果 4 个镜像的延迟分别为 10 ms5 ms30 ms 3 ms,则第一次调整步骤如下:

1. Initial percentages: 0.25, 0.25, 0.25, 0.25.
2. Observed latencies: 10 ms, 5 ms, 30 ms, 3 ms.
3. Inverse latencies: 0.1, 0.2, 0.0333, 0.333.
4. Scaled percentages: 0.025, 0.05, 0.008333, 0.0833.
5. Renormalized percentages: 0.15, 0.30, 0.05, 0.50.
1. 初始百分比:0.250.250.250.25
2. 观察到的延迟:10 ms5 ms30 ms3 ms
3. 延迟的倒数:0.10.20.03330.333
4. 缩放后的百分比:0.0250.050.0083330.0833
5. 重新归一化的百分比:0.150.300.050.50

This means that the first mirror would have a 15% chance of being chosen during the next karma period, the second one a 30% chance, the third one (slowest at 30 ms) only a 5% chance, and the fourth and fastest one (at 3 ms) a 50% chance. After that period, the second adjustment step would update those chances again, and so on.
这意味着,在下一个 Karma 周期中,第一个镜像被选择的机会为 15%,第二个镜像为 30%,第三个(延迟最高的,30 ms)仅为 5%,第四个(最快的,3 ms)则为 50%。在这个周期之后,第二次调整步骤将再次更新这些概率,依此类推。

The idea is that once the **observed latencies** stabilize, the **latency weighted probabilities** will stabilize as well. All these adjustment iterations are meant to converge at a point where the average latencies are roughly equal across all mirrors.
其核心理念是,一旦 **观察到的延迟** 稳定,**延迟加权概率** 也将稳定。这些调整迭代旨在收敛到一个点,使得所有镜像的平均延迟大致相等。

<!-- example conf balancing 2 -->
#### nodeads
Latency-weighted probabilities, but dead mirrors are excluded from the selection. A "dead" mirror is defined as a mirror that has resulted in multiple hard errors (e.g. network failure, or no answer, etc) in a row.
延迟加权概率,但会将死镜像排除在选择之外。"死" 镜像被定义为连续出现多个严重错误(例如网络故障或无响应等)的镜像。

<!-- intro -->
##### Example:
##### 示例:

<!-- request Example -->
```ini
Expand All @@ -64,10 +64,11 @@ ha_strategy = nodeads
<!-- example conf balancing 3 -->
#### noerrors

Latency-weighted probabilities, but mirrors with a worse error/success ratio are excluded from selection.
延迟加权概率,但会将错误/成功率较差的镜像排除在选择之外。

<!-- intro -->
##### Example:

##### 示例:

<!-- request Example -->

Expand All @@ -79,43 +80,43 @@ ha_strategy = noerrors
### Round-robin balancing

<!-- example conf balancing 4 -->
Simple round-robin selection, that is, selecting the first mirror in the list, then the second one, then the third one, etc, and then repeating the process once the last mirror in the list is reached. Unlike with the randomized strategies, RR imposes a strict querying order (1, 2, 3, ..., N-1, N, 1, 2, 3, ..., and so on) and *guarantees* that no two consecutive queries will be sent to the same mirror.
简单的轮询选择,即选择列表中的第一个镜像,然后第二个、第三个,以此类推,当达到列表末尾后重复该过程。与随机策略不同,轮询强制执行严格的查询顺序(1, 2, 3, ..., N-1, N, 1, 2, 3, ...),并*确保*不会向同一个镜像发送两个连续的查询。

<!-- intro -->
##### Example:
##### 示例:

<!-- request Example -->
```ini
ha_strategy = roundrobin
```
<!-- end -->

## Instance-wide options
## 实例级选项

### ha_period_karma

```ini
ha_period_karma = 2m
```

`ha_period_karma` defines the size of the agent mirror statistics window, in seconds (or a time suffix). Optional, the default is 60.
`ha_period_karma` 定义了代理镜像统计窗口的大小,单位为秒(或带时间后缀)。这是可选的,默认值为 60。

For a distributed table with agent mirrors, the server tracks several different per-mirror counters. These counters are then used for failover and balancing. (The server picks the best mirror to use based on the counters.) Counters are accumulated in blocks of `ha_period_karma` seconds.
对于带有代理镜像的分布式表,服务器跟踪多个每个镜像的计数器。这些计数器用于故障转移和负载均衡。(服务器根据计数器选择最佳镜像。)计数器在 `ha_period_karma` 秒的时间段内累积。

After beginning a new block, the master may still use the accumulated values from the previous one until the new one is half full. Thus, any previous history stops affecting the mirror choice after at most 1.5 times ha_period_karma seconds.
在开始新的时间段后,主节点仍可以使用之前累积的值,直到新的时间段达到一半。因此,任何以前的历史记录最多在 1.5 倍的 `ha_period_karma` 秒后停止影响镜像选择。

Although at most 2 blocks are used for mirror selection, up to 15 last blocks are actually stored for instrumentation purposes. They can be inspected using the `SHOW AGENT STATUS` statement.
虽然在镜像选择中最多使用 2 个时间段,但实际上会存储最后 15 个时间段以供仪器监测。可以使用 `SHOW AGENT STATUS` 语句查看。

### ha_ping_interval

```ini
ha_ping_interval = 3s
```

`ha_ping_interval` directive defines the interval between pings sent to the agent mirrors, in milliseconds (or with a time suffix). This option is optional and its default value is 1000.
`ha_ping_interval` 指令定义了发送到代理镜像的 ping 的间隔,单位为毫秒(或带时间后缀)。此选项是可选的,默认值为 1000

For a distributed table with agent mirrors, the server sends all mirrors a ping command during idle periods to track their current status (whether they are alive or dead, network roundtrip time, etc.). The interval between pings is determined by the ha_ping_interval setting.
对于带有代理镜像的分布式表,服务器在空闲期间向所有镜像发送 ping 命令,以跟踪它们的当前状态(是否存活、网络往返时间等)。ping 之间的间隔由 `ha_ping_interval` 设置确定。

If you want to disable pings, set ha_ping_interval to 0.
如果希望禁用 ping,可以将 `ha_ping_interval` 设置为 0。

<!-- proofread -->
20 changes: 9 additions & 11 deletions Creating_a_cluster/Remote_nodes/Mirroring.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,28 @@
# Mirroring
# 镜像

[Agent](../../Creating_a_table/Creating_a_distributed_table/Remote_tables.md#agent) mirrors can be used interchangeably when processing a search query. The Manticore instance(s) hosting the distributed table where the mirrored agents are defined keeps track of mirror status (alive or dead) and response times, and performs automatic failover and load balancing based on this information.
[代理](../../Creating_a_table/Creating_a_distributed_table/Remote_tables.md#agent)镜像可以在处理搜索查询时互换使用。托管有定义镜像代理的分布式表的 Manticore 实例会跟踪镜像状态(存活或失效)和响应时间,并基于这些信息执行自动故障转移和负载均衡。

## Agent mirrors
## 代理镜像

```ini
agent = node1|node2|node3:9312:shard2
```

The above example declares that `node1:9312`, `node2:9312`, and `node3:9312` all have a table called shard2, and can be used as interchangeable mirrors. If any of these servers go down, the queries will be distributed between the remaining two. When the server comes back online, the master will detect it and begin routing queries to all three nodes again.
上述示例声明 `node1:9312``node2:9312` `node3:9312` 都有一个名为 shard2 的表,并可以作为互换镜像使用。如果其中任何一台服务器出现故障,查询将分配给剩下的两台。当服务器重新上线时,主节点会检测到这一点,并重新开始将查询路由到所有三台节点。

A mirror may also include an individual table list, as follows:
镜像还可以包含单独的表列表,如下所示:

```ini
agent = node1:9312:node1shard2|node2:9312:node2shard2
```

This works similarly to the previous example, but different table names will be used when querying different servers. For example, node1shard2 will be used when querying node1:9312, and node2shard will be used when querying node2:9312.
这与前面的示例类似,但在查询不同服务器时将使用不同的表名。例如,在查询 `node1:9312` 时将使用 `node1shard2`,在查询 `node2:9312` 时将使用 `node2shard2`

By default, all queries are routed to the best of the mirrors. The best mirror is selected based on recent statistics, as controlled by the [ha_period_karma](../../Server_settings/Searchd.md#ha_period_karma) config directive. The master stores metrics (total query count, error count, response time, etc.) for each agent and groups these by time spans. The karma is the length of the time span. The best agent mirror is then determined dynamically based on the last two such time spans. The specific algorithm used to pick a mirror can be configured with the [ha_strategy](../../Creating_a_cluster/Remote_nodes/Load_balancing.md#ha_strategy) directive.
默认情况下,所有查询会路由到表现最好的镜像。最佳镜像的选择基于最近的统计数据,由 [ha_period_karma](../../Server_settings/Searchd.md#ha_period_karma) 配置指令控制。主节点为每个代理存储指标(总查询计数、错误计数、响应时间等),并按时间段分组。这段时间称为 karma,默认长度为 60 秒。最佳代理镜像会动态地基于最近的两个时间段确定。选择镜像所使用的具体算法可以通过 [ha_strategy](../../Creating_a_cluster/Remote_nodes/Load_balancing.md#ha_strategy) 指令进行配置。

The karma period is in seconds and defaults to 60 seconds. The master stores up to 15 karma spans with per-agent statistics for instrumentation purposes (see `SHOW AGENT STATUS` statement). However, only the last two spans out of these are used for HA/LB logic.
当没有查询时,主节点会在每个 [ha_ping_interval](../../Creating_a_cluster/Remote_nodes/Load_balancing.md#ha_ping_interval) 毫秒内发送定期 ping 命令,以收集统计数据并检查远程主机是否仍然存活。`ha_ping_interval` 默认值为 1000 毫秒。将其设置为 0 将禁用 ping,统计数据仅基于实际查询进行累积。

When there are no queries, the master sends a regular ping command every [ha_ping_interval](../../Creating_a_cluster/Remote_nodes/Load_balancing.md#ha_ping_interval) milliseconds in order to collect statistics and check if the remote host is still alive. The ha_ping_interval defaults to 1000 msec. Setting it to 0 disables pings, and statistics will only be accumulated based on actual queries.

Example:
示例:

```ini
# sharding table over 4 servers total
Expand Down
Loading

0 comments on commit 79f24b8

Please sign in to comment.