diff --git a/docs/404.html b/docs/404.html index c9e856ac..3e767da5 100644 --- a/docs/404.html +++ b/docs/404.html @@ -14,7 +14,7 @@
Skip to content

404

PAGE NOT FOUND

But if you don't change your direction, and if you keep looking, you may end up where you are heading.
- + \ No newline at end of file diff --git a/docs/assets/client_api.md.725adea3.js b/docs/assets/client_api.md.eb765a2b.js similarity index 99% rename from docs/assets/client_api.md.725adea3.js rename to docs/assets/client_api.md.eb765a2b.js index 1ed62daf..eb00cffb 100644 --- a/docs/assets/client_api.md.725adea3.js +++ b/docs/assets/client_api.md.eb765a2b.js @@ -1,4 +1,4 @@ -import{_ as s,o as n,c as a,Q as e}from"./chunks/framework.0882ee08.js";const F=JSON.parse('{"title":"API","description":"","frontmatter":{},"headers":[],"relativePath":"client/api.md","filePath":"client/api.md"}'),o={name:"client/api.md"},l=e(`

API

Turms client currently supports four programming languages, JavaScript, Kotlin, Swift and Dart, exposing a consistent interface and behaving in a consistent manner. Some interface parameters may be inconsistent across languages, mainly due to: 1. The interface uses parameters and syntax that are closer to current language characteristics and conventions; 2. Unique parameters and interfaces of turms-client-js.

Since Turms client behavior is highly consistent across languages, you can easily translate your written business code into the other three languages without changing the code logic (see the examples at the end of this article) if you develop your application based on either language.

External Logic Structure

Return Value of Methods in Services

All Turms client service methods that interact with the Turms server are written based on the asynchronous model. turms-client-js uses the Promise model, turms-client-kotlin uses the Coroutines model, and turms-client-swift uses the Promise model (provided by PromiseKit).

Various Services can add, delete, update and query the business data provided by Turms. You need to understand their return value types in order to develop your own business code.

Deep Dive - For Responses with Status Code 10xx

Deep Dive - For Responses with Status Code Other Than 10xx

These types of responses are all regarded as "error" status responses. The methods in the Service will throw ResponseError or ResponseException through the asynchronous model, and these error or exception instances will carry a specific response status code and an error reason.

Deep Dive - Main Interface Differences

Normally, you don't need to care about the differences between client interfaces, but if your team needs to have one developer working on the upper layers based on multiple Turms clients, or if you need to compare the similarities and differences between the upper layer client code implementations for your project, you can learn about the differences in the main interfaces between the clients.

In early Turms client implementations, the interface parameters and data model between the clients were kept as uniform as possible in terms of configuration and meaning, such as time-related configuration parameters. However, this forced uniformity was written in a way that did not conform to the target language conventions. Also, considering that in most cases, the upper-level business code of each client usually has a dedicated person in charge of it, rather than all by one developer, the uniform meaning is not significant, and these differences are also in line with the target language habits, so no mandatory uniformity is made.

The differences in the main interfaces of the clients are listed below.

JavaScript ClientKotlin ClientSwift ClientDart ClientExamples
Time UnitConsistent with millisecondsConsistent with millisecondsUses TimeInterval (i.e., seconds)Consistent with millisecondsconnectTimeout
Response Exception ModelResponseError (inherited from Error)ResponseException (inherited from RuntimeException)ResponseError (inherited from Error)ResponseException (inherited from Exception)
Asynchronous ModelPromiseCoroutinesPromise provided by PromiseKitFuture

Note: For the externally exposed callback function implementation, Turms Swift client does not use the delegate proxy common to Swift, but escapes the closure via function passing like other language clients.

Understanding interfaces (Important)

The interfaces of all Turms clients are very easy to understand and use. Developers don't even need to look at what interfaces Turms clients have. They can simply deduce what interfaces Turms will have based on basic IM business knowledge.

Developers generally only need to remember:

Afterwards, based on business knowledge, we can infer what interfaces the Turms client will have, such as:

In summary, developers generally only need basic business knowledge to infer the interfaces provided by the Turms client, and do not even need to read the source code of the Turms client.

For advanced developers, the Turms client also provides a driver for implementing relatively low-level operations. In addition, as mentioned in the section on \`Session Lifecycle," the Turms client is intentionally designed to be clear and easy to understand, deliberately not providing operations such as automatic reconnection or automatic routing, because on one hand developers can easily implement such logic themselves, and on the other hand, such "hidden" internal logic can make it difficult for upper-level developers to control low-level driver behavior and can sometimes become a stumbling block.

Examples

The following examples include four versions of turms-client-js/kotlin/swift/dart and have equivalent functionalities. The following business operations are included: client initialization, login, listen for session disconnections (offline), listen for notifications, listen for new messages, query nearby users, send messages, and create groups.

Server-side Preparation before Trying Examples

Code example

javascript
// Initialize client
+import{_ as s,o as n,c as a,Q as e}from"./chunks/framework.0882ee08.js";const F=JSON.parse('{"title":"API","description":"","frontmatter":{},"headers":[],"relativePath":"client/api.md","filePath":"client/api.md"}'),o={name:"client/api.md"},l=e(`

API

Turms client currently supports four programming languages, JavaScript, Kotlin, Swift and Dart, exposing a consistent interface and behaving in a consistent manner. Some interface parameters may be inconsistent across languages, mainly due to: 1. The interface uses parameters and syntax that are closer to current language characteristics and conventions; 2. Unique parameters and interfaces of turms-client-js.

Since Turms client behavior is highly consistent across languages, you can easily translate your written business code into the other three languages without changing the code logic (see the examples at the end of this article) if you develop your application based on either language.

External Logic Structure

  • TurmsClient: TurmsClient is the only class exposed directly to the public. A TurmsClient instance represents a session between a client and a server. The following variables are the external member variables of TurmsClient.

    • driver: TurmsClient's runtime driver. It is responsible for the basic operations such as opening and closing the connection, sending and receiving the underlying data and heartbeat control. The following service layer classes are all driver-based.

    • userService: A user-related service. It is responsible for such operations as user login, adding friends, adding relationship groups, sending/processing friend requests, querying nearby users, etc.

    • groupService: A group-related service. It is responsible for operations such as creating groups, changing group owners, modifying group members' roles, modifying group information, etc.

    • messageService:A message-related service. It is responsible for operations such as sending messages, modifying sent messages, querying various messages and their status, recalling messages, etc.

    • notificationService: A notification-related service. It is responsible for receiving and responding to business-level notifications (e.g., other users sending friend requests to the user, group members going up and down, etc.). Reminder: messages are not considered as business-level notifications, so notificationService does not handle user messages, and user messages are only handled by messageService. The concept of "notification" in TurmsNotification in driver refers to the notification from the Turms server to the Turms client at the network level, so the notificationService does not handle the underlying TurmsNotification data.

      Addendum: You can change the notification function on and off in real-time at im.turms.server.common.infra.properties.env.service.business.NotificationProperties on the Turms server.

    • storageService: A storage-related service (optional extension). It is responsible for upload and download operations of user avatars, group avatars and message attachments. Note: This service is an extension of turms, so if you want to use this feature, you need to integrate turms-plugin-minio or your own storage plugin into the Turms server.

Return Value of Methods in Services

All Turms client service methods that interact with the Turms server are written based on the asynchronous model. turms-client-js uses the Promise model, turms-client-kotlin uses the Coroutines model, and turms-client-swift uses the Promise model (provided by PromiseKit).

Various Services can add, delete, update and query the business data provided by Turms. You need to understand their return value types in order to develop your own business code.

Deep Dive - For Responses with Status Code 10xx

  • For methods that add business data, if the return value of the method is declared as an asynchronous model (e.g., Promise<Response<string>>), the return value of the generic type (such as the string type in the previous section) must not be null, otherwise an error with status code INVALID_RESPONSE will be thrown ResponseError or ResponseException, indicating that a data that should exist is missing. If this error occurs, it means there is a bug of inconsistency in the behavior of either the Turms server or client.

  • For methods that delete and update business data, they both return Void types wrapped by asynchronous models (e.g., Promise<Response<Void>>).

  • For functions that find business models.

    If the function of this class returns a List type wrapped by an asynchronous model, the lookup operation function returns an empty List instead of null or undefined when the server returns empty data.

    If the wrapped type is not a List, the lookup function returns an undefined (JavaScript) or null (Kotlin) or nil (Swift) when the server returns null data. Special case: the answerGroupQuestions method can be counted as a query method, but its return data is never null.

Deep Dive - For Responses with Status Code Other Than 10xx

These types of responses are all regarded as "error" status responses. The methods in the Service will throw ResponseError or ResponseException through the asynchronous model, and these error or exception instances will carry a specific response status code and an error reason.

Deep Dive - Main Interface Differences

Normally, you don't need to care about the differences between client interfaces, but if your team needs to have one developer working on the upper layers based on multiple Turms clients, or if you need to compare the similarities and differences between the upper layer client code implementations for your project, you can learn about the differences in the main interfaces between the clients.

In early Turms client implementations, the interface parameters and data model between the clients were kept as uniform as possible in terms of configuration and meaning, such as time-related configuration parameters. However, this forced uniformity was written in a way that did not conform to the target language conventions. Also, considering that in most cases, the upper-level business code of each client usually has a dedicated person in charge of it, rather than all by one developer, the uniform meaning is not significant, and these differences are also in line with the target language habits, so no mandatory uniformity is made.

The differences in the main interfaces of the clients are listed below.

JavaScript ClientKotlin ClientSwift ClientDart ClientExamples
Time UnitConsistent with millisecondsConsistent with millisecondsUses TimeInterval (i.e., seconds)Consistent with millisecondsconnectTimeout
Response Exception ModelResponseError (inherited from Error)ResponseException (inherited from RuntimeException)ResponseError (inherited from Error)ResponseException (inherited from Exception)
Asynchronous ModelPromiseCoroutinesPromise provided by PromiseKitFuture

Note: For the externally exposed callback function implementation, Turms Swift client does not use the delegate proxy common to Swift, but escapes the closure via function passing like other language clients.

Understanding interfaces (Important)

The interfaces of all Turms clients are very easy to understand and use. Developers don't even need to look at what interfaces Turms clients have. They can simply deduce what interfaces Turms will have based on basic IM business knowledge.

Developers generally only need to remember:

  • Create a Turms client instance through new TurmsClient(...)
  • As mentioned in the previous section on External Logic Structure, the Turms client is divided into five services: userService (related to user), groupService (related to group), messageService (related to message), notificationService (related to notification), and storageService (related to storage, optional).

Afterwards, based on business knowledge, we can infer what interfaces the Turms client will have, such as:

  • If a user needs to log in first, we naturally think of the userService related to users. Since it is "logging in," we look for a login method and naturally find the client.userService.login(...) method.
  • After logging in, the user needs to be able to send messages. We would then think of the messageService related to messages and look for an method similar to sendMessage, which leads us to the client.messageService.sendMessage(...) method.
  • Since we can send messages, what method can we use to listen for received messages? Since it is still related to messages, we still think of the messageService, so we might consider methods like onMessage, subscribeMessage, or addMessageListener. Looking through the code, we find the client.messageService.addMessageListener(...) method.
  • If we can listen for received messages, how do we listen for received notifications? Since it is related to notifications, we naturally think of the notificationService related to notifications. Since the method for listening for received messages is called addMessageListener, the method for listening to notifications should be addNotificationListener, which leads us to the client.notification.addNotificationListener method.

In summary, developers generally only need basic business knowledge to infer the interfaces provided by the Turms client, and do not even need to read the source code of the Turms client.

For advanced developers, the Turms client also provides a driver for implementing relatively low-level operations. In addition, as mentioned in the section on \`Session Lifecycle," the Turms client is intentionally designed to be clear and easy to understand, deliberately not providing operations such as automatic reconnection or automatic routing, because on one hand developers can easily implement such logic themselves, and on the other hand, such "hidden" internal logic can make it difficult for upper-level developers to control low-level driver behavior and can sometimes become a stumbling block.

Examples

The following examples include four versions of turms-client-js/kotlin/swift/dart and have equivalent functionalities. The following business operations are included: client initialization, login, listen for session disconnections (offline), listen for notifications, listen for new messages, query nearby users, send messages, and create groups.

Server-side Preparation before Trying Examples

  • Option 1: No need to build Turms servers locally, users connect to turms-gateway on Playground directly locally via the client API (WebSocket endpoint: http://playground.turms.im:10510; TCP endpoint: http://playground.turms.im:11510). However, pay attention to upgrade the local client to the latest version in time to avoid the problem of inconsistent data because of server-side interface updates.
  • Option 2: Update the following configuration in the application.yaml configuration file.
    1. Set turms.gateway.session.enable-authentication to false (disable user login authentication)
    2. Set turms.service.message.allow-sending-messages-to-stranger to true (allow users without relationship to send messages to each other)
  • Option 3: Use the built-in dev profile configuration. This is because the dev profile provided by Turms already has the above configuration. By default, the profile of application.yaml in the Turms distribution package is empty, i.e. the default profile is not dev and you need to configure it to dev manually.

Code example

javascript
// Initialize client
 const client = new TurmsClient(); // new TurmsClient('ws://any-turms-gateway-server.com');
 
 // Listen to the offline event
diff --git a/docs/assets/client_api.md.725adea3.lean.js b/docs/assets/client_api.md.eb765a2b.lean.js
similarity index 100%
rename from docs/assets/client_api.md.725adea3.lean.js
rename to docs/assets/client_api.md.eb765a2b.lean.js
diff --git a/docs/assets/client_turms-chat-demo.md.3a23e9e1.js b/docs/assets/client_turms-chat-demo.md.3a23e9e1.js
new file mode 100644
index 00000000..479c6304
--- /dev/null
+++ b/docs/assets/client_turms-chat-demo.md.3a23e9e1.js
@@ -0,0 +1 @@
+import{_ as e,o as t,c as o,Q as i}from"./chunks/framework.0882ee08.js";const g=JSON.parse('{"title":"Turms Chat Demo","description":"","frontmatter":{},"headers":[],"relativePath":"client/turms-chat-demo.md","filePath":"client/turms-chat-demo.md"}'),a={name:"client/turms-chat-demo.md"},n=i('

Turms Chat Demo

Background

Initially, our plan was to let users to reuse existing XMPP clients by making turms-gateway support the XMPP protocol. However, both paid and free XMPP clients have generally low quality, mainly due to the following reasons:

  1. Most XMPP client projects have poor code quality, especially early client engineers who lack coding skills. They often mix complex UI logic with business logic (e.g., the famous open-source project JMeter), making it difficult for redevelopment. It is better to rewrite them from scratch.
  2. Both commercial and open-source XMPP clients have UI designs that are at an amateur level. If a client project lacks a professional UI, we doubt the capabilities of their frontend engineers and UI designers (a competent intermediate frontend engineer should be capable of designing a single product UI independently). We do not recommend users to adopt their solutions.
  3. There is hardly any open-source XMPP client that supports a complete cross-platform solution.
  4. Many low-quality XMPP clients even require payment.

Considering that developing a cross-platform IM application is not difficult and mainly involves manual work, and that IM application UI and functionalities are highly generic (researching 10 commercial IM applications in the market would reveal that at least 9 of them have similar UI and functionalities), we decided to first provide the IM client demo turms-chat-demo-flutter for Turms users to use or redevelopment. We will support the XMPP protocol later.

Roadmap

  • November-December 2023: Complete desktop UI design; set up Flutter project framework; develop and test basic desktop components; complete Windows UI development and testing.
  • December 2023-January 2024: Adapt the UI for MacOS; develop and test basic mobile components; complete Android UI development and testing.
  • January-February 2024: Adapt the UI for iOS.
  • February-March 2024: Develop the UI for the web.
  • March-April 2024: Integrate turms-client-dart and implement IM business logic (the above tasks only involve UI development and testing, excluding business logic).

Note:

  • Considering other tasks, holidays, and work situations at Turms, the above timeline may be subject to slight changes.
  • There is no plan to support mini programs.

Introduction

We want to emphasize the term demo in the project name. This term mainly has the following meanings:

  1. Whether from a product perspective or a technical perspective, this client "demo" is just one of the "possible" solutions. Users should not limit their ability to design their own IM products because of this "demo." Especially, do not assume that Turms' server is customized for this "demo." As repeatedly mentioned in the Turms documentation, Turms is a generic IM solution dedicated to solving various IM scenarios.
  2. Prepare for users' further development. This mainly involves three aspects:
    1. Separation of UI and business logic. This allows teams that require redevelopment to reuse the UI and implement their own business logic.
    2. We continue to use the permissive Apache 2.0 license instead of the more restrictive GPL license commonly used in client open-source projects.
    3. Since the UI design of IM applications worldwide is very similar, this demo will also implement most of the generic UI and logic for IM. It generally does not provide more customized logic to facilitate redevelopment by other teams.

Note: demo does not imply "low quality." Readers will understand this by examining the code quality and UI design later.

',13),s=[n];function r(l,c,d,m,u,h){return t(),o("div",null,s)}const f=e(a,[["render",r]]);export{g as __pageData,f as default}; diff --git a/docs/assets/client_turms-chat-demo.md.3a23e9e1.lean.js b/docs/assets/client_turms-chat-demo.md.3a23e9e1.lean.js new file mode 100644 index 00000000..e867c373 --- /dev/null +++ b/docs/assets/client_turms-chat-demo.md.3a23e9e1.lean.js @@ -0,0 +1 @@ +import{_ as e,o as t,c as o,Q as i}from"./chunks/framework.0882ee08.js";const g=JSON.parse('{"title":"Turms Chat Demo","description":"","frontmatter":{},"headers":[],"relativePath":"client/turms-chat-demo.md","filePath":"client/turms-chat-demo.md"}'),a={name:"client/turms-chat-demo.md"},n=i("",13),s=[n];function r(l,c,d,m,u,h){return t(),o("div",null,s)}const f=e(a,[["render",r]]);export{g as __pageData,f as default}; diff --git a/docs/assets/server_deployment_config.md.68734049.js b/docs/assets/server_deployment_config.md.7619e646.js similarity index 99% rename from docs/assets/server_deployment_config.md.68734049.js rename to docs/assets/server_deployment_config.md.7619e646.js index e21704b0..a6dc2902 100644 --- a/docs/assets/server_deployment_config.md.68734049.js +++ b/docs/assets/server_deployment_config.md.7619e646.js @@ -34,7 +34,7 @@ import{_ as t,o as e,c as d,Q as r}from"./chunks/framework.0882ee08.js";const g= --health-retries=3 \\ --health-start-period=60s \\ -v <your-jvm-options-file-path>:/opt/turms/turms-gateway/config/jvm.options:ro \\ - ghcr.io/turms-im/turms-gateway
  • If via Docker Compose, you can use something like:

  • shell
    TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path> docker compose -f docker-compose.standalone.yml up --force-recreate
    TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path> docker compose -f docker-compose.standalone.yml up --force-recreate
    powershell
    $env:TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path>;docker compose -f docker-compose.standalone.yml up --force-recreate
    $env:TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path>;docker compose -f docker-compose.standalone.yml up --force-recreate
    Note: The above \`TURMS_GATEWAY_JVM_CONF\` path points to the path inside the mirror, not the path of the host. If you want to use the configuration file in the host machine, you need to modify the \`docker-compose.standalone.yml\` configuration file to use Docker's mounting mechanism, such as:
    +  ghcr.io/turms-im/turms-gateway
  • If via Docker Compose, you can use something like:

  • shell
    TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path> docker compose -f docker-compose.standalone.yml up --force-recreate
    TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path> docker compose -f docker-compose.standalone.yml up --force-recreate
    powershell
    $env:TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path>;docker compose -f docker-compose.standalone.yml up --force-recreate
    $env:TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path>;docker compose -f docker-compose.standalone.yml up --force-recreate
    Note: The above \`TURMS_GATEWAY_JVM_CONF\` path points to the path inside the mirror, not the path of the host. If you want to use the configuration file in the host machine, you need to modify the \`docker-compose.standalone.yml\` configuration file to use Docker's mounting mechanism, such as:
     
     \`\`\`yaml
     turms-gateway:
    diff --git a/docs/assets/server_deployment_config.md.68734049.lean.js b/docs/assets/server_deployment_config.md.7619e646.lean.js
    similarity index 100%
    rename from docs/assets/server_deployment_config.md.68734049.lean.js
    rename to docs/assets/server_deployment_config.md.7619e646.lean.js
    diff --git a/docs/assets/zh-CN_client_api.md.b7e0577a.js b/docs/assets/zh-CN_client_api.md.294bb9cc.js
    similarity index 99%
    rename from docs/assets/zh-CN_client_api.md.b7e0577a.js
    rename to docs/assets/zh-CN_client_api.md.294bb9cc.js
    index 61046002..65c0b7ec 100644
    --- a/docs/assets/zh-CN_client_api.md.b7e0577a.js
    +++ b/docs/assets/zh-CN_client_api.md.294bb9cc.js
    @@ -1,4 +1,4 @@
    -import{_ as s,o as n,c as a,Q as l}from"./chunks/framework.0882ee08.js";const d=JSON.parse('{"title":"接口","description":"","frontmatter":{},"headers":[],"relativePath":"zh-CN/client/api.md","filePath":"zh-CN/client/api.md"}'),p={name:"zh-CN/client/api.md"},o=l(`

    接口

    Turms客户端目前支持JavaScript、Kotlin、Swift与Dart这四种语言,对外暴露一致的接口,并且表现为一致的行为。各语言版本之间的部分接口参数可能出现不完全一致的情况,这主要体现在:1. 接口采用更贴近当前语言特性及习惯的参数与语法;2. turms-client-js独有的参数与接口。

    由于Turms各语言客户端行为具有高度的一致性,因此如果您基于上述任意一种语言进行业务开发,您可以在代码逻辑不做改变的情况下,轻松将已写好的业务代码翻译为另外三种语言(具体可参考在本文结尾处的示例)。

    客户端的对外逻辑结构

    • TurmsClient:Turms客户端唯一直接对外暴露的类,一个TurmsClient实例代表着一个客户端与服务端之间的会话连接。以下变量是TurmsClient对外的成员变量。

      • driver:TurmsClient的运行驱动。负责连接的开起关闭、底层数据的发送接收与心跳控制等基础性操作。以下介绍到的Service层类都基于driver运作。

      • userService:用户相关服务。负责如用户登陆、添加好友、添加关系人分组、发送/处理好友请求、查询附近的用户等操作。

      • groupService:群组相关服务。负责如创建群组、变更群主、修改群成员角色、修改群信息等操作。

      • messageService:消息相关服务。负责如发送消息、修改已发送消息、查询各类消息与其状态、撤回消息等操作。

      • notificationService:通知相关服务。负责接受与响应业务层面上的通知(比如:其他用户向该用户发送好友请求、群组成员上下线等通知)。 提醒:消息(message)不算做业务层面上的“通知”(notification),因此notificationService不会处理用户消息,用户消息仅由messageService进行处理。而driver中TurmsNotification的“通知”概念指的是网络层面上的Turms服务端给Turms客户端的通知,因此notificationService也不会处理底层的TurmsNotification数据。

        补充:关于通知功能的开启与关闭,您可以在turms服务端im.turms.server.common.infra.property.env.service.business.NotificationProperties处,实时地进行修改。

      • storageService:存储相关服务(可选拓展)。负责用户头像、群组头像与消息附件的上传与下载操作。补充:该服务为Turms的拓展服务,因此若您希望使用该功能,您需要将turms-plugin-minio或您自行实现的存储插件集成到turms服务端当中。

    Service中方法的返回值

    与Turms服务端交互的所有Turms客户端接口都基于异步模型编写。turms-client-js使用Promise模型,turms-client-kotlin使用Coroutines模型,而turms-client-swift使用Promise模型(由PromiseKit提供)。

    各种Service可以对Turms所提供的业务数据进行增删改查操作。您需要了解其返回值种类,以开发您自己的业务代码。

    对于状态码为10xx的响应(拓展知识)

    • 对于增加业务数据的方法,如果该方法的返回值被声明为一个异步模型(如:Promise<Response<string>>),则返回的泛型(如前文的string类型)的值必定不为空,否则会抛出一个状态码为INVALID_RESPONSE的错误ResponseErrorResponseException,表明本应该存在的数据丢失。若出现该错误,则意味着Turms服务端或客户端自身存在行为不一致的Bug。

    • 对于删除与更新业务数据的方法,它们均返回被异步模型包裹的Void类型(如:Promise<Response<Void>>)。

    • 对于查找业务数据的方法:

      如果该类方法返回被异步模型包裹的List类型,则当服务端返回空数据时,该查找操作方法会返回一个空List,而非null或undefined。

      如果被包裹的类型不是List类型,则当服务端返回空数据时,该查找操作方法会返回一个undefined(JavaScript)或null(Kotlin)或nil(Swift)。特例:answerGroupQuestions方法可以算做查询方法,但其返回数据永不为空。

    对于状态非10xx的响应(拓展知识)

    这类响应均被认作是“错误”状态响应。Service中的方法会通过异步模型抛出ResponseErrorResponseException,并且这些错误或异常实例均会携带具体的响应状态码与错误原因。

    主要接口差异(拓展知识)

    通常情况下,您并不需要关心各客户端接口之间的差异,但如果您的团队需要由一名开发者基于多个Turms客户端进行上层的开发工作,或者您需要对照您项目的上层客户端代码实现的异同,您可以了解一下客户端间主要接口的不同。

    在早期Turms客户端实现中,各客户端之间的接口参数与数据模型是尽量保持统一的参数配置与含义,如时间相关的参数。但这种强行统一的写法不符合目标语言习惯。同时考虑到在大部分情况下,各客户端的上层业务代码通常有专人负责,而非全由一名开发者负责,统一含义意义不大,并且这些差异也符合目标语言习惯,故不进行强制统一。

    客户端主要接口的差异如下表:

    JavaScript客户端Kotlin客户端Swift客户端Dart客户端示例
    时间单位一律为毫秒一律为毫秒采用TimeInterval(即秒)一律为毫秒connectTimeout
    响应异常模型ResponseError(继承自Error)ResponseException(继承自RuntimeException)ResponseError(继承自Error)ResponseException(继承自Exception)
    异步模型PromiseCoroutines由PromiseKit提供的PromiseFuture

    补充:对于对外暴露的回调函数实现,Turms的Swift客户端没有采用Swift常见的delegate代理模式,而是和其他语言客户端一样通过函数传递逃逸闭包。

    理解接口(重点)

    Turms所有客户端的接口都非常容易理解与使用。开发者甚至不需要看Turms客户端有什么接口,只需要凭借基本的IM业务知识就能反推Turms会有什么接口。

    开发者一般只需要记住:

    • 通过new TurmsClient(...)创建Turms客户端实例
    • 在上文客户端的对外逻辑结构提到的:Turms客户端分为五个服务:userService(用户相关服务)、groupService(群组相关服务)、messageService(消息相关服务)、notificationService(通知相关服务)、storageService(存储相关服务、可选拓展)。

    之后我们就能凭借业务知识反推Turms客户端会有什么接口了,比如:

    • 用户首先要能登陆,于是先想到其对应的服务userService用户相关服务。既然是登陆所以找找有没有login方法,于是自然地就找到了client.userService.login(...)方法。
    • 登陆后,用户需要能够发消息,那就先想到messageService消息相关服务,再看看有没有类似sendMessage的方法,于是找到了client.messageService.sendMessage(...)方法。
    • 既然能发消息,那有什么方法能监听收到的消息呢?既然跟消息有关,那依旧想到的是messageService,于是想到方法可能是onMessagesubscribeMessageaddMessageListener,代码里找一找,找到了client.messageService.addMessageListener(...)
    • 既然能监听收到的消息,那怎么监听接收到的通知呢?既然跟通知有关,那想到的就是notificationService通知相关类服务,并且既然监听收到的消息的方法叫addMessageListener,那监听通知的方法就应该是addNotificationListener了,于是找到了client.notification.addNotificationListener

    综上,开发者一般只需凭借基本的业务知识就能反推Turms客户端提供的接口,甚至不需要读Turms客户端的源码。

    而对于高级开发者,Turms客户端也开放了driver对象,让开发者自行实现一些相对底层的操作。另外,如在会话的生命周期提到的,Turms客户端是故意设计的清晰易懂,故意不提供诸如自动重连、自动路由跳转等操作,因为一方面开发者可以很容易地自行实现该类逻辑,另一方面,这类“隐藏”的内部逻辑会使得上层开发者难以把控底层驱动行为,在一些时候反而会成为绊脚石。

    具体示例

    以下示例包括turms-client-js/kotlin/swift/dart四个版本,并且其作用等价。具体包括了以下业务操作:初始化客户端、登录、监听会话连接断开(下线)、监听通知、监听新消息、查询附近的用户、发送消息、创建群组操作。

    体验示例前的服务端准备工作

    • 方案一:无需在本地搭建Turms服务端,用户直接在本地通过客户端API连接Playground上的turms-gateway服务端(WebSocket端口:http://playground.turms.im:10510;TCP端口:http://playground.turms.im:11510)。但注意及时将本地客户端升级到最新版本,以避免出现因为服务端侧的接口更新,导致数据不一致的问题。
    • 方案二:在application.yaml配置文件中更新以下配置:
      1. turms.gateway.session.enable-authentication设置为false(取消用户登录认证)
      2. turms.service.message.allow-sending-messages-to-stranger设置为true(允许没有用户关系的用户互相发送消息)
    • 方案三:使用自带dev profile配置。因为Turms提供的devprofile已做了上述配置。默认情况下,Turms发布包中的application.yamlprofile字段为空,即默认的profile不是dev,需要您手动配置为dev

    代码示例

    javascript
    // Initialize client
    +import{_ as s,o as n,c as a,Q as l}from"./chunks/framework.0882ee08.js";const d=JSON.parse('{"title":"接口","description":"","frontmatter":{},"headers":[],"relativePath":"zh-CN/client/api.md","filePath":"zh-CN/client/api.md"}'),p={name:"zh-CN/client/api.md"},o=l(`

    接口

    Turms客户端目前支持JavaScript、Kotlin、Swift与Dart这四种语言,对外暴露一致的接口,并且表现为一致的行为。各语言版本之间的部分接口参数可能出现不完全一致的情况,这主要体现在:1. 接口采用更贴近当前语言特性及习惯的参数与语法;2. turms-client-js独有的参数与接口。

    由于Turms各语言客户端行为具有高度的一致性,因此如果您基于上述任意一种语言进行业务开发,您可以在代码逻辑不做改变的情况下,轻松将已写好的业务代码翻译为另外三种语言(具体可参考在本文结尾处的示例)。

    客户端的对外逻辑结构

    • TurmsClient:Turms客户端唯一直接对外暴露的类,一个TurmsClient实例代表着一个客户端与服务端之间的会话连接。以下变量是TurmsClient对外的成员变量。

      • driver:TurmsClient的运行驱动。负责连接的开起关闭、底层数据的发送接收与心跳控制等基础性操作。以下介绍到的Service层类都基于driver运作。

      • userService:用户相关服务。负责如用户登陆、添加好友、添加关系人分组、发送/处理好友请求、查询附近的用户等操作。

      • groupService:群组相关服务。负责如创建群组、变更群主、修改群成员角色、修改群信息等操作。

      • messageService:消息相关服务。负责如发送消息、修改已发送消息、查询各类消息与其状态、撤回消息等操作。

      • notificationService:通知相关服务。负责接受与响应业务层面上的通知(比如:其他用户向该用户发送好友请求、群组成员上下线等通知)。 提醒:消息(message)不算做业务层面上的“通知”(notification),因此notificationService不会处理用户消息,用户消息仅由messageService进行处理。而driver中TurmsNotification的“通知”概念指的是网络层面上的Turms服务端给Turms客户端的通知,因此notificationService也不会处理底层的TurmsNotification数据。

        补充:关于通知功能的开启与关闭,您可以在turms服务端im.turms.server.common.infra.property.env.service.business.NotificationProperties处,实时地进行修改。

      • storageService:存储相关服务(可选拓展)。负责用户头像、群组头像与消息附件的上传与下载操作。补充:该服务为Turms的拓展服务,因此若您希望使用该功能,您需要将turms-plugin-minio或您自行实现的存储插件集成到turms服务端当中。

    Service中方法的返回值

    与Turms服务端交互的所有Turms客户端接口都基于异步模型编写。turms-client-js使用Promise模型,turms-client-kotlin使用Coroutines模型,而turms-client-swift使用Promise模型(由PromiseKit提供)。

    各种Service可以对Turms所提供的业务数据进行增删改查操作。您需要了解其返回值种类,以开发您自己的业务代码。

    对于状态码为10xx的响应(拓展知识)

    • 对于增加业务数据的方法,如果该方法的返回值被声明为一个异步模型(如:Promise<Response<string>>),则返回的泛型(如前文的string类型)的值必定不为空,否则会抛出一个状态码为INVALID_RESPONSE的错误ResponseErrorResponseException,表明本应该存在的数据丢失。若出现该错误,则意味着Turms服务端或客户端自身存在行为不一致的Bug。

    • 对于删除与更新业务数据的方法,它们均返回被异步模型包裹的Void类型(如:Promise<Response<Void>>)。

    • 对于查找业务数据的方法:

      如果该类方法返回被异步模型包裹的List类型,则当服务端返回空数据时,该查找操作方法会返回一个空List,而非null或undefined。

      如果被包裹的类型不是List类型,则当服务端返回空数据时,该查找操作方法会返回一个undefined(JavaScript)或null(Kotlin)或nil(Swift)。特例:answerGroupQuestions方法可以算做查询方法,但其返回数据永不为空。

    对于状态非10xx的响应(拓展知识)

    这类响应均被认作是“错误”状态响应。Service中的方法会通过异步模型抛出ResponseErrorResponseException,并且这些错误或异常实例均会携带具体的响应状态码与错误原因。

    主要接口差异(拓展知识)

    通常情况下,您并不需要关心各客户端接口之间的差异,但如果您的团队需要由一名开发者基于多个Turms客户端进行上层的开发工作,或者您需要对照您项目的上层客户端代码实现的异同,您可以了解一下客户端间主要接口的不同。

    在早期Turms客户端实现中,各客户端之间的接口参数与数据模型是尽量保持统一的参数配置与含义,如时间相关的参数。但这种强行统一的写法不符合目标语言习惯。同时考虑到在大部分情况下,各客户端的上层业务代码通常有专人负责,而非全由一名开发者负责,统一含义意义不大,并且这些差异也符合目标语言习惯,故不进行强制统一。

    客户端主要接口的差异如下表:

    JavaScript客户端Kotlin客户端Swift客户端Dart客户端示例
    时间单位一律为毫秒一律为毫秒采用TimeInterval(即秒)一律为毫秒connectTimeout
    响应异常模型ResponseError(继承自Error)ResponseException(继承自RuntimeException)ResponseError(继承自Error)ResponseException(继承自Exception)
    异步模型PromiseCoroutines由PromiseKit提供的PromiseFuture

    补充:对于对外暴露的回调函数实现,Turms的Swift客户端没有采用Swift常见的delegate代理模式,而是和其他语言客户端一样通过函数传递逃逸闭包。

    理解接口(重点)

    Turms所有客户端的接口都非常容易理解与使用。开发者甚至不需要看Turms客户端有什么接口,只需要凭借基本的IM业务知识就能反推Turms会有什么接口。

    开发者一般只需要记住:

    • 通过new TurmsClient(...)创建Turms客户端实例
    • 在上文客户端的对外逻辑结构提到的:Turms客户端分为五个服务:userService(用户相关服务)、groupService(群组相关服务)、messageService(消息相关服务)、notificationService(通知相关服务)、storageService(存储相关服务、可选拓展)。

    之后我们就能凭借业务知识反推Turms客户端会有什么接口了,比如:

    • 用户首先要能登陆,于是先想到其对应的服务userService用户相关服务。既然是登陆所以找找有没有login方法,于是自然地就找到了client.userService.login(...)方法。
    • 登陆后,用户需要能够发消息,那就先想到messageService消息相关服务,再看看有没有类似sendMessage的方法,于是找到了client.messageService.sendMessage(...)方法。
    • 既然能发消息,那有什么方法能监听收到的消息呢?既然跟消息有关,那依旧想到的是messageService,于是想到方法可能是onMessagesubscribeMessageaddMessageListener,代码里找一找,找到了client.messageService.addMessageListener(...)
    • 既然能监听收到的消息,那怎么监听接收到的通知呢?既然跟通知有关,那想到的就是notificationService通知相关类服务,并且既然监听收到的消息的方法叫addMessageListener,那监听通知的方法就应该是addNotificationListener了,于是找到了client.notification.addNotificationListener

    综上,开发者一般只需凭借基本的业务知识就能反推Turms客户端提供的接口,甚至不需要读Turms客户端的源码。

    而对于高级开发者,Turms客户端也开放了driver对象,让开发者自行实现一些相对底层的操作。另外,如在会话的生命周期提到的,Turms客户端是故意设计的清晰易懂,故意不提供诸如自动重连、自动路由跳转等操作,因为一方面开发者可以很容易地自行实现该类逻辑,另一方面,这类“隐藏”的内部逻辑会使得上层开发者难以把控底层驱动行为,在一些时候反而会成为绊脚石。

    具体示例

    以下示例包括turms-client-js/kotlin/swift/dart四个版本,并且其作用等价。具体包括了以下业务操作:初始化客户端、登录、监听会话连接断开(下线)、监听通知、监听新消息、查询附近的用户、发送消息、创建群组操作。

    体验示例前的服务端准备工作

    • 方案一:无需在本地搭建Turms服务端,用户直接在本地通过客户端API连接Playground上的turms-gateway服务端(WebSocket端口:http://playground.turms.im:10510;TCP端口:http://playground.turms.im:11510)。但注意及时将本地客户端升级到最新版本,以避免出现因为服务端侧的接口更新,导致数据不一致的问题。
    • 方案二:在application.yaml配置文件中更新以下配置:
      1. turms.gateway.session.enable-authentication设置为false(取消用户登录认证)
      2. turms.service.message.allow-sending-messages-to-stranger设置为true(允许没有用户关系的用户互相发送消息)
    • 方案三:使用自带dev profile配置。因为Turms提供的devprofile已做了上述配置。默认情况下,Turms发布包中的application.yamlprofile字段为空,即默认的profile不是dev,需要您手动配置为dev

    代码示例

    javascript
    // Initialize client
     const client = new TurmsClient(); // new TurmsClient('ws://any-turms-gateway-server.com');
     
     // Listen to the offline event
    diff --git a/docs/assets/zh-CN_client_api.md.b7e0577a.lean.js b/docs/assets/zh-CN_client_api.md.294bb9cc.lean.js
    similarity index 100%
    rename from docs/assets/zh-CN_client_api.md.b7e0577a.lean.js
    rename to docs/assets/zh-CN_client_api.md.294bb9cc.lean.js
    diff --git a/docs/assets/zh-CN_client_turms-chat-demo.md.f52ce343.js b/docs/assets/zh-CN_client_turms-chat-demo.md.f52ce343.js
    new file mode 100644
    index 00000000..c391c90a
    --- /dev/null
    +++ b/docs/assets/zh-CN_client_turms-chat-demo.md.f52ce343.js
    @@ -0,0 +1 @@
    +import{_ as e,o as a,c as o,Q as t}from"./chunks/framework.0882ee08.js";const _=JSON.parse('{"title":"Turms Chat Demo","description":"","frontmatter":{},"headers":[],"relativePath":"zh-CN/client/turms-chat-demo.md","filePath":"zh-CN/client/turms-chat-demo.md"}'),l={name:"zh-CN/client/turms-chat-demo.md"},i=t('

    Turms Chat Demo

    背景

    最初,我们是计划先通过让turms-gateway支持XMPP协议来让用户能够自行复用世界上已有的XMPP客户端。但是不管是收费,还是免费的XMPP客户端质量基本都不高,主要体现在:

    1. 大多XMPP客户端项目代码质量差,尤其是很多早期客户端工程师的代码功底很差,甚至会把复杂的UI逻辑与业务逻辑杂糅在一起写(比如著名开源项目JMeter),二次开发不如自己重写。
    2. 不管是商业还是开源的UI设计水平基本都停留在业余爱好者水平。如果一个客户端项目没有专业的UI,我们会对其团队的前端工程师与UI设计师的能力表示怀疑(团队中只要有一位靠谱的、中级水平的前端工程师,就应该有独立设计单一产品UI的能力),也不会推荐用户去用他们的方案。
    3. 几乎没有一个开源的XMPP客户端支持完整的跨平台方案。
    4. 很多质量不高的XMPP客户端甚至需要收费。

    考虑到提供一套跨桌面端与移动端IM应用的开发难度不高,主要是体力活,并且IM应用的UI与功能通用性强(在市面上找10款IM商业应用调研,会发现至少有9款IM的UI与功能是基本类似的),因此决定先提供IM客户端Demoturms-chat-demo-flutter,让Turms的用户能够自己使用或二次开发,之后再支持XMPP协议。

    RoadMap

    • 2023年11月~12月:完成桌面端UI设计;搭建Flutter项目框架;完成桌面端基础组件开发与测试;完整Windows桌面端UI开发与测试。
    • 2023年12月~2024年1月:完成MacOS桌面端的UI适配工作;完成移动端基础组件开发与测试;完成Android手机端的UI开发与测试。
    • 2024年1月~2024年2月:完成iOS手机端的UI适配工作。
    • 2024年2月~3月:完成Web端的UI开发。
    • 2024年3月~4月:集成turms-client-dart与实现IM业务逻辑(上述任务只有UI开发与测试,不包括业务逻辑)。

    另外:

    • 考虑到Turms的其他任务、节假日与工作情况,上述时间可能会略有变动。
    • 无计划支持小程序。

    简介

    我们想着重提醒项目名中的一词——demo。该词主要有以下几种含义:

    1. 不管是从产品角度,还是技术角度,该客户端demo也只不过是其中可能的的方案之一,用户不应该因为该demo而限制设计自身IM产品的能力,尤其不要认为Turms的服务端是为该demo定制的,正如Turms文档中反复提及Turms是一个通用IM解决方案,致力于解决各种IM场景。
    2. 为用户的二次开发做准备。这主要分为三个方面:
      1. UI与业务逻辑分离。方便需要二次开发的团队复用UI来实现自己的业务逻辑。
      2. 依旧采用宽松的Apache 2.0,而不是客户端开源项目常见的、更加严格的GPL协议。
      3. 由于全球范围的IM应用的UI设计都非常类似,因此该demo也会实现大部分IM的通用UI与逻辑,一般不提供更为定制化的逻辑,以方面其他团队二次开发。

    注意:demo没有质量低的含义,这点读者之后看代码质量与UI设计就可明白。

    ',13),d=[i];function r(c,m,s,h,n,u){return a(),o("div",null,d)}const p=e(l,[["render",r]]);export{_ as __pageData,p as default}; diff --git a/docs/assets/zh-CN_client_turms-chat-demo.md.f52ce343.lean.js b/docs/assets/zh-CN_client_turms-chat-demo.md.f52ce343.lean.js new file mode 100644 index 00000000..f206ed74 --- /dev/null +++ b/docs/assets/zh-CN_client_turms-chat-demo.md.f52ce343.lean.js @@ -0,0 +1 @@ +import{_ as e,o as a,c as o,Q as t}from"./chunks/framework.0882ee08.js";const _=JSON.parse('{"title":"Turms Chat Demo","description":"","frontmatter":{},"headers":[],"relativePath":"zh-CN/client/turms-chat-demo.md","filePath":"zh-CN/client/turms-chat-demo.md"}'),l={name:"zh-CN/client/turms-chat-demo.md"},i=t("",13),d=[i];function r(c,m,s,h,n,u){return a(),o("div",null,d)}const p=e(l,[["render",r]]);export{_ as __pageData,p as default}; diff --git a/docs/assets/zh-CN_server_deployment_config.md.ebee420e.js b/docs/assets/zh-CN_server_deployment_config.md.f5b194c8.js similarity index 99% rename from docs/assets/zh-CN_server_deployment_config.md.ebee420e.js rename to docs/assets/zh-CN_server_deployment_config.md.f5b194c8.js index 20199644..b49599d4 100644 --- a/docs/assets/zh-CN_server_deployment_config.md.ebee420e.js +++ b/docs/assets/zh-CN_server_deployment_config.md.f5b194c8.js @@ -34,7 +34,7 @@ import{_ as t,o as e,c as d,Q as s}from"./chunks/framework.0882ee08.js";const g= --health-retries=3 \\ --health-start-period=60s \\ -v <your-jvm-options-file-path>:/opt/turms/turms-gateway/config/jvm.options:ro \\ - ghcr.io/turms-im/turms-gateway
  • 如果通过Docker Compose,则可以使用类似:

    shell
    TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path> docker compose -f docker-compose.standalone.yml up --force-recreate
    TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path> docker compose -f docker-compose.standalone.yml up --force-recreate
    powershell
    $env:TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path>;docker compose -f docker-compose.standalone.yml up --force-recreate
    $env:TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path>;docker compose -f docker-compose.standalone.yml up --force-recreate

    注意:上述的TURMS_GATEWAY_JVM_CONF路径指向的是镜像内部的路径,而非宿主机的路径。如果想使用宿主机里的配置文件,则需要修改docker-compose.standalone.yml配置文件,以使用Docker的挂载机制,如:

    yaml
    turms-gateway:
    +  ghcr.io/turms-im/turms-gateway
  • 如果通过Docker Compose,则可以使用类似:

    shell
    TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path> docker compose -f docker-compose.standalone.yml up --force-recreate
    TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path> docker compose -f docker-compose.standalone.yml up --force-recreate
    powershell
    $env:TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path>;docker compose -f docker-compose.standalone.yml up --force-recreate
    $env:TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path>;docker compose -f docker-compose.standalone.yml up --force-recreate

    注意:上述的TURMS_GATEWAY_JVM_CONF路径指向的是镜像内部的路径,而非宿主机的路径。如果想使用宿主机里的配置文件,则需要修改docker-compose.standalone.yml配置文件,以使用Docker的挂载机制,如:

    yaml
    turms-gateway:
       volumes:
         - <your-jvm-options-file-path>:/opt/turms/turms-gateway/config/jvm.options:ro
    turms-gateway:
       volumes:
    diff --git a/docs/assets/zh-CN_server_deployment_config.md.ebee420e.lean.js b/docs/assets/zh-CN_server_deployment_config.md.f5b194c8.lean.js
    similarity index 100%
    rename from docs/assets/zh-CN_server_deployment_config.md.ebee420e.lean.js
    rename to docs/assets/zh-CN_server_deployment_config.md.f5b194c8.lean.js
    diff --git a/docs/client/api.html b/docs/client/api.html
    index 6152e1e9..fbd8132b 100644
    --- a/docs/client/api.html
    +++ b/docs/client/api.html
    @@ -12,12 +12,12 @@
         
         
         
    -    
    +    
         
         
       
       
    -    
    Skip to content

    API

    Turms client currently supports four programming languages, JavaScript, Kotlin, Swift and Dart, exposing a consistent interface and behaving in a consistent manner. Some interface parameters may be inconsistent across languages, mainly due to: 1. The interface uses parameters and syntax that are closer to current language characteristics and conventions; 2. Unique parameters and interfaces of turms-client-js.

    Since Turms client behavior is highly consistent across languages, you can easily translate your written business code into the other three languages without changing the code logic (see the examples at the end of this article) if you develop your application based on either language.

    External Logic Structure

    • TurmsClient: TurmsClient is the only class exposed directly to the public. A TurmsClient instance represents a session between a client and a server. The following variables are the external member variables of TurmsClient.

      • driver: TurmsClient's runtime driver. It is responsible for the basic operations such as opening and closing the connection, sending and receiving the underlying data and heartbeat control. The following service layer classes are all driver-based.

      • userService: A user-related service. It is responsible for such operations as user login, adding friends, adding relationship groups, sending/processing friend requests, querying nearby users, etc.

      • groupService: A group-related service. It is responsible for operations such as creating groups, changing group owners, modifying group members' roles, modifying group information, etc.

      • messageService:A message-related service. It is responsible for operations such as sending messages, modifying sent messages, querying various messages and their status, recalling messages, etc.

      • notificationService: A notification-related service. It is responsible for receiving and responding to business-level notifications (e.g., other users sending friend requests to the user, group members going up and down, etc.). Reminder: messages are not considered as business-level notifications, so notificationService does not handle user messages, and user messages are only handled by messageService. The concept of "notification" in TurmsNotification in driver refers to the notification from the Turms server to the Turms client at the network level, so the notificationService does not handle the underlying TurmsNotification data.

        Addendum: You can change the notification function on and off in real-time at im.turms.server.common.infra.properties.env.service.business.NotificationProperties on the Turms server.

      • storageService: A storage-related service (optional extension). It is responsible for upload and download operations of user avatars, group avatars and message attachments. Note: This service is an extension of turms, so if you want to use this feature, you need to integrate turms-plugin-minio or your own storage plugin into the Turms server.

    Return Value of Methods in Services

    All Turms client service methods that interact with the Turms server are written based on the asynchronous model. turms-client-js uses the Promise model, turms-client-kotlin uses the Coroutines model, and turms-client-swift uses the Promise model (provided by PromiseKit).

    Various Services can add, delete, update and query the business data provided by Turms. You need to understand their return value types in order to develop your own business code.

    Deep Dive - For Responses with Status Code 10xx

    • For methods that add business data, if the return value of the method is declared as an asynchronous model (e.g., Promise<Response<string>>), the return value of the generic type (such as the string type in the previous section) must not be null, otherwise an error with status code INVALID_RESPONSE will be thrown ResponseError or ResponseException, indicating that a data that should exist is missing. If this error occurs, it means there is a bug of inconsistency in the behavior of either the Turms server or client.

    • For methods that delete and update business data, they both return Void types wrapped by asynchronous models (e.g., Promise<Response<Void>>).

    • For functions that find business models.

      If the function of this class returns a List type wrapped by an asynchronous model, the lookup operation function returns an empty List instead of null or undefined when the server returns empty data.

      If the wrapped type is not a List, the lookup function returns an undefined (JavaScript) or null (Kotlin) or nil (Swift) when the server returns null data. Special case: the answerGroupQuestions method can be counted as a query method, but its return data is never null.

    Deep Dive - For Responses with Status Code Other Than 10xx

    These types of responses are all regarded as "error" status responses. The methods in the Service will throw ResponseError or ResponseException through the asynchronous model, and these error or exception instances will carry a specific response status code and an error reason.

    Deep Dive - Main Interface Differences

    Normally, you don't need to care about the differences between client interfaces, but if your team needs to have one developer working on the upper layers based on multiple Turms clients, or if you need to compare the similarities and differences between the upper layer client code implementations for your project, you can learn about the differences in the main interfaces between the clients.

    In early Turms client implementations, the interface parameters and data model between the clients were kept as uniform as possible in terms of configuration and meaning, such as time-related configuration parameters. However, this forced uniformity was written in a way that did not conform to the target language conventions. Also, considering that in most cases, the upper-level business code of each client usually has a dedicated person in charge of it, rather than all by one developer, the uniform meaning is not significant, and these differences are also in line with the target language habits, so no mandatory uniformity is made.

    The differences in the main interfaces of the clients are listed below.

    JavaScript ClientKotlin ClientSwift ClientDart ClientExamples
    Time UnitConsistent with millisecondsConsistent with millisecondsUses TimeInterval (i.e., seconds)Consistent with millisecondsconnectTimeout
    Response Exception ModelResponseError (inherited from Error)ResponseException (inherited from RuntimeException)ResponseError (inherited from Error)ResponseException (inherited from Exception)
    Asynchronous ModelPromiseCoroutinesPromise provided by PromiseKitFuture

    Note: For the externally exposed callback function implementation, Turms Swift client does not use the delegate proxy common to Swift, but escapes the closure via function passing like other language clients.

    Understanding interfaces (Important)

    The interfaces of all Turms clients are very easy to understand and use. Developers don't even need to look at what interfaces Turms clients have. They can simply deduce what interfaces Turms will have based on basic IM business knowledge.

    Developers generally only need to remember:

    • Create a Turms client instance through new TurmsClient(...)
    • As mentioned in the previous section on External Logic Structure, the Turms client is divided into five services: userService (related to user), groupService (related to group), messageService (related to message), notificationService (related to notification), and storageService (related to storage, optional).

    Afterwards, based on business knowledge, we can infer what interfaces the Turms client will have, such as:

    • If a user needs to log in first, we naturally think of the userService related to users. Since it is "logging in," we look for a login method and naturally find the client.userService.login(...) method.
    • After logging in, the user needs to be able to send messages. We would then think of the messageService related to messages and look for an method similar to sendMessage, which leads us to the client.messageService.sendMessage(...) method.
    • Since we can send messages, what method can we use to listen for received messages? Since it is still related to messages, we still think of the messageService, so we might consider methods like onMessage, subscribeMessage, or addMessageListener. Looking through the code, we find the client.messageService.addMessageListener(...) method.
    • If we can listen for received messages, how do we listen for received notifications? Since it is related to notifications, we naturally think of the notificationService related to notifications. Since the method for listening for received messages is called addMessageListener, the method for listening to notifications should be addNotificationListener, which leads us to the client.notification.addNotificationListener method.

    In summary, developers generally only need basic business knowledge to infer the interfaces provided by the Turms client, and do not even need to read the source code of the Turms client.

    For advanced developers, the Turms client also provides a driver for implementing relatively low-level operations. In addition, as mentioned in the section on `Session Lifecycle," the Turms client is intentionally designed to be clear and easy to understand, deliberately not providing operations such as automatic reconnection or automatic routing, because on one hand developers can easily implement such logic themselves, and on the other hand, such "hidden" internal logic can make it difficult for upper-level developers to control low-level driver behavior and can sometimes become a stumbling block.

    Examples

    The following examples include four versions of turms-client-js/kotlin/swift/dart and have equivalent functionalities. The following business operations are included: client initialization, login, listen for session disconnections (offline), listen for notifications, listen for new messages, query nearby users, send messages, and create groups.

    Server-side Preparation before Trying Examples

    • Option 1: No need to build Turms servers locally, users connect to turms-gateway on Playground directly locally via the client API (WebSocket endpoint: http://playground.turms.im:10510; TCP endpoint: http://playground.turms.im:11510). However, pay attention to upgrade the local client to the latest version in time to avoid the problem of inconsistent data because of server-side interface updates.
    • Option 2: Update the following configuration in the application.yaml configuration file.
      1. Set turms.gateway.session.enable-authentication to false (disable user login authentication)
      2. Set turms.service.message.allow-sending-messages-to-stranger to true (allow users without relationship to send messages to each other)
    • Option 3: Use the built-in dev profile configuration. This is because the dev profile provided by Turms already has the above configuration. By default, the profile of application.yaml in the Turms distribution package is empty, i.e. the default profile is not dev and you need to configure it to dev manually.

    Code example

    javascript
    // Initialize client
    +    
    Skip to content

    API

    Turms client currently supports four programming languages, JavaScript, Kotlin, Swift and Dart, exposing a consistent interface and behaving in a consistent manner. Some interface parameters may be inconsistent across languages, mainly due to: 1. The interface uses parameters and syntax that are closer to current language characteristics and conventions; 2. Unique parameters and interfaces of turms-client-js.

    Since Turms client behavior is highly consistent across languages, you can easily translate your written business code into the other three languages without changing the code logic (see the examples at the end of this article) if you develop your application based on either language.

    External Logic Structure

    • TurmsClient: TurmsClient is the only class exposed directly to the public. A TurmsClient instance represents a session between a client and a server. The following variables are the external member variables of TurmsClient.

      • driver: TurmsClient's runtime driver. It is responsible for the basic operations such as opening and closing the connection, sending and receiving the underlying data and heartbeat control. The following service layer classes are all driver-based.

      • userService: A user-related service. It is responsible for such operations as user login, adding friends, adding relationship groups, sending/processing friend requests, querying nearby users, etc.

      • groupService: A group-related service. It is responsible for operations such as creating groups, changing group owners, modifying group members' roles, modifying group information, etc.

      • messageService:A message-related service. It is responsible for operations such as sending messages, modifying sent messages, querying various messages and their status, recalling messages, etc.

      • notificationService: A notification-related service. It is responsible for receiving and responding to business-level notifications (e.g., other users sending friend requests to the user, group members going up and down, etc.). Reminder: messages are not considered as business-level notifications, so notificationService does not handle user messages, and user messages are only handled by messageService. The concept of "notification" in TurmsNotification in driver refers to the notification from the Turms server to the Turms client at the network level, so the notificationService does not handle the underlying TurmsNotification data.

        Addendum: You can change the notification function on and off in real-time at im.turms.server.common.infra.properties.env.service.business.NotificationProperties on the Turms server.

      • storageService: A storage-related service (optional extension). It is responsible for upload and download operations of user avatars, group avatars and message attachments. Note: This service is an extension of turms, so if you want to use this feature, you need to integrate turms-plugin-minio or your own storage plugin into the Turms server.

    Return Value of Methods in Services

    All Turms client service methods that interact with the Turms server are written based on the asynchronous model. turms-client-js uses the Promise model, turms-client-kotlin uses the Coroutines model, and turms-client-swift uses the Promise model (provided by PromiseKit).

    Various Services can add, delete, update and query the business data provided by Turms. You need to understand their return value types in order to develop your own business code.

    Deep Dive - For Responses with Status Code 10xx

    • For methods that add business data, if the return value of the method is declared as an asynchronous model (e.g., Promise<Response<string>>), the return value of the generic type (such as the string type in the previous section) must not be null, otherwise an error with status code INVALID_RESPONSE will be thrown ResponseError or ResponseException, indicating that a data that should exist is missing. If this error occurs, it means there is a bug of inconsistency in the behavior of either the Turms server or client.

    • For methods that delete and update business data, they both return Void types wrapped by asynchronous models (e.g., Promise<Response<Void>>).

    • For functions that find business models.

      If the function of this class returns a List type wrapped by an asynchronous model, the lookup operation function returns an empty List instead of null or undefined when the server returns empty data.

      If the wrapped type is not a List, the lookup function returns an undefined (JavaScript) or null (Kotlin) or nil (Swift) when the server returns null data. Special case: the answerGroupQuestions method can be counted as a query method, but its return data is never null.

    Deep Dive - For Responses with Status Code Other Than 10xx

    These types of responses are all regarded as "error" status responses. The methods in the Service will throw ResponseError or ResponseException through the asynchronous model, and these error or exception instances will carry a specific response status code and an error reason.

    Deep Dive - Main Interface Differences

    Normally, you don't need to care about the differences between client interfaces, but if your team needs to have one developer working on the upper layers based on multiple Turms clients, or if you need to compare the similarities and differences between the upper layer client code implementations for your project, you can learn about the differences in the main interfaces between the clients.

    In early Turms client implementations, the interface parameters and data model between the clients were kept as uniform as possible in terms of configuration and meaning, such as time-related configuration parameters. However, this forced uniformity was written in a way that did not conform to the target language conventions. Also, considering that in most cases, the upper-level business code of each client usually has a dedicated person in charge of it, rather than all by one developer, the uniform meaning is not significant, and these differences are also in line with the target language habits, so no mandatory uniformity is made.

    The differences in the main interfaces of the clients are listed below.

    JavaScript ClientKotlin ClientSwift ClientDart ClientExamples
    Time UnitConsistent with millisecondsConsistent with millisecondsUses TimeInterval (i.e., seconds)Consistent with millisecondsconnectTimeout
    Response Exception ModelResponseError (inherited from Error)ResponseException (inherited from RuntimeException)ResponseError (inherited from Error)ResponseException (inherited from Exception)
    Asynchronous ModelPromiseCoroutinesPromise provided by PromiseKitFuture

    Note: For the externally exposed callback function implementation, Turms Swift client does not use the delegate proxy common to Swift, but escapes the closure via function passing like other language clients.

    Understanding interfaces (Important)

    The interfaces of all Turms clients are very easy to understand and use. Developers don't even need to look at what interfaces Turms clients have. They can simply deduce what interfaces Turms will have based on basic IM business knowledge.

    Developers generally only need to remember:

    • Create a Turms client instance through new TurmsClient(...)
    • As mentioned in the previous section on External Logic Structure, the Turms client is divided into five services: userService (related to user), groupService (related to group), messageService (related to message), notificationService (related to notification), and storageService (related to storage, optional).

    Afterwards, based on business knowledge, we can infer what interfaces the Turms client will have, such as:

    • If a user needs to log in first, we naturally think of the userService related to users. Since it is "logging in," we look for a login method and naturally find the client.userService.login(...) method.
    • After logging in, the user needs to be able to send messages. We would then think of the messageService related to messages and look for an method similar to sendMessage, which leads us to the client.messageService.sendMessage(...) method.
    • Since we can send messages, what method can we use to listen for received messages? Since it is still related to messages, we still think of the messageService, so we might consider methods like onMessage, subscribeMessage, or addMessageListener. Looking through the code, we find the client.messageService.addMessageListener(...) method.
    • If we can listen for received messages, how do we listen for received notifications? Since it is related to notifications, we naturally think of the notificationService related to notifications. Since the method for listening for received messages is called addMessageListener, the method for listening to notifications should be addNotificationListener, which leads us to the client.notification.addNotificationListener method.

    In summary, developers generally only need basic business knowledge to infer the interfaces provided by the Turms client, and do not even need to read the source code of the Turms client.

    For advanced developers, the Turms client also provides a driver for implementing relatively low-level operations. In addition, as mentioned in the section on `Session Lifecycle," the Turms client is intentionally designed to be clear and easy to understand, deliberately not providing operations such as automatic reconnection or automatic routing, because on one hand developers can easily implement such logic themselves, and on the other hand, such "hidden" internal logic can make it difficult for upper-level developers to control low-level driver behavior and can sometimes become a stumbling block.

    Examples

    The following examples include four versions of turms-client-js/kotlin/swift/dart and have equivalent functionalities. The following business operations are included: client initialization, login, listen for session disconnections (offline), listen for notifications, listen for new messages, query nearby users, send messages, and create groups.

    Server-side Preparation before Trying Examples

    • Option 1: No need to build Turms servers locally, users connect to turms-gateway on Playground directly locally via the client API (WebSocket endpoint: http://playground.turms.im:10510; TCP endpoint: http://playground.turms.im:11510). However, pay attention to upgrade the local client to the latest version in time to avoid the problem of inconsistent data because of server-side interface updates.
    • Option 2: Update the following configuration in the application.yaml configuration file.
      1. Set turms.gateway.session.enable-authentication to false (disable user login authentication)
      2. Set turms.service.message.allow-sending-messages-to-stranger to true (allow users without relationship to send messages to each other)
    • Option 3: Use the built-in dev profile configuration. This is because the dev profile provided by Turms already has the above configuration. By default, the profile of application.yaml in the Turms distribution package is empty, i.e. the default profile is not dev and you need to configure it to dev manually.

    Code example

    javascript
    // Initialize client
     const client = new TurmsClient(); // new TurmsClient('ws://any-turms-gateway-server.com');
     
     // Listen to the offline event
    @@ -372,7 +372,7 @@
             intro: 'nope'))
         .data;
     print('group $groupId has been created');
    - + \ No newline at end of file diff --git a/docs/client/communication-protocol.html b/docs/client/communication-protocol.html index 86203400..fc3465df 100644 --- a/docs/client/communication-protocol.html +++ b/docs/client/communication-protocol.html @@ -17,8 +17,8 @@ -
    Skip to content

    Communication Protocol Used Between Client and Server

    Data Format

    For general requests and responses:

    • Client based on the pure TCP protocol: varint-encoded payload length + payload (Protobuf-encoded TurmsNotification or TurmsRequest).
    • Client based on the WebSocket protocol: payload (Protobuf-encoded TurmsNotification or TurmsRequest). The byte length of the payload is transmitted through the underlying WebSocket frame.

    For heartbeat requests:

    • Client based on the pure TCP protocol: a byte array [0] with the length of one byte. The value 0 here actually means "the length of the payload is 0 with a length of one byte under varint encoding", that is, the payload is 0 bytes.
    • Client based on the WebSocket protocol: a binary message with an empty body (0 bytes).

    Note: The reasons why Turms does not implement heartbeat through WebSocket's PING/PONG are:

    • The time interval for sending PING frames implemented by each browser's WebSocket is different.
    • The upper layer code cannot control the behavior of PING/PONG, or even perceive the occurrence of the behavior.
    • The heartbeat logic at the network level should not be coupled with the heartbeat at the application layer.
    - +
    Skip to content

    Communication Protocol Used Between Client and Server

    Data Format

    For general requests and responses:

    • Client based on the pure TCP protocol: varint-encoded payload length + payload (Protobuf-encoded TurmsNotification or TurmsRequest).
    • Client based on the WebSocket protocol: payload (Protobuf-encoded TurmsNotification or TurmsRequest). The byte length of the payload is transmitted through the underlying WebSocket frame.

    For heartbeat requests:

    • Client based on the pure TCP protocol: a byte array [0] with the length of one byte. The value 0 here actually means "the length of the payload is 0 with a length of one byte under varint encoding", that is, the payload is 0 bytes.
    • Client based on the WebSocket protocol: a binary message with an empty body (0 bytes).

    Note: The reasons why Turms does not implement heartbeat through WebSocket's PING/PONG are:

    • The time interval for sending PING frames implemented by each browser's WebSocket is different.
    • The upper layer code cannot control the behavior of PING/PONG, or even perceive the occurrence of the behavior.
    • The heartbeat logic at the network level should not be coupled with the heartbeat at the application layer.
    + \ No newline at end of file diff --git a/docs/client/metrics.html b/docs/client/metrics.html index f9bb8b21..63640753 100644 --- a/docs/client/metrics.html +++ b/docs/client/metrics.html @@ -17,8 +17,8 @@ -
    Skip to content

    Metrics

    Reference: Observability System

    Network Connection Metrics

    Each client of Turms will provide metrics related to the network connection. Developers can get the metrics through turmsClient.driver.connectionMetrics. This object contains the following data:

    Data point nameUnitMeaning
    addressResolverTimemillisecondsThe domain name resolution time.
    turms-client-js does not provide this data
    connectTimemillisecondsFor non-turms-client-js clients, this data refers to the time spent in TCP handshake;
    For turms-client-js clients, this data refers to the total time of domain name resolution, TCP handshake, TLS handshake, and establishment of WebSocket connection
    tlsHandshakeTimemillisecondsTLS handshake time.
    turms-client-js/swift does not provide this data
    dataReceivedbytesFor non-turms-client-js clients, this data refers to the number of data bytes received by the TCP connection;
    For turms-client-js clients, this data refers to the bytes of the binary frame received by the WebSocket connection
    dataSentbytesFor non-turms-client-js clients, this data refers to the number of data bytes sent by the TCP connection;
    For turms-client-js clients, this data refers to the bytes of the binary frame received by the WebSocket connection

    Business Request Metrics

    TODO

    - +
    Skip to content

    Metrics

    Reference: Observability System

    Network Connection Metrics

    Each client of Turms will provide metrics related to the network connection. Developers can get the metrics through turmsClient.driver.connectionMetrics. This object contains the following data:

    Data point nameUnitMeaning
    addressResolverTimemillisecondsThe domain name resolution time.
    turms-client-js does not provide this data
    connectTimemillisecondsFor non-turms-client-js clients, this data refers to the time spent in TCP handshake;
    For turms-client-js clients, this data refers to the total time of domain name resolution, TCP handshake, TLS handshake, and establishment of WebSocket connection
    tlsHandshakeTimemillisecondsTLS handshake time.
    turms-client-js/swift does not provide this data
    dataReceivedbytesFor non-turms-client-js clients, this data refers to the number of data bytes received by the TCP connection;
    For turms-client-js clients, this data refers to the bytes of the binary frame received by the WebSocket connection
    dataSentbytesFor non-turms-client-js clients, this data refers to the number of data bytes sent by the TCP connection;
    For turms-client-js clients, this data refers to the bytes of the binary frame received by the WebSocket connection

    Business Request Metrics

    TODO

    + \ No newline at end of file diff --git a/docs/client/quick-start.html b/docs/client/quick-start.html index aa7de6d2..6799a32d 100644 --- a/docs/client/quick-start.html +++ b/docs/client/quick-start.html @@ -17,7 +17,7 @@ -
    Skip to content

    Quick Start

    1. Clone the Turms repository (currently none of the client code is released to any public dependency repository). Reference command: git clone --depth 1 https://github.com/turms-im/turms.git

    2. In your project, import the corresponding client as follows:

      • For projects using turms-client-js:

        First go to the directory of the turms-client-js subproject, and execute the command npm run quickbuild, which will install the dependencies and compile the release package of the turms-client-js. Then:

        • For projects using modules:
          • Installation: Add under dependencies of package.json: "turms-client-js": "file:<YOUR_OWN_PATH>/turms-client-js"
          • Use: import Turms client through import TurmsClient from 'turms-client-js'
        • For projects that do not use modules: add on HTML: <script type="text/javascript" src="<YOUR_OWN_PATH>/turms-client-js/dist/turms-client.iife.js"></script >, and use the global object TurmsClient directly.
      • For projects using turms-client-kotlin:

        • Installation: In the directory of the turms-client-kotlin subproject, execute the command mvn clean install, which will compile turms-client-kotlin and install its JAR file to the local Maven repository.

        • Usage:

          • For Maven projects, add:

            xml
            <dependency>
            +    
            Skip to content

            Quick Start

            1. Clone the Turms repository (currently none of the client code is released to any public dependency repository). Reference command: git clone --depth 1 https://github.com/turms-im/turms.git

            2. In your project, import the corresponding client as follows:

              • For projects using turms-client-js:

                First go to the directory of the turms-client-js subproject, and execute the command npm run quickbuild, which will install the dependencies and compile the release package of the turms-client-js. Then:

                • For projects using modules:
                  • Installation: Add under dependencies of package.json: "turms-client-js": "file:<YOUR_OWN_PATH>/turms-client-js"
                  • Use: import Turms client through import TurmsClient from 'turms-client-js'
                • For projects that do not use modules: add on HTML: <script type="text/javascript" src="<YOUR_OWN_PATH>/turms-client-js/dist/turms-client.iife.js"></script >, and use the global object TurmsClient directly.
              • For projects using turms-client-kotlin:

                • Installation: In the directory of the turms-client-kotlin subproject, execute the command mvn clean install, which will compile turms-client-kotlin and install its JAR file to the local Maven repository.

                • Usage:

                  • For Maven projects, add:

                    xml
                    <dependency>
                         <groupId>im.turms</groupId>
                         <artifactId>turms-client-kotlin</artifactId>
                         <version>0.10.0-SNAPSHOT</version>
                    @@ -40,7 +40,7 @@
                         path: <YOUR_OWN_DIR>/turms_client_dart
                    dependencies:
                       turms_client_dart:
                         path: <YOUR_OWN_DIR>/turms_client_dart
                • Write business logic code

            - + \ No newline at end of file diff --git a/docs/client/requirements.html b/docs/client/requirements.html index ab2f9de5..492a3f74 100644 --- a/docs/client/requirements.html +++ b/docs/client/requirements.html @@ -17,8 +17,8 @@ -
            Skip to content

            Version Requirements

            The minimum requirements for the version of the Turms client are mainly based on three factors: the global market share of the platform, the minimum supported version of TLSv1.2 on the platform, and the elegance of code implementation. In addition, Turms does not provide official support for obsolete protocols such as TLSv1 and TLSv1.1.

            PlatformMinimum supported versionReason
            Android21+Considering the market share of 21+ and the elegance of code implementation, it supports 21+
            iOS12.0+Considering the global market share of iOS 12.0+ and the habits of Apple product users, turms-client-swift adopts NWConnection to implement the TCP protocol, so the device version requirements are equivalent to those of devices supporting NWConnection.
            In addition, turms-client-swift will not consider using the ancient CFStreamCreatePairWithSocketToHost to implement the TCP protocol.
            BrowserBrowser that supports WebSocket protocolFor IE browsers, turms-client-js only provides official support for IE 11.
            Also, turms-client-js will not downgrade WebSocket to polling.
            Desktopturms-client-kotlin(JDK8+)
            turms-client-js(Node.js 8+)
            If you use turms-client-kotlin, the JDK version is required to be 8(+), because JDK 8+ provides support for TLSv1.2 by default.
            Turms provides official support for Node.js 8+ if you use turms-client-js.

            Note:

            • turms-client-kotlin uses Socket instead of SocketChannel. The main reason is that the Android SDK does not provide a set of standard TLS protocol implementations for SocketChannel, which needs to be implemented by itself. Considering the variety of Android systems and the limited system functions (especially compared to server-side implementations), self-implementation of the TLS protocol can easily lead to various unexpected bugs, so use Socket to implement the official TLS protocol .
            - +
            Skip to content

            Version Requirements

            The minimum requirements for the version of the Turms client are mainly based on three factors: the global market share of the platform, the minimum supported version of TLSv1.2 on the platform, and the elegance of code implementation. In addition, Turms does not provide official support for obsolete protocols such as TLSv1 and TLSv1.1.

            PlatformMinimum supported versionReason
            Android21+Considering the market share of 21+ and the elegance of code implementation, it supports 21+
            iOS12.0+Considering the global market share of iOS 12.0+ and the habits of Apple product users, turms-client-swift adopts NWConnection to implement the TCP protocol, so the device version requirements are equivalent to those of devices supporting NWConnection.
            In addition, turms-client-swift will not consider using the ancient CFStreamCreatePairWithSocketToHost to implement the TCP protocol.
            BrowserBrowser that supports WebSocket protocolFor IE browsers, turms-client-js only provides official support for IE 11.
            Also, turms-client-js will not downgrade WebSocket to polling.
            Desktopturms-client-kotlin(JDK8+)
            turms-client-js(Node.js 8+)
            If you use turms-client-kotlin, the JDK version is required to be 8(+), because JDK 8+ provides support for TLSv1.2 by default.
            Turms provides official support for Node.js 8+ if you use turms-client-js.

            Note:

            • turms-client-kotlin uses Socket instead of SocketChannel. The main reason is that the Android SDK does not provide a set of standard TLS protocol implementations for SocketChannel, which needs to be implemented by itself. Considering the variety of Android systems and the limited system functions (especially compared to server-side implementations), self-implementation of the TLS protocol can easily lead to various unexpected bugs, so use Socket to implement the official TLS protocol .
            + \ No newline at end of file diff --git a/docs/client/session.html b/docs/client/session.html index d5d74f06..3a83447c 100644 --- a/docs/client/session.html +++ b/docs/client/session.html @@ -17,8 +17,8 @@ -
            Skip to content

            Session Lifecycle

            The session lifecycle of the Turms client is relatively easy to understand. Specifically: first set up a connection on the network layer through driver.connect(...), and then log in on the business level through userService.login(...) , after successful login, the corresponding session is established. Finally, the session close notification is sent to the server through the userService.logout(...) method, and the network layer connection is also closed.

            In order to keep the logic simple, it is also convenient for upper-level developers to combine various logics by themselves. Turms does not provide operations such as automatic reconnection and automatic routing, because on one hand developers can easily implement such logic themselves, and on the other hand, such "hidden" internal logic can make it difficult for upper-level developers to control low-level driver behavior and can sometimes become a stumbling block.

            Note: Similar to the session close mechanism based on the close frame in WebSocket, when Turms server closes a session, it also notifies the client that the session has been closed through a session close signal, and after the signal is flushed, it notifies the underlying WebSocket/TCP to close the connection. Turms server does not need to wait for any response from the client regarding the session close signal, and the client does not send any response to the server regarding the session close signal.

            Lifecycle Callback Hooks

            LayerNameInvocation TimingReminder
            Network layerdriver.addOnConnectedListenerWhen the network layer connection is establishedUsually you don't need to add connection event listeners through addOnConnectedListener,
            but run custom code after the successful asynchronous execution of driver.connect(...).
            Network layerdriver.addOnDisconnectedListenerWhen the network layer connection is disconnected
            Business logic layeruserService.addOnOnlineListenerWhen the session is established, i.e., when the user logs inUsually you don't need to add online event listeners through addOnOnlineListener,
            but run custom code after the successful asynchronous execution of userService.login(...).
            Business logic layeruserService.addOnOfflineListenerWhen the session is disconnected, i.e., when the user logs out
            - +
            Skip to content

            Session Lifecycle

            The session lifecycle of the Turms client is relatively easy to understand. Specifically: first set up a connection on the network layer through driver.connect(...), and then log in on the business level through userService.login(...) , after successful login, the corresponding session is established. Finally, the session close notification is sent to the server through the userService.logout(...) method, and the network layer connection is also closed.

            In order to keep the logic simple, it is also convenient for upper-level developers to combine various logics by themselves. Turms does not provide operations such as automatic reconnection and automatic routing, because on one hand developers can easily implement such logic themselves, and on the other hand, such "hidden" internal logic can make it difficult for upper-level developers to control low-level driver behavior and can sometimes become a stumbling block.

            Note: Similar to the session close mechanism based on the close frame in WebSocket, when Turms server closes a session, it also notifies the client that the session has been closed through a session close signal, and after the signal is flushed, it notifies the underlying WebSocket/TCP to close the connection. Turms server does not need to wait for any response from the client regarding the session close signal, and the client does not send any response to the server regarding the session close signal.

            Lifecycle Callback Hooks

            LayerNameInvocation TimingReminder
            Network layerdriver.addOnConnectedListenerWhen the network layer connection is establishedUsually you don't need to add connection event listeners through addOnConnectedListener,
            but run custom code after the successful asynchronous execution of driver.connect(...).
            Network layerdriver.addOnDisconnectedListenerWhen the network layer connection is disconnected
            Business logic layeruserService.addOnOnlineListenerWhen the session is established, i.e., when the user logs inUsually you don't need to add online event listeners through addOnOnlineListener,
            but run custom code after the successful asynchronous execution of userService.login(...).
            Business logic layeruserService.addOnOfflineListenerWhen the session is disconnected, i.e., when the user logs out
            + \ No newline at end of file diff --git a/docs/client/turms-chat-demo.html b/docs/client/turms-chat-demo.html new file mode 100644 index 00000000..43e00dff --- /dev/null +++ b/docs/client/turms-chat-demo.html @@ -0,0 +1,24 @@ + + + + + + Turms Chat Demo | Turms Documentation + + + + + + + + + + + + + +
            Skip to content

            Turms Chat Demo

            Background

            Initially, our plan was to let users to reuse existing XMPP clients by making turms-gateway support the XMPP protocol. However, both paid and free XMPP clients have generally low quality, mainly due to the following reasons:

            1. Most XMPP client projects have poor code quality, especially early client engineers who lack coding skills. They often mix complex UI logic with business logic (e.g., the famous open-source project JMeter), making it difficult for redevelopment. It is better to rewrite them from scratch.
            2. Both commercial and open-source XMPP clients have UI designs that are at an amateur level. If a client project lacks a professional UI, we doubt the capabilities of their frontend engineers and UI designers (a competent intermediate frontend engineer should be capable of designing a single product UI independently). We do not recommend users to adopt their solutions.
            3. There is hardly any open-source XMPP client that supports a complete cross-platform solution.
            4. Many low-quality XMPP clients even require payment.

            Considering that developing a cross-platform IM application is not difficult and mainly involves manual work, and that IM application UI and functionalities are highly generic (researching 10 commercial IM applications in the market would reveal that at least 9 of them have similar UI and functionalities), we decided to first provide the IM client demo turms-chat-demo-flutter for Turms users to use or redevelopment. We will support the XMPP protocol later.

            Roadmap

            • November-December 2023: Complete desktop UI design; set up Flutter project framework; develop and test basic desktop components; complete Windows UI development and testing.
            • December 2023-January 2024: Adapt the UI for MacOS; develop and test basic mobile components; complete Android UI development and testing.
            • January-February 2024: Adapt the UI for iOS.
            • February-March 2024: Develop the UI for the web.
            • March-April 2024: Integrate turms-client-dart and implement IM business logic (the above tasks only involve UI development and testing, excluding business logic).

            Note:

            • Considering other tasks, holidays, and work situations at Turms, the above timeline may be subject to slight changes.
            • There is no plan to support mini programs.

            Introduction

            We want to emphasize the term demo in the project name. This term mainly has the following meanings:

            1. Whether from a product perspective or a technical perspective, this client "demo" is just one of the "possible" solutions. Users should not limit their ability to design their own IM products because of this "demo." Especially, do not assume that Turms' server is customized for this "demo." As repeatedly mentioned in the Turms documentation, Turms is a generic IM solution dedicated to solving various IM scenarios.
            2. Prepare for users' further development. This mainly involves three aspects:
              1. Separation of UI and business logic. This allows teams that require redevelopment to reuse the UI and implement their own business logic.
              2. We continue to use the permissive Apache 2.0 license instead of the more restrictive GPL license commonly used in client open-source projects.
              3. Since the UI design of IM applications worldwide is very similar, this demo will also implement most of the generic UI and logic for IM. It generally does not provide more customized logic to facilitate redevelopment by other teams.

            Note: demo does not imply "low quality." Readers will understand this by examining the code quality and UI design later.

            + + + + \ No newline at end of file diff --git a/docs/client/turms-client-js.html b/docs/client/turms-client-js.html index 651561c4..40e5f95e 100644 --- a/docs/client/turms-client-js.html +++ b/docs/client/turms-client-js.html @@ -17,7 +17,7 @@ -
            Skip to content

            turms-client-js Shared Context

            Background

            Since the Turms server does not support and does not plan to support: a user establishes multiple sessions at the same time on the same platform. Therefore, if a user opens multiple tabs in the browser and tries to log in with the same user ID and device type, then there is one and only one session that can be successfully established. From the perspective of the browser, there is one and only one tab page that can log in successfully. This scenario is suitable for general social applications.

            Application Scenarios

            However, some instant messaging scenarios require support for: from a user's perspective, the user only needs to log in once on one page, so that clients in other tabs are also logged in. The Turms clients in all tabs should be able to send and receive requests, messages, and notifications with the same user identity. It is suitable for scenarios such as customer service systems.

            To support the above scenarios, a Shared Context needs to be used. Specifically, for Turms clients of the same domain (same protocol; same domain name; same port), same user ID, and same device type in different tabs, they can share the WebSocket connection with the Turms server and logged-in user information.

            Note: Only Turms clients with the same domain, user ID, and device type can share context. Therefore, your client can log in with different user identities in different tabs to support features such as "some tabs share A user's session, while others share B user's session."

            Usage

            turms-client-js does not enable the shared context by default, but if your application needs to use this feature, you can enable it by passing the parameter useSharedContext: true when creating a TurmsClient instance as follows:

            javascript
            var client = new TurmsClient({
            +    
            Skip to content

            turms-client-js Shared Context

            Background

            Since the Turms server does not support and does not plan to support: a user establishes multiple sessions at the same time on the same platform. Therefore, if a user opens multiple tabs in the browser and tries to log in with the same user ID and device type, then there is one and only one session that can be successfully established. From the perspective of the browser, there is one and only one tab page that can log in successfully. This scenario is suitable for general social applications.

            Application Scenarios

            However, some instant messaging scenarios require support for: from a user's perspective, the user only needs to log in once on one page, so that clients in other tabs are also logged in. The Turms clients in all tabs should be able to send and receive requests, messages, and notifications with the same user identity. It is suitable for scenarios such as customer service systems.

            To support the above scenarios, a Shared Context needs to be used. Specifically, for Turms clients of the same domain (same protocol; same domain name; same port), same user ID, and same device type in different tabs, they can share the WebSocket connection with the Turms server and logged-in user information.

            Note: Only Turms clients with the same domain, user ID, and device type can share context. Therefore, your client can log in with different user identities in different tabs to support features such as "some tabs share A user's session, while others share B user's session."

            Usage

            turms-client-js does not enable the shared context by default, but if your application needs to use this feature, you can enable it by passing the parameter useSharedContext: true when creating a TurmsClient instance as follows:

            javascript
            var client = new TurmsClient({
                  useSharedContext: true
             });
            var client = new TurmsClient({
                  useSharedContext: true
            @@ -83,8 +83,8 @@
                 userId: 1,
                 password: "123",
                 deviceType: DeviceType.ANDROID
            -});
            - +});
            + \ No newline at end of file diff --git a/docs/community/index.html b/docs/community/index.html index e86029de..4dc4c730 100644 --- a/docs/community/index.html +++ b/docs/community/index.html @@ -17,8 +17,8 @@ -
            Skip to content

            Community

            FAQ

            Why does Issues use English?

            The fundamental reason: Issues are written in a single language to facilitate searching. During the use of Issues, encountering open source projects that use multiple languages is the most troublesome because when searching for a problem, such as "How is the blocklist mechanism implemented in the Turms server", for bilingual projects, we usually need to search for both "黑名单" and "blocklist" keywords. In other words, at least two searches are needed to ensure that all related Issues are found, resulting in a poor user search experience. However, if Issues are only in English, users only need to search for the "blocklist" keyword.

            Secondary reason: using English facilitates global open source and promotion, while using non-English languages goes against our open source philosophy.

            In addition, we do not exclude users from submitting Issues in non-English languages, but encourage them to use English more often. However, we will always reply in English.

            Why are There no QQ Groups, WeChat Groups, Slack Channels, or Other Groups?

            Using various groups for issues management and discussion is a very bad practice, and issues management should have been prioritized using GitHub's Issues. The reasons for this are as follows.

            • Issues allows for focused discussion on a single issue
            • It is easy for later users to search for issues
            • Developers can do task tracking through Issues
            • Users can view the progress of various tasks through Issues, open and transparent

            However, various groups cannot achieve the above functions. On the contrary, various groups are a manifestation of closed project information and go against the purpose of open source. Some open source projects will intentionally block the flow of information to earn consultation or service fees, but this is not the purpose of Turms.

            In practice, groups and even video conferences are more often used for quick discussions among developers internally, especially in the early stages of drafting, but the final results of the discussion and the key issues involved are still recorded in Issues or documents to facilitate users and developers to understand the ins and outs of a problem.

            Can I Ask "Newbie Questions"?

            There are no so-called "newbie questions" in the Turms project, only "questions related to the Turms project" and "questions unrelated to the Turms project." Everyone may appear "not very professional" when they encounter a new field, and as newcomers, we hope that there will be more goodwill and tolerance from people in this field. Similarly, as long as it is a question related to the Turms project, we will reply. And when encountering "basic questions", we usually think not "this question is terrible," but "can we add some documents, or optimize the documents to provide more guidance to new users". Therefore, users do not need to worry about asking so-called "newbie questions."

            In addition, there is an attitude problem. As long as everyone respects each other, any question can be discussed. The common unacceptable attitudes are: 1. Not reading the documentation, not checking Issues first, and not willing to think before asking directly; 2. Condescending.

            Of course, learning how to ask questions is also a very interesting thing. For details, please refer to "How To Ask Questions The Smart Way".

            Can Responses Generated by a Model Similar to ChatGPT be Used for Discussion?

            ChatGPT is an excellent memorizer, but its analysis of various technical solutions is quite naive. Engaging in discussions with ChatGPT responses only reflects a lack of critical thinking and a lack of responsibility towards the projects. Therefore, whether we should answer such responses depends on the proportion of responses after removing ChatGPT answers.

            Let me mention why we pay so much attention to the issue of "attitude." In fact, engineers with work experience have probably had similar experiences: their work depends on the cooperation of other teams. Although certain tasks may be technically simple, they can become stalled due to the laziness and negative cooperation of other team members, making progress on their own projects extremely difficult. Therefore, in projects that require team collaboration, addressing technically manageable issues on one's own is usually the easiest part, while motivating and coordinating various project teams to work together and complete tasks by the deadline is the most challenging and demanding aspect.

            Some engineers without work experience might consider technical expertise as the primary survival skill for engineers. However, a responsible attitude is actually the most critical survival skill in the workplace or community (of course, if someone is genuinely responsible for a project, their technical skills won't be lacking either). Apart from specific domains, for most projects, the technical competence displayed by most qualified engineers is quite similar. The real differentiation lies in their level of dedication and responsibility towards a project.

            Therefore, to demonstrate a responsible attitude, please refrain from directly using ChatGPT generated responses to participate in discussions.

            How to Identify Responses Generated by a Model Similar to ChatGPT

            1. The writing style generated by GPT is often too apparent and can be manually recognized.
            2. Use the open-source model from Hugging Face, Hello-SimpleAI/chatgpt-detector-roberta, to detect responses generated by ChatGPT-like models online.
            3. Even as GPT continues to develop and display more diverse writing styles, there are now many pre-trained language models and various corpora available. Therefore, it's possible to train a new model to detect GPT-like responses based on transfer learning. This process can be relatively fast, taking just one day, or slower, taking 2-3 days.

            About Upstream First

            Directly interacting with the open source community and solving problems at the source is called upstream first.

            For Turms, upstream first mainly involves two aspects: communication and code feedback.

            • Communication: Before doing a feature or fixing a bug, it is best to open an issue on GitHub in advance. Some features may seem common and easy to implement, but Turms currently does not have them implemented. It's possible that this seemingly simple feature often involves many details, such as:

              • Are there any other related or extended requirements for this requirement?
              • Can this requirement be implemented in this way? Can all related features be implemented in this way? Does the code implementation need to be separate? Is the code implementation universal? Can this template implement almost all related requirements?
              • Can it be implemented in both single-machine and distributed scenarios?
              • From a different business perspective or technical perspective, is there a better design and implementation?

              Therefore, a "seemingly" simple requirement may involve a large amount of requirement analysis and technical analysis. If developers silently implement some features locally, they will face a series of issues mentioned above when giving back the code. If major design problems are discovered during the implementation at this time, some previous efforts may be wasted (of course, there are still gains, at least knowing that "there is room for optimization in the current solution"). Therefore, when facing complex features, developers should be mentally prepared for "design may be overturned repeatedly."

              To minimize this situation, when designing and implementing complex features, it is best for developers to initiate a new discussion in Issue, so as to reduce the number of times of design being overturned and save developers' time and effort.

              Note: Sometimes, even if the design is completed in advance, more ingenious designs may be discovered during the implementation, and the more complex the function, the more design iterations it usually involves. However, these "overturned/half-overturned" iterations are best discussed and developed repeatedly before the code is released, rather than discovering them after the code is released.

              Note: Because of the complexity of requirements, many "seemingly" issues on GitHub Issues are in "pending". Many feature-related issues are just seeds that developers need to do more detailed requirement analysis, design, and coding, and the most difficult thing is usually requirement analysis, which needs to clarify "what needs to be done", and developers need to consider both current and future requirements, and prevent over-design. This is also why Turms documentation mentions several times that "the design and implementation of IM business functions are far more difficult than the design and implementation of technology middleware".

            • Reduce your maintenance costs and facilitate the continuous merging of upstream updates. If a developer forks the Turms project for complex secondary development, they will face a long-term maintenance problem: if the developer wants to use upstream's new code, they need to constantly adapt their own branch, and the faster upstream Turms server updates, the greater the developer's adaptation workload. There may even be logical conflicts that the developer is not aware of.

              On the contrary, if developers give back the code to upstream, such problems will not occur. Because we will not only maintain these feedbacked codes together, but also consider whether these new designs and these feedbacked codes are consistent in design when designing other new related functional modules for Turms.

            • Reduce maintenance conflicts and avoid overturning local implementations repeatedly. Developers may have added some new features or fixed some bugs locally, but have not given back. After a period of time, developers may find that upstream considers the functionality they have implemented to be more thoughtful and complete, and the bug fixes are more ingenious (readers can read about the difficulty of Turms server-side bugs in Task Difficulty). Ultimately, developers have to revert all their original work, then re-pull upstream and start over again. The workload among them is painful to think about, and the more developers change locally, the more conflicts there may be.

            About Contacting Turms Author for Private Chat and Custom Development

            If readers' teams are interested in doing redevelopment themselves, they can directly refer to the article on Redevelopment.

            For users who wish to pay Turms' author for custom development, it's worth noting that Turms' author generally only accepts unpaid development for common needs (yes, generally, only unpaid development for the community). The reason for this is quite simple; Turms' author doesn't lack money, and even if the Turms project incurs a loss of several tens of thousands of Chinese yuan every year, we can still ensure the continuous operation of the Turms project because we never intended to profit from it in the first place. So, either we will only accept a very high offer that's hard to refuse, or we will only accept unpaid development for the community.

            Therefore, unless you are prepared to offer a very high price, it's not advisable to try to contact Turms' author for custom development. If you genuinely want Turms' author to prioritize fulfilling your requirements, you can describe your needs clearly and post them in Issues, and then we will schedule them based on the cost-effectiveness of the requirements and your respect for the requirements you've proposed.

            Of course, if you are even willing to pay a high fee for custom development to Turms' author, I also recommend considering commercial solutions directly, even though their development level, work attitude, and work responsibility are probably not as good as Turms' author. Of course, this mainly depends on which country and company's solution you decide to adopt.

            Compared to free development, custom development differs in the following aspects:

            • A complete, phased project schedule will be provided, including design, development, testing, delivery, and so on.

            • Assistance with designing requirements. Readers might wonder why they need Turms' author to design requirements if they already want custom developmpent. This is much like what Henry Ford said, "If I had asked people what they wanted, they would have said faster horses." What users ask for may not necessarily be what they truly need, and having insights into users' real needs is one of the essential skills required for engineers.

            • Guaranteed fixed working hours. During this time, only project-related custom design, development, testing, deployment, and addressing various questions will be done.

              Of course, all of the above is done by Turms' author during their off-hours.

            If some users are concerned that Turms' author might intentionally slow down the development and release progress of the features they want due to not having paid, this won't happen either, because Turms' author doesn't lack money and doesn't intend to profit from open-source, so there's no motivation to intentionally delay.

            - +
            Skip to content

            Community

            FAQ

            Why does Issues use English?

            The fundamental reason: Issues are written in a single language to facilitate searching. During the use of Issues, encountering open source projects that use multiple languages is the most troublesome because when searching for a problem, such as "How is the blocklist mechanism implemented in the Turms server", for bilingual projects, we usually need to search for both "黑名单" and "blocklist" keywords. In other words, at least two searches are needed to ensure that all related Issues are found, resulting in a poor user search experience. However, if Issues are only in English, users only need to search for the "blocklist" keyword.

            Secondary reason: using English facilitates global open source and promotion, while using non-English languages goes against our open source philosophy.

            In addition, we do not exclude users from submitting Issues in non-English languages, but encourage them to use English more often. However, we will always reply in English.

            Why are There no QQ Groups, WeChat Groups, Slack Channels, or Other Groups?

            Using various groups for issues management and discussion is a very bad practice, and issues management should have been prioritized using GitHub's Issues. The reasons for this are as follows.

            • Issues allows for focused discussion on a single issue
            • It is easy for later users to search for issues
            • Developers can do task tracking through Issues
            • Users can view the progress of various tasks through Issues, open and transparent

            However, various groups cannot achieve the above functions. On the contrary, various groups are a manifestation of closed project information and go against the purpose of open source. Some open source projects will intentionally block the flow of information to earn consultation or service fees, but this is not the purpose of Turms.

            In practice, groups and even video conferences are more often used for quick discussions among developers internally, especially in the early stages of drafting, but the final results of the discussion and the key issues involved are still recorded in Issues or documents to facilitate users and developers to understand the ins and outs of a problem.

            Can I Ask "Newbie Questions"?

            There are no so-called "newbie questions" in the Turms project, only "questions related to the Turms project" and "questions unrelated to the Turms project." Everyone may appear "not very professional" when they encounter a new field, and as newcomers, we hope that there will be more goodwill and tolerance from people in this field. Similarly, as long as it is a question related to the Turms project, we will reply. And when encountering "basic questions", we usually think not "this question is terrible," but "can we add some documents, or optimize the documents to provide more guidance to new users". Therefore, users do not need to worry about asking so-called "newbie questions."

            In addition, there is an attitude problem. As long as everyone respects each other, any question can be discussed. The common unacceptable attitudes are: 1. Not reading the documentation, not checking Issues first, and not willing to think before asking directly; 2. Condescending.

            Of course, learning how to ask questions is also a very interesting thing. For details, please refer to "How To Ask Questions The Smart Way".

            Can Responses Generated by a Model Similar to ChatGPT be Used for Discussion?

            ChatGPT is an excellent memorizer, but its analysis of various technical solutions is quite naive. Engaging in discussions with ChatGPT responses only reflects a lack of critical thinking and a lack of responsibility towards the projects. Therefore, whether we should answer such responses depends on the proportion of responses after removing ChatGPT answers.

            Let me mention why we pay so much attention to the issue of "attitude." In fact, engineers with work experience have probably had similar experiences: their work depends on the cooperation of other teams. Although certain tasks may be technically simple, they can become stalled due to the laziness and negative cooperation of other team members, making progress on their own projects extremely difficult. Therefore, in projects that require team collaboration, addressing technically manageable issues on one's own is usually the easiest part, while motivating and coordinating various project teams to work together and complete tasks by the deadline is the most challenging and demanding aspect.

            Some engineers without work experience might consider technical expertise as the primary survival skill for engineers. However, a responsible attitude is actually the most critical survival skill in the workplace or community (of course, if someone is genuinely responsible for a project, their technical skills won't be lacking either). Apart from specific domains, for most projects, the technical competence displayed by most qualified engineers is quite similar. The real differentiation lies in their level of dedication and responsibility towards a project.

            Therefore, to demonstrate a responsible attitude, please refrain from directly using ChatGPT generated responses to participate in discussions.

            How to Identify Responses Generated by a Model Similar to ChatGPT

            1. The writing style generated by GPT is often too apparent and can be manually recognized.
            2. Use the open-source model from Hugging Face, Hello-SimpleAI/chatgpt-detector-roberta, to detect responses generated by ChatGPT-like models online.
            3. Even as GPT continues to develop and display more diverse writing styles, there are now many pre-trained language models and various corpora available. Therefore, it's possible to train a new model to detect GPT-like responses based on transfer learning. This process can be relatively fast, taking just one day, or slower, taking 2-3 days.

            About Upstream First

            Directly interacting with the open source community and solving problems at the source is called upstream first.

            For Turms, upstream first mainly involves two aspects: communication and code feedback.

            • Communication: Before doing a feature or fixing a bug, it is best to open an issue on GitHub in advance. Some features may seem common and easy to implement, but Turms currently does not have them implemented. It's possible that this seemingly simple feature often involves many details, such as:

              • Are there any other related or extended requirements for this requirement?
              • Can this requirement be implemented in this way? Can all related features be implemented in this way? Does the code implementation need to be separate? Is the code implementation universal? Can this template implement almost all related requirements?
              • Can it be implemented in both single-machine and distributed scenarios?
              • From a different business perspective or technical perspective, is there a better design and implementation?

              Therefore, a "seemingly" simple requirement may involve a large amount of requirement analysis and technical analysis. If developers silently implement some features locally, they will face a series of issues mentioned above when giving back the code. If major design problems are discovered during the implementation at this time, some previous efforts may be wasted (of course, there are still gains, at least knowing that "there is room for optimization in the current solution"). Therefore, when facing complex features, developers should be mentally prepared for "design may be overturned repeatedly."

              To minimize this situation, when designing and implementing complex features, it is best for developers to initiate a new discussion in Issue, so as to reduce the number of times of design being overturned and save developers' time and effort.

              Note: Sometimes, even if the design is completed in advance, more ingenious designs may be discovered during the implementation, and the more complex the function, the more design iterations it usually involves. However, these "overturned/half-overturned" iterations are best discussed and developed repeatedly before the code is released, rather than discovering them after the code is released.

              Note: Because of the complexity of requirements, many "seemingly" issues on GitHub Issues are in "pending". Many feature-related issues are just seeds that developers need to do more detailed requirement analysis, design, and coding, and the most difficult thing is usually requirement analysis, which needs to clarify "what needs to be done", and developers need to consider both current and future requirements, and prevent over-design. This is also why Turms documentation mentions several times that "the design and implementation of IM business functions are far more difficult than the design and implementation of technology middleware".

            • Reduce your maintenance costs and facilitate the continuous merging of upstream updates. If a developer forks the Turms project for complex secondary development, they will face a long-term maintenance problem: if the developer wants to use upstream's new code, they need to constantly adapt their own branch, and the faster upstream Turms server updates, the greater the developer's adaptation workload. There may even be logical conflicts that the developer is not aware of.

              On the contrary, if developers give back the code to upstream, such problems will not occur. Because we will not only maintain these feedbacked codes together, but also consider whether these new designs and these feedbacked codes are consistent in design when designing other new related functional modules for Turms.

            • Reduce maintenance conflicts and avoid overturning local implementations repeatedly. Developers may have added some new features or fixed some bugs locally, but have not given back. After a period of time, developers may find that upstream considers the functionality they have implemented to be more thoughtful and complete, and the bug fixes are more ingenious (readers can read about the difficulty of Turms server-side bugs in Task Difficulty). Ultimately, developers have to revert all their original work, then re-pull upstream and start over again. The workload among them is painful to think about, and the more developers change locally, the more conflicts there may be.

            About Contacting Turms Author for Private Chat and Custom Development

            If readers' teams are interested in doing redevelopment themselves, they can directly refer to the article on Redevelopment.

            For users who wish to pay Turms' author for custom development, it's worth noting that Turms' author generally only accepts unpaid development for common needs (yes, generally, only unpaid development for the community). The reason for this is quite simple; Turms' author doesn't lack money, and even if the Turms project incurs a loss of several tens of thousands of Chinese yuan every year, we can still ensure the continuous operation of the Turms project because we never intended to profit from it in the first place. So, either we will only accept a very high offer that's hard to refuse, or we will only accept unpaid development for the community.

            Therefore, unless you are prepared to offer a very high price, it's not advisable to try to contact Turms' author for custom development. If you genuinely want Turms' author to prioritize fulfilling your requirements, you can describe your needs clearly and post them in Issues, and then we will schedule them based on the cost-effectiveness of the requirements and your respect for the requirements you've proposed.

            Of course, if you are even willing to pay a high fee for custom development to Turms' author, I also recommend considering commercial solutions directly, even though their development level, work attitude, and work responsibility are probably not as good as Turms' author. Of course, this mainly depends on which country and company's solution you decide to adopt.

            Compared to free development, custom development differs in the following aspects:

            • A complete, phased project schedule will be provided, including design, development, testing, delivery, and so on.

            • Assistance with designing requirements. Readers might wonder why they need Turms' author to design requirements if they already want custom developmpent. This is much like what Henry Ford said, "If I had asked people what they wanted, they would have said faster horses." What users ask for may not necessarily be what they truly need, and having insights into users' real needs is one of the essential skills required for engineers.

            • Guaranteed fixed working hours. During this time, only project-related custom design, development, testing, deployment, and addressing various questions will be done.

              Of course, all of the above is done by Turms' author during their off-hours.

            If some users are concerned that Turms' author might intentionally slow down the development and release progress of the features they want due to not having paid, this won't happen either, because Turms' author doesn't lack money and doesn't intend to profit from open-source, so there's no motivation to intentionally delay.

            + \ No newline at end of file diff --git a/docs/design/architecture.html b/docs/design/architecture.html index 1bc8d136..2e0e2d90 100644 --- a/docs/design/architecture.html +++ b/docs/design/architecture.html @@ -17,8 +17,8 @@ -
            Skip to content

            Architecture Design

            Architecture Features

            Common Architecture Features

            1. (Agility) Support updating Turms servers without the users' awareness of shutdown to support rapid iteration
            2. (Scalability) The Turms server is stateless to be scaled out; Support multi-active across data centers
            3. (Deployability) Support container deployment to facilitate integration (CI/CD) with cloud services. Turms provides three solutions for container deployment out of the box: Docker image, Docker compose file, and Terraform module
            4. (Observability) Support relatively complete features of observability for business analysis and troubleshoot
            5. (Scalability) Support medium to large scale instant messaging applications, and there is no need to refactor even if the application becomes large from medium-scale (There is still a lot of optimization work to be done for large applications, but Turms servers are easy to upgrade)
            6. (Security) Support API throttling and global user/IP blocklist to resist most CC attacks
            7. (Simplicity) The Turms architecture is lightweight, which makes Turms easy to learn and redevelop. Please refer to Turms Architecture Design for details)
            8. Turms depends on the MongoDB sharded cluster to support request routing (such as read-write separation) and tiered storage for medium to large scale applications

            Architecture Description

            Reference Architecture Diagram

            Architecture Differences with Other IM Projects

            Like the code implementation of Turms server, the architecture design of Turms is also very lean. Whenever possible, services are not split, and external services are not introduced unnecessarily. This is reflected in:

            • In the architecture design of some IM projects, they will separate the three major functions of session management, relayed message cache, and message sending in turms-gateway into three independent services to achieve business decoupling and traffic shaping. However, compared with the architecture of Turms, this approach adds two more failure points, increases development and operation difficulty, and requires RPC operations, resulting in lower throughput. Specifically:

              In terms of business decoupling, some IM projects will use the queue of the relayed messages to implement asynchronous consumption of downstream consumers for various statistical functions. However, using data from consuming the message queue to perform message statistics is a poor design. A more comprehensive, professional, and easy-to-implement solution is to use distributed collection and analysis of business logs (such as the AWS-based CloudWatch Logs => Kinesis Firehose => S3 => Athena/QuickSight solution), which is explained in detail in the log section of the observability system. The logic between session management and message sending in turms-gateway is not complex, so there is little benefit to decoupling, and no such requirement exists.

              In terms of traffic shaping, cloud services with elastic scaling (Auto Scaling) are better suited to implement traffic shaping than message queues (such as Kafka, RocketMQ, or other cloud services). Various cloud service providers provide resource monitoring functions, and elastic scaling services can automatically scale resources based on various system metrics (such as CPU/memory utilization) and custom other metrics (such as the number of online users), and automatically release resources when idle, which is more in line with modern operation and maintenance. Taking AWS cloud services as an example, operations staff can use CloudWatch to monitor the above Turms server metrics and cooperate with Application Auto Scaling for automated server resource scaling. If operations staff is familiar with these operations, from purchasing these cloud services from scratch to completing configuration, it may only take 3-10 minutes.

              In terms of high availability, some IM architectures will use highly available (multi-AZ deployment) message queue cloud services and self-developed message sending services to consume the queue to ensure that notifications are not lost. However, in the architecture design of Turms, even if the Turms message push service server turms-gateway is forcibly closed (such as hardware failure, server crash), the Turms server cluster can self-heal. And because in the Turms process design, the application developed based on the Turms client itself needs to send requests and synchronize data with the newly connected Turms server every time it reconnects (corresponding to the callback turmsClient.userService.addOnOnlineListener(...)), messages and statuses will not be lost due to turms-gateway crashes or network disconnections.

              The reason why some IM projects insist on decoupling and introducing message queues, even when there are only tens of thousands or less online users, is simply to enhance their resumes and increase their irreplaceability, adding unnecessary technologies to the project and engaging in excessive design.

              Generally, only in the cloud architecture design of small and medium-sized IM scenarios based on serverless architecture can message queues play the most significant role. Still, even in such scenarios, as mentioned above, users can send notifications to AWS SQS to ensure high availability of message services and use Lambda functions to push messages to ensure that notifications are not lost. In this type of architecture design, users do not have self-developed services.

              In addition, the reason why serverless architecture is most suitable for small and medium-sized IM scenarios is that:

              • Lambda services have many quota restrictions, see Lambda quotas.

              • Compared with developing based on serverless architecture, designing and implementing self-developed IM services will be much simpler and more controllable. Blindly pursuing more "fashionable" serverless architecture may not be progress, but regression.

            • In the architecture design of some IM projects, they will separate session management into two services: network connection management and session logic management to ensure that when updating the session logic management service during downtime, the client does not need to disconnect from the network connection management service. However, considering that turms-gateway has almost no session business logic, and the existing business logic is very fixed, the main business logic is implemented in turms-service. Therefore, there is little need for turms-gateway to update the business logic during downtime, and thus splitting network connection management and session logic management into two independent services would add more failure points, result in performance degradation, and have little benefit for Turms. Therefore, Turms architecture design does not currently split session management into separate services.

              Notes:

              • The reason why the code implementation of Turms server is also very lean is because of the Basic Development Conventions.
              • In fact, in the early design of Turms, it considered not using distributed memory services like Redis, but adopted another common distributed memory implementation solution, which is to use a design similar to distributed map in Hazelcast or distributed cache in Ignite to enable Turms servers to synchronize data through distributed maps, thereby reducing dependence on external services. However, considering the high availability design of the cluster, the release process design of Turms server itself, etc., Redis was ultimately introduced to implement distributed memory.

            Relationship Between Turms Architecture and Cloud Architecture

            As of 2022, AWS is still the top cloud provider in terms of global market share, so the following discussion will mainly be based on AWS cloud.

            • The architecture design of Turms must ensure that its technical solutions do not rely on any cloud services to maintain technical neutrality, avoid being tied to any vendor's technology stack, and make it easy for non-cloud users to deploy a complete set of Turms servers (such as Kubernetes). At the same time, the technology solutions used by Turms must have the support of cloud vendors to ensure that cloud users can easily deploy a complete set of highly available Turms servers through various cloud services provided by different vendors.

              For the core IM functionality of Turms server, this requirement does not affect the release of Turms' core features because these features are implemented in the same way regardless of whether they are deployed on the cloud or not.

              However, for some IM extension features, such as file storage and data analysis, their implementation is more complicated because we need to consider, design, and implement various solutions. Taking business data analysis as an example, if Turms is designed with AWS, the implementation of business data analysis is very simple. In general, it is based on the business logs provided by Turms server, providing a set of CloudFormation configurations, and analyze data according to the needs and configurations of different users, such as (the easiest but not the cheapest) CloudWatch Logs Insights, (based on S3, cost-effective but not real-time) CloudWatch Logs => S3 => Ahtena/QuickSight, (based on S3, cost-effective, and introducing Kinesis Firehose to ensure real-time data integration) CloudWatch Logs => Kinesis Firehose => S3 => Athena/QuickSight or other data analysis solutions. However, Turms also needs to meet the needs of users who do not want to use other third-party services, so it needs to develop its own data analysis solution in the later stage. Therefore, the workload will be much larger, and the speed of releasing extension features will be much slower.

              But as mentioned above, if users can use third-party services to analyze the business logs provided by Turms, they don't have to wait for Turms to provide a solution.

            • Turms' cloud architecture design is very simple.

              • Turms' cloud architecture is just a subset of cloud architecture design. Compared with the enterprise cloud architecture design of large-scale hybrid clouds (enterprise cloud architecture design includes not only deployment architecture design of various projects, but also organization structure design, hybrid cloud network architecture design, etc.), although Turms can be considered as a large-scale project in the open source community, designing cloud architecture for such a volume project is still quite simple, and users who have basic understanding of cloud services should be able to understand Turms' cloud architecture design.

              • Turms' cloud architecture is very traditional. If users have deployed other traditional web services' cloud architectures, deploying Turms is almost the same, especially since Turms provides multiple deployment schemes and even Terraform-based schemes to help users automatically purchase and configure cloud services.

                The relatively complicated part of Turms' cloud architecture is that some cloud vendors do not directly support MongoDB services. For example, AWS does not directly support higher versions of MongoDB services. Although AWS has provided DocumentDB services compatible with lower versions of MongoDB, due to competition between MongoDB and AWS vendors, AWS can currently only lock the latest MongoDB version compatible with DocumentDB at version 4.0, and the maintenance effort is also relatively low. Overall, DocumentDB service is somewhat redundant and has poor development prospects, so it is recommended to use MongoDB Atlas service directly.

                However, because MongoDB is a partner of AWS, users can easily integrate MongoDB Atlas enterprise service into AWS through VPC Peering and deploy it.

            The General Process of Client Accessing Server

            This process is the general process for the client to access the server, and it is also the process for the Turms architecture to achieve horizontal scaling, you can adjust it according to the actual situation.

            • When the client needs to establish a TCP connection with the turms-gateway server, the client can use the DNS service to query the IP address corresponding to the access layer server's domain name, which points to the SLB/ELB service (usually based on LVS and Nginx), Global Acceleration Service, or turms-gateway, depending on the needs and size of your actual application. The DNS service can be configured with one or more public IP addresses (In the production environment, do not configure the server's public IP address to mitigate DDoS attacks.) and return an IP address to the client via polling or other policies.

              Notes:

              • Regardless of whether the Turms client is using a TCP connection or an upper layer WebSocket connection, the upstream services of turms-gateway (DNS/SLB, etc.) should perform load balancing of TCP connections based on the client IP address.

              • It is highly recommended that you enable the Sticky Session feature of the SLB service so that the session is always connected to a turms-gateway server. This has the advantage of mitigating a large portion of DDoS attacks. Because turms-gateway supports blocking clients automatically, it can quickly detect and block IPs or users with abnormal behavior on the local server, but the default time interval for synchronizing blocked client data between turms-gateway servers is about 10~15 seconds, so if the Sticky Session feature is turned off, hackers can take advantage of the blocked data synchronization If the Sticky Session feature is turned off, hackers can use the blocked data synchronization interval to switch the TCP connection with turms-gateway and perform DDoS attacks.

              • Normally, you should place the SSL certificate on the upstream server of turms-gateway, i.e. the upstream SLB service or Nginx server, etc.

              • Since turms-gateway is designed with a stateless architecture, any client can connect to any turms-gateway server, and you can flexibly scale up or down turms-gateway servers; the state (i.e., user session information) is transferred to the distributed in-memory Redis servers.

            • After the client gets the IP address and successfully establishes a TCP connection with the turms-gateway, the turms-gateway detects whether the IP has been blocked or whether the turms-gateway itself is overloaded, and if so, actively disconnects the TCP connection. Otherwise, passing the TCP connection.

            • If the turms-gateway passes the TCP connection.

              • For a Turms client using a TCP connection, the client can start initiating a Protobuf data stream of TurmsRequest. This data stream consists of two parts, a ZigZag-encoded body-length header, and a Protobuf-encoded body.
              • For a Turms client using a WebSocket connection, the client will initiate an HTTP upgrade request to the turms-gateway server after a successful TCP connection is established, requesting an HTTP upgrade to the WebSocket. If the upgrade is successful, the client can put the Protobuf encoded TurmsRequest data in the body of the WebSocket binary frame and send it to the turms-gateway server.

              Note: At this point, the Turms client only sets up a network connection to the turms-gateway, but the user has not yet logged in and no session has been established.

            • After the stream is forwarded by the load balancing service (optional), it reaches the turms-gateway server first. The turms-gateway server first performs a simple Protobuf format verification on the stream (without verifying the legitimacy of the business request, in order to decouple the business logic from turms-service servers, so that turms-service servers can update the business request format independently without the need to stop turms-gateway servers, and if it is an illegal data stream, the TCP connection will be closed.

              Otherwise, if it is a legitimate request, it is partially parsed to confirm whether the turms-gateway server can handle the request on its own. For example, for both login and logout requests, the turms-gateway server can handle them on its own.

            • If the turms-gateway server can handle the request on its own, it will return a response. If it cannot handle it, then it detects whether the user has logged in on the local server, and if not, it rejects the request and sends back a response. If the user is logged in, a turms-service server is first selected from the list of available turms-service servers according to the load balancing policy, and then the request is forwarded to that turms-service server for processing through the self-developed RPC implementation.

              • If the turms-gateway server detects that the client request is a login request, the turms-gateway server forms a session ID based on the user ID and the device type specified by the login request, and determines whether the session ID conflicts with the logged-in session based on the user session information on Redis or the local cache. If there is a conflict, the login request will be rejected and a response is sent back informing the client of the failure reason. Otherwise, the current user session information is registered with Redis, and a successful response is sent back. At this point, the user enters the online state.

                Notes:

                • A session ID (user ID + device type) will constitute a user session with only one turms-gateway server and a TCP connection with one turms-gateway server at the same moment. All subsequent service requests of the user are done in this one session and TCP connection until the session is closed and the user is offline.

                • Different devices under one user ID can form a `user session' with different turms-gateway servers at the same time, regardless of whether they are from different IPs.

                  However, it is recommended that all devices under a user ID are always connected to a single turms-gateway because:

                  1. If logged into the same turms-gateway, the server only needs to send its byte stream to one turms-gateway server instead of multiple when forwarding messages or notifications to a user, in order to reduce system resource overhead and increase throughput.
                  2. All devices of the same user on the same turms-gateway server share the session's heartbeat clock, thus reducing the number of TTL heartbeat refresh requests that the turms-gateway server sends to Redis;
                  3. If the server has user status caching enabled, it may use a user status that has not been updated when forwarding messages or notifications, so new messages may not be sent to the newly logged-in device immediately.
              • If the turms-gateway server is unable to handle the client request, the request will be forwarded to a turms-service server via RPC service. After receiving the client request, the turms-service server verifies and processes the request, triggering the ClientRequestHandler plugin to assist developers in implementing custom logic (such as filtering sensitive words). Additionally, during the processing, corresponding CRUD requests are usually sent to mongos. Once the client request has been processed, turms-service will send the generated response back to the turms-gateway server. For the notifications generated during the processing, the turms-service server will first query Redis or local cache based on the ID of the notified user to obtain the node ID of the turms-gateway connected by this batch of users. The notifications are then sent to these turms-gateway servers via RPC service for notification pushing.

                Note: Turms adopts the MongoDB sharded cluster. After receiving the CRUD request, mongos routes the request according to the configuration.

              • Regardless of whether the turms-gateway server receives a response or notification, it does not perform any validity checks but instead directly forwards it to the user. During the notification pushing, the turms-gateway server triggers the NotificationHandler plugin to assist developers in implementing custom logic (such as pushing messages to offline users).

              (Notably, all network IO operations in Turms are implemented based on Netty, i.e., all of the above RPC and database calls are asynchronous and non-blocking.)

            - +
            Skip to content

            Architecture Design

            Architecture Features

            Common Architecture Features

            1. (Agility) Support updating Turms servers without the users' awareness of shutdown to support rapid iteration
            2. (Scalability) The Turms server is stateless to be scaled out; Support multi-active across data centers
            3. (Deployability) Support container deployment to facilitate integration (CI/CD) with cloud services. Turms provides three solutions for container deployment out of the box: Docker image, Docker compose file, and Terraform module
            4. (Observability) Support relatively complete features of observability for business analysis and troubleshoot
            5. (Scalability) Support medium to large scale instant messaging applications, and there is no need to refactor even if the application becomes large from medium-scale (There is still a lot of optimization work to be done for large applications, but Turms servers are easy to upgrade)
            6. (Security) Support API throttling and global user/IP blocklist to resist most CC attacks
            7. (Simplicity) The Turms architecture is lightweight, which makes Turms easy to learn and redevelop. Please refer to Turms Architecture Design for details)
            8. Turms depends on the MongoDB sharded cluster to support request routing (such as read-write separation) and tiered storage for medium to large scale applications

            Architecture Description

            Reference Architecture Diagram

            Architecture Differences with Other IM Projects

            Like the code implementation of Turms server, the architecture design of Turms is also very lean. Whenever possible, services are not split, and external services are not introduced unnecessarily. This is reflected in:

            • In the architecture design of some IM projects, they will separate the three major functions of session management, relayed message cache, and message sending in turms-gateway into three independent services to achieve business decoupling and traffic shaping. However, compared with the architecture of Turms, this approach adds two more failure points, increases development and operation difficulty, and requires RPC operations, resulting in lower throughput. Specifically:

              In terms of business decoupling, some IM projects will use the queue of the relayed messages to implement asynchronous consumption of downstream consumers for various statistical functions. However, using data from consuming the message queue to perform message statistics is a poor design. A more comprehensive, professional, and easy-to-implement solution is to use distributed collection and analysis of business logs (such as the AWS-based CloudWatch Logs => Kinesis Firehose => S3 => Athena/QuickSight solution), which is explained in detail in the log section of the observability system. The logic between session management and message sending in turms-gateway is not complex, so there is little benefit to decoupling, and no such requirement exists.

              In terms of traffic shaping, cloud services with elastic scaling (Auto Scaling) are better suited to implement traffic shaping than message queues (such as Kafka, RocketMQ, or other cloud services). Various cloud service providers provide resource monitoring functions, and elastic scaling services can automatically scale resources based on various system metrics (such as CPU/memory utilization) and custom other metrics (such as the number of online users), and automatically release resources when idle, which is more in line with modern operation and maintenance. Taking AWS cloud services as an example, operations staff can use CloudWatch to monitor the above Turms server metrics and cooperate with Application Auto Scaling for automated server resource scaling. If operations staff is familiar with these operations, from purchasing these cloud services from scratch to completing configuration, it may only take 3-10 minutes.

              In terms of high availability, some IM architectures will use highly available (multi-AZ deployment) message queue cloud services and self-developed message sending services to consume the queue to ensure that notifications are not lost. However, in the architecture design of Turms, even if the Turms message push service server turms-gateway is forcibly closed (such as hardware failure, server crash), the Turms server cluster can self-heal. And because in the Turms process design, the application developed based on the Turms client itself needs to send requests and synchronize data with the newly connected Turms server every time it reconnects (corresponding to the callback turmsClient.userService.addOnOnlineListener(...)), messages and statuses will not be lost due to turms-gateway crashes or network disconnections.

              The reason why some IM projects insist on decoupling and introducing message queues, even when there are only tens of thousands or less online users, is simply to enhance their resumes and increase their irreplaceability, adding unnecessary technologies to the project and engaging in excessive design.

              Generally, only in the cloud architecture design of small and medium-sized IM scenarios based on serverless architecture can message queues play the most significant role. Still, even in such scenarios, as mentioned above, users can send notifications to AWS SQS to ensure high availability of message services and use Lambda functions to push messages to ensure that notifications are not lost. In this type of architecture design, users do not have self-developed services.

              In addition, the reason why serverless architecture is most suitable for small and medium-sized IM scenarios is that:

              • Lambda services have many quota restrictions, see Lambda quotas.

              • Compared with developing based on serverless architecture, designing and implementing self-developed IM services will be much simpler and more controllable. Blindly pursuing more "fashionable" serverless architecture may not be progress, but regression.

            • In the architecture design of some IM projects, they will separate session management into two services: network connection management and session logic management to ensure that when updating the session logic management service during downtime, the client does not need to disconnect from the network connection management service. However, considering that turms-gateway has almost no session business logic, and the existing business logic is very fixed, the main business logic is implemented in turms-service. Therefore, there is little need for turms-gateway to update the business logic during downtime, and thus splitting network connection management and session logic management into two independent services would add more failure points, result in performance degradation, and have little benefit for Turms. Therefore, Turms architecture design does not currently split session management into separate services.

              Notes:

              • The reason why the code implementation of Turms server is also very lean is because of the Basic Development Conventions.
              • In fact, in the early design of Turms, it considered not using distributed memory services like Redis, but adopted another common distributed memory implementation solution, which is to use a design similar to distributed map in Hazelcast or distributed cache in Ignite to enable Turms servers to synchronize data through distributed maps, thereby reducing dependence on external services. However, considering the high availability design of the cluster, the release process design of Turms server itself, etc., Redis was ultimately introduced to implement distributed memory.

            Relationship Between Turms Architecture and Cloud Architecture

            As of 2022, AWS is still the top cloud provider in terms of global market share, so the following discussion will mainly be based on AWS cloud.

            • The architecture design of Turms must ensure that its technical solutions do not rely on any cloud services to maintain technical neutrality, avoid being tied to any vendor's technology stack, and make it easy for non-cloud users to deploy a complete set of Turms servers (such as Kubernetes). At the same time, the technology solutions used by Turms must have the support of cloud vendors to ensure that cloud users can easily deploy a complete set of highly available Turms servers through various cloud services provided by different vendors.

              For the core IM functionality of Turms server, this requirement does not affect the release of Turms' core features because these features are implemented in the same way regardless of whether they are deployed on the cloud or not.

              However, for some IM extension features, such as file storage and data analysis, their implementation is more complicated because we need to consider, design, and implement various solutions. Taking business data analysis as an example, if Turms is designed with AWS, the implementation of business data analysis is very simple. In general, it is based on the business logs provided by Turms server, providing a set of CloudFormation configurations, and analyze data according to the needs and configurations of different users, such as (the easiest but not the cheapest) CloudWatch Logs Insights, (based on S3, cost-effective but not real-time) CloudWatch Logs => S3 => Ahtena/QuickSight, (based on S3, cost-effective, and introducing Kinesis Firehose to ensure real-time data integration) CloudWatch Logs => Kinesis Firehose => S3 => Athena/QuickSight or other data analysis solutions. However, Turms also needs to meet the needs of users who do not want to use other third-party services, so it needs to develop its own data analysis solution in the later stage. Therefore, the workload will be much larger, and the speed of releasing extension features will be much slower.

              But as mentioned above, if users can use third-party services to analyze the business logs provided by Turms, they don't have to wait for Turms to provide a solution.

            • Turms' cloud architecture design is very simple.

              • Turms' cloud architecture is just a subset of cloud architecture design. Compared with the enterprise cloud architecture design of large-scale hybrid clouds (enterprise cloud architecture design includes not only deployment architecture design of various projects, but also organization structure design, hybrid cloud network architecture design, etc.), although Turms can be considered as a large-scale project in the open source community, designing cloud architecture for such a volume project is still quite simple, and users who have basic understanding of cloud services should be able to understand Turms' cloud architecture design.

              • Turms' cloud architecture is very traditional. If users have deployed other traditional web services' cloud architectures, deploying Turms is almost the same, especially since Turms provides multiple deployment schemes and even Terraform-based schemes to help users automatically purchase and configure cloud services.

                The relatively complicated part of Turms' cloud architecture is that some cloud vendors do not directly support MongoDB services. For example, AWS does not directly support higher versions of MongoDB services. Although AWS has provided DocumentDB services compatible with lower versions of MongoDB, due to competition between MongoDB and AWS vendors, AWS can currently only lock the latest MongoDB version compatible with DocumentDB at version 4.0, and the maintenance effort is also relatively low. Overall, DocumentDB service is somewhat redundant and has poor development prospects, so it is recommended to use MongoDB Atlas service directly.

                However, because MongoDB is a partner of AWS, users can easily integrate MongoDB Atlas enterprise service into AWS through VPC Peering and deploy it.

            The General Process of Client Accessing Server

            This process is the general process for the client to access the server, and it is also the process for the Turms architecture to achieve horizontal scaling, you can adjust it according to the actual situation.

            • When the client needs to establish a TCP connection with the turms-gateway server, the client can use the DNS service to query the IP address corresponding to the access layer server's domain name, which points to the SLB/ELB service (usually based on LVS and Nginx), Global Acceleration Service, or turms-gateway, depending on the needs and size of your actual application. The DNS service can be configured with one or more public IP addresses (In the production environment, do not configure the server's public IP address to mitigate DDoS attacks.) and return an IP address to the client via polling or other policies.

              Notes:

              • Regardless of whether the Turms client is using a TCP connection or an upper layer WebSocket connection, the upstream services of turms-gateway (DNS/SLB, etc.) should perform load balancing of TCP connections based on the client IP address.

              • It is highly recommended that you enable the Sticky Session feature of the SLB service so that the session is always connected to a turms-gateway server. This has the advantage of mitigating a large portion of DDoS attacks. Because turms-gateway supports blocking clients automatically, it can quickly detect and block IPs or users with abnormal behavior on the local server, but the default time interval for synchronizing blocked client data between turms-gateway servers is about 10~15 seconds, so if the Sticky Session feature is turned off, hackers can take advantage of the blocked data synchronization If the Sticky Session feature is turned off, hackers can use the blocked data synchronization interval to switch the TCP connection with turms-gateway and perform DDoS attacks.

              • Normally, you should place the SSL certificate on the upstream server of turms-gateway, i.e. the upstream SLB service or Nginx server, etc.

              • Since turms-gateway is designed with a stateless architecture, any client can connect to any turms-gateway server, and you can flexibly scale up or down turms-gateway servers; the state (i.e., user session information) is transferred to the distributed in-memory Redis servers.

            • After the client gets the IP address and successfully establishes a TCP connection with the turms-gateway, the turms-gateway detects whether the IP has been blocked or whether the turms-gateway itself is overloaded, and if so, actively disconnects the TCP connection. Otherwise, passing the TCP connection.

            • If the turms-gateway passes the TCP connection.

              • For a Turms client using a TCP connection, the client can start initiating a Protobuf data stream of TurmsRequest. This data stream consists of two parts, a ZigZag-encoded body-length header, and a Protobuf-encoded body.
              • For a Turms client using a WebSocket connection, the client will initiate an HTTP upgrade request to the turms-gateway server after a successful TCP connection is established, requesting an HTTP upgrade to the WebSocket. If the upgrade is successful, the client can put the Protobuf encoded TurmsRequest data in the body of the WebSocket binary frame and send it to the turms-gateway server.

              Note: At this point, the Turms client only sets up a network connection to the turms-gateway, but the user has not yet logged in and no session has been established.

            • After the stream is forwarded by the load balancing service (optional), it reaches the turms-gateway server first. The turms-gateway server first performs a simple Protobuf format verification on the stream (without verifying the legitimacy of the business request, in order to decouple the business logic from turms-service servers, so that turms-service servers can update the business request format independently without the need to stop turms-gateway servers, and if it is an illegal data stream, the TCP connection will be closed.

              Otherwise, if it is a legitimate request, it is partially parsed to confirm whether the turms-gateway server can handle the request on its own. For example, for both login and logout requests, the turms-gateway server can handle them on its own.

            • If the turms-gateway server can handle the request on its own, it will return a response. If it cannot handle it, then it detects whether the user has logged in on the local server, and if not, it rejects the request and sends back a response. If the user is logged in, a turms-service server is first selected from the list of available turms-service servers according to the load balancing policy, and then the request is forwarded to that turms-service server for processing through the self-developed RPC implementation.

              • If the turms-gateway server detects that the client request is a login request, the turms-gateway server forms a session ID based on the user ID and the device type specified by the login request, and determines whether the session ID conflicts with the logged-in session based on the user session information on Redis or the local cache. If there is a conflict, the login request will be rejected and a response is sent back informing the client of the failure reason. Otherwise, the current user session information is registered with Redis, and a successful response is sent back. At this point, the user enters the online state.

                Notes:

                • A session ID (user ID + device type) will constitute a user session with only one turms-gateway server and a TCP connection with one turms-gateway server at the same moment. All subsequent service requests of the user are done in this one session and TCP connection until the session is closed and the user is offline.

                • Different devices under one user ID can form a `user session' with different turms-gateway servers at the same time, regardless of whether they are from different IPs.

                  However, it is recommended that all devices under a user ID are always connected to a single turms-gateway because:

                  1. If logged into the same turms-gateway, the server only needs to send its byte stream to one turms-gateway server instead of multiple when forwarding messages or notifications to a user, in order to reduce system resource overhead and increase throughput.
                  2. All devices of the same user on the same turms-gateway server share the session's heartbeat clock, thus reducing the number of TTL heartbeat refresh requests that the turms-gateway server sends to Redis;
                  3. If the server has user status caching enabled, it may use a user status that has not been updated when forwarding messages or notifications, so new messages may not be sent to the newly logged-in device immediately.
              • If the turms-gateway server is unable to handle the client request, the request will be forwarded to a turms-service server via RPC service. After receiving the client request, the turms-service server verifies and processes the request, triggering the ClientRequestHandler plugin to assist developers in implementing custom logic (such as filtering sensitive words). Additionally, during the processing, corresponding CRUD requests are usually sent to mongos. Once the client request has been processed, turms-service will send the generated response back to the turms-gateway server. For the notifications generated during the processing, the turms-service server will first query Redis or local cache based on the ID of the notified user to obtain the node ID of the turms-gateway connected by this batch of users. The notifications are then sent to these turms-gateway servers via RPC service for notification pushing.

                Note: Turms adopts the MongoDB sharded cluster. After receiving the CRUD request, mongos routes the request according to the configuration.

              • Regardless of whether the turms-gateway server receives a response or notification, it does not perform any validity checks but instead directly forwards it to the user. During the notification pushing, the turms-gateway server triggers the NotificationHandler plugin to assist developers in implementing custom logic (such as pushing messages to offline users).

              (Notably, all network IO operations in Turms are implemented based on Netty, i.e., all of the above RPC and database calls are asynchronous and non-blocking.)

            + \ No newline at end of file diff --git a/docs/design/schema.html b/docs/design/schema.html index 059c9649..493b8947 100644 --- a/docs/design/schema.html +++ b/docs/design/schema.html @@ -17,8 +17,8 @@ -
            Skip to content

            Collection Schema Design

            Requirements Analysis and Collection Schema Design

            When doing architecture design, it is often said that "key requirements determine architecture design, secondary requirements verify architecture" (here "requirements" include functional requirements, quality attribute requirements, and constraint requirements). However, as Turms is a general instant messaging project, its requirements are not as clear and specific as those of a concrete instant messaging project. Therefore, facing endless business requirements and various possible constraints, Turms cannot and should not design for every scenario. Therefore, when designing Turms, we follow the principle of "prioritizing key universal instant messaging requirements".

            When abstracting various complex requirements into actual business models, it is necessary to understand the priority relationship between requirements and ultimately express these requirement relationships in the form of collection schemas, which is the most important embodiment of technical architecture implementation. Therefore, it is essential to review and adjust the default collection schemas provided by Turms according to your own product requirements.

            Default Collection Index Design

            Key Points (If your team needs to develop based on Turms, please remember the following three points):

            • The index is designed mainly based on the characteristics and constraints of distributed data sharding, and is designed based on the principle of more read and less write and prioritizing key universal instant messaging requirements.
            • The index is not designed for data analysis (please refer to Turms Data Analytics for details).
            • The index is not designed for admin API (to avoid unnecessary index overhead, at the cost of relatively poor flexibility of the admin API).
            • Turms does not use auxiliary indexes to support extra business features (therefore, if your project has extra business features, you need to develop on top of Turms. Of course, this is also very simple to implement, and qualified intermediate to advanced engineers should have this ability).

            It is particularly important to emphasize the principle of "prioritizing key universal instant messaging requirements", because it reminds not only developers but also product managers and clients to pay attention to the design of collections. For scenarios involving distributed data sharding, some seemingly "simple-to-implement" features can bring a lot of resource consumption and increase the difficulty of development and operation when actually implemented. Therefore, for such "laborious and futile" features, it is necessary to confirm whether the requirement is reasonable, necessary, and able to bear the corresponding risks and costs through multiple iterations after confirming whether it needs to be implemented. After considering factors such as the need for implementation and the possibility of multiple iterations, it is then appropriate to consider whether to adopt a design with flexibility on the collection to facilitate future updates and reduce the risk of thoroughly refactoring.

            Take the "query groups joined by a user" feature as an example. The GroupMember collection in Turms is used to manage the relationship between groups and users. This collection is designed to shard data based on group IDs by default. Therefore, if you need to find group-related data according to group IDs in a distributed database server, it is very easy for the database (targeted queries). However, conversely, if you need to find the groups that a certain user has joined based on their user ID without creating a new auxiliary collection, it becomes extremely inefficient (scatter gather queries). Because the database cannot locate the relevant group data based on the user ID, it will send the query request to all database servers, causing a large number of invalid and redundant requests, with only a small proportion of valid requests, ultimately resulting in a lower effective throughput of the database cluster than a single database.

            With the increase of the user scale, either due to misjudgment of primary and secondary requirements leading to the need to overturn the architecture and start over, or to customize and expand on the existing basis (such as implementing an auxiliary collection by oneself like ShardingSphere to help with data sharding, but such implementation is likely to cause a large amount of redundant data and transactions). Therefore, it is necessary to have a deep understanding of the default collection index design of Turms and remember that "the default index is designed mainly based on the characteristics and constraints of distributed data sharding, and is designed based on the principle of more read and less write and prioritizing key universal instant messaging requirements".

            The Cost of Rich Features

            After gaining a deep understanding of Turms' default collection index design, you will understand why so many large and medium-sized instant messaging applications do not provide, and should not provide, some seemingly "simple" features. You will also understand what needs to be paid attention to when implementing an instant messaging application in practice. On the other hand, you should also be wary of instant messaging technology solutions that claim to provide rich business features because they are likely only suitable for user scales of hundreds or thousands. If your product needs to scale up later, you will find that some existing collection designs and data sharding designs are contradictory, and you may need to start refactoring from schemas, which can ultimately lead to a complete reconstruction of the project, forcing you to start over with a self-developed solution.

            Here is an example of explaining a feature: "To limit the number of groups each user can create, the server needs to have the ability to quickly find the number of groups owned by that user." This seems like a very "simple" feature to implement. However, due to Turms' default index design principle mentioned above, Turms only shards the Group ID for quick group member information retrieval.

            Therefore, we cannot quickly query the number of groups owned by a user based on the group owner ID using a targeted query. To achieve a relatively feasible solution, there are roughly only three options (note that these three solutions can be applied to other extended functional designs through analogy):

            1. Create a single-column index for the group owner ID. Although a targeted query cannot be implemented, it is still possible to query relatively quickly after a scatter query. (Note: this type of implementation is the default implementation provided by Turms for extended functionality but is disabled in the default configuration.)

            2. Dimensional modeling, creating an auxiliary index set specifically for recording the group owner ID and the corresponding group ID. A targeted query can be achieved, but some key operations require the use of distributed transactions to ensure data consistency, and there is still data redundancy.

            3. Use a static statistical table to specifically record the number of groups each user already owns. This solution is the most efficient and has the minimum redundancy, but it still requires distributed transactions, and has the worst scalability.

            It is clear that for implementing a seemingly "simple" feature, our three implementation solutions not only have vastly different requirements for system resources but also have time complexity that is not in the same order of magnitude.

            Therefore, one should always be wary of instant messaging solutions that claim to be "feature-rich".

            Collection Structure

            Turms' collection structure may contain fields that your product does not use at all, but these unused fields are not stored in the database, so you do not need to worry about them increasing database overhead.

            How Turms' Collection Structure was Designed

            Turms' collection structure was not designed in a single commit or within a few days but was sorted out through a long period of iterative analysis and practice. The process was roughly as follows:

            1. Analyze business needs, grasp the intricate logic between businesses, and clarify the main and secondary relationships of the needs. It is required not only to cover all existing requirements but also to predict future business needs as much as possible and confirm which business needs are not needed.
            2. Analyze the specific code logic of the business implementation and determine the necessary fields.
            3. Determine the field ID. It is worth noting that composite IDs can have independent indices internally. For example, the composite ID of the GroupMember collection is group ID + user ID, and these two fields have their independent indices for implementing other business functions.
            4. Build indexes. First, consider whether each field indeed needs an index and whether it can be made into an optional index. Then, consider whether several fields can be combined into a composite index (including analyzing the cardinality of records, the frequency of using composite indices, whether the query condition can always follow the leftmost matching principle, and whether it can also avoid table returning queries).
            5. Determine whether to shard the collection, including analyzing whether the collection needs data cold and hot separation. If sharding is required, whether data can be sharded "incidentally" based on the above index information.

            Collection Details

            Summary

            The following content is just a basic theory. As we mentioned in How the collection structure of Turms is designed, the actual business is more complex and changeable. Therefore, in the face of specific collection index design, it is necessary to combine its actual application scenarios Do analysis and design.

            Data Fragmentation

            Except for small collections such as Admin (Admin), GroupType (GroupType) and other small collections that do not need data sharding, most other collections support data sharding, such as User (User), Group (Group) and Message (Message) are combined to realize that when sending CRUD requests to mongos, mongos can do load balancing and balance data load by itself, and it is also to support the separation of hot and cold data.

            record creation time index

            The composite index of many collections has the record creation time field, which is to match the pull mode of Turms, to support quick query of records in a certain time range and avoid repeated query by the client. This is why most query statements on the Turms client can include a query time interval parameter, and if the client request does not include this parameter, the Turms server will assign a query time interval by default to ensure query performance.

            ID only uses B-tree index

            We prohibit the use of Hashed indexes for record IDs. This is because MongoDB does not support uniqueness constraints through Hashed indexes. Only B-tree indexes can be used to ensure the uniqueness of records. Therefore, even if we add a For the Hashed index, MongoDB will automatically create an additional B-tree index, which is not worth the loss.

            Optional fields and indexes

            There are dozens of optional indexes in the Turms collection, but they are not enabled by default, because:

            • Although many IM business requirements are typical, they are in conflict with each other. For example, it is necessary to support message or request sender can query the message or request sent by himself and message or request sender cannot query the message sent by himself. or requests (the default implementation).
            • Or some IM business requirements are typical, but not so common, such as whether the processor of the group request can query the requests he has processed. The optional indexes used to support such extended IM functions account for the majority.
            • If these optional indexes are turned on by default, it is designed for small IM applications. For larger IM applications, it is a mistake that we mentioned above as "the fatal price of rich functions".

            **The principle for us to choose the default implementation scheme is: choose the scheme that does not require additional fields or indexes, has the lowest storage cost, and can be logically consistent with other IM business requirements. And if your application really needs to support another solution, we generally provide multiple sets of alternative solutions, which need to be configured by the user to replace the default implementation. **

            As long as you grasp this basic principle, you can deduce why the indexes of the Turms collection are so designed. In addition, each model and each field in the code actually has index-related comments, which are used to guide users: what fields are suitable for indexing in which scenarios, and why some fields do not use indexes. Users can design with reference to this note.

            Note: Very few optional indexes are enabled by default, because the scenarios corresponding to these indexes are very common, and only a few applications do not need to use these scenarios. In addition, Turms has not yet optimized the scenarios where these optional indexes are not enabled, so it is currently recommended that you do not manually turn them off.

            Replenish:

            • These optional indexes can be enabled by configuring turms.service.mongo.[service name].optional-index.[collection name].[field name]=true, such as turms.service.mongo.message. optional-index.message.sender-id=true.

              Reminder: IntelliJ IDEA supports configuration auto-completion

            • Users can also directly create the indexes they want to use on the MongoDB service, and it is very simple to add or delete indexes or fields in MongoDB, so even if the user misses configuration, or the requirements are not clear in the early stage, and new requirements come later, there is no need to worry about being unable to do so Add a new index or field.

              Additional supplement: Each version of MongoDB will release some very practical new features. There may be some complex functions that we need to fully develop ourselves in the early days, but in the new version of MongoDB, we only need to execute one command to realize it, which greatly reduces the development and effort. It is difficult to operate and maintain and improve the reliability of functions, so it is highly recommended that you deploy the new version of MongoDB as much as possible.

            By default, the request sender field of the request model is not indexed

            The two collections such as friend request and group request do not index the request sender by default. In other words, once the user sends the request, he can no longer query the requests he has already sent, and the client needs to record it locally. If your product really needs the server to record and query the requests sent by users, you need to configure the above optional index by yourself, and let turms-service add this index when creating the table for the first time, or you can directly add it on the MongoDB server Build an index into the collection.

            Message

            Message is currently the only model that supports separate storage of hot and cold data. The separation of hot and cold data can greatly save the cost of the database server, such as putting hot data in a 16-core 128G server, and putting cold data in a 4-core 8G server. In addition, other models currently do not have the meaning of separate storage of hot and cold data, so other models do not support it.

            index
            • Business scenario: Do you need to support the message sender to be able to query the messages he sent himself?

              • Scheme 1 (default scheme): This feature is not supported, use message sending time + recipient ID compound index

                Since message needs to support the separation of hot and cold data, the composite index of the message is: message sending time + recipient ID, and the sharding key is message sending time, to ensure that we can combine Zones in different time intervals Allocate to different Shards, and realize the cold and hot separation storage of messages.

                (If message does not need to support the separation of hot and cold data, then the composite index of Turms’ message model should be: recipient ID + message sending time, and the shard key is recipient ID, to ensure that MongoDB can Do load balancing for both read and write requests, and ensure that messages sent to the same recipient are divided into the same Chunks as much as possible to improve query speed)

                Supplement: As for why there is no separation of hot and cold data for collections such as add friend request and group invitation request, this is because although these requests are indeed closely related to the creation time in terms of business performance, for example, add friend request has passed After a period of time, from a business point of view, it is in the state of The request has expired and cannot be processed. However, for the recipient of the request, even if it is an expired request, the user often needs to quickly query all the requests he has received through query statements, and the number of visits will not decrease with time. For example, if a user has received 20 friend requests this year and 20 friend requests last year, and the client can query at most 50 requests each time, then the database should use the recipient ID as the dimension to store the recipients of the same request The data is divided into a Chunk. Instead of dividing the data of the same request receiver into different Chunks and loading them into different databases based on the request creation time. Therefore, we do not support hot and cold data separation for these collections. For this type of collection, we generally use a composite index such as request recipient ID + request creation time, and use request recipient ID as the shard key to collect all requests received by a request recipient as much as possible. Put them in the same Chunk.

              • Solution 2: Support this feature, use message sending time + session ID compound index

                If your product needs this solution, you only need to configure turms.service.message.use-conversation-id=true when the turms-service server starts for the first time. Just pay special attention: if you have already created a table in the database and created a message record in the method of Scheme 1, the Turms server will not create a composite index of message sending time + session ID at present, nor will it The message data will be swiped and the session ID will be filled in the message.

                Supplementary knowledge: Private chat session ID is a 16-byte long byte array, and its value is composed of message sender ID and message receiver. Group chat session ID is an 8-byte long byte array whose value consists of group IDs.

              • Solution 3: This feature is supported, but it is generally not recommended, and Turms does not provide support. The scheme is: under the composite index scheme of message sending time + recipient ID, enable optional index for sender ID.

                The reason why this solution is not recommended is because it is a very common scenario for users to query messages in a session, and this solution needs to query twice when querying messages in a session: one is to query the messages sent by the other party, and the other is to query itself The messages sent, are so inefficient that Turms offers no support.

            • Message deletion time B-tree index. If your product needs to support logical deletion, turms-service will fill in the value of this field when "deleting" a message, otherwise this field will not be used.

            TODO

            - +
            Skip to content

            Collection Schema Design

            Requirements Analysis and Collection Schema Design

            When doing architecture design, it is often said that "key requirements determine architecture design, secondary requirements verify architecture" (here "requirements" include functional requirements, quality attribute requirements, and constraint requirements). However, as Turms is a general instant messaging project, its requirements are not as clear and specific as those of a concrete instant messaging project. Therefore, facing endless business requirements and various possible constraints, Turms cannot and should not design for every scenario. Therefore, when designing Turms, we follow the principle of "prioritizing key universal instant messaging requirements".

            When abstracting various complex requirements into actual business models, it is necessary to understand the priority relationship between requirements and ultimately express these requirement relationships in the form of collection schemas, which is the most important embodiment of technical architecture implementation. Therefore, it is essential to review and adjust the default collection schemas provided by Turms according to your own product requirements.

            Default Collection Index Design

            Key Points (If your team needs to develop based on Turms, please remember the following three points):

            • The index is designed mainly based on the characteristics and constraints of distributed data sharding, and is designed based on the principle of more read and less write and prioritizing key universal instant messaging requirements.
            • The index is not designed for data analysis (please refer to Turms Data Analytics for details).
            • The index is not designed for admin API (to avoid unnecessary index overhead, at the cost of relatively poor flexibility of the admin API).
            • Turms does not use auxiliary indexes to support extra business features (therefore, if your project has extra business features, you need to develop on top of Turms. Of course, this is also very simple to implement, and qualified intermediate to advanced engineers should have this ability).

            It is particularly important to emphasize the principle of "prioritizing key universal instant messaging requirements", because it reminds not only developers but also product managers and clients to pay attention to the design of collections. For scenarios involving distributed data sharding, some seemingly "simple-to-implement" features can bring a lot of resource consumption and increase the difficulty of development and operation when actually implemented. Therefore, for such "laborious and futile" features, it is necessary to confirm whether the requirement is reasonable, necessary, and able to bear the corresponding risks and costs through multiple iterations after confirming whether it needs to be implemented. After considering factors such as the need for implementation and the possibility of multiple iterations, it is then appropriate to consider whether to adopt a design with flexibility on the collection to facilitate future updates and reduce the risk of thoroughly refactoring.

            Take the "query groups joined by a user" feature as an example. The GroupMember collection in Turms is used to manage the relationship between groups and users. This collection is designed to shard data based on group IDs by default. Therefore, if you need to find group-related data according to group IDs in a distributed database server, it is very easy for the database (targeted queries). However, conversely, if you need to find the groups that a certain user has joined based on their user ID without creating a new auxiliary collection, it becomes extremely inefficient (scatter gather queries). Because the database cannot locate the relevant group data based on the user ID, it will send the query request to all database servers, causing a large number of invalid and redundant requests, with only a small proportion of valid requests, ultimately resulting in a lower effective throughput of the database cluster than a single database.

            With the increase of the user scale, either due to misjudgment of primary and secondary requirements leading to the need to overturn the architecture and start over, or to customize and expand on the existing basis (such as implementing an auxiliary collection by oneself like ShardingSphere to help with data sharding, but such implementation is likely to cause a large amount of redundant data and transactions). Therefore, it is necessary to have a deep understanding of the default collection index design of Turms and remember that "the default index is designed mainly based on the characteristics and constraints of distributed data sharding, and is designed based on the principle of more read and less write and prioritizing key universal instant messaging requirements".

            The Cost of Rich Features

            After gaining a deep understanding of Turms' default collection index design, you will understand why so many large and medium-sized instant messaging applications do not provide, and should not provide, some seemingly "simple" features. You will also understand what needs to be paid attention to when implementing an instant messaging application in practice. On the other hand, you should also be wary of instant messaging technology solutions that claim to provide rich business features because they are likely only suitable for user scales of hundreds or thousands. If your product needs to scale up later, you will find that some existing collection designs and data sharding designs are contradictory, and you may need to start refactoring from schemas, which can ultimately lead to a complete reconstruction of the project, forcing you to start over with a self-developed solution.

            Here is an example of explaining a feature: "To limit the number of groups each user can create, the server needs to have the ability to quickly find the number of groups owned by that user." This seems like a very "simple" feature to implement. However, due to Turms' default index design principle mentioned above, Turms only shards the Group ID for quick group member information retrieval.

            Therefore, we cannot quickly query the number of groups owned by a user based on the group owner ID using a targeted query. To achieve a relatively feasible solution, there are roughly only three options (note that these three solutions can be applied to other extended functional designs through analogy):

            1. Create a single-column index for the group owner ID. Although a targeted query cannot be implemented, it is still possible to query relatively quickly after a scatter query. (Note: this type of implementation is the default implementation provided by Turms for extended functionality but is disabled in the default configuration.)

            2. Dimensional modeling, creating an auxiliary index set specifically for recording the group owner ID and the corresponding group ID. A targeted query can be achieved, but some key operations require the use of distributed transactions to ensure data consistency, and there is still data redundancy.

            3. Use a static statistical table to specifically record the number of groups each user already owns. This solution is the most efficient and has the minimum redundancy, but it still requires distributed transactions, and has the worst scalability.

            It is clear that for implementing a seemingly "simple" feature, our three implementation solutions not only have vastly different requirements for system resources but also have time complexity that is not in the same order of magnitude.

            Therefore, one should always be wary of instant messaging solutions that claim to be "feature-rich".

            Collection Structure

            Turms' collection structure may contain fields that your product does not use at all, but these unused fields are not stored in the database, so you do not need to worry about them increasing database overhead.

            How Turms' Collection Structure was Designed

            Turms' collection structure was not designed in a single commit or within a few days but was sorted out through a long period of iterative analysis and practice. The process was roughly as follows:

            1. Analyze business needs, grasp the intricate logic between businesses, and clarify the main and secondary relationships of the needs. It is required not only to cover all existing requirements but also to predict future business needs as much as possible and confirm which business needs are not needed.
            2. Analyze the specific code logic of the business implementation and determine the necessary fields.
            3. Determine the field ID. It is worth noting that composite IDs can have independent indices internally. For example, the composite ID of the GroupMember collection is group ID + user ID, and these two fields have their independent indices for implementing other business functions.
            4. Build indexes. First, consider whether each field indeed needs an index and whether it can be made into an optional index. Then, consider whether several fields can be combined into a composite index (including analyzing the cardinality of records, the frequency of using composite indices, whether the query condition can always follow the leftmost matching principle, and whether it can also avoid table returning queries).
            5. Determine whether to shard the collection, including analyzing whether the collection needs data cold and hot separation. If sharding is required, whether data can be sharded "incidentally" based on the above index information.

            Collection Details

            Summary

            The following content is just a basic theory. As we mentioned in How the collection structure of Turms is designed, the actual business is more complex and changeable. Therefore, in the face of specific collection index design, it is necessary to combine its actual application scenarios Do analysis and design.

            Data Fragmentation

            Except for small collections such as Admin (Admin), GroupType (GroupType) and other small collections that do not need data sharding, most other collections support data sharding, such as User (User), Group (Group) and Message (Message) are combined to realize that when sending CRUD requests to mongos, mongos can do load balancing and balance data load by itself, and it is also to support the separation of hot and cold data.

            record creation time index

            The composite index of many collections has the record creation time field, which is to match the pull mode of Turms, to support quick query of records in a certain time range and avoid repeated query by the client. This is why most query statements on the Turms client can include a query time interval parameter, and if the client request does not include this parameter, the Turms server will assign a query time interval by default to ensure query performance.

            ID only uses B-tree index

            We prohibit the use of Hashed indexes for record IDs. This is because MongoDB does not support uniqueness constraints through Hashed indexes. Only B-tree indexes can be used to ensure the uniqueness of records. Therefore, even if we add a For the Hashed index, MongoDB will automatically create an additional B-tree index, which is not worth the loss.

            Optional fields and indexes

            There are dozens of optional indexes in the Turms collection, but they are not enabled by default, because:

            • Although many IM business requirements are typical, they are in conflict with each other. For example, it is necessary to support message or request sender can query the message or request sent by himself and message or request sender cannot query the message sent by himself. or requests (the default implementation).
            • Or some IM business requirements are typical, but not so common, such as whether the processor of the group request can query the requests he has processed. The optional indexes used to support such extended IM functions account for the majority.
            • If these optional indexes are turned on by default, it is designed for small IM applications. For larger IM applications, it is a mistake that we mentioned above as "the fatal price of rich functions".

            **The principle for us to choose the default implementation scheme is: choose the scheme that does not require additional fields or indexes, has the lowest storage cost, and can be logically consistent with other IM business requirements. And if your application really needs to support another solution, we generally provide multiple sets of alternative solutions, which need to be configured by the user to replace the default implementation. **

            As long as you grasp this basic principle, you can deduce why the indexes of the Turms collection are so designed. In addition, each model and each field in the code actually has index-related comments, which are used to guide users: what fields are suitable for indexing in which scenarios, and why some fields do not use indexes. Users can design with reference to this note.

            Note: Very few optional indexes are enabled by default, because the scenarios corresponding to these indexes are very common, and only a few applications do not need to use these scenarios. In addition, Turms has not yet optimized the scenarios where these optional indexes are not enabled, so it is currently recommended that you do not manually turn them off.

            Replenish:

            • These optional indexes can be enabled by configuring turms.service.mongo.[service name].optional-index.[collection name].[field name]=true, such as turms.service.mongo.message. optional-index.message.sender-id=true.

              Reminder: IntelliJ IDEA supports configuration auto-completion

            • Users can also directly create the indexes they want to use on the MongoDB service, and it is very simple to add or delete indexes or fields in MongoDB, so even if the user misses configuration, or the requirements are not clear in the early stage, and new requirements come later, there is no need to worry about being unable to do so Add a new index or field.

              Additional supplement: Each version of MongoDB will release some very practical new features. There may be some complex functions that we need to fully develop ourselves in the early days, but in the new version of MongoDB, we only need to execute one command to realize it, which greatly reduces the development and effort. It is difficult to operate and maintain and improve the reliability of functions, so it is highly recommended that you deploy the new version of MongoDB as much as possible.

            By default, the request sender field of the request model is not indexed

            The two collections such as friend request and group request do not index the request sender by default. In other words, once the user sends the request, he can no longer query the requests he has already sent, and the client needs to record it locally. If your product really needs the server to record and query the requests sent by users, you need to configure the above optional index by yourself, and let turms-service add this index when creating the table for the first time, or you can directly add it on the MongoDB server Build an index into the collection.

            Message

            Message is currently the only model that supports separate storage of hot and cold data. The separation of hot and cold data can greatly save the cost of the database server, such as putting hot data in a 16-core 128G server, and putting cold data in a 4-core 8G server. In addition, other models currently do not have the meaning of separate storage of hot and cold data, so other models do not support it.

            index
            • Business scenario: Do you need to support the message sender to be able to query the messages he sent himself?

              • Scheme 1 (default scheme): This feature is not supported, use message sending time + recipient ID compound index

                Since message needs to support the separation of hot and cold data, the composite index of the message is: message sending time + recipient ID, and the sharding key is message sending time, to ensure that we can combine Zones in different time intervals Allocate to different Shards, and realize the cold and hot separation storage of messages.

                (If message does not need to support the separation of hot and cold data, then the composite index of Turms’ message model should be: recipient ID + message sending time, and the shard key is recipient ID, to ensure that MongoDB can Do load balancing for both read and write requests, and ensure that messages sent to the same recipient are divided into the same Chunks as much as possible to improve query speed)

                Supplement: As for why there is no separation of hot and cold data for collections such as add friend request and group invitation request, this is because although these requests are indeed closely related to the creation time in terms of business performance, for example, add friend request has passed After a period of time, from a business point of view, it is in the state of The request has expired and cannot be processed. However, for the recipient of the request, even if it is an expired request, the user often needs to quickly query all the requests he has received through query statements, and the number of visits will not decrease with time. For example, if a user has received 20 friend requests this year and 20 friend requests last year, and the client can query at most 50 requests each time, then the database should use the recipient ID as the dimension to store the recipients of the same request The data is divided into a Chunk. Instead of dividing the data of the same request receiver into different Chunks and loading them into different databases based on the request creation time. Therefore, we do not support hot and cold data separation for these collections. For this type of collection, we generally use a composite index such as request recipient ID + request creation time, and use request recipient ID as the shard key to collect all requests received by a request recipient as much as possible. Put them in the same Chunk.

              • Solution 2: Support this feature, use message sending time + session ID compound index

                If your product needs this solution, you only need to configure turms.service.message.use-conversation-id=true when the turms-service server starts for the first time. Just pay special attention: if you have already created a table in the database and created a message record in the method of Scheme 1, the Turms server will not create a composite index of message sending time + session ID at present, nor will it The message data will be swiped and the session ID will be filled in the message.

                Supplementary knowledge: Private chat session ID is a 16-byte long byte array, and its value is composed of message sender ID and message receiver. Group chat session ID is an 8-byte long byte array whose value consists of group IDs.

              • Solution 3: This feature is supported, but it is generally not recommended, and Turms does not provide support. The scheme is: under the composite index scheme of message sending time + recipient ID, enable optional index for sender ID.

                The reason why this solution is not recommended is because it is a very common scenario for users to query messages in a session, and this solution needs to query twice when querying messages in a session: one is to query the messages sent by the other party, and the other is to query itself The messages sent, are so inefficient that Turms offers no support.

            • Message deletion time B-tree index. If your product needs to support logical deletion, turms-service will fill in the value of this field when "deleting" a message, otherwise this field will not be used.

            TODO

            + \ No newline at end of file diff --git a/docs/design/status-aware.html b/docs/design/status-aware.html index 73f3e163..e8cccddc 100644 --- a/docs/design/status-aware.html +++ b/docs/design/status-aware.html @@ -17,7 +17,7 @@ -
            Skip to content

            Status Awareness

            Status awareness is divided into two categories, one is "user online status awareness", and the other is "business data change awareness" (such as receiving new messages, group members sending changes).

            Since the specific implementation of state awareness is closely related to specific product requirements, you need to be able to grasp the following two points:

            1. Determine whether the product demand is reasonable. Usually unreasonable requirements, such as: there can be 10,000 users in a group, when a user sends a message, it is necessary to ensure that the message can be 100% sent to other 9999 users, and the user can pull a few years ago Chat information.
            2. Distinguish primary and secondary requirements, and try to strike a balance between quality attributes. There are many details in the implementation of IM services. Is it really necessary to design a large number of back-and-forth strategies (such as message session-level auto-increment IDs) in order to be compatible with extreme situations, which not only greatly increases the development cost and failure points, but also makes the overall server throughput drops.

            User online status awareness

            In short, Turms detects the health status of the user's TCP connection through the heartbeat packet and judges whether the user is "online". Also, if you don't care about the underlying implementation, you only need to read: Client API - Session Lifecycle.

            Specific principles (expanding knowledge)

            background

            From the perspective of the network transport layer, TCP is just a virtual connection, which needs to simulate a physical connection through two-way message delivery and message confirmation. In the case of waving for the first time (that is, the specified message transmission and confirmation is not completed), TCP still determines that the connection is in a hold state (if you try to read data from the TCP connection at this time, it will throw a message similar to "An existing connection was forcibly closed by the remote host" message). Therefore, for the upper-layer instant messaging application developed based on the TCP protocol, if we do not do extra work, the server can only mistakenly believe that "the user is online".

            Common reasons why TCP did not complete the four wave

            • Client: The client application is forcibly closed
            • Server: The load continues to be too high to respond; the server directly goes down, causing the server application to be forcibly closed
            • Link intermediate routing: unexpected interruption (eg: mobile access network NAT timeout)

            Solutions for abnormal disconnection

            In order to ensure that the server can perceive the state of "user offline", the Turms client will, after a certain time interval from the last request of any type (such as a request to send a message) (for now, it does not support the configuration of smart heartbeat according to network conditions), Send heartbeat packets to the server to maintain its "online status". After the server receives the heartbeat packet or other business requests from the client, it will refresh the online status of the client on the Redis server to keep alive.

            Business data change perception

            In order to allow users to perceive changes in business data (addition, deletion, modification), Turms supports push mode (server-side active notification), pull mode (client-side active pull mechanism. Support pull by Timeline) and push-pull combination mode to achieve real-time Balance between real-time performance and resource consumption, and allow developers to adjust the weight between real-time performance and resource consumption.

            Perception

            Method 1: push mode (active notification from the server)

            The push mode means that when a certain business model changes (due to addition, deletion and modification operations), the server will actively notify the relevant online users of the occurrence of the event. When the client receives the notification, the Turms client will trigger the onNotification callback function in NotificationService. The parameter of this function is a TurmsRequest object, indicating the request that triggered the event.

            Notification-related behaviors can be configured according to the im.turms.server.common.infra.property.env.service.business.NotificationProperties class. Each notification type can be configured individually, and all notification-related configurations can be dynamically updated while the cluster is running.

            Example

            Take the property im.turms.server.common.infra.property.env.service.business.NotificationProperties#notifyMembersAfterGroupUpdated as an example. This attribute is used to control "whether to notify group members when group information changes". The group information here refers to global group information such as group name, group type, and group silence time.

            If you set this attribute value to true, when the group information changes, the clients of the group members will receive a notification that triggers the change. Otherwise, group member clients will not receive any notifications.

            evaluate

            The notification mechanism can ensure that notifications can be delivered to relevant users in real time, but its disadvantage is that it can easily lead to meaningless resource consumption (subject to specific business scenarios). For example, user A has joined 100 groups, but the user usually only checks the information of 3 of them. In this scenario, if the notification mechanism is enabled for all status changes of 100 groups, both the server and the client need to waste a lot of resources to deal with these meaningless notifications (because the user never reads these notifications) .

            In order to solve this type of problem and meet other common needs (such as: requiring offline users to detect whether the business model has changed when they go online; requiring online users to perceive changes in the business model even when the notification is turned off) , Turms also provides a pull mode (the client actively pulls) to allow users to perceive changes in the business model.

            Method 2: Pull mode (the client actively pulls. Supports pulling by Timeline)

            In order to make up for the deficiencies of the push mode mentioned above, Turms also provides a pull mode.

            About to achieve

            Each business model of Turms has a version information, which records the time when the business model was last updated. When the client requests resources from the server, it can carry the time when the client last updated the business model (or not). The Turms server will compare this version information with the version information of the current business model. If the client If the version information sent by the client is earlier than the version information of the current business model, the Turms server will return the latest business model data, otherwise the status code NO_CONTENT will be thrown, and the client will receive empty data.

            Common pull timing (synchronization timing)
            • When your app is switched to the foreground
            • When the session is reconnected
            • Depends on specific business (see example below)
            Example

            Continuing with the example above. Assume that we want group members to be able to perceive changes in the profile information of other group members in real time. Then if we adopt the notification mechanism, assuming that each group has 100 other online users besides user A, then user A’s profile information needs to be notified to other 10,000 (100 groups*100 people/group) group members, This is absolutely undesirable in practical applications.

            In practice, usually at a specific time (for example, when the user opens a user's personal information UI interface, or opens a chat window with someone), the client will actively request the server for the user's information. At the same time, use version comparison to reduce meaningless waste of resources.

            This kind of design that always pays attention to real-time and resource consumption should be kept in mind, so as not to design unrealistic application scenarios.

            The real-time perception of user behavior by the client and the delay of the server

            Taking the related implementation of blocking users as an example, Turms caches user relationships for 1 minute by default to avoid frequent database queries, which is a reasonable behavior. If user A "blocks" user B at this time, it may appear that although user A has blocked user B, user B may still be able to send messages to user A during the cached period (because The Turms server is a distributed cluster, and the relational cache and the server that receives the blacklist request are not necessarily the same server). This behavior is acceptable to the Turms server, not a bug.

            Its reasonable and ideal reference solution is: on the business level of the client (the business logic is controlled by you, not by the Turms client), even if the Turms server sends a message to the Turms client, your client should also follow the The business logic of your product itself, judge whether the user has been blocked again, and if so, hide it or not.

            Message awareness

            Read flooding and write flooding

            The architecture of Turms is designed based on the read diffusion message model. The following table compares the advantages and disadvantages of read flooding and write flooding for readers' reference:

            Read DiffusionWrite Diffusion
            Meaning1. Each user has an individual conversation (also known as mailbox or Timeline) with other users or groups that they chat with.
            2. When a user sends a message, regardless of private chat or group chat, the database only needs to store a message record.
            3. When the user queries the message, the client needs to send a request to the server to pull the message of the specified session ID list; Chat session messages, and then use a request to specify the group chat session ID list to pull group chat messages
            1. Each user has and only one mailbox.
            2. When a user sends a message, the message needs to be written to the mailboxes of all members in the session, that is, if there are 100 other members in the group chat, the message needs to be written 100 times.
            3. When the user queries the message, the client does not need to specify the session ID list, but only needs to send a request to the server to read the message in its own mailbox
            Advantageous ScenariosScenarios where there are relatively few user sessions (private chat sessions and group chat sessions) and a large number of groups.
            Note: If the application only has private chat sessions and no group chat sessions, then under the implementation of the Turms server, the advantages and disadvantages of read flooding and write flooding are not much different, because both message models only require When a user sends a message, the database writes a message; when a user reads a message, the database looks up the table once based on the index (Turms uses a composite index of message sending time + recipient ID, see Message Collection Design)
            To avoid too many Message copying, so write diffusion is relatively more suitable for scenarios where there are many group chats but few group members
            Disadvantageous scenariosBecause the client needs to specify the ID list of group chat sessions, the disadvantageous scenario of read diffusion is: there are many group chat sessions, and users read messages frequently.
            Reminder: The Turms server uses a MongoDB client request to complete the above query operations based on the index, so the performance is actually very efficient. Only compared to write flooding, this scenario is a disadvantageous scenario for read flooding
            because the more group members, the more times the message is copied, so the disadvantageous scenario of write flooding is: a single group has many members, and the group members are frequent send message
            Technical Implementation1. Read requests can be load-balanced through MongoDB's shard copy architecture
            2. All read requests are implemented based on indexes, with high performance
            1. Write operations are difficult to load-balance
            br />2. The cost of implementing IM functions such as updating messages and withdrawing messages is huge, and distributed consistency issues and message storms need to be considered
            Message reliabilityIf the product has high requirements on message reliability, that is, to ensure that the message is not lost and that the content of the message is consistent, then the implementation of read diffusion is much simpler, because the database only needs to store one message, and the user only needs to Read this messageBecause it is necessary to ensure that the message is written to the mailbox of each group member, it is necessary to introduce a weak distributed consistency transaction (or a strong distributed consistency transaction), otherwise the message may be lost, but the distributed consistency Transactions cause poor throughput
            General Comments1. Read diffusion is applicable to a wide range of products. For the characteristics of huge cost to implement write diffusion, based on the implementation of read diffusion, usually only the client needs to customize the query conditions and send a query statement to the Turms server. (such as group new member message sharing, multi-terminal message synchronization), the server does not need to change a line of code, and these query tasks are completed based on the index.
            2. Read diffusion can still rely on indexes to ensure high efficiency in disadvantageous scenarios
            Since write diffusion requires writing a large number of messages, any update operations (withdrawal/update) also need to use distributed transactions, and The implementation of IM features (such as group new member message sharing, multi-terminal synchronization) is very complicated.
            In summary, the business expansion of writing diffusion is extremely poor, and its usage scenarios are basically limited to: applications are basically private chats, no group chats, and the business functions are simple, but for applications that only have private chats, as mentioned above , the performance of read flooding or write flooding is similar.
            If your team's product manager asks to add business functionality, your development team will quickly realize how fatal the design of an IM system is to only support write flooding. Read diffusion can be a very efficient and easy-to-implement function, but for write diffusion, this becomes an inefficient and difficult-to-implement function

            Emphasize again: unless you are very clear that the use case of your product is as simple and limited as above (the number of private chat sessions does not matter, but the number of group chat sessions is large and the number of group members is small), and the future business needs will basically remain unchanged, otherwise use Understanding the write diffusion message model basically means that your product will one day need to refactor the readback diffusion model, or support both reading and writing models. Of course, write diffusion can also be retained for a long time as "technical debt".

            remind:

            • Changing from a write-diffusion implementation to a read-diffusion implementation almost means recreating the design and implementation of the entire project from scratch. Also because the impact of the message model on the IM architecture is so great, when we talk about the Turms architecture, the first sentence is always "Turms architecture is designed based on the read diffusion message model".
            • In the implementation of the Turms server, the "withdrawal message" is also a message, that is, a special system message.

            Message reception, message update and message withdrawal

            Turms implements message reception, update and withdrawal on the client side based on the above-mentioned "push mode" and "pull mode". in:

            • Combining the above "common pull timing" and the following "About message accessibility, order and repeatability", Turms can achieve 100% message arrival, message consistency sorting and deduplication

            • The notification of message update and withdrawal is essentially a message, that is, a special system message. After receiving the message update or cancellation request from the user, the Turms server will first judge whether the function is enabled, whether the user has permission, whether it is within a certain time interval, etc. If the verification passes, it will (hereinafter referred to as the withdrawal message Take the process as an example, the update message is the same):

              • The Turms server first modifies the target original message record stored in the database, and marks it with a timestamp of "message withdrawn".

              • Then generate a "withdrawal message" system message (note that it is a message, not a notification notification), and insert it into the message collection.

              • Finally, send the above-mentioned "withdrawal message" system message to the corresponding online users to inform these clients that some messages have been withdrawn before.

                After the client receives the system message, the developer needs to do the processing on the corresponding business layer (the Turms client will not do any other logical processing except for parsing which messages are withdrawn), such as physically deleting the message locally. message, or just hide it, or replace a retracted message with something like "This message was retracted at XX time", etc.

                Supplement: As mentioned above, when the current Turms server processes a withdrawal message, it will send a "withdrawal message" system message to the corresponding online client to ensure that the online client can quickly withdraw the locally received message. Configuration items will also be added to support applications that do not want the Turms server to actively send this system message.

              • If the user is already offline and has not received the "withdrawal message" system message, then when the user logs in next time, it still needs to actively pull the message received when offline, so in the process of pulling By the way, the "withdrawal message" system message inserted above will also be pulled down. When developers detect such system messages, they can do specific business layer processing.

                Reminder: Developers can use the addMessageListener interface in the message service provided by the client side to determine whether the received message is a system message of "withdrawal message". Take the turms-client-js client as an example:

                js
                turmsClient. messageService. addMessageListener((message, addition) => {
                +    
                Skip to content

                Status Awareness

                Status awareness is divided into two categories, one is "user online status awareness", and the other is "business data change awareness" (such as receiving new messages, group members sending changes).

                Since the specific implementation of state awareness is closely related to specific product requirements, you need to be able to grasp the following two points:

                1. Determine whether the product demand is reasonable. Usually unreasonable requirements, such as: there can be 10,000 users in a group, when a user sends a message, it is necessary to ensure that the message can be 100% sent to other 9999 users, and the user can pull a few years ago Chat information.
                2. Distinguish primary and secondary requirements, and try to strike a balance between quality attributes. There are many details in the implementation of IM services. Is it really necessary to design a large number of back-and-forth strategies (such as message session-level auto-increment IDs) in order to be compatible with extreme situations, which not only greatly increases the development cost and failure points, but also makes the overall server throughput drops.

                User online status awareness

                In short, Turms detects the health status of the user's TCP connection through the heartbeat packet and judges whether the user is "online". Also, if you don't care about the underlying implementation, you only need to read: Client API - Session Lifecycle.

                Specific principles (expanding knowledge)

                background

                From the perspective of the network transport layer, TCP is just a virtual connection, which needs to simulate a physical connection through two-way message delivery and message confirmation. In the case of waving for the first time (that is, the specified message transmission and confirmation is not completed), TCP still determines that the connection is in a hold state (if you try to read data from the TCP connection at this time, it will throw a message similar to "An existing connection was forcibly closed by the remote host" message). Therefore, for the upper-layer instant messaging application developed based on the TCP protocol, if we do not do extra work, the server can only mistakenly believe that "the user is online".

                Common reasons why TCP did not complete the four wave

                • Client: The client application is forcibly closed
                • Server: The load continues to be too high to respond; the server directly goes down, causing the server application to be forcibly closed
                • Link intermediate routing: unexpected interruption (eg: mobile access network NAT timeout)

                Solutions for abnormal disconnection

                In order to ensure that the server can perceive the state of "user offline", the Turms client will, after a certain time interval from the last request of any type (such as a request to send a message) (for now, it does not support the configuration of smart heartbeat according to network conditions), Send heartbeat packets to the server to maintain its "online status". After the server receives the heartbeat packet or other business requests from the client, it will refresh the online status of the client on the Redis server to keep alive.

                Business data change perception

                In order to allow users to perceive changes in business data (addition, deletion, modification), Turms supports push mode (server-side active notification), pull mode (client-side active pull mechanism. Support pull by Timeline) and push-pull combination mode to achieve real-time Balance between real-time performance and resource consumption, and allow developers to adjust the weight between real-time performance and resource consumption.

                Perception

                Method 1: push mode (active notification from the server)

                The push mode means that when a certain business model changes (due to addition, deletion and modification operations), the server will actively notify the relevant online users of the occurrence of the event. When the client receives the notification, the Turms client will trigger the onNotification callback function in NotificationService. The parameter of this function is a TurmsRequest object, indicating the request that triggered the event.

                Notification-related behaviors can be configured according to the im.turms.server.common.infra.property.env.service.business.NotificationProperties class. Each notification type can be configured individually, and all notification-related configurations can be dynamically updated while the cluster is running.

                Example

                Take the property im.turms.server.common.infra.property.env.service.business.NotificationProperties#notifyMembersAfterGroupUpdated as an example. This attribute is used to control "whether to notify group members when group information changes". The group information here refers to global group information such as group name, group type, and group silence time.

                If you set this attribute value to true, when the group information changes, the clients of the group members will receive a notification that triggers the change. Otherwise, group member clients will not receive any notifications.

                evaluate

                The notification mechanism can ensure that notifications can be delivered to relevant users in real time, but its disadvantage is that it can easily lead to meaningless resource consumption (subject to specific business scenarios). For example, user A has joined 100 groups, but the user usually only checks the information of 3 of them. In this scenario, if the notification mechanism is enabled for all status changes of 100 groups, both the server and the client need to waste a lot of resources to deal with these meaningless notifications (because the user never reads these notifications) .

                In order to solve this type of problem and meet other common needs (such as: requiring offline users to detect whether the business model has changed when they go online; requiring online users to perceive changes in the business model even when the notification is turned off) , Turms also provides a pull mode (the client actively pulls) to allow users to perceive changes in the business model.

                Method 2: Pull mode (the client actively pulls. Supports pulling by Timeline)

                In order to make up for the deficiencies of the push mode mentioned above, Turms also provides a pull mode.

                About to achieve

                Each business model of Turms has a version information, which records the time when the business model was last updated. When the client requests resources from the server, it can carry the time when the client last updated the business model (or not). The Turms server will compare this version information with the version information of the current business model. If the client If the version information sent by the client is earlier than the version information of the current business model, the Turms server will return the latest business model data, otherwise the status code NO_CONTENT will be thrown, and the client will receive empty data.

                Common pull timing (synchronization timing)
                • When your app is switched to the foreground
                • When the session is reconnected
                • Depends on specific business (see example below)
                Example

                Continuing with the example above. Assume that we want group members to be able to perceive changes in the profile information of other group members in real time. Then if we adopt the notification mechanism, assuming that each group has 100 other online users besides user A, then user A’s profile information needs to be notified to other 10,000 (100 groups*100 people/group) group members, This is absolutely undesirable in practical applications.

                In practice, usually at a specific time (for example, when the user opens a user's personal information UI interface, or opens a chat window with someone), the client will actively request the server for the user's information. At the same time, use version comparison to reduce meaningless waste of resources.

                This kind of design that always pays attention to real-time and resource consumption should be kept in mind, so as not to design unrealistic application scenarios.

                The real-time perception of user behavior by the client and the delay of the server

                Taking the related implementation of blocking users as an example, Turms caches user relationships for 1 minute by default to avoid frequent database queries, which is a reasonable behavior. If user A "blocks" user B at this time, it may appear that although user A has blocked user B, user B may still be able to send messages to user A during the cached period (because The Turms server is a distributed cluster, and the relational cache and the server that receives the blacklist request are not necessarily the same server). This behavior is acceptable to the Turms server, not a bug.

                Its reasonable and ideal reference solution is: on the business level of the client (the business logic is controlled by you, not by the Turms client), even if the Turms server sends a message to the Turms client, your client should also follow the The business logic of your product itself, judge whether the user has been blocked again, and if so, hide it or not.

                Message awareness

                Read flooding and write flooding

                The architecture of Turms is designed based on the read diffusion message model. The following table compares the advantages and disadvantages of read flooding and write flooding for readers' reference:

                Read DiffusionWrite Diffusion
                Meaning1. Each user has an individual conversation (also known as mailbox or Timeline) with other users or groups that they chat with.
                2. When a user sends a message, regardless of private chat or group chat, the database only needs to store a message record.
                3. When the user queries the message, the client needs to send a request to the server to pull the message of the specified session ID list; Chat session messages, and then use a request to specify the group chat session ID list to pull group chat messages
                1. Each user has and only one mailbox.
                2. When a user sends a message, the message needs to be written to the mailboxes of all members in the session, that is, if there are 100 other members in the group chat, the message needs to be written 100 times.
                3. When the user queries the message, the client does not need to specify the session ID list, but only needs to send a request to the server to read the message in its own mailbox
                Advantageous ScenariosScenarios where there are relatively few user sessions (private chat sessions and group chat sessions) and a large number of groups.
                Note: If the application only has private chat sessions and no group chat sessions, then under the implementation of the Turms server, the advantages and disadvantages of read flooding and write flooding are not much different, because both message models only require When a user sends a message, the database writes a message; when a user reads a message, the database looks up the table once based on the index (Turms uses a composite index of message sending time + recipient ID, see Message Collection Design)
                To avoid too many Message copying, so write diffusion is relatively more suitable for scenarios where there are many group chats but few group members
                Disadvantageous scenariosBecause the client needs to specify the ID list of group chat sessions, the disadvantageous scenario of read diffusion is: there are many group chat sessions, and users read messages frequently.
                Reminder: The Turms server uses a MongoDB client request to complete the above query operations based on the index, so the performance is actually very efficient. Only compared to write flooding, this scenario is a disadvantageous scenario for read flooding
                because the more group members, the more times the message is copied, so the disadvantageous scenario of write flooding is: a single group has many members, and the group members are frequent send message
                Technical Implementation1. Read requests can be load-balanced through MongoDB's shard copy architecture
                2. All read requests are implemented based on indexes, with high performance
                1. Write operations are difficult to load-balance
                br />2. The cost of implementing IM functions such as updating messages and withdrawing messages is huge, and distributed consistency issues and message storms need to be considered
                Message reliabilityIf the product has high requirements on message reliability, that is, to ensure that the message is not lost and that the content of the message is consistent, then the implementation of read diffusion is much simpler, because the database only needs to store one message, and the user only needs to Read this messageBecause it is necessary to ensure that the message is written to the mailbox of each group member, it is necessary to introduce a weak distributed consistency transaction (or a strong distributed consistency transaction), otherwise the message may be lost, but the distributed consistency Transactions cause poor throughput
                General Comments1. Read diffusion is applicable to a wide range of products. For the characteristics of huge cost to implement write diffusion, based on the implementation of read diffusion, usually only the client needs to customize the query conditions and send a query statement to the Turms server. (such as group new member message sharing, multi-terminal message synchronization), the server does not need to change a line of code, and these query tasks are completed based on the index.
                2. Read diffusion can still rely on indexes to ensure high efficiency in disadvantageous scenarios
                Since write diffusion requires writing a large number of messages, any update operations (withdrawal/update) also need to use distributed transactions, and The implementation of IM features (such as group new member message sharing, multi-terminal synchronization) is very complicated.
                In summary, the business expansion of writing diffusion is extremely poor, and its usage scenarios are basically limited to: applications are basically private chats, no group chats, and the business functions are simple, but for applications that only have private chats, as mentioned above , the performance of read flooding or write flooding is similar.
                If your team's product manager asks to add business functionality, your development team will quickly realize how fatal the design of an IM system is to only support write flooding. Read diffusion can be a very efficient and easy-to-implement function, but for write diffusion, this becomes an inefficient and difficult-to-implement function

                Emphasize again: unless you are very clear that the use case of your product is as simple and limited as above (the number of private chat sessions does not matter, but the number of group chat sessions is large and the number of group members is small), and the future business needs will basically remain unchanged, otherwise use Understanding the write diffusion message model basically means that your product will one day need to refactor the readback diffusion model, or support both reading and writing models. Of course, write diffusion can also be retained for a long time as "technical debt".

                remind:

                • Changing from a write-diffusion implementation to a read-diffusion implementation almost means recreating the design and implementation of the entire project from scratch. Also because the impact of the message model on the IM architecture is so great, when we talk about the Turms architecture, the first sentence is always "Turms architecture is designed based on the read diffusion message model".
                • In the implementation of the Turms server, the "withdrawal message" is also a message, that is, a special system message.

                Message reception, message update and message withdrawal

                Turms implements message reception, update and withdrawal on the client side based on the above-mentioned "push mode" and "pull mode". in:

                • Combining the above "common pull timing" and the following "About message accessibility, order and repeatability", Turms can achieve 100% message arrival, message consistency sorting and deduplication

                • The notification of message update and withdrawal is essentially a message, that is, a special system message. After receiving the message update or cancellation request from the user, the Turms server will first judge whether the function is enabled, whether the user has permission, whether it is within a certain time interval, etc. If the verification passes, it will (hereinafter referred to as the withdrawal message Take the process as an example, the update message is the same):

                  • The Turms server first modifies the target original message record stored in the database, and marks it with a timestamp of "message withdrawn".

                  • Then generate a "withdrawal message" system message (note that it is a message, not a notification notification), and insert it into the message collection.

                  • Finally, send the above-mentioned "withdrawal message" system message to the corresponding online users to inform these clients that some messages have been withdrawn before.

                    After the client receives the system message, the developer needs to do the processing on the corresponding business layer (the Turms client will not do any other logical processing except for parsing which messages are withdrawn), such as physically deleting the message locally. message, or just hide it, or replace a retracted message with something like "This message was retracted at XX time", etc.

                    Supplement: As mentioned above, when the current Turms server processes a withdrawal message, it will send a "withdrawal message" system message to the corresponding online client to ensure that the online client can quickly withdraw the locally received message. Configuration items will also be added to support applications that do not want the Turms server to actively send this system message.

                  • If the user is already offline and has not received the "withdrawal message" system message, then when the user logs in next time, it still needs to actively pull the message received when offline, so in the process of pulling By the way, the "withdrawal message" system message inserted above will also be pulled down. When developers detect such system messages, they can do specific business layer processing.

                    Reminder: Developers can use the addMessageListener interface in the message service provided by the client side to determine whether the received message is a system message of "withdrawal message". Take the turms-client-js client as an example:

                    js
                    turmsClient. messageService. addMessageListener((message, addition) => {
                         if (addition. recalledMessageIds. length) {
                             // is a system message to recall messages
                         } else {
                    @@ -30,7 +30,7 @@
                             //not
                         }
                     });

                in addition:

                • Regarding the process of deleting messages on the Turms server, the Turms server currently only performs soft delete or hard delete on the messages, and does not perform any logic related to "withdrawing messages". We will add corresponding configuration items to Turms in the future to support applications that want to withdraw messages when they want to delete messages.
                • At present, the Turms server does not provide complete support for "update message" like "withdraw message", and the optimization of this part will be completed in the near future.

                About the reachability, orderliness and repeatability of messages

                Architectural design is always the art of balance, and blindly promising 100% news is just a sales rhetoric. For example, most Internet applications will only use weakly distributed transactions with better performance in the technical implementation of distributed transactions, rather than strong distributed transactions that are more reliable but have low performance. Whether it is necessary to achieve 100% message delivery depends on the business scenario. For example, in the live chat room scenario, not only does it not require that the message must arrive, but it even requires the server to actively discard user messages according to the load situation and message priority, or only send the message to some users.

                The live broadcast scene may not require the order of messages, but requires "how to design a message with a high throughput. Try to ensure the order of the messages, but do not provide additional auxiliary resources for support." Some designed IM applications can also "in order to achieve a balance between high throughput and high reachability, use the non-message must reach mechanism for free groups, and use the message must reach mechanism for VIP groups". The needs of practical applications are always varied.

                Therefore, it is emphasized again: when doing functional design, it is necessary to distinguish between primary and secondary requirements, and to strike a balance between quality attributes as much as possible. Never leave the business scene and work behind closed doors.

                Summarize

                Since the specific implementation comparison of various message features below is relatively complicated, this summary section quickly summarizes the final solution for you.

                In general, Turms is designed to follow the principle that the client can implement it itself, and the Turms server does not implement it, so as to achieve maximum throughput and flexible business implementation. If the feature must be implemented by the server and has little impact on throughput, it is enabled by default, otherwise it is disabled by default`, specifically:

                • Accessibility

                  • Solution 1: If you want to achieve almost 100% message delivery, you can enable use-sequence-id-for-group-conversation and use-sequence under turms.service.message.sequence-id -id-for-private-conversation(default configuration, all closed), this mechanism will request a session-level auto-incrementsequence ID` from Redis every time a message record is generated, and assign this ID to the current In the message record, the client can judge whether the message is lost through the auto-increment of the ID and the message sending time (the need to judge the message sending time is because: if Redis crashes and the serial number data is lost, the serial ID will be calculated from the beginning, and when If the client detects that the serial number has become smaller, it can then determine which message is the latest message based on the message sending time).

                    Note: sequence ID has nothing to do with message ID.

                  • Option 2 (default implementation): If you do not require messages to be 100% guaranteed, turn off the above configuration to obtain greater message push throughput.

                • Orderliness

                  • Sequential eventual consistency

                    • Option 1: Use the self-incrementing sequence ID mentioned above to realize the order of messages "by the way"
                    • Solution 2: (default implementation) Use server time to ensure message order. Reminder: Not only messages need to use the system time, but also various functional modules of the Turms server heavily use the system time, such as the ID generated based on the Snowflake algorithm, the timestamp of the log, and the current limiting and anti-scraping mechanism based on the timestamp.
                  • Consistency in receiving order: Some IM systems will delay sending messages or displaying messages on the client side to avoid as much as possible "the client first receives the message sent later, and then receives the message sent earlier", resulting in a message UI Need to rearrange. However, Turms has no plans to provide relevant support

                  • Causal consistency: When the client sends a message, it can carry the preMessageId field, which is used to indicate what the last message ID displayed on the message sending client UI is. This record has no practical effect on Turms itself, but other clients can refer to this value for upper-level message UI display to achieve causal consistency of message logic between clients.

                    Note: preMessageId has nothing to do with the implementation of "message reachability", it is only used for your product to sort the message UI

                • Repeatability. In this regard, the Turms server only provides message records with unique global IDs. The deduplication of messages needs to be implemented by the developer on the client side: if your application needs to achieve 100% deduplication of messages, you need to consider the received The message ID. If your application only needs to ensure deduplication of messages within the life cycle of an application, you only need to store the received message ID in memory, and whenever the server pushes a new message, you only need to judge whether the message with this ID has been processed That's it.

                  Reminder: usually only need to store the message ID of the latest local time (such as the last 1 day), there is no need for full storage

                In addition, the following will explain a common but often very failed design scheme in the industry, that is, the scheme of "message confirmation mechanism requiring server participation" as a negative case. It achieves the worst "reachability" and "repeatability" effects at the highest cost, and its performance and scalability are also extremely poor. (TODO: This part of the documentation has not been updated)

                Message Confirmation Mechanism (Acknowledge)

                It is worth noting that:

                1. The message confirmation mechanism of Turms does not require the participation of the Turms server
                2. The message confirmation mechanism is completely independent from the "read message" function at the business level, and there is no relationship between the two.
                Ack mechanism that requires server participationAck mechanism that does not require server participation
                IntroductionIn some instant messaging architecture designs, the client is required to send a message confirmation request to the server at a certain interval (such as 5 seconds, 10 seconds, etc.) after receiving the message (instead of confirming as soon as the message is received. One is to improve the efficiency of confirmation processing, and the other is to reduce the probability of losing messages due to network delays).
                The server records the latest confirmation time of each session, so that when the user pulls messages from all sessions (such as when the user goes online), he can pull all the messages from the confirmation time to the present through a simple request.
                The client stores the last confirmation time of each session locally. If the client wants to obtain any session message to which it belongs, it sends the corresponding session ID and confirmation time to the server, and the server returns all messages from the confirmation time to the present.
                Advantages1. The client is simple to implement and does not need to store session information locally1. The client can customize the range of message fetching. The business is more applicable and can easily support multi-terminal message synchronization
                2. The server does not need to check the confirmation time of all sessions first, and then pull the message according to the Ack time, which has better performance
                3. The client does not need to send confirmation requests to the server regularly, which can completely save the performance overhead caused by a large number of confirmation operations
                Disadvantages1. The server needs to check the confirmation time of all sessions first, and then pull the message according to the confirmation time. The performance is relatively poor
                2. For each message received, the client needs to send a confirmation to the server 1. When the client sends a request, it needs to carry all the session IDs of the message to be requested and the corresponding confirmation time, and the request body is relatively large (but it also corresponds to the above ② Advantages)
                2. Developers are required to implement the client's local database (such as: Realm database. Turms may help developers implement local storage functions in the future in an extended form)

                About message reachability

                Architectural design is always the art of balance, and blindly promising 100% news is just a sales rhetoric. For example, most Internet applications will only use weakly distributed transactions with better performance in the technical implementation of distributed transactions, rather than strong distributed transactions that are more reliable but have low performance. Whether it is necessary to achieve 100% message delivery or not depends on the business scenario (for example, in the live chat room scenario, not only is the message not required to be delivered, but the server is even required to actively discard user messages according to the load situation).

                The solution to achieve 100% delivery of messages is also relatively simple. A session-level self-incrementing ID generation server can be implemented through Redis to ensure that message IDs are incremented within a session. The client can judge whether there is a message missing through the incrementality of the ID. If it finds that the message is missing, it can send a request to the server to get the specified message.

                Turms will also support the above-mentioned session-level message auto-increment ID implementation to ensure 100% message delivery (TODO), and also provide a global auto-increment ID implementation based on the Snowflake algorithm to provide the best throughput (the cost is that the message cannot guarantee 100% % must reach).

                About the realization of the number of unread messages

                Business needs

                • When used as a desktop badge (Badge Number), display the total number of unread messages (iOS must calculate the total number on the server side). Need to support offline update, or do not need to support offline update
                • When used as a conversation badge in the app, it displays the number of unread messages for each conversation

                plan

                Does not support offline message push with unread message count (default implementation)supports offline message push with unread message count (TODO)
                ImplementationWhen the client receives and pulls messages, it sends a request to the server to calculate the "unread messages" in real time.
                In this solution, the Turms server does not actually have the concept of unread message count, the server only calculates the number of messages within a certain message sending time interval according to the client's request
                Use Redis to support offline messages Carry the number of unread messages when pushing: carry the number of unread messages in the session and the total number of unread messages; only carry the total number of unread messages Add 1 to the number of unread messages, and add 1 to the total
                When the user reads the message, or when the user or group is deleted, do the opposite subtraction operation in the Redis record
                (**Note: The total number of unread messages must be calculated by the server **)
                Advantages1. The implementation is simple and can flexibly support various business needs, without the need to introduce a Redis server
                2. When sending a message, there is no need to send a request to Redis to calculate the number of unread messages, and the write throughput is higher
                1. Support offline message push to carry the number of unread messages
                2. When reading unread messages, no real-time calculation is required, and the read throughput is higher
                Disadvantages1. Does not support the number of unread messages carried when offline messages are pushed
                2. When the client reads the number of unread messages, real-time calculation is required, and the read throughput is lower (supplement: index support)
                1 . Redis server needs to be introduced to increase the cost and difficulty of operation and maintenance
                2. Every time the server receives a new message, Redis needs to send a request to calculate the number of unread messages, and the write throughput is lower
                Relationship with unread messagesUnread messages and Number of unread messages both take the terminal as the dimension, and the client sends the local message to the service through the above-mentioned client to confirm the last confirmation time to obtain this time point The number of "unread" messages and "unread" messages after that.
                Therefore, the unread message and unread message number obtained by different terminals may be inconsistent
                unread message still takes the terminal as the dimension, but unread message number takes the user as the dimension . If message A is "read" on the desktop side, the mobile phone side can still consider it "unread", but the number of unread messages pushed to all clients of the user is uniformly reduced by 1
                So the different ends get Unread Messages may be inconsistent, but Unread Message Count is consistent
                SupplementAs mentioned above, this solution can actually "forcibly" support the number of unread messages when pushing offline messages.
                But because this solution is not designed for frequently reading the number of unread messages, if the server calculates the number of unread messages in real time every time a message is pushed, its performance is obviously not advisable. Therefore, it is not supported in practice
                The above solutions have their own advantages and disadvantages, and which solution to use depends on the business requirements of the specific application. If you do not need to support offline message push and carry the number of unread messages, use the solution on the left, and if you need to support it, use the solution on the right.
                If the customer has additional requirements on the basis of these two solutions, they need to do secondary development by themselves
                TODO: This implementation will be supported in the near future

                Implementation

                TODO

                About the implementation of offline push

                For online users, developers can use the notification attribute to configure whether to allow the server to actively push messages to online users (the default is true). For offline users, the implementation of offline push usually needs to use the push SDK provided by the mobile phone operator to perform offline push through its channel.

                However, since Turms itself does not connect to any operator and does not plan to connect, you need to implement custom offline push logic through the NotificationHandler plug-in. The Handler provides a handle function and accepts four parameters: message information, online user ID, offline user ID, and optional number of unread messages. You can use this function to call the push SDK provided by the manufacturer to implement offline push logic .

                Message batch pull

                TODO: Not supported yet. Since message fetching is controlled by the client itself, this feature can be easily implemented efficiently and flexibly, and we will provide support before the official release.

                extra large group

                It is not difficult to implement a very large group, but its business requirements and scenarios are very different from those of general social applications, so a set of special strategies is required to support very large groups.

                Strategy (TODO)

                1. Messages are sent according to priority
                2. Intelligently limit the peak value of messages, and actively discard messages according to the server status and message priority
                3. Send messages in buckets (subgroups)
                4. Message roaming is usually not required
                - + \ No newline at end of file diff --git a/docs/feature/group.html b/docs/feature/group.html index ad56b26e..c2d880f7 100644 --- a/docs/feature/group.html +++ b/docs/feature/group.html @@ -17,8 +17,8 @@ -
                Skip to content

                Group-related Features

                Types of group members include: group owner, administrator, ordinary member, visitor, anonymous visitor

                • Admin API path: /groups. For specific API details, please refer to the OpenAPI documentation
                • Client Interface: Please refer to the GroupServiceController class.
                • The underlying request model: please refer to the interface description file in the https://github.com/turms-im/proto/tree/master/request/group directory
                • Configuration class: im.turms.server.common.infra.property.env.service.business.group.GroupProperties

                function list

                Function
                DescriptionRelated configuration attribute name
                New GroupNew Groupturms.service.group.activate-group-when-created
                The group owner dismisses the groupThe group owner can dismiss the groupturms.service.group.delete-group-logically-by-default
                Actively withdraw from the groupExcept for the group owner, other users can actively withdraw from the group. The group owner needs to transfer the group to other group members before they can withdraw from the group
                Group owner transfer groupThe group owner can transfer the owner authority of the group to other members in the group. After the transfer, the transferred person becomes the new group owner, and the original group owner becomes an ordinary member. The group owner can also choose to quit the group directly while transferring
                Modify group informationSupport group name, group avatar, group introduction, group notification, group type and other fields
                Group banOrdinary members of the group cannot send messages during the mute period, only the group owner and administrator can send messages
                Get group informationFind groups based on filter conditions (such as group ID)
                Add group membersAdd group members
                Send invitation to join the groupGroup members with the role of invitation permission can send invitation to the specified userturms.service.group.invitation.content-limit
                turms.service.group.invitation.expire-after- seconds
                turms.service.group.invitation.expired-invitations-cleanup-cron
                turms.service.group.invitation.delete-expired-invitations-when-cron-triggered
                Cancel the invitation to join the groupThe group owner, administrator and initiator of the invitation to join the group can cancel the invitation to join the groupturms.service.group.invitation.allow-recall-pending-invitation-by-owner-and-manager
                Send group requestturms.service.group.join-request.content-limit
                turms.service.group.join-request.expire-after-seconds
                turms.service.group.join -request.expired-join-requests-cleanup-cron
                turms.service.group.join-request.delete-expired-join-requests-when-cron-triggered
                Cancel group join requestturms.service.group.join-request.allow-recall-join-request-sent-by-oneself
                Set group entry questionsFor groups whose group entry policy is "join after the group entry requester answers the questions correctly", group owners and administrators can set group entry questions. There can be multiple questions for entering the group, and one question can have multiple answersturms.service.group.question.answer-content-limit
                turms.service.group.question.max-answer-count
                turms .service.group.question.question-content-limit
                Delete group entry questionDelete group entry question
                Remove group membersGroup owners and administrators can remove group members, and administrators cannot remove group owners and other administrators
                Update group member informationAccording to the corresponding "group type", group members with specified roles can modify the member information of other group members (for example: the group owner assigns administrator roles to group members)
                Muting group membersMuted users can be in the group, but cannot send messages
                Group member coordinates sharing in real timeGroup members can share their coordinates with other group members in real time
                Group BlacklistAfter a user is blacklisted, he will no longer be able to enter the group. If the blocked user is a current group member before being blocked, the user will be automatically removed from the group member list after being blocked

                Group type configuration

                In terms of group configuration, Turms uses the concept of "group types". By default, Turms provides a general group type, and you can also add, delete, modify and query the "group type" to meet your customized group type needs.

                Corresponding admin API: /groups/types. For specific API details, please refer to the OpenAPI documentation Corresponding configuration model: im.turms.service.domain.group.po.GroupType

                Configuration list

                AttributeDescriptionConfiguration attribute name
                Maximum number of group membersValid value is 1~∞groupSizeLimit
                Group Invitation PolicySupport configuration:
                ①Only the group owner can invite: OWNER, OWNER_REQUIRING_APPROVAL;
                ②The group owner + administrator can invite: OWNER_MANAGER, OWNER_MANAGER_REQUIRING_APPROVAL;< br />③Group owner + administrator and group members can invite: OWNER_MANAGER_MEMBER, OWNER_MANAGER_MEMBER_REQUIRING_APPROVAL;
                ④Everyone can invite: ALL, ALL_REQUIRING_APPROVAL
                invitationStrategy
                Invitee's Consent ModeSupport configuration:
                ①The invitee's consent is required: the inviter sends an invitation to the invitee. If the invitee agrees to the invitation, it will automatically join the group: the strategy with _REQUIRING_APPROVAL;
                ②The invitee's consent is not required: the inviter is prohibited from sending invitations to the invitee. The inviter can directly add the invitee to the group: strategy without _REQUIRING_APPROVAL
                invitationStrategy
                Group Joining PolicySupported configuration:
                ①After the group owner or administrator approves the group joining request, the group requester can join: JOIN_REQUEST;
                ②After the group joining requester answers the questions correctly , automatically join: QUESTION;
                ③Allow unblocked users to actively join:MEMBERSHIP_REQUEST;
                ④No user is allowed to actively join, the group owner or administrator needs to send an invitation or directly pull Entering the group: INVITATION
                joinStrategy
                Group information update strategySupported configuration:
                ①Only the group owner can modify;
                ②Group owner + administrator can modify;
                ③Group owner+administrator+group members can modify;
                br />④ Everyone can modify
                groupInfoUpdateStrategy
                Group member information update strategyThe group owner can modify the member information of everyone in the group, and the administrator can only modify the member information of ordinary members in the groupmemberInfoUpdateStrategy
                Guest SpeakProhibited, AllowedguestSpeakable
                Group members modify their own informationCan be prohibited, allowedselfInfoUpdatable
                Group message read receiptCan be turned on and offenableReadReceipt
                Modify sent messagesCan be turned on and offmessageEditable

                remind:

                • There is no mutually exclusive relationship between the above "invitation policy", "invitee consent mode" and "group policy", and they are all compatible with each other, so developers can match them according to their own application scenarios .

                • If the administrator modifies the invitation policy or joining policy of a group type, which leads to a change in the policy corresponding to the group, the data corresponding to the old policy will be archived and will not be deleted by the system. Authorized users can still delete, modify and query these data.

                  For example, a group originally allowed new users to join the group based on the policy of "approving group entry requests", and the group has received some group entry requests. If the system administrator (note: users do not have permission to modify the group type) modify the group policy to "question-and-answer based" policy to allow new users to join the group, then the previously received request to join the group will not be deleted by the system. When the group administrator tries to approve these group entry requests, the server will also notify the group policy of the change and reject the approval. But group administrators can still delete, modify and query these group requests.

                  In addition, some users may think that the group policy of Turms is more complicated, but this kind of "complexity" has nothing to do with users. Users only need to configure according to their own application scenarios. It is very simple to use, just the development of Turms It is more complicated to implement these dynamic combination strategies.

                • We have no plan to support the feature of "users block groups to refuse to receive group invitations and be pulled into groups".

                Scene introduction

                User joins a group

                1. The client queries the group information of the specified group through turmsClient.groupService.queryGroups(...).

                2. Obtain group type information based on the relationship between the local hard-coded group type ID and group type information.

                  Replenish:

                  • Here, the client does not support dynamic query of group type information because the group type of most applications is fixed, and there is no need to dynamically pull information.
                  • If your application only uses one group type, you can directly hard-code the group type information on the client side, skip steps ① and ②, and go directly to the next step.
                3. According to the group entry policy in the group type information, determine which client API needs to be called to join the group:

                  • If it is JOIN_REQUEST policy, you need to call turmsClient.groupService.createJoinRequest(...) to send the request to join the group, and wait for the approval of the group administrator.
                  • If it is QUESTION strategy, you need to call turmsClient.groupService.queryGroupJoinQuestions(...) to query group questions, and then use turmsClient.groupService.answerGroupQuestions(...) to answer group questions, when the score reaches After the group administrator sets the entry threshold, you can automatically join the group.
                  • If it is MEMBERSHIP_REQUEST policy, call turmsClient.groupService.joinGroup(...) to directly join the group without any approval.
                  • If it is INVITATION strategy, you need to wait for the group administrator to send the current user an invitation to join the group.
                - +
                Skip to content

                Group-related Features

                Types of group members include: group owner, administrator, ordinary member, visitor, anonymous visitor

                • Admin API path: /groups. For specific API details, please refer to the OpenAPI documentation
                • Client Interface: Please refer to the GroupServiceController class.
                • The underlying request model: please refer to the interface description file in the https://github.com/turms-im/proto/tree/master/request/group directory
                • Configuration class: im.turms.server.common.infra.property.env.service.business.group.GroupProperties

                function list

                Function
                DescriptionRelated configuration attribute name
                New GroupNew Groupturms.service.group.activate-group-when-created
                The group owner dismisses the groupThe group owner can dismiss the groupturms.service.group.delete-group-logically-by-default
                Actively withdraw from the groupExcept for the group owner, other users can actively withdraw from the group. The group owner needs to transfer the group to other group members before they can withdraw from the group
                Group owner transfer groupThe group owner can transfer the owner authority of the group to other members in the group. After the transfer, the transferred person becomes the new group owner, and the original group owner becomes an ordinary member. The group owner can also choose to quit the group directly while transferring
                Modify group informationSupport group name, group avatar, group introduction, group notification, group type and other fields
                Group banOrdinary members of the group cannot send messages during the mute period, only the group owner and administrator can send messages
                Get group informationFind groups based on filter conditions (such as group ID)
                Add group membersAdd group members
                Send invitation to join the groupGroup members with the role of invitation permission can send invitation to the specified userturms.service.group.invitation.content-limit
                turms.service.group.invitation.expire-after- seconds
                turms.service.group.invitation.expired-invitations-cleanup-cron
                turms.service.group.invitation.delete-expired-invitations-when-cron-triggered
                Cancel the invitation to join the groupThe group owner, administrator and initiator of the invitation to join the group can cancel the invitation to join the groupturms.service.group.invitation.allow-recall-pending-invitation-by-owner-and-manager
                Send group requestturms.service.group.join-request.content-limit
                turms.service.group.join-request.expire-after-seconds
                turms.service.group.join -request.expired-join-requests-cleanup-cron
                turms.service.group.join-request.delete-expired-join-requests-when-cron-triggered
                Cancel group join requestturms.service.group.join-request.allow-recall-join-request-sent-by-oneself
                Set group entry questionsFor groups whose group entry policy is "join after the group entry requester answers the questions correctly", group owners and administrators can set group entry questions. There can be multiple questions for entering the group, and one question can have multiple answersturms.service.group.question.answer-content-limit
                turms.service.group.question.max-answer-count
                turms .service.group.question.question-content-limit
                Delete group entry questionDelete group entry question
                Remove group membersGroup owners and administrators can remove group members, and administrators cannot remove group owners and other administrators
                Update group member informationAccording to the corresponding "group type", group members with specified roles can modify the member information of other group members (for example: the group owner assigns administrator roles to group members)
                Muting group membersMuted users can be in the group, but cannot send messages
                Group member coordinates sharing in real timeGroup members can share their coordinates with other group members in real time
                Group BlacklistAfter a user is blacklisted, he will no longer be able to enter the group. If the blocked user is a current group member before being blocked, the user will be automatically removed from the group member list after being blocked

                Group type configuration

                In terms of group configuration, Turms uses the concept of "group types". By default, Turms provides a general group type, and you can also add, delete, modify and query the "group type" to meet your customized group type needs.

                Corresponding admin API: /groups/types. For specific API details, please refer to the OpenAPI documentation Corresponding configuration model: im.turms.service.domain.group.po.GroupType

                Configuration list

                AttributeDescriptionConfiguration attribute name
                Maximum number of group membersValid value is 1~∞groupSizeLimit
                Group Invitation PolicySupport configuration:
                ①Only the group owner can invite: OWNER, OWNER_REQUIRING_APPROVAL;
                ②The group owner + administrator can invite: OWNER_MANAGER, OWNER_MANAGER_REQUIRING_APPROVAL;< br />③Group owner + administrator and group members can invite: OWNER_MANAGER_MEMBER, OWNER_MANAGER_MEMBER_REQUIRING_APPROVAL;
                ④Everyone can invite: ALL, ALL_REQUIRING_APPROVAL
                invitationStrategy
                Invitee's Consent ModeSupport configuration:
                ①The invitee's consent is required: the inviter sends an invitation to the invitee. If the invitee agrees to the invitation, it will automatically join the group: the strategy with _REQUIRING_APPROVAL;
                ②The invitee's consent is not required: the inviter is prohibited from sending invitations to the invitee. The inviter can directly add the invitee to the group: strategy without _REQUIRING_APPROVAL
                invitationStrategy
                Group Joining PolicySupported configuration:
                ①After the group owner or administrator approves the group joining request, the group requester can join: JOIN_REQUEST;
                ②After the group joining requester answers the questions correctly , automatically join: QUESTION;
                ③Allow unblocked users to actively join:MEMBERSHIP_REQUEST;
                ④No user is allowed to actively join, the group owner or administrator needs to send an invitation or directly pull Entering the group: INVITATION
                joinStrategy
                Group information update strategySupported configuration:
                ①Only the group owner can modify;
                ②Group owner + administrator can modify;
                ③Group owner+administrator+group members can modify;
                br />④ Everyone can modify
                groupInfoUpdateStrategy
                Group member information update strategyThe group owner can modify the member information of everyone in the group, and the administrator can only modify the member information of ordinary members in the groupmemberInfoUpdateStrategy
                Guest SpeakProhibited, AllowedguestSpeakable
                Group members modify their own informationCan be prohibited, allowedselfInfoUpdatable
                Group message read receiptCan be turned on and offenableReadReceipt
                Modify sent messagesCan be turned on and offmessageEditable

                remind:

                • There is no mutually exclusive relationship between the above "invitation policy", "invitee consent mode" and "group policy", and they are all compatible with each other, so developers can match them according to their own application scenarios .

                • If the administrator modifies the invitation policy or joining policy of a group type, which leads to a change in the policy corresponding to the group, the data corresponding to the old policy will be archived and will not be deleted by the system. Authorized users can still delete, modify and query these data.

                  For example, a group originally allowed new users to join the group based on the policy of "approving group entry requests", and the group has received some group entry requests. If the system administrator (note: users do not have permission to modify the group type) modify the group policy to "question-and-answer based" policy to allow new users to join the group, then the previously received request to join the group will not be deleted by the system. When the group administrator tries to approve these group entry requests, the server will also notify the group policy of the change and reject the approval. But group administrators can still delete, modify and query these group requests.

                  In addition, some users may think that the group policy of Turms is more complicated, but this kind of "complexity" has nothing to do with users. Users only need to configure according to their own application scenarios. It is very simple to use, just the development of Turms It is more complicated to implement these dynamic combination strategies.

                • We have no plan to support the feature of "users block groups to refuse to receive group invitations and be pulled into groups".

                Scene introduction

                User joins a group

                1. The client queries the group information of the specified group through turmsClient.groupService.queryGroups(...).

                2. Obtain group type information based on the relationship between the local hard-coded group type ID and group type information.

                  Replenish:

                  • Here, the client does not support dynamic query of group type information because the group type of most applications is fixed, and there is no need to dynamically pull information.
                  • If your application only uses one group type, you can directly hard-code the group type information on the client side, skip steps ① and ②, and go directly to the next step.
                3. According to the group entry policy in the group type information, determine which client API needs to be called to join the group:

                  • If it is JOIN_REQUEST policy, you need to call turmsClient.groupService.createJoinRequest(...) to send the request to join the group, and wait for the approval of the group administrator.
                  • If it is QUESTION strategy, you need to call turmsClient.groupService.queryGroupJoinQuestions(...) to query group questions, and then use turmsClient.groupService.answerGroupQuestions(...) to answer group questions, when the score reaches After the group administrator sets the entry threshold, you can automatically join the group.
                  • If it is MEMBERSHIP_REQUEST policy, call turmsClient.groupService.joinGroup(...) to directly join the group without any approval.
                  • If it is INVITATION strategy, you need to wait for the group administrator to send the current user an invitation to join the group.
                + \ No newline at end of file diff --git a/docs/feature/index.html b/docs/feature/index.html index ca08eab9..be53a9f4 100644 --- a/docs/feature/index.html +++ b/docs/feature/index.html @@ -17,8 +17,8 @@ -
                Skip to content

                Business Features

                1. In the list of business functions, some functions are marked with the "✍" icon. This icon is used to indicate: whether to execute the judgment logic of the business function point, you need to make your own judgment and call the relevant API based on your own business application scenarios. Because Turms itself cannot determine whether the current context meets the conditions to trigger this function point.
                2. This function list refers to: Netease Yunxin, Huanxin, Rongyun, LeanCloud, Tencent Cloud Communication and other commercial instant messaging services. Turms provides almost all of the business functionality that these commercial services provide, and in many ways improves upon it.
                3. The function configuration parameters of Turms are extremely free. You can even configure a group with an upper limit of 10,000 members, a single message with an upper limit of 100MB, turn off most business functions, etc., expand the function of forwarding messages to all users, etc. , the Turms server will not interfere with any of your business scenarios. Turms just provides you with the most common and reasonable default configuration, such as the default upper limit of the number of people in a group is 500, a single message can be up to 1MB and so on.
                4. If you don't find the function you need in this list, please check whether your requirement can be realized by only configuring Turms parameters. After confirming that it cannot be implemented through the Turms configuration parameters, please raise it in the Issue area. Turms will be evaluated on the basis of "value for money" and will try to meet your needs as best as possible.
                5. The version number design of Turms does not completely follow Semantic Versioning, the large version number of Turms is mainly driven by the introduction of key functions . It will be presented separately in the section related to Breaking Changes.

                Notice

                • For some function points, the Turms server or client itself does not not directly provide some business function points. Taking the "burn after reading" function as an example, what Turms actually does is to pass an additional parameter burnAfter on the basis of the message, how to "burn" after reading, when to "burn", whether to The messages in the user's local database are also "burned" and other business implementation details are things that the upper-layer application implementer has to consider, and Turms will not intervene.
                • When doing functional design, keep in mind the relevant laws and regulations of the country, and avoid designing designs that are contrary to the requirements of national management. Such as "Internet Interactive Service Security Management Requirements Part 4: Instant Messaging Service"
                - +
                Skip to content

                Business Features

                1. In the list of business functions, some functions are marked with the "✍" icon. This icon is used to indicate: whether to execute the judgment logic of the business function point, you need to make your own judgment and call the relevant API based on your own business application scenarios. Because Turms itself cannot determine whether the current context meets the conditions to trigger this function point.
                2. This function list refers to: Netease Yunxin, Huanxin, Rongyun, LeanCloud, Tencent Cloud Communication and other commercial instant messaging services. Turms provides almost all of the business functionality that these commercial services provide, and in many ways improves upon it.
                3. The function configuration parameters of Turms are extremely free. You can even configure a group with an upper limit of 10,000 members, a single message with an upper limit of 100MB, turn off most business functions, etc., expand the function of forwarding messages to all users, etc. , the Turms server will not interfere with any of your business scenarios. Turms just provides you with the most common and reasonable default configuration, such as the default upper limit of the number of people in a group is 500, a single message can be up to 1MB and so on.
                4. If you don't find the function you need in this list, please check whether your requirement can be realized by only configuring Turms parameters. After confirming that it cannot be implemented through the Turms configuration parameters, please raise it in the Issue area. Turms will be evaluated on the basis of "value for money" and will try to meet your needs as best as possible.
                5. The version number design of Turms does not completely follow Semantic Versioning, the large version number of Turms is mainly driven by the introduction of key functions . It will be presented separately in the section related to Breaking Changes.

                Notice

                • For some function points, the Turms server or client itself does not not directly provide some business function points. Taking the "burn after reading" function as an example, what Turms actually does is to pass an additional parameter burnAfter on the basis of the message, how to "burn" after reading, when to "burn", whether to The messages in the user's local database are also "burned" and other business implementation details are things that the upper-layer application implementer has to consider, and Turms will not intervene.
                • When doing functional design, keep in mind the relevant laws and regulations of the country, and avoid designing designs that are contrary to the requirements of national management. Such as "Internet Interactive Service Security Management Requirements Part 4: Instant Messaging Service"
                + \ No newline at end of file diff --git a/docs/feature/message.html b/docs/feature/message.html index c0e71d2e..ad318a8e 100644 --- a/docs/feature/message.html +++ b/docs/feature/message.html @@ -17,8 +17,8 @@ -
                Skip to content

                Message-related Features

                • Admin API path: /messages. For specific API details, please refer to the OpenAPI documentation
                • Client interface: Please refer to the MessageServiceController class
                • The underlying request model: please refer to the interface description file in the https://github.com/turms-im/proto/tree/master/request/message directory
                • Configuration class: im.turms.server.common.infra.property.env.service.business.message.MessageProperties

                function list

                Message function
                Function descriptionRelated configuration
                Offline MessagesImplementation ideas: You can actively request the Turms server for specific offline messages of all private chats and group chats received when the user is offline every time the Turms client logs in. Quantity, and the specific data of the last N messages (default is 1)>, so as to take into account both the real-time nature of the message and the performance of the service. By default, the Turms server does not regularly delete any offline messages stored on the Turms serverturms.service.message.default-available-messages-number-with-total
                Roaming Messages✍When a new device logs in, the developer calls the message query interface of the Turms client, specifies the number and time period, and requests roaming messages from the Turms server.
                The implementation of roaming messages is essentially the same as that of "historical messages"
                (✍Reason: Turms cannot judge what is "new device login")
                Multi-terminal synchronizationWhen a user has multiple clients online at the same time, the Turms server will send the message to all the online clients of the user
                Historical MessagesSupport querying user's historical messages. By default, Turms permanently stores messages (including user messages or system messages)
                The implementation of historical messages is essentially the same as that of "roaming messages"
                turms.service.message.message-retention-period-hours
                turms. service.message.expired-messages-cleanup-cron
                Send Messageturms.service.message.time-type
                turms.service.message.persist-message
                turms.service.message.persist-record
                turms.service.message. persist-pre-message-id
                turms.service.message.persist-sender-ip
                turms.service.message.check-if-target-active-and-not-deleted
                turms .service.message.max-text-limit
                turms.service.message.max-records-size-bytes
                turms.service.message.allow-send-messages-to-oneself
                turms.service.message.allow-send-messages-to-stranger
                turms.service.message.delete-message-logically-by-default
                turms.service.message.send-message-to- other-sender-online-devices
                turms.service.message.use-conversation-id
                turms.service.message.sequence-id.use-sequence-id-for-group-conversation
                >turms.service.message.sequence-id.use-sequence-id-for-private-conversation
                Message RecallWithdraw a message that has been successfully delivered. By default, the sender is allowed to withdraw the message within 5 minutes of the successful delivery timeturms.service.message.allow-recall-message
                turms.service.message.available-recall- duration-seconds
                Message EditingEdit a successfully sent messageturms.service.message.allow-edit-message-by-sender
                Burn after readingAfter the recipient receives the sender's message, the recipient's client will automatically destroy it on time according to the time preset (or default) by the sender
                Read Receipt✍Notify the private chat object or group members that the current user has read a message
                Check the read/unread status of the other party in the private chat and group conversation
                (✍reason : Turms cannot know under what circumstances your user has "read a certain message". The developer needs to call turmsClient.messageService.readMessage() to inform the other party that the current user has read a certain message)
                turms.service. conversation.read-receipt.enabled
                allow-move-read-date-forward
                turms.service.conversation.read-receipt.update-read-date-after-message-sent
                turms .service.conversation.read-receipt.update-read-date-when-user-querying-message
                turms.service.conversation.read-receipt.use-server-time
                Message ForwardingForwarding a message to another user or group
                @someoneis used to specifically remind a user. If the Turms client detects that the user @ in the received message is currently logged in, the Turms client will trigger the @ callback function. Developers can implement subsequent related business logic by themselves. It is often used to remind users who are @.
                There is no essential difference between the @ message in the group and the ordinary message, only that when the @ message is received, special processing is required (triggering a callback function)
                Typing✍When a party in a communication is typing text, inform the recipient (user or users) that the user is typing the message
                (✍Reason: Turms has no way of knowing whether your users are typing text)
                turms.service.conversation.typing-status.enabled

                Precautions when querying session messages

                By default, Turms does not support "In a private chat session, the message sender can query the messages he sent himself" (specific reason: Message Index Design. Note: In a group chat session, the message sender can always query his own messages.) Developers can pass in turms-service Configure turms.service.message.use-conversation-id=true in the configuration file of the server to enable conversation ID.

                Afterwards, the semantics of turmsClient.messageService.queryMessages({areGroupMessages: false, fromIds: [10,11,12]}) will be replaced by the original "query messages sent by users whose IDs are 11, 12, and 13 in a private chat session. "Messages to the current user" becomes "query the messages sent by the users whose user IDs are 11, 12 and 13 to the current user, and the messages sent by the current user to users whose user IDs are 11, 12 and 13 in the private chat session ".

                Business message type

                From a developer's point of view, the Turms client has and only uses one data model when sending messages, namely CreateMessageRequest. Since it has fields of type string and List<byte[]>, you can actually pass any kind of data when sending a message. It's just that in order to facilitate developers to quickly implement various business message types, the Turms client divides common message types to facilitate developers to get started quickly.

                Reminder: Turms messages (messages of all business types) can be marked as system messages. However, system messages can only be sent through the turms admin API, and the Turms client cannot send system messages.

                Business message type
                Description
                Text messageThe content of the message is text
                Reminder: Text can also be JSON, encoded as Base64 binary data
                Image messageThe content of the message is the description part (optional): image URL address, size, image size
                Image data (optional)
                Voice messageThe content of the message is the description part (optional): URL address, duration, size, format of the voice file
                Voice data (optional)
                Video messageThe content of the message is the description part (optional): URL address, duration, size, format of the video file
                Video data (optional)
                File messageThe content of the message is the description part (optional): URL address, size, format of the file
                File data (optional)
                Geographic location messageThe content of the message is geographic location title, address, longitude and latitude information
                Combined messageThe content of the message is text information and any number of other messages of any content type (for example: a message contains both text, pictures and audio)
                Custom messageTurms only uses one data structure during transmission, and it can carry string and List<byte[]> data structures. Therefore, developers are free to implement any custom message types, such as messages in the form of red envelope messages, rock-paper-scissors, etc.

                Implementation of binary data transmission

                There are two main implementation schemes for the transmission of binary data (files):

                Use the Turms client to send the records field of the message API (not recommended)Use object storage services (AWS S3, Alibaba Cloud OSS, etc.)
                IntroductionBy default, Turms supports the transmission and storage of binary data records attached to messages, so you can store binary data such as pictures, videos, files, etc. in recordsYour application client (Note: here "client "It is not the client of Turms, but the client of your IM application) to request the OSS operation permission Token from your service server program, and the client will take this Token to find the OSS service and upload the file to OSS, and then take it from OSS The returned file URL is passed to the Turms server, and Turms saves the URL text instead of the binary data of the file.
                Because the Turms plug-in supports developers to implement file management services by themselves, you can also implement this function by implementing a plug-in. For example, the integrated implementation turms-plugin-minio of the MinIO object storage server officially provided by Turms is implemented based on the Turms plugin, for your reference
                AdvantagesSimple implementationUnlimited capacity;
                Support CDN acceleration, optimize user experience;
                Support UI visual management, and provide various operation and maintenance management functions. Cloud storage services generally support practical features such as redundant storage, server-side encryption, and hierarchical storage of hot and cold data (which greatly reduces data storage costs)
                DisadvantageA Turms client has and only establishes one TCP connection with the server, so if the user uses the records field that comes with the Turms client to transfer a large file, it will block the data transmission of other business requests;
                When MongoDB queries message data, it will load the entire message record into the memory, which greatly slows down the message query speed

                Reference: Storage Service

                - +
                Skip to content

                Message-related Features

                • Admin API path: /messages. For specific API details, please refer to the OpenAPI documentation
                • Client interface: Please refer to the MessageServiceController class
                • The underlying request model: please refer to the interface description file in the https://github.com/turms-im/proto/tree/master/request/message directory
                • Configuration class: im.turms.server.common.infra.property.env.service.business.message.MessageProperties

                function list

                Message function
                Function descriptionRelated configuration
                Offline MessagesImplementation ideas: You can actively request the Turms server for specific offline messages of all private chats and group chats received when the user is offline every time the Turms client logs in. Quantity, and the specific data of the last N messages (default is 1)>, so as to take into account both the real-time nature of the message and the performance of the service. By default, the Turms server does not regularly delete any offline messages stored on the Turms serverturms.service.message.default-available-messages-number-with-total
                Roaming Messages✍When a new device logs in, the developer calls the message query interface of the Turms client, specifies the number and time period, and requests roaming messages from the Turms server.
                The implementation of roaming messages is essentially the same as that of "historical messages"
                (✍Reason: Turms cannot judge what is "new device login")
                Multi-terminal synchronizationWhen a user has multiple clients online at the same time, the Turms server will send the message to all the online clients of the user
                Historical MessagesSupport querying user's historical messages. By default, Turms permanently stores messages (including user messages or system messages)
                The implementation of historical messages is essentially the same as that of "roaming messages"
                turms.service.message.message-retention-period-hours
                turms. service.message.expired-messages-cleanup-cron
                Send Messageturms.service.message.time-type
                turms.service.message.persist-message
                turms.service.message.persist-record
                turms.service.message. persist-pre-message-id
                turms.service.message.persist-sender-ip
                turms.service.message.check-if-target-active-and-not-deleted
                turms .service.message.max-text-limit
                turms.service.message.max-records-size-bytes
                turms.service.message.allow-send-messages-to-oneself
                turms.service.message.allow-send-messages-to-stranger
                turms.service.message.delete-message-logically-by-default
                turms.service.message.send-message-to- other-sender-online-devices
                turms.service.message.use-conversation-id
                turms.service.message.sequence-id.use-sequence-id-for-group-conversation
                >turms.service.message.sequence-id.use-sequence-id-for-private-conversation
                Message RecallWithdraw a message that has been successfully delivered. By default, the sender is allowed to withdraw the message within 5 minutes of the successful delivery timeturms.service.message.allow-recall-message
                turms.service.message.available-recall- duration-seconds
                Message EditingEdit a successfully sent messageturms.service.message.allow-edit-message-by-sender
                Burn after readingAfter the recipient receives the sender's message, the recipient's client will automatically destroy it on time according to the time preset (or default) by the sender
                Read Receipt✍Notify the private chat object or group members that the current user has read a message
                Check the read/unread status of the other party in the private chat and group conversation
                (✍reason : Turms cannot know under what circumstances your user has "read a certain message". The developer needs to call turmsClient.messageService.readMessage() to inform the other party that the current user has read a certain message)
                turms.service. conversation.read-receipt.enabled
                allow-move-read-date-forward
                turms.service.conversation.read-receipt.update-read-date-after-message-sent
                turms .service.conversation.read-receipt.update-read-date-when-user-querying-message
                turms.service.conversation.read-receipt.use-server-time
                Message ForwardingForwarding a message to another user or group
                @someoneis used to specifically remind a user. If the Turms client detects that the user @ in the received message is currently logged in, the Turms client will trigger the @ callback function. Developers can implement subsequent related business logic by themselves. It is often used to remind users who are @.
                There is no essential difference between the @ message in the group and the ordinary message, only that when the @ message is received, special processing is required (triggering a callback function)
                Typing✍When a party in a communication is typing text, inform the recipient (user or users) that the user is typing the message
                (✍Reason: Turms has no way of knowing whether your users are typing text)
                turms.service.conversation.typing-status.enabled

                Precautions when querying session messages

                By default, Turms does not support "In a private chat session, the message sender can query the messages he sent himself" (specific reason: Message Index Design. Note: In a group chat session, the message sender can always query his own messages.) Developers can pass in turms-service Configure turms.service.message.use-conversation-id=true in the configuration file of the server to enable conversation ID.

                Afterwards, the semantics of turmsClient.messageService.queryMessages({areGroupMessages: false, fromIds: [10,11,12]}) will be replaced by the original "query messages sent by users whose IDs are 11, 12, and 13 in a private chat session. "Messages to the current user" becomes "query the messages sent by the users whose user IDs are 11, 12 and 13 to the current user, and the messages sent by the current user to users whose user IDs are 11, 12 and 13 in the private chat session ".

                Business message type

                From a developer's point of view, the Turms client has and only uses one data model when sending messages, namely CreateMessageRequest. Since it has fields of type string and List<byte[]>, you can actually pass any kind of data when sending a message. It's just that in order to facilitate developers to quickly implement various business message types, the Turms client divides common message types to facilitate developers to get started quickly.

                Reminder: Turms messages (messages of all business types) can be marked as system messages. However, system messages can only be sent through the turms admin API, and the Turms client cannot send system messages.

                Business message type
                Description
                Text messageThe content of the message is text
                Reminder: Text can also be JSON, encoded as Base64 binary data
                Image messageThe content of the message is the description part (optional): image URL address, size, image size
                Image data (optional)
                Voice messageThe content of the message is the description part (optional): URL address, duration, size, format of the voice file
                Voice data (optional)
                Video messageThe content of the message is the description part (optional): URL address, duration, size, format of the video file
                Video data (optional)
                File messageThe content of the message is the description part (optional): URL address, size, format of the file
                File data (optional)
                Geographic location messageThe content of the message is geographic location title, address, longitude and latitude information
                Combined messageThe content of the message is text information and any number of other messages of any content type (for example: a message contains both text, pictures and audio)
                Custom messageTurms only uses one data structure during transmission, and it can carry string and List<byte[]> data structures. Therefore, developers are free to implement any custom message types, such as messages in the form of red envelope messages, rock-paper-scissors, etc.

                Implementation of binary data transmission

                There are two main implementation schemes for the transmission of binary data (files):

                Use the Turms client to send the records field of the message API (not recommended)Use object storage services (AWS S3, Alibaba Cloud OSS, etc.)
                IntroductionBy default, Turms supports the transmission and storage of binary data records attached to messages, so you can store binary data such as pictures, videos, files, etc. in recordsYour application client (Note: here "client "It is not the client of Turms, but the client of your IM application) to request the OSS operation permission Token from your service server program, and the client will take this Token to find the OSS service and upload the file to OSS, and then take it from OSS The returned file URL is passed to the Turms server, and Turms saves the URL text instead of the binary data of the file.
                Because the Turms plug-in supports developers to implement file management services by themselves, you can also implement this function by implementing a plug-in. For example, the integrated implementation turms-plugin-minio of the MinIO object storage server officially provided by Turms is implemented based on the Turms plugin, for your reference
                AdvantagesSimple implementationUnlimited capacity;
                Support CDN acceleration, optimize user experience;
                Support UI visual management, and provide various operation and maintenance management functions. Cloud storage services generally support practical features such as redundant storage, server-side encryption, and hierarchical storage of hot and cold data (which greatly reduces data storage costs)
                DisadvantageA Turms client has and only establishes one TCP connection with the server, so if the user uses the records field that comes with the Turms client to transfer a large file, it will block the data transmission of other business requests;
                When MongoDB queries message data, it will load the entire message record into the memory, which greatly slows down the message query speed

                Reference: Storage Service

                + \ No newline at end of file diff --git a/docs/feature/simultaneous-login.html b/docs/feature/simultaneous-login.html index ab1a3831..12bbb5f0 100644 --- a/docs/feature/simultaneous-login.html +++ b/docs/feature/simultaneous-login.html @@ -17,8 +17,8 @@ -
                Skip to content
                - +
                Skip to content
                + \ No newline at end of file diff --git a/docs/feature/user.html b/docs/feature/user.html index aac0ceec..8892f56b 100644 --- a/docs/feature/user.html +++ b/docs/feature/user.html @@ -17,8 +17,8 @@ -
                Skip to content

                User-related Features

                • Admin API path: /users. For specific API details, please refer to the OpenAPI documentation
                • Client Interface: Please refer to the UserServiceController class
                • The underlying request model: please refer to the interface description file in the https://github.com/turms-im/proto/tree/master/request/user directory
                • Configuration class: im.turms.server.common.infra.property.env.service.business.user.UserProperties

                User information function

                FunctionFunction DescriptionRelated Configuration
                Add Userturms.service.user.activate-user-when-added
                Delete Userturms.service.user.delete-user-logically
                Modify user profileUsers modify their own nickname, introduction, avatar URL
                Get user profileUser view own or other user's profile
                Set user profile access permissionsUsers can set access permissions for each personal profile. Access rights are: visible to everyone, visible to friends, visible only to yourself
                User permission groupAdministrators can give different permissions to different usersConfiguration model: im.turms.service.domain.user.po.UserPermissionGroup

                User Relationship Hosting

                concept:

                • Relationship: Relationship is divided into one-way relationship and two-way relationship. One-way relationship refers to: the owner of the relationship (relationship owner) has a specific relationship with the Related User (relationship person), such as "one-way friend" (allowing the other party to send messages and friend requests) or "blocking User" (prohibit the other party from sending messages, friend requests, etc.). The establishment of a one-way relationship does not require permission authentication. A two-way relationship means that user A has a one-way relationship with user B, and user B has a one-way relationship with user A. For example, user A blocks user B, and user B can specify not to block user A.
                • Related Users: Refers to users who have a one-way or two-way relationship (designate the other party as a friend or block the user). Two users are Strangers if they don't have a relationship of either kind.
                • Relational person group: A relational person group consists of a group name and a group of related persons, and each relationship must exist in at least one related person group. If the client does not perform a group operation on the relationship when creating the relationship, the relationship will be put into the user's default relationship group. Therefore, special attention should be paid to the fact that there can be both "friends" and "blocked" users in "a related person group". Of course, you can restrict a group to only have a certain type of related person through business restrictions.

                Additional supplement: In fact, there is no such concept as "friend/block user" in the Turms domain model, and its essence is a bool called "isBlocked".

                Function
                Function DescriptionRelated Configuration
                Get relationshipGet the relationship owned by the current user according to optional filtering (such as specifying user ID, "whether it is a contact", "whether it is a friend/blocked user", etc.) and grouping conditions
                Add a relationship (+initiate a friend request)①If adding a relationship as a "friend", according to your customized Turms server configuration, the user can either directly add a "friend" relationship, or initiate a friend request first , the operation of adding a "friend" relationship will not be performed automatically until the requestee's approval is obtained.
                ② If you add a related person whose relationship is "block user", no approval is required and it will take effect directly. Users will no longer receive any messages or requests from blocked users.
                turms.service.user.friend-request.content-limit
                turms.service.user.friend-request.delete-expired-requests-when-cron-triggered
                turms.service.user.friend -request.allow-send-request-after-declined-or-ignored-or-expired
                turms.service.user.friend-request.friend-request-expire-after-seconds
                turms.service .user.friend-request.expired-user-friend-requests-cleanup-cron
                turms.service.user.friend-request.delete-expired-requests-when-cron-triggered
                Approve/Reject Friend RequestUser can approve or reject friend request. If you agree to the friend request, the two will establish a two-way "friend" relationship
                Delete RelationAccording to the optional deletion conditions (such as "is/is not a relation", "is a friend/block user"), delete a certain type of relation or a designated relation.deleteTwoSidedRelationships
                Modify the relationship with related partiesModify user relationship (friend/block user) information. When modifying the relationship to "friend", you need to send a friend request by default (you can cancel this step)
                Create relational person groupWhen creating a group, you can specify the group name and the relational person to be added at the same time. The same person can be added to multiple groups
                Delete related groupDelete the related group, and you can choose whether to transfer the related people in the deleted related group to other groups (if not specified, it will be assigned to "default group" by default)
                Rename Relationship GroupRename Relationship Group
                Obtain the user's own related person group informationGet the user's own related person group information
                Add a relation to a groupAdd/move a relation to a relation group. If the group does not exist, the operation fails
                Delete a related person from a groupDelete a related person from a related person group

                GPS

                Configuration class: im.turms.server.common.infra.property.env.common.location.LocationProperties

                FunctionFunction DescriptionRelated Configuration
                User location recordPeriodically record user locationturms.location.enabled
                turms.location.treat-user-id-and-device-type-as-unique-user
                People NearbySearch for other nearby users based on current real-time coordinatesturms.location.users-nearby-request.default-max-available-nearby-users-number
                turms.location.users-nearby-request. default-max-distance-meters
                turms.location.users-nearby-request.max-available-users-nearby-number-limit
                turms.location.users-nearby-request.max-distance- meters

                Statistics function

                Configuration class: im.turms.server.common.infra.property.env.service.env.StatisticsProperties

                Although Turms provides some basic statistical functions, it is recommended that users collect various statistical data through cloud services, such as Amazon CloudWatch.

                FunctionFunction DescriptionRelated Configuration
                Online user statisticsThe Master node in the Turms cluster will regularly record the number of online users in the cluster in the form of logsturms.service.statistics.log-online-users-number
                turms.service.statistics. online-users-number-logging-cron
                - +
                Skip to content

                User-related Features

                • Admin API path: /users. For specific API details, please refer to the OpenAPI documentation
                • Client Interface: Please refer to the UserServiceController class
                • The underlying request model: please refer to the interface description file in the https://github.com/turms-im/proto/tree/master/request/user directory
                • Configuration class: im.turms.server.common.infra.property.env.service.business.user.UserProperties

                User information function

                FunctionFunction DescriptionRelated Configuration
                Add Userturms.service.user.activate-user-when-added
                Delete Userturms.service.user.delete-user-logically
                Modify user profileUsers modify their own nickname, introduction, avatar URL
                Get user profileUser view own or other user's profile
                Set user profile access permissionsUsers can set access permissions for each personal profile. Access rights are: visible to everyone, visible to friends, visible only to yourself
                User permission groupAdministrators can give different permissions to different usersConfiguration model: im.turms.service.domain.user.po.UserPermissionGroup

                User Relationship Hosting

                concept:

                • Relationship: Relationship is divided into one-way relationship and two-way relationship. One-way relationship refers to: the owner of the relationship (relationship owner) has a specific relationship with the Related User (relationship person), such as "one-way friend" (allowing the other party to send messages and friend requests) or "blocking User" (prohibit the other party from sending messages, friend requests, etc.). The establishment of a one-way relationship does not require permission authentication. A two-way relationship means that user A has a one-way relationship with user B, and user B has a one-way relationship with user A. For example, user A blocks user B, and user B can specify not to block user A.
                • Related Users: Refers to users who have a one-way or two-way relationship (designate the other party as a friend or block the user). Two users are Strangers if they don't have a relationship of either kind.
                • Relational person group: A relational person group consists of a group name and a group of related persons, and each relationship must exist in at least one related person group. If the client does not perform a group operation on the relationship when creating the relationship, the relationship will be put into the user's default relationship group. Therefore, special attention should be paid to the fact that there can be both "friends" and "blocked" users in "a related person group". Of course, you can restrict a group to only have a certain type of related person through business restrictions.

                Additional supplement: In fact, there is no such concept as "friend/block user" in the Turms domain model, and its essence is a bool called "isBlocked".

                Function
                Function DescriptionRelated Configuration
                Get relationshipGet the relationship owned by the current user according to optional filtering (such as specifying user ID, "whether it is a contact", "whether it is a friend/blocked user", etc.) and grouping conditions
                Add a relationship (+initiate a friend request)①If adding a relationship as a "friend", according to your customized Turms server configuration, the user can either directly add a "friend" relationship, or initiate a friend request first , the operation of adding a "friend" relationship will not be performed automatically until the requestee's approval is obtained.
                ② If you add a related person whose relationship is "block user", no approval is required and it will take effect directly. Users will no longer receive any messages or requests from blocked users.
                turms.service.user.friend-request.content-limit
                turms.service.user.friend-request.delete-expired-requests-when-cron-triggered
                turms.service.user.friend -request.allow-send-request-after-declined-or-ignored-or-expired
                turms.service.user.friend-request.friend-request-expire-after-seconds
                turms.service .user.friend-request.expired-user-friend-requests-cleanup-cron
                turms.service.user.friend-request.delete-expired-requests-when-cron-triggered
                Approve/Reject Friend RequestUser can approve or reject friend request. If you agree to the friend request, the two will establish a two-way "friend" relationship
                Delete RelationAccording to the optional deletion conditions (such as "is/is not a relation", "is a friend/block user"), delete a certain type of relation or a designated relation.deleteTwoSidedRelationships
                Modify the relationship with related partiesModify user relationship (friend/block user) information. When modifying the relationship to "friend", you need to send a friend request by default (you can cancel this step)
                Create relational person groupWhen creating a group, you can specify the group name and the relational person to be added at the same time. The same person can be added to multiple groups
                Delete related groupDelete the related group, and you can choose whether to transfer the related people in the deleted related group to other groups (if not specified, it will be assigned to "default group" by default)
                Rename Relationship GroupRename Relationship Group
                Obtain the user's own related person group informationGet the user's own related person group information
                Add a relation to a groupAdd/move a relation to a relation group. If the group does not exist, the operation fails
                Delete a related person from a groupDelete a related person from a related person group

                GPS

                Configuration class: im.turms.server.common.infra.property.env.common.location.LocationProperties

                FunctionFunction DescriptionRelated Configuration
                User location recordPeriodically record user locationturms.location.enabled
                turms.location.treat-user-id-and-device-type-as-unique-user
                People NearbySearch for other nearby users based on current real-time coordinatesturms.location.users-nearby-request.default-max-available-nearby-users-number
                turms.location.users-nearby-request. default-max-distance-meters
                turms.location.users-nearby-request.max-available-users-nearby-number-limit
                turms.location.users-nearby-request.max-distance- meters

                Statistics function

                Configuration class: im.turms.server.common.infra.property.env.service.env.StatisticsProperties

                Although Turms provides some basic statistical functions, it is recommended that users collect various statistical data through cloud services, such as Amazon CloudWatch.

                FunctionFunction DescriptionRelated Configuration
                Online user statisticsThe Master node in the Turms cluster will regularly record the number of online users in the cluster in the form of logsturms.service.statistics.log-online-users-number
                turms.service.statistics. online-users-number-logging-cron
                + \ No newline at end of file diff --git a/docs/hashmap.json b/docs/hashmap.json index e821e8be..13432ff3 100644 --- a/docs/hashmap.json +++ b/docs/hashmap.json @@ -1 +1 @@ -{"zh-cn_reference_admin-api.md":"aa307b66","server_development_testing.md":"bcef65b5","client_communication-protocol.md":"ece16fe5","client_api.md":"725adea3","zh-cn_server_deployment_distribution.md":"6598d373","zh-cn_server_development_plugin.md":"48efe233","community_index.md":"d5c5ea8c","feature_index.md":"dc889387","feature_message.md":"5a764148","feature_simultaneous-login.md":"f7c62cdc","feature_user.md":"a73ae67a","index.md":"88e8fd9c","reference_admin-api.md":"cd1aa4c1","server_development_rules.md":"d366e1d5","server_deployment_getting-started.md":"db0bd29d","server_development_code.md":"8d1c62a0","zh-cn_client_metrics.md":"4e0dce30","server_module_anti-spam.md":"1eb3d9b0","server_module_cluster.md":"871e6a9f","server_module_identity-access-management.md":"4d1d60bf","server_module_observability.md":"536db662","server_module_security.md":"2811177b","server_module_system-resource-management.md":"14cf6b55","server_module_xmpp.md":"dcf5ade5","turms-admin.md":"e78a83e8","zh-cn_client_communication-protocol.md":"c3b01846","zh-cn_design_schema.md":"067d7975","zh-cn_client_quick-start.md":"242732d8","zh-cn_client_turms-client-js.md":"de32f7bf","server_module_data-analytics.md":"89dd732f","zh-cn_design_architecture.md":"1e666ada","zh-cn_reference_status-code.md":"0a40300d","server_module_storage.md":"53d425fa","zh-cn_feature_simultaneous-login.md":"d2fc551a","zh-cn_feature_user.md":"512ab7fc","zh-cn_client_api.md":"b7e0577a","zh-cn_server_deployment_config.md":"ebee420e","server_development_redevelopment.md":"fbceab5a","zh-cn_client_requirements.md":"334bb1c0","zh-cn_feature_message.md":"0c4082f1","server_module_chatbot.md":"a5496727","feature_group.md":"82db277a","zh-cn_server_deployment_getting-started.md":"bc587018","zh-cn_feature_group.md":"cb522884","design_architecture.md":"a62a79fa","design_status-aware.md":"f346f12c","client_metrics.md":"d00c67a6","zh-cn_server_development_redevelopment.md":"55189f5c","zh-cn_server_development_rules.md":"d361d583","zh-cn_server_development_testing.md":"7382a948","zh-cn_server_module_chatbot.md":"8b1749eb","zh-cn_server_module_cluster.md":"a5af839f","server_development_plugin.md":"4ad29477","zh-cn_server_module_data-analytics.md":"0e180478","zh-cn_server_module_identity-access-management.md":"33a2fed4","zh-cn_server_module_observability.md":"f5a551b7","zh-cn_server_module_security.md":"e70930d1","zh-cn_server_module_storage.md":"7a7fd54a","zh-cn_server_module_system-resource-management.md":"71ac4014","zh-cn_server_module_xmpp.md":"d1012659","zh-cn_turms-admin.md":"b9c1d665","zh-cn_client_session.md":"8e3853f1","client_session.md":"aad38adf","zh-cn_server_module_anti-spam.md":"4c2dcfde","zh-cn_index.md":"19b7ad26","zh-cn_design_status-aware.md":"02556b58","client_requirements.md":"2fe21189","client_turms-client-js.md":"90459bf7","reference_status-code.md":"8a608fc0","server_deployment_config.md":"68734049","server_deployment_distribution.md":"76550939","zh-cn_server_development_code.md":"a90098da","design_schema.md":"32995249","zh-cn_community_index.md":"7d14e5f0","zh-cn_feature_index.md":"35f4ed6d","client_quick-start.md":"2495bdda"} +{"client_session.md":"aad38adf","client_turms-chat-demo.md":"3a23e9e1","client_turms-client-js.md":"90459bf7","client_communication-protocol.md":"ece16fe5","client_metrics.md":"d00c67a6","client_quick-start.md":"2495bdda","community_index.md":"d5c5ea8c","feature_group.md":"82db277a","feature_index.md":"dc889387","server_deployment_getting-started.md":"db0bd29d","server_development_code.md":"8d1c62a0","client_api.md":"eb765a2b","server_deployment_distribution.md":"76550939","zh-cn_server_module_observability.md":"f5a551b7","design_schema.md":"32995249","design_status-aware.md":"f346f12c","server_module_cluster.md":"871e6a9f","feature_user.md":"a73ae67a","server_module_data-analytics.md":"89dd732f","server_module_chatbot.md":"a5496727","design_architecture.md":"a62a79fa","reference_admin-api.md":"cd1aa4c1","reference_status-code.md":"8a608fc0","server_deployment_config.md":"7619e646","server_development_plugin.md":"4ad29477","server_development_redevelopment.md":"fbceab5a","server_development_rules.md":"d366e1d5","server_development_testing.md":"bcef65b5","server_module_anti-spam.md":"1eb3d9b0","feature_message.md":"5a764148","feature_simultaneous-login.md":"f7c62cdc","client_requirements.md":"2fe21189","index.md":"88e8fd9c","zh-cn_server_development_code.md":"a90098da","server_module_identity-access-management.md":"4d1d60bf","server_module_observability.md":"536db662","server_module_security.md":"2811177b","server_module_storage.md":"53d425fa","server_module_system-resource-management.md":"14cf6b55","turms-admin.md":"e78a83e8","zh-cn_client_api.md":"294bb9cc","zh-cn_client_communication-protocol.md":"c3b01846","zh-cn_client_metrics.md":"4e0dce30","zh-cn_client_quick-start.md":"242732d8","zh-cn_client_requirements.md":"334bb1c0","zh-cn_community_index.md":"7d14e5f0","zh-cn_design_architecture.md":"1e666ada","zh-cn_server_development_plugin.md":"48efe233","zh-cn_server_development_redevelopment.md":"55189f5c","zh-cn_server_development_rules.md":"d361d583","zh-cn_server_development_testing.md":"7382a948","zh-cn_server_module_anti-spam.md":"4c2dcfde","zh-cn_server_module_chatbot.md":"8b1749eb","zh-cn_server_module_cluster.md":"a5af839f","zh-cn_server_module_data-analytics.md":"0e180478","zh-cn_server_module_security.md":"e70930d1","zh-cn_server_module_system-resource-management.md":"71ac4014","zh-cn_server_module_xmpp.md":"d1012659","zh-cn_turms-admin.md":"b9c1d665","zh-cn_design_schema.md":"067d7975","zh-cn_design_status-aware.md":"02556b58","zh-cn_feature_group.md":"cb522884","zh-cn_feature_index.md":"35f4ed6d","zh-cn_feature_simultaneous-login.md":"d2fc551a","zh-cn_feature_user.md":"512ab7fc","zh-cn_index.md":"19b7ad26","zh-cn_reference_admin-api.md":"aa307b66","zh-cn_reference_status-code.md":"0a40300d","zh-cn_server_deployment_config.md":"f5b194c8","zh-cn_server_deployment_distribution.md":"6598d373","zh-cn_server_deployment_getting-started.md":"bc587018","server_module_xmpp.md":"dcf5ade5","zh-cn_feature_message.md":"0c4082f1","zh-cn_server_module_storage.md":"7a7fd54a","zh-cn_server_module_identity-access-management.md":"33a2fed4","zh-cn_client_session.md":"8e3853f1","zh-cn_client_turms-chat-demo.md":"f52ce343","zh-cn_client_turms-client-js.md":"de32f7bf"} diff --git a/docs/index.html b/docs/index.html index 3536c300..0ef3f6fa 100644 --- a/docs/index.html +++ b/docs/index.html @@ -17,7 +17,7 @@ -
                Skip to content

                简体中文

                What is Turms

                Turms is the most advanced open-source instant messaging engine for 100K~10M concurrent users in the world. Please refer to Turms Documentation for details.

                Playground

                (Version of demo servers: ghcr.io/turms-im/turms-admin:latest, ghcr.io/turms-im/turms-gateway:latest, ghcr.io/turms-im/turms-service:latest)

                You can use any turms-client-(java/js/swift) implementation to send requests to turms-gateway and interact with other users.

                In addition, Playground is set up automatically by just one command: ENV=dev,demo docker compose -f docker-compose.standalone.yml --profile monitoring up --force-recreate -d

                Quick Start

                Running the following commands to setup a minimum viable cluster (including turms-gateway, turms-service and turms-admin) and its dependent servers (MongoDB sharded cluster and Redis) automatically:

                sh
                git clone --depth 1 https://github.com/turms-im/turms.git
                +    
                Skip to content

                简体中文

                What is Turms

                Turms is the most advanced open-source instant messaging engine for 100K~10M concurrent users in the world. Please refer to Turms Documentation for details.

                Playground

                (Version of demo servers: ghcr.io/turms-im/turms-admin:latest, ghcr.io/turms-im/turms-gateway:latest, ghcr.io/turms-im/turms-service:latest)

                You can use any turms-client-(java/js/swift) implementation to send requests to turms-gateway and interact with other users.

                In addition, Playground is set up automatically by just one command: ENV=dev,demo docker compose -f docker-compose.standalone.yml --profile monitoring up --force-recreate -d

                Quick Start

                Running the following commands to setup a minimum viable cluster (including turms-gateway, turms-service and turms-admin) and its dependent servers (MongoDB sharded cluster and Redis) automatically:

                sh
                git clone --depth 1 https://github.com/turms-im/turms.git
                 cd turms
                 docker plugin install grafana/loki-docker-driver:latest --alias loki --grant-all-permissions
                 docker compose -f docker-compose.standalone.yml up --force-recreate
                git clone --depth 1 https://github.com/turms-im/turms.git
                @@ -38,7 +38,7 @@
                 Original Project Documentation:https://turms-im.github.io/docs
                Original Project Name:turms-im/turms
                 Original Project:https://github.com/turms-im/turms
                 Original Project Documentation:https://turms-im.github.io/docs

                Q & A

                1. How is the Turms project profitable?

                  We do not need to be profitable currently. Of course, we do not exclude profit, but we will not deliberately to write bad documents or to do a bad job in order to earn consulting, training and other expenses. Another thing to mention is that there are indeed many (closed) open source projects that earn service support fees by deliberately writing bad documents and doing a bad job.

                2. If profit-making organizations, such as training institutions or companies, cite Turms' documents, or even sell Turms projects as SaaS services, do these profit-making organizations need to pay attention to anything?

                  We don't care whether your team plans to make a profit from the Turms project. Your team only needs to comply with the Apache License 2.0 license and mention the Turms project information as mentioned above.

                3. The Turms project is suitable for making SaaS services, so why doesn't the Turms project adopt the AGPL or SSPL license?

                  We currently do not need to make a profit, and we do not plan to make a profit. We only require users to comply with the Apache License 2.0 license.

                4. If the Turms project is not profitable, what is the quality of its project?

                  Our documentation and source code have answered this question for us, and in the open source community, there is no open source IM project that can compete with the Turms project in medium and large IM application scenarios. Another thing to mention is that commercial projects do not mean high quality, and even the quality of documentation and code for many commercial projects is shocking.

                5. Does Turms use dual license agreements or have hidden charges?

                  No. Some projects are free for personal use and charge for commercial use, using dual licensing agreements, or have many hidden charges. The Turms project is licensed under the Apache License 2.0 license, and there is no charge. Some projects claim to be open source software, but they are not. For details, please refer to The Open Source Definition.

                Special Thanks

                Mainly developed in IntelliJ IDEA and CLion.

                License kindly provided by JetBrains Community Support Team.

                - + \ No newline at end of file diff --git a/docs/reference/admin-api.html b/docs/reference/admin-api.html index e33886e8..94b9c58d 100644 --- a/docs/reference/admin-api.html +++ b/docs/reference/admin-api.html @@ -17,8 +17,8 @@ -
                Skip to content

                Admin API

                The Turms server-side API documentation follows the OpenAPI 3.0 standard and provides the current server-side OpenAPI interface documentation externally via an HTTP service.

                If you need to consult the API interface documentation, you can access the API interface at http://localhost:端口号/openapi/ui after starting the Turms server. If you need the JSON format data of the API interface, you can get it by visiting http://localhost:端口号/openapi/docs. The default port number for the turms-gateway administrator HTTP server is 9510, while the turms-service uses port 8510.

                Note: When deploying the Turms server to a production environment, it is usually not necessary to open the Admin API port of the Turms server to the public network to avoid unwanted attacks.

                Interface Design Guidelines

                In order to make the interface as the name suggests and to ensure developers can understand it at a glance, Turms' Admin API interface design refers to the RESTful design style and has been further optimized and Uniformity, specifically following the following guidelines.

                • The path portion of the URL represents the target resource, such as /users/relationships; or the representation of the resource, such as /users/relationships/page indicating that the resource is returned in paged form. A URI has and may only return a Response in one format.

                • POST method for Create resources, DELETE method for Delete resources, PUT method for Update resources, GET method for Query resources, and the more specific HEAD method for Check resources (similar to GET but without the Response body, interacting only through HTTP status codes)

                • The requested Query string is used to locate the resource, such as ?ids=1,2,3; or an additional directive, such as ?reset=true

                  Note: Unlike the RESTful style, the Turms server does not use the request URL path (Path) for resource location. For example, GET /flight-recordings/jfr to download JFR file interface, in RESTful style it should be GET /flight-recordings/jfr/{id}, but in Turms server it is GET /flight-recordings/jfr?id={id }

                • The Body of the request is used to describe the data to be created or updated

                Objects that use the management interface

                • HTTP(S) request from your front-end management system or back-end server to make the call

                • The turms-admin used by the administrator backend to manage web projects

                Note: The administration interface is not for end-users, but for your team to make internal calls. So normally you don't need to open external IP and port for turms-service server.

                Categories

                Monitoring

                TypeControllerPathSupported Servers
                Log ManagementLogController/logsAll
                Metrics ManagementMetricsController/metricsAll
                Flight Recording ManagementFlightRecordingController/flight-recordingsAll

                Plugin

                TypeControllerPathSupported Servers
                Plugin ManagementPluginController/pluginsAll

                Administrator

                TypeControllerPathSupported ServersNotes
                Admin ManagementAdminController/adminsturms-serviceEach Turms cluster has a default account with the role ROOT and the account name and password turms
                Admin Role ManagementAdminRoleController/admins/rolesturms-serviceBy default, each Turms cluster has a super administrator role with the role ROOT and all privileges

                Cluster

                TypeControllerPathSupported Servers
                Cluster Node ManagementMemberController/cluster/membersturms-service
                Cluster Configuration ManagementSettingController/cluster/settingsturms-service

                Blocklist

                TypeControllerPathSupported Servers
                IP Blocklist ManagementIpBlocklistController/blocked-clients/ipsturms-service
                User Blocklist ManagementUserBlocklistController/blocked-clients/usersturms-service

                User Session

                TypeControllerPathSupported Servers
                User Session ManagementSessionController/sessionsturms-gateway

                All API ports in the following table exist only on the turms-service server side. These API ports are not available on the turms-gateway server side.

                User

                TypeControllerPath
                User Information ManagementUserController/users
                User Online Info ManagementUserOnlineInfoController/users/online-infos
                User Permission Group ManagementUserPermissionGroupController/users/permission-groups
                User Relationship ManagementUserRelationshipController/users/relationships
                User Relationship Group ManagementUserRelationshipGroupController/users/relationships/groups
                User Friend Request ManagementUserFriendRequestController/users/relationships/friend-requests

                Group

                TypeControllerPath
                Group ManagementGroupController/groups
                Group Type ManagementGroupTypeController/groups/types
                Group Question ManagementGroupQuestionController/groups/questions
                Group Member ManagementGroupMemberController/groups/members
                Group Blocklist ManagementGroupBlocklistController/groups/blocked-users
                Group Invitation ManagementGroupInvitationController/groups/invitations
                Group Join Request ManagementGroupJoinRequestController/groups/join-requests

                Chat Session

                TypeControllerPath
                Conversation ManagementConversationController/conversations

                Message Classes

                TypeControllerPath
                Message ManagementMessageController/messages

                Statistics

                The statistics-related interfaces currently exposed to the public are mostly Legacy APIs, which are not recommended. We will adjust and refactor them later. Please refer to the chapter Data Analysis for specific reasons.

                Admin API Security

                Every HTTP request sent by a user to Turms server will go through the authentication and authorization process of Turms server, which can be found in Administrator Security.

                - +
                Skip to content

                Admin API

                The Turms server-side API documentation follows the OpenAPI 3.0 standard and provides the current server-side OpenAPI interface documentation externally via an HTTP service.

                If you need to consult the API interface documentation, you can access the API interface at http://localhost:端口号/openapi/ui after starting the Turms server. If you need the JSON format data of the API interface, you can get it by visiting http://localhost:端口号/openapi/docs. The default port number for the turms-gateway administrator HTTP server is 9510, while the turms-service uses port 8510.

                Note: When deploying the Turms server to a production environment, it is usually not necessary to open the Admin API port of the Turms server to the public network to avoid unwanted attacks.

                Interface Design Guidelines

                In order to make the interface as the name suggests and to ensure developers can understand it at a glance, Turms' Admin API interface design refers to the RESTful design style and has been further optimized and Uniformity, specifically following the following guidelines.

                • The path portion of the URL represents the target resource, such as /users/relationships; or the representation of the resource, such as /users/relationships/page indicating that the resource is returned in paged form. A URI has and may only return a Response in one format.

                • POST method for Create resources, DELETE method for Delete resources, PUT method for Update resources, GET method for Query resources, and the more specific HEAD method for Check resources (similar to GET but without the Response body, interacting only through HTTP status codes)

                • The requested Query string is used to locate the resource, such as ?ids=1,2,3; or an additional directive, such as ?reset=true

                  Note: Unlike the RESTful style, the Turms server does not use the request URL path (Path) for resource location. For example, GET /flight-recordings/jfr to download JFR file interface, in RESTful style it should be GET /flight-recordings/jfr/{id}, but in Turms server it is GET /flight-recordings/jfr?id={id }

                • The Body of the request is used to describe the data to be created or updated

                Objects that use the management interface

                • HTTP(S) request from your front-end management system or back-end server to make the call

                • The turms-admin used by the administrator backend to manage web projects

                Note: The administration interface is not for end-users, but for your team to make internal calls. So normally you don't need to open external IP and port for turms-service server.

                Categories

                Monitoring

                TypeControllerPathSupported Servers
                Log ManagementLogController/logsAll
                Metrics ManagementMetricsController/metricsAll
                Flight Recording ManagementFlightRecordingController/flight-recordingsAll

                Plugin

                TypeControllerPathSupported Servers
                Plugin ManagementPluginController/pluginsAll

                Administrator

                TypeControllerPathSupported ServersNotes
                Admin ManagementAdminController/adminsturms-serviceEach Turms cluster has a default account with the role ROOT and the account name and password turms
                Admin Role ManagementAdminRoleController/admins/rolesturms-serviceBy default, each Turms cluster has a super administrator role with the role ROOT and all privileges

                Cluster

                TypeControllerPathSupported Servers
                Cluster Node ManagementMemberController/cluster/membersturms-service
                Cluster Configuration ManagementSettingController/cluster/settingsturms-service

                Blocklist

                TypeControllerPathSupported Servers
                IP Blocklist ManagementIpBlocklistController/blocked-clients/ipsturms-service
                User Blocklist ManagementUserBlocklistController/blocked-clients/usersturms-service

                User Session

                TypeControllerPathSupported Servers
                User Session ManagementSessionController/sessionsturms-gateway

                All API ports in the following table exist only on the turms-service server side. These API ports are not available on the turms-gateway server side.

                User

                TypeControllerPath
                User Information ManagementUserController/users
                User Online Info ManagementUserOnlineInfoController/users/online-infos
                User Permission Group ManagementUserPermissionGroupController/users/permission-groups
                User Relationship ManagementUserRelationshipController/users/relationships
                User Relationship Group ManagementUserRelationshipGroupController/users/relationships/groups
                User Friend Request ManagementUserFriendRequestController/users/relationships/friend-requests

                Group

                TypeControllerPath
                Group ManagementGroupController/groups
                Group Type ManagementGroupTypeController/groups/types
                Group Question ManagementGroupQuestionController/groups/questions
                Group Member ManagementGroupMemberController/groups/members
                Group Blocklist ManagementGroupBlocklistController/groups/blocked-users
                Group Invitation ManagementGroupInvitationController/groups/invitations
                Group Join Request ManagementGroupJoinRequestController/groups/join-requests

                Chat Session

                TypeControllerPath
                Conversation ManagementConversationController/conversations

                Message Classes

                TypeControllerPath
                Message ManagementMessageController/messages

                Statistics

                The statistics-related interfaces currently exposed to the public are mostly Legacy APIs, which are not recommended. We will adjust and refactor them later. Please refer to the chapter Data Analysis for specific reasons.

                Admin API Security

                Every HTTP request sent by a user to Turms server will go through the authentication and authorization process of Turms server, which can be found in Administrator Security.

                + \ No newline at end of file diff --git a/docs/reference/status-code.html b/docs/reference/status-code.html index 56c3ef3e..687d8107 100644 --- a/docs/reference/status-code.html +++ b/docs/reference/status-code.html @@ -17,8 +17,8 @@ -
                Skip to content

                Status Code

                There are two status codes that developers need to understand, one is ResponseStatusCode, and the other is SessionCloseStatus. The content in the following table does not need to be memorized deliberately. You only need to know how to query when encountering an unfamiliar status code.

                ResponseStatusCode

                ResponseStatusCode indicates the processing status in the request response, similar to the HTTP status code.

                Each request response will contain a ResponseStatusCode. For the specific status code declaration, please refer to the im.turms.client.model.ResponseStatusCode class under the turms-client-kotlin project.

                Client unique status code

                The client-specific status code will not appear in the Turms server, indicating that the client request is rejected locally on the client.

                CategoryNameStatus CodeMeaning
                Connection relatedCONNECT_TIMEOUT1
                Request relatedINVALID_REQUEST100
                CLIENT_REQUESTS_TOO_FREQUENT101
                REQUEST_TIMEOUT102
                ILLEGAL_ARGUMENT103
                Notification RelatedINVALID_NOTIFICATION200
                INVALID_RESPONSE201
                Session relatedCLIENT_SESSION_ALREADY_ESTABLISHED300
                CLIENT_SESSION_HAS_BEEN_CLOSED301
                Message relatedMESSAGE_IS_REJECTED400
                Storage relatedQUERY_PROFILE_URL_TO_UPDATE_BEFORE_LOGIN500

                Common Status Codes

                CategoryNameStatus CodeMeaning
                Successful ResponseOK1000
                NO_CONTENT1001
                ALREADY_UP_TO_DATE1002
                Client request errorINVALID_REQUEST_FROM_SERVER1100
                CLIENT_REQUESTS_TOO_FREQUENT_FROM_SERVER1101
                ILLEGAL_ARGUMENT_FROM_SERVER1102
                RECORD_CONTAINS_DUPLICATE_KEY1103
                REQUESTED_RECORDS_TOO_MANY1104
                SEND_REQUEST_FROM_NON_EXISTING_SESSION1105
                UNAUTHORIZED_REQUEST1106
                Server ErrorSERVER_INTERNAL_ERROR1200
                SERVER_UNAVAILABLE1201
                User login related errorsUNSUPPORTED_CLIENT_VERSION2000
                LOGIN_TIMEOUT2010
                LOGIN_AUTHENTICATION_FAILED2011
                LOGGING_IN_USER_NOT_ACTIVE2012
                LOGIN_FROM_FORBIDDEN_DEVICE_TYPE2013
                User session related errorsSESSION_SIMULTANEOUS_CONFLICTS_DECLINE2100
                SESSION_SIMULTANEOUS_CONFLICTS_NOTIFY2101
                SESSION_SIMULTANEOUS_CONFLICTS_OFFLINE2102
                CREATE_EXISTING_SESSION2103
                UPDATE_NON_EXISTING_SESSION_HEARTBEAT2104
                User location related errorsUSER_LOCATION_RELATED_FEATURES_ARE_DISABLED2200
                QUERYING_NEAREST_USERS_BY_SESSION_ID_IS_DISABLED2201
                User information related errorsUPDATE_INFO_OF_NON_EXISTING_USER2300
                USER_PROFILE_NOT_FOUND2301
                PROFILE_REQUESTER_NOT_IN_CONTACTS_OR_BLOCKED2302
                PROFILE_REQUESTER_HAS_BEEN_BLOCKED2303
                User permission group related errorsQUERY_PERMISSION_OF_NON_EXISTING_USER2400
                User relation related errorsADD_NOT_RELATED_USER_TO_GROUP2500
                CREATE_EXISTING_RELATIONSHIP2501
                User friend request related errorREQUESTER_NOT_FRIEND_REQUEST_RECIPIENT2600
                CREATE_EXISTING_FRIEND_REQUEST2601
                FRIEND_REQUEST_SENDER_HAS_BEEN_BLOCKED2602
                Group information related errorsUPDATE_INFO_OF_NON_EXISTING_GROUP3000
                NOT_OWNER_TO_UPDATE_GROUP_INFO3001
                NOT_OWNER_OR_MANAGER_TO_UPDATE_GROUP_INFO3002
                NOT_MEMBER_TO_UPDATE_GROUP_INFO3003
                Group type related errorsNO_PERMISSION_TO_CREATE_GROUP_WITH_GROUP_TYPE3100
                CREATE_GROUP_WITH_NON_EXISTING_GROUP_TYPE3101
                Group ownership related errorsNOT_ACTIVE_USER_TO_CREATE_GROUP3200
                NOT_OWNER_TO_TRANSFER_GROUP3201
                NOT_OWNER_TO_DELETE_GROUP3202
                SUCCESSOR_NOT_GROUP_MEMBER3203
                OWNER_QUITS_WITHOUT_SPECIFYING_SUCCESSOR3204
                MAX_OWNED_GROUPS_REACHED3205
                TRANSFER_NON_EXISTING_GROUP3206
                Errors related to group entry questionsNOT_OWNER_OR_MANAGER_TO_CREATE_GROUP_QUESTION3300
                NOT_OWNER_OR_MANAGER_TO_DELETE_GROUP_QUESTION3301
                NOT_OWNER_OR_MANAGER_TO_UPDATE_GROUP_QUESTION3302
                NOT_OWNER_OR_MANAGER_TO_ACCESS_GROUP_QUESTION_ANSWER3303
                CREATE_GROUP_QUESTION_FOR_INACTIVE_GROUP3304
                CREATE_GROUP_QUESTION_FOR_GROUP_USING_JOIN_REQUEST3305
                CREATE_GROUP_QUESTION_FOR_GROUP_USING_INVITATION3306
                CREATE_GROUP_QUESTION_FOR_GROUP_USING_MEMBERSHIP_REQUEST3307
                GROUP_QUESTION_ANSWERER_HAS_BEEN_BLOCKED3308
                MEMBER_CANNOT_ANSWER_GROUP_QUESTION3309
                ANSWER_INACTIVE_QUESTION3310
                ANSWER_QUESTION_OF_INACTIVE_GROUP3311
                Group membership related errorsADD_USER_TO_GROUP_REQUIRING_INVITATION3400
                ADD_USER_TO_INACTIVE_GROUP3401
                ADD_USER_WITH_ROLE_HIGHER_THAN_REQUESTER3402
                ADD_BLOCKED_USER_TO_GROUP3403
                ADD_BLOCKED_USER_TO_INACTIVE_GROUP3404
                NOT_OWNER_OR_MANAGER_TO_REMOVE_GROUP_MEMBER3405
                NOT_OWNER_TO_REMOVE_GROUP_OWNER_OR_MANAGER3406
                NOT_OWNER_TO_UPDATE_GROUP_MEMBER_INFO3407
                NOT_OWNER_OR_MANAGER_TO_UPDATE_GROUP_MEMBER_INFO3408
                NOT_MEMBER_TO_QUERY_MEMBER_INFO3409
                Group blacklist related errorsNOT_OWNER_OR_MANAGER_TO_ADD_BLOCKED_USER3500
                NOT_OWNER_OR_MANAGER_TO_REMOVE_BLOCKED_USER3501
                Group join request related errorsGROUP_JOIN_REQUEST_SENDER_HAS_BEEN_BLOCKED3600
                NOT_JOIN_REQUEST_SENDER_TO_RECALL_REQUEST3601
                NOT_OWNER_OR_MANAGER_TO_ACCESS_GROUP_REQUEST3602
                RECALL_NOT_PENDING_GROUP_JOIN_REQUEST3603
                SEND_JOIN_REQUEST_TO_INACTIVE_GROUP3604
                SEND_JOIN_REQUEST_TO_GROUP_USING_MEMBERSHIP_REQUEST3605
                SEND_JOIN_REQUEST_TO_GROUP_USING_INVITATION3606
                SEND_JOIN_REQUEST_TO_GROUP_USING_QUESTION3607
                RECALLING_GROUP_JOIN_REQUEST_IS_DISABLED3608
                Group invite related errorsGROUP_INVITER_NOT_MEMBER3700
                GROUP_INVITEE_ALREADY_GROUP_MEMBER3701
                NOT_OWNER_OR_MANAGER_TO_RECALL_INVITATION3702
                NOT_OWNER_OR_MANAGER_TO_ACCESS_INVITATION3703
                NOT_OWNER_TO_SEND_INVITATION3704
                NOT_OWNER_OR_MANAGER_TO_SEND_INVITATION3705
                NOT_MEMBER_TO_SEND_INVITATION3706
                INVITEE_HAS_BEEN_BLOCKED3707
                RECALLING_GROUP_INVITATION_IS_DISABLED3708
                SEND_GROUP_INVITATION_TO_GROUP_NOT_REQUIRE_INVITATION3709
                RECALL_NOT_PENDING_GROUP_INVITATION3710
                Chat session related errorsUPDATING_TYPING_STATUS_IS_DISABLED4000
                UPDATING_READ_DATE_IS_DISABLED4001
                MOVING_READ_DATE_FORWARD_IS_DISABLED4002
                Message sending related errorMESSAGE_RECIPIENT_NOT_ACTIVE5000
                MESSAGE_SENDER_NOT_IN_CONTACTS_OR_BLOCKED5001
                PRIVATE_MESSAGE_SENDER_HAS_BEEN_BLOCKED5002
                GROUP_MESSAGE_SENDER_HAS_BEEN_BLOCKED5003
                SEND_MESSAGE_TO_INACTIVE_GROUP5004
                SEND_MESSAGE_TO_MUTED_GROUP5005
                SENDING_MESSAGES_TO_ONESELF_IS_DISABLED5006
                MUTED_MEMBER_SEND_MESSAGE5007
                GUESTS_HAVE_BEEN_MUTED5008
                MESSAGE_IS_ILLEGAL5009
                Message update related errorUPDATING_MESSAGE_BY_SENDER_IS_DISABLED5100
                NOT_SENDER_TO_UPDATE_MESSAGE5101
                NOT_MESSAGE_RECIPIENT_TO_UPDATE_MESSAGE_READ_DATE5102
                Message recall related errorRECALL_NON_EXISTING_MESSAGE5200
                RECALLING_MESSAGE_IS_DISABLED5201
                MESSAGE_RECALL_TIMEOUT5202
                Message query related errorsNOT_MEMBER_TO_QUERY_GROUP_MESSAGES5300
                Storage related errorsSTORAGE_NOT_IMPLEMENTED = 60006000

                SessionCloseStatus

                SessionCloseStatus indicates why the session was closed.

                For the specific status code declaration, please refer to im.turms.server.common.access.common.SessionCloseStatus class.

                Cause CategoryNameStatus CodeMeaning
                Illegal client behaviorILLEGAL_REQUEST100Illegal request
                HEARTBEAT_TIMEOUT110Heartbeat timeout
                LOGIN_TIMEOUT111Login timeout
                SWITCH112Session timeout, TCP or WebSocket switches to UDP and enters dormant keep-alive state
                Server behaviorSERVER_ERROR200Server exception error
                SERVER_CLOSED201The server enters shutdown state
                SERVER_UNAVAILABLE202Service Unavailable
                Network layer errorCONNECTION_CLOSED300No close frame received, the network layer connection is forcibly closed
                Unknown errorUNKNOWN_ERROR400Unknown server or client behavior error
                Closed by the userDISCONNECTED_BY_CLIENT500The current user actively requests to close the session
                DISCONNECTED_BY_OTHER_DEVICE501The current session is closed because the current user's other device is online
                The administrator actively closesDISCONNECTED_BY_ADMIN600The administrator actively closes the session through the API
                User status changeUSER_IS_DELETED_OR_INACTIVATED700User account is deleted or enters inactive state
                USER_IS_BLOCKED701User IP or User ID is blocked
                - +
                Skip to content

                Status Code

                There are two status codes that developers need to understand, one is ResponseStatusCode, and the other is SessionCloseStatus. The content in the following table does not need to be memorized deliberately. You only need to know how to query when encountering an unfamiliar status code.

                ResponseStatusCode

                ResponseStatusCode indicates the processing status in the request response, similar to the HTTP status code.

                Each request response will contain a ResponseStatusCode. For the specific status code declaration, please refer to the im.turms.client.model.ResponseStatusCode class under the turms-client-kotlin project.

                Client unique status code

                The client-specific status code will not appear in the Turms server, indicating that the client request is rejected locally on the client.

                CategoryNameStatus CodeMeaning
                Connection relatedCONNECT_TIMEOUT1
                Request relatedINVALID_REQUEST100
                CLIENT_REQUESTS_TOO_FREQUENT101
                REQUEST_TIMEOUT102
                ILLEGAL_ARGUMENT103
                Notification RelatedINVALID_NOTIFICATION200
                INVALID_RESPONSE201
                Session relatedCLIENT_SESSION_ALREADY_ESTABLISHED300
                CLIENT_SESSION_HAS_BEEN_CLOSED301
                Message relatedMESSAGE_IS_REJECTED400
                Storage relatedQUERY_PROFILE_URL_TO_UPDATE_BEFORE_LOGIN500

                Common Status Codes

                CategoryNameStatus CodeMeaning
                Successful ResponseOK1000
                NO_CONTENT1001
                ALREADY_UP_TO_DATE1002
                Client request errorINVALID_REQUEST_FROM_SERVER1100
                CLIENT_REQUESTS_TOO_FREQUENT_FROM_SERVER1101
                ILLEGAL_ARGUMENT_FROM_SERVER1102
                RECORD_CONTAINS_DUPLICATE_KEY1103
                REQUESTED_RECORDS_TOO_MANY1104
                SEND_REQUEST_FROM_NON_EXISTING_SESSION1105
                UNAUTHORIZED_REQUEST1106
                Server ErrorSERVER_INTERNAL_ERROR1200
                SERVER_UNAVAILABLE1201
                User login related errorsUNSUPPORTED_CLIENT_VERSION2000
                LOGIN_TIMEOUT2010
                LOGIN_AUTHENTICATION_FAILED2011
                LOGGING_IN_USER_NOT_ACTIVE2012
                LOGIN_FROM_FORBIDDEN_DEVICE_TYPE2013
                User session related errorsSESSION_SIMULTANEOUS_CONFLICTS_DECLINE2100
                SESSION_SIMULTANEOUS_CONFLICTS_NOTIFY2101
                SESSION_SIMULTANEOUS_CONFLICTS_OFFLINE2102
                CREATE_EXISTING_SESSION2103
                UPDATE_NON_EXISTING_SESSION_HEARTBEAT2104
                User location related errorsUSER_LOCATION_RELATED_FEATURES_ARE_DISABLED2200
                QUERYING_NEAREST_USERS_BY_SESSION_ID_IS_DISABLED2201
                User information related errorsUPDATE_INFO_OF_NON_EXISTING_USER2300
                USER_PROFILE_NOT_FOUND2301
                PROFILE_REQUESTER_NOT_IN_CONTACTS_OR_BLOCKED2302
                PROFILE_REQUESTER_HAS_BEEN_BLOCKED2303
                User permission group related errorsQUERY_PERMISSION_OF_NON_EXISTING_USER2400
                User relation related errorsADD_NOT_RELATED_USER_TO_GROUP2500
                CREATE_EXISTING_RELATIONSHIP2501
                User friend request related errorREQUESTER_NOT_FRIEND_REQUEST_RECIPIENT2600
                CREATE_EXISTING_FRIEND_REQUEST2601
                FRIEND_REQUEST_SENDER_HAS_BEEN_BLOCKED2602
                Group information related errorsUPDATE_INFO_OF_NON_EXISTING_GROUP3000
                NOT_OWNER_TO_UPDATE_GROUP_INFO3001
                NOT_OWNER_OR_MANAGER_TO_UPDATE_GROUP_INFO3002
                NOT_MEMBER_TO_UPDATE_GROUP_INFO3003
                Group type related errorsNO_PERMISSION_TO_CREATE_GROUP_WITH_GROUP_TYPE3100
                CREATE_GROUP_WITH_NON_EXISTING_GROUP_TYPE3101
                Group ownership related errorsNOT_ACTIVE_USER_TO_CREATE_GROUP3200
                NOT_OWNER_TO_TRANSFER_GROUP3201
                NOT_OWNER_TO_DELETE_GROUP3202
                SUCCESSOR_NOT_GROUP_MEMBER3203
                OWNER_QUITS_WITHOUT_SPECIFYING_SUCCESSOR3204
                MAX_OWNED_GROUPS_REACHED3205
                TRANSFER_NON_EXISTING_GROUP3206
                Errors related to group entry questionsNOT_OWNER_OR_MANAGER_TO_CREATE_GROUP_QUESTION3300
                NOT_OWNER_OR_MANAGER_TO_DELETE_GROUP_QUESTION3301
                NOT_OWNER_OR_MANAGER_TO_UPDATE_GROUP_QUESTION3302
                NOT_OWNER_OR_MANAGER_TO_ACCESS_GROUP_QUESTION_ANSWER3303
                CREATE_GROUP_QUESTION_FOR_INACTIVE_GROUP3304
                CREATE_GROUP_QUESTION_FOR_GROUP_USING_JOIN_REQUEST3305
                CREATE_GROUP_QUESTION_FOR_GROUP_USING_INVITATION3306
                CREATE_GROUP_QUESTION_FOR_GROUP_USING_MEMBERSHIP_REQUEST3307
                GROUP_QUESTION_ANSWERER_HAS_BEEN_BLOCKED3308
                MEMBER_CANNOT_ANSWER_GROUP_QUESTION3309
                ANSWER_INACTIVE_QUESTION3310
                ANSWER_QUESTION_OF_INACTIVE_GROUP3311
                Group membership related errorsADD_USER_TO_GROUP_REQUIRING_INVITATION3400
                ADD_USER_TO_INACTIVE_GROUP3401
                ADD_USER_WITH_ROLE_HIGHER_THAN_REQUESTER3402
                ADD_BLOCKED_USER_TO_GROUP3403
                ADD_BLOCKED_USER_TO_INACTIVE_GROUP3404
                NOT_OWNER_OR_MANAGER_TO_REMOVE_GROUP_MEMBER3405
                NOT_OWNER_TO_REMOVE_GROUP_OWNER_OR_MANAGER3406
                NOT_OWNER_TO_UPDATE_GROUP_MEMBER_INFO3407
                NOT_OWNER_OR_MANAGER_TO_UPDATE_GROUP_MEMBER_INFO3408
                NOT_MEMBER_TO_QUERY_MEMBER_INFO3409
                Group blacklist related errorsNOT_OWNER_OR_MANAGER_TO_ADD_BLOCKED_USER3500
                NOT_OWNER_OR_MANAGER_TO_REMOVE_BLOCKED_USER3501
                Group join request related errorsGROUP_JOIN_REQUEST_SENDER_HAS_BEEN_BLOCKED3600
                NOT_JOIN_REQUEST_SENDER_TO_RECALL_REQUEST3601
                NOT_OWNER_OR_MANAGER_TO_ACCESS_GROUP_REQUEST3602
                RECALL_NOT_PENDING_GROUP_JOIN_REQUEST3603
                SEND_JOIN_REQUEST_TO_INACTIVE_GROUP3604
                SEND_JOIN_REQUEST_TO_GROUP_USING_MEMBERSHIP_REQUEST3605
                SEND_JOIN_REQUEST_TO_GROUP_USING_INVITATION3606
                SEND_JOIN_REQUEST_TO_GROUP_USING_QUESTION3607
                RECALLING_GROUP_JOIN_REQUEST_IS_DISABLED3608
                Group invite related errorsGROUP_INVITER_NOT_MEMBER3700
                GROUP_INVITEE_ALREADY_GROUP_MEMBER3701
                NOT_OWNER_OR_MANAGER_TO_RECALL_INVITATION3702
                NOT_OWNER_OR_MANAGER_TO_ACCESS_INVITATION3703
                NOT_OWNER_TO_SEND_INVITATION3704
                NOT_OWNER_OR_MANAGER_TO_SEND_INVITATION3705
                NOT_MEMBER_TO_SEND_INVITATION3706
                INVITEE_HAS_BEEN_BLOCKED3707
                RECALLING_GROUP_INVITATION_IS_DISABLED3708
                SEND_GROUP_INVITATION_TO_GROUP_NOT_REQUIRE_INVITATION3709
                RECALL_NOT_PENDING_GROUP_INVITATION3710
                Chat session related errorsUPDATING_TYPING_STATUS_IS_DISABLED4000
                UPDATING_READ_DATE_IS_DISABLED4001
                MOVING_READ_DATE_FORWARD_IS_DISABLED4002
                Message sending related errorMESSAGE_RECIPIENT_NOT_ACTIVE5000
                MESSAGE_SENDER_NOT_IN_CONTACTS_OR_BLOCKED5001
                PRIVATE_MESSAGE_SENDER_HAS_BEEN_BLOCKED5002
                GROUP_MESSAGE_SENDER_HAS_BEEN_BLOCKED5003
                SEND_MESSAGE_TO_INACTIVE_GROUP5004
                SEND_MESSAGE_TO_MUTED_GROUP5005
                SENDING_MESSAGES_TO_ONESELF_IS_DISABLED5006
                MUTED_MEMBER_SEND_MESSAGE5007
                GUESTS_HAVE_BEEN_MUTED5008
                MESSAGE_IS_ILLEGAL5009
                Message update related errorUPDATING_MESSAGE_BY_SENDER_IS_DISABLED5100
                NOT_SENDER_TO_UPDATE_MESSAGE5101
                NOT_MESSAGE_RECIPIENT_TO_UPDATE_MESSAGE_READ_DATE5102
                Message recall related errorRECALL_NON_EXISTING_MESSAGE5200
                RECALLING_MESSAGE_IS_DISABLED5201
                MESSAGE_RECALL_TIMEOUT5202
                Message query related errorsNOT_MEMBER_TO_QUERY_GROUP_MESSAGES5300
                Storage related errorsSTORAGE_NOT_IMPLEMENTED = 60006000

                SessionCloseStatus

                SessionCloseStatus indicates why the session was closed.

                For the specific status code declaration, please refer to im.turms.server.common.access.common.SessionCloseStatus class.

                Cause CategoryNameStatus CodeMeaning
                Illegal client behaviorILLEGAL_REQUEST100Illegal request
                HEARTBEAT_TIMEOUT110Heartbeat timeout
                LOGIN_TIMEOUT111Login timeout
                SWITCH112Session timeout, TCP or WebSocket switches to UDP and enters dormant keep-alive state
                Server behaviorSERVER_ERROR200Server exception error
                SERVER_CLOSED201The server enters shutdown state
                SERVER_UNAVAILABLE202Service Unavailable
                Network layer errorCONNECTION_CLOSED300No close frame received, the network layer connection is forcibly closed
                Unknown errorUNKNOWN_ERROR400Unknown server or client behavior error
                Closed by the userDISCONNECTED_BY_CLIENT500The current user actively requests to close the session
                DISCONNECTED_BY_OTHER_DEVICE501The current session is closed because the current user's other device is online
                The administrator actively closesDISCONNECTED_BY_ADMIN600The administrator actively closes the session through the API
                User status changeUSER_IS_DELETED_OR_INACTIVATED700User account is deleted or enters inactive state
                USER_IS_BLOCKED701User IP or User ID is blocked
                + \ No newline at end of file diff --git a/docs/server/deployment/config.html b/docs/server/deployment/config.html index 95f4e27d..f358b71b 100644 --- a/docs/server/deployment/config.html +++ b/docs/server/deployment/config.html @@ -12,12 +12,12 @@ - + -
                Skip to content

                Configuration

                Importance

                There are many business scenarios for instant messaging, so different businesses have vastly different requirements for hardware resources (for example: architecture that requires a database and architecture that does not require a database). In order to effectively utilize server resources, please be sure to carefully understand the configuration parameters provided by the Turms server.

                • Scenario 1: 100% message reachability vs actively discarding messages

                  • In social applications, messages are generally required to have a 100% reachability rate. Conversely, for live chat room applications, the server will even actively discard user messages or send messages to only some users in the chat room according to message priority and server load.
                  • For the former, Turms uses Redis to pull the incremental sequence ID at the session level to achieve 100% delivery of messages. For the latter, Turms will actively discard messages based on messages in memory and server load information. The two have completely different but reasonable requirements for message reachability, so the implementation of the two also has completely different requirements for hardware configuration.
                • Scenario 2: Read-diffused message storage vs zero-message storage

                  • Application A is an instant messaging application mainly for business customers. This application has a requirement: when a user sends a message in the business group, the user can know whether other every user in the group has read the message, even if the user finishes sending the message When it goes offline, when it goes online again, it can still check the read status of other people's messages.

                    Therefore, if a business group has 100 users, when one of the users sends a message, Turms needs to store 1 Message and 1 Conversation (Turms adopts the read diffusion message model, and please note: this Conversation record will carry 99 last read time of a group member).

                  • Application B is a live barrage chat application, which handles messages very casually. When a user sends a message on a live channel, the user not only does not need to know the read status of other users, but even the message itself does not require storage (that is, no offline message requirement).

                    Therefore, if a live channel has 100 members, when one of the users sends a message, Turms needs to store 0 Message and 0 Conversation records.

                  • Contrast that application A requires the message storage function, while application B does not. Therefore, the table for storing messages is not even used in the architecture design of the B application (of course, in practical applications, user messages are generally stored for user behavior analysis). Therefore, the hardware requirements of the two are also quite different.

                Local configuration and global configuration

                The Turms server has two types of configurations: local configuration and global configuration, among which:

                Local ConfigurationGlobal Configuration
                Application DomainValid only for the current nodeValid for all nodes in the cluster
                Storage locationStored in the local application-[profile].yaml fileStored in the turms-config/shared-cluster-properties collection in the MongoDB database
                MutableFor properties marked with the MutableProperty annotation, users can perform real-time updates with zero downtime when the Turms cluster is running through the dedicated API interface for administratorsSame as the left

                Configuration Categories

                The configuration is divided into two categories, one is the configuration of the JVM, and the other is the configuration of the Turms server.

                JVM configuration

                The JVM default configuration file of turms-gateway is: turms-gateway/dist/config/jvm.options

                The default JVM configuration file for turms-service is: turms-service/dist/config/jvm.options.

                Users generally use the default JVM configuration and do not need to modify the JVM configuration by themselves.

                If the user wants to modify the JVM configuration, there are two ways:

                1. Modify the environment variable TURMS_GATEWAY_JVM_CONF (for turms-gateway) or TURMS_SERVICE_JVM_CONF (for turms-service) and point to the custom JVM configuration file to use a fully custom JVM configuration. The following takes modifying the JVM configuration of turms-gateway as an example, the specific modification method:

                  1. If you start via the run.sh script, you can use something like export TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path> && sh run.sh -f to set the environment variable and start.

                  2. If you start from a Docker image, you can use something like:

                    shell
                    docker run -d --name turms-gateway --ulimit nofile=1048576 \
                    +    
                    Skip to content

                    Configuration

                    Importance

                    There are many business scenarios for instant messaging, so different businesses have vastly different requirements for hardware resources (for example: architecture that requires a database and architecture that does not require a database). In order to effectively utilize server resources, please be sure to carefully understand the configuration parameters provided by the Turms server.

                    • Scenario 1: 100% message reachability vs actively discarding messages

                      • In social applications, messages are generally required to have a 100% reachability rate. Conversely, for live chat room applications, the server will even actively discard user messages or send messages to only some users in the chat room according to message priority and server load.
                      • For the former, Turms uses Redis to pull the incremental sequence ID at the session level to achieve 100% delivery of messages. For the latter, Turms will actively discard messages based on messages in memory and server load information. The two have completely different but reasonable requirements for message reachability, so the implementation of the two also has completely different requirements for hardware configuration.
                    • Scenario 2: Read-diffused message storage vs zero-message storage

                      • Application A is an instant messaging application mainly for business customers. This application has a requirement: when a user sends a message in the business group, the user can know whether other every user in the group has read the message, even if the user finishes sending the message When it goes offline, when it goes online again, it can still check the read status of other people's messages.

                        Therefore, if a business group has 100 users, when one of the users sends a message, Turms needs to store 1 Message and 1 Conversation (Turms adopts the read diffusion message model, and please note: this Conversation record will carry 99 last read time of a group member).

                      • Application B is a live barrage chat application, which handles messages very casually. When a user sends a message on a live channel, the user not only does not need to know the read status of other users, but even the message itself does not require storage (that is, no offline message requirement).

                        Therefore, if a live channel has 100 members, when one of the users sends a message, Turms needs to store 0 Message and 0 Conversation records.

                      • Contrast that application A requires the message storage function, while application B does not. Therefore, the table for storing messages is not even used in the architecture design of the B application (of course, in practical applications, user messages are generally stored for user behavior analysis). Therefore, the hardware requirements of the two are also quite different.

                    Local configuration and global configuration

                    The Turms server has two types of configurations: local configuration and global configuration, among which:

                    Local ConfigurationGlobal Configuration
                    Application DomainValid only for the current nodeValid for all nodes in the cluster
                    Storage locationStored in the local application-[profile].yaml fileStored in the turms-config/shared-cluster-properties collection in the MongoDB database
                    MutableFor properties marked with the MutableProperty annotation, users can perform real-time updates with zero downtime when the Turms cluster is running through the dedicated API interface for administratorsSame as the left

                    Configuration Categories

                    The configuration is divided into two categories, one is the configuration of the JVM, and the other is the configuration of the Turms server.

                    JVM configuration

                    The JVM default configuration file of turms-gateway is: turms-gateway/dist/config/jvm.options

                    The default JVM configuration file for turms-service is: turms-service/dist/config/jvm.options.

                    Users generally use the default JVM configuration and do not need to modify the JVM configuration by themselves.

                    If the user wants to modify the JVM configuration, there are two ways:

                    1. Modify the environment variable TURMS_GATEWAY_JVM_CONF (for turms-gateway) or TURMS_SERVICE_JVM_CONF (for turms-service) and point to the custom JVM configuration file to use a fully custom JVM configuration. The following takes modifying the JVM configuration of turms-gateway as an example, the specific modification method:

                      1. If you start via the run.sh script, you can use something like export TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path> && sh run.sh -f to set the environment variable and start.

                      2. If you start from a Docker image, you can use something like:

                        shell
                        docker run -d --name turms-gateway --ulimit nofile=1048576 \
                           --memory-swappiness=0 \
                           -p 7510:7510 -p 9510:9510 -p 10510:10510 -p 11510:11510 -p 12510:12510 \
                           --health-cmd="curl -I --silent $${HOST}:9510/health || exit 1" \
                        @@ -53,7 +53,7 @@
                           --health-retries=3 \
                           --health-start-period=60s \
                           -v <your-jvm-options-file-path>:/opt/turms/turms-gateway/config/jvm.options:ro \
                        -  ghcr.io/turms-im/turms-gateway
                      3. If via Docker Compose, you can use something like:

                      shell
                      TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path> docker compose -f docker-compose.standalone.yml up --force-recreate
                      TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path> docker compose -f docker-compose.standalone.yml up --force-recreate
                      powershell
                      $env:TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path>;docker compose -f docker-compose.standalone.yml up --force-recreate
                      $env:TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path>;docker compose -f docker-compose.standalone.yml up --force-recreate
                      Note: The above `TURMS_GATEWAY_JVM_CONF` path points to the path inside the mirror, not the path of the host. If you want to use the configuration file in the host machine, you need to modify the `docker-compose.standalone.yml` configuration file to use Docker's mounting mechanism, such as:
                      +  ghcr.io/turms-im/turms-gateway
                  3. If via Docker Compose, you can use something like:

                  shell
                  TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path> docker compose -f docker-compose.standalone.yml up --force-recreate
                  TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path> docker compose -f docker-compose.standalone.yml up --force-recreate
                  powershell
                  $env:TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path>;docker compose -f docker-compose.standalone.yml up --force-recreate
                  $env:TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path>;docker compose -f docker-compose.standalone.yml up --force-recreate
                  Note: The above `TURMS_GATEWAY_JVM_CONF` path points to the path inside the mirror, not the path of the host. If you want to use the configuration file in the host machine, you need to modify the `docker-compose.standalone.yml` configuration file to use Docker's mounting mechanism, such as:
                   
                   ```yaml
                   turms-gateway:
                  @@ -61,7 +61,7 @@
                       - <your-jvm-options-file-path>:/opt/turms/turms-gateway/config/jvm.options:ro
                   ```
                   
                2. Modify the environment variable TURMS_GATEWAY_JVM_OPTS (for turms-gateway) or TURMS_SERVICE_JVM_OPTS (for turms-service) to append a custom JVM configuration based on the JVM configuration file and override the declared JVM configuration. The specific modification method is the same as above, so it will not be repeated.

                  Note: The format of this variable is: -D<name>=<value> -D<name>=<value>, such as: -Dspring.profiles.active=DEV -Dturms.cluster.discovery.address.advertise -host=myturms.

                Turms server configuration

                Turms configurations fall into four broad categories:

                • Turms Gateway configuration: corresponding to the unique configuration of the turms-gateway server
                • Turms Service configuration: corresponding to the unique configuration of the turms-service server.
                • Common general configuration: Common general configuration can be shared by turms-gateway and turms-service servers.
                • The configuration of the plug-in itself: the configuration provided by the Turms server plug-in itself.

                Configuration method

                1. The aforementioned TURMS_GATEWAY_JVM_CONF or TURMS_SERVICE_JVM_CONF, and TURMS_GATEWAY_JVM_OPTS or TURMS_SERVICE_JVM_OPTS can also be used to configure the parameters of the Turms server.
                2. Modify the configuration file under application.yaml. specific method:
                  1. Directly modify the application.yaml file under the server in the warehouse. Because if the configuration source file is modified, the user cannot use the official Turms Docker image, and needs to package it into a JAR package and create an image. Therefore, this method is generally only used for local development and testing, not for online use. environment.
                  2. Use the Docker mounting method mentioned above to mount the custom server configuration file to the path /opt/turms/turms-gateway/config/application.yaml.
                3. Call the Admin HTTP API to modify, the path is: PUT /cluster/settings.

                Reminder: For the configuration of the plug-in itself, its configuration method is the same as that of the Turms server, except that it does not support dynamic modification using the Admin HTTP API for the time being, it can also be configured based on the above two methods ①②. For example, if a plug-in is a plug-in for the turms-gateway server, then the user can put the configuration of the plug-in itself into the TURMS_GATEWAY_JVM_OPTS environment variable of the turms-gateway server.

                Profiles

                If developers need to use different configurations for the same Turms server configuration and switching, configuration sets can be used.

                By default, the configuration hard-coded in the source code of the Turms server and the configuration specified in the application.yaml file is the configuration of the default production environment. If developers want to switch to use other configuration sets, they can use other configuration sets by modifying the spring.profiles.active configuration in the application.yaml file.

                For example, a common use case: when developing and debugging locally, if you want to switch the production environment configuration to the default development environment configuration, the developer can change the spring.profiles.active value in the application.yaml file to dev , so that the Turms server will adopt the configuration specified in the two files application.yaml and application-dev.yaml (default development environment configuration), and the configuration priority in the application-dev.yaml file Higher, will override the default configuration.

                Introduction to Configuration Parameters

                Since there are hundreds of configuration items on the Turms server, this section only briefly introduces the configuration categories. If readers want to refer to the specific configuration items, they can refer to the codes of each configuration class under the im.turms.server.common.infra.property package, or continue to browse the configuration item descriptions provided in the Configuration Items section below.

                Reminder: After you compile the turms/turms-gateway server project locally, the compiler will generate the target/classes/META-INF/spring-configuration-metadata.json file. IntelliJ IDEA can automatically detect this file, and provide configuration prompts and completion functions when you enter Turms-related configuration, as shown in the following figure:

                Tumrs Service configuration
                CategoryClassField NameDescriptionSupplement
                Admin APIAdminApiPropertiesadminApiRelated configuration of administrator API interface
                Client APIClientApiPropertiesclientApiRelated configuration of client API interface
                Fake dataFakePropertiesfakeFake data related configuration
                Data SourceMongoPropertiesmongoMongoDB database related configurationTurms completely reuses the URI configuration of MongoDB. Reference document:
                https://docs.mongodb.com/manual/reference/connection-string/
                TurmsRedisPropertiesredisRedis database configuration
                StatisticsStatisticsPropertiesstatisticsStatistics related configuration
                NotificationNotificationPropertiesnotificationNotification related configuration
                File storageStoragePropertiesstorageStorage related configuration
                Business behaviorUserPropertiesuserUser-related configuration
                GroupPropertiesgroupGroup related configuration
                ConversationPropertiesconversationMessage conversation service related configuration
                MessagePropertiesmessageMessage service related configuration
                Turms Gateway configuration
                CategoryClassField NameDescription
                Admin APIAdminApiPropertiesadminApiRelated configuration of admin API
                Client APIClientApiPropertiesclientApiClient-oriented HTTP access layer related configuration (that is, ReasonController related configuration)
                NotificationLoggingPropertiesnotificationLoggingNotification log related configuration
                Service interfaceUdpPropertiesudpUDP server related configuration
                TcpPropertiestcpTCP server configuration
                WebSocketPropertieswebsocketWebSocket server related configuration
                DiscoveryPropertiesserviceDiscoveryService discovery related configuration
                Fake dataFakePropertiesfakeFake data related configuration
                Data sourceMongoPropertiesmongoMongoDB database related configuration
                TurmsRedisPropertiesredisRedis database configuration
                Business BehaviorSimultaneousLoginPropertiessimultaneousLoginMulti-login related configuration
                SessionPropertiessessionsession related configuration
                Common general configuration
                classfield namedescription
                ClusterPropertiesclusterCluster related configuration. Including configuring current running node information, service discovery registration information, configuration center information, RPC parameters
                HealthCheckPropertieshealthCheckMonitor node health status
                IpPropertiesipPublic network IP detection related configuration
                LocationPropertieslocationUser coordinate related configuration
                LoggingPropertiesloggingBasic logging configuration
                PluginPropertiespluginPlugin related configuration
                SecurityPropertiessecurityUser and administrator password encryption related configuration
                UserStatusPropertiesuserStatusUser session (connection) status related configuration
                The configuration of the plugin itself

                If users want to check the configuration items of the official Turms server plugin, they can read the corresponding plugin documentation, which will list the configuration items provided by the plugin.

                server port number configuration

                ServerConfiguration ItemPortFunction
                turms-admin6510 (HTTP)Provides the web page of the background administrator system
                turms-service/turms-gatewayturms.cluster.connection.server.port7510 (TCP)Used for RPC of turms-service and turms-gateway servers
                turms-serviceturms.service.admin-api.http.port8510 (HTTP)Provide admin API and metrics API
                turms-gatewayturms.gateway.admin-api.http.port9510 (HTTP)Provide metrics API
                turms-gatewayturms.gateway.websocket.port10510 (WebSocket)Interact with the turms-client-js client
                turms-gatewayturms.gateway.tcp.port11510 (TCP)Interact with clients
                turms-gatewayturms.gateway.udp.port12510 (UDP)Interact with clients (clients are not supported yet).
                Note: UDP server is an experimental function, not in the first release plan

                configuration items

                Note: The table below does not include the configuration of the Turms server plugin.

                Configuration ItemsGlobal AttributesVariable AttributesData TypeDefault ValueDescription
                turms.cluster.connection.client.keepalive-interval-secondsint5
                turms.cluster.connection.client.keepalive-timeout-secondsint15
                turms.cluster.connection.client.reconnect-interval-secondsint15
                turms.cluster.connection.server.hoststring0.0.0.0
                turms.cluster.connection.server.portint7510
                turms.cluster.connection.server.port-auto-incrementbooleanfalse
                turms.cluster.connection.server.port-countint100
                turms.cluster.discovery.address.advertise-hoststringThe advertise address of the local node exposed to admins. (e.g. 100.131.251.96)
                turms.cluster.discovery.address.advertise-strategyenumPRIVATE_ADDRESSThe advertise strategy is used to decide which type of address should be used so that admins can access admin APIs and metrics APIs
                turms.cluster.discovery.address.attach-port-to-hostbooleantrueWhether to attach the local port to the host. e.g. The local host is 100.131.251.96, and the port is 9510 so the service address will be 100.131.251.96:9510
                turms.cluster.discovery.delay-to-notify-members-change-secondsint3Delay notifying listeners on members change. Waits for seconds to avoid thundering herd
                turms.cluster.discovery.heartbeat-interval-secondsint10
                turms.cluster.discovery.heartbeat-timeout-secondsint30
                turms.cluster.idstringturms
                turms.cluster.node.active-by-defaultbooleantrue
                turms.cluster.node.idstringThe node ID must start with a letter or underscore, and matches zero or more of characters [a-zA-Z0-9_] after the beginning. e.g. "turms001", "turms_002"
                turms.cluster.node.leader-eligiblebooleantrueOnly works when it is a turms-service node
                turms.cluster.node.priorityint0The priority to be a leader
                turms.cluster.node.zonestringe.g. "us-east-1" and "ap-east-1"
                turms.cluster.rpc.request-timeout-millisint30000The timeout for RPC requests in milliseconds
                turms.flight-recorder.closed-recording-retention-periodint0A closed recording will be retained for the given period and will be removed from the file system after the retention period. 0 means no retention. -1 means unlimited retention.
                turms.gateway.admin-api.address.advertise-hoststringThe advertise address of the local node exposed to admins. (e.g. 100.131.251.96)
                turms.gateway.admin-api.address.advertise-strategyenumPRIVATE_ADDRESSThe advertise strategy is used to decide which type of address should be used so that admins can access admin APIs and metrics APIs
                turms.gateway.admin-api.address.attach-port-to-hostbooleantrueWhether to attach the local port to the host. e.g. The local host is 100.131.251.96, and the port is 9510 so the service address will be 100.131.251.96:9510
                turms.gateway.admin-api.enabledbooleantrueWhether to enable the APIs for administrators
                turms.gateway.admin-api.http.hoststring0.0.0.0
                turms.gateway.admin-api.http.max-request-body-size-bytesint10485760
                turms.gateway.admin-api.http.portint9510
                turms.gateway.admin-api.log.enabledbooleantrueWhether to log API calls
                turms.gateway.admin-api.log.log-request-paramsbooleantrueWhether to log the parameters of requests
                turms.gateway.admin-api.rate-limiting.capacityint50The maximum number of tokens that the bucket can hold
                turms.gateway.admin-api.rate-limiting.initial-tokensint50The initial number of tokens for new session
                turms.gateway.admin-api.rate-limiting.refill-interval-millisint1000The time interval to refill. 0 means never refill
                turms.gateway.admin-api.rate-limiting.tokens-per-periodint50Refills the bucket with the specified number of tokens per period if the bucket is not full
                turms.gateway.admin-api.use-authenticationbooleantrueWhether to use authentication. If false, all HTTP requesters will personate the root user and all HTTP requests will be passed. You may set it to false when you want to manage authentication via security groups, NACL, etc
                turms.gateway.client-api.logging.excluded-notification-categoriesSet-enum[]Turms will get the notifications to log from the union of "includedNotificationCategories" and "includedNotifications" except the notifications included in "excludedNotificationCategories" and "excludedNotificationTypes"
                turms.gateway.client-api.logging.excluded-notification-typesSet-enum[]Turms will get the notifications to log from the union of "includedNotificationCategories" and "includedNotifications" except the notifications included in "excludedNotificationCategories" and "excludedNotificationTypes"
                turms.gateway.client-api.logging.excluded-request-categoriesSet-enum[]Turms will get the requests to log from the union of "includedRequestCategories" and "includedRequests" except the requests included in "excludedRequestCategories" and "excludedRequestTypes"
                turms.gateway.client-api.logging.excluded-request-typesSet-enum[]Turms will get the requests to log from the union of "includedRequestCategories" and "includedRequests" except the requests included in "excludedRequestCategories" and "excludedRequestTypes"
                turms.gateway.client-api.logging.heartbeat-sample-ratefloat0
                turms.gateway.client-api.logging.included-notification-categoriesLinkedHashSet-LoggingCategoryProperties[]Turms will get the notifications to log from the union of "includedNotificationCategories" and "includedNotifications" except the notifications included in "excludedNotificationCategories" and "excludedNotificationTypes"
                turms.gateway.client-api.logging.included-notificationsLinkedHashSet-LoggingRequestProperties[]Turms will get the notifications to log from the union of "includedNotificationCategories" and "includedNotifications" except the notifications included in "excludedNotificationCategories" and "excludedNotificationTypes"
                turms.gateway.client-api.logging.included-request-categoriesLinkedHashSet-LoggingCategoryProperties[
                {
                "category": "ALL",
                "sampleRate": 1
                }
                ]
                Turms will get the requests to log from the union of "includedRequestCategories" and "includedRequests" except the requests included in "excludedRequestCategories" and "excludedRequestTypes"
                turms.gateway.client-api.logging.included-requestsLinkedHashSet-LoggingRequestProperties[]Turms will get the requests to log from the union of "includedRequestCategories" and "includedRequests" except the requests included in "excludedRequestCategories" and "excludedRequestTypes"
                turms.gateway.client-api.max-request-size-bytesint16384The client session will be closed and may be blocked if it tries to send a request larger than the size. Note: The average size of turms requests is 16~64 bytes
                turms.gateway.client-api.rate-limiting.capacityint50The maximum number of tokens that the bucket can hold
                turms.gateway.client-api.rate-limiting.initial-tokensint50The initial number of tokens for new session
                turms.gateway.client-api.rate-limiting.refill-interval-millisint1000The time interval to refill. 0 means never refill
                turms.gateway.client-api.rate-limiting.tokens-per-periodint1Refills the bucket with the specified number of tokens per period if the bucket is not full
                turms.gateway.client-api.return-reason-for-server-errorbooleanfalseWhether to return the reason for the server error to the client. Note: 1. It may reveal sensitive data like the IP of internal servers if true; 2. turms-gateway never return the information of stack traces no matter it is true or false.
                turms.gateway.fake.enabledbooleanfalseWhether to fake clients. Note that faking only works in non-production environments
                turms.gateway.fake.first-user-idlong100
                turms.gateway.fake.request-count-per-intervalint10The number of requests to send per interval. If requestIntervalMillis is 1000, requestCountPerInterval is TPS in fact
                turms.gateway.fake.request-interval-millisint1000The interval to send request
                turms.gateway.fake.user-countint10Run the number of real clients as faked users with an ID from [firstUserId, firstUserId + userCount) to connect to turms-gateway. So please ensure you have set "turms.service.fake.userCount" to a number larger than or equal to (firstUserId + userCount)
                turms.gateway.notification-logging.enabledbooleanfalseWhether to parse the buffer of TurmsNotification to log. Note that the property has an impact on performance
                turms.gateway.service-discovery.advertise-hoststringThe advertise address of the local node exposed to the public. The property can be used to advertise the DDoS Protected IP address to hide the origin IP address (e.g. 100.131.251.96)
                turms.gateway.service-discovery.advertise-strategyenumPRIVATE_ADDRESSThe advertise strategy is used to help clients or load balancing servers to access the local node. Note: For security, do NOT use "PUBLIC_ADDRESS" in production to prevent from exposing the origin IP address for DDoS attack.
                turms.gateway.service-discovery.attach-port-to-hostbooleantrueWhether to attach the local port to the host. For example, if the local host is 100.131.251.96, and the port is 10510, so the service address will be 100.131.251.96:10510
                turms.gateway.service-discovery.identitystringThe identity of the local node will be sent to clients as a notification if identity is not blank and "turms.gateway.session.notifyClientsOfSessionInfoAfterConnected" is true (e.g. "turms-east-0001")
                turms.gateway.session.client-heartbeat-interval-secondsint60The client heartbeat interval. Note that the value will NOT change the actual heartbeat behavior of clients, and the value is only used to facilitate related operations of turms-gateway
                turms.gateway.session.close-idle-session-after-secondsint180A session will be closed if turms server does not receive any request (including heartbeat request) from the client during closeIdleSessionAfterSeconds. References: https://mp.weixin.qq.com/s?__biz=MzAwNDY1ODY2OQ==&mid=207243549&idx=1&sn=4ebe4beb8123f1b5ab58810ac8bc5994&scene=0#rd
                turms.gateway.session.device-details.expire-after-secondsint2592000Device details information will expire after the specified time has elapsed. 0 means never expire
                turms.gateway.session.device-details.itemsList-DeviceDetailsItemProperties[]
                turms.gateway.session.identity-access-management.enabledbooleantrueWhether to authenticate and authorize users when logging in. Note that user ID is always required even if enabled is false. If false at startup, turms-gateway will not connect to the MongoDB server for user records
                turms.gateway.session.identity-access-management.http.authentication.response-expectation.body-fieldsMap{
                "authenticated": true
                }
                turms.gateway.session.identity-access-management.http.authentication.response-expectation.headersMap{}
                turms.gateway.session.identity-access-management.http.authentication.response-expectation.status-codesSet-string[
                "2??"
                ]
                turms.gateway.session.identity-access-management.http.request.headersMap{}
                turms.gateway.session.identity-access-management.http.request.http-methodenumGET
                turms.gateway.session.identity-access-management.http.request.timeout-millisint30000
                turms.gateway.session.identity-access-management.http.request.urlstring
                turms.gateway.session.identity-access-management.jwt.algorithm.ecdsa256.p12.file-pathstring
                turms.gateway.session.identity-access-management.jwt.algorithm.ecdsa256.p12.key-aliasstring
                turms.gateway.session.identity-access-management.jwt.algorithm.ecdsa256.p12.passwordstring
                turms.gateway.session.identity-access-management.jwt.algorithm.ecdsa256.pem-file-pathstring
                turms.gateway.session.identity-access-management.jwt.algorithm.ecdsa384.p12.file-pathstring
                turms.gateway.session.identity-access-management.jwt.algorithm.ecdsa384.p12.key-aliasstring
                turms.gateway.session.identity-access-management.jwt.algorithm.ecdsa384.p12.passwordstring
                turms.gateway.session.identity-access-management.jwt.algorithm.ecdsa384.pem-file-pathstring
                turms.gateway.session.identity-access-management.jwt.algorithm.ecdsa512.p12.file-pathstring
                turms.gateway.session.identity-access-management.jwt.algorithm.ecdsa512.p12.key-aliasstring
                turms.gateway.session.identity-access-management.jwt.algorithm.ecdsa512.p12.passwordstring
                turms.gateway.session.identity-access-management.jwt.algorithm.ecdsa512.pem-file-pathstring
                turms.gateway.session.identity-access-management.jwt.algorithm.hmac256.file-pathstring
                turms.gateway.session.identity-access-management.jwt.algorithm.hmac256.p12.file-pathstring
                turms.gateway.session.identity-access-management.jwt.algorithm.hmac256.p12.key-aliasstring
                turms.gateway.session.identity-access-management.jwt.algorithm.hmac256.p12.passwordstring
                turms.gateway.session.identity-access-management.jwt.algorithm.hmac384.file-pathstring
                turms.gateway.session.identity-access-management.jwt.algorithm.hmac384.p12.file-pathstring
                turms.gateway.session.identity-access-management.jwt.algorithm.hmac384.p12.key-aliasstring
                turms.gateway.session.identity-access-management.jwt.algorithm.hmac384.p12.passwordstring
                turms.gateway.session.identity-access-management.jwt.algorithm.hmac512.file-pathstring
                turms.gateway.session.identity-access-management.jwt.algorithm.hmac512.p12.file-pathstring
                turms.gateway.session.identity-access-management.jwt.algorithm.hmac512.p12.key-aliasstring
                turms.gateway.session.identity-access-management.jwt.algorithm.hmac512.p12.passwordstring
                turms.gateway.session.identity-access-management.jwt.algorithm.ps256.p12.file-pathstring
                turms.gateway.session.identity-access-management.jwt.algorithm.ps256.p12.key-aliasstring
                turms.gateway.session.identity-access-management.jwt.algorithm.ps256.p12.passwordstring
                turms.gateway.session.identity-access-management.jwt.algorithm.ps256.pem-file-pathstring
                turms.gateway.session.identity-access-management.jwt.algorithm.ps384.p12.file-pathstring
                turms.gateway.session.identity-access-management.jwt.algorithm.ps384.p12.key-aliasstring
                turms.gateway.session.identity-access-management.jwt.algorithm.ps384.p12.passwordstring
                turms.gateway.session.identity-access-management.jwt.algorithm.ps384.pem-file-pathstring
                turms.gateway.session.identity-access-management.jwt.algorithm.ps512.p12.file-pathstring
                turms.gateway.session.identity-access-management.jwt.algorithm.ps512.p12.key-aliasstring
                turms.gateway.session.identity-access-management.jwt.algorithm.ps512.p12.passwordstring
                turms.gateway.session.identity-access-management.jwt.algorithm.ps512.pem-file-pathstring
                turms.gateway.session.identity-access-management.jwt.algorithm.rsa256.p12.file-pathstring
                turms.gateway.session.identity-access-management.jwt.algorithm.rsa256.p12.key-aliasstring
                turms.gateway.session.identity-access-management.jwt.algorithm.rsa256.p12.passwordstring
                turms.gateway.session.identity-access-management.jwt.algorithm.rsa256.pem-file-pathstring
                turms.gateway.session.identity-access-management.jwt.algorithm.rsa384.p12.file-pathstring
                turms.gateway.session.identity-access-management.jwt.algorithm.rsa384.p12.key-aliasstring
                turms.gateway.session.identity-access-management.jwt.algorithm.rsa384.p12.passwordstring
                turms.gateway.session.identity-access-management.jwt.algorithm.rsa384.pem-file-pathstring
                turms.gateway.session.identity-access-management.jwt.algorithm.rsa512.p12.file-pathstring
                turms.gateway.session.identity-access-management.jwt.algorithm.rsa512.p12.key-aliasstring
                turms.gateway.session.identity-access-management.jwt.algorithm.rsa512.p12.passwordstring
                turms.gateway.session.identity-access-management.jwt.algorithm.rsa512.pem-file-pathstring
                turms.gateway.session.identity-access-management.jwt.authentication.expectation.custom-payload-claimsMap{
                "authenticated": true
                }
                turms.gateway.session.identity-access-management.jwt.verification.audiencestring
                turms.gateway.session.identity-access-management.jwt.verification.custom-payload-claimsMap{}
                turms.gateway.session.identity-access-management.jwt.verification.issuerstring
                turms.gateway.session.identity-access-management.typeenumPASSWORDNote that if the type is not PASSWORD, turms-gateway will not connect to the MongoDB server for user records
                turms.gateway.session.min-heartbeat-interval-secondsint18The minimum interval to refresh the heartbeat status by client requests to avoid refreshing the heartbeat status frequently
                turms.gateway.session.notify-clients-of-session-info-after-connectedbooleantrueWhether to notify clients of the session information after connected with the server
                turms.gateway.session.switch-protocol-after-secondsint540If the turms server only receives heartbeat requests from the client during switchProtocolAfterSeconds, the TCP/WebSocket connection will be closed with the close status "SWITCH" to indicate the client should keep sending heartbeat requests over UDP if they want to keep online. Note: 1. The property only works if UDP is enabled; 2. For browser clients, UDP is not supported
                turms.gateway.simultaneous-login.allow-device-type-others-loginbooleantrueWhether to allow the devices of DeviceType.OTHERS to login
                turms.gateway.simultaneous-login.allow-device-type-unknown-loginbooleantrueWhether to allow the devices of DeviceType.UNKNOWN to login
                turms.gateway.simultaneous-login.login-conflict-strategyenumDISCONNECT_LOGGED_IN_DEVICESThe login conflict strategy is used for servers to know how to behave if a device is logging in when there are conflicted and logged-in devices
                turms.gateway.simultaneous-login.strategyenumALLOW_ONE_DEVICE_OF_EACH_DEVICE_TYPE_ONLINEThe simultaneous login strategy is used to control which devices can be online at the same time
                turms.gateway.tcp.backlogint4096The maximum number of connection requests waiting in the backlog queue. Large enough to handle bursts and GC pauses but do not set too large to prevent SYN-Flood attacks
                turms.gateway.tcp.close-idle-connection-after-secondsint300A TCP connection will be closed on the server side if a client has not established a user session in a specified time. Note that the developers on the client side should take the responsibility to close the TCP connection according to their business requirements
                turms.gateway.tcp.connection-timeoutint30
                turms.gateway.tcp.enabledbooleantrue
                turms.gateway.tcp.hoststring0.0.0.0
                turms.gateway.tcp.portint-1
                turms.gateway.tcp.wiretapbooleanfalse
                turms.gateway.udp.enabledbooleantrue
                turms.gateway.udp.hoststring0.0.0.0
                turms.gateway.udp.portint-1
                turms.gateway.websocket.backlogint4096The maximum number of connection requests waiting in the backlog queue. Large enough to handle bursts and GC pauses but do not set too large to prevent SYN-Flood attacks
                turms.gateway.websocket.close-idle-connection-after-secondsint300A WebSocket connection will be closed on the server side if a client has not established a user session in a specified time. Note that the developers on the client side should take the responsibility to close the WebSocket connection according to their business requirements
                turms.gateway.websocket.connect-timeoutint30Used to mitigate the Slowloris DoS attack by lowering the timeout for the TCP connection handshake
                turms.gateway.websocket.enabledbooleantrue
                turms.gateway.websocket.hoststring0.0.0.0
                turms.gateway.websocket.portint-1
                turms.health-check.check-interval-secondsint3
                turms.health-check.cpu.retriesint5
                turms.health-check.cpu.unhealthy-load-threshold-percentageint95
                turms.health-check.memory.direct-memory-warning-threshold-percentageint50Log warning messages if the used direct memory exceeds the max direct memory of the percentage
                turms.health-check.memory.heap-memory-gc-threshold-percentageint60If the used memory has used the reserved memory specified by maxAvailableMemoryPercentage and minFreeSystemMemoryBytes, try to start GC when the used heap memory exceeds the max heap memory of the percentage
                turms.health-check.memory.heap-memory-warning-threshold-percentageint95Log warning messages if the used heap memory exceeds the max heap memory of the percentage
                turms.health-check.memory.max-available-direct-memory-percentageint95The server will refuse to serve when the used direct memory exceeds the max direct memory of the percentage to try to avoid OutOfMemoryError
                turms.health-check.memory.max-available-memory-percentageint95The server will refuse to serve when the used memory (heap memory + JVM internal non-heap memory + direct buffer pool) exceeds the physical memory of the percentage. The server will try to reserve max(maxAvailableMemoryPercentage of the physical memory, minFreeSystemMemoryBytes) for kernel and other processes. Note that the max available memory percentage does not conflict with the usage of limiting memory in docker because docker limits the memory of the container, while this memory percentage only limits the available memory for JVM
                turms.health-check.memory.min-free-system-memory-bytesint134217728The server will refuse to serve when the free system memory is less than minFreeSystemMemoryBytes
                turms.health-check.memory.min-heap-memory-gc-interval-secondsint10
                turms.health-check.memory.min-memory-warning-interval-secondsint10
                turms.ip.cached-private-ip-expire-after-millisint60000The cached private IP will expire after the specified time has elapsed. 0 means no cache
                turms.ip.cached-public-ip-expire-after-millisint60000The cached public IP will expire after the specified time has elapsed. 0 means no cache
                turms.ip.public-ip-detector-addressesList-string[
                "https://checkip.amazonaws.com",
                "https://whatismyip.akamai.com",
                "https://ifconfig.me/ip",
                "https://myip.dnsomatic.com"
                ]
                The public IP detectors will only be used to query the public IP of the local node if needed (e.g. If the node discovery property "advertiseStrategy" is "PUBLIC_ADDRESS". Note that the HTTP response body must be a string of IP instead of a JSON
                turms.location.enabledbooleantrueWhether to handle users' locations
                turms.location.nearby-user-request.default-max-distance-metersint10000The default maximum allowed distance in meters
                turms.location.nearby-user-request.default-max-nearby-user-countshort20The default maximum allowed number of nearby users
                turms.location.nearby-user-request.max-distance-metersint10000The maximum allowed distance in meters
                turms.location.nearby-user-request.max-nearby-user-countshort100The maximum allowed number of nearby users
                turms.location.treat-user-id-and-device-type-as-unique-userbooleanfalseWhether to treat the pair of user ID and device type as a unique user when querying users nearby. If false, only the user ID is used to identify a unique user
                turms.logging.console.enabledbooleanfalse
                turms.logging.console.levelenumINFO
                turms.logging.file.compression.enabledbooleantrue
                turms.logging.file.enabledbooleantrue
                turms.logging.file.file-pathstring@HOME/@SERVICE_TYPE_NAME.log
                turms.logging.file.levelenumINFO
                turms.logging.file.max-file-size-mbint32
                turms.logging.file.max-filesint320
                turms.plugin.dirstringpluginsThe relative path of plugins
                turms.plugin.enabledbooleantrueWhether to enable plugins
                turms.plugin.java.allow-savebooleanfalseWhether to allow to save plugins using HTTP API
                turms.plugin.js.allow-savebooleanfalseWhether to allow to save plugins using HTTP API
                turms.plugin.js.debug.enabledbooleanfalseWhether to enable debugging
                turms.plugin.js.debug.inspect-hoststringlocalhostThe inspect host
                turms.plugin.js.debug.inspect-portint24242The inspect port
                turms.plugin.network.pluginsList-NetworkPluginProperties[]
                turms.plugin.network.proxy.connect-timeout-millisint60000The HTTP proxy connect timeout in millis
                turms.plugin.network.proxy.enabledbooleanfalseWhether to enable HTTP proxy
                turms.plugin.network.proxy.hoststringThe HTTP proxy host
                turms.plugin.network.proxy.passwordstringThe HTTP proxy password
                turms.plugin.network.proxy.portint8080The HTTP proxy port
                turms.plugin.network.proxy.usernamestringThe HTTP proxy username
                turms.security.blocklist.ip.auto-block.corrupted-frame.block-levelsList-BlockLevel[
                {
                "blockDurationSeconds": 600,
                "goNextLevelTriggerTimes": 1,
                "reduceOneTriggerTimeIntervalMillis": 60000
                },
                {
                "blockDurationSeconds": 1800,
                "goNextLevelTriggerTimes": 1,
                "reduceOneTriggerTimeIntervalMillis": 60000
                },
                {
                "blockDurationSeconds": 3600,
                "goNextLevelTriggerTimes": 0,
                "reduceOneTriggerTimeIntervalMillis": 60000
                }
                ]
                turms.security.blocklist.ip.auto-block.corrupted-frame.block-trigger-timesint5Block the client when the block condition is triggered the times
                turms.security.blocklist.ip.auto-block.corrupted-frame.enabledbooleanfalse
                turms.security.blocklist.ip.auto-block.corrupted-request.block-levelsList-BlockLevel[
                {
                "blockDurationSeconds": 600,
                "goNextLevelTriggerTimes": 1,
                "reduceOneTriggerTimeIntervalMillis": 60000
                },
                {
                "blockDurationSeconds": 1800,
                "goNextLevelTriggerTimes": 1,
                "reduceOneTriggerTimeIntervalMillis": 60000
                },
                {
                "blockDurationSeconds": 3600,
                "goNextLevelTriggerTimes": 0,
                "reduceOneTriggerTimeIntervalMillis": 60000
                }
                ]
                turms.security.blocklist.ip.auto-block.corrupted-request.block-trigger-timesint5Block the client when the block condition is triggered the times
                turms.security.blocklist.ip.auto-block.corrupted-request.enabledbooleanfalse
                turms.security.blocklist.ip.auto-block.frequent-request.block-levelsList-BlockLevel[
                {
                "blockDurationSeconds": 600,
                "goNextLevelTriggerTimes": 1,
                "reduceOneTriggerTimeIntervalMillis": 60000
                },
                {
                "blockDurationSeconds": 1800,
                "goNextLevelTriggerTimes": 1,
                "reduceOneTriggerTimeIntervalMillis": 60000
                },
                {
                "blockDurationSeconds": 3600,
                "goNextLevelTriggerTimes": 0,
                "reduceOneTriggerTimeIntervalMillis": 60000
                }
                ]
                turms.security.blocklist.ip.auto-block.frequent-request.block-trigger-timesint5Block the client when the block condition is triggered the times
                turms.security.blocklist.ip.auto-block.frequent-request.enabledbooleanfalse
                turms.security.blocklist.ip.enabledbooleantrue
                turms.security.blocklist.ip.sync-blocklist-interval-millisint10000
                turms.security.blocklist.user-id.auto-block.corrupted-frame.block-levelsList-BlockLevel[
                {
                "blockDurationSeconds": 600,
                "goNextLevelTriggerTimes": 1,
                "reduceOneTriggerTimeIntervalMillis": 60000
                },
                {
                "blockDurationSeconds": 1800,
                "goNextLevelTriggerTimes": 1,
                "reduceOneTriggerTimeIntervalMillis": 60000
                },
                {
                "blockDurationSeconds": 3600,
                "goNextLevelTriggerTimes": 0,
                "reduceOneTriggerTimeIntervalMillis": 60000
                }
                ]
                turms.security.blocklist.user-id.auto-block.corrupted-frame.block-trigger-timesint5Block the client when the block condition is triggered the times
                turms.security.blocklist.user-id.auto-block.corrupted-frame.enabledbooleanfalse
                turms.security.blocklist.user-id.auto-block.corrupted-request.block-levelsList-BlockLevel[
                {
                "blockDurationSeconds": 600,
                "goNextLevelTriggerTimes": 1,
                "reduceOneTriggerTimeIntervalMillis": 60000
                },
                {
                "blockDurationSeconds": 1800,
                "goNextLevelTriggerTimes": 1,
                "reduceOneTriggerTimeIntervalMillis": 60000
                },
                {
                "blockDurationSeconds": 3600,
                "goNextLevelTriggerTimes": 0,
                "reduceOneTriggerTimeIntervalMillis": 60000
                }
                ]
                turms.security.blocklist.user-id.auto-block.corrupted-request.block-trigger-timesint5Block the client when the block condition is triggered the times
                turms.security.blocklist.user-id.auto-block.corrupted-request.enabledbooleanfalse
                turms.security.blocklist.user-id.auto-block.frequent-request.block-levelsList-BlockLevel[
                {
                "blockDurationSeconds": 600,
                "goNextLevelTriggerTimes": 1,
                "reduceOneTriggerTimeIntervalMillis": 60000
                },
                {
                "blockDurationSeconds": 1800,
                "goNextLevelTriggerTimes": 1,
                "reduceOneTriggerTimeIntervalMillis": 60000
                },
                {
                "blockDurationSeconds": 3600,
                "goNextLevelTriggerTimes": 0,
                "reduceOneTriggerTimeIntervalMillis": 60000
                }
                ]
                turms.security.blocklist.user-id.auto-block.frequent-request.block-trigger-timesint5Block the client when the block condition is triggered the times
                turms.security.blocklist.user-id.auto-block.frequent-request.enabledbooleanfalse
                turms.security.blocklist.user-id.enabledbooleantrue
                turms.security.blocklist.user-id.sync-blocklist-interval-millisint10000
                turms.security.password.admin-password-encoding-algorithmenumBCRYPTThe password encoding algorithm for admins
                turms.security.password.initial-root-passwordstringThe initial password of the root user
                turms.security.password.user-password-encoding-algorithmenumSALTED_SHA256The password encoding algorithm for users
                turms.service.admin-api.address.advertise-hoststringThe advertise address of the local node exposed to admins. (e.g. 100.131.251.96)
                turms.service.admin-api.address.advertise-strategyenumPRIVATE_ADDRESSThe advertise strategy is used to decide which type of address should be used so that admins can access admin APIs and metrics APIs
                turms.service.admin-api.address.attach-port-to-hostbooleantrueWhether to attach the local port to the host. e.g. The local host is 100.131.251.96, and the port is 9510 so the service address will be 100.131.251.96:9510
                turms.service.admin-api.allow-delete-without-filterbooleanfalseWhether to allow administrators to delete data without any filter. Better false to prevent administrators from deleting all data by accident
                turms.service.admin-api.default-available-records-per-requestint10The default available records per query request
                turms.service.admin-api.enabledbooleantrueWhether to enable the APIs for administrators
                turms.service.admin-api.http.hoststring0.0.0.0
                turms.service.admin-api.http.max-request-body-size-bytesint10485760
                turms.service.admin-api.http.portint8510
                turms.service.admin-api.log.enabledbooleantrueWhether to log API calls
                turms.service.admin-api.log.log-request-paramsbooleantrueWhether to log the parameters of requests
                turms.service.admin-api.max-available-online-users-status-per-requestint20The maximum available online users' status per query request
                turms.service.admin-api.max-available-records-per-requestint1000The maximum available records per query request
                turms.service.admin-api.max-day-difference-per-count-requestint31The maximum day difference per count request
                turms.service.admin-api.max-day-difference-per-requestint90The maximum day difference per query request
                turms.service.admin-api.max-hour-difference-per-count-requestint24The maximum hour difference per count request
                turms.service.admin-api.max-month-difference-per-count-requestint12The maximum month difference per count request
                turms.service.admin-api.rate-limiting.capacityint50The maximum number of tokens that the bucket can hold
                turms.service.admin-api.rate-limiting.initial-tokensint50The initial number of tokens for new session
                turms.service.admin-api.rate-limiting.refill-interval-millisint1000The time interval to refill. 0 means never refill
                turms.service.admin-api.rate-limiting.tokens-per-periodint50Refills the bucket with the specified number of tokens per period if the bucket is not full
                turms.service.admin-api.use-authenticationbooleantrueWhether to use authentication. If false, all HTTP requesters will personate the root user and all HTTP requests will be passed. You may set it to false when you want to manage authentication via security groups, NACL, etc
                turms.service.client-api.disabled-endpointsSet-enum[]The disabled endpoints for client requests. Return ILLEGAL_ARGUMENT if a client tries to access them
                turms.service.client-api.logging.excluded-notification-categoriesSet-enum[]Turms will get the notifications to log from the union of "includedNotificationCategories" and "includedNotifications" except the notifications included in "excludedNotificationCategories" and "excludedNotificationTypes"
                turms.service.client-api.logging.excluded-notification-typesSet-enum[]Turms will get the notifications to log from the union of "includedNotificationCategories" and "includedNotifications" except the notifications included in "excludedNotificationCategories" and "excludedNotificationTypes"
                turms.service.client-api.logging.excluded-request-categoriesSet-enum[]Turms will get the requests to log from the union of "includedRequestCategories" and "includedRequests" except the requests included in "excludedRequestCategories" and "excludedRequestTypes"
                turms.service.client-api.logging.excluded-request-typesSet-enum[]Turms will get the requests to log from the union of "includedRequestCategories" and "includedRequests" except the requests included in "excludedRequestCategories" and "excludedRequestTypes"
                turms.service.client-api.logging.included-notification-categoriesLinkedHashSet-LoggingCategoryProperties[]Turms will get the notifications to log from the union of "includedNotificationCategories" and "includedNotifications" except the notifications included in "excludedNotificationCategories" and "excludedNotificationTypes"
                turms.service.client-api.logging.included-notificationsLinkedHashSet-LoggingRequestProperties[]Turms will get the notifications to log from the union of "includedNotificationCategories" and "includedNotifications" except the notifications included in "excludedNotificationCategories" and "excludedNotificationTypes"
                turms.service.client-api.logging.included-request-categoriesLinkedHashSet-LoggingCategoryProperties[
                {
                "category": "ALL",
                "sampleRate": 1
                }
                ]
                Turms will get the requests to log from the union of "includedRequestCategories" and "includedRequests" except the requests included in "excludedRequestCategories" and "excludedRequestTypes"
                turms.service.client-api.logging.included-requestsLinkedHashSet-LoggingRequestProperties[]Turms will get the requests to log from the union of "includedRequestCategories" and "includedRequests" except the requests included in "excludedRequestCategories" and "excludedRequestTypes"
                turms.service.conversation.read-receipt.allow-move-read-date-forwardbooleanfalseWhether to allow to move the last read date forward
                turms.service.conversation.read-receipt.enabledbooleantrueWhether to allow to update the last read date
                turms.service.conversation.read-receipt.update-read-date-after-message-sentbooleantrueWhether to update the read date after a user sent a message
                turms.service.conversation.read-receipt.update-read-date-when-user-querying-messagebooleanfalseWhether to update the read date when a user queries messages
                turms.service.conversation.read-receipt.use-server-timebooleantrueWhether to use the server time to set the last read date when updating
                turms.service.conversation.typing-status.enabledbooleantrueWhether to notify users of typing statuses sent by other users
                turms.service.fake.clear-all-collections-before-fakingbooleanfalseWhether to clear all collections before faking at startup
                turms.service.fake.enabledbooleanfalseWhether to fake data. Note that faking only works in non-production environments
                turms.service.fake.fake-if-collection-existsbooleanfalseWhether to fake data even if the collection has already existed
                turms.service.fake.user-countint1000the total number of users to fake
                turms.service.group.activate-group-when-createdbooleantrueWhether to activate a group when created by default
                turms.service.group.delete-group-logically-by-defaultbooleantrueWhether to delete groups logically by default
                turms.service.group.invitation.allow-recall-pending-invitation-by-owner-and-managerbooleanfalseWhether to allow the owner and managers of a group to recall pending group invitations
                turms.service.group.invitation.delete-expired-invitations-when-cron-triggeredbooleanfalseWhether to delete expired group invitations when the cron expression is triggered
                turms.service.group.invitation.expire-after-secondsint2592000A group invitation will become expired after the specified time has passed
                turms.service.group.invitation.expired-invitations-cleanup-cronstring0 15 2 * * *Clean the expired group invitations when the cron expression is triggered if "deleteExpiredInvitationsWhenCronTriggered" is true
                turms.service.group.invitation.max-content-lengthint200The maximum allowed length for the text of a group invitation
                turms.service.group.join-request.allow-recall-join-request-sent-by-oneselfbooleanfalseWhether to allow users to recall the join requests sent by themselves
                turms.service.group.join-request.delete-expired-join-requests-when-cron-triggeredbooleanfalseWhether to delete expired group join requests when the cron expression is triggered
                turms.service.group.join-request.expire-after-secondsint2592000A group join request will become expired after the specified time has elapsed
                turms.service.group.join-request.expired-join-requests-cleanup-cronstring0 30 2 * * *Clean the expired group join requests when the cron expression is triggered if "deleteExpiredJoinRequestsWhenCronTriggered" is true
                turms.service.group.join-request.max-content-lengthint200The maximum allowed length for the text of a group join request
                turms.service.group.member-cache-expire-after-secondsint15The group member cache will expire after the specified seconds. If 0, no group member cache
                turms.service.group.question.answer-content-limitint50The maximum allowed length for the text of a group question's answer
                turms.service.group.question.max-answer-countint10The maximum number of answers for a group question
                turms.service.group.question.question-content-limitint200The maximum allowed length for the text of a group question
                turms.service.message.allow-edit-message-by-senderbooleantrueWhether to allow the sender of a message to edit the message
                turms.service.message.allow-recall-messagebooleantrueWhether to allow users to recall messages. Note: To recall messages, more system resources are needed
                turms.service.message.allow-send-messages-to-oneselfbooleanfalseWhether to allow users to send messages to themselves
                turms.service.message.allow-send-messages-to-strangerbooleantrueWhether to allow users to send messages to a stranger
                turms.service.message.available-recall-duration-secondsint300The available recall duration for the sender of a message
                turms.service.message.cache.sent-message-cache-max-sizeint10240The maximum size of the cache of sent messages.
                turms.service.message.cache.sent-message-expire-afterint30The retention period of sent messages in the cache. For a better performance, it is a good practice to keep the value greater than the allowed recall duration
                turms.service.message.check-if-target-active-and-not-deletedbooleantrueWhether to check if the target (recipient or group) of a message is active and not deleted
                turms.service.message.default-available-messages-number-with-totalint1The default available messages number with the "total" field that users request
                turms.service.message.delete-message-logically-by-defaultbooleantrueWhether to delete messages logically by default
                turms.service.message.expired-messages-cleanup-cronstring0 45 2 * * *Clean the expired messages when the cron expression is triggered
                turms.service.message.is-recalled-message-visiblebooleanfalseWhether to respond with recalled messages to clients' message query requests
                turms.service.message.max-records-size-bytesint15728640The maximum allowed size for the records of a message
                turms.service.message.max-text-limitint500The maximum allowed length for the text of a message
                turms.service.message.message-retention-period-hoursint0A message will be retained for the given period and will be removed from the database after the retention period
                turms.service.message.persist-messagebooleantrueWhether to persist messages in databases. Note: If false, senders will not get the message ID after the message has sent and cannot edit it
                turms.service.message.persist-pre-message-idbooleanfalseWhether to persist the previous message ID of messages in databases
                turms.service.message.persist-recordbooleanfalseWhether to persist the records of messages in databases
                turms.service.message.persist-sender-ipbooleanfalseWhether to persist the sender IP of messages in databases
                turms.service.message.sequence-id.use-sequence-id-for-group-conversationbooleanfalseWhether to use the sequence ID for group conversations so that the client can be aware of the loss of messages. Note that the property has a significant impact on performance
                turms.service.message.sequence-id.use-sequence-id-for-private-conversationbooleanfalseWhether to use the sequence ID for private conversations so that the client can be aware of the loss of messages. Note that the property has a significant impact on performance
                turms.service.message.time-typeenumLOCAL_SERVER_TIMEThe time type for the delivery time of message
                turms.service.message.use-conversation-idbooleanfalseWhether to use conversation ID so that a user can query the messages sent by themselves in a conversation quickly
                turms.service.mongo.admin.optional-index.admin.registration-datebooleanfalse
                turms.service.mongo.admin.optional-index.admin.role-idbooleanfalse
                turms.service.mongo.group.optional-index.group-blocked-user.block-datebooleanfalse
                turms.service.mongo.group.optional-index.group-blocked-user.requester-idbooleanfalse
                turms.service.mongo.group.optional-index.group-invitation.group-idbooleantrue
                turms.service.mongo.group.optional-index.group-invitation.inviter-idbooleanfalse
                turms.service.mongo.group.optional-index.group-invitation.response-datebooleanfalse
                turms.service.mongo.group.optional-index.group-join-request.creation-datebooleanfalse
                turms.service.mongo.group.optional-index.group-join-request.group-idbooleantrue
                turms.service.mongo.group.optional-index.group-join-request.responder-idbooleanfalse
                turms.service.mongo.group.optional-index.group-join-request.response-datebooleanfalse
                turms.service.mongo.group.optional-index.group-member.join-datebooleanfalse
                turms.service.mongo.group.optional-index.group-member.mute-end-datebooleanfalse
                turms.service.mongo.group.optional-index.group.creation-datebooleanfalse
                turms.service.mongo.group.optional-index.group.creator-idbooleanfalse
                turms.service.mongo.group.optional-index.group.deletion-datebooleantrue
                turms.service.mongo.group.optional-index.group.mute-end-datebooleanfalse
                turms.service.mongo.group.optional-index.group.owner-idbooleantrue
                turms.service.mongo.group.optional-index.group.type-idbooleanfalse
                turms.service.mongo.message.optional-index.message.deletion-datebooleantrue
                turms.service.mongo.message.optional-index.message.reference-idbooleanfalse
                turms.service.mongo.message.optional-index.message.sender-idbooleanfalse
                turms.service.mongo.message.optional-index.message.sender-ipbooleantrue
                turms.service.mongo.message.tiered-storage.auto-range-updater.cronstring0 0 3 * * *
                turms.service.mongo.message.tiered-storage.auto-range-updater.enabledbooleantrue
                turms.service.mongo.message.tiered-storage.enabledbooleantrue
                turms.service.mongo.message.tiered-storage.tiersLinkedHashMap{
                "cold": {
                "days": 270,
                "enabled": true,
                "shards": [
                ""
                ]
                },
                "frozen": {
                "days": 0,
                "enabled": true,
                "shards": [
                ""
                ]
                },
                "hot": {
                "days": 30,
                "enabled": true,
                "shards": [
                ""
                ]
                },
                "warm": {
                "days": 60,
                "enabled": true,
                "shards": [
                ""
                ]
                }
                }
                The storage properties for tiers from hot to cold. Note that the order of the tiers is important
                turms.service.mongo.user.optional-index.user-friend-request.recipient-idbooleanfalse
                turms.service.mongo.user.optional-index.user-friend-request.requester-idbooleanfalse
                turms.service.mongo.user.optional-index.user-friend-request.response-datebooleanfalse
                turms.service.mongo.user.optional-index.user-relationship-group-member.group-indexbooleanfalse
                turms.service.mongo.user.optional-index.user-relationship-group-member.join-datebooleanfalse
                turms.service.mongo.user.optional-index.user-relationship-group-member.related-user-idbooleanfalse
                turms.service.mongo.user.optional-index.user-relationship.establishment-datebooleanfalse
                turms.service.notification.friend-request-created.notify-friend-request-recipientbooleantrueWhether to notify the recipient when the requester has created a friend request
                turms.service.notification.friend-request-created.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have created a friend request
                turms.service.notification.friend-request-replied.notify-friend-request-requesterbooleantrueWhether to notify the requester when a recipient has replied to the friend request sent by the requester
                turms.service.notification.friend-request-replied.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have replied to a friend request
                turms.service.notification.group-blocked-user-added.notify-blocked-userbooleanfalseWhether to notify the user when they have been blocked by a group
                turms.service.notification.group-blocked-user-added.notify-group-membersbooleanfalseWhether to notify group members when a user has been blocked by a group
                turms.service.notification.group-blocked-user-added.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have added a blocked user to a group
                turms.service.notification.group-blocked-user-removed.notify-group-membersbooleanfalseWhether to notify group members when a user is unblocked by a group
                turms.service.notification.group-blocked-user-removed.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have removed a blocked user from a group
                turms.service.notification.group-blocked-user-removed.notify-unblocked-userbooleanfalseWhether to notify the user when they are unblocked by a group
                turms.service.notification.group-conversation-read-date-updated.notify-other-group-membersbooleanfalseWhether to notify other group members when a group member has updated their read date in a group conversation
                turms.service.notification.group-conversation-read-date-updated.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have updated the read date in a group conversation
                turms.service.notification.group-created.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have created a group
                turms.service.notification.group-deleted.notify-group-membersbooleantrueWhether to notify group members when a group owner has updated their group
                turms.service.notification.group-deleted.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have deleted a group
                turms.service.notification.group-invitation-added.notify-group-membersbooleanfalseWhether to notify group members when a user has been invited
                turms.service.notification.group-invitation-added.notify-group-owner-and-managersbooleantrueWhether to notify the group owner and managers when a user has been invited
                turms.service.notification.group-invitation-added.notify-inviteebooleantrueWhether to notify the user when they have been invited by a group member
                turms.service.notification.group-invitation-added.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have invited a user to a group
                turms.service.notification.group-invitation-recalled.notify-group-membersbooleanfalseWhether to notify group members when an invitation has been recalled
                turms.service.notification.group-invitation-recalled.notify-group-owner-and-managersbooleantrueWhether to notify the group owner and managers when an invitation has been recalled
                turms.service.notification.group-invitation-recalled.notify-inviteebooleantrueWhether to notify the invitee when a group member has recalled their received group invitation
                turms.service.notification.group-invitation-recalled.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have recalled a group invitation
                turms.service.notification.group-join-request-created.notify-group-membersbooleanfalseWhether to notify group members when a user has created a group join request for their group
                turms.service.notification.group-join-request-created.notify-group-owner-and-managersbooleantrueWhether to notify the group owner and managers when a user has created a group join request for their group
                turms.service.notification.group-join-request-created.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have created a group join request
                turms.service.notification.group-join-request-recalled.notify-group-membersbooleanfalseWhether to notify group members when a user has recalled a group join request for their group
                turms.service.notification.group-join-request-recalled.notify-group-owner-and-managersbooleantrueWhether to notify the group owner and managers when a user has recalled a group join request for their group
                turms.service.notification.group-join-request-recalled.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have recalled a group join request
                turms.service.notification.group-member-added.notify-added-group-memberbooleantrueWhether to notify the group member when added by others
                turms.service.notification.group-member-added.notify-other-group-membersbooleantrueWhether to notify other group members when a group member has been added
                turms.service.notification.group-member-added.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have added a group member
                turms.service.notification.group-member-info-updated.notify-other-group-membersbooleanfalseWhether to notify other group members when a group member's information has been updated
                turms.service.notification.group-member-info-updated.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have updated their group member information
                turms.service.notification.group-member-info-updated.notify-updated-group-memberbooleanfalseWhether to notify the group member when others have updated their group member information
                turms.service.notification.group-member-online-status-updated.notify-group-membersbooleanfalseWhether to notify other group members when a member's online status has been updated
                turms.service.notification.group-member-removed.notify-other-group-membersbooleantrueWhether to notify other group members when a group member has been removed
                turms.service.notification.group-member-removed.notify-removed-group-memberbooleantrueWhether to notify the group member when removed by others
                turms.service.notification.group-member-removed.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they removed a group member
                turms.service.notification.group-updated.notify-group-membersbooleantrueWhether to notify group members when the group owner or managers have updated their group
                turms.service.notification.group-updated.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have updated a group
                turms.service.notification.message-created.notify-message-recipientsbooleantrueWhether to notify the message recipients when a sender has created a message to them
                turms.service.notification.message-created.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have created a message
                turms.service.notification.message-updated.notify-message-recipientsbooleantrueWhether to notify the message recipients when a sender has updated a message sent to them
                turms.service.notification.message-updated.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have updated a message
                turms.service.notification.one-sided-relationship-group-deleted.notify-relationship-group-membersbooleanfalseWhether to notify members when a one-side relationship group owner has deleted the group
                turms.service.notification.one-sided-relationship-group-deleted.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have deleted a relationship group
                turms.service.notification.one-sided-relationship-group-member-added.notify-new-relationship-group-memberbooleanfalseWhether to notify the new member when a user has added them to their one-sided relationship group
                turms.service.notification.one-sided-relationship-group-member-added.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have added a new member to their one-sided relationship group
                turms.service.notification.one-sided-relationship-group-member-removed.notify-removed-relationship-group-memberbooleanfalseWhether to notify the removed member when a user has removed them from their one-sided relationship group
                turms.service.notification.one-sided-relationship-group-member-removed.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have removed a new member from their one-sided relationship group
                turms.service.notification.one-sided-relationship-group-updated.notify-relationship-group-membersbooleanfalseWhether to notify members when a one-side relationship group owner has updated the group
                turms.service.notification.one-sided-relationship-group-updated.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have updated a relationship group
                turms.service.notification.one-sided-relationship-updated.notify-related-userbooleanfalseWhether to notify the related user when a user has updated a one-sided relationship with them
                turms.service.notification.one-sided-relationship-updated.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have updated a one-sided relationship
                turms.service.notification.private-conversation-read-date-updated.notify-contactbooleanfalseWhether to notify another contact when a contact has updated their read date in a private conversation
                turms.service.notification.private-conversation-read-date-updated.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have updated the read date in a private conversation
                turms.service.notification.user-info-updated.notify-non-blocked-related-usersbooleanfalseWhether to notify non-blocked related users when a user has updated their information
                turms.service.notification.user-info-updated.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have updated their information
                turms.service.notification.user-online-status-updated.notify-non-blocked-related-usersbooleanfalseWhether to notify non-blocked related users when a user has updated their online status
                turms.service.notification.user-online-status-updated.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have updated their online status
                turms.service.push-notification.apns.bundle-idstring
                turms.service.push-notification.apns.enabledbooleanfalse
                turms.service.push-notification.apns.key-idstring
                turms.service.push-notification.apns.sandbox-enabledbooleanfalse
                turms.service.push-notification.apns.signing-keystring
                turms.service.push-notification.apns.team-idstring
                turms.service.push-notification.fcm.credentialsstring
                turms.service.push-notification.fcm.enabledbooleanfalse
                turms.service.statistics.log-online-users-numberbooleantrueWhether to log online users number
                turms.service.statistics.online-users-number-logging-cronstring0/15 * * * * *The cron expression to specify the time to log online users' number
                turms.service.storage.group-profile-picture.allowed-content-typestringimage/*The allowed "Content-Type" of the resource that the client can upload
                turms.service.storage.group-profile-picture.allowed-referrersList-string[]Restrict access to the resource to only allow the specific referrers (e.g. "https://github.com/turms-im/turms/*")
                turms.service.storage.group-profile-picture.download-url-expire-after-secondsint300The presigned URLs are valid only for the specified duration. 0 means no expiration
                turms.service.storage.group-profile-picture.expire-after-daysint0Delete the resource the specific days after creation. 0 means no expiration
                turms.service.storage.group-profile-picture.max-size-bytesint1048576The maximum size of the resource that the client can upload. 0 means no limit
                turms.service.storage.group-profile-picture.min-size-bytesint0The minimum size of the resource that the client can upload. 0 means no limit
                turms.service.storage.group-profile-picture.upload-url-expire-after-secondsint300The presigned URLs are valid only for the specified duration. 0 means no expiration
                turms.service.storage.message-attachment.allowed-content-typestring/The allowed "Content-Type" of the resource that the client can upload
                turms.service.storage.message-attachment.allowed-referrersList-string[]Restrict access to the resource to only allow the specific referrers (e.g. "https://github.com/turms-im/turms/*")
                turms.service.storage.message-attachment.download-url-expire-after-secondsint300The presigned URLs are valid only for the specified duration. 0 means no expiration
                turms.service.storage.message-attachment.expire-after-daysint0Delete the resource the specific days after creation. 0 means no expiration
                turms.service.storage.message-attachment.max-size-bytesint1048576The maximum size of the resource that the client can upload. 0 means no limit
                turms.service.storage.message-attachment.min-size-bytesint0The minimum size of the resource that the client can upload. 0 means no limit
                turms.service.storage.message-attachment.upload-url-expire-after-secondsint300The presigned URLs are valid only for the specified duration. 0 means no expiration
                turms.service.storage.user-profile-picture.allowed-content-typestringimage/*The allowed "Content-Type" of the resource that the client can upload
                turms.service.storage.user-profile-picture.allowed-referrersList-string[]Restrict access to the resource to only allow the specific referrers (e.g. "https://github.com/turms-im/turms/*")
                turms.service.storage.user-profile-picture.download-url-expire-after-secondsint300The presigned URLs are valid only for the specified duration. 0 means no expiration
                turms.service.storage.user-profile-picture.expire-after-daysint0Delete the resource the specific days after creation. 0 means no expiration
                turms.service.storage.user-profile-picture.max-size-bytesint1048576The maximum size of the resource that the client can upload. 0 means no limit
                turms.service.storage.user-profile-picture.min-size-bytesint0The minimum size of the resource that the client can upload. 0 means no limit
                turms.service.storage.user-profile-picture.upload-url-expire-after-secondsint300The presigned URLs are valid only for the specified duration. 0 means no expiration
                turms.service.user.activate-user-when-addedbooleantrueWhether to activate a user when added by default
                turms.service.user.delete-two-sided-relationshipsbooleanfalseWhether to delete the two-sided relationships when a user requests to delete a relationship
                turms.service.user.delete-user-logicallybooleantrueWhether to delete a user logically
                turms.service.user.friend-request.allow-send-request-after-declined-or-ignored-or-expiredbooleanfalseWhether to allow resending a friend request after the previous request has been declined, ignored, or expired
                turms.service.user.friend-request.delete-expired-requests-when-cron-triggeredbooleanfalseWhether to delete expired when the cron expression is triggered
                turms.service.user.friend-request.expired-user-friend-requests-cleanup-cronstring0 0 2 * * *Clean expired friend requests when the cron expression is triggered if deleteExpiredRequestsWhenCronTriggered is true
                turms.service.user.friend-request.friend-request-expire-after-secondsint2592000A friend request will become expired after the specified time has elapsed
                turms.service.user.friend-request.max-content-lengthint200The maximum allowed length for the text of a friend request
                turms.service.user.max-intro-lengthint100The maximum allowed length for a user's intro
                turms.service.user.max-name-lengthint20The maximum allowed length for a user's name
                turms.service.user.max-password-lengthint16The maximum allowed length for a user's password
                turms.service.user.max-profile-picture-lengthint100The maximum allowed length for a user's profile picture
                turms.service.user.min-password-lengthint-1The minimum allowed length for a user's password. If 0, it means the password can be an empty string "". If -1, it means the password can be null
                turms.service.user.respond-offline-if-invisiblebooleanfalseWhether to respond to client with the OFFLINE status if a user is in INVISIBLE status
                turms.shutdown.job-timeout-millislong120000Wait for a job 2 minutes at most for extreme cases by default. Though it is a long time, graceful shutdown is usually better than force shutdown.
                turms.user-status.cache-user-sessions-statusbooleantrueWhether to cache the user sessions status
                turms.user-status.user-sessions-status-cache-max-sizeint-1The maximum size of the cache of users' sessions status
                turms.user-status.user-sessions-status-expire-afterint60The life duration of each remote user's sessions status in the cache. Note that the cache will make the presentation of users' sessions status inconsistent during the time
                - + \ No newline at end of file diff --git a/docs/server/deployment/distribution.html b/docs/server/deployment/distribution.html index 6cf6e69f..15d5d626 100644 --- a/docs/server/deployment/distribution.html +++ b/docs/server/deployment/distribution.html @@ -17,7 +17,7 @@ -
                Skip to content

                Distribution

                The Directory Structure of Server Release Package

                The directory structure of the turms-gateway and turms-service server release packages is as follows:

                ├─bin
                +    
                Skip to content

                Distribution

                The Directory Structure of Server Release Package

                The directory structure of the turms-gateway and turms-service server release packages is as follows:

                ├─bin
                 │ └─run.sh
                 ├─config
                 │ ├─application.yaml
                @@ -200,7 +200,7 @@
                 net.ipv4.tcp_moderate_rcvbuf = 1
                 # Default: 1. TCP uses 16 bits to record the window size, and the maximum value can be 65535B. If this value is exceeded, the tcp_window_scaling mechanism needs to be enabled
                 net.ipv4.tcp_window_scaling = 1

                Once configured, execute sudo sysctl -p to load the latest configuration of sysctl.

                Special mention is: we are in system resource management mentioned that the Turms server will reserve part of the memory for the system Kernel, this part of memory mainly refers to the buffer of the above-mentioned TCP connection.

                Initial congestion window (initcwnd) configuration

                Keep the default value: 10MSS.

                Reference documents:

                - + \ No newline at end of file diff --git a/docs/server/deployment/getting-started.html b/docs/server/deployment/getting-started.html index 810db248..cf620733 100644 --- a/docs/server/deployment/getting-started.html +++ b/docs/server/deployment/getting-started.html @@ -17,7 +17,7 @@ -
                Skip to content

                Build and Start

                Automatically Build and Start

                Stand-alone Environment

                Applicable scenarios: The construction process is convenient and fast, but it cannot meet the requirements of disaster recovery, elastic expansion, zero downtime upgrade, and load balancing. It is mainly used to build demos for display and serve users who do not require SLA.

                Based on Docker Compose

                Through the following commands, a complete set of Turms minimal cluster (including turms-gateway, turms-service and turms-admin) and its dependent servers (MongoDB shard cluster and Redis) can be built automatically

                bash
                git clone --depth 1 https://github.com/turms-im/turms.git
                +    
                Skip to content

                Build and Start

                Automatically Build and Start

                Stand-alone Environment

                Applicable scenarios: The construction process is convenient and fast, but it cannot meet the requirements of disaster recovery, elastic expansion, zero downtime upgrade, and load balancing. It is mainly used to build demos for display and serve users who do not require SLA.

                Based on Docker Compose

                Through the following commands, a complete set of Turms minimal cluster (including turms-gateway, turms-service and turms-admin) and its dependent servers (MongoDB shard cluster and Redis) can be built automatically

                bash
                git clone --depth 1 https://github.com/turms-im/turms.git
                 cd turms
                 docker plugin install grafana/loki-docker-driver:latest --alias loki --grant-all-permissions
                 # Or "ENV=dev,demo docker compose -f docker-compose.standalone.yml --profile monitoring up --force-recreate -d" to run with sidecar services in dev profile
                @@ -108,7 +108,7 @@
                 docker run -p 6510:6510 ghcr.io/turms-im/turms-admin
                 docker run -p 7510:7510 -p 8510:8510 ghcr.io/turms-im/turms-service
                 docker run --ulimit nofile=102400:102400 -p 7510:7510 -p 9510:9510 -p 10510:10510 -p 11510:11510 -p 12510:12510 ghcr.io/turms-im/turms-gateway

                In addition, you can use custom application.yaml and jvm.options by volume mounting. For example, configure -v /your-custom-config-dir:/opt/turms/turms/config.

                Solution 2: Download and decompress the Turms server compressed package (since v.0.10.0 has not been released on the release page, this scheme is not available at the moment), run according to the following steps:

                • (If you install the default configuration of both MongoDB and Redis locally, you can skip this step) Configure config/jvm.options, config/application.yaml according to your needs (you can configure Turms custom configuration here parameters, and you can also configure multiple MongoDB or mongos server addresses here. For details, please refer to: https://docs.mongodb.com/manual/reference/connection-string).

                • (Ansible is recommended) On all systems that need to run the Turms server, run the bin/turms script (the default is executed as a Thin package, if you need to execute it as a Fat package, please add the -f parameter when executing the script, Such as: sh run.sh -f. Then run the turms-gateway server. The turms-gateway and turms-service servers will automatically find other server nodes through MongoDB (as a service registry), so the Turms cluster start working.

                Solution 3: Clone the source code of the Turms warehouse, and run the turms-gateway and turms-service servers directly through the IDE. (Reference command: git clone --depth 1 https://github.com/turms-im/turms.git)

    Notes:

    • When the turms-service server is started, it will automatically detect whether there is a super administrator account with the role of ROOT and the account of turms in the database. If it does not exist, the turms-service server will automatically create a role with ROOT, name turms and password turms.security.password.initial-root-password (default: turms) Administrator account. In a production environment, please remember to change the default password.
    • The above operations are mainly for your first experience of using Turms clusters. If you need to deploy Turms in a production environment, please be sure to refer to the Wiki manual to understand the meaning of various configuration parameters and customize your own with minimal resource consumption. Business needs and business mix.

    The general process of Turms server startup and shutdown

    Start the process

    1. Connect and verify mongos and Redis server.
    2. Check whether MongoDB has created a table. If the table has already been built, skip this step. If not, proceed: create tables, add indexes, add shard keys, and add Zones for separate storage of hot and cold data. If MongoDB's fake data is enabled, turms-service will automatically generate fake data to MongoDB for development and testing.
    3. For the turms-service server, it will detect whether there is already a super administrator account with the role ROOT and the account turms in MongoDB. If it does not exist, an administrator account with role ROOT, name turms and password turms.security.password.initial-root-password (default: turms) will be created for MongoDB.
    4. Register the local Node node to the service registration center. If the registration is successful, pull and apply the global configuration of the cluster, and build an RPC server to receive RPC client connections. If it fails, throw an exception and exit the process.
    5. Open the Admin HTTP server to receive admin API requests. In addition, for turms-gateway, the gateway server (such as TCP/WebSocket) must be opened to receive client connections and requests.
    6. For turms-gateway, if the Fake client is enabled, a real client connection is generated and a real client request is randomly sent (random request type, random request parameters) for development and testing.

    At this point, the server is started.

    Shutdown Process

    (for turms-gateway)

    1. Deny new client network connections and client requests.
    2. Close the fake clients and close the established client sessions.
    3. Shut down the servers that connects to TCP, UDP, or WebSocket clients and the HTTP admin API server.

    (for turms-gateway and turms-service) 4. Turn off the blocklist synchronization mechanism. 5. Close cluster services (such as the connection between RPC nodes, service registration and discovery service). 6. Turn off the plugin mechanism. 7. After sending requests to Redis and MongoDB, close the network connections from Turms server to Redis and MongoDB. 8. After flushing all logs, close the log service.

    At this point, the server shutdown is complete.

    - + \ No newline at end of file diff --git a/docs/server/development/code.html b/docs/server/development/code.html index 907d6930..ecdc48e5 100644 --- a/docs/server/development/code.html +++ b/docs/server/development/code.html @@ -17,7 +17,7 @@ -
    Skip to content

    Source Code

    This article explains the package structure of the Turms server and the approximate source code implementation of each main functional module to help developers read the source code and understand the related process faster.

    remind:

    1. The Turms server heavily uses the responsive framework of reactor-core. This article assumes that the reader has mastered responsive programming. If the reader has not mastered For responsive programming, it is recommended to learn and master reactor-core by yourself.
    2. Turms will optimize the code from time to time. Some function names or function implementations may change slightly, but the idea will not change.
    3. What the source code of each module does is usually much more than what is described below, but for the convenience of readers to understand, this article only selects the main process to explain, and omits a lot of details. If readers are interested in the details, they can read the source code after reading the relevant explanations in this article and have a general understanding of the main process to understand its specific implementation details.

    Project Structure

    We often say that code is a document. Code allows readers to understand the implementation details and logical relationship of each function from a micro perspective, while a package is like a directory of documents. A good subcontract should clearly show the hierarchy and structure of the "document" at a macro level, so that readers can understand it. This article explains the package structure of the Turms server to help developers better understand the relationship and hierarchy between packages.

    Background (extended content)

    No matter what kind of subcontracting concept, in fact, there are only four basic subcontracting categories: by feature (Feature), by type (Type), by layer (Layer) and no subcontracting, and various upper-level designs Ideas are simply different combinations of these basic subcontracting categories.

    In addition, even for the same project, different package structures are usually applicable at different development stages. We often say that the architecture is an evolutionary development, and the package classification actually needs an evolutionary development. For example, in the early stage of the Turms server, there were not many modules in total, but according to the idea of subcontracting a bunch of modules by the Turms server today, we designed the package structure for the early Turms server, and the result is: the readability of the package structure Not rising but falling, designing for the sake of design, that is, over-designing.

    Subcontract target (expanded content)

    When doing subcontract design, you must have a clear goal, otherwise it is easy to fall into the situation of "forcibly subcontracting in order to cover a certain package design", such as the service layer of some projects, first write the interface class and then write the implementation class, without thinking Why do we need such an interface in the design specification, or forcibly apply the DDD layered template, without thinking about whether some designs have seriously violated the established conventions, which leads to handicap when programming.

    The main objectives of the subcontracting of the Turms server project are:

    • Try to ensure high cohesion of functional feature modules and reduce the complexity of modules. This is mainly for the maintainability of the code to avoid falling into the very common mixed design by feature+by type or mixed design by feature+by type+by layer, because the hybrid design will both It makes the ownership of the code ambiguous, and also reduces the readability of the package structure due to the use of different subcontracting strategies under one layer of packages, which is not conducive to long-term maintenance.
    • Try to ensure the independence of the business sub-domain. This is mainly to draw clear business boundaries and make each module easy to read and change (additionally, turms-service will support deployment in various business domain combinations in the future, for example, turms-service can be deployed in the future It can also be deployed as a service in the user business domain, or as a service in the message business domain, or as a service in the user + message business domain, etc.).
    • The functional feature modules and business modules of the supporting domain must be separated. This is mainly to draw a clear boundary between the problem domain and the support domain.
    • Try to let developers guess the upstream and downstream relationship of the package through the package structure. This is mainly due to the readability of the code. In long-term programming practice, when we see that the package structure of medium and large projects does not have layered code, then we may have to go through the package or code several times before Infer possible upstream and downstream relationships of packages.
    • In the case of clear logic, try to make the package level less.

    In addition, when reading the package structure of various excellent open source projects, we will find that most of the well-known medium and large open source server projects may not do hierarchical design at all, and usually focus on subpackaging according to functional characteristics, with Subcontracting by type is supplemented by hybrid design, or by conventional MVC or DDD layered design. For these subcontracting ideas, we generally evaluate "moderate, in line with conventions, but unsatisfactory", because they do not well meet the above-mentioned subcontracting goals, and many developers will also fall into the trap when reading the source code of these projects. In the case of "don't know where to start", coding often encounters the problem of ambiguous attribution of code.

    subcontracting idea

    Various subcontracting concepts usually only provide ideas in ideal scenarios, and must not be directly applied blindly. When we design the package structure for the Turms server, we mainly refer to: the design concepts of various subcontracts, the practice and actual effects of excellent open source projects, conventional practices, project scale, project type, package scale, and long-term programming practices. experience.

    Therefore, various subcontracting concepts are just "reference suggestions". In actual operation, it is necessary to rely on long-term programming practice and experience to judge whether various designs are suitable for the actual situation of the Turms project, and to learn from each other's various subcontracting concepts , readers can also see many design concepts and even the shadow of DDD-based microservice design from the package structure of the Turms server project. Specifically, the subcontract diagram of the Turms server is as follows:

    The name in the frame above is the name of the package in the actual Turms server, and its connection is the logical relationship between packages. in:

    The first layer is divided into layers, which are access, domain, storage and infra. Among them:

    • access: The access layer is responsible for session management and request scheduling between administrators and clients. This layer will distribute user requests to the access layer of the domain layer.

    • domain: business domain layer, responsible for processing logic related to various business domains. The domain layer is divided into three layers: access, service and repository according to the common hierarchical subcontracting design.

      • Among them, the relatively special one is the access layer in the domain layer. Because the upper layer of service not only has the Controller layer admin that dispatches administrator HTTP requests, but also the Controller layer client or servicerequest that dispatches client requests (for turms-gateway, it is client package, and For turms-service it is servicerequest package). Both share the Service layer, so a single accecss layer is used to cover both layers.

      • About why a model should be selected separately

        For example, dto (Data Transfer Object)/bo (Business Object)/po (Persistent Object) in the above picture are all anemic models, only model is a hyperemic model, they not only store state (data) , also comes with some behaviors (logic), which are used to handle various high-cohesion logics, which are special, so they are subcontracted separately.

      • About the rpc package

        Some domains (domain) have their own unique RPC requests, and these RPC requests will be grouped under this domain. For example, the RPC request im.turms.server.common.domain.session.rpc.SetUserOfflineRequest under the Session domain.

        In addition, the implementation of cluster RPC is under the im.turms.server.common.infra.cluster.service.rpc package.

    • storage: Storage layer, providing MongoDB client management and Redis client management, corresponding to mongo package and redis package respectively.

    • infra: Basic service layer, responsible for providing basic functions for access and domain layers, such as log processing, configuration management, etc. The infra layer is divided into packages according to functional characteristics.

    In summary, the subcontracting of the Turms server is actually very cleverly designed:

    • Through the four layers of access, domain, storage and infra, developers can quickly understand the source code level of the Turms server based on the mastered MVC layered knowledge, and can clearly understand each What is the relationship between the layer package and the user session and user request.
    • The business domains of the domain layer can help developers quickly distinguish which business domains each Turms server has. The interior of each domain is based on the common MVC layered design, and developers can quickly understand the internal upstream and downstream logical relationships of a business domain based on previous knowledge.
    • The infra layer can help developers understand which functional modules are included in the support domain of the Turms server.

    Therefore, such a subcontracting level is actually relatively clear, which is conducive to long-term maintenance. In addition, readers may have seen the shadow of many subcontracting concepts from the above-mentioned subcontracting practice, and Turms only designs with reference to these concepts, and does not need to follow these subcontracting concepts.

    Replenish:

    • Regarding why the first-level package is not divided into modules (Java Modules), this is because there is no need to divide the modules at this stage, and the division of modules will also increase the complexity of the project structure. If it is not necessary, do not add entities.
    • Most of the anemia models on the Turms server are represented by Java's record, but some anemia models are still represented by class for performance reasons (whether a new object needs to be recreated to change a field).

    Request processing flow process between packages

    After understanding the Turms subcontracting design above, readers should have a clear understanding of the request processing process of the Turms server. Here we take the most classic "client login" as an example, and briefly talk about the relevant process from the perspective of the package (readers can read it in conjunction with the above sub-package diagram), to help readers understand the layered design of the package more clearly.

    • When the client logs in, it must first establish a pure TCP or WebSocket connection with the turms-gateway server. At this time, the access layer handles the network connection, because it is a client connection, so it is the access/client layer (and not access/admin).

    • After the network connection is established, when the client sends a login request to turms-gateway, turms-gateway will pass the parsed request to the Controller of the domain/session/access/client/controller layer via the access layer for processing , the Controller will hand over the specific business logic to the Service at the domain/session/access/client/service layer for processing, and the Service will: 1. Hand over the query operation of the related MongoDB database to domain/session/access The Repository at the /client/repository layer is processed, and the Repository is just a splicing of related CRUD statements, and these statements will be passed to the MongoDB client implementation at the storage/mongo layer, and they will send the final request to the MongoDB server ; 2. Related Redis operations are handled by the storage/redis layer.

      After the request is processed, it will return in sequence through the callback and according to the upstream and downstream relationship of the package.

    • As for the infra layer and various other subpackages, most of them provide support for various capabilities for the above layers, such as the infra/logging log package and the infra/cluster cluster implementation package.

    The processing flow of other types of requests (administrator HTTP requests, client business requests based on TCP/WebSocket connections) is roughly the same as above, and readers can infer other cases by themselves.

    The following chapters will continue to explain the processing flow of client requests from a more detailed source code perspective.

    Client request processing flow

    Before reading the following, readers are advised to read Standard Process for Client Access to Server, first understand the design ideas behind it from the perspective of architecture, so that it is not easy to "get lost" when reading the source code.

    Request model: im.turms.server.common.access.client.dto.request.TurmsRequest

    Response and notification model: im.turms.server.common.access.client.dto.notification.TurmsNotification

    UML sequence diagram

    ###turms-gateway

    Introduction: It is used to maintain the network connection with the client, maintain the application layer session, and send most of the business requests to the turms-service server.

    Network layer configuration

    1. Start the server that receives the client request

      TCP server: im.turms.gateway.access.client.tcp.TcpServerFactory#create

      WebSocket server: im.turms.gateway.access.client.websocket.WebSocketServerFactory#create

      The main functions of these two functions are: based on the reactor-netty library, bind the listening address of the server, configure the EventLoop thread pool, (optional) configure SSL, enable related metrics, and other routine server-related work.

    2. For a pure TCP connection (not a prepared WebSocket connection), bind the codec Handlers to the newly established TCP connection

      In the im.turms.gateway.access.client.tcp.TcpServerFactory#create function, through the following callback, bind the corresponding codec instances of TurmsRequest and TurmsNotification to the new TCP connection.

      java
      .doOnConnection(connection -> {
      +    
      Skip to content

      Source Code

      This article explains the package structure of the Turms server and the approximate source code implementation of each main functional module to help developers read the source code and understand the related process faster.

      remind:

      1. The Turms server heavily uses the responsive framework of reactor-core. This article assumes that the reader has mastered responsive programming. If the reader has not mastered For responsive programming, it is recommended to learn and master reactor-core by yourself.
      2. Turms will optimize the code from time to time. Some function names or function implementations may change slightly, but the idea will not change.
      3. What the source code of each module does is usually much more than what is described below, but for the convenience of readers to understand, this article only selects the main process to explain, and omits a lot of details. If readers are interested in the details, they can read the source code after reading the relevant explanations in this article and have a general understanding of the main process to understand its specific implementation details.

      Project Structure

      We often say that code is a document. Code allows readers to understand the implementation details and logical relationship of each function from a micro perspective, while a package is like a directory of documents. A good subcontract should clearly show the hierarchy and structure of the "document" at a macro level, so that readers can understand it. This article explains the package structure of the Turms server to help developers better understand the relationship and hierarchy between packages.

      Background (extended content)

      No matter what kind of subcontracting concept, in fact, there are only four basic subcontracting categories: by feature (Feature), by type (Type), by layer (Layer) and no subcontracting, and various upper-level designs Ideas are simply different combinations of these basic subcontracting categories.

      In addition, even for the same project, different package structures are usually applicable at different development stages. We often say that the architecture is an evolutionary development, and the package classification actually needs an evolutionary development. For example, in the early stage of the Turms server, there were not many modules in total, but according to the idea of subcontracting a bunch of modules by the Turms server today, we designed the package structure for the early Turms server, and the result is: the readability of the package structure Not rising but falling, designing for the sake of design, that is, over-designing.

      Subcontract target (expanded content)

      When doing subcontract design, you must have a clear goal, otherwise it is easy to fall into the situation of "forcibly subcontracting in order to cover a certain package design", such as the service layer of some projects, first write the interface class and then write the implementation class, without thinking Why do we need such an interface in the design specification, or forcibly apply the DDD layered template, without thinking about whether some designs have seriously violated the established conventions, which leads to handicap when programming.

      The main objectives of the subcontracting of the Turms server project are:

      • Try to ensure high cohesion of functional feature modules and reduce the complexity of modules. This is mainly for the maintainability of the code to avoid falling into the very common mixed design by feature+by type or mixed design by feature+by type+by layer, because the hybrid design will both It makes the ownership of the code ambiguous, and also reduces the readability of the package structure due to the use of different subcontracting strategies under one layer of packages, which is not conducive to long-term maintenance.
      • Try to ensure the independence of the business sub-domain. This is mainly to draw clear business boundaries and make each module easy to read and change (additionally, turms-service will support deployment in various business domain combinations in the future, for example, turms-service can be deployed in the future It can also be deployed as a service in the user business domain, or as a service in the message business domain, or as a service in the user + message business domain, etc.).
      • The functional feature modules and business modules of the supporting domain must be separated. This is mainly to draw a clear boundary between the problem domain and the support domain.
      • Try to let developers guess the upstream and downstream relationship of the package through the package structure. This is mainly due to the readability of the code. In long-term programming practice, when we see that the package structure of medium and large projects does not have layered code, then we may have to go through the package or code several times before Infer possible upstream and downstream relationships of packages.
      • In the case of clear logic, try to make the package level less.

      In addition, when reading the package structure of various excellent open source projects, we will find that most of the well-known medium and large open source server projects may not do hierarchical design at all, and usually focus on subpackaging according to functional characteristics, with Subcontracting by type is supplemented by hybrid design, or by conventional MVC or DDD layered design. For these subcontracting ideas, we generally evaluate "moderate, in line with conventions, but unsatisfactory", because they do not well meet the above-mentioned subcontracting goals, and many developers will also fall into the trap when reading the source code of these projects. In the case of "don't know where to start", coding often encounters the problem of ambiguous attribution of code.

      subcontracting idea

      Various subcontracting concepts usually only provide ideas in ideal scenarios, and must not be directly applied blindly. When we design the package structure for the Turms server, we mainly refer to: the design concepts of various subcontracts, the practice and actual effects of excellent open source projects, conventional practices, project scale, project type, package scale, and long-term programming practices. experience.

      Therefore, various subcontracting concepts are just "reference suggestions". In actual operation, it is necessary to rely on long-term programming practice and experience to judge whether various designs are suitable for the actual situation of the Turms project, and to learn from each other's various subcontracting concepts , readers can also see many design concepts and even the shadow of DDD-based microservice design from the package structure of the Turms server project. Specifically, the subcontract diagram of the Turms server is as follows:

      The name in the frame above is the name of the package in the actual Turms server, and its connection is the logical relationship between packages. in:

      The first layer is divided into layers, which are access, domain, storage and infra. Among them:

      • access: The access layer is responsible for session management and request scheduling between administrators and clients. This layer will distribute user requests to the access layer of the domain layer.

      • domain: business domain layer, responsible for processing logic related to various business domains. The domain layer is divided into three layers: access, service and repository according to the common hierarchical subcontracting design.

        • Among them, the relatively special one is the access layer in the domain layer. Because the upper layer of service not only has the Controller layer admin that dispatches administrator HTTP requests, but also the Controller layer client or servicerequest that dispatches client requests (for turms-gateway, it is client package, and For turms-service it is servicerequest package). Both share the Service layer, so a single accecss layer is used to cover both layers.

        • About why a model should be selected separately

          For example, dto (Data Transfer Object)/bo (Business Object)/po (Persistent Object) in the above picture are all anemic models, only model is a hyperemic model, they not only store state (data) , also comes with some behaviors (logic), which are used to handle various high-cohesion logics, which are special, so they are subcontracted separately.

        • About the rpc package

          Some domains (domain) have their own unique RPC requests, and these RPC requests will be grouped under this domain. For example, the RPC request im.turms.server.common.domain.session.rpc.SetUserOfflineRequest under the Session domain.

          In addition, the implementation of cluster RPC is under the im.turms.server.common.infra.cluster.service.rpc package.

      • storage: Storage layer, providing MongoDB client management and Redis client management, corresponding to mongo package and redis package respectively.

      • infra: Basic service layer, responsible for providing basic functions for access and domain layers, such as log processing, configuration management, etc. The infra layer is divided into packages according to functional characteristics.

      In summary, the subcontracting of the Turms server is actually very cleverly designed:

      • Through the four layers of access, domain, storage and infra, developers can quickly understand the source code level of the Turms server based on the mastered MVC layered knowledge, and can clearly understand each What is the relationship between the layer package and the user session and user request.
      • The business domains of the domain layer can help developers quickly distinguish which business domains each Turms server has. The interior of each domain is based on the common MVC layered design, and developers can quickly understand the internal upstream and downstream logical relationships of a business domain based on previous knowledge.
      • The infra layer can help developers understand which functional modules are included in the support domain of the Turms server.

      Therefore, such a subcontracting level is actually relatively clear, which is conducive to long-term maintenance. In addition, readers may have seen the shadow of many subcontracting concepts from the above-mentioned subcontracting practice, and Turms only designs with reference to these concepts, and does not need to follow these subcontracting concepts.

      Replenish:

      • Regarding why the first-level package is not divided into modules (Java Modules), this is because there is no need to divide the modules at this stage, and the division of modules will also increase the complexity of the project structure. If it is not necessary, do not add entities.
      • Most of the anemia models on the Turms server are represented by Java's record, but some anemia models are still represented by class for performance reasons (whether a new object needs to be recreated to change a field).

      Request processing flow process between packages

      After understanding the Turms subcontracting design above, readers should have a clear understanding of the request processing process of the Turms server. Here we take the most classic "client login" as an example, and briefly talk about the relevant process from the perspective of the package (readers can read it in conjunction with the above sub-package diagram), to help readers understand the layered design of the package more clearly.

      • When the client logs in, it must first establish a pure TCP or WebSocket connection with the turms-gateway server. At this time, the access layer handles the network connection, because it is a client connection, so it is the access/client layer (and not access/admin).

      • After the network connection is established, when the client sends a login request to turms-gateway, turms-gateway will pass the parsed request to the Controller of the domain/session/access/client/controller layer via the access layer for processing , the Controller will hand over the specific business logic to the Service at the domain/session/access/client/service layer for processing, and the Service will: 1. Hand over the query operation of the related MongoDB database to domain/session/access The Repository at the /client/repository layer is processed, and the Repository is just a splicing of related CRUD statements, and these statements will be passed to the MongoDB client implementation at the storage/mongo layer, and they will send the final request to the MongoDB server ; 2. Related Redis operations are handled by the storage/redis layer.

        After the request is processed, it will return in sequence through the callback and according to the upstream and downstream relationship of the package.

      • As for the infra layer and various other subpackages, most of them provide support for various capabilities for the above layers, such as the infra/logging log package and the infra/cluster cluster implementation package.

      The processing flow of other types of requests (administrator HTTP requests, client business requests based on TCP/WebSocket connections) is roughly the same as above, and readers can infer other cases by themselves.

      The following chapters will continue to explain the processing flow of client requests from a more detailed source code perspective.

      Client request processing flow

      Before reading the following, readers are advised to read Standard Process for Client Access to Server, first understand the design ideas behind it from the perspective of architecture, so that it is not easy to "get lost" when reading the source code.

      Request model: im.turms.server.common.access.client.dto.request.TurmsRequest

      Response and notification model: im.turms.server.common.access.client.dto.notification.TurmsNotification

      UML sequence diagram

      ###turms-gateway

      Introduction: It is used to maintain the network connection with the client, maintain the application layer session, and send most of the business requests to the turms-service server.

      Network layer configuration

      1. Start the server that receives the client request

        TCP server: im.turms.gateway.access.client.tcp.TcpServerFactory#create

        WebSocket server: im.turms.gateway.access.client.websocket.WebSocketServerFactory#create

        The main functions of these two functions are: based on the reactor-netty library, bind the listening address of the server, configure the EventLoop thread pool, (optional) configure SSL, enable related metrics, and other routine server-related work.

      2. For a pure TCP connection (not a prepared WebSocket connection), bind the codec Handlers to the newly established TCP connection

        In the im.turms.gateway.access.client.tcp.TcpServerFactory#create function, through the following callback, bind the corresponding codec instances of TurmsRequest and TurmsNotification to the new TCP connection.

        java
        .doOnConnection(connection -> {
             // Inbound
             connection.addHandlerLast("varintLengthBasedFrameDecoder", CodecFactory.getExtendedVarintLengthBasedFrameDecoder(maxFrameLength));
             // Outbound
        @@ -492,7 +492,7 @@
              }
              return sink.asMono();
         }

        At this point, the processing flow of the RPC sender ends.

        In particular, the reason why request ID is not encoded upstream is because some RPC requests may be sent to multiple RPC receivers, such as group messages are often forwarded to multiple turms-gateway services end, and through separate encoding, the byte data transmitted from the upstream can be shared without memory copying, which greatly improves memory usage. This is one of the reasons why Turms develops its own RPC service.

        RPC receiver of HandleServiceRequest

        TODO

      - + \ No newline at end of file diff --git a/docs/server/development/plugin.html b/docs/server/development/plugin.html index faa3f954..f1846ead 100644 --- a/docs/server/development/plugin.html +++ b/docs/server/development/plugin.html @@ -17,7 +17,7 @@ -
      Skip to content

      Custom Plugins

      List of plugin extension points

      CategoryExtensionDescription
      Admin classAdminActionHandlerAdmin action handler. Used to monitor the administrator's API operations
      User classUserAuthenticatorUser login authentication. When the client requests turms-gateway to log in, turms-gateway will call the plugin to implement custom login authentication logic. With this plugin, you don't need to (optionally) synchronize user information in your business system to Turms
      UserOnlineStatusChangeHandlerUser Online Status Change Handler. When any user enters the online or offline state, turms-gateway will call this interface
      Request classClientRequestHandlerClient service request handler. Used to modify request parameters (even transform into other business requests) and implement custom requests. This Handler will be called when turms receives a client service request. Through this plug-in, you can implement sensitive word filtering and other functions
      Notification and message classNotificationHandlerNotification Handler. When a behavior needs to be notified to relevant users, turms-gateway will call this Handler. Commonly used to integrate custom third-party push services
      ExpiredMessageDeletionNotifierExpired message auto-deletion notification handler. When Turms automatically and regularly deletes expired messages, the Turms server will call this interface to inform the plug-in implementer of all messages to be deleted. Commonly used for developer backup messages
      Service implementation classStorageServiceProviderStorage service provider. The Turms project itself does not have a specific implementation of storage services, but only exposes storage service-related interfaces for the plug-in to implement. (Refer to turms-plugin-minio)
      Business Model Lifecycle Class (TODO)

      Plugin loading method

      • Local loading: The Turms server will detect whether the JAR package ending with .jar file name and the JavaScript file ending with .js file name in the plugins directory of the release package are plug-in implementations. If it is a plug-in, They will be loaded when the Turms server starts.

        Note: The Turms server will not load plugins stored in the lib directory.

        Expanded information: Directory structure of the Turms server release package

      • Load via HTTP interface:

        • Add the API interface of the Java plugin: POST /plugins/java
        • Add the API interface of the JavaScript plugin: POST /plugins/js

        Expanded information: Plug-in related API interface

      • Loading via turms-admin (based on "loading via HTTP"): On the /cluster/plugin page, administrators can also upload Java plug-ins and JavaScript plug-ins through the UI.

      Plugin implementation

      The Turms server supports plug-in implementations based on JVM or JavaScript language.

      JVM language pluginJavaScript plugin
      Language VersionJava 21 (Bytecode 65.0)ECMAScript 2022
      AdvantagesIt is suitable for implementing functions with complex logic.
      For example, the official plugin of the Turms project turms-plugin-antispam sensitive word filter plug-in
      Just create a new JavaScript file, you can directly write custom logic, no need to compile, no need to package;
      It is convenient to support hot update
      DisadvantagesIf you just implement a little custom logic, you still need to build a plug-in project first, and then package the code into a Jar package based on the construction tool. The process is cumbersomeIf you need to implement complex logic, it is better to implement it based on Java plug-ins;<br / >The memory overhead is larger than the Java plug-in;
      interpretation and execution, low operating efficiency
      General CommentsIt is more suitable for plug-ins that are complex, heavy-weight, and relatively fixed.
      This type of plug-in is more like a "project"
      It is more suitable for small and lightweight plug-ins that need to support hot updates.
      This type of plugin is more like a "small patch"

      JVM language version (take Java as an example)

      Implementation steps

      1. Install the JAR package dependencies of the Turms project for use when compiling your plugin

        • Clone Turms' warehouse. Reference command: git clone --depth 1 https://github.com/turms-im/turms.git
        • In the root directory of the Turms project (that is, the parent directory of the .git directory), execute the mvn install -DskipUTs -DskipITs -DskipSTs command to compile the Turms project source code, and automatically install the generated JAR package to the local In the Maven repository, for your plug-in compilation
      2. Build the plug-in project

        • Option 1 (recommended): Clone the turms/turms-plugin-demo directory to the local, and develop based on this template. This solution can reduce unnecessary repeated configuration work.

        • Option 2: Manual construction. Specific steps are as follows:

          1. Create a new Maven project and add dependencies in pom.xml (to implement the turms-gateway server plug-in, add turms-gateway dependencies. To implement turms-service plug-ins, add turms-service dependencies):

            xml
            <dependency>
            +    
            Skip to content

            Custom Plugins

            List of plugin extension points

            CategoryExtensionDescription
            Admin classAdminActionHandlerAdmin action handler. Used to monitor the administrator's API operations
            User classUserAuthenticatorUser login authentication. When the client requests turms-gateway to log in, turms-gateway will call the plugin to implement custom login authentication logic. With this plugin, you don't need to (optionally) synchronize user information in your business system to Turms
            UserOnlineStatusChangeHandlerUser Online Status Change Handler. When any user enters the online or offline state, turms-gateway will call this interface
            Request classClientRequestHandlerClient service request handler. Used to modify request parameters (even transform into other business requests) and implement custom requests. This Handler will be called when turms receives a client service request. Through this plug-in, you can implement sensitive word filtering and other functions
            Notification and message classNotificationHandlerNotification Handler. When a behavior needs to be notified to relevant users, turms-gateway will call this Handler. Commonly used to integrate custom third-party push services
            ExpiredMessageDeletionNotifierExpired message auto-deletion notification handler. When Turms automatically and regularly deletes expired messages, the Turms server will call this interface to inform the plug-in implementer of all messages to be deleted. Commonly used for developer backup messages
            Service implementation classStorageServiceProviderStorage service provider. The Turms project itself does not have a specific implementation of storage services, but only exposes storage service-related interfaces for the plug-in to implement. (Refer to turms-plugin-minio)
            Business Model Lifecycle Class (TODO)

            Plugin loading method

            • Local loading: The Turms server will detect whether the JAR package ending with .jar file name and the JavaScript file ending with .js file name in the plugins directory of the release package are plug-in implementations. If it is a plug-in, They will be loaded when the Turms server starts.

              Note: The Turms server will not load plugins stored in the lib directory.

              Expanded information: Directory structure of the Turms server release package

            • Load via HTTP interface:

              • Add the API interface of the Java plugin: POST /plugins/java
              • Add the API interface of the JavaScript plugin: POST /plugins/js

              Expanded information: Plug-in related API interface

            • Loading via turms-admin (based on "loading via HTTP"): On the /cluster/plugin page, administrators can also upload Java plug-ins and JavaScript plug-ins through the UI.

            Plugin implementation

            The Turms server supports plug-in implementations based on JVM or JavaScript language.

            JVM language pluginJavaScript plugin
            Language VersionJava 21 (Bytecode 65.0)ECMAScript 2022
            AdvantagesIt is suitable for implementing functions with complex logic.
            For example, the official plugin of the Turms project turms-plugin-antispam sensitive word filter plug-in
            Just create a new JavaScript file, you can directly write custom logic, no need to compile, no need to package;
            It is convenient to support hot update
            DisadvantagesIf you just implement a little custom logic, you still need to build a plug-in project first, and then package the code into a Jar package based on the construction tool. The process is cumbersomeIf you need to implement complex logic, it is better to implement it based on Java plug-ins;<br / >The memory overhead is larger than the Java plug-in;
            interpretation and execution, low operating efficiency
            General CommentsIt is more suitable for plug-ins that are complex, heavy-weight, and relatively fixed.
            This type of plug-in is more like a "project"
            It is more suitable for small and lightweight plug-ins that need to support hot updates.
            This type of plugin is more like a "small patch"

            JVM language version (take Java as an example)

            Implementation steps

            1. Install the JAR package dependencies of the Turms project for use when compiling your plugin

              • Clone Turms' warehouse. Reference command: git clone --depth 1 https://github.com/turms-im/turms.git
              • In the root directory of the Turms project (that is, the parent directory of the .git directory), execute the mvn install -DskipUTs -DskipITs -DskipSTs command to compile the Turms project source code, and automatically install the generated JAR package to the local In the Maven repository, for your plug-in compilation
            2. Build the plug-in project

              • Option 1 (recommended): Clone the turms/turms-plugin-demo directory to the local, and develop based on this template. This solution can reduce unnecessary repeated configuration work.

              • Option 2: Manual construction. Specific steps are as follows:

                1. Create a new Maven project and add dependencies in pom.xml (to implement the turms-gateway server plug-in, add turms-gateway dependencies. To implement turms-service plug-ins, add turms-service dependencies):

                  xml
                  <dependency>
                       <groupId>im.turms</groupId>
                       <artifactId>turms-gateway</artifactId>
                       <version>0.10.0-SNAPSHOT</version>
                  @@ -222,7 +222,7 @@
                   }
                   
                   export default MyTurmsPlugin;

                  in:

                  • The MyTurmsExtension class is a developer-defined extension of TurmsExtension, and the developer can customize the class name. in:

                    • The getExtensionPoints function must exist and is used to return the names of the extension points implemented by the extension class. If the developer specifies an extension point but does not implement the interface function of the extension point, the Turms server will skip the plug-in when executing the plug-in callback function without reporting an error.
                  • The MyTurmsPlugin class is a developer-defined TurmsPlugin plugin, and the developer can customize the class name. in:

                    • The getDescriptor function must exist, and the object it returns is the description information of the plug-in:

                      • The id field is used to distinguish plugins. No format required, but must not be empty.

                      • The other fields are used for description and have no practical function at the moment, so they can all be empty.

                    • The getExtensions function must exist, and the object it returns is an array of extension classes, such as MyTurmsExtension above.

                  • export default is used to export developer-defined plugins, such as MyTurmsPlugin above.

                  Precautions:

                  • The Turms server will only detect whether the files ending with .js in the plugins directory are plug-in implementations, so if you put the plug-in JAR package in the lib directory, these plug-ins will not be Identify and use. *Turms does not implement access control for plugins, you need to ensure that there is no malicious code in the plugin. Note: Malicious plug-ins can not only call functions to directly force the shutdown of the Turms server, but can even directly control the operating system.
                  • The context environment is in the unit of plug-in, that is, each plug-in has its own independent context, and all functions of a plug-in share one context. In other words, the function executed next time can see the changes made to the context by the function executed last time.
                  • JavaScript plug-ins can also access the Java classes and instances of the Turms server like Java plug-ins, and even call System.exit() directly, but it is not recommended to use JavaScript to write complex plug-ins
                  • Calling Node.js modules is not supported.

                  Main global object

                  • The load function is GraalVM's global function, used to load external JavaScript resources.
                  • turms object. This object mounts:
                    • log object, used for log printing
                    • fetch function, used to send HTTP requests

                  TODO

                  Plug-in Debug steps

                  In Debug mode (configure turms.plugin.js.debug.enabled to true, you can start Debug mode):

                  1. When the plug-in host Turms Java server calls the JavaScript plug-in function implemented by the proxy of the Java Proxy class (the proxy implementation source code is in: im.turms.server.common.infra.plugin.JsExtensionPointInvocationHandler), listen to JavaScript The WebSocket Debugger server of the plug-in will wait for the developer to start the Debugger of the Chrome browser to ensure that the JavaScript plug-in code will not be executed until the developer binds the Debugger. At this time, the Java calling thread that calls the JavaScript plug-in function will enter the WAITING state, and wait for the execution of the JavaScript plug-in function to complete.

                  2. In order to monitor the implementation of the JavaScript plug-in code, the developer needs to open the Chrome browser and enter the monitoring address of the WebSocket Debugger server that monitors the JavaScript plug-in. The developer can set a breakpoint for the JavaScript plug-in code on this page for debugging. Among them, the listening address of the server will be printed on the console by the Turms server, similar to:

                    Debugger listening on ws://127.0.0.1:24242/bd62b7be-bdec-48a6-9ad0-9314af33d531 For help, see: https://www.graalvm.org/tools/chrome-debugger E.g. in Chrome open: devtools://devtools/bundled/js_app.html?ws=127.0.0.1:24242/bd62b7be-bdec-48a6-9ad0-9314af33d531

                    Among them, devtools://devtools/bundled/js_app.html?ws=127.0.0.1:24242/bd62b7be-bdec-48a6-9ad0-9314af33d531 is the listening address.

                  3. After the Chrome Debugger is bound, the JavaScript plug-in function will start to execute

                  4. After the JavaScript plug-in function is executed, the Java calling thread will enter the RUNNABLE state, and the Java proxy function will then return the data returned by the JavaScript plug-in function.

                  configuration items

                  config namedefault valuedescription
                  turms.plugin.enabledtrueWhether to enable the plugin mechanism
                  turms.plugin.dirpluginsDirectory where local plugins are located. The Turms server will load plugins from this directory
                  turms.plugin.network.proxy.enabledfalseWhether to enable HTTP proxy when downloading network plugins
                  turms.plugin.network.proxy.usernameHTTP proxy username
                  turms.plugin.network.proxy.passwordHTTP proxy password
                  turms.plugin.network.proxy.hostHTTP proxy hostname
                  turms.plugin.network.proxy.port8080HTTP proxy port number
                  turms.plugin.network.proxy.connect-timeout-millis60_000HTTP proxy connection timeout (milliseconds)
                  turms.plugin.network.plugins[?].urlPlugin URL
                  turms.plugin.network.plugins[?].typeAUTOPlugin type.
                  When the value is AUTO, the Turms server will detect the type of plug-in according to the path of the URL: if the URL ends with .jar, it is judged as a Java plug-in; if the URL ends with .js, then It is judged as a JavaScript plug-in, otherwise the Turms server will throw an exception that the plug-in type cannot be recognized.
                  When the value is JAVA, it is a Java plug-in type
                  When the value is JAVA_SCRIPT, it is a JavaScript plug-in type
                  turms.plugin.network.plugins[?].use-local-cachefalseWhether to use local plugin cache. If false, the Turms server will re-download plugins every time it starts
                  turms.plugin.network.plugins[?].download.http-methodGETHTTP request method type when requesting plugin URL
                  turms.plugin.network.plugins[?].download.timeout-millis60_000Timeout for downloading plugins (milliseconds)

                  OpenAPI address: http://playground.turms.im:8510/openapi/ui#/plugin-controller

                  ControllerPathRoleGeneral
                  PluginControllerGET /pluginsQuery plugins
                  PUT /pluginsUpdate plugins
                  DELETE /pluginsDelete plugins
                  POST /plugins/javaAdd Java plugin
                  POST /plugins/jsAdd JavaScript plugins
            - + \ No newline at end of file diff --git a/docs/server/development/redevelopment.html b/docs/server/development/redevelopment.html index db22b766..ec58920e 100644 --- a/docs/server/development/redevelopment.html +++ b/docs/server/development/redevelopment.html @@ -17,7 +17,7 @@ -
            Skip to content

            About Secondary Development

            Reasons for Secondary Development Based on Turms

            Objective Reasons

            • uniqueness. The Turms solution is the only solution based on modern architecture and modern engineering technology in the global instant messaging open source field, and suitable for medium and large-scale deployment. Dozens of other IM open source projects are still in the slash-and-burn era, most of which emphasize enterprise communication or end-to-end security IM projects, and usually only win the favor of enterprise users. Except for Turms, there is no medium-to-large IM open source project designed for conventional Internet applications in the global open source community.

            • Normative. Since the architecture design of Turms is a variant of the standard commercial instant messaging architecture, if your professional team is based on common commercial standards, the architecture designed by your team is similar to the current architecture of Turms, and there is no need to start from scratch. Zero self-development.

            • Simplicity. The entire architecture of Turms and the implementation of each module are actually relatively simple and lightweight, and the secondary development is not difficult.

            • Controllability. Turms is developed based on the Apache V2 protocol, 100% open source, and has self-developed many basic middleware, which ensures the controllability of the underlying technology and avoids insufficient development momentum in the later stage of the project.

            • Documentation is complete. It includes design documents for modules such as message awareness, observability system, sensitive word filtering, anti-brush current limiting, global blacklist, etc. When we write the Turms document, we write it with the attitude of "I'm afraid I don't understand it". The Turms document will not only write "what to do", "how to do it", but also "why to do it", by providing Design concepts and ideas and core points help developers understand various functional modules, which is actually relatively rare in the open source circle. In order to earn consulting fees and worry about being plagiarized, some personnel of open source IM projects write with the attitude of "I am afraid that users will understand", so they are unwilling to write good documents.

              Reminder: The importance of design documents to developers and architects is self-evident. When readers use various open source IM projects, they can check whether the documents of a project are "afraid that they are not clearly written" or "afraid that users I can understand."

            • The IM system itself has many details, and the level of developers is uneven, so it is difficult to guarantee the quality of the projects produced. Realizing that user A can send messages to user B/group B is at most only 1% of the functions of the IM system, and these functional modules are not like some general dependency libraries that can be plugged in at will, but need to be customized. For example, Turms is based on dual The sensitive word filtering function of the array Trie AC automaton algorithm, and each implementation is interlocking (in fact, even the documents of Turms are mutually referenced and interlocking), so each module must be self-developed, requiring designers to work with Developers have a strong foundation.

              (If you want to know how many detailed functions a complete IM system has, you can continue to read the Turms document. Of course, the functions of the IM system can be more abundant, these are what we said above: IM is not only complex, but also can be almost seamless Endlessly complicated.)

              And Turms has basically implemented a complete IM server system. We have already implemented, or laid a solid foundation for, what basic users can think of and what they didn't expect. Even if the functions that are not implemented, we have generally written Explain why it is deliberately not implemented, and ensure transparency.

              In addition, some implementation schemes of Turms may seem to be "natural" schemes, but in fact, when we design and implement a scheme, we have usually overturned many other schemes. Behind it is a lot of derivation and practice. Users see What I want is just a final solution, and then I feel that "this is a natural solution". Regarding this point, the design documents of each module of Turms have related instructions.

            • Code quality is high. The Turms server can always maintain the level that a senior engineer should have in terms of code implementation, and can strike a balance between code performance and readability. For details, please refer to the Turms server source code and the design documents of each module. The reason why we dare to say that the Turms server can reach the limit of the Java ecosystem is that, in addition to the fact that the Turms server itself is very efficient, we have many inefficient but critical dependencies (such as mongo-java-driver and lettuce ) has been refactored, and even self-developed (such as log implementation/cluster implementation) to ensure the ultimate performance.

              In particular, some open source projects claim to have good performance, but in fact, the code reveals the truth. Here are three general, fast and practical methods for judging the coding level of open source authors for readers' reference:

              • Reasonable use of (elementary) syntax, data structures, and programming paradigms.
              • (Intermediate) Through class names, variable names, function names, etc., observe the author's vocabulary + word accuracy. Vocabulary and word accuracy are things that are difficult to disguise. It is generally easy to deduce the technical background, technical level and coding experience of the project author through this method. If the author has a rich vocabulary and uses words more accurately, then the coding level is usually not bad.
              • (Advanced) anti-paradigm design (such as anti-design pattern design, anti-conventional algorithm design and Unsafe operation, etc.). Reasonable use of design patterns can tell whether the author has design thinking, and daring to design anti-paradigms usually means that the author has a clear coding goal in mind, and is very familiar with related designs and underlying codes, has insight into the shortcomings of conventional designs, and has Have the courage to answer the question of "why not design according to the standard routine", and then dare to design against the paradigm.

              Of course, the above methods are only for readers' reference, and there can be more actual investigation points.

            • The technical solution is forward-looking. As software engineers, we have a deep understanding of one point: Maybe today's well-known technical solutions with stars and moons will become yesterday's yellow flowers tomorrow and become "technical debt". Such as Hadoop on the server side, Bootstrap, Backbone.js and Ember.js on the Web side. When Turms makes technology selection, it will not only consider the current status quo, such as cluster design and implementation, will also consider the development process of future technologies, such as system resource management Project Valhalla and Project Loom mentioned.

            • There is a large market demand for self-developed IM services. Even if you go to various recruitment websites to inquire about IM engineer-related positions, you can find that there are still a large number of companies recruiting IM-related talents at home and abroad. Companies invest hundreds or tens of millions of self-developed IM services from scratch or based on ancient IM open source projects. , The utilization rate of social resources is low.

            In addition, if you are still hesitating whether to adopt other open source IM projects, then we highly recommend that you compare Turms with them. After you have probably read the documents and source codes of Turms and other open source IM projects, I believe you will have a clear understanding in your mind. Answer.

            Subjective Reasons

            • The core business of your project is related to instant messaging, or you have plans to further develop instant messaging business.
            • The extended functions required by your project are not currently available in Turms, especially the extended functions that need to be implemented through auxiliary index tables (for auxiliary index tables, please refer to Turms Collection Design).
            • Your project has a large number of project-specific IM implementation details. Although Turms provides hundreds of configuration items, these are only general configurations. According to the specific business needs, the specific implementation of IM-related functions is extremely rich, but it is impossible for Turms to directly provide the realization of these relatively niche business functions, otherwise the amount of code will increase exponentially, so you need to do secondary development by yourself.

            Project Import

            1. Pull the Turms repository: git clone https://github.com/turms-im/turms.git

            2. Since the proto model files of each sub-project of Turms are placed in an independent warehouse, you also need to pull the code in the submodule through the following command in the root directory of the Turms project.

            git submodule update --init --recursive
            +    
            Skip to content

            About Secondary Development

            Reasons for Secondary Development Based on Turms

            Objective Reasons

            • uniqueness. The Turms solution is the only solution based on modern architecture and modern engineering technology in the global instant messaging open source field, and suitable for medium and large-scale deployment. Dozens of other IM open source projects are still in the slash-and-burn era, most of which emphasize enterprise communication or end-to-end security IM projects, and usually only win the favor of enterprise users. Except for Turms, there is no medium-to-large IM open source project designed for conventional Internet applications in the global open source community.

            • Normative. Since the architecture design of Turms is a variant of the standard commercial instant messaging architecture, if your professional team is based on common commercial standards, the architecture designed by your team is similar to the current architecture of Turms, and there is no need to start from scratch. Zero self-development.

            • Simplicity. The entire architecture of Turms and the implementation of each module are actually relatively simple and lightweight, and the secondary development is not difficult.

            • Controllability. Turms is developed based on the Apache V2 protocol, 100% open source, and has self-developed many basic middleware, which ensures the controllability of the underlying technology and avoids insufficient development momentum in the later stage of the project.

            • Documentation is complete. It includes design documents for modules such as message awareness, observability system, sensitive word filtering, anti-brush current limiting, global blacklist, etc. When we write the Turms document, we write it with the attitude of "I'm afraid I don't understand it". The Turms document will not only write "what to do", "how to do it", but also "why to do it", by providing Design concepts and ideas and core points help developers understand various functional modules, which is actually relatively rare in the open source circle. In order to earn consulting fees and worry about being plagiarized, some personnel of open source IM projects write with the attitude of "I am afraid that users will understand", so they are unwilling to write good documents.

              Reminder: The importance of design documents to developers and architects is self-evident. When readers use various open source IM projects, they can check whether the documents of a project are "afraid that they are not clearly written" or "afraid that users I can understand."

            • The IM system itself has many details, and the level of developers is uneven, so it is difficult to guarantee the quality of the projects produced. Realizing that user A can send messages to user B/group B is at most only 1% of the functions of the IM system, and these functional modules are not like some general dependency libraries that can be plugged in at will, but need to be customized. For example, Turms is based on dual The sensitive word filtering function of the array Trie AC automaton algorithm, and each implementation is interlocking (in fact, even the documents of Turms are mutually referenced and interlocking), so each module must be self-developed, requiring designers to work with Developers have a strong foundation.

              (If you want to know how many detailed functions a complete IM system has, you can continue to read the Turms document. Of course, the functions of the IM system can be more abundant, these are what we said above: IM is not only complex, but also can be almost seamless Endlessly complicated.)

              And Turms has basically implemented a complete IM server system. We have already implemented, or laid a solid foundation for, what basic users can think of and what they didn't expect. Even if the functions that are not implemented, we have generally written Explain why it is deliberately not implemented, and ensure transparency.

              In addition, some implementation schemes of Turms may seem to be "natural" schemes, but in fact, when we design and implement a scheme, we have usually overturned many other schemes. Behind it is a lot of derivation and practice. Users see What I want is just a final solution, and then I feel that "this is a natural solution". Regarding this point, the design documents of each module of Turms have related instructions.

            • Code quality is high. The Turms server can always maintain the level that a senior engineer should have in terms of code implementation, and can strike a balance between code performance and readability. For details, please refer to the Turms server source code and the design documents of each module. The reason why we dare to say that the Turms server can reach the limit of the Java ecosystem is that, in addition to the fact that the Turms server itself is very efficient, we have many inefficient but critical dependencies (such as mongo-java-driver and lettuce ) has been refactored, and even self-developed (such as log implementation/cluster implementation) to ensure the ultimate performance.

              In particular, some open source projects claim to have good performance, but in fact, the code reveals the truth. Here are three general, fast and practical methods for judging the coding level of open source authors for readers' reference:

              • Reasonable use of (elementary) syntax, data structures, and programming paradigms.
              • (Intermediate) Through class names, variable names, function names, etc., observe the author's vocabulary + word accuracy. Vocabulary and word accuracy are things that are difficult to disguise. It is generally easy to deduce the technical background, technical level and coding experience of the project author through this method. If the author has a rich vocabulary and uses words more accurately, then the coding level is usually not bad.
              • (Advanced) anti-paradigm design (such as anti-design pattern design, anti-conventional algorithm design and Unsafe operation, etc.). Reasonable use of design patterns can tell whether the author has design thinking, and daring to design anti-paradigms usually means that the author has a clear coding goal in mind, and is very familiar with related designs and underlying codes, has insight into the shortcomings of conventional designs, and has Have the courage to answer the question of "why not design according to the standard routine", and then dare to design against the paradigm.

              Of course, the above methods are only for readers' reference, and there can be more actual investigation points.

            • The technical solution is forward-looking. As software engineers, we have a deep understanding of one point: Maybe today's well-known technical solutions with stars and moons will become yesterday's yellow flowers tomorrow and become "technical debt". Such as Hadoop on the server side, Bootstrap, Backbone.js and Ember.js on the Web side. When Turms makes technology selection, it will not only consider the current status quo, such as cluster design and implementation, will also consider the development process of future technologies, such as system resource management Project Valhalla and Project Loom mentioned.

            • There is a large market demand for self-developed IM services. Even if you go to various recruitment websites to inquire about IM engineer-related positions, you can find that there are still a large number of companies recruiting IM-related talents at home and abroad. Companies invest hundreds or tens of millions of self-developed IM services from scratch or based on ancient IM open source projects. , The utilization rate of social resources is low.

            In addition, if you are still hesitating whether to adopt other open source IM projects, then we highly recommend that you compare Turms with them. After you have probably read the documents and source codes of Turms and other open source IM projects, I believe you will have a clear understanding in your mind. Answer.

            Subjective Reasons

            • The core business of your project is related to instant messaging, or you have plans to further develop instant messaging business.
            • The extended functions required by your project are not currently available in Turms, especially the extended functions that need to be implemented through auxiliary index tables (for auxiliary index tables, please refer to Turms Collection Design).
            • Your project has a large number of project-specific IM implementation details. Although Turms provides hundreds of configuration items, these are only general configurations. According to the specific business needs, the specific implementation of IM-related functions is extremely rich, but it is impossible for Turms to directly provide the realization of these relatively niche business functions, otherwise the amount of code will increase exponentially, so you need to do secondary development by yourself.

            Project Import

            1. Pull the Turms repository: git clone https://github.com/turms-im/turms.git

            2. Since the proto model files of each sub-project of Turms are placed in an independent warehouse, you also need to pull the code in the submodule through the following command in the root directory of the Turms project.

            git submodule update --init --recursive
             git submodule foreach git pull origin master
            git submodule update --init --recursive
             git submodule foreach git pull origin master
            1. (Optional) If you are using IntelliJ IDEA, you can import the entire Turms project through File -> New -> Project from Existing Source. IDEA will automatically recognize the directory structure of the entire Turms project and import the corresponding Maven dependency library.

            Build a Development Environment

            Except for the Turms server, the construction of other sub-projects of Turms is very routine and simple, so I won’t go into details.

            The construction of the Turms server development environment is actually very simple. The specific steps include:

            1. Install JDK 21 to develop Turms server

            2. Download, install and start the Redis server. Take RHEL/CentOS as an example:

              bash
              yum install epel-release
               yum update
              @@ -28,7 +28,7 @@
               yum install redis
               systemctl start redis
               systemctl enable redis

              For the Windows platform, you can download the Windows version from tporadowski/redis for local development and testing.

            3. Download, install and start the MongoDB shard cluster

              • Download MongoDB 4.4
              • Start MongoDB shard cluster: It is recommended to install mtools to build a MongoDB shard cluster automatically. The installation command is: pip3 install mtools[mlaunch]. After installing mtools, just run mlaunch init --replicaset --sharded 1 --nodes 1 --config 1 --hostname localhost --port 27017 --mongos 1 and wait for a few seconds. The construction of the MongoDB shard cluster can be completed
            4. After confirming that the Redis server and the MongoDB shard cluster are running normally, you can start the Turms server

            Notes:

            • For the startup of Redis and MongoDB, it can be set as a self-starting service at startup, so that there is no need to restart the computer every time and then manually build it. In addition, even if it is built manually, developers can basically complete the building of Redis and MongoDB shard clusters in 10 to 30 seconds after a few more operations. The process of building and starting is very simple.
            • When doing server-side development, it is recommended to change spring.profiles.active=prod in application.yaml under turms-gateway and turms-service to dev. This is because:
              • Under the default production environment configuration, the Turms server will not print logs on the console, so it is not convenient for developers to debug
              • In the dev environment, turms-service will automatically generate fake data to the MongoDB database, and turms-gateway will also automatically create TCP-based fake clients, and these clients will randomly (random request type, random request parameters) send turms-gateway sends real client requests to facilitate testing by developers.
            • If you want to replace the port of the MongoDB server, you only need to globally replace 27017 with your target port under the Turms project.

            About Task Difficulty

            For teams that are planning to do secondary development based on Turms (change the source code of the Turms project itself), you can refer to the task difficulty table below to assign tasks to members.

            The difficulty value of the task ranges from 0 to 10, where:

            • 0 means extremely simple
            • 1~3 means simple
            • 4~6 means medium
            • 7~9 means difficult
            • 10 means unachievable

            Server

            "Code implementation difficulty" is mainly considered from two perspectives, one is the logic complexity, and the other is the workload (the degree of tediousness, mainly relying on "physical strength" to achieve). For example, the same amount of self-developed implementation of spring-webflux, its logical complexity is 1~3, but the workload is 5~6, and the combination of the two is 5~6. The algorithm implementation is generally high logic complexity and low workload.

            Requirements AnalysisRelevant Process DesignCode Implementation Difficulty (Prerequisite: Code Implementation Must Be Efficient)
            IM basic business functions3~7. It is necessary to consider whether all IM service features are logically consistent, and whether they can be implemented efficiently (reverse inference or restriction of IM service requirements), etc.4~6: Initial stage. For example, messages use read-diffusion, write-diffusion, and read-write hybrid technology selection. Various notification push, pull, and push-pull hybrid technology selection
            1~2: current stage
            1~3. Most of them are regular CRUD operations. The difficulty of the individual 3 tasks is to balance the contradiction between code elegance and efficient implementation, and it is more of a code design problem.
            Expanded functions2~53~4: Initial stage
            1~2: Current stage
            2: Current limiting and anti-brush mechanism
            4~5: Global blacklist
            7~8 : Realization of sensitive words
            Middleware implementation and basic library1~31~31~4.
            1: such as metrics, distributed snowflake ID distributor
            2~3: such as logs, distributed configuration center
            3~4: such as plug-in mechanism, RPC, service registration and discovery
            />
            Bug correction0~30~31: Most of the regular bugs
            Turms seldom fix bugs in isolation. Generally, before fixing bugs, it is necessary to deduce whether the business process design that caused the bug is reasonable. Is there any? Optimizing the space, followed by fixing this bug.
            And hard-to-correct bugs generally have nothing to do with code implementation. Generally, hard-to-correct bugs are due to loopholes in the process design.
            For example, if there is a problem with the architecture design, the architecture of read diffusion should be used, but write diffusion is used. If the design of the bottom layer is wrong, no matter how the upper layer is changed, it will only scratch the surface.
            Custom Algorithms and Data Structures11~21: General custom data structures. Such as im.turms.server.common.infra.collection.FastEnumMap
            2: lock-free thread-safe custom data structure, such as: im.turms.server.common.infra.collection.ConcurrentEnumMap, im.turms.server.common.infra.throttle.TokenBucket
            4~5: lock-free thread-safe custom Growable data structure, such as im.turms.server.common.infra.collection.SpscGrowableLongRingBuffer
            8: im.turms.plugin.antispam.ac.AhoCorasickDoubleArrayTrie in sensitive words

            General comments:

            • The difficulty of the IM function lies in requirement analysis and outline design. Adding a new IM feature should not only consider whether it is logically consistent with other IM business features, but also consider whether the current architecture can implement it efficiently, whether distributed transactions are required, Do you need to add collection fields in the database and many other issues. As for code subcontracting and layering, it is more complicated in the early stage, but these problems have been solved and are relatively stable, so new tasks generally do not encounter difficulties in code flow design. The specific code implementation is generally very routine, and individual implementations may be relatively cumbersome.
            • There is basically no difficulty in the implementation of custom middleware and basic library, and the main thing to pay attention to is requirement analysis (of course, the difficulty of middleware requirement analysis is much simpler than that of IM business functions).
            • Most of the bugs themselves are not difficult, but you need to reverse the root cause that caused the bug, and think about whether the business process has the ability to optimize the space (in fact, it is still difficult to analyze the requirements)
            • Except for the AC automaton algorithm based on the double array Trie, which is difficult to implement, most other custom algorithms are relatively easy to implement. And in fact, there are very few algorithms and data structures that need to be customized, so the Erkai team should not encounter problems related to algorithms and data structures.

            Special mention: Requirements analysis is required for not doing a function. For example, the process of some functions of Turms has been designed, and its code implementation has also been written. But in the end, considering that this requirement may conflict with other requirement logics, or that the requirement is dispensable due to a large performance loss, these functions will always be in a pending state that has been implemented but not released.

            turms-admin

            turms-admin itself has no technical difficulties, the code level and implementation are relatively standardized, and there is no nesting problem of a large number of heterogeneous sub-projects in medium and large front-end projects due to historical reasons (for example, the root project uses Backbone, and nested in this The sub-projects of the root project mix Vue, Angular, React, etc., and various dependency version conflicts), so junior front-end engineers should be able to get started and do second development.

            The proportion of time to make a new UI feature is generally: requirements analysis (40%) > UI design (30%) >= code implementation (30%)

            turms-client

            Turms-client itself has no technical difficulties, and the code level and implementation are relatively standardized. Junior engineers should be able to get started and do second development.

            Relatively speaking, the difficulty of turms-client is that the API interface design "try to make the interface as the name suggests, while ensuring that developers have the ability to expand the underlying layer".

            - + \ No newline at end of file diff --git a/docs/server/development/rules.html b/docs/server/development/rules.html index 2dd5909d..0e93d7db 100644 --- a/docs/server/development/rules.html +++ b/docs/server/development/rules.html @@ -17,7 +17,7 @@ -
            Skip to content

            Basic Development Rules

            Conservative Design vs. Radical Design

            Java itself is a very conservative language, and its ecology is also very conservative. Its design principle is to "provide a set of safe APIs, and how Java users use these APIs will not cause Java internal errors" (except for Unsafe classes), so various access control mechanisms, internal memory copies and repeated locking are provided. The code writing principle of the Turms server is generally "How to run the program fast, how to write. As long as the Caller dares to pass or use data indiscriminately, we will report an error or ignore it directly." For example, StringUtil of Turms uses jdk.internal.misc.Unsafe#getReference to obtain the byte[] object inside the String object to avoid memory copying, and the Caller needs to ensure that it does not "do anything wrong". The String#getBytes() provided by Java itself is to ensure that the user cannot modify the internal byte[], so the byte[] object is copied and passed to the Caller.

            Therefore, in the practice of strings, for a conventional Spring-based Web application, after an HTTP request is cut from the TCP byte stream, it may need to be repeated in String, StringBuilder, byte[], HeapByteBuffer, DirectByteBuffer and other data are switched and spliced. Finally, it is very common for a String type object on the business level to be copied 5 to 30 times by third-party libraries and Java internally.

            Taking a specific application as an example, if we use Spring to create a Controller Bean, and define an API function with a return value type of String in it, so as to return the measurement data in Prometheus format through this API. If we do the "most elegant" writing method under this premise, we need to make at least 4 memory copies of this memory object (excluding the part where the system kernel brushes data to the network card; Turms is optimized and only needs to do one memory copy: that is, the heap Memory to off-heap memory; the actual size of this measurement data is about 8K):

            1. Write the basic data of Java into StringBuilder, at this time heap memory -> heap memory copy
            2. StringBuilder#toString(), another heap memory copy
            3. String#getBytes(), at least another heap memory copy
            4. Write byte[] to the off-heap memory DirectByteBuffer to hand over to the system kernel for writing data operations

            The effective memory usage rate is extremely low, and note that the above is only the simplest API String response function, and the process involved in the actual application is more complicated, so it is very common for a string to be copied 5 to 30 times after a process things. Therefore, we often see that when an HTTP server is built based on the mainstream ecology of its language, the memory used by a conventional Java Web application may be tens or even hundreds of times that of its equivalent C++ HTTP server.

            In addition to various network APIs, log implementations also need to deal frequently with String. In terms of memory, Turms is much more efficient than general-purpose implementations. Turms directly allocates cached off-heap memory through PooledByteBufAllocator.DEFAULT, and directly writes Java basic data into off-heap memory blocks. And throughout the process, we avoid using Java's own inefficient implementation, thereby avoiding meaningless heap-to-heap memory copies.

            In summary, although Java itself is relatively conservative, Turms is relatively radical, and prioritizes performance rather than "elegant code", and makes good use of Unsafe classes when necessary. Of course, our "radical" is also limited, such as: 1. Never replace the Java internal class implementation; 2. Try not to write JNI and C language code

            Replenish:

            1. For the practice of Java syntax sugar level, our attitude is "relatively indifferent", such as for (X x : Collection<X>) (need to create an iterator object, consume at least dozens of B) and more efficient for (int i = 0; i < length; i++), both are allowed
            2. In addition to the conservative tendency, there is also a very paradoxical phenomenon in the Java circle, that is, "selective neglect during optimization". A String is copied dozens of times. On the other hand, study JVM memory optimization on a budget. Turms faces various optimization items, mainly based on "cost-effectiveness", prioritizing the parts with high cost-effectiveness, so as to avoid trying to find fish.

            Basic protocol for server-side development

            Prioritization of coding strategies

            General rules: performance (low time and space complexity) > code readability > design patterns

            • Performance > Code readability. For example, use long instead of java.util.Date or java.time.Instant to represent time to avoid creating new objects and calculations during time conversion; another example is im.turms.server.common. The nextIncreasingIdfunction andnextLargeGapId function under the infra.cluster.service.idgen.SnowflakeIdGenerator class repeat about 10 lines of code, but we do not extract this common code to avoid opening up a new method stack (regardless of JVM lag Inline operation).
            • Performance > Design Mode. Such as the scene:
              • Iterate over the char[] elements in String. If you use the chain of responsibility mode, you need to use different Handler classes to implement different types of processing logic. Although this can make the logic clear, each Handler needs to traverse char[], so the time complexity of processing It is O(n*m) (n is the length of char[], and m is the number of Handlers). Codes of this complexity are prohibited in the Turms server code. At this time, it is necessary to write the code in an anti-design mode, write the processing logic in one traversal as much as possible, and try not to open a new function to distinguish logic (this is optional), but use comments to divide different processing logic to avoid function stack overhead.
              • The efficient design of the Protobuf model has always been praised, but the code implementation of the official Java version of Protobuf is conservative and inefficient. For example, the Protobuf model is Immutable, and only its Builder is Mutable. Therefore, if you want to modify the Protobuf model, you must first toBuilder() into a Builder, and then recreate a new Protobuf model instance. The effective memory usage rate is low (additional supplement : Its string decoding implementation is also very inefficient. For example, in order to be compatible with lower versions of Java, it uses char[] for encoding, but the String of the new version of Java only stores byte[] inside, so an additional type conversion). And our controllable code is to use the Builder without the Builder to avoid meaningless memory consumption.

            Exception: As in rare cases, code readability takes precedence over performance. Take the "Prohibition of using reflection during the processing of client requests and admin API requests" mentioned in the article as an example. Despite this rule, if the request needs to create an Entity object for use by the database driver, then we will still create and populate this object through reflection. Because if you don't use reflection, you need to write hundreds of field serialization and deserialization logics by hand, which is a huge workload and error-prone. The profitability of using reflection is very high, so reflection is allowed.

            There are many more examples of the above, see the Turms server code for details. When adding new code, just make sure that there is hardly any room for time or space optimization in the newly added code. If there is still room for optimization, but the benefits are low and the implementation is complex, allow for later optimization.

            Threads and locks

            • The use of elastic thread pools is prohibited, creating new threads requires a dedicated code review

            • Try not to use synchronized and Lock operations (including reentrant locks) during the processing of client requests and admin API requests. If a critical section is really needed, priority should be given to refactoring the code flow or replacing it with CAS technology.

            Memory and GC

            • Prohibit copying of ByteBuf

            • For network I/O operations, the use of non-pooled or heap memory is prohibited, only pooled direct memory is allowed

            • Try not to create new objects, try to use the object pool. As is common in design: In order to logically separate the data models of different layers, the DTO and BO models are specially disassembled. For this scenario, Turms will try to use a data model and implement a response that conforms to the DTO model by customizing Jackson's serialization logic

              Also: this rule will change after Project Valhalla is released, in particular we will be removing most of the existing object pools

            • Try not to create objects with multiple unused fields. For example, Turms reconstructed MongoDB's FindOptions model with a custom QueryOptions model. One of the reasons is that the FindOptions model is frequently used, but it has dozens of useless fields.

            • During the processing of client requests and admin API requests, the use of Stream is prohibited

            • Regarding the question of "why some functions that seem to be able to use primitive parameters still use wrapper classes". Wrapper classes are still used because: although some parameters in a function may seem to be able to use primitives, in fact, these primitives will eventually be passed to Java collection class implementations with a high probability (such as Map<Long, Object>). Functions that accept objects (such as Object types, Long types, generics, etc.) or Object fields as classes, etc. Therefore, if a function just uses primitives on its own, after the entire logic is processed, this primitive is likely to be repeatedly converted between the wrapper class and the primitive many times. To sum up, in most cases, the Turms server uniformly uses wrapper classes to avoid such multiple conversions. Only when we can guarantee that the primitive will not be converted into a wrapper class, we will use the primitive uniformly.

              In addition, this is why we are in About the Valhalla project said that the design concept of "everything is an object" "lingers like a curse". It is difficult for a primitive not to be converted into a package in complex logic. Classes, meaningless objects waste a lot of memory, which is why we've been waiting for the Valhalla project to finalize wrapper classes and support features like the List<int> type.

            Proxy and Reflection

            • Do not use dynamic proxy technology (such as Java dynamic proxy, CGLib, Spring AOP, etc.), try not to use proxy or use static compilation technology instead (such as Lombok).

              The only exception: In the plug-in mechanism of the Turms server, Java's dynamic proxy is used to proxy plug-ins written in JavaScript.

            • During the processing of client requests and admin API requests, unless you need to write a lot of complicated code without using reflection, the use of reflection technology is prohibited in other scenarios. For example: Turms uses reflection when serializing and deserializing hundreds of fields of MongoDB's Entity model.

            In addition, if there is a third-party dependency that violates the above principles, the third-party dependency will be refactored according to the cost-effectiveness schedule.

            text format

            toString() text format

            The text format implemented by the Java project toString() is varied, and even the internal code of Java itself has many text formats with inconsistent styles. As far as the bracket style is concerned, there are not only the default [key=value] format of Java record, but also the (key=value) format generated by Lombok, and the {key=value} format generated by Google AutoValue.

            In order to achieve a unified text format, the Turms server project uniformly adopts the following format:

            • For the prefix and suffix of the text, use { and } respectively, instead of [] or (). Because in the text format design of Turms, [] refers to an array, and () refers to a special mark to make important information more eye-catching. For specific rules, see Server operation log and exception text format below.

            • Use the mainstream = instead of : between keys and values.

            • For string values, you need to use "" to wrap the value; for other non-array values, use the toString() form of the value; for array values, use [] to include the value in the array .

              For example: ClassName{key1=value, key2=[value1, value2]}

            **Note: The Turms server has not yet unified the text format of toString(), but the content described above is the direction of improvement in the future. **

            Server running log and exception text format

            Because there are many details in the text format design of logs and exceptions, and the principles of many common practices are in conflict with each other, and there is no unified best practice in the Java field, almost all large and medium-sized open source projects (including Java itself) source code) cannot achieve a unified text format, but a mix of various text formats, and the specific format depends mainly on the current "feeling" of the engineer.

            Therefore, this section specifically explains which text formats the Turms server uses, and why some other common text formats are not used, so as to reduce readers' confusion in practice.

            The Importance of a Uniform Format

            For some text formatting rules, readers may not feel the difference between the rules when reading a single log. But when readers need to read dozens, even hundreds, or thousands of different logs, they can understand how much energy saving in reading using a standardized and unified text format.

            specific rules

            simply put:

            • Important information in the text should be put in the end of the sentence as much as possible. Vital information is usually a variable.
            • When the important information is at the end of the sentence, you need to use : to separate the important information from other texts. For example: use Could not find the class: my.company.Main instead of The class (my.company.Main) could not be found.
            • Sentences do not need to omit the articles a, an and the. This point is especially emphasized because most well-known large and medium-sized open source projects tend to omit articles.
            • For noun phrases, restrictive appositions are usually used instead of attributive nouns. For example, restrictive appositions: The collection "messasge" or The setting "turms.whatever.min"; attributive nouns: The "messasge" collection and The "turms.whatever.min" setting.
            • Function and use of special symbols:
            RoleSymbols usedIn a sentenceWhen paired with : When paired with an arrayCommon examples
            Represents an array value[,]Use [value] format.
            Such as Detected illegal operations [CREATE, DELETE] on the collection "message"
            use : [value] format.
            Such as Detected illegal operations: [CREATE, DELETE]
            Indicates interval[..] closed interval, (..) open intervalsuch as: [1..2], ``
            Wrap information that needs to be specially separated for eye-catching()Use the (value) format.
            Such as The path (/turms/1.txt/) is illegal
            No need to use (), just use : value format.
            Such as Could not find any resource from the path: /turms/1.txt
            No need to use (), just use [value] format.
            Such as The paths [/1.txt, /2.txt] are illegal
            object, enumeration value, path, domain name, field reference
            Wrap key-value pairs{}Use {key=value} format.
            Example
            use : {key=value} formatuse [{key=value}, {key=value}] format
            Package name or string value""Use "value" format.
            Such as The property "turms.whatever.min" must be greater than 0; The setting name "abc123" should not contain any digit
            use : "value" format.
            Such as Unknown property: "turms.whatever.min"
            use ["value", "value"] format.
            Such as The properties ["turms.whatever.min", "turms.whatever.max"] are unknown
            field name, parameter name, database collection name
            • Difference between name and reference

              Let’s give a relatively easy-to-understand example first. Take the field name and reference as an example. Suppose there is a field name in a class com.abc.Song (song), then the name of the field is name, When the name is used in a sentence, double quotes "" are required, such as The field "name" contains illegal characters. The reference of the field is com.abc.Song#name, and when the reference is used in a sentence, parentheses () are required, such as The field (com.abc.Song#name) should be accessible.

              But in the actual development process, we will find that many strings themselves can have multiple interpretations. For example, if there is a class whose name is com.my.Main, then this name can be interpreted as either a class name or a class reference. And considering that the class name will not have the serious ambiguity that may be brought about by the above-mentioned name, and the practice of most well-known open source projects of CUHK does not use "" to wrap the class name, so for the class name, when designing Turms, It is uniformly interpreted as a class reference rather than a class name, so this type of reference needs to follow the usage rules of (), not the usage rules of "".

            The next section will explain why Turms is designed this way, and why some other common designs are not used.

            TODO: Update later

            About the use of dependent libraries

            Many dependency libraries are keen to abstract and encapsulate the underlying implementation to achieve "internal logic transparency, and users do not need to care about the logic behind it". Such a design is more practical for some applications that are simple in logic, require fast online, and do not pursue performance. But as a project develops further and further optimized, this uncontrollable abstraction layer will become a stumbling block for troubleshooting, performance optimization, and function customization. Problems caused by the abstraction layer, such as:

            • Requirement iteration and version update are seriously lagging behind. If our project uses an abstraction layer A dependency, A dependency encapsulates B dependency. If we need to add a new feature or fix a bug to the B dependency, the usual process is: we raise an issue to the B dependency community. If we are lucky, we will get a reply within 2 to 4 days on average. If luck is still good, the other party is willing to change. Assuming that the changes are not significant, the relevant PRs will be merged after 1 week. It may wait 2 weeks, 1 month, or even a few months, and the B dependency finally releases a new version. Then we have to wait for the A dependency to update the B dependency version, which may take another 2 weeks, 1 month, or even a few months. By the time we actually get to use the new features, it may have been a few months. But more often than not, the maintainer of B's dependency is not willing to modify the relevant code at all.

            • The vast majority of well-known dependent libraries only care about function realization, not performance, and basically have the attitude of "the function can be used, and the performance can make do". (Turms solves most of the following problems by refactoring dependent code) such as:

              • mongo-java-driver repeatedly creates a large number of intermediate objects when making API calls. For the default configuration object, no Cache is used.
              • Lettuce needs to repeatedly expand the memory when serializing the instruction parameters passed to Redis, and the memory data of the Cache is not cached.
              • Log4j2 actually uses getBytes to read the data of the string, and uses StringBuilder to do the splicing of the log (compared to the log implementation of Turms, which directly uses the byte[] value data inside String, and uses the provided by Netty io.netty.buffer.AbstractByteBufAllocator#directBuffer` to splice logs and do log output). (Supplement: If readers are interested in log implementation, you can read log implementation, understand how Turms implements logs)
              • In the official Java implementation of Protobuf, its string decoding implementation is also very inefficient. For example, in order to be compatible with lower versions of Java, it uses char[] for encoding, but the String of the new version of Java only stores byte[ ], so a meaningless memory copy is required (note: the string itself is the largest data in the client request).
              • Spring is a typical representative of inefficient code, such as:
                • When org.springframework.core.codec.CharSequenceEncoder processes UTF-8 encoded strings, 1 character corresponds to 3 bytes to open up DirectByteBuffer for output. In other words, for the above-mentioned 8K Prometheus data, only this piece of Spring needs to use 2.4MB, and an extra 1.6MB is used. Of course, Spring is even more efficient, because it also performs string copying when String#getBytes(...).
                • spring-boot-actuator:v2.6.6 does not support zero copy when exporting huge heap dump files (see org.springframework.boot.actuate.management.HeapDumpWebEndpoint.TemporaryFileSystemResource#isFile)
                • Spring's AOP is often used to proxy Controller layer method calls, which can be used to capture parsed parameters and print logs (WebFilter cannot obtain parsed parameters). But AOP will add 19 stacks to a method and use a lot of reflection. The time required to call from the AOP proxy to the Controller method layer is even longer than the internal business processing time of Turms (additional supplement: AOP is a very bad design, Spring Should be designed for the chain of responsibility adopted by the Controller layer).

            To sum up, the code quality of many well-known Java dependent libraries is not high, and even the code performance and quality are worrying, and the source code is shocking to read. Instead, readers can refer to how the Turms server is coded to optimize various details to the extreme.

            • When the dependency library that focuses on abstract implementation is combined with responsive programming, it will bring developers a hell-level experience in troubleshooting problems, especially when bugs are related to memory that needs to be released manually. In the troubleshooting of conventional problems, we can usually quickly troubleshoot the problem through the stack information. But in reactive programming, such a method usually does not work, and we rely more on logical reasoning to troubleshoot problems. That is, familiarize yourself with the upstream and downstream code (including the code in the dependent package), and deduce all the processes that the code may go through.

              If the code has few abstraction layers and the call relationship is flat, the troubleshooting process is actually very simple. Maybe we only need to glance at dozens of lines of code in a class to roughly know the cause of the problem. However, if a large number of "encapsulation, abstraction, users do not need to pay attention to the underlying implementation logic" dependency libraries are used in the process, the hell-level experience will come. Originally, we might only need a function with dozens of lines to implement all the relevant logic. But if we implement related functions based on the abstract library, when we troubleshoot, the code we may want to check may be A abstract class (A1, A2, A3...) class -> B abstract class (B1, B2, B3.. .)->C abstract class (C1,C2,C3...)->..., jump between dozens of classes and dozens of methods, and reason.

              The most typical comparison example is: Turms' im.turms.gateway.access.client.websocket.WebSocketServerFactory#getHttpRequestHandler implements a set of WebSocket handshake logic in a function of dozens of lines. But if this set of logic is implemented by Spring, it will mix the classes under different packages and various logics together. When troubleshooting, if it is accompanied by some memory that needs to be released manually, hell level Here comes the troubleshooting experience. What can be solved with dozens of lines of code, a library like Spring requires thousands of lines of code. For example, there are multiple sets of underlying Web implementations inside WebFlux, which is euphemistically called "encapsulation and abstraction, and users do not need to pay attention to the underlying implementation logic."

            • Some dependent libraries will automatically suppress exceptions in some places, and the upper-layer application code cannot perceive them. When something goes wrong, the underlying library code and the upper-level application code run on different stacks in most cases. Unless the underlying dependency library supports global exception callbacks, the upper-layer application cannot even perceive the occurrence of exceptions. For some Trivial-level errors, it doesn't matter if the upper-layer application cannot perceive them. But if it is an abnormality that some upper-layer applications are very concerned about (such as the abnormal disconnection of the RPC TCP connection), this will be the fuse that causes the abnormality and disorder of the entire system.

            • Developers of some well-known dependent libraries even lack the most basic security knowledge. For example, the developers of Log4j actually added code to automatically detect whether there is a ${jndi} pattern in the string to be printed, and if it exists, call the corresponding JNDI service, and enable this function by default. As a developer who specializes in writing log-dependent libraries, he lacks security common sense and has passed PR review.

            On the other hand, self-development can avoid all the above-mentioned problems. While improving the controllability of the code, it also greatly reduces the difficulty of research and development and troubleshooting, and improves code performance and resource utilization.

            In summary, when a Turms project references a class library, it usually does not introduce an abstract encapsulation library (such as Spring), but only an implementation library. Points that require performance optimization or logic optimization in the dependent library will be directly refactored inside the Turms project. Considering the difficulty of self-development and code controllability, we will choose self-development as much as possible in most cases.

            Supplement: Although the Java ecosystem is prosperous, there are actually few high-quality libraries. Therefore, most medium and large-scale Java open source projects that pursue performance usually try to develop various functional modules by themselves instead of using third-party dependent libraries, such as: Elasticsearch, Cassandra, Ignite . In addition, in the entire Java ecosystem, the only library we currently trust in the technical level of its developers is: Netty

            Exception capture and printing

            Role: Understanding the exception capture and printing principles of the Turms server can help developers quickly locate the exception and find the root cause of the exception.

            In reactive programming, the most criticized exception is that exceptions under this programming paradigm are usually very difficult to locate, and their stack information is basically useless. If the developer randomly prints the exception log in the reactive programming mode, it is very likely that the debugger will not even be able to judge where the exception is thrown from the log, let alone reverse the execution code.

            But in fact, the principle and practice of good exception log printing are relatively simple, and if you follow this principle, it usually takes a few seconds or minutes to locate the exception. The basic principle is that the most downstream code throws an exception without printing. If the midstream code needs to translate the exception, it will continue to be thrown upwards after the translation, without printing; the most upstream exception will be received and ** will be printed. As for what code is considered "the most upstream", the code that calls subscribe() is considered the "most upstream". This principle is actually very simple in practice, but the exception capture in reactive programming "looks" more complicated. For example, under the im.turms.service.access.servicerequest.dispatcher.ServiceRequestDispatcher#dispatch0 function in the turms-service server, there is an operation of "send notifications to relevant users according to the processing results of the Service layer" , whose code is as follows:

            java
            return result
            +    
            Skip to content

            Basic Development Rules

            Conservative Design vs. Radical Design

            Java itself is a very conservative language, and its ecology is also very conservative. Its design principle is to "provide a set of safe APIs, and how Java users use these APIs will not cause Java internal errors" (except for Unsafe classes), so various access control mechanisms, internal memory copies and repeated locking are provided. The code writing principle of the Turms server is generally "How to run the program fast, how to write. As long as the Caller dares to pass or use data indiscriminately, we will report an error or ignore it directly." For example, StringUtil of Turms uses jdk.internal.misc.Unsafe#getReference to obtain the byte[] object inside the String object to avoid memory copying, and the Caller needs to ensure that it does not "do anything wrong". The String#getBytes() provided by Java itself is to ensure that the user cannot modify the internal byte[], so the byte[] object is copied and passed to the Caller.

            Therefore, in the practice of strings, for a conventional Spring-based Web application, after an HTTP request is cut from the TCP byte stream, it may need to be repeated in String, StringBuilder, byte[], HeapByteBuffer, DirectByteBuffer and other data are switched and spliced. Finally, it is very common for a String type object on the business level to be copied 5 to 30 times by third-party libraries and Java internally.

            Taking a specific application as an example, if we use Spring to create a Controller Bean, and define an API function with a return value type of String in it, so as to return the measurement data in Prometheus format through this API. If we do the "most elegant" writing method under this premise, we need to make at least 4 memory copies of this memory object (excluding the part where the system kernel brushes data to the network card; Turms is optimized and only needs to do one memory copy: that is, the heap Memory to off-heap memory; the actual size of this measurement data is about 8K):

            1. Write the basic data of Java into StringBuilder, at this time heap memory -> heap memory copy
            2. StringBuilder#toString(), another heap memory copy
            3. String#getBytes(), at least another heap memory copy
            4. Write byte[] to the off-heap memory DirectByteBuffer to hand over to the system kernel for writing data operations

            The effective memory usage rate is extremely low, and note that the above is only the simplest API String response function, and the process involved in the actual application is more complicated, so it is very common for a string to be copied 5 to 30 times after a process things. Therefore, we often see that when an HTTP server is built based on the mainstream ecology of its language, the memory used by a conventional Java Web application may be tens or even hundreds of times that of its equivalent C++ HTTP server.

            In addition to various network APIs, log implementations also need to deal frequently with String. In terms of memory, Turms is much more efficient than general-purpose implementations. Turms directly allocates cached off-heap memory through PooledByteBufAllocator.DEFAULT, and directly writes Java basic data into off-heap memory blocks. And throughout the process, we avoid using Java's own inefficient implementation, thereby avoiding meaningless heap-to-heap memory copies.

            In summary, although Java itself is relatively conservative, Turms is relatively radical, and prioritizes performance rather than "elegant code", and makes good use of Unsafe classes when necessary. Of course, our "radical" is also limited, such as: 1. Never replace the Java internal class implementation; 2. Try not to write JNI and C language code

            Replenish:

            1. For the practice of Java syntax sugar level, our attitude is "relatively indifferent", such as for (X x : Collection<X>) (need to create an iterator object, consume at least dozens of B) and more efficient for (int i = 0; i < length; i++), both are allowed
            2. In addition to the conservative tendency, there is also a very paradoxical phenomenon in the Java circle, that is, "selective neglect during optimization". A String is copied dozens of times. On the other hand, study JVM memory optimization on a budget. Turms faces various optimization items, mainly based on "cost-effectiveness", prioritizing the parts with high cost-effectiveness, so as to avoid trying to find fish.

            Basic protocol for server-side development

            Prioritization of coding strategies

            General rules: performance (low time and space complexity) > code readability > design patterns

            • Performance > Code readability. For example, use long instead of java.util.Date or java.time.Instant to represent time to avoid creating new objects and calculations during time conversion; another example is im.turms.server.common. The nextIncreasingIdfunction andnextLargeGapId function under the infra.cluster.service.idgen.SnowflakeIdGenerator class repeat about 10 lines of code, but we do not extract this common code to avoid opening up a new method stack (regardless of JVM lag Inline operation).
            • Performance > Design Mode. Such as the scene:
              • Iterate over the char[] elements in String. If you use the chain of responsibility mode, you need to use different Handler classes to implement different types of processing logic. Although this can make the logic clear, each Handler needs to traverse char[], so the time complexity of processing It is O(n*m) (n is the length of char[], and m is the number of Handlers). Codes of this complexity are prohibited in the Turms server code. At this time, it is necessary to write the code in an anti-design mode, write the processing logic in one traversal as much as possible, and try not to open a new function to distinguish logic (this is optional), but use comments to divide different processing logic to avoid function stack overhead.
              • The efficient design of the Protobuf model has always been praised, but the code implementation of the official Java version of Protobuf is conservative and inefficient. For example, the Protobuf model is Immutable, and only its Builder is Mutable. Therefore, if you want to modify the Protobuf model, you must first toBuilder() into a Builder, and then recreate a new Protobuf model instance. The effective memory usage rate is low (additional supplement : Its string decoding implementation is also very inefficient. For example, in order to be compatible with lower versions of Java, it uses char[] for encoding, but the String of the new version of Java only stores byte[] inside, so an additional type conversion). And our controllable code is to use the Builder without the Builder to avoid meaningless memory consumption.

            Exception: As in rare cases, code readability takes precedence over performance. Take the "Prohibition of using reflection during the processing of client requests and admin API requests" mentioned in the article as an example. Despite this rule, if the request needs to create an Entity object for use by the database driver, then we will still create and populate this object through reflection. Because if you don't use reflection, you need to write hundreds of field serialization and deserialization logics by hand, which is a huge workload and error-prone. The profitability of using reflection is very high, so reflection is allowed.

            There are many more examples of the above, see the Turms server code for details. When adding new code, just make sure that there is hardly any room for time or space optimization in the newly added code. If there is still room for optimization, but the benefits are low and the implementation is complex, allow for later optimization.

            Threads and locks

            • The use of elastic thread pools is prohibited, creating new threads requires a dedicated code review

            • Try not to use synchronized and Lock operations (including reentrant locks) during the processing of client requests and admin API requests. If a critical section is really needed, priority should be given to refactoring the code flow or replacing it with CAS technology.

            Memory and GC

            • Prohibit copying of ByteBuf

            • For network I/O operations, the use of non-pooled or heap memory is prohibited, only pooled direct memory is allowed

            • Try not to create new objects, try to use the object pool. As is common in design: In order to logically separate the data models of different layers, the DTO and BO models are specially disassembled. For this scenario, Turms will try to use a data model and implement a response that conforms to the DTO model by customizing Jackson's serialization logic

              Also: this rule will change after Project Valhalla is released, in particular we will be removing most of the existing object pools

            • Try not to create objects with multiple unused fields. For example, Turms reconstructed MongoDB's FindOptions model with a custom QueryOptions model. One of the reasons is that the FindOptions model is frequently used, but it has dozens of useless fields.

            • During the processing of client requests and admin API requests, the use of Stream is prohibited

            • Regarding the question of "why some functions that seem to be able to use primitive parameters still use wrapper classes". Wrapper classes are still used because: although some parameters in a function may seem to be able to use primitives, in fact, these primitives will eventually be passed to Java collection class implementations with a high probability (such as Map<Long, Object>). Functions that accept objects (such as Object types, Long types, generics, etc.) or Object fields as classes, etc. Therefore, if a function just uses primitives on its own, after the entire logic is processed, this primitive is likely to be repeatedly converted between the wrapper class and the primitive many times. To sum up, in most cases, the Turms server uniformly uses wrapper classes to avoid such multiple conversions. Only when we can guarantee that the primitive will not be converted into a wrapper class, we will use the primitive uniformly.

              In addition, this is why we are in About the Valhalla project said that the design concept of "everything is an object" "lingers like a curse". It is difficult for a primitive not to be converted into a package in complex logic. Classes, meaningless objects waste a lot of memory, which is why we've been waiting for the Valhalla project to finalize wrapper classes and support features like the List<int> type.

            Proxy and Reflection

            • Do not use dynamic proxy technology (such as Java dynamic proxy, CGLib, Spring AOP, etc.), try not to use proxy or use static compilation technology instead (such as Lombok).

              The only exception: In the plug-in mechanism of the Turms server, Java's dynamic proxy is used to proxy plug-ins written in JavaScript.

            • During the processing of client requests and admin API requests, unless you need to write a lot of complicated code without using reflection, the use of reflection technology is prohibited in other scenarios. For example: Turms uses reflection when serializing and deserializing hundreds of fields of MongoDB's Entity model.

            In addition, if there is a third-party dependency that violates the above principles, the third-party dependency will be refactored according to the cost-effectiveness schedule.

            text format

            toString() text format

            The text format implemented by the Java project toString() is varied, and even the internal code of Java itself has many text formats with inconsistent styles. As far as the bracket style is concerned, there are not only the default [key=value] format of Java record, but also the (key=value) format generated by Lombok, and the {key=value} format generated by Google AutoValue.

            In order to achieve a unified text format, the Turms server project uniformly adopts the following format:

            • For the prefix and suffix of the text, use { and } respectively, instead of [] or (). Because in the text format design of Turms, [] refers to an array, and () refers to a special mark to make important information more eye-catching. For specific rules, see Server operation log and exception text format below.

            • Use the mainstream = instead of : between keys and values.

            • For string values, you need to use "" to wrap the value; for other non-array values, use the toString() form of the value; for array values, use [] to include the value in the array .

              For example: ClassName{key1=value, key2=[value1, value2]}

            **Note: The Turms server has not yet unified the text format of toString(), but the content described above is the direction of improvement in the future. **

            Server running log and exception text format

            Because there are many details in the text format design of logs and exceptions, and the principles of many common practices are in conflict with each other, and there is no unified best practice in the Java field, almost all large and medium-sized open source projects (including Java itself) source code) cannot achieve a unified text format, but a mix of various text formats, and the specific format depends mainly on the current "feeling" of the engineer.

            Therefore, this section specifically explains which text formats the Turms server uses, and why some other common text formats are not used, so as to reduce readers' confusion in practice.

            The Importance of a Uniform Format

            For some text formatting rules, readers may not feel the difference between the rules when reading a single log. But when readers need to read dozens, even hundreds, or thousands of different logs, they can understand how much energy saving in reading using a standardized and unified text format.

            specific rules

            simply put:

            • Important information in the text should be put in the end of the sentence as much as possible. Vital information is usually a variable.
            • When the important information is at the end of the sentence, you need to use : to separate the important information from other texts. For example: use Could not find the class: my.company.Main instead of The class (my.company.Main) could not be found.
            • Sentences do not need to omit the articles a, an and the. This point is especially emphasized because most well-known large and medium-sized open source projects tend to omit articles.
            • For noun phrases, restrictive appositions are usually used instead of attributive nouns. For example, restrictive appositions: The collection "messasge" or The setting "turms.whatever.min"; attributive nouns: The "messasge" collection and The "turms.whatever.min" setting.
            • Function and use of special symbols:
            RoleSymbols usedIn a sentenceWhen paired with : When paired with an arrayCommon examples
            Represents an array value[,]Use [value] format.
            Such as Detected illegal operations [CREATE, DELETE] on the collection "message"
            use : [value] format.
            Such as Detected illegal operations: [CREATE, DELETE]
            Indicates interval[..] closed interval, (..) open intervalsuch as: [1..2], ``
            Wrap information that needs to be specially separated for eye-catching()Use the (value) format.
            Such as The path (/turms/1.txt/) is illegal
            No need to use (), just use : value format.
            Such as Could not find any resource from the path: /turms/1.txt
            No need to use (), just use [value] format.
            Such as The paths [/1.txt, /2.txt] are illegal
            object, enumeration value, path, domain name, field reference
            Wrap key-value pairs{}Use {key=value} format.
            Example
            use : {key=value} formatuse [{key=value}, {key=value}] format
            Package name or string value""Use "value" format.
            Such as The property "turms.whatever.min" must be greater than 0; The setting name "abc123" should not contain any digit
            use : "value" format.
            Such as Unknown property: "turms.whatever.min"
            use ["value", "value"] format.
            Such as The properties ["turms.whatever.min", "turms.whatever.max"] are unknown
            field name, parameter name, database collection name
            • Difference between name and reference

              Let’s give a relatively easy-to-understand example first. Take the field name and reference as an example. Suppose there is a field name in a class com.abc.Song (song), then the name of the field is name, When the name is used in a sentence, double quotes "" are required, such as The field "name" contains illegal characters. The reference of the field is com.abc.Song#name, and when the reference is used in a sentence, parentheses () are required, such as The field (com.abc.Song#name) should be accessible.

              But in the actual development process, we will find that many strings themselves can have multiple interpretations. For example, if there is a class whose name is com.my.Main, then this name can be interpreted as either a class name or a class reference. And considering that the class name will not have the serious ambiguity that may be brought about by the above-mentioned name, and the practice of most well-known open source projects of CUHK does not use "" to wrap the class name, so for the class name, when designing Turms, It is uniformly interpreted as a class reference rather than a class name, so this type of reference needs to follow the usage rules of (), not the usage rules of "".

            The next section will explain why Turms is designed this way, and why some other common designs are not used.

            TODO: Update later

            About the use of dependent libraries

            Many dependency libraries are keen to abstract and encapsulate the underlying implementation to achieve "internal logic transparency, and users do not need to care about the logic behind it". Such a design is more practical for some applications that are simple in logic, require fast online, and do not pursue performance. But as a project develops further and further optimized, this uncontrollable abstraction layer will become a stumbling block for troubleshooting, performance optimization, and function customization. Problems caused by the abstraction layer, such as:

            • Requirement iteration and version update are seriously lagging behind. If our project uses an abstraction layer A dependency, A dependency encapsulates B dependency. If we need to add a new feature or fix a bug to the B dependency, the usual process is: we raise an issue to the B dependency community. If we are lucky, we will get a reply within 2 to 4 days on average. If luck is still good, the other party is willing to change. Assuming that the changes are not significant, the relevant PRs will be merged after 1 week. It may wait 2 weeks, 1 month, or even a few months, and the B dependency finally releases a new version. Then we have to wait for the A dependency to update the B dependency version, which may take another 2 weeks, 1 month, or even a few months. By the time we actually get to use the new features, it may have been a few months. But more often than not, the maintainer of B's dependency is not willing to modify the relevant code at all.

            • The vast majority of well-known dependent libraries only care about function realization, not performance, and basically have the attitude of "the function can be used, and the performance can make do". (Turms solves most of the following problems by refactoring dependent code) such as:

              • mongo-java-driver repeatedly creates a large number of intermediate objects when making API calls. For the default configuration object, no Cache is used.
              • Lettuce needs to repeatedly expand the memory when serializing the instruction parameters passed to Redis, and the memory data of the Cache is not cached.
              • Log4j2 actually uses getBytes to read the data of the string, and uses StringBuilder to do the splicing of the log (compared to the log implementation of Turms, which directly uses the byte[] value data inside String, and uses the provided by Netty io.netty.buffer.AbstractByteBufAllocator#directBuffer` to splice logs and do log output). (Supplement: If readers are interested in log implementation, you can read log implementation, understand how Turms implements logs)
              • In the official Java implementation of Protobuf, its string decoding implementation is also very inefficient. For example, in order to be compatible with lower versions of Java, it uses char[] for encoding, but the String of the new version of Java only stores byte[ ], so a meaningless memory copy is required (note: the string itself is the largest data in the client request).
              • Spring is a typical representative of inefficient code, such as:
                • When org.springframework.core.codec.CharSequenceEncoder processes UTF-8 encoded strings, 1 character corresponds to 3 bytes to open up DirectByteBuffer for output. In other words, for the above-mentioned 8K Prometheus data, only this piece of Spring needs to use 2.4MB, and an extra 1.6MB is used. Of course, Spring is even more efficient, because it also performs string copying when String#getBytes(...).
                • spring-boot-actuator:v2.6.6 does not support zero copy when exporting huge heap dump files (see org.springframework.boot.actuate.management.HeapDumpWebEndpoint.TemporaryFileSystemResource#isFile)
                • Spring's AOP is often used to proxy Controller layer method calls, which can be used to capture parsed parameters and print logs (WebFilter cannot obtain parsed parameters). But AOP will add 19 stacks to a method and use a lot of reflection. The time required to call from the AOP proxy to the Controller method layer is even longer than the internal business processing time of Turms (additional supplement: AOP is a very bad design, Spring Should be designed for the chain of responsibility adopted by the Controller layer).

            To sum up, the code quality of many well-known Java dependent libraries is not high, and even the code performance and quality are worrying, and the source code is shocking to read. Instead, readers can refer to how the Turms server is coded to optimize various details to the extreme.

            • When the dependency library that focuses on abstract implementation is combined with responsive programming, it will bring developers a hell-level experience in troubleshooting problems, especially when bugs are related to memory that needs to be released manually. In the troubleshooting of conventional problems, we can usually quickly troubleshoot the problem through the stack information. But in reactive programming, such a method usually does not work, and we rely more on logical reasoning to troubleshoot problems. That is, familiarize yourself with the upstream and downstream code (including the code in the dependent package), and deduce all the processes that the code may go through.

              If the code has few abstraction layers and the call relationship is flat, the troubleshooting process is actually very simple. Maybe we only need to glance at dozens of lines of code in a class to roughly know the cause of the problem. However, if a large number of "encapsulation, abstraction, users do not need to pay attention to the underlying implementation logic" dependency libraries are used in the process, the hell-level experience will come. Originally, we might only need a function with dozens of lines to implement all the relevant logic. But if we implement related functions based on the abstract library, when we troubleshoot, the code we may want to check may be A abstract class (A1, A2, A3...) class -> B abstract class (B1, B2, B3.. .)->C abstract class (C1,C2,C3...)->..., jump between dozens of classes and dozens of methods, and reason.

              The most typical comparison example is: Turms' im.turms.gateway.access.client.websocket.WebSocketServerFactory#getHttpRequestHandler implements a set of WebSocket handshake logic in a function of dozens of lines. But if this set of logic is implemented by Spring, it will mix the classes under different packages and various logics together. When troubleshooting, if it is accompanied by some memory that needs to be released manually, hell level Here comes the troubleshooting experience. What can be solved with dozens of lines of code, a library like Spring requires thousands of lines of code. For example, there are multiple sets of underlying Web implementations inside WebFlux, which is euphemistically called "encapsulation and abstraction, and users do not need to pay attention to the underlying implementation logic."

            • Some dependent libraries will automatically suppress exceptions in some places, and the upper-layer application code cannot perceive them. When something goes wrong, the underlying library code and the upper-level application code run on different stacks in most cases. Unless the underlying dependency library supports global exception callbacks, the upper-layer application cannot even perceive the occurrence of exceptions. For some Trivial-level errors, it doesn't matter if the upper-layer application cannot perceive them. But if it is an abnormality that some upper-layer applications are very concerned about (such as the abnormal disconnection of the RPC TCP connection), this will be the fuse that causes the abnormality and disorder of the entire system.

            • Developers of some well-known dependent libraries even lack the most basic security knowledge. For example, the developers of Log4j actually added code to automatically detect whether there is a ${jndi} pattern in the string to be printed, and if it exists, call the corresponding JNDI service, and enable this function by default. As a developer who specializes in writing log-dependent libraries, he lacks security common sense and has passed PR review.

            On the other hand, self-development can avoid all the above-mentioned problems. While improving the controllability of the code, it also greatly reduces the difficulty of research and development and troubleshooting, and improves code performance and resource utilization.

            In summary, when a Turms project references a class library, it usually does not introduce an abstract encapsulation library (such as Spring), but only an implementation library. Points that require performance optimization or logic optimization in the dependent library will be directly refactored inside the Turms project. Considering the difficulty of self-development and code controllability, we will choose self-development as much as possible in most cases.

            Supplement: Although the Java ecosystem is prosperous, there are actually few high-quality libraries. Therefore, most medium and large-scale Java open source projects that pursue performance usually try to develop various functional modules by themselves instead of using third-party dependent libraries, such as: Elasticsearch, Cassandra, Ignite . In addition, in the entire Java ecosystem, the only library we currently trust in the technical level of its developers is: Netty

            Exception capture and printing

            Role: Understanding the exception capture and printing principles of the Turms server can help developers quickly locate the exception and find the root cause of the exception.

            In reactive programming, the most criticized exception is that exceptions under this programming paradigm are usually very difficult to locate, and their stack information is basically useless. If the developer randomly prints the exception log in the reactive programming mode, it is very likely that the debugger will not even be able to judge where the exception is thrown from the log, let alone reverse the execution code.

            But in fact, the principle and practice of good exception log printing are relatively simple, and if you follow this principle, it usually takes a few seconds or minutes to locate the exception. The basic principle is that the most downstream code throws an exception without printing. If the midstream code needs to translate the exception, it will continue to be thrown upwards after the translation, without printing; the most upstream exception will be received and ** will be printed. As for what code is considered "the most upstream", the code that calls subscribe() is considered the "most upstream". This principle is actually very simple in practice, but the exception capture in reactive programming "looks" more complicated. For example, under the im.turms.service.access.servicerequest.dispatcher.ServiceRequestDispatcher#dispatch0 function in the turms-service server, there is an operation of "send notifications to relevant users according to the processing results of the Service layer" , whose code is as follows:

            java
            return result
                     .name(CLIENT_REQUEST_NAME)
                     .tag(CLIENT_REQUEST_TAG_TYPE, requestType.name())
                     .metrics()
            @@ -60,7 +60,7 @@
                                 });
                     })
                     ...

            As mentioned above, this piece of code performs the notification delivery operation through the notifyRelatedUsersOfAction function. We don't care about its internal implementation. We only need to pass subscribe(...) at the most upstream to ensure that it can catch the exceptions that may be thrown. and print it.

            There are and only custom exception classes inherited from RuntimeException

            In the Turms server project, there are and only exception classes inherited from RuntimeException can be customized, and exception classes inherited from Exception (Checked Exception) are prohibited from being customized.

            The discussion on whether to use Checked Exception or Unchecked Exception has been divided so far, but now many articles directly criticize Checked Exception as a design failure of Java, and later languages such as Kotlin/Scala/C# don’t even have it at all. The concept of Checked Exception, and now most of the well-known large and medium-sized open source projects generally only customize the subclasses of RuntimeException, but not the subclasses of Checked Exception.

            Common reasons why Checked Exception is bad design include:

            • As a third-party library/downstream code, Checked Exception has interface signature version compatibility issues.

            • As a large and medium-sized project, when all submodules use Checked Exception, the interface of the upstream code may eventually declare dozens of exceptions. When the exception declaration of the interface is added, deleted, or modified, it will affect the whole body.

            • Inside the Java code, there are exception design conflicts. For example, Lambda in the design of Java Streams does not support throwing Checked Exception. For Lambda in Stream, its implementation must be treated as processing (usually a wrong practice) or converted into Unchecked Exception (losing the use of Checked Exception). Exceptionmeaning), Java even introducedUncheckedIOException`.

            • In practice, people often avoid the purpose that Checked Exception was designed, so it is better not to use Checked Exception, for example:

              • Catch all Exception directly
              • Translate Checked Exception to RuntimeException. Such as try { ... } catch (Exception e) { throw new RuntimeException(e); }
              • Because the stack is too deep, in order to avoid polluting the upstream code, it is possible to directly perform meaningless capture downstream, and it is even possible to directly catch (Exception e) { do nothing } by mistake
            • Many developers will misunderstand exception design, and then mistakenly customize exceptions. For example, many developers think that if it is an exception that can be avoided by the upstream code, use the subclass of RuntimeException. If it is an unavoidable exception in the upstream code, use Checked Exception`. A similar view is very blindly optimistic and lacks actual project experience and coding experience, because whether the exception thrown downstream can be handled depends on the logic of the upstream code, not The assumptions of the downstream code.

              For example, when the plug-in module of the Turms server loads the plug-in, the class loader of the plug-in may throw NoClassDefFoundError, if according to the early Java team, An Error is a subclass of Throwable that indicates serious problems that a reasonable application should not try to catch, then the upstream code of the plug-in module should not catch Error, but as a server, Turms cannot make the server abnormal because it loads a problematic class plug-in, so the upstream code is really reasonable The approach is to catch these Error, instead of causing the server to crash directly and fall into an abnormal state.

            For the Turms server project, considering that the only scenario where Checked Exception can really play a role is: in individual scenarios, when designing downstream functional modules, it is known that the upstream caller code needs to be based on various events thrown downstream. Exceptions are distinguished by exceptions. In order to ensure that the upstream does not miss processing some exceptions thrown by the downstream, you can consider using Checked Exception. But since this kind of scenario is very rare, it is also very bad practice to design downstream code according to the logic of upstream caller code.

            Therefore, in order to avoid various problems caused by Checked Exception, unify the exception design style, and avoid wasting time on "why are they all modules of a certain type, module A uses a certain type of exception, and module B uses a certain type of exception "For such insignificant disputes, in the Turms server project, there are and only exception classes that inherit from RuntimeException, and it is forbidden to customize exception classes that inherit from Exception (Checked Exception).

            - + \ No newline at end of file diff --git a/docs/server/development/testing.html b/docs/server/development/testing.html index c26662ea..f3806596 100644 --- a/docs/server/development/testing.html +++ b/docs/server/development/testing.html @@ -17,8 +17,8 @@ -
            Skip to content

            Testing

            About stress testing

            Why doesn't the Turms server provide a stress test report

            For two simple Java functions that do the same function, we can easily test their respective performance through JMH. But for projects as large as Turms, there is no such silver bullet. Its complexity is mainly reflected in the following three aspects:

            1. Turms supports a variety of different architectures, and many functions also support opening and closing.

              For example, in configuration parameters It is mentioned in the chapter that some use cases do not even require data storage, so under the same conditions, the natural throughput of applications without storage is faster than those with storage.

              Another example is our About message accessibility, order and repeatability mentions that the Turms server disables the 100% message delivery is supported. The reason is that there is a price to support 100% message delivery. It requires at least one Redis server to distribute serial numbers at the session level. Every time a message is sent, a serial number needs to be requested. The throughput is naturally not as good as the scenario that does not support 100% message delivery.

              Another example is whether the server needs to push the user status when the user logs in. For scenarios that do not need to be pushed, the pressure on the server is naturally much less than that for scenarios that need to be pushed.

              Another example is in the Observability System - Log chapter It is mentioned that the Turms server defaults and recommends 100% log sampling of user requests, and 100% sampling requires a large number of I/O operations, and its throughput is naturally not as good as that of operations that do not sample at all.

            2. For the realization of most of the business requests, most of the requests sent by the Turms server to MongoDB are used for user authority verification and data verification, and only a small part are for the final execution of the business functions instructed by the users. For example, if user A bans user B in group 123, the status of user A, user B, and group 123 must be checked separately, and user B will be banned only if all checks pass. Without these checks, the throughput will naturally be much higher, but there is no real project other than toy projects that will not do checks.

            3. The Turms server usually deletes related data through distributed transactions when deleting certain business data. Without distributed transactions, the throughput will naturally be higher than with distributed transactions, but it is easy to generate dirty data.

            4. Turms provides many caching functions and will support more caching functions in the future. Caching is a classic example of trading space for time. Taking the group member cache as an example, when a group member sends a message, the Turms server needs to query the member list of the group. For the scenario where the cache is used, the Turms server can query based on the local Map, and its throughput is naturally much higher than the scenario where the query request is sent to the database without using the cache, but the advantage of the scenario without using the cache lies in the group list High real-time performance.

            5. The stress test results of stand-alone and distributed are also completely different. Even the Turms server will support: In the stand-alone deployment scenario, the Turms server supports Unix Domain Socket without using TCP connections.

            To sum up, if Turms only wants to write a good-looking stress test report, the Turms server does not need to store any data, guarantee that messages will arrive, push user status, or perform permission verification, data verification and logging on user requests. Sampling, all business operations do not use transactions, all data is cached for a long time, etc., the final throughput is naturally high, but such a stress test report is like a castle in the air, and not many real scenarios will use such a set of configurations . This is not only the reason why our developers of medium and large applications are reluctant to provide simple stress test reports, but also the reason why we do not trust the pressure test reports provided by other medium and large applications.

            When we look at the performance of any application, whether it is fast or slow, we mainly ask "why is it so fast/slow?". For example, when we are researching why the JVM takes up so much memory, if we only see Java's extremely redundant and common object headers, we will sigh "It turns out that it is a redundant design problem, it is the author's bad design, no wonder takes up so much memory”, but if we look at the design and use of Code Cache by JVM, we will sigh again, “It turns out that space is exchanged for time, and it is the author’s good intentions. No wonder it takes up so much memory”, and the evaluation direction is completely different. .

            In the final analysis, laymen watch the excitement, and experts watch the way. Not to mention medium and large applications, even if it is a small Java library, we can't believe it when we look at its performance report. For example, Log4j2 shows its excellent performance in its performance report, but if we read its source code, we will find that Log4j2 The implementation is not efficient, and it is Turms' self-developed log implementation based on Netty that really maximizes performance (for details, please refer to Self-developed Implementation design document, and the specific code im.turms.server.common.logging.core.logger.AsyncLogger#doLog), as long as the comparison The implementation of the source code of the two will find that the two are not at the same level in terms of performance optimization. In order to facilitate users to see the doorway of Turms, the documents are written in detail and the location of key codes will also be marked, so that users can evaluate whether Turms is suitable for their own application scenarios.

            turms-performance-testing project (preview document)

            Although Turms does not plan to provide a ready-made stress test report, we will customize a distributed stress test platform for the Turms server in the near future. The platform's UI display and report analysis will be in charge of turms-admin, while node control and task execution will be in charge of the Controller node and Agent node in turms-performance-testing respectively.

            In particular, the reason why Turms can quickly customize and develop many platforms also benefits from our reasons for secondary development based on Turms mentioned "controllability. The Turms project is 100% open source and has self-developed many basic middleware to ensure the controllability of the underlying technology. It avoids insufficient development motivation in the later stage of the project", so we will not be subject to third-party dependence when we do new projects, and we are full of motivation.

            - +
            Skip to content

            Testing

            About stress testing

            Why doesn't the Turms server provide a stress test report

            For two simple Java functions that do the same function, we can easily test their respective performance through JMH. But for projects as large as Turms, there is no such silver bullet. Its complexity is mainly reflected in the following three aspects:

            1. Turms supports a variety of different architectures, and many functions also support opening and closing.

              For example, in configuration parameters It is mentioned in the chapter that some use cases do not even require data storage, so under the same conditions, the natural throughput of applications without storage is faster than those with storage.

              Another example is our About message accessibility, order and repeatability mentions that the Turms server disables the 100% message delivery is supported. The reason is that there is a price to support 100% message delivery. It requires at least one Redis server to distribute serial numbers at the session level. Every time a message is sent, a serial number needs to be requested. The throughput is naturally not as good as the scenario that does not support 100% message delivery.

              Another example is whether the server needs to push the user status when the user logs in. For scenarios that do not need to be pushed, the pressure on the server is naturally much less than that for scenarios that need to be pushed.

              Another example is in the Observability System - Log chapter It is mentioned that the Turms server defaults and recommends 100% log sampling of user requests, and 100% sampling requires a large number of I/O operations, and its throughput is naturally not as good as that of operations that do not sample at all.

            2. For the realization of most of the business requests, most of the requests sent by the Turms server to MongoDB are used for user authority verification and data verification, and only a small part are for the final execution of the business functions instructed by the users. For example, if user A bans user B in group 123, the status of user A, user B, and group 123 must be checked separately, and user B will be banned only if all checks pass. Without these checks, the throughput will naturally be much higher, but there is no real project other than toy projects that will not do checks.

            3. The Turms server usually deletes related data through distributed transactions when deleting certain business data. Without distributed transactions, the throughput will naturally be higher than with distributed transactions, but it is easy to generate dirty data.

            4. Turms provides many caching functions and will support more caching functions in the future. Caching is a classic example of trading space for time. Taking the group member cache as an example, when a group member sends a message, the Turms server needs to query the member list of the group. For the scenario where the cache is used, the Turms server can query based on the local Map, and its throughput is naturally much higher than the scenario where the query request is sent to the database without using the cache, but the advantage of the scenario without using the cache lies in the group list High real-time performance.

            5. The stress test results of stand-alone and distributed are also completely different. Even the Turms server will support: In the stand-alone deployment scenario, the Turms server supports Unix Domain Socket without using TCP connections.

            To sum up, if Turms only wants to write a good-looking stress test report, the Turms server does not need to store any data, guarantee that messages will arrive, push user status, or perform permission verification, data verification and logging on user requests. Sampling, all business operations do not use transactions, all data is cached for a long time, etc., the final throughput is naturally high, but such a stress test report is like a castle in the air, and not many real scenarios will use such a set of configurations . This is not only the reason why our developers of medium and large applications are reluctant to provide simple stress test reports, but also the reason why we do not trust the pressure test reports provided by other medium and large applications.

            When we look at the performance of any application, whether it is fast or slow, we mainly ask "why is it so fast/slow?". For example, when we are researching why the JVM takes up so much memory, if we only see Java's extremely redundant and common object headers, we will sigh "It turns out that it is a redundant design problem, it is the author's bad design, no wonder takes up so much memory”, but if we look at the design and use of Code Cache by JVM, we will sigh again, “It turns out that space is exchanged for time, and it is the author’s good intentions. No wonder it takes up so much memory”, and the evaluation direction is completely different. .

            In the final analysis, laymen watch the excitement, and experts watch the way. Not to mention medium and large applications, even if it is a small Java library, we can't believe it when we look at its performance report. For example, Log4j2 shows its excellent performance in its performance report, but if we read its source code, we will find that Log4j2 The implementation is not efficient, and it is Turms' self-developed log implementation based on Netty that really maximizes performance (for details, please refer to Self-developed Implementation design document, and the specific code im.turms.server.common.logging.core.logger.AsyncLogger#doLog), as long as the comparison The implementation of the source code of the two will find that the two are not at the same level in terms of performance optimization. In order to facilitate users to see the doorway of Turms, the documents are written in detail and the location of key codes will also be marked, so that users can evaluate whether Turms is suitable for their own application scenarios.

            turms-performance-testing project (preview document)

            Although Turms does not plan to provide a ready-made stress test report, we will customize a distributed stress test platform for the Turms server in the near future. The platform's UI display and report analysis will be in charge of turms-admin, while node control and task execution will be in charge of the Controller node and Agent node in turms-performance-testing respectively.

            In particular, the reason why Turms can quickly customize and develop many platforms also benefits from our reasons for secondary development based on Turms mentioned "controllability. The Turms project is 100% open source and has self-developed many basic middleware to ensure the controllability of the underlying technology. It avoids insufficient development motivation in the later stage of the project", so we will not be subject to third-party dependence when we do new projects, and we are full of motivation.

            + \ No newline at end of file diff --git a/docs/server/module/anti-spam.html b/docs/server/module/anti-spam.html index 48060870..dc70a374 100644 --- a/docs/server/module/anti-spam.html +++ b/docs/server/module/anti-spam.html @@ -17,7 +17,7 @@ -
            Skip to content

            Content Moderation

            Turms does not support and will not support the anti-spam detection function of pictures, videos and voices in the future. All the content below is only explained within the scope of text detection.

            Feature Comparison

            Combined with the actual situation, the biggest advantage of the commercial sensitive word filtering function is: rich thesaurus, timely update, and support for multiple languages. The main disadvantages are: the fee is charged according to the number of detections, and a network request needs to be sent for each detection; the biggest advantage of turms-plugin-antispam is: free, local fast detection, only need to traverse the target string once. The main disadvantage is: no thesaurus is provided. in particular:

            In particular, due to the objective existence of black products, the actual cost of "charging by the number of tests" may be greater than your expected cost.

            Commercial antispam service (including sensitive word filtering)turms-plugin-antispam
            FreeNo. Billed per testYes
            Open SourceNo. Fully closed sourceYes. Fully open source
            Matching speedNeed to send a network request, which is several orders of magnitude slower than the matching speed of turms-plugin-antispamLocal extremely fast matching (implemented by AC automaton algorithm based on double array Trie). You can ignore the performance overhead of matching.
            In NORMALIZATION mode, the time complexity of matching is O(n), where n is the length of the input string.
            In NORMALIZATION_TRANSLITERATION mode, the time complexity of transliteration is O(n), where n is the length of the input string. The time complexity of matching the transliteration result is O(m), where m is the string length of the transliteration result.
            Supplement: Transliteration of Chinese characters refers to the conversion of Chinese characters into pinyin
            Text denoising (e.g. depunctuation, letter and number normalization)Partially supportedPartially supported
            Shape and word matching (such as Martian)Partial supportTODO (1.1 support)
            Split word matchingPartial supportTODO (1.2 support)
            Accurate matching of sounds and wordsSupportSupport
            Fuzzy matching of sounds and wordsSupportTODO (1.1 support)
            polyphone matchingsupportTODO (1.1 support)
            ThesaurusClosed source, but thesaurus is rich and updated in a timely mannerNot provided. See below for specific reasons
            Multi-language/dialect supportSupport for multiple languages and dialectsUsers need to collect thesaurus by themselves. In addition, there are also projects that translate the source language into a specific language and then match it by calling the "translation API", but turms-plugin-antispam does not provide this kind of implementation
            Rare word supportPartial supportPartial support. turms-plugin-antispam can recognize code points in the Unicode Basic Multilingual Plane (BMP), and supports the recognition of more than 20,000 Chinese characters (the latest version of "Xinhua Dictionary" only includes more than 10,000 Chinese characters).
            Because most IM applications do not require to be able to display particularly uncommon characters (such as "𤳵"), it is recommended that your UI front-end application directly use placeholders such as "?" points to replace.
            turms-plugin-antispam has no plans to support code points other than BMP
            Combining sensitive wordsSupportTODO (1.1 support)
            Vertical Text DetectionNot SupportedNot Supported
            Additional information on query thesaurusAbundant additional information. Such as sensitive word categories (pornography, politics, terrorism, prohibition, abuse, flooding, advertising, advertising law, values, etc.)TODO (1.0). In addition, although Turms will support this function in the future, Turms still does not provide sensitive thesaurus
            WhitelistSupportTODO (1.1 support)
            Regionally differentiated servicesPartially supportedNot supported
            Manual review systemPartially supportedNot supported

            Complexity of sensitive word detection

            • Not all text can be detected. Take the string "Turms is an excellent IM open source project" as an example, if we use conventional vertical plaintext display. Then if the sensitive word detection system does not support feature extraction, then the system cannot detect this type of text:

              text
              ╔═╤═╤═╤═╤═╗
              +    
              Skip to content

              Content Moderation

              Turms does not support and will not support the anti-spam detection function of pictures, videos and voices in the future. All the content below is only explained within the scope of text detection.

              Feature Comparison

              Combined with the actual situation, the biggest advantage of the commercial sensitive word filtering function is: rich thesaurus, timely update, and support for multiple languages. The main disadvantages are: the fee is charged according to the number of detections, and a network request needs to be sent for each detection; the biggest advantage of turms-plugin-antispam is: free, local fast detection, only need to traverse the target string once. The main disadvantage is: no thesaurus is provided. in particular:

              In particular, due to the objective existence of black products, the actual cost of "charging by the number of tests" may be greater than your expected cost.

              Commercial antispam service (including sensitive word filtering)turms-plugin-antispam
              FreeNo. Billed per testYes
              Open SourceNo. Fully closed sourceYes. Fully open source
              Matching speedNeed to send a network request, which is several orders of magnitude slower than the matching speed of turms-plugin-antispamLocal extremely fast matching (implemented by AC automaton algorithm based on double array Trie). You can ignore the performance overhead of matching.
              In NORMALIZATION mode, the time complexity of matching is O(n), where n is the length of the input string.
              In NORMALIZATION_TRANSLITERATION mode, the time complexity of transliteration is O(n), where n is the length of the input string. The time complexity of matching the transliteration result is O(m), where m is the string length of the transliteration result.
              Supplement: Transliteration of Chinese characters refers to the conversion of Chinese characters into pinyin
              Text denoising (e.g. depunctuation, letter and number normalization)Partially supportedPartially supported
              Shape and word matching (such as Martian)Partial supportTODO (1.1 support)
              Split word matchingPartial supportTODO (1.2 support)
              Accurate matching of sounds and wordsSupportSupport
              Fuzzy matching of sounds and wordsSupportTODO (1.1 support)
              polyphone matchingsupportTODO (1.1 support)
              ThesaurusClosed source, but thesaurus is rich and updated in a timely mannerNot provided. See below for specific reasons
              Multi-language/dialect supportSupport for multiple languages and dialectsUsers need to collect thesaurus by themselves. In addition, there are also projects that translate the source language into a specific language and then match it by calling the "translation API", but turms-plugin-antispam does not provide this kind of implementation
              Rare word supportPartial supportPartial support. turms-plugin-antispam can recognize code points in the Unicode Basic Multilingual Plane (BMP), and supports the recognition of more than 20,000 Chinese characters (the latest version of "Xinhua Dictionary" only includes more than 10,000 Chinese characters).
              Because most IM applications do not require to be able to display particularly uncommon characters (such as "𤳵"), it is recommended that your UI front-end application directly use placeholders such as "?" points to replace.
              turms-plugin-antispam has no plans to support code points other than BMP
              Combining sensitive wordsSupportTODO (1.1 support)
              Vertical Text DetectionNot SupportedNot Supported
              Additional information on query thesaurusAbundant additional information. Such as sensitive word categories (pornography, politics, terrorism, prohibition, abuse, flooding, advertising, advertising law, values, etc.)TODO (1.0). In addition, although Turms will support this function in the future, Turms still does not provide sensitive thesaurus
              WhitelistSupportTODO (1.1 support)
              Regionally differentiated servicesPartially supportedNot supported
              Manual review systemPartially supportedNot supported

              Complexity of sensitive word detection

              • Not all text can be detected. Take the string "Turms is an excellent IM open source project" as an example, if we use conventional vertical plaintext display. Then if the sensitive word detection system does not support feature extraction, then the system cannot detect this type of text:

                text
                ╔═╤═╤═╤═╤═╗
                 ║┊│item│of│is│T║
                 ║┊│目│I│一│u║
                 ║┊│┊│M│a│r║
                @@ -40,7 +40,7 @@
                 
                 안녕하세요,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
                 こんにちは

                Configuration explanation

                Configuration class: im.turms.plugin.antispam.property.AntiSpamProperties

                Configuration prefix: turms.plugin.antispam

                configuration items

                configuration namedefault valuerole
                enabledtrueWhether to enable the anti-spam function
                dictParsing.binFilePathnullThe binary file path of the dictionary. This file saves the parsed data of the thesaurus text, which is used to avoid parsing the thesaurus text from the beginning each time the server starts. If the user configures "textFilePath" and "binFilePath", "binFilePath" will be used first
                dictParsing.textFilePathnullText file path of the thesaurus
                dictParsing.textFileCharset"UTF-8"Thesaurus encoding format. It is recommended to use "UTF-8" encoding uniformly
                dictParsing.skipInvalidCharactertrueWhether to automatically skip invalid characters when parsing the thesaurus text.
                If false and an illegal character is encountered during parsing, an exception will be thrown
                dictParsing.extendedWords.enabledtrueWhether to support the extended word library function. If true, all data in the thesaurus is parsed and used. If it is false, only parse and use word field data to greatly reduce memory overhead
                textParsingStrategyNORMALIZATION_TRANSLITERATIONParsing strategy for dictionary text and user input text:
                NORMALIZATION: Normalize the input text. For example: ⑩HELLO(你{}好./ -> 10hello
                NORMALIZATION_TRANSLITERATION: Standardize and transliterate the input text. For example: ⑩HELLO(你{}好./ -> 10hellonihao
                unwantedWordHandleStrategyREJECT_REQUESTIllegal text processing strategy:
                REJECT_REQUEST: return "MESSAGE_IS_ILLEGAL" error status code to the client
                MASK_TEXT: replace illegal characters, and continue to process the request normally
                mask'*'When "unwantedWordHandleStrategy" is "MASK_TEXT", the mask used
                maxNumberOfUnwantedWordsToReturn0When the processing strategy is REJECT_REQUEST and the value is greater than 0, the character string detected as illegal text will use ASCII 0x1E (Record Separator) express. The exception text will eventually be received by the client
                textTypesAll other user-visible textConfigure which text fields of which requests should be checked
                silentIllegalTextTypesEmptyConfigure When detecting that these text fields of these requests contain illegal characters, the server will respond to the client with an "OK" status code, but the server does not actually continue processing the request.
                In actual business scenarios, this value is usually CREATE_MESSAGE_REQUEST_TEXT, which is used to silently refuse to send user messages

                Admin API

                TODO

                Reasons not to use other open source implementations

                In the global open source circle, the quality of open source implementations currently available is very low, mainly reflected in: low code quality (high space complexity and time complexity), many matching functions are not supported, and the author does not have engineering design capabilities , There are even paid semi-open source IM projects that perform matching by traversing the thesaurus. There is no implementation of an algorithm and code quality like turms-plugin-antispam, and the overall implementation of traditional anti-spam solutions (not involving machine learning) is not difficult, so Turms chooses self-developed, and also contributes to many later expansions. fully prepared. in particular:

                • Those who know algorithms do not know engineering design, and those who know engineering design do not know how to do algorithms. On the one hand, it is difficult to implement the AC automaton algorithm based on the double array Trie, and the data structure design of Java is relatively conservative, such as String and StringBuilder. In order to ensure the isolation of internal data and external data, many functions will involve For memory copy work, it is necessary for engineers to have basic optimization awareness to avoid various Java "pitfalls" in algorithm implementation. On the other hand, the logic of the anti-spam design and algorithm implementation in Turms is unified, and they are all designed for the IM project of Turms and serve the actual IM needs. Therefore, it can be guaranteed that "the functions that can be imagined can be realized, and the unnecessary functions do not need to be provided, so as to avoid unnecessary time and space overhead".
                • Self-research can customize the algorithm implementation and the upstream and downstream codes of the algorithm according to the project requirements to ensure absolute efficiency (press the space complexity to O(1), time complexity to O(n), and ensure that one side of the string is traversed Sensitive word matching can be completed). For example, in the implementation of the AC automaton standard algorithm, the logic of "skipping a certain character for matching" is not involved. Then if we want to achieve "only detect code points in BMP", we need to filter and copy a new char[] before passing the original char[] to the standard AC algorithm, and then pass it to the AC automaton to match. This frequent memory copy work is undoubtedly very inefficient and unnecessary, especially the "user text message" itself is the most memory-intensive and frequently occurring data among all user requests. However, if a custom implementation is used, we only need to add an if judgment condition to skip the character directly when the AC automaton performs matching. It is simple and clear, and does not need to open up a new memory space, so the space efficiency is high.
              - + \ No newline at end of file diff --git a/docs/server/module/chatbot.html b/docs/server/module/chatbot.html index 9bc27d75..919765c1 100644 --- a/docs/server/module/chatbot.html +++ b/docs/server/module/chatbot.html @@ -17,7 +17,7 @@ -
              Skip to content

              Chatbot

              turms-plugin-rasa

              Introduction

              turms-plugin-rasa is a plugin implementation of the turms-service chatbot based on the open-source conversational AI framework Rasa.

              The workflow of turms-plugin-rasa is simple: it forwards messages sent by users to the Rasa server, and then sends the response returned by the Rasa server to the user in the form of a message.

              Installation

              Configuration

              Configuration ItemDefault ValueDescription
              turms-plugin.rasa.enabledtrueWhether to activate the plugin
              turms-plugin.rasa.instances[?].chatbot-user-id0When a user sends a message to this user ID, the message is forwarded to the Rasa server
              turms-plugin.rasa.instances[?].urlhttp://localhost:5005/webhooks/rest/webhookThe address of the Rasa server that receives user messages
              turms-plugin.rasa.instances[?].request.timeoutMillis60_000Request timeout duration (in milliseconds)
              turms-plugin.rasa.instances[?].response.formatPLAINWhen set to PLAIN, the text field in the response from the Rasa server will be sent directly to the user as a message;
              When set to JSON, the response from the Rasa server will be first serialized into JSON format text, and then sent to the user as a message. See below for the specific JSON format.
              turms-plugin.rasa.instances[?].response.delimiter\nWhen format is set to PLAIN and the user sends one message to the Rasa server but the Rasa server returns multiple responses, the specified string will be used as the delimiter between the text fields in the responses.
              turms-plugin.rasa.instances[?].response.persistDEFAULTWhether to persist messages generated based on the responses of the Rasa server.
              If set to TRUE, it means persisting;
              If set to FALSE, it means not persisting;
              If set to DEFAULT, it means judging based on the property turms.service.message.persist-message.

              The JSON format of the message sent to the user is:

              json
              [
              +    
              Skip to content

              Chatbot

              turms-plugin-rasa

              Introduction

              turms-plugin-rasa is a plugin implementation of the turms-service chatbot based on the open-source conversational AI framework Rasa.

              The workflow of turms-plugin-rasa is simple: it forwards messages sent by users to the Rasa server, and then sends the response returned by the Rasa server to the user in the form of a message.

              Installation

              Configuration

              Configuration ItemDefault ValueDescription
              turms-plugin.rasa.enabledtrueWhether to activate the plugin
              turms-plugin.rasa.instances[?].chatbot-user-id0When a user sends a message to this user ID, the message is forwarded to the Rasa server
              turms-plugin.rasa.instances[?].urlhttp://localhost:5005/webhooks/rest/webhookThe address of the Rasa server that receives user messages
              turms-plugin.rasa.instances[?].request.timeoutMillis60_000Request timeout duration (in milliseconds)
              turms-plugin.rasa.instances[?].response.formatPLAINWhen set to PLAIN, the text field in the response from the Rasa server will be sent directly to the user as a message;
              When set to JSON, the response from the Rasa server will be first serialized into JSON format text, and then sent to the user as a message. See below for the specific JSON format.
              turms-plugin.rasa.instances[?].response.delimiter\nWhen format is set to PLAIN and the user sends one message to the Rasa server but the Rasa server returns multiple responses, the specified string will be used as the delimiter between the text fields in the responses.
              turms-plugin.rasa.instances[?].response.persistDEFAULTWhether to persist messages generated based on the responses of the Rasa server.
              If set to TRUE, it means persisting;
              If set to FALSE, it means not persisting;
              If set to DEFAULT, it means judging based on the property turms.service.message.persist-message.

              The JSON format of the message sent to the user is:

              json
              [
                   {
                       "text": <string>,
                       "image": <string>
              @@ -30,7 +30,7 @@
                   },
                   ...
               ]
              - + \ No newline at end of file diff --git a/docs/server/module/cluster.html b/docs/server/module/cluster.html index 1d328524..824a7782 100644 --- a/docs/server/module/cluster.html +++ b/docs/server/module/cluster.html @@ -17,8 +17,8 @@ -
              Skip to content

              Cluster Design and Implementation

              The cluster code implementation of Turms is relatively clear and easy to understand. The code implementation package is: src/main/java/im/turms/server/common/infra/cluster; the configuration package is: src/main/java/im/turms/server/common/infra/property/env/ common/cluster

              The reason for pure self-development

              Self-developedThird-party services
              Customized functionsTurms has a lot of customized detailed requirements, and each function is linked together. If you develop it yourself, you can ensure that new requirements are realized immediately. The time required to complete a new requirement is roughly 5-60 minutes, and there is no need to write Hacky CodeOthers do not necessarily provide customized functions. Even if it is done, it is usually weeks, months, or even years before new features are released to new versions. Such low efficiency is absolutely unacceptable
              Difficulty of LearningServices are clearly divided, codes are streamlined, and aspects can be quickly learned and mastered. Spend 10 to 30 minutes to train some basic newcomers, and newcomers can master Turms cluster servicesProjects like ZooKeeper or Eureka, which are only about a certain function of microservices, have far more source code than the six major Turms below The sum of service source code. Moreover, third-party services also involve some relatively complex but completely useless functions for Turms, such as Zookeeper's Zab protocol, which only increases the difficulty of learning. To grasp the details of its implementation requires a lot of practice and source code reading
              Implementation DifficultyThe implementation difficulty of the cluster service code is low. For example, the total difficulty of Turms' cluster service implementation is far lower than the "AC automaton algorithm based on double array Trie" mentioned in the "sensitive word filtering" function.
              In addition, the implementation difficulty of cluster services is much lower than the implementation of IM business logic
              Need to write Adapter code according to the characteristics of third-party services. Although the difficulty is low, due to the complexity of the source code of the third-party service, it is not easy to ensure that the Adapter code will always execute as expected (using the time of learning the source code of various third-party services + writing the Adapter code, it has been possible to start from Zero self-developed several sets of cluster services have been realized)
              Deployment and O&M DifficultyIn Turms' cluster services, only the "Configuration Center Service" and "Service Registry" require MongoDB services for deployment, and both share MongoDB services. Therefore:
              1. Since business data storage also uses MongoDB service, operation and maintenance personnel can choose to share a MongoDB service without additional deployment
              2. Both domestic and foreign cloud vendors provide MongoDB service deployment services. You can deploy a single instance or a clustered MongoDB service with just a few clicks of the mouse, and directly realize disaster recovery in the same city
              Most of the "configuration center service" and "service registration center" services supported by domestic and foreign cloud vendors are bound to specific vendors, allowing deployment flexibility very bad. On the other hand, if Turms adopts an open source solution such as Eureka, since various vendors do not provide cloud services for open source solutions such as Eureka, the operation and maintenance personnel have to purchase cloud servers for deployment and operation and maintenance by themselves. , greatly participated in the difficulty of operation and maintenance
              PerformanceTurms can combine the characteristics of business code, so that the implementation of cluster services can coordinate with each other, ensuring that no redundant data is generated throughout the process. At the same time, all network operations are implemented based on Netty, with extremely high performanceSince third-party services are based on general requirements, we have to write a lot of Adapter code for this, which increases resource overhead and makes learning more difficult. On the other hand, its own implementation cannot guarantee extreme efficiency, and some services even use blocking API

              In summary, there is almost no advantage in using third-party services, so Turms adopts a purely self-developed solution. In addition, in fact, companies with a little strength and a little customization need will choose self-research for the same reason as above.

              node

              Implementation class: im.turms.server.common.infra.cluster.node.Node

              Configuration class: im.turms.server.common.infra.property.env.common.cluster.NodeProperties

              Each server has one and only one node class instance. The node class internally manages node information and node life cycle events, and schedules the services of each node. Undertake user-defined configuration externally, expose node services and provide some commonly used Util functions for business implementation codes to use.

              Serve

              Distributed configuration center service (Config)

              Service class: im.turms.server.common.infra.cluster.service.config.SharedConfigService

              Configuration class: im.turms.server.common.infra.property.env.common.cluster.SharedConfigProperties

              Nowadays, basic service implementation schemes in the field of microservices are flourishing. Taking the implementation of the configuration center as an example, the implementation solutions include: ConfigMaps of K8S, configuration services of cloud service vendors (such as AppConfig of AWS), and open source implementations (such as Zookeeper). As Turms is a technology-neutral open source project, its technology stack must not be bound by vendors. But at the same time, it is necessary to ensure that these implementations can be easily supported by cloud service vendors, so that operation and maintenance personnel can "implement and deploy with a click of a mouse." At the same time, it must meet various key features such as disaster tolerance, high availability, monitorability, and easy operation. Therefore, Turms implements the configuration center through MongoDB self-development to meet all the above requirements.

              The addition, deletion, modification, and query operations of the specific configuration are implemented as the addition, deletion, modification, and query operations of the conventional MongoDB database, which is very routine, so I won’t go into details. The only thing worthy of special attention is: Turms monitors configuration changes through MongoDB's Change Stream mechanism, while the official client implementation mongo-java-driver uses a polling mechanism to monitor configuration changes, rather than actively notifying MongoDB from the MongoDB server client.

              Replenish:

              • Because the "service information" of the service registry is essentially a configuration, the following service registration and discovery are also implemented based on the configuration center.
              • The configuration center of the MongoDB cluster itself is also implemented based on the MongoDB server, that is, the Config server.

              TODO

              Service registration and discovery service (Discovery)

              Service class: im.turms.server.common.infra.cluster.service.discovery.DiscoveryService

              Configuration class: im.turms.server.common.infra.property.env.common.cluster.DiscoveryProperties

              Responsibilities

              The service is primarily responsible for:

              • Do our best to ensure that the current node is registered in the service registry. When each node starts on the server side, it will register the information of the current node with the service registry. If the registration fails at startup (for example, the node information has been registered), it will actively shut down the server process and report the failed exception information. If the registration information of the node is abnormally deleted by the service registration center during the operation of the node (for example, the administrator deletes the data by mistake), the node will automatically re-register its information
              • When the server is shut down gracefully, delete the registration information of the current node in the service registry. Note: If the server is forcibly shut down (such as when the system is directly powered off), the registration information of the node will not be deleted by the current node, but will be automatically removed after the service registration center detects a heartbeat timeout of 60 seconds. registration message. In addition, during this period, other nodes will continue to try to establish a TCP connection with this node until its registration information is removed by the service registration center
              • Listen to the node addition, deletion and modification events of the service registration center to notify the "network connection service" to connect or disconnect the corresponding TCP connection
              • Election Leader

              Register node record format

              There are two types of record formats for registration nodes: Member and Leader

              #####Member

              Class: im.turms.server.common.infra.cluster.service.config.domain.discovery.Member

              Field CategoryField NameDescription
              KeyclusterIdCluster ID
              nodeIdNode ID
              General InformationzoneThe zone where the node is located. Used as the data center ID in Snowflake ID Algorithm
              nodeVersionNode version number. Used to ensure that the operations between nodes can be version compatible
              nodeTypeThe node type. Used to ensure that RPC requests can be sent to the correct node
              isSeedIf a node's lastHeartbeatDate times out by 60 seconds and isSeed is false, the node will be automatically removed from the service registry. If isSeed is true, the node will not be removed even if the heartbeat times out
              registrationDatenode registration time
              isLeaderEligibleUsed to determine whether a node can participate in the election
              priorityPriority. Mainly used in the Leader election, the node with a high value can be preferentially elected as the Leader
              RPC address informationmemberHostRPC host number. It is used to ensure that other nodes can communicate with it through the host number
              memberPortRPC port number. It is used to ensure that other nodes can communicate with it through this port number
              Supplementary address informationadminApiAddressNo practical effect. It is only used for administrators to know the address information of Admin API through Admin API
              wsAddressNo effect. It is only used for the administrator to know the address information of the client WebSocket service through the Admin API
              tcpAddressNo effect. It is only used for the administrator to know the address information of the client TCP service through the Admin API
              udpAddressNo effect. It is only used for the administrator to know the address information of the client UDP service through the Admin API
              Status informationhasJoinedClusterWhen it is True, it means that the node has successfully completed the heartbeat refresh operation. This field has no practical effect, it is only used as an indicator to indicate the node's heartbeat health status. Even if a node is unhealthy, it can still handle client requests.
              In addition, the value of this field of each cluster node is updated by the Leader node according to lastHeartbeatDate of each node
              isHealthyDeny service when False. Specifically, it includes: if it is the turms-gateway server, refuse to establish a new session and process user requests; if it is the turms-service server, refuse to process the RPC request sent by the turms-gateway server; When the client chooses RPC to respond to the server, it only selects from healthy nodes
              isActiveWhen it is False, it means that the node is prohibited from processing client requests. The value of this field can only be updated through the Admin API. It can be used to gradually cut off the flow of nodes during the grayscale release, and then perform shutdown update operations
              lastHeartbeatDateRecord the last heartbeat refresh time, used by the Leader node to update hasJoinedCluster information based on this value
              Leader
              Field CategoryField NameDescription
              KeyclusterIdCluster ID
              nodeIdLeader node ID
              General InformationrenewDateLease renewal time. If there is no refresh for more than 60s, the service registration center will automatically delete the Leader record information
              generationgeneration. It is mainly used to reject the previous generation Leader's attempt to perform lease operations because the birth of a new Leader has not been detected

              Leader Election

              Conditions for nodes to participate in the election:

              • The node type must be turms-service, not turms-gateway. This is because some Leader actions can only be performed by turms-service, and turms-gateway has no ability to perform these operations.
              • im.turms.server.common.infra.property.env.common.cluster.NodeProperties#leaderEligible is true (default true)
              • Node status must be active
              Automatic Election

              Each node eligible for election: 1. When the server starts; 2. When the Leader information of the service registration center is deleted through the Change Stream; 3. When it finds that its isLeaderEligible information changes from False to True:

              The current node will first pull all the node information in the service registry at this moment, and find a batch of nodes with the highest priority that are eligible for election. If the current node is in this batch of nodes and there is no Leader in the local node information snapshot, it will send a Leader registration request to the service registration center and try to select itself as the Leader. If there is no Leader in the service registry, the registration is successful. Otherwise, registration fails.

              Note: If a node with a higher priority joins the cluster, the node will not snatch the Leader role.

              Manual election (Admin API)

              The API interface POST /cluster/members/leader allows to force the cluster to re-elect the Leader. This API has an id parameter. If the id parameter is empty, the node with the highest priority in the current cluster and eligible for election will be forced to be the leader. If the id parameter is not empty, the node whose node ID is id is elected as the leader, regardless of its priority. An exception is thrown if the node does not exist or is not eligible for election.

              Leader's Responsibilities

              Generally speaking, it is necessary to ensure that only one node triggers or executes an action, which is usually executed by the Leader node. In addition, in some server implementations, this kind of behavior will be realized by nodes preempting distributed locks, but the reliability, controllability and performance of this implementation are far inferior to the solution of using a unified leader, so Turms does not use preempting distributed locks. lock scheme.

              In terms of specific actions:

              • One of the most important actions of the Leader is to update the latest status of each node according to the heartbeat refresh time of other nodes in the service registry (MongoDB) (the specific code is in: im.turms.server.common.infra.cluster.service .discovery.LocalNodeStatusManager#updateMembersStatus)
              • "Periodic cron sends instructions to Redis to clear expired blacklist records" This action only needs one node, that is, the Leader to execute regularly.
              • "Periodic cron deletes expired database data operations, such as user messages", and will only be executed by the Leader (Supplement: The code of this type of operation is actually a "legacy code" and is reserved "by the way". After all, very few applications will really get it Delete user data, so the default disabled state can be ignored)

              TODO

              Network connection service (Connection)

              Service class: im.turms.server.common.infra.cluster.service.connection.ConnectionService

              Configuration class: im.turms.server.common.infra.property.env.common.cluster.connection.ConnectionProperties

              In the implementation of Turms server cluster, Connection is a concept between Transport and RPC, because Connection needs to maintain the TCP connection between nodes on the one hand, and on the other hand needs to pass RpcService To complete the heartbeat operation between nodes (used to detect whether the TCP connection between nodes is healthy). The reason why ConnectionService and RpcService are not merged into one Service is because both of them have a lot of their own logic. In order to follow the principle of single responsibility as much as possible, to avoid mixing a large number of TCP connection maintenance and RPC capability realization logic, Therefore the two services are not merged.

              Responsibilities

              • According to the request of service registration and discovery service, connect to other cluster nodes based on TCP. Note: There is and only one TCP connection between two nodes
              • If accidentally disconnected from other cluster nodes, do a best-effort reconnect operation
              • Send a heartbeat request to confirm that the TCP connection between nodes is indeed valid

              Network connection life cycle

              • Establish a TCP connection
              • Carry out the handshake operation of the application layer, and exchange the basic necessary information of the nodes, such as the node ID, to know which node the TCP peer is. Note: The handshake here is not the handshake in the TCP protocol.
              • After the handshake is successful, the nodes can send and receive network data
              • Before closing the TCP network connection, send the wave operation of the application layer to notify the peer that the node should actively disconnect from it, so as to distinguish TCP from accidental disconnection. Note: The waving here is not the wave in the TCP protocol.
              • Close the TCP connection

              Codec service (Codec)

              Service class: im.turms.server.common.infra.cluster.service.codec.CodecService

              This service mainly provides data codec implementation for RPC services. In particular, Turms does not use the reflection mechanism to uniformly implement the serialization and deserialization logic, but customizes the implementation for each data. This is mainly because: 1. The customized implementation ensures absolute efficiency. For example, Set<DeviceType> can use a Byte to represent the existence of a value by Bit instead of a group of Bytes; 2. Avoid reflection and ensure high efficiency; 3. What you see is what you get in the code, avoiding the existence of obscure operations

              RPC service

              Service class: im.turms.server.common.infra.cluster.service.rpc.RpcService

              Configuration class: im.turms.server.common.infra.property.env.common.cluster.RpcProperties

              This service is based on the underlying TCP network connection provided by the "network connection service" and the data serialization and deserialization capabilities provided by the "codec service" to implement the relevant logic of the RPC operation.

              Encoding format

              The components of an RPC request:

              1. The length of the text encoded by Varint, which is used to distinguish the byte range of each RPC request data in the TCP byte stream. For most RPC requests, this part usually occupies 1~2 bytes.
              2. Request header: data type ID (2 bytes) + request ID (4 bytes)
              3. Request body: Different requests have different encoding methods, but they all use custom encoding to ensure extreme efficiency. In addition, the largest data in the request body is "user-defined text", such as "chat message"

              Components of an RPC response:

              1. The text length of Varint encoding, which is used to distinguish the byte range of each RPC response data in the TCP byte stream. For most RPC responses, this part usually occupies 1 byte.
              2. Response header: data type ID (2 bytes) + response request ID (4 bytes)
              3. Response body: The response body can be divided into two categories: normal response and abnormal response. The correct response is various data types, such as the eight basic types and other combined data types. The exception response is essentially just a "combined data type", which is expressed as the RpcException data type, and the exception information is described through the RpcErrorCode, ResponseStatusCode, and description (String) fields.

              Replenish

              • Some requests (such as user chat messages in "Notification") will be sent to multiple different RPC nodes, and their request bodies all share direct memory outside the heap, and memory copying is not required

              • Turms currently does not plan to use compression technology for RPC request and response data, mainly because: the compression ratio of various compression algorithms is not ideal, and compression and decompression need to consume a lot of memory and CPU resources. In general, the price/performance ratio of compression is too low, and the gain outweighs the gain, so compression technology is not used.

                What’s more, for data transfer between server and client, support for compression will be considered in the future. The fundamental motivation is: at the cost of more memory and CPU usage (opening up new memory space when compressing/decompressing) ) Improve data accessibility by compressing data (especially in weak network environments)

              Backpressure

              The turms-gateway server’s implementation of back pressure on the turms-service server is quite tricky. Specifically: each node will judge the health status of the current node according to the CPU and memory load status of the current node, and send Other nodes synchronize the health information. The turms-gateway will find out the node whose "isHealthy" is True from the known turms-service node list, and send an RPC request to it. If turms-gateway finds that the "isHealthy" of all turms-services is False, it will no longer send RPC, but will directly throw an exception.

              Failover

              For an RPC request without a specific target, if one Turms server sends an RPC request to another Turms server, and the peer responds abnormally, the sender will automatically send the RPC request to another Turms server. For example, if the client sends a request to turms-gateway, turms-gateway will first randomly select a turms-service to process the user request, if the turms-service responds abnormally, turms-gateway will automatically search again Another turms-service to handle the user request.

              Distributed ID Generation Service (IdGen)

              Service class: im.turms.server.common.infra.cluster.service.idgen.IdService

              The distributed ID generator is used to quickly provide the unique ID of the cluster for each business scenario. Generating a unique ID for a cluster only requires nodes to perform local operations (specific code: im.turms.server.common.infra.cluster.service.idgen.SnowflakeIdGenerator#nextLargeGapId), which is extremely efficient.

              Principle

              Turms' distributed ID generator is implemented based on the mainstream Snowflake ID algorithm, and the generated ID is of long data type, specifically:

              • The highest bit (1 bit) is always 0, indicating a positive number
              • 41 bits represent the time stamp in milliseconds, which can represent about 69 years. The specific UTC time interval is: [2020-10-13, 2090-06-19]. 2020-10-13 is the hard-coded Epoch time, if you want to modify the time, just modify the value of im.turms.server.common.infra.cluster.service.idgen.SnowflakeIdGenerator#EPOCH
              • 4 bits represent the data center ID, and the ID range is [0, 15]. In practice, the ID is usually divided into regions in the cloud service, that is, each region has an ID. Turms will automatically map the zone name to a value in the interval [0, 15] according to the NodeProperties#zone "zone name" of the node. Note: If there are more than 16 region names, although these region names will still be mapped to values in the interval [0, 15], this also means that there will be duplicate data center IDs, and cluster nodes that generate the same ID risk. Also, the downgraded node will print a warning log to warn of the risk of generating the same ID.
              • 8 bits represent the ID of the working node, and the ID range is [0, 255]. Turms will automatically map the zone name to a value in the interval [0, 255] according to the im.turms.server.common.infra.property.env.common.cluster.NodeProperties#zone "zone name" of the node. Note: If there are more than 256 nodes in a data center, although these node IDs will still be mapped to values ​​in the interval [0, 255], this also means that there will be duplicate worker node IDs, and there are cluster nodes Risk of generating the same ID. Also, the downgraded node will print a warning log to warn of the risk of generating the same ID.
              • 10 bits represent the serial number. A maximum of 1024 serial numbers can be represented in the unit timestamp field (1 millisecond), that is, a maximum of 1024 unique IDs can be generated in 1 millisecond. In other words, a maximum of 1,024,000 unique IDs can be represented within 1 second, so in actual use, it is impossible to have duplicate IDs.

              Supplement: According to the node information, the code to update the data center ID and the working node ID information is in: addOnMembersChangeListener of im.turms.server.common.infra.cluster.service.idgen.IdService#IdService

              Variation implementation

              Concrete implementation: im.turms.server.common.infra.cluster.service.idgen.SnowflakeIdGenerator#nextLargeGapId

              The IDs generated by the conventional snowflake algorithm are monotonically increasing. But in most cases, Turms' business implementation uses IDs with large intervals to avoid monotonically increasing IDs. The reason for this is to use large-spacing IDs to ensure that when these data are stored in the MongoDB database, MongoDB can generate enough Chunks based on these IDs, and load-balance these Chunks to each MongoDB server for storage. . The monotonically increasing ID will cause all new data to be always allocated to the only hotspot MongoDB server, causing the load balancing of the database to fail.

              The implementation of large-spacing ID is also very simple, just rearrange the fields, the specific order is: serial number, time stamp, data center ID, work node ID (the ID order of the conventional snowflake algorithm is time stamp, data Center ID, worker node ID, serial number). Since the serial number occupies the highest bit of the ID, and the generated serial number is monotonically increasing in the interval [0, 1023], it can ensure that the generated ID quickly occupies a large range of values, and is divided into multiple Chunks by MongoDB and stored in a load-balanced manner. In different MongoDB servers.

              - +
              Skip to content

              Cluster Design and Implementation

              The cluster code implementation of Turms is relatively clear and easy to understand. The code implementation package is: src/main/java/im/turms/server/common/infra/cluster; the configuration package is: src/main/java/im/turms/server/common/infra/property/env/ common/cluster

              The reason for pure self-development

              Self-developedThird-party services
              Customized functionsTurms has a lot of customized detailed requirements, and each function is linked together. If you develop it yourself, you can ensure that new requirements are realized immediately. The time required to complete a new requirement is roughly 5-60 minutes, and there is no need to write Hacky CodeOthers do not necessarily provide customized functions. Even if it is done, it is usually weeks, months, or even years before new features are released to new versions. Such low efficiency is absolutely unacceptable
              Difficulty of LearningServices are clearly divided, codes are streamlined, and aspects can be quickly learned and mastered. Spend 10 to 30 minutes to train some basic newcomers, and newcomers can master Turms cluster servicesProjects like ZooKeeper or Eureka, which are only about a certain function of microservices, have far more source code than the six major Turms below The sum of service source code. Moreover, third-party services also involve some relatively complex but completely useless functions for Turms, such as Zookeeper's Zab protocol, which only increases the difficulty of learning. To grasp the details of its implementation requires a lot of practice and source code reading
              Implementation DifficultyThe implementation difficulty of the cluster service code is low. For example, the total difficulty of Turms' cluster service implementation is far lower than the "AC automaton algorithm based on double array Trie" mentioned in the "sensitive word filtering" function.
              In addition, the implementation difficulty of cluster services is much lower than the implementation of IM business logic
              Need to write Adapter code according to the characteristics of third-party services. Although the difficulty is low, due to the complexity of the source code of the third-party service, it is not easy to ensure that the Adapter code will always execute as expected (using the time of learning the source code of various third-party services + writing the Adapter code, it has been possible to start from Zero self-developed several sets of cluster services have been realized)
              Deployment and O&M DifficultyIn Turms' cluster services, only the "Configuration Center Service" and "Service Registry" require MongoDB services for deployment, and both share MongoDB services. Therefore:
              1. Since business data storage also uses MongoDB service, operation and maintenance personnel can choose to share a MongoDB service without additional deployment
              2. Both domestic and foreign cloud vendors provide MongoDB service deployment services. You can deploy a single instance or a clustered MongoDB service with just a few clicks of the mouse, and directly realize disaster recovery in the same city
              Most of the "configuration center service" and "service registration center" services supported by domestic and foreign cloud vendors are bound to specific vendors, allowing deployment flexibility very bad. On the other hand, if Turms adopts an open source solution such as Eureka, since various vendors do not provide cloud services for open source solutions such as Eureka, the operation and maintenance personnel have to purchase cloud servers for deployment and operation and maintenance by themselves. , greatly participated in the difficulty of operation and maintenance
              PerformanceTurms can combine the characteristics of business code, so that the implementation of cluster services can coordinate with each other, ensuring that no redundant data is generated throughout the process. At the same time, all network operations are implemented based on Netty, with extremely high performanceSince third-party services are based on general requirements, we have to write a lot of Adapter code for this, which increases resource overhead and makes learning more difficult. On the other hand, its own implementation cannot guarantee extreme efficiency, and some services even use blocking API

              In summary, there is almost no advantage in using third-party services, so Turms adopts a purely self-developed solution. In addition, in fact, companies with a little strength and a little customization need will choose self-research for the same reason as above.

              node

              Implementation class: im.turms.server.common.infra.cluster.node.Node

              Configuration class: im.turms.server.common.infra.property.env.common.cluster.NodeProperties

              Each server has one and only one node class instance. The node class internally manages node information and node life cycle events, and schedules the services of each node. Undertake user-defined configuration externally, expose node services and provide some commonly used Util functions for business implementation codes to use.

              Serve

              Distributed configuration center service (Config)

              Service class: im.turms.server.common.infra.cluster.service.config.SharedConfigService

              Configuration class: im.turms.server.common.infra.property.env.common.cluster.SharedConfigProperties

              Nowadays, basic service implementation schemes in the field of microservices are flourishing. Taking the implementation of the configuration center as an example, the implementation solutions include: ConfigMaps of K8S, configuration services of cloud service vendors (such as AppConfig of AWS), and open source implementations (such as Zookeeper). As Turms is a technology-neutral open source project, its technology stack must not be bound by vendors. But at the same time, it is necessary to ensure that these implementations can be easily supported by cloud service vendors, so that operation and maintenance personnel can "implement and deploy with a click of a mouse." At the same time, it must meet various key features such as disaster tolerance, high availability, monitorability, and easy operation. Therefore, Turms implements the configuration center through MongoDB self-development to meet all the above requirements.

              The addition, deletion, modification, and query operations of the specific configuration are implemented as the addition, deletion, modification, and query operations of the conventional MongoDB database, which is very routine, so I won’t go into details. The only thing worthy of special attention is: Turms monitors configuration changes through MongoDB's Change Stream mechanism, while the official client implementation mongo-java-driver uses a polling mechanism to monitor configuration changes, rather than actively notifying MongoDB from the MongoDB server client.

              Replenish:

              • Because the "service information" of the service registry is essentially a configuration, the following service registration and discovery are also implemented based on the configuration center.
              • The configuration center of the MongoDB cluster itself is also implemented based on the MongoDB server, that is, the Config server.

              TODO

              Service registration and discovery service (Discovery)

              Service class: im.turms.server.common.infra.cluster.service.discovery.DiscoveryService

              Configuration class: im.turms.server.common.infra.property.env.common.cluster.DiscoveryProperties

              Responsibilities

              The service is primarily responsible for:

              • Do our best to ensure that the current node is registered in the service registry. When each node starts on the server side, it will register the information of the current node with the service registry. If the registration fails at startup (for example, the node information has been registered), it will actively shut down the server process and report the failed exception information. If the registration information of the node is abnormally deleted by the service registration center during the operation of the node (for example, the administrator deletes the data by mistake), the node will automatically re-register its information
              • When the server is shut down gracefully, delete the registration information of the current node in the service registry. Note: If the server is forcibly shut down (such as when the system is directly powered off), the registration information of the node will not be deleted by the current node, but will be automatically removed after the service registration center detects a heartbeat timeout of 60 seconds. registration message. In addition, during this period, other nodes will continue to try to establish a TCP connection with this node until its registration information is removed by the service registration center
              • Listen to the node addition, deletion and modification events of the service registration center to notify the "network connection service" to connect or disconnect the corresponding TCP connection
              • Election Leader

              Register node record format

              There are two types of record formats for registration nodes: Member and Leader

              #####Member

              Class: im.turms.server.common.infra.cluster.service.config.domain.discovery.Member

              Field CategoryField NameDescription
              KeyclusterIdCluster ID
              nodeIdNode ID
              General InformationzoneThe zone where the node is located. Used as the data center ID in Snowflake ID Algorithm
              nodeVersionNode version number. Used to ensure that the operations between nodes can be version compatible
              nodeTypeThe node type. Used to ensure that RPC requests can be sent to the correct node
              isSeedIf a node's lastHeartbeatDate times out by 60 seconds and isSeed is false, the node will be automatically removed from the service registry. If isSeed is true, the node will not be removed even if the heartbeat times out
              registrationDatenode registration time
              isLeaderEligibleUsed to determine whether a node can participate in the election
              priorityPriority. Mainly used in the Leader election, the node with a high value can be preferentially elected as the Leader
              RPC address informationmemberHostRPC host number. It is used to ensure that other nodes can communicate with it through the host number
              memberPortRPC port number. It is used to ensure that other nodes can communicate with it through this port number
              Supplementary address informationadminApiAddressNo practical effect. It is only used for administrators to know the address information of Admin API through Admin API
              wsAddressNo effect. It is only used for the administrator to know the address information of the client WebSocket service through the Admin API
              tcpAddressNo effect. It is only used for the administrator to know the address information of the client TCP service through the Admin API
              udpAddressNo effect. It is only used for the administrator to know the address information of the client UDP service through the Admin API
              Status informationhasJoinedClusterWhen it is True, it means that the node has successfully completed the heartbeat refresh operation. This field has no practical effect, it is only used as an indicator to indicate the node's heartbeat health status. Even if a node is unhealthy, it can still handle client requests.
              In addition, the value of this field of each cluster node is updated by the Leader node according to lastHeartbeatDate of each node
              isHealthyDeny service when False. Specifically, it includes: if it is the turms-gateway server, refuse to establish a new session and process user requests; if it is the turms-service server, refuse to process the RPC request sent by the turms-gateway server; When the client chooses RPC to respond to the server, it only selects from healthy nodes
              isActiveWhen it is False, it means that the node is prohibited from processing client requests. The value of this field can only be updated through the Admin API. It can be used to gradually cut off the flow of nodes during the grayscale release, and then perform shutdown update operations
              lastHeartbeatDateRecord the last heartbeat refresh time, used by the Leader node to update hasJoinedCluster information based on this value
              Leader
              Field CategoryField NameDescription
              KeyclusterIdCluster ID
              nodeIdLeader node ID
              General InformationrenewDateLease renewal time. If there is no refresh for more than 60s, the service registration center will automatically delete the Leader record information
              generationgeneration. It is mainly used to reject the previous generation Leader's attempt to perform lease operations because the birth of a new Leader has not been detected

              Leader Election

              Conditions for nodes to participate in the election:

              • The node type must be turms-service, not turms-gateway. This is because some Leader actions can only be performed by turms-service, and turms-gateway has no ability to perform these operations.
              • im.turms.server.common.infra.property.env.common.cluster.NodeProperties#leaderEligible is true (default true)
              • Node status must be active
              Automatic Election

              Each node eligible for election: 1. When the server starts; 2. When the Leader information of the service registration center is deleted through the Change Stream; 3. When it finds that its isLeaderEligible information changes from False to True:

              The current node will first pull all the node information in the service registry at this moment, and find a batch of nodes with the highest priority that are eligible for election. If the current node is in this batch of nodes and there is no Leader in the local node information snapshot, it will send a Leader registration request to the service registration center and try to select itself as the Leader. If there is no Leader in the service registry, the registration is successful. Otherwise, registration fails.

              Note: If a node with a higher priority joins the cluster, the node will not snatch the Leader role.

              Manual election (Admin API)

              The API interface POST /cluster/members/leader allows to force the cluster to re-elect the Leader. This API has an id parameter. If the id parameter is empty, the node with the highest priority in the current cluster and eligible for election will be forced to be the leader. If the id parameter is not empty, the node whose node ID is id is elected as the leader, regardless of its priority. An exception is thrown if the node does not exist or is not eligible for election.

              Leader's Responsibilities

              Generally speaking, it is necessary to ensure that only one node triggers or executes an action, which is usually executed by the Leader node. In addition, in some server implementations, this kind of behavior will be realized by nodes preempting distributed locks, but the reliability, controllability and performance of this implementation are far inferior to the solution of using a unified leader, so Turms does not use preempting distributed locks. lock scheme.

              In terms of specific actions:

              • One of the most important actions of the Leader is to update the latest status of each node according to the heartbeat refresh time of other nodes in the service registry (MongoDB) (the specific code is in: im.turms.server.common.infra.cluster.service .discovery.LocalNodeStatusManager#updateMembersStatus)
              • "Periodic cron sends instructions to Redis to clear expired blacklist records" This action only needs one node, that is, the Leader to execute regularly.
              • "Periodic cron deletes expired database data operations, such as user messages", and will only be executed by the Leader (Supplement: The code of this type of operation is actually a "legacy code" and is reserved "by the way". After all, very few applications will really get it Delete user data, so the default disabled state can be ignored)

              TODO

              Network connection service (Connection)

              Service class: im.turms.server.common.infra.cluster.service.connection.ConnectionService

              Configuration class: im.turms.server.common.infra.property.env.common.cluster.connection.ConnectionProperties

              In the implementation of Turms server cluster, Connection is a concept between Transport and RPC, because Connection needs to maintain the TCP connection between nodes on the one hand, and on the other hand needs to pass RpcService To complete the heartbeat operation between nodes (used to detect whether the TCP connection between nodes is healthy). The reason why ConnectionService and RpcService are not merged into one Service is because both of them have a lot of their own logic. In order to follow the principle of single responsibility as much as possible, to avoid mixing a large number of TCP connection maintenance and RPC capability realization logic, Therefore the two services are not merged.

              Responsibilities

              • According to the request of service registration and discovery service, connect to other cluster nodes based on TCP. Note: There is and only one TCP connection between two nodes
              • If accidentally disconnected from other cluster nodes, do a best-effort reconnect operation
              • Send a heartbeat request to confirm that the TCP connection between nodes is indeed valid

              Network connection life cycle

              • Establish a TCP connection
              • Carry out the handshake operation of the application layer, and exchange the basic necessary information of the nodes, such as the node ID, to know which node the TCP peer is. Note: The handshake here is not the handshake in the TCP protocol.
              • After the handshake is successful, the nodes can send and receive network data
              • Before closing the TCP network connection, send the wave operation of the application layer to notify the peer that the node should actively disconnect from it, so as to distinguish TCP from accidental disconnection. Note: The waving here is not the wave in the TCP protocol.
              • Close the TCP connection

              Codec service (Codec)

              Service class: im.turms.server.common.infra.cluster.service.codec.CodecService

              This service mainly provides data codec implementation for RPC services. In particular, Turms does not use the reflection mechanism to uniformly implement the serialization and deserialization logic, but customizes the implementation for each data. This is mainly because: 1. The customized implementation ensures absolute efficiency. For example, Set<DeviceType> can use a Byte to represent the existence of a value by Bit instead of a group of Bytes; 2. Avoid reflection and ensure high efficiency; 3. What you see is what you get in the code, avoiding the existence of obscure operations

              RPC service

              Service class: im.turms.server.common.infra.cluster.service.rpc.RpcService

              Configuration class: im.turms.server.common.infra.property.env.common.cluster.RpcProperties

              This service is based on the underlying TCP network connection provided by the "network connection service" and the data serialization and deserialization capabilities provided by the "codec service" to implement the relevant logic of the RPC operation.

              Encoding format

              The components of an RPC request:

              1. The length of the text encoded by Varint, which is used to distinguish the byte range of each RPC request data in the TCP byte stream. For most RPC requests, this part usually occupies 1~2 bytes.
              2. Request header: data type ID (2 bytes) + request ID (4 bytes)
              3. Request body: Different requests have different encoding methods, but they all use custom encoding to ensure extreme efficiency. In addition, the largest data in the request body is "user-defined text", such as "chat message"

              Components of an RPC response:

              1. The text length of Varint encoding, which is used to distinguish the byte range of each RPC response data in the TCP byte stream. For most RPC responses, this part usually occupies 1 byte.
              2. Response header: data type ID (2 bytes) + response request ID (4 bytes)
              3. Response body: The response body can be divided into two categories: normal response and abnormal response. The correct response is various data types, such as the eight basic types and other combined data types. The exception response is essentially just a "combined data type", which is expressed as the RpcException data type, and the exception information is described through the RpcErrorCode, ResponseStatusCode, and description (String) fields.

              Replenish

              • Some requests (such as user chat messages in "Notification") will be sent to multiple different RPC nodes, and their request bodies all share direct memory outside the heap, and memory copying is not required

              • Turms currently does not plan to use compression technology for RPC request and response data, mainly because: the compression ratio of various compression algorithms is not ideal, and compression and decompression need to consume a lot of memory and CPU resources. In general, the price/performance ratio of compression is too low, and the gain outweighs the gain, so compression technology is not used.

                What’s more, for data transfer between server and client, support for compression will be considered in the future. The fundamental motivation is: at the cost of more memory and CPU usage (opening up new memory space when compressing/decompressing) ) Improve data accessibility by compressing data (especially in weak network environments)

              Backpressure

              The turms-gateway server’s implementation of back pressure on the turms-service server is quite tricky. Specifically: each node will judge the health status of the current node according to the CPU and memory load status of the current node, and send Other nodes synchronize the health information. The turms-gateway will find out the node whose "isHealthy" is True from the known turms-service node list, and send an RPC request to it. If turms-gateway finds that the "isHealthy" of all turms-services is False, it will no longer send RPC, but will directly throw an exception.

              Failover

              For an RPC request without a specific target, if one Turms server sends an RPC request to another Turms server, and the peer responds abnormally, the sender will automatically send the RPC request to another Turms server. For example, if the client sends a request to turms-gateway, turms-gateway will first randomly select a turms-service to process the user request, if the turms-service responds abnormally, turms-gateway will automatically search again Another turms-service to handle the user request.

              Distributed ID Generation Service (IdGen)

              Service class: im.turms.server.common.infra.cluster.service.idgen.IdService

              The distributed ID generator is used to quickly provide the unique ID of the cluster for each business scenario. Generating a unique ID for a cluster only requires nodes to perform local operations (specific code: im.turms.server.common.infra.cluster.service.idgen.SnowflakeIdGenerator#nextLargeGapId), which is extremely efficient.

              Principle

              Turms' distributed ID generator is implemented based on the mainstream Snowflake ID algorithm, and the generated ID is of long data type, specifically:

              • The highest bit (1 bit) is always 0, indicating a positive number
              • 41 bits represent the time stamp in milliseconds, which can represent about 69 years. The specific UTC time interval is: [2020-10-13, 2090-06-19]. 2020-10-13 is the hard-coded Epoch time, if you want to modify the time, just modify the value of im.turms.server.common.infra.cluster.service.idgen.SnowflakeIdGenerator#EPOCH
              • 4 bits represent the data center ID, and the ID range is [0, 15]. In practice, the ID is usually divided into regions in the cloud service, that is, each region has an ID. Turms will automatically map the zone name to a value in the interval [0, 15] according to the NodeProperties#zone "zone name" of the node. Note: If there are more than 16 region names, although these region names will still be mapped to values in the interval [0, 15], this also means that there will be duplicate data center IDs, and cluster nodes that generate the same ID risk. Also, the downgraded node will print a warning log to warn of the risk of generating the same ID.
              • 8 bits represent the ID of the working node, and the ID range is [0, 255]. Turms will automatically map the zone name to a value in the interval [0, 255] according to the im.turms.server.common.infra.property.env.common.cluster.NodeProperties#zone "zone name" of the node. Note: If there are more than 256 nodes in a data center, although these node IDs will still be mapped to values ​​in the interval [0, 255], this also means that there will be duplicate worker node IDs, and there are cluster nodes Risk of generating the same ID. Also, the downgraded node will print a warning log to warn of the risk of generating the same ID.
              • 10 bits represent the serial number. A maximum of 1024 serial numbers can be represented in the unit timestamp field (1 millisecond), that is, a maximum of 1024 unique IDs can be generated in 1 millisecond. In other words, a maximum of 1,024,000 unique IDs can be represented within 1 second, so in actual use, it is impossible to have duplicate IDs.

              Supplement: According to the node information, the code to update the data center ID and the working node ID information is in: addOnMembersChangeListener of im.turms.server.common.infra.cluster.service.idgen.IdService#IdService

              Variation implementation

              Concrete implementation: im.turms.server.common.infra.cluster.service.idgen.SnowflakeIdGenerator#nextLargeGapId

              The IDs generated by the conventional snowflake algorithm are monotonically increasing. But in most cases, Turms' business implementation uses IDs with large intervals to avoid monotonically increasing IDs. The reason for this is to use large-spacing IDs to ensure that when these data are stored in the MongoDB database, MongoDB can generate enough Chunks based on these IDs, and load-balance these Chunks to each MongoDB server for storage. . The monotonically increasing ID will cause all new data to be always allocated to the only hotspot MongoDB server, causing the load balancing of the database to fail.

              The implementation of large-spacing ID is also very simple, just rearrange the fields, the specific order is: serial number, time stamp, data center ID, work node ID (the ID order of the conventional snowflake algorithm is time stamp, data Center ID, worker node ID, serial number). Since the serial number occupies the highest bit of the ID, and the generated serial number is monotonically increasing in the interval [0, 1023], it can ensure that the generated ID quickly occupies a large range of values, and is divided into multiple Chunks by MongoDB and stored in a load-balanced manner. In different MongoDB servers.

              + \ No newline at end of file diff --git a/docs/server/module/data-analytics.html b/docs/server/module/data-analytics.html index a95b9e0f..8b054124 100644 --- a/docs/server/module/data-analytics.html +++ b/docs/server/module/data-analytics.html @@ -17,8 +17,8 @@ -
              Skip to content

              Data Analysis

              When designing the table structure for a small instant messaging scenario, since there is no need to consider the sharding design of the data model, and the business model and the statistical model can be directly integrated, for small business scenarios, you can quickly realize unpacking through code Ready-to-use and efficient basic data analysis functions, and extended to provide common statistical APIs based on index fields.

              However, the Turms project is designed for medium and large instant messaging scenarios. Data analysis and business implementation must be separated at the architectural level, which also includes the separation of business models and data models. If you need to perform data analysis, you can collect the measurement or buried point logs generated by the turms-gateway and turms-service servers, and use cloud services or self-developed implementations to analyze them.

              In addition, considering that there are indeed many common and common IM-related statistical data, we will open a new project turms-data to be responsible for data analysis, and cooperate with Turms server and turms-admin to realize: log and database data collection , Data warehouse construction, analysis and statistics of business indicators, result visualization and other functions.

              Note: Since Turms was mainly designed for small instant messaging scenarios in the early days, all API query fields were implemented based on indexes at that time, which could ensure query efficiency. But later turned to design for medium and large scenarios, and many indexes were removed, but the query fields of the corresponding APIs (especially the statistics API) were not removed, so there are still some APIs (especially the statistics API) The implementation of query parameters will use full table scan, which is Legacy code. We will classify these APIs according to the implementation performance to ensure that some inefficient APIs will not be misused.

              - +
              Skip to content

              Data Analysis

              When designing the table structure for a small instant messaging scenario, since there is no need to consider the sharding design of the data model, and the business model and the statistical model can be directly integrated, for small business scenarios, you can quickly realize unpacking through code Ready-to-use and efficient basic data analysis functions, and extended to provide common statistical APIs based on index fields.

              However, the Turms project is designed for medium and large instant messaging scenarios. Data analysis and business implementation must be separated at the architectural level, which also includes the separation of business models and data models. If you need to perform data analysis, you can collect the measurement or buried point logs generated by the turms-gateway and turms-service servers, and use cloud services or self-developed implementations to analyze them.

              In addition, considering that there are indeed many common and common IM-related statistical data, we will open a new project turms-data to be responsible for data analysis, and cooperate with Turms server and turms-admin to realize: log and database data collection , Data warehouse construction, analysis and statistics of business indicators, result visualization and other functions.

              Note: Since Turms was mainly designed for small instant messaging scenarios in the early days, all API query fields were implemented based on indexes at that time, which could ensure query efficiency. But later turned to design for medium and large scenarios, and many indexes were removed, but the query fields of the corresponding APIs (especially the statistics API) were not removed, so there are still some APIs (especially the statistics API) The implementation of query parameters will use full table scan, which is Legacy code. We will classify these APIs according to the implementation performance to ensure that some inefficient APIs will not be misused.

              + \ No newline at end of file diff --git a/docs/server/module/identity-access-management.html b/docs/server/module/identity-access-management.html index f25b3870..d147c88e 100644 --- a/docs/server/module/identity-access-management.html +++ b/docs/server/module/identity-access-management.html @@ -17,7 +17,7 @@ -
              Skip to content

              Identity and Access Management

              Login authentication and authorization

              Turms not only provides a built-in identity and access management mechanism, but also supports user-defined identity and access management implementation based on plug-ins.

              config namedefault valuedescription
              turms.gateway.session.identity-access-management.enabledtrueWhether to enable the identity and access management mechanism.
              If the value is false, turn off the Turms built-in identity and access management mechanism and user-based plug-in custom identity and access management implementation, and allow any user to log in, and authorize them to send any type of request
              turms.gateway.session.identity-access-management.typepasswordThe type of Turms built-in identity and access management mechanism used. The type can be noop, password, jwt, http. See below for details

              Built-in identity and access management mechanism

              1. NOOP

              Turn off the built-in identity and access management mechanism, and allow any user to log in, and authorize it to send any request type.

              • turms.gateway.session.identity-access-management.type=noop

              2. Key-based authentication

              User authentication is based on the password in the user collection in the MongoDB built by the Turms server. Authorization implementation is not currently supported.

              • turms.gateway.session.identity-access-management.type=password

              3. Based on JWT authentication

              The JWT token contains the user's authentication and authorization information.

              work process
              • The client application applies for a JWT token from your server
              • After the client application gets the JWT token, it sends the JWT string to the turms-gateway server through the password field in the Turms client login interface turmsClient.userService.login
              • After the turms-gateway server gets the JWT token, according to the algorithm specified in the JWT token and the public key configuration configured by the developer on the turms-gateway server (asymmetric encryption algorithm: RS256, RS384, RS512, PS256, PS384 , PS512, ES256, ES384, ES512) or private key configuration (symmetric encryption algorithm: HS256, HS384, HS512) to verify the JWT token.
              • If the developer does not configure the algorithm key configuration specified by JWT on the turms-gateway server, a corresponding error message will be returned to the client to inform the client that the algorithm is not supported
              • If the JWT token verification is passed, the user is authenticated and authorized according to the authentication and authorization information of the JWT token
              • If the JWT token verification fails, return the corresponding error message to the client
              JWT body (Payload) format
              json
              {
              +    
              Skip to content

              Identity and Access Management

              Login authentication and authorization

              Turms not only provides a built-in identity and access management mechanism, but also supports user-defined identity and access management implementation based on plug-ins.

              config namedefault valuedescription
              turms.gateway.session.identity-access-management.enabledtrueWhether to enable the identity and access management mechanism.
              If the value is false, turn off the Turms built-in identity and access management mechanism and user-based plug-in custom identity and access management implementation, and allow any user to log in, and authorize them to send any type of request
              turms.gateway.session.identity-access-management.typepasswordThe type of Turms built-in identity and access management mechanism used. The type can be noop, password, jwt, http. See below for details

              Built-in identity and access management mechanism

              1. NOOP

              Turn off the built-in identity and access management mechanism, and allow any user to log in, and authorize it to send any request type.

              • turms.gateway.session.identity-access-management.type=noop

              2. Key-based authentication

              User authentication is based on the password in the user collection in the MongoDB built by the Turms server. Authorization implementation is not currently supported.

              • turms.gateway.session.identity-access-management.type=password

              3. Based on JWT authentication

              The JWT token contains the user's authentication and authorization information.

              work process
              • The client application applies for a JWT token from your server
              • After the client application gets the JWT token, it sends the JWT string to the turms-gateway server through the password field in the Turms client login interface turmsClient.userService.login
              • After the turms-gateway server gets the JWT token, according to the algorithm specified in the JWT token and the public key configuration configured by the developer on the turms-gateway server (asymmetric encryption algorithm: RS256, RS384, RS512, PS256, PS384 , PS512, ES256, ES384, ES512) or private key configuration (symmetric encryption algorithm: HS256, HS384, HS512) to verify the JWT token.
              • If the developer does not configure the algorithm key configuration specified by JWT on the turms-gateway server, a corresponding error message will be returned to the client to inform the client that the algorithm is not supported
              • If the JWT token verification is passed, the user is authenticated and authorized according to the authentication and authorization information of the JWT token
              • If the JWT token verification fails, return the corresponding error message to the client
              JWT body (Payload) format
              json
              {
                    "iss": string, // issuer
                    "sub": string, // subject
                    "aud": array<string>, // audience
              @@ -92,7 +92,7 @@
                        "resources": "*" // a string of ["*", "USER", "GROUP_BLOCKED_USER", ...], or an array that contains these strings
                    }]
               }

              The meanings of authenticated and statements fields are the same as those of the corresponding statements in the JWT text above, so I won’t repeat them here.

              config namedefault valuedescription
              turms.gateway.session.identity-access-management.typepasswordSet to http to enable identity and access management based on external HTTP responses
              turms.service.message.check-if-target-active-and-not-deletedtrueWhen using the HTTP mechanism, you need to set this configuration item to false, otherwise because it does not exist in the Turms database the user, so the user will not be able to send messages
              turms.gateway.session.identity-access-management.http.request.url""Request URL
              turms.gateway.session.identity-access-management.http.request.headerstrueadditional request headers
              turms.gateway.session.identity-access-management.http.request.http-methodGETrequest method
              turms.gateway.session.identity-access-management.http.request.timeout-millis30000Request timeout
              turms.gateway.session.identity-access-management.http.authentication.response-expectation.status-codes"2??"Match this value in the response status code, if the match is successful, continue to other matches, Otherwise authentication fails
              turms.gateway.session.identity-access-management.http.authentication.response-expectation.headersMatch the value in the response header, if the match is successful, continue to other matches, otherwise the authentication fails
              turms.gateway.session.identity-access-management.http.authentication.response-expectation.body-fieldsMatch this value in the response body, if the match is successful, continue with other matches , otherwise authentication fails

              Plug-in-based custom identity and access management implementation

              Authentication plugin interface: im.turms.gateway.infra.plugin.extension.UserAuthenticator

              Authorization plug-in interface: TODO

              Readers can refer to plugin implementation, implement the above plug-in interface.

              Authentication and authorization of business logic

              For the permission information sent by the client, the attitude of the Turms server is "the permission information sent by the client is not trustworthy", so the Turms server will do all necessary things according to the business configuration you set on the Turms server. authority judgment.

              Take the "modify sent message" function as an example, this behavior will trigger a series of decision logic. Turms will first judge whether the target message is indeed sent by the user, and then judge whether to allow the user to modify the sent message according to the allowEditMessageBySender configured on the Turms server (default is true), if you set it to false, Then a ResponseException (Kotlin) or ResponseError (JavaScript/Swift) object will be captured on the client side, and it is represented by the business status code model ResponseStatusCode (composed of code and reason description information) .

              For another example, for a "simple" "send message" request, the Turms server will determine whether the user who sent the message is active, whether "allow sending messages to strangers (non-related persons)" is set, and whether the sender of the message is in the blacklist. If the recipient is a group, then it is logically judged whether the sender of the message is a member of the group, and whether it is in a mute state. And you just need to call a sendMessage(...) interface.

              - + \ No newline at end of file diff --git a/docs/server/module/observability.html b/docs/server/module/observability.html index 88f968e5..d0f43af3 100644 --- a/docs/server/module/observability.html +++ b/docs/server/module/observability.html @@ -17,7 +17,7 @@ -
              Skip to content

              Observability

              In order to achieve high reliability of the system and enable the system to have the ability to predict capacity and troubleshoot abnormalities (such as detecting DDoS attacks), the construction of the system's observability system is very important. If a server does not provide support for the observability system, no matter how rich its functions are, it is just a toy project.

              Moreover, the derivative products generated under the observability system are also an important asset of the enterprise. If business operators ignore the construction of the observability system, they will not be able to effectively analyze user behavior and preferences, let alone optimize business strategies , It also means that the company has given up a considerable amount of wealth.

              Turms, like other conventional servers, divides the specific implementation of observability into three categories, namely: metrics (aggregated values), logs (events), link tracking (request-oriented).

              metrics

              Metrics are composed of aggregated values, and are generally divided into system metrics, application metrics and business metrics. System metrics are used to observe the running status and trends of the system or container; application metrics are used to observe the running status and trends of the JVM and Turms application layer; business metrics are used to observe the status and trends of business development. In the case of non-displacement sampling by default, it only takes up a very small part of the memory space.

              In addition, in terms of specific code implementation, Turms' measurement system is implemented based on the mainstream measurement sampling library Micrometer. And provide interface /metrics to export JSON format, /metrics/prometheus to export OpenMetrics format, /metrics/csv to export CSV format.

              Note: There is also a type of statistical data that consumes system resources relatively, such as daily/weekly/monthly active, user retention rate, etc. The implementation of these functions is very routine. However, such relatively high-latitude functions are suitable for specialized log services or products, so Turms does not directly provide such data.

              System Metrics

              Cloud service vendors also provide this type of measurement, and their measurement points are usually more abundant, and functions such as storage, display and analysis are also available out of the box. Turms provides the following important metrics mainly to fulfill the responsibility of a server, which cannot meet the customization needs of cloud users and some users. For users who can use cloud services, priority should be given to using cloud services.

              CategoryNameTypeMeaning
              Uptime (running time)process.uptimeTimeGaugeHow long the process has been running
              process.start.timeTimeGaugeProcess start time
              Processorsystem.cpu.countGaugeNumber of available CPU cores for a process
              system.load.average.1mGaugeSystem CPU load in the last minute
              system.cpu.usageGaugeRecent system CPU usage
              process.cpu.usageGaugeRecent process CPU usage
              Memory (memory)system.memory.totalGaugesystem physical memory size
              system.memory.freeGaugeAvailable physical memory size of the system
              system.memory.swap.totalGaugeSystem Swap memory size
              system.memory.swap.freeGaugeAvailable Swap memory size of the system
              Storagedisk.totalGaugeTotal storage capacity
              disk.freeGaugeAvailable storage capacity
              FileDescriptorprocess.files.openGaugeNumber of open file descriptors
              process.files.maxGaugeThe maximum number of open file descriptors

              Apply Metrics

              JVM Metrics

              The following description is based on the HotSpot virtual machine. Turms does not provide official support for other virtual machines.

              CategoryNameTypeMeaning
              GCjvm.gc.max.data.sizeGaugeMaximum available heap memory in the old generation
              jvm.gc.live.data.sizeGaugeAfter GC, the memory space occupied by the old generation
              jvm.gc.memory.allocatedCounterTotal allocated memory space in Eden
              jvm.gc.memory.promotedCounterTotal allocated memory space in the old generation
              jvm.gc.pauseTimerGC time-consuming
              Memoryjvm.buffer.countGaugeThe number of memory buffers in each memory buffer pool
              jvm.buffer.memory.usedGaugeThe used memory of each memory buffer pool
              Note: The off-heap memory used by the Turms application layer is recorded here
              jvm.buffer.total.capacityGaugeThe total capacity of each memory buffer pool
              jvm.memory.usedGaugeUsed memory of each memory pool
              Note: The off-heap memory used by the Turms application layer will not be recorded here
              jvm.memory.committedGaugeAvailable memory for each memory pool
              jvm.memory.maxGaugeMaximum memory of each memory pool
              Threadjvm.threads.peakGaugePeak number of threads
              jvm.threads.daemonGaugeNumber of daemon threads
              jvm.threads.liveGaugeNumber of currently active threads
              jvm.threads.statesGaugeNumber of threads in each thread state
              Classjvm.classes.loadedGaugeNumber of loaded classes
              jvm.classes.unloadedCounterNumber of unloaded classes

              Note: Turms uses the off-heap memory in the memory pool (that is, allocates off-heap memory through Netty's PooledByteBufAllocator) when performing network IO operations. By deliberately not releasing the off-heap memory and caching these off-heap memory, To avoid inefficient off-heap memory allocation and release operations, the memory usage rate of Turms will continue to rise, and there is no overall downward trend. It's not a memory leak, it's just that Turms is caching this off-heap memory.

              Inter-cluster TCP connection metrics

              In the connection measurement, because the number of nodes on the server side is limited, each measurement will use the remote address of the TCP terminal as a tag to distinguish the respective measurement data of each TCP terminal, so as to observe the communication between nodes in more detail.

              TCP server
              typenametypemeaning
              Connectionturms.node.tcp.server.data.receivedDistributionSummaryBytes Received
              turms.node.tcp.server.data.sentDistributionSummaryNumber of bytes sent
              turms.node.tcp.server.errorsCounterConnection exception trigger times
              turms.node.tcp.server.tls.handshake.timeTimerTLS handshake time
              ByteBufAllocator(memory)TODO
              TCP client
              typenametypemeaning
              Connectionturms.node.tcp.client.data.receivedDistributionSummaryBytes Received
              turms.node.tcp.client.data.sentDistributionSummaryNumber of bytes sent
              turms.node.tcp.client.errorsCounterConnection exception trigger times
              turms.node.tcp.client.tls.handshake.timeTimerTLS handshake time
              turms.node.tcp.client.connect.timeTimerTCP connection establishment time
              turms.node.tcp.client.address.resolverTimerAddress resolution time
              ByteBufAllocator(memory)TODO

              RPC metrics

              NameTypeMeaning
              rpc.request.subscribedCounterThe number of times a certain type of RPC request has been processed
              rpc.request.flow.durationTimerThe processing time of a certain type of RPC request

              Admin API Metrics

              Because the administrator's IP can be unlimited, each measurement does not use the remote address of the peer end as a tag to distinguish the respective measurement data of each end.

              | type | name | type | meaning | | ------------------ | ----------------------------- | - ------------------ | ---------------- | | Connection | admin.api.data.received | DistributionSummary | Bytes Received | | | admin.api.data.sent | DistributionSummary | Bytes Sent | | | admin.api.errors | Counter | Connection exception trigger times | | | admin.api.tls.handshake.time | Timer | TLS handshake time |

              Turms Client Metrics

              In the connection measurement, because the number of clients is infinite, each measurement does not use the remote address of the peer end as a tag to distinguish the respective measurement data of each end. In addition, the connection metrics use the tag uri to distinguish the respective measurement data of the three types of TCP/UDP/WebSocket connections.

              typenametypemeaning
              Connectionturms.client.network.data.receivedDistributionSummaryBytes Received
              turms.client.network.data.sentDistributionSummaryBytes Sent
              turms.client.network.errorsCounterConnection exception trigger times
              turms.client.network.tls.handshake.timeTimerTLS handshake time
              turms.client.network.connect.timeTimerConnection establishment time
              turms.client.network.address.resolverTimerDomain name resolution time
              Requestturms.client.request.subscribedCounterNumber of times a certain type of client request has been processed
              turms.client.request.flow.durationTimerThe processing time of a certain type of client request
              ConnectionProvider (connection pool)TODO
              ByteBufAllocator(memory)TODO

              Business Metrics

              | Server | Name | Type | Meaning | | ------------- | --------------- | ------- | ----------- -| | turms-gateway | user.logged_in | Counter | Number of logged in users | | | user.online | Gauge | Number of online users | | turms-service | user.registered | Counter | Number of registered users | | | user.deleted | Counter | Number of deleted users | | | group.created | Counter | Number of created groups | | | group.deleted | Counter | Number of deleted groups | | | message.sent | Counter | Number of sent messages |

              logs

              Each log corresponds to the events that occur when the Turms server is running, and is used to track the running status of the system and generate high-latitude statistical data. There are two categories of logs in Turms, namely application logs and business logs. The number of application running logs is small and takes up little space, and follows the principle of precision and accuracy. However, the client API access log designed for business analysis is different. It is the basic data of most statistical data and an important asset of the enterprise. Therefore, Turms defaults and recommends 100% sampling of it, which consumes a lot of storage.

              Notice

              • The data format design of all logs, metrics and link tracking in Turms is designed with "simple and fast, convenient and fast query" and "accurate sampling, convenient for log service analysis", but Turms itself does not provide any log analysis function.

              • The log timestamp and log cutting of Turms are based on UTC time, not system time.

              • When Turms has FATAL level logs, manual intervention is required to fix them. The currently existing FATAL level log types are:

                • Detects that a table in the database has been dropped, or renamed.

                • It is detected that the file system storing the log is full and cannot continue to print the log.

                  Note: Turms cannot continue to print logs when it detects that the file system is full, so Turms will not print this FATAL level log until the user has not made enough space. Turms will optimize this later to ensure that the log can be printed out in a timely manner. Of course, since the current system is equipped with a monitoring system, when the operation and maintenance personnel receive a warning that the storage space exceeds the custom threshold, they should deal with it in advance.

              • Turms will continuously print the log and print the log into a file for storage in the file system. When the storage space of the file system is insufficient, Turms server will stop printing logs, but will not discard the logs, but will accumulate the logs in the memory, so when too many logs accumulated in the memory and the memory is insufficient, It will also trigger the automatic protection mechanism of the Turms server to reject all user requests, so as to prevent the Turms server from going down due to insufficient memory. Therefore, the operation and maintenance personnel must ensure that the system where the Turms server is located has sufficient storage space at all times.

                Further reading: Memory health detection mechanism of Turms server

              Self-developed implementation (expanding knowledge)

              reason

              1. Turms defaults and highly recommends 100% sampling of the client API, which requires efficient implementation of Logging
              2. The implementation of third-party Logging is too redundant, with low performance and high memory usage
              3. Avoid third-party Logging developers writing critical bugs like Remote code injection in Log4j due to lack of security common sense
              4. The log implementation of Turms is "almost no function implemented", and the implemented functions are also implemented according to almost the highest performance standard (we directly write the basic data of Java into DirectByteBuf, and directly write to the file descriptor , there is no string copy), so the throughput of this implementation can be several times higher than log4j2 async logger, and the memory overhead is several times higher

              Implementation

              The Turms log implementation is very streamlined, and only implements a few percent of the core functions of the standard log library. The main steps for printing logs are:

              For regular logs:

              * Call `im.turms.server.common.infra.logging.core.logger.AsyncLogger#doLog` function
              +    
              Skip to content

              Observability

              In order to achieve high reliability of the system and enable the system to have the ability to predict capacity and troubleshoot abnormalities (such as detecting DDoS attacks), the construction of the system's observability system is very important. If a server does not provide support for the observability system, no matter how rich its functions are, it is just a toy project.

              Moreover, the derivative products generated under the observability system are also an important asset of the enterprise. If business operators ignore the construction of the observability system, they will not be able to effectively analyze user behavior and preferences, let alone optimize business strategies , It also means that the company has given up a considerable amount of wealth.

              Turms, like other conventional servers, divides the specific implementation of observability into three categories, namely: metrics (aggregated values), logs (events), link tracking (request-oriented).

              metrics

              Metrics are composed of aggregated values, and are generally divided into system metrics, application metrics and business metrics. System metrics are used to observe the running status and trends of the system or container; application metrics are used to observe the running status and trends of the JVM and Turms application layer; business metrics are used to observe the status and trends of business development. In the case of non-displacement sampling by default, it only takes up a very small part of the memory space.

              In addition, in terms of specific code implementation, Turms' measurement system is implemented based on the mainstream measurement sampling library Micrometer. And provide interface /metrics to export JSON format, /metrics/prometheus to export OpenMetrics format, /metrics/csv to export CSV format.

              Note: There is also a type of statistical data that consumes system resources relatively, such as daily/weekly/monthly active, user retention rate, etc. The implementation of these functions is very routine. However, such relatively high-latitude functions are suitable for specialized log services or products, so Turms does not directly provide such data.

              System Metrics

              Cloud service vendors also provide this type of measurement, and their measurement points are usually more abundant, and functions such as storage, display and analysis are also available out of the box. Turms provides the following important metrics mainly to fulfill the responsibility of a server, which cannot meet the customization needs of cloud users and some users. For users who can use cloud services, priority should be given to using cloud services.

              CategoryNameTypeMeaning
              Uptime (running time)process.uptimeTimeGaugeHow long the process has been running
              process.start.timeTimeGaugeProcess start time
              Processorsystem.cpu.countGaugeNumber of available CPU cores for a process
              system.load.average.1mGaugeSystem CPU load in the last minute
              system.cpu.usageGaugeRecent system CPU usage
              process.cpu.usageGaugeRecent process CPU usage
              Memory (memory)system.memory.totalGaugesystem physical memory size
              system.memory.freeGaugeAvailable physical memory size of the system
              system.memory.swap.totalGaugeSystem Swap memory size
              system.memory.swap.freeGaugeAvailable Swap memory size of the system
              Storagedisk.totalGaugeTotal storage capacity
              disk.freeGaugeAvailable storage capacity
              FileDescriptorprocess.files.openGaugeNumber of open file descriptors
              process.files.maxGaugeThe maximum number of open file descriptors

              Apply Metrics

              JVM Metrics

              The following description is based on the HotSpot virtual machine. Turms does not provide official support for other virtual machines.

              CategoryNameTypeMeaning
              GCjvm.gc.max.data.sizeGaugeMaximum available heap memory in the old generation
              jvm.gc.live.data.sizeGaugeAfter GC, the memory space occupied by the old generation
              jvm.gc.memory.allocatedCounterTotal allocated memory space in Eden
              jvm.gc.memory.promotedCounterTotal allocated memory space in the old generation
              jvm.gc.pauseTimerGC time-consuming
              Memoryjvm.buffer.countGaugeThe number of memory buffers in each memory buffer pool
              jvm.buffer.memory.usedGaugeThe used memory of each memory buffer pool
              Note: The off-heap memory used by the Turms application layer is recorded here
              jvm.buffer.total.capacityGaugeThe total capacity of each memory buffer pool
              jvm.memory.usedGaugeUsed memory of each memory pool
              Note: The off-heap memory used by the Turms application layer will not be recorded here
              jvm.memory.committedGaugeAvailable memory for each memory pool
              jvm.memory.maxGaugeMaximum memory of each memory pool
              Threadjvm.threads.peakGaugePeak number of threads
              jvm.threads.daemonGaugeNumber of daemon threads
              jvm.threads.liveGaugeNumber of currently active threads
              jvm.threads.statesGaugeNumber of threads in each thread state
              Classjvm.classes.loadedGaugeNumber of loaded classes
              jvm.classes.unloadedCounterNumber of unloaded classes

              Note: Turms uses the off-heap memory in the memory pool (that is, allocates off-heap memory through Netty's PooledByteBufAllocator) when performing network IO operations. By deliberately not releasing the off-heap memory and caching these off-heap memory, To avoid inefficient off-heap memory allocation and release operations, the memory usage rate of Turms will continue to rise, and there is no overall downward trend. It's not a memory leak, it's just that Turms is caching this off-heap memory.

              Inter-cluster TCP connection metrics

              In the connection measurement, because the number of nodes on the server side is limited, each measurement will use the remote address of the TCP terminal as a tag to distinguish the respective measurement data of each TCP terminal, so as to observe the communication between nodes in more detail.

              TCP server
              typenametypemeaning
              Connectionturms.node.tcp.server.data.receivedDistributionSummaryBytes Received
              turms.node.tcp.server.data.sentDistributionSummaryNumber of bytes sent
              turms.node.tcp.server.errorsCounterConnection exception trigger times
              turms.node.tcp.server.tls.handshake.timeTimerTLS handshake time
              ByteBufAllocator(memory)TODO
              TCP client
              typenametypemeaning
              Connectionturms.node.tcp.client.data.receivedDistributionSummaryBytes Received
              turms.node.tcp.client.data.sentDistributionSummaryNumber of bytes sent
              turms.node.tcp.client.errorsCounterConnection exception trigger times
              turms.node.tcp.client.tls.handshake.timeTimerTLS handshake time
              turms.node.tcp.client.connect.timeTimerTCP connection establishment time
              turms.node.tcp.client.address.resolverTimerAddress resolution time
              ByteBufAllocator(memory)TODO

              RPC metrics

              NameTypeMeaning
              rpc.request.subscribedCounterThe number of times a certain type of RPC request has been processed
              rpc.request.flow.durationTimerThe processing time of a certain type of RPC request

              Admin API Metrics

              Because the administrator's IP can be unlimited, each measurement does not use the remote address of the peer end as a tag to distinguish the respective measurement data of each end.

              | type | name | type | meaning | | ------------------ | ----------------------------- | - ------------------ | ---------------- | | Connection | admin.api.data.received | DistributionSummary | Bytes Received | | | admin.api.data.sent | DistributionSummary | Bytes Sent | | | admin.api.errors | Counter | Connection exception trigger times | | | admin.api.tls.handshake.time | Timer | TLS handshake time |

              Turms Client Metrics

              In the connection measurement, because the number of clients is infinite, each measurement does not use the remote address of the peer end as a tag to distinguish the respective measurement data of each end. In addition, the connection metrics use the tag uri to distinguish the respective measurement data of the three types of TCP/UDP/WebSocket connections.

              typenametypemeaning
              Connectionturms.client.network.data.receivedDistributionSummaryBytes Received
              turms.client.network.data.sentDistributionSummaryBytes Sent
              turms.client.network.errorsCounterConnection exception trigger times
              turms.client.network.tls.handshake.timeTimerTLS handshake time
              turms.client.network.connect.timeTimerConnection establishment time
              turms.client.network.address.resolverTimerDomain name resolution time
              Requestturms.client.request.subscribedCounterNumber of times a certain type of client request has been processed
              turms.client.request.flow.durationTimerThe processing time of a certain type of client request
              ConnectionProvider (connection pool)TODO
              ByteBufAllocator(memory)TODO

              Business Metrics

              | Server | Name | Type | Meaning | | ------------- | --------------- | ------- | ----------- -| | turms-gateway | user.logged_in | Counter | Number of logged in users | | | user.online | Gauge | Number of online users | | turms-service | user.registered | Counter | Number of registered users | | | user.deleted | Counter | Number of deleted users | | | group.created | Counter | Number of created groups | | | group.deleted | Counter | Number of deleted groups | | | message.sent | Counter | Number of sent messages |

              logs

              Each log corresponds to the events that occur when the Turms server is running, and is used to track the running status of the system and generate high-latitude statistical data. There are two categories of logs in Turms, namely application logs and business logs. The number of application running logs is small and takes up little space, and follows the principle of precision and accuracy. However, the client API access log designed for business analysis is different. It is the basic data of most statistical data and an important asset of the enterprise. Therefore, Turms defaults and recommends 100% sampling of it, which consumes a lot of storage.

              Notice

              • The data format design of all logs, metrics and link tracking in Turms is designed with "simple and fast, convenient and fast query" and "accurate sampling, convenient for log service analysis", but Turms itself does not provide any log analysis function.

              • The log timestamp and log cutting of Turms are based on UTC time, not system time.

              • When Turms has FATAL level logs, manual intervention is required to fix them. The currently existing FATAL level log types are:

                • Detects that a table in the database has been dropped, or renamed.

                • It is detected that the file system storing the log is full and cannot continue to print the log.

                  Note: Turms cannot continue to print logs when it detects that the file system is full, so Turms will not print this FATAL level log until the user has not made enough space. Turms will optimize this later to ensure that the log can be printed out in a timely manner. Of course, since the current system is equipped with a monitoring system, when the operation and maintenance personnel receive a warning that the storage space exceeds the custom threshold, they should deal with it in advance.

              • Turms will continuously print the log and print the log into a file for storage in the file system. When the storage space of the file system is insufficient, Turms server will stop printing logs, but will not discard the logs, but will accumulate the logs in the memory, so when too many logs accumulated in the memory and the memory is insufficient, It will also trigger the automatic protection mechanism of the Turms server to reject all user requests, so as to prevent the Turms server from going down due to insufficient memory. Therefore, the operation and maintenance personnel must ensure that the system where the Turms server is located has sufficient storage space at all times.

                Further reading: Memory health detection mechanism of Turms server

              Self-developed implementation (expanding knowledge)

              reason

              1. Turms defaults and highly recommends 100% sampling of the client API, which requires efficient implementation of Logging
              2. The implementation of third-party Logging is too redundant, with low performance and high memory usage
              3. Avoid third-party Logging developers writing critical bugs like Remote code injection in Log4j due to lack of security common sense
              4. The log implementation of Turms is "almost no function implemented", and the implemented functions are also implemented according to almost the highest performance standard (we directly write the basic data of Java into DirectByteBuf, and directly write to the file descriptor , there is no string copy), so the throughput of this implementation can be several times higher than log4j2 async logger, and the memory overhead is several times higher

              Implementation

              The Turms log implementation is very streamlined, and only implements a few percent of the core functions of the standard log library. The main steps for printing logs are:

              For regular logs:

              * Call `im.turms.server.common.infra.logging.core.logger.AsyncLogger#doLog` function
               * The `doLog` function internally allocates a block of off-heap memory through `PooledByteBufAllocator.DEFAULT`, and traverses the message, writes non-placeholders directly into this memory, skips placeholders and writes specific parameters, and finally writes this memory Put it in the MPSC queue for log processing (based on `MpscUnboundedArrayQueue` of jctools)
               * When the log processing thread detects that there is a new log (that is, the `ByteBuffer` object), it will write the off-heap memory to the `FileChannel` of the NIO package (it can be a console or a file). Under the system, `pwrite` will be called to directly write the off-heap memory into the file descriptor
               

              For various API logs (such as client API logs), we use a more customized implementation, namely:

              • The caller directly writes API information (such as client IP, request size, etc.) into DirectByteBuf, and passes this Buffer to AsyncLogger#doLog function
              • The doLog function writes the general template information of the log (such as timestamp, node ID, etc.) into another DirectByteBuf, and splices it with the above DirectByteBuf to form a CompositeByteBuf
              • When the log processing thread detects that there is a new log (that is, the CompositeByteBuf object), it will write the off-heap memory to the FileChannel of the NIO package (it can be a console or a file). Under the system, pwrite will be called to directly write the off-heap memory into the file descriptor

              Of course, the performance of Turms writing logs can reach the extreme.

              Replenish

              • Although there is a more efficient way of writing, that is, across the Java implementation, instead of using the FileChannel of the NIO package, it directly calls the underlying JNI implementation. Write to the file descriptor. However, considering the maintainability of the code, and Java does not open these underlying functions by default, this method is not adopted.

              • The memory mentioned above is allocated through PooledByteBufAllocator.DEFAULT, and there is no limit on the upper limit of memory usage, and "dare" to use MpscUnboundedArrayQueue to store logs without limiting the maximum capacity. This is because the Turms server has its own memory management mechanism, which can guarantee the upper limit of memory usage, At the same time, the used memory is gradually released.

              • Turms does not support and will not support in the future: add console text styling. Because you need to use ANSI escape codes to style the console text, and the log file does not need to store these characters, so to achieve this function, we need to maintain a ByteBuf for the console and the log file respectively, and a log needs to consume double memory, so the implementation is not considered.

                In addition, developers can use third-party tools or plug-ins, such as the Grep Console plug-in of Intellij IDEA, to add styles to the logs of the Turms server console.

              • About "why there are garbled characters when printing non-ASCII characters", this is because:

                background:

                • The byte[] value inside the Java 21 String class has and can only store LATIN-1 or UTF-16 encoded data
                • Turms server itself has and only prints ASCII characters (Turms server will not print any text entered by users or administrators)
                • Log printing is a frequently used function, meaningless memory copying is absolutely prohibited.

                In the above background, when Turms prints String, it does not get its byte data through getBytes("UTF-8"), but directly obtains the internal LATIN-1 of String through Unsafe Or UTF-16 encoded byte data, so the log file may be LATIN-1 and UTF-16 mixed encoding.

                When users view log files in UTF-8 encoding, ASCII characters in LATIN-1 encoding can be displayed correctly, and ASCII characters in UTF-16 encoding can also be displayed, but each ASCII character will be more With a null character (binary encoding 0000 0000), characters that are not compatible with other encodings will be displayed as garbled characters, so if the Turms server prints non-ASCII characters, the user will see garbled characters.

                In addition, unless Java supports storing UTF-8 encoded byte data in the future, the Turms server will not consider using an inefficient implementation such as getBytes("UTF-8").

              In summary, the supplementary content has once again verified what we have repeatedly mentioned in each chapter: "multiple functions" is likely to be a shortcoming for servers that pursue performance.

              Reasons for not using JSON format

              With the development of microservices, logs in JSON format are becoming more and more popular. For example, MongoDB began to support logs in JSON format in version 4.4. There are three main advantages of using the JSON format:

              • Greatly unified the log format of each server. Especially for companies with dozens/hundreds/thousands of heterogeneous servers, it is mandatory for each project to use the JSON log format
              • Each programming language has good support for JSON, and there is almost no difficulty in log printing and analysis
              • The log services of various cloud vendors have good support for logs in JSON format, which can be used out of the box

              The reasons why the Turms server does not use the JSON format are:

              • The structure of the Turms server is very simple, and there is no need to unify the log format through JSON.
              • JSON serialization requires additional memory and CPU resources, and has a large storage overhead. If compression technology is used, additional CPU resources will be occupied. In particular, the CPU resources required for serialization plus compression are even higher than the CPU resources required by the Turms server to process business requests, which is unacceptable for Turms.
              • JSON format is actually not good in readability of raw data. Because the original log is displayed in a single line, a line represents an event. When the JSON format is displayed in a single line, it will bring a lot of "noise". A large amount of JSON metadata, JSON keys and JSON values are criss-crossed. It is more laborious to read the original data directly. On the other hand, the client API access log of the Turms server uses | delimiters to split each field. Users only need to read a few more logs for the first time, and then they can reflect what information each field represents.

              Of course, adopting the traditional single-line format will cause relatively complicated cloud service parsing and inflexible configuration. However, considering that this kind of thing is configured once and for all, considering the above situation, the Turms server log does not use the JSON format, but still uses the traditional single-line format.

              Category

              GC log

              It is used for JVM performance testing, analysis and tuning, and troubleshooting and positioning problems.

              The server JVM GC configuration of turms-gateway is: -Xlog:gc*,gc+age=trace,safepoint:file=${TURMS_GATEWAY_HOME}/log/turms-gateway-gc.log:utctime,pid,tags:filecount =32,filesize=32m

              The server JVM GC configuration of turms-service is: -Xlog:gc*,gc+age=trace,safepoint:file=${TURMS_SERVICE_HOME}/log/turms-service-gc.log:utctime,pid,tags:filecount =32,filesize=32m

              Server running log

              Describe the main events that occur in the Turms server, such as the transition of the RPC connection state, the occurrence of server-side errors in request processing, etc.

              File name: turms-gateway.log (turms-gateway server); turms-service.log (turms-service server)

              Composition: event sending time, log level, server type, node ID, trace ID, thread, class, message. Among them, the main function of the server information is to distinguish the source node of the log during the distributed log collection process. Other types of logs also use this log format (except for client API access logs and notification logs that do not record "class" information), they just use a customized message format in the "message" section.

              Format: %d{${sys:LOG_DATEFORMAT_PATTERN}}{GMT+0} ${sys:LOG_LEVEL_PATTERN} ${myctx:NODE_TYPE} ${myctx:NODE_ID} %-19.19X{traceId} %t %-40.40c{ 1.} : %m%n${sys:LOG_EXCEPTION_CONVERSION_WORD}

              Parsing Regex: (?P<time>\d{4}-\d{2}-\d{2}\s\d{1,2}\:\d{2}\:\d{2} \.\d{3})\s+(?P<level>[A-Z]{4,5})\s+(?P<node_type>[A-Z])\s+(?P<node_id>\S*)\ s+\[(?P<trace_id>.{19})\]\s+(?P<thread>\S*)\s+(?P<class>\S*)\s+:\s(?P<msg >.*)

              Example:

              spreadsheet
              
              @@ -35,7 +35,7 @@
               2021-09-03 00:08:37.636 INFO G hkivjeav 8332948877634499289 -client-io-15-3 : 190|1|0||19|UPDATE_TYPING_STATUS_REQUEST
              turms-service server

              File name: turms-service-notification.log

              Format: notification trigger user ID|sending status|number of notification target users|session close status code|notification size|notification forwarding request ID|notification forwarding request type. in:

              • Notification trigger user information: Notification trigger user ID
              • Notification receiving user information: the number of notification receiving users, the number of online notification receiving users
              • Notification information: session close status code, notification size
              • Request information for notification forwarding: request ID for notification forwarding, request type for notification forwarding

              Example:

              spreadsheet
              2021-09-03 00:08:22.537 INFO Shkivjeav 3166178398923546492 -client-io-15-3 : 149|1|1||75|4971734074638762694|UPDATE_FRIEND_REQUEST_REQUEST
               2021-09-03 00:08:37.636 INFO Shkivjeav 8332948877634499289 -client-io-15-3 : 190|1|0||19|6469201046445182337|UPDATE_TYPING_STATUS_REQUEST
              2021-09-03 00:08:22.537 INFO Shkivjeav 3166178398923546492 -client-io-15-3 : 149|1|1||75|4971734074638762694|UPDATE_FRIEND_REQUEST_REQUEST
               2021-09-03 00:08:37.636 INFO Shkivjeav 8332948877634499289 -client-io-15-3 : 190|1|0||19|6469201046445182337|UPDATE_TYPING_STATUS_REQUEST

              Slow log

              TODO

              Collection and Analysis

              Turms only provides raw data, does not provide and does not plan to provide log collection and analysis functions.

              reason

              • Now cloud vendors support advanced services such as log collection, parsing, storage, retrieval, analysis and alarm. Through SQL retrieval, obtain various high-latitude statistical data and charts (such as: daily activity, monthly activity, daily message sending volume, session retention time, new session proportion, retention rate, etc. operational data). It is precisely because this solution has become one of the best practices in the industry that Turms itself does not provide some relatively complex functions that are more suitable for big data projects.
              • Log collection related techniques are routine. However, from the perspective of business value, it is difficult to reasonably plan which logs should be collected, which fields should be indexed, which logs should be analyzed in real time, and which logs should be analyzed offline. These issues are directly linked to business value and cost. Therefore, in terms of commercial value considerations, Turms can only give suggestions, rather than directly intervene.
              • Log-related services and products contend, and the log-related implementation of the Turms server should remain neutral. Therefore, the Turms server itself does not connect to any SDK, and only provides original logs for log-related service collection.
              • From the perspective of microservice responsibilities, the functions of the Turms server should not be too coupled.

              role

              Request-oriented, used to quickly track the execution of requests between nodes and within specific nodes.

              accomplish

              In the link tracking implementation specification [OpenTracing] (https://opentracing.io/specification), it stipulates that Trace and Span should be used as the unit of link tracking. However, compared with dozens, hundreds or even thousands of microservice applications, Turms' call link is extremely simple, and there is no need to track requests through Span information. Also, if Turms is implemented using standard OpenTracing, the link tracing additional information for many requests will be even larger than the body of most RPC requests.

              Therefore, Turms only adds a field for trace ID to all logs. When developers are performing link tracing, they only need to query the trace ID field to understand all the nodes that the request passes through. , with the implementation inside the node.

              Monitoring and alarming

              In the observable system, the system needs to monitor the running status of the server in real time based on metrics and logs, and give an alarm notification when an abnormality is found in the system.

              Turms does not provide and does not plan to provide an alarm function. On the one hand, cloud services such as AWS CloudWatch or other related products provide extremely rich, mature and out-of-the-box functions such as collection, analysis and alarm of metrics and logs. If users are familiar with cloud service products, it usually only takes 3 to 10 minutes to purchase cloud services from scratch and implement Turms monitoring and alarming. On the other hand, from the perspective of microservice responsibilities, the functions of the Turms server should not be too coupled, and there is no need to integrate these monitoring and alarm functions.

              Even if users do not plan to use the cloud server, they can also use professional and mature open source technology solutions such as Prometheus Alertmanager. If the user is familiar with the relevant operations, it usually only takes 10 to 60 minutes to build such a system from scratch.

              - + \ No newline at end of file diff --git a/docs/server/module/security.html b/docs/server/module/security.html index 55e4259b..278fbc9a 100644 --- a/docs/server/module/security.html +++ b/docs/server/module/security.html @@ -17,7 +17,7 @@ -
              Skip to content

              Security

              Client Security

              For security reasons, this article does not describe CC attacks for which Turms does not provide a special defense mechanism.

              Client blacklist mechanism

              The server's handling of banned clients

              When turms-gateway detects that a new IP or user ID is blocked, it will first send a shutdown notification of the Turms business layer to the established and blocked session. The notification has the USER_IS_BLOCKED status code, telling the client that it is Banned. After the data is flushed, the Turms server automatically disconnects the underlying TCP connection.

              When turms-gateway checks that the peer IP of the newly established TCP connection has been banned, or detects that the user ID sending the login request has been banned, by default, turms-gateway will directly close the TCP connection with it, and will not A notification of the reason for connection closure will be sent, such as "Your IP/User ID has been blocked for XX time".

              There are two points to note:

              • turms-gateway itself cannot refuse to connect with the banned IP before the TCP connection is established. If you want the server to refuse to connect to the banned IP before the TCP handshake, you can use the callback plug-in provided after Turms to notify the cloud service security system to ban the IP, so as to completely realize the IP ban .

                In addition, the reason why we do not call the system service to completely block the IP is because: when the server is forcibly shut down, the blocked IP will not be automatically removed; modifying the underlying network configuration by ourselves may interfere with the cloud service's own network The management service conflicts, causing the server to be abnormal.

              • When the client connects or logs in, turms-gateway will actively disconnect the connection with the banned IP or user, but it will not send a notification of the reason for the connection closure. The advantages of doing this are: 1. The bandwidth of the cloud service is charged according to the outbound bandwidth, and the inbound bandwidth is not charged, so turms-gateway does not send a response on the business layer, which can slow down the bandwidth cost incurred when being attacked by DDoS; 2. Reduce information exposure and try not to provide effective information to hackers

              Automatic ban mechanism

              The timings that currently support automatic detection and banning of clients are:

              • When the user sends requests frequently and reaches a certain number of times

              • When the WebSocket frame sent by the user does not conform to the specification or is too large, and reaches a certain number of times. The size of the request is based on the Payload Length value in the WebSocket Frame Header

              • When the Turms client request sent by the user cannot be parsed or is too large, and reaches a certain number of times. The size of the request is based on the Payload Length value of the client request Header in the TCP byte stream

                Replenish:

                • When the server detects that the data frame or client request is "too large", it will not continue to parse the subsequent Payload part. If the Payload Length of the client does not match the actual Payload length, it will be judged as an illegal request
                • The specific request size limit can be configured through turms.gateway.client-api.max-request-size-bytes

              In other words, after the TCP connection is established, any behavior of the user may trigger a ban.

              The automatic ban mechanism of Turms adopts a grading system, and provides 3 levels by default. The ban durations of these 3 levels are: 1 minute, 30 minutes, and 60 minutes. Under the default configuration, when the client triggers 5 illegal behaviors, the server will block the IP and user ID of the client with the configuration of level 1. If a certain number of illegal behaviors are triggered within the blocking time, it will enter the next block grade, and so on.

              If you want to modify the default configuration, you can pass turms.security.blocklist.ip.auto-block and turms.security.blocklist.user-id.auto-block prefix, and cooperate with IDEA's smart prompt Modify the default configuration. Its specific configuration items are declared in the im.turms.server.common.infra.property.env.common.security.AutoBlockItemProperties class.

              Administrators can use the API: /blocked-clients/ips and /blocked-clients/users to add, delete, modify and query blocked IPs and blocked user IDs respectively. The specific operations follow Turms HTTP interface design general rules, so I won’t go into details.

              Implementation principle of ban (expand knowledge)

              The principle of synchronizing banned client data is similar to that of common distributed Replicated Map implementations. That is, each server holds a weakly consistent copy of the Map, and one or more Redis servers store a reference copy, and also records the logs of each blocking and unblocking behavior for each server Do incremental sync. When a new server goes online or a server's local logs data lags behind by 100,000 records, these servers will request full synchronization from Redis, otherwise the server only needs to request incremental logs from Redis at the default interval of 10 seconds to synchronize the local copy .

              In addition, the causal consistency implementation currently adopted by Turms is: the order of blocking and unblocking actions is based on the insertion order of the blocked logs queue in Redis, and each server performs causal synchronization based on the order of logs in the queue to ensure that the client is blocked The eventual consistency of the data.

              Why not use Bloom Filter

              The theoretical solution to realize the blacklist function based on Bloom Filter is widely known, but in fact, Bloom Filter has many pitfalls in this scenario, specifically:

              • The features and engineering practices supported by Bloom Filter are limited. such as:

                • In a distributed environment, how to judge the order of "ban operation" and "unblock operation", and how to ensure the final consistency
                • How to set different ban durations for different banned users (such as five minutes/half an hour)
                • How to add additional information to additional banned users, such as the reason for being blocked
                • How to synchronize the blacklist list between nodes, how to do incremental synchronization
                • How to implement "unblocking operation" and what is the cost

                To sum up, even the most basic functions of the blacklist system cannot be realized by Bloom Filter in a distributed environment. Even if Bloom Filter is barely realized in conjunction with other engineering practices, then the advantages of Bloom Filter itself will not exist.

              • The amount of blocked user data itself is very small, and Bloom Filter cannot give full play to its advantages. And if it is just to judge whether the user has been blocked, according to the 1 million banned user IDs, a total of 12MiB or 61.4MiB of memory is needed (additional supplement: this example also confirms that we are in About the Valhalla project Mentioned in the article: Java's waste of memory makes people feel a little "self-defeating"). Because thread-safe collections are usually used in actual programming, and most thread-safe Sets are generally implemented based on Map, so the thread-safe Map is uniformly used below:

                java
                public static void main(String[] args) {
                +    
                Skip to content

                Security

                Client Security

                For security reasons, this article does not describe CC attacks for which Turms does not provide a special defense mechanism.

                Client blacklist mechanism

                The server's handling of banned clients

                When turms-gateway detects that a new IP or user ID is blocked, it will first send a shutdown notification of the Turms business layer to the established and blocked session. The notification has the USER_IS_BLOCKED status code, telling the client that it is Banned. After the data is flushed, the Turms server automatically disconnects the underlying TCP connection.

                When turms-gateway checks that the peer IP of the newly established TCP connection has been banned, or detects that the user ID sending the login request has been banned, by default, turms-gateway will directly close the TCP connection with it, and will not A notification of the reason for connection closure will be sent, such as "Your IP/User ID has been blocked for XX time".

                There are two points to note:

                • turms-gateway itself cannot refuse to connect with the banned IP before the TCP connection is established. If you want the server to refuse to connect to the banned IP before the TCP handshake, you can use the callback plug-in provided after Turms to notify the cloud service security system to ban the IP, so as to completely realize the IP ban .

                  In addition, the reason why we do not call the system service to completely block the IP is because: when the server is forcibly shut down, the blocked IP will not be automatically removed; modifying the underlying network configuration by ourselves may interfere with the cloud service's own network The management service conflicts, causing the server to be abnormal.

                • When the client connects or logs in, turms-gateway will actively disconnect the connection with the banned IP or user, but it will not send a notification of the reason for the connection closure. The advantages of doing this are: 1. The bandwidth of the cloud service is charged according to the outbound bandwidth, and the inbound bandwidth is not charged, so turms-gateway does not send a response on the business layer, which can slow down the bandwidth cost incurred when being attacked by DDoS; 2. Reduce information exposure and try not to provide effective information to hackers

                Automatic ban mechanism

                The timings that currently support automatic detection and banning of clients are:

                • When the user sends requests frequently and reaches a certain number of times

                • When the WebSocket frame sent by the user does not conform to the specification or is too large, and reaches a certain number of times. The size of the request is based on the Payload Length value in the WebSocket Frame Header

                • When the Turms client request sent by the user cannot be parsed or is too large, and reaches a certain number of times. The size of the request is based on the Payload Length value of the client request Header in the TCP byte stream

                  Replenish:

                  • When the server detects that the data frame or client request is "too large", it will not continue to parse the subsequent Payload part. If the Payload Length of the client does not match the actual Payload length, it will be judged as an illegal request
                  • The specific request size limit can be configured through turms.gateway.client-api.max-request-size-bytes

                In other words, after the TCP connection is established, any behavior of the user may trigger a ban.

                The automatic ban mechanism of Turms adopts a grading system, and provides 3 levels by default. The ban durations of these 3 levels are: 1 minute, 30 minutes, and 60 minutes. Under the default configuration, when the client triggers 5 illegal behaviors, the server will block the IP and user ID of the client with the configuration of level 1. If a certain number of illegal behaviors are triggered within the blocking time, it will enter the next block grade, and so on.

                If you want to modify the default configuration, you can pass turms.security.blocklist.ip.auto-block and turms.security.blocklist.user-id.auto-block prefix, and cooperate with IDEA's smart prompt Modify the default configuration. Its specific configuration items are declared in the im.turms.server.common.infra.property.env.common.security.AutoBlockItemProperties class.

                Administrators can use the API: /blocked-clients/ips and /blocked-clients/users to add, delete, modify and query blocked IPs and blocked user IDs respectively. The specific operations follow Turms HTTP interface design general rules, so I won’t go into details.

                Implementation principle of ban (expand knowledge)

                The principle of synchronizing banned client data is similar to that of common distributed Replicated Map implementations. That is, each server holds a weakly consistent copy of the Map, and one or more Redis servers store a reference copy, and also records the logs of each blocking and unblocking behavior for each server Do incremental sync. When a new server goes online or a server's local logs data lags behind by 100,000 records, these servers will request full synchronization from Redis, otherwise the server only needs to request incremental logs from Redis at the default interval of 10 seconds to synchronize the local copy .

                In addition, the causal consistency implementation currently adopted by Turms is: the order of blocking and unblocking actions is based on the insertion order of the blocked logs queue in Redis, and each server performs causal synchronization based on the order of logs in the queue to ensure that the client is blocked The eventual consistency of the data.

                Why not use Bloom Filter

                The theoretical solution to realize the blacklist function based on Bloom Filter is widely known, but in fact, Bloom Filter has many pitfalls in this scenario, specifically:

                • The features and engineering practices supported by Bloom Filter are limited. such as:

                  • In a distributed environment, how to judge the order of "ban operation" and "unblock operation", and how to ensure the final consistency
                  • How to set different ban durations for different banned users (such as five minutes/half an hour)
                  • How to add additional information to additional banned users, such as the reason for being blocked
                  • How to synchronize the blacklist list between nodes, how to do incremental synchronization
                  • How to implement "unblocking operation" and what is the cost

                  To sum up, even the most basic functions of the blacklist system cannot be realized by Bloom Filter in a distributed environment. Even if Bloom Filter is barely realized in conjunction with other engineering practices, then the advantages of Bloom Filter itself will not exist.

                • The amount of blocked user data itself is very small, and Bloom Filter cannot give full play to its advantages. And if it is just to judge whether the user has been blocked, according to the 1 million banned user IDs, a total of 12MiB or 61.4MiB of memory is needed (additional supplement: this example also confirms that we are in About the Valhalla project Mentioned in the article: Java's waste of memory makes people feel a little "self-defeating"). Because thread-safe collections are usually used in actual programming, and most thread-safe Sets are generally implemented based on Map, so the thread-safe Map is uniformly used below:

                  java
                  public static void main(String[] args) {
                       int number = 1_000_000;
                       var map1 = ConcurrentHashMap.newKeySet((int)(number / 0.75F + 1.0F));
                       var map2 = new NonBlockingHashMapLong<>(number);
                  @@ -76,7 +76,7 @@
                            1        40        40   org.jctools.maps.NonBlockingHashMapLong
                            1        64        64   org.jctools.maps.NonBlockingHashMapLong$CHM
                           11            12583464   (total)
                • There is an error

                Client interface anti-brush current limit

                The current limiting implementation of turms-gateway adopts the mainstream algorithm "token bucket algorithm" (for example, the API Gateway of AWS provides traffic integer implementation using the token bucket algorithm).

                Basics

                No matter what the algorithm is, it needs to calculate the "allowed number of requests". The following is a unified description, and the word "token" is used to refer to the "allowed number of requests". In addition, the following table is the general implementation of this type of algorithm, and its variants will not affect the essence of the algorithm, so it will not be discussed.

                Fixed time window algorithmSliding time window algorithmToken bucket algorithmLeaky bucket algorithm
                Token CapFixed or Dynamic Token Cap (usually Fixed Cap)Fixed or Dynamic Token Cap (usually Fixed Cap)Fixed or Dynamic Token Cap (usually Fixed Cap)Fixed or Dynamic Token Cap (usually Fixed upper limit)
                Current number of available tokensCalculated by a single time intervalCalculated by multiple time intervalsCalculated by the current number of tokens in stockCalculated by the current number of tokens in stock
                Token issuance intervalEmphasize coarse-grained interval issuance (such as interval 1 minute)Emphasize fine-grained interval issuance (such as interval 15 seconds)Emphasize fine-grained interval issuance (such as interval 1 second)Emphasize fine-grained interval Release (such as interval 1 second)
                Clear count on token issueYesYes. But generally only the earliest few windows are clearedNoNo
                Resource overheadNo timer required, minimal overheadNo timer required, minimal overheadNo timer required, minimal overheadEach session needs to maintain an MPSC synchronization queue, and a timer to time the Poll queue, with very high overhead big
                Difficulty of implementationVery simpleVery simpleVery simpleRelatively troublesome
                General CommentDue to the need to clear the count and the granularity is too large, the client can burst a large number of requests before each token issuance, causing the problem of "double burst traffic"avoiding the problem of "double burst traffic" , but because of the "clear count" operation, its control precision is not as good as the token bucket algorithm and the leaky bucket algorithmIt can not only handle burst requests through stock tokens, but also through fine-grained interval commands Cards are issued to smoothly throttle requests.
                In fact, the CPU credit mechanism of cloud services is similar to this
                The length is slightly longer, see below

                Both the leaky bucket algorithm and the token bucket algorithm have the ability to handle burst requests and smoothly limit the flow of requests. But a special function of the leaky bucket algorithm is that it can limit the flow of downstream services (the most important one is the database). However, there is also a price for downstream flow limiting. It requires the operation and maintenance personnel to accurately estimate the downstream service throughput, otherwise it may cause the downstream service to be idle while the upstream service is limiting the flow.

                In addition, the use of MPSC queues to cache requests not only reduces throughput, but also increases memory overhead and GC times, resulting in poor user experience and exacerbating the effect of DDoS attacks. (Supplement: By reading the source code of the Turms server, you will find that in the process of processing client requests in Turms, the code is as "light" as possible, so using the MPSC queue for each user session is considered a heavy operation)

                In summary, the Turms server finally uses the token bucket algorithm.

                In particular, compared to the traditional HTTP server, the CPU and memory system resources required to receive and process a regular HTTP request and response may be hundreds of times the system resources required for the interaction between the Turms server and its client (for example: except The network layer protocol header, the average size of a request from the Turms client is about 32B). Therefore, there is no need to take the sudden Turms client requests of a small number of users too seriously. The system resources used to process hundreds of Turms client requests may be similar to processing one HTTP request (of course, there are other forms of CC attacks that will cause a lot of resource consumption).

                other:

                • turms-gateway does not support and currently does not plan to support the implementation of global current limiting. The reason is: global current limiting is usually over-designed. Global current limiting is to alleviate DDoS attacks at all times, increase Redis failure points, and reduce the request processing of the entire system Throughput, many times you lose sight of the other, the gain outweighs the gain
                • Turms does not currently support assigning different weights to different types of requests. For example, a login request requires 3 tokens, and a message request requires 1 token
                • turms-gateway supports the configuration of the runtime zero-downtime update token bucket algorithm

                User Information Security

                For most domestic groups with a little Internet age, unless they have a strong sense of security, their plaintext passwords are very likely to have been leaked (the specific content can be found out through social engineering database). Combined with the fact that the passwords used by most users are relatively fixed, so no matter how encrypted the server is, the security of the "password" is still relatively low.

                TODO

                Admin Security

                Administrator authentication and authorization

                Authentication

                Authentication: The server is implemented based on the common HTTP Basic authentication to confirm which administrator is the sender of the HTTP request.

                Configuration item: turms.security.password.admin-password-encoding-algorithm, its optional values are: bcrypt (default), salted_sha256 and noop.

                Supported key encryption algorithms

                *BCrypt. Its cost is a hard-coded 10 (2^10 rounds), which is used to prevent hackers from easily deciphering the plaintext password through the rainbow table when it is removed from the database.

                For the specific algorithm implementation, please refer to the source code implementation of Bouncy Castle of Fork under the turms-server-common subproject: org.bouncycastle.crypto.generators.BCrypt#generate

                • Salted SHA-256

                • NOOP (plaintext storage)

                Special mention: The password field in the admin collection is not stored in string (such as the common Base64 encoded string), but the original byte[] byte data.

                Authorization

                Authorization: The server confirms what authority the sender of the HTTP request has to do

                Because Turms's own permission management requirements are very simple, its design and implementation are also relatively simple. For example, there are no concepts such as user groups, group roles, and role inheritance, and there is no many-to-many relationship between users and roles. Specifically, Turms uses RBAC (Role-Based Access Control) design scheme.

                Turms' RBAC Model

                Turms' RBAC model consists of three subjects: Admin, Role and Permission. A user can have only one role, and a role can have multiple permissions. in:

                • Each role also has a field rank, only relatively high-rank administrators can add, delete and modify relatively low-rank administrator account information, such as passwords.
                • Permissions are used to describe what operations a role can perform on what resources, such as adding, deleting, modifying and querying user resources
                Special Character - Root

                Root is a built-in administrator role in Turms, which has all administrator privileges and cannot be modified or deleted.

                Special root account - turms

                The root account turms user has the Root root role authority, and its account name does not support modification (but can be changed by modifying the hard-coded im.turms.server.common.domain.admin.constant.AdminConst#ROOT_ADMIN_ACCOUNT value) Modify the root account name), its initial password is turms by default, but users can use the configuration item turms.security.password.initial-root-password when the admin collection has not been created and the turms-service is started Defined initial password.

                Log desensitization

                TODO

                - + \ No newline at end of file diff --git a/docs/server/module/storage.html b/docs/server/module/storage.html index 825c9911..026d4b96 100644 --- a/docs/server/module/storage.html +++ b/docs/server/module/storage.html @@ -17,8 +17,8 @@ -
                Skip to content

                Storage Service

                Turms itself does not directly provide storage services, but opens common interfaces in storage services on the server side for developers to implement themselves, and the Turms client also provides the corresponding storage service turmsClient.storageService API , for developers to call by themselves.

                Notice:

                • Developers can completely implement a set of interactive storage logic between the application client and your own server without any interface provided by the Turms client and server. Turms just maintains a set of implementations of common storage services, so that most developers don't have to develop from scratch. Even if the developer does not intend to use Turms' storage implementation, since each storage service implementation is similar, developers can refer to the Turms storage implementation process to implement their own storage logic to save time for self-development.
                • The function provided by the Turms client storage service is a superset of the functions of the official storage service plug-in of the Turms server, namely: the Turms client storage service is designed to interact with the official storage service plug-in of the Turms server, and can also be extended to interact with other third-party plugins to interact.

                Plugin interface and configuration

                Storage resources are currently divided into three types, namely: User Profile Picture (user profile picture), Group Profile Picture (group profile picture) and Message Attachment (message attachment). Each resource has its corresponding three function interfaces for adding (modifying), deleting, and checking for developers to implement.

                interface

                Plugin interface: im.turms.service.infra.plugin.extension.StorageServiceProvider

                Interface function introduction:

                Resource typeFunction nameExpected functionReturn value description
                User Profile PicturedeleteUserProfilePictureDelete User Profile Picture
                queryUserProfilePictureUploadInfoQuery user profile picture upload informationThe return value format is Map<String, String>, the plug-in implementer can customize any return value
                queryUserProfilePictureDownloadInfoQuery user profile picture download informationThe return value format is Map<String, String>, the plug-in implementer can customize any return value
                Group Profile PicturedeleteGroupProfilePictureDelete Group Profile Picture
                queryGroupProfilePictureUploadInfoQuery group profile picture upload informationThe return value format is Map<String, String>, plug-in implementers can customize any return value
                queryGroupProfilePictureDownloadInfoQuery group profile picture download informationThe return value format is Map<String, String>, plug-in implementers can customize any return value
                message attachmentdeleteMessageAttachmentdelete message attachment
                shareMessageAttachmentWithUsershare message attachment with specified user
                shareMessageAttachmentWithGroupShare the message attachment with the specified group
                unshareMessageAttachmentWithUserNo longer share the message attachment with the specified user
                unshareMessageAttachmentWithGroupNo longer share the message attachment with the specified group
                queryMessageAttachmentUploadInfoQuery message attachment upload informationThe return value format is Map<String, String>, and plug-in implementers can customize any return value
                queryMessageAttachmentUploadInfoInPrivateConversationQuery message attachment upload information in a private chat session
                queryMessageAttachmentUploadInfoInGroupConversationQuery message attachment upload information in a group chat session
                queryMessageAttachmentDownloadInfoQuery message attachment download informationThe return value format is Map<String, String>, and plug-in implementers can customize any return value
                queryMessageAttachmentInfosUploadedByRequesterQuery the message attachment uploaded by the requester
                queryMessageAttachmentInfosInPrivateConversationsQuery message attachments in private chat sessions
                queryMessageAttachmentInfosInGroupConversationsQuery message attachments in group chat conversations

                General configuration

                Configuration itemDefault valueDescription
                turms.service.storage.user-profile-picture.expire-after-days0The expiration time (in days) of the resource since the creation time. A value of 0 means no expiration
                turms.service.storage.user-profile-picture.allowed-referrersEmptyOnly allow specified Referrers to access resources
                turms.service.storage.user-profile-picture.allowed-content-type*/*Allowed resource Content-Type for upload. */* values represent unlimited
                turms.service.storage.user-profile-picture.min-size-bytes0The minimum size of resources allowed to be uploaded. A value of 0 means unlimited
                turms.service.storage.user-profile-picture.max-size-bytes1MBThe maximum size of resources allowed to be uploaded. A value of 0 means unlimited
                turms.service.storage.user-profile-picture.download-url-expire-after-seconds300Expiration time of resource download URL (seconds)
                turms.service.storage.user-profile-picture.upload-url-expire-after-seconds300Expiration time of resource upload URL (seconds)
                turms.service.storage.group-profile-picture....Same as turms.service.storage.user-profile-picture
                turms.service.storage.message-attachment....Same as turms.service.storage.user-profile-picture

                Official plugin implementation

                Bucket Basic Design Guidelines

                Since the functions provided by the object storage service are similar, the official plug-ins based on the object storage service provided by Turms at present and in the future will follow the following Bucket design guidelines.

                As mentioned above, Turms currently includes three types of storage resources, namely User Profile Picture (user profile picture), Group Profile Picture (group profile picture) and Message Attachment (message attachment), which correspond to The bucket names are user-profile-picture, group-profile-picture and message-attachment respectively. in:

                • user-profile-picture and group-profile-picture are public Buckets. For the URLs of these resources, Turms not only supports the generation of regular URLs to allow the client to predict resource URLs by itself, avoiding sending requests to the Turms server to query resource URLs, but also supports the generation of irregular URLs for anti-crawlers. Which URL your application needs to use depends on your product requirements.
                • message-attachment is a private Bucket that provides authorized users with a URL for temporary access to message attachments through Presigned URLs.
                • The upload process of all resources is based on providing authorized users with a temporary Multipart Upload interface through the Presigned URL.

                Of course, the above are only the default configurations. The current mainstream object storage services support many practical features, such as separate storage of hot and cold data (such as Amazon S3 Intelligent-Tiering Storage Class), encryption, complex permission control, etc., users can create in Turms On the basis of Buckets, further configuration is performed through the object storage service.

                ###turms-plugin-minio

                Introduction

                turms-plugin-minio is a turms-service storage service implementation plugin developed based on the open source object storage service MinIO.

                Install

                After the plug-in is Started on the server side, the client can call the corresponding API under turmsClient.storageService to add, delete, modify, and query storage resources.

                Since the storage interface of the Turms client adopts a general-purpose interface design and is not customized for turms-plugin-minio, you need to pay attention to the following when calling the client API:

                • When calling the queryMessageAttachment interface, the parameter fetchDownloadInfo must be true; when calling the queryMessageAttachmentDownloadInfo interface, the parameter fetch must be true.

                Business functions

                Message attachment function
                Upload message attachment
                FeaturesSupport
                Do not specify any session, upload message attachmentTODO
                Upload message attachments to a specific private chat session
                Upload message attachments to multiple private chat sessions
                Upload message attachments to a specific group chat session
                Upload message attachments to specified multiple group chat sessions
                Delete message attachment
                FeaturesSupport
                Delete message attachments in any conversationTODO
                Share and Unshare
                FeaturesSupport
                Share uploaded message attachments to a single private chat session
                Share uploaded message attachments to multiple private chat sessions
                Share uploaded message attachments to a single group chat session
                Share uploaded message attachments to multiple group chat sessions
                Cancel sharing of uploaded message attachments to a single private chat sessionTODO
                Unshare uploaded message attachments to multiple private chat sessions
                Share uploaded message attachments to a single group chat sessionTODO
                Share uploaded message attachments to multiple group chat sessions

                For more advanced sharing functions, such as detailed permission control, custom sharing duration, encrypted sharing and other functions, there is no plan to support them in the near future.

                Inquire
                FeaturesSupport
                Specify the attachments that the other party shared with me in a single private chat session
                Specify the attachments I send to the other party in a single private chat session
                Specify the attachments that the other party shared with me and the attachments I sent to the other party in a single private chat session
                Specify the attachments that the other party shared with me in multiple private chat sessions
                Specify the attachments I send to each other in multiple private chat sessions
                Specify the attachments shared by the other party to me and the attachments I send to the other party in multiple private chat sessions
                Attachments shared to me in all private chat sessions
                The attachments I sent to the other party in all private chat sessionsDoes not support "only query the attachments I sent to the other party in the private chat session", but supports "in all sessions, the attachments I shared"
                In all private chat sessions, the attachments shared by the other party to me and the attachments I sent to the other party
                Specify attachments shared by a single user (can be myself) in a single group chat session
                Specify attachments shared by multiple users (including myself) in a single group chat session
                Specifies attachments shared by all users (including myself) in a single group chat session
                Specify attachments shared by a single user (can be myself) in multiple group chat sessions
                Specify attachments shared by multiple users (including myself) in multiple group chat sessions
                Specify attachments shared by all users (including myself) in multiple group chat sessions
                In all group chat sessions, specify the attachments shared by a single userDoes not support "In all group chat sessions, specify the attachments I share", but supports "In all sessions, the attachments I share"
                In all group chat sessions, specify attachments shared by multiple users (can include myself)
                Attachments shared by all users (including myself) in all group chat sessions
                Across all conversations, my shared attachments
                Across all sessions, various other query objects

                Permission Control

                • View message attachments

                  • Regardless of whether users who send message attachments log out of the private chat or group chat session, they always have the right to query the message attachments they uploaded.

                    And even if the user who uploaded the message attachment exits the session, all other users in the session still have the right to view the message attachment uploaded by the user.

                  • Users have and can only view message attachments shared by other users in the private chat or group chat session they have joined. In other words, if a user joins a session and then logs out, the logged out user cannot view attachments in that session. Only when the user joins the session again can he have the right to view the attachments in the session again.

                Safety

                Upload limit: TODO

                Store file data verification

                If the data verification of stored files is implemented based on cloud services, the implementation of the logic will be relatively simple. For example, on AWS, you can trigger a custom Lambda function to verify the data uploaded by the user through the S3 event notification, or add a Lambda@Edge function that listens to the origin-response event on the CloudFront side for verification, except The custom verification logic needs to write some codes, and other functions can basically be realized by clicking the mouse.

                However, as an independent storage service, MinIO does not support serverless architecture features such as Lambda functions. Compared with serverless solutions, it is much more troublesome to implement low-cost and highly available data verification logic based on MinIO's event mechanism. Therefore, Turms does not currently support data verification of stored files. Support will follow.

                Configuration

                Configuration itemDefault valueDescription
                turms-plugin.minio.enabledtrueWhether to enable the plugin
                turms-plugin.minio.endpoint"http://localhost:9000"MinIO server address
                turms-plugin.minio.region""MinIO server region
                turms-plugin.minio.access-keyminioadminAccess Key for MinIO server
                turms-plugin.minio.secret-keyminioadminSecret Key of MinIO server
                turms-plugin.minio.retry.enabledtrueWhether to retry when initialization of Buckets fails
                turms-plugin.minio.retry.initial-interval-millis30_000Initial retry interval when buckets initialization fails
                turms-plugin.minio.retry.interval-millis30_000The retry interval when initializing Buckets fails
                turms-plugin.minio.retry.max-attempts3When buckets initialization fails, the maximum number of retries
                turms-plugin.minio.resource-id.mac.enabledfalseWhether to encrypt the Object Key of the resource with the MAC algorithm to generate unpredictable URLs to prevent crawlers.
                If this item is not enabled, the user can obtain the corresponding image URL through the user ID or group ID
                The final resource URL is: <bucket>/<base62(object key)><base62( mac(object key))>. Such as user-profile-picture/123456789 => user-profile-picture/8M0kX1aEllpuvXRV09grkIEtD4R
                Note: If the MAC algorithm is enabled, the client must pass the parameter fetch when calling queryXXXDownloadInfo series interfaces Set to true; when calling queryXXX series interfaces, the parameter fetchDownloadInfo must be set to true
                turms-plugin.minio.resource-id.mac.base64-key"AHR1cm1zLWltL3R1cm1zgA=="Base64 encoded MAC algorithm key
                turms-plugin.minio.resource-id.base62.enabledfalseWhether to encode the Object Key of the resource with Base62 algorithm to shorten the length of the URL.
                The final resource URL is: <bucket>/<base62(object key)>, or <bucket>/<base62(object key)><base62(mac(object key))>. Such as user-profile-picture/123456789 => message-attachment/8M0kX or user-profile-picture/8M0kX1aEllpuvXRV09grkIEtD4R
                Note: 1. When turms-plugin.minio.resource-key.mac When .enabled is true, the Base62 algorithm will always be applied.
                2. If the Base62 algorithm is enabled, the client must set the parameter fetch to true when calling the queryXXXDownloadInfo series interface; when calling the queryXXX series interface, it must set the parameter fetchDownloadInfo set to true
                turms-plugin.minio.resource-id.base62.charset...Character set for Base62 algorithm
                - +
                Skip to content

                Storage Service

                Turms itself does not directly provide storage services, but opens common interfaces in storage services on the server side for developers to implement themselves, and the Turms client also provides the corresponding storage service turmsClient.storageService API , for developers to call by themselves.

                Notice:

                • Developers can completely implement a set of interactive storage logic between the application client and your own server without any interface provided by the Turms client and server. Turms just maintains a set of implementations of common storage services, so that most developers don't have to develop from scratch. Even if the developer does not intend to use Turms' storage implementation, since each storage service implementation is similar, developers can refer to the Turms storage implementation process to implement their own storage logic to save time for self-development.
                • The function provided by the Turms client storage service is a superset of the functions of the official storage service plug-in of the Turms server, namely: the Turms client storage service is designed to interact with the official storage service plug-in of the Turms server, and can also be extended to interact with other third-party plugins to interact.

                Plugin interface and configuration

                Storage resources are currently divided into three types, namely: User Profile Picture (user profile picture), Group Profile Picture (group profile picture) and Message Attachment (message attachment). Each resource has its corresponding three function interfaces for adding (modifying), deleting, and checking for developers to implement.

                interface

                Plugin interface: im.turms.service.infra.plugin.extension.StorageServiceProvider

                Interface function introduction:

                Resource typeFunction nameExpected functionReturn value description
                User Profile PicturedeleteUserProfilePictureDelete User Profile Picture
                queryUserProfilePictureUploadInfoQuery user profile picture upload informationThe return value format is Map<String, String>, the plug-in implementer can customize any return value
                queryUserProfilePictureDownloadInfoQuery user profile picture download informationThe return value format is Map<String, String>, the plug-in implementer can customize any return value
                Group Profile PicturedeleteGroupProfilePictureDelete Group Profile Picture
                queryGroupProfilePictureUploadInfoQuery group profile picture upload informationThe return value format is Map<String, String>, plug-in implementers can customize any return value
                queryGroupProfilePictureDownloadInfoQuery group profile picture download informationThe return value format is Map<String, String>, plug-in implementers can customize any return value
                message attachmentdeleteMessageAttachmentdelete message attachment
                shareMessageAttachmentWithUsershare message attachment with specified user
                shareMessageAttachmentWithGroupShare the message attachment with the specified group
                unshareMessageAttachmentWithUserNo longer share the message attachment with the specified user
                unshareMessageAttachmentWithGroupNo longer share the message attachment with the specified group
                queryMessageAttachmentUploadInfoQuery message attachment upload informationThe return value format is Map<String, String>, and plug-in implementers can customize any return value
                queryMessageAttachmentUploadInfoInPrivateConversationQuery message attachment upload information in a private chat session
                queryMessageAttachmentUploadInfoInGroupConversationQuery message attachment upload information in a group chat session
                queryMessageAttachmentDownloadInfoQuery message attachment download informationThe return value format is Map<String, String>, and plug-in implementers can customize any return value
                queryMessageAttachmentInfosUploadedByRequesterQuery the message attachment uploaded by the requester
                queryMessageAttachmentInfosInPrivateConversationsQuery message attachments in private chat sessions
                queryMessageAttachmentInfosInGroupConversationsQuery message attachments in group chat conversations

                General configuration

                Configuration itemDefault valueDescription
                turms.service.storage.user-profile-picture.expire-after-days0The expiration time (in days) of the resource since the creation time. A value of 0 means no expiration
                turms.service.storage.user-profile-picture.allowed-referrersEmptyOnly allow specified Referrers to access resources
                turms.service.storage.user-profile-picture.allowed-content-type*/*Allowed resource Content-Type for upload. */* values represent unlimited
                turms.service.storage.user-profile-picture.min-size-bytes0The minimum size of resources allowed to be uploaded. A value of 0 means unlimited
                turms.service.storage.user-profile-picture.max-size-bytes1MBThe maximum size of resources allowed to be uploaded. A value of 0 means unlimited
                turms.service.storage.user-profile-picture.download-url-expire-after-seconds300Expiration time of resource download URL (seconds)
                turms.service.storage.user-profile-picture.upload-url-expire-after-seconds300Expiration time of resource upload URL (seconds)
                turms.service.storage.group-profile-picture....Same as turms.service.storage.user-profile-picture
                turms.service.storage.message-attachment....Same as turms.service.storage.user-profile-picture

                Official plugin implementation

                Bucket Basic Design Guidelines

                Since the functions provided by the object storage service are similar, the official plug-ins based on the object storage service provided by Turms at present and in the future will follow the following Bucket design guidelines.

                As mentioned above, Turms currently includes three types of storage resources, namely User Profile Picture (user profile picture), Group Profile Picture (group profile picture) and Message Attachment (message attachment), which correspond to The bucket names are user-profile-picture, group-profile-picture and message-attachment respectively. in:

                • user-profile-picture and group-profile-picture are public Buckets. For the URLs of these resources, Turms not only supports the generation of regular URLs to allow the client to predict resource URLs by itself, avoiding sending requests to the Turms server to query resource URLs, but also supports the generation of irregular URLs for anti-crawlers. Which URL your application needs to use depends on your product requirements.
                • message-attachment is a private Bucket that provides authorized users with a URL for temporary access to message attachments through Presigned URLs.
                • The upload process of all resources is based on providing authorized users with a temporary Multipart Upload interface through the Presigned URL.

                Of course, the above are only the default configurations. The current mainstream object storage services support many practical features, such as separate storage of hot and cold data (such as Amazon S3 Intelligent-Tiering Storage Class), encryption, complex permission control, etc., users can create in Turms On the basis of Buckets, further configuration is performed through the object storage service.

                ###turms-plugin-minio

                Introduction

                turms-plugin-minio is a turms-service storage service implementation plugin developed based on the open source object storage service MinIO.

                Install

                After the plug-in is Started on the server side, the client can call the corresponding API under turmsClient.storageService to add, delete, modify, and query storage resources.

                Since the storage interface of the Turms client adopts a general-purpose interface design and is not customized for turms-plugin-minio, you need to pay attention to the following when calling the client API:

                • When calling the queryMessageAttachment interface, the parameter fetchDownloadInfo must be true; when calling the queryMessageAttachmentDownloadInfo interface, the parameter fetch must be true.

                Business functions

                Message attachment function
                Upload message attachment
                FeaturesSupport
                Do not specify any session, upload message attachmentTODO
                Upload message attachments to a specific private chat session
                Upload message attachments to multiple private chat sessions
                Upload message attachments to a specific group chat session
                Upload message attachments to specified multiple group chat sessions
                Delete message attachment
                FeaturesSupport
                Delete message attachments in any conversationTODO
                Share and Unshare
                FeaturesSupport
                Share uploaded message attachments to a single private chat session
                Share uploaded message attachments to multiple private chat sessions
                Share uploaded message attachments to a single group chat session
                Share uploaded message attachments to multiple group chat sessions
                Cancel sharing of uploaded message attachments to a single private chat sessionTODO
                Unshare uploaded message attachments to multiple private chat sessions
                Share uploaded message attachments to a single group chat sessionTODO
                Share uploaded message attachments to multiple group chat sessions

                For more advanced sharing functions, such as detailed permission control, custom sharing duration, encrypted sharing and other functions, there is no plan to support them in the near future.

                Inquire
                FeaturesSupport
                Specify the attachments that the other party shared with me in a single private chat session
                Specify the attachments I send to the other party in a single private chat session
                Specify the attachments that the other party shared with me and the attachments I sent to the other party in a single private chat session
                Specify the attachments that the other party shared with me in multiple private chat sessions
                Specify the attachments I send to each other in multiple private chat sessions
                Specify the attachments shared by the other party to me and the attachments I send to the other party in multiple private chat sessions
                Attachments shared to me in all private chat sessions
                The attachments I sent to the other party in all private chat sessionsDoes not support "only query the attachments I sent to the other party in the private chat session", but supports "in all sessions, the attachments I shared"
                In all private chat sessions, the attachments shared by the other party to me and the attachments I sent to the other party
                Specify attachments shared by a single user (can be myself) in a single group chat session
                Specify attachments shared by multiple users (including myself) in a single group chat session
                Specifies attachments shared by all users (including myself) in a single group chat session
                Specify attachments shared by a single user (can be myself) in multiple group chat sessions
                Specify attachments shared by multiple users (including myself) in multiple group chat sessions
                Specify attachments shared by all users (including myself) in multiple group chat sessions
                In all group chat sessions, specify the attachments shared by a single userDoes not support "In all group chat sessions, specify the attachments I share", but supports "In all sessions, the attachments I share"
                In all group chat sessions, specify attachments shared by multiple users (can include myself)
                Attachments shared by all users (including myself) in all group chat sessions
                Across all conversations, my shared attachments
                Across all sessions, various other query objects

                Permission Control

                • View message attachments

                  • Regardless of whether users who send message attachments log out of the private chat or group chat session, they always have the right to query the message attachments they uploaded.

                    And even if the user who uploaded the message attachment exits the session, all other users in the session still have the right to view the message attachment uploaded by the user.

                  • Users have and can only view message attachments shared by other users in the private chat or group chat session they have joined. In other words, if a user joins a session and then logs out, the logged out user cannot view attachments in that session. Only when the user joins the session again can he have the right to view the attachments in the session again.

                Safety

                Upload limit: TODO

                Store file data verification

                If the data verification of stored files is implemented based on cloud services, the implementation of the logic will be relatively simple. For example, on AWS, you can trigger a custom Lambda function to verify the data uploaded by the user through the S3 event notification, or add a Lambda@Edge function that listens to the origin-response event on the CloudFront side for verification, except The custom verification logic needs to write some codes, and other functions can basically be realized by clicking the mouse.

                However, as an independent storage service, MinIO does not support serverless architecture features such as Lambda functions. Compared with serverless solutions, it is much more troublesome to implement low-cost and highly available data verification logic based on MinIO's event mechanism. Therefore, Turms does not currently support data verification of stored files. Support will follow.

                Configuration

                Configuration itemDefault valueDescription
                turms-plugin.minio.enabledtrueWhether to enable the plugin
                turms-plugin.minio.endpoint"http://localhost:9000"MinIO server address
                turms-plugin.minio.region""MinIO server region
                turms-plugin.minio.access-keyminioadminAccess Key for MinIO server
                turms-plugin.minio.secret-keyminioadminSecret Key of MinIO server
                turms-plugin.minio.retry.enabledtrueWhether to retry when initialization of Buckets fails
                turms-plugin.minio.retry.initial-interval-millis30_000Initial retry interval when buckets initialization fails
                turms-plugin.minio.retry.interval-millis30_000The retry interval when initializing Buckets fails
                turms-plugin.minio.retry.max-attempts3When buckets initialization fails, the maximum number of retries
                turms-plugin.minio.resource-id.mac.enabledfalseWhether to encrypt the Object Key of the resource with the MAC algorithm to generate unpredictable URLs to prevent crawlers.
                If this item is not enabled, the user can obtain the corresponding image URL through the user ID or group ID
                The final resource URL is: <bucket>/<base62(object key)><base62( mac(object key))>. Such as user-profile-picture/123456789 => user-profile-picture/8M0kX1aEllpuvXRV09grkIEtD4R
                Note: If the MAC algorithm is enabled, the client must pass the parameter fetch when calling queryXXXDownloadInfo series interfaces Set to true; when calling queryXXX series interfaces, the parameter fetchDownloadInfo must be set to true
                turms-plugin.minio.resource-id.mac.base64-key"AHR1cm1zLWltL3R1cm1zgA=="Base64 encoded MAC algorithm key
                turms-plugin.minio.resource-id.base62.enabledfalseWhether to encode the Object Key of the resource with Base62 algorithm to shorten the length of the URL.
                The final resource URL is: <bucket>/<base62(object key)>, or <bucket>/<base62(object key)><base62(mac(object key))>. Such as user-profile-picture/123456789 => message-attachment/8M0kX or user-profile-picture/8M0kX1aEllpuvXRV09grkIEtD4R
                Note: 1. When turms-plugin.minio.resource-key.mac When .enabled is true, the Base62 algorithm will always be applied.
                2. If the Base62 algorithm is enabled, the client must set the parameter fetch to true when calling the queryXXXDownloadInfo series interface; when calling the queryXXX series interface, it must set the parameter fetchDownloadInfo set to true
                turms-plugin.minio.resource-id.base62.charset...Character set for Base62 algorithm
                + \ No newline at end of file diff --git a/docs/server/module/system-resource-management.html b/docs/server/module/system-resource-management.html index 5df50d4c..95987020 100644 --- a/docs/server/module/system-resource-management.html +++ b/docs/server/module/system-resource-management.html @@ -17,10 +17,10 @@ -
                Skip to content

                System Resource Management

                The importance of memory and CPU resources to the server is self-evident. Each module of Turms uses memory and CPU to the extreme. For details, please refer to the documents and codes implemented by each module. On the other hand, in order to ensure the normal operation of the server, it also provides a set of health detection mechanism internally. This mechanism cooperates with the "denial of service" mechanism of the upper layer to do its best to ensure the normal operation of the server.

                Turms provides a system resource monitoring configuration class: im.turms.server.common.infra.property.env.common.healthcheck.HealthCheckProperties, to allow users to configure available memory usage and CPU usage. The HealthCheckManager on the Turms server will continuously detect the available physical memory and CPU usage. If it detects that the available physical memory is too low or the CPU usage is too high, it will:

                • Mark the isHealthy information in the service registry as false. Since the RPC sender will only select the RPC response server from the server whose isHealthy is true, it can achieve a similar back pressure effect
                • Refusal to provide external services. Specifically: if it is the turms-gateway server, refuse to establish a new session and process user requests; if it is the turms-service server, refuse to process the RPC request sent by the turms-gateway server (note: even in "Unhealthy" status, turms-service will still provide services for the admin API)

                Memory management

                JVM basic memory knowledge

                The memory area of the JVM HotSpot virtual machine can be divided into:

                • Heap Memory: Eden area, Survivor area, Old Generation (Old Generation)

                • Non-heap Memory (Non-heap Memory)

                  • Direct memory (Direct Memory): Direct Buffer Pool
                  • JVM internal memory (JVM Specific Memory): local method stack, metaspace, Code Cache, etc.

                  Special attention: NonHeapMemory obtained by the function java.lang.management.MemoryMXBean#getNonHeapMemoryUsage does not include Direct Buffer Pool (direct memory buffer pool). Specifically, the memory space referred to by this function in JDK 21 is:

                  • CodeHeap 'non-nmethods'
                  • CodeHeap 'non-profiled nmethods'
                  • CodeHeap 'profiled nmethods'
                  • Compressed Class Space
                  • Metaspace

                Reference document: How to Monitor VM Internal Memory

                Use of Managed Memory

                The controllable memory of the Turms server refers to the two areas of heap memory (Heap Memory) and direct memory (Direct Memory).

                Heap memory

                Practical significance

                The practical significance of heap memory is relatively easy to understand, which is to configure as large a heap memory as possible to reduce the number of GC and the occurrence of stop-the-world events.

                Configuration

                The default heap configuration of the JVM is as follows:

                -XX:MaxRAMPercentage=75
                +    
                Skip to content

                System Resource Management

                The importance of memory and CPU resources to the server is self-evident. Each module of Turms uses memory and CPU to the extreme. For details, please refer to the documents and codes implemented by each module. On the other hand, in order to ensure the normal operation of the server, it also provides a set of health detection mechanism internally. This mechanism cooperates with the "denial of service" mechanism of the upper layer to do its best to ensure the normal operation of the server.

                Turms provides a system resource monitoring configuration class: im.turms.server.common.infra.property.env.common.healthcheck.HealthCheckProperties, to allow users to configure available memory usage and CPU usage. The HealthCheckManager on the Turms server will continuously detect the available physical memory and CPU usage. If it detects that the available physical memory is too low or the CPU usage is too high, it will:

                • Mark the isHealthy information in the service registry as false. Since the RPC sender will only select the RPC response server from the server whose isHealthy is true, it can achieve a similar back pressure effect
                • Refusal to provide external services. Specifically: if it is the turms-gateway server, refuse to establish a new session and process user requests; if it is the turms-service server, refuse to process the RPC request sent by the turms-gateway server (note: even in "Unhealthy" status, turms-service will still provide services for the admin API)

                Memory management

                JVM basic memory knowledge

                The memory area of the JVM HotSpot virtual machine can be divided into:

                • Heap Memory: Eden area, Survivor area, Old Generation (Old Generation)

                • Non-heap Memory (Non-heap Memory)

                  • Direct memory (Direct Memory): Direct Buffer Pool
                  • JVM internal memory (JVM Specific Memory): local method stack, metaspace, Code Cache, etc.

                  Special attention: NonHeapMemory obtained by the function java.lang.management.MemoryMXBean#getNonHeapMemoryUsage does not include Direct Buffer Pool (direct memory buffer pool). Specifically, the memory space referred to by this function in JDK 21 is:

                  • CodeHeap 'non-nmethods'
                  • CodeHeap 'non-profiled nmethods'
                  • CodeHeap 'profiled nmethods'
                  • Compressed Class Space
                  • Metaspace

                Reference document: How to Monitor VM Internal Memory

                Use of Managed Memory

                The controllable memory of the Turms server refers to the two areas of heap memory (Heap Memory) and direct memory (Direct Memory).

                Heap memory

                Practical significance

                The practical significance of heap memory is relatively easy to understand, which is to configure as large a heap memory as possible to reduce the number of GC and the occurrence of stop-the-world events.

                Configuration

                The default heap configuration of the JVM is as follows:

                -XX:MaxRAMPercentage=75
                 -XX:InitialRAMPercentage=75
                -XX:MaxRAMPercentage=75
                 -XX:InitialRAMPercentage=75

                in:

                • InitialRAMPercentage and MaxRAMPercentage specify the size of the memory that needs to be reserved, but a page fault will still occur when the Turms server accesses this memory area. Although the JVM can directly convert the reserved memory into committed memory by configuring AlwaysPreTouch to avoid page fault exceptions on the server side during runtime. But because it is difficult for the server to monitor the actual used heap memory after enabling this option, it is not recommended to add this configuration at present.
                • InitialRAMPercentage and MaxRAMPercentage are set to the same value mainly to ensure the continuity of the memory as much as possible, and avoid repeated GC and stop-the-world operations on the server due to memory expansion and shrinkage.
                • The heap memory is not configured to a value close to 100%. This is to give the remaining physical memory to the JVM's own off-heap memory (such as the largest direct memory, CodeCache, Metaspace, etc.), the system kernel (such as maintaining the TCP connection time buffer) and sidecar services (such as: log collection service).

                In addition, it is recommended not to allocate more than 32GB of memory to the Turms server in the production environment. because:

                • Turn on the pointer compression technology of the JVM to reduce unnecessary memory usage
                • Avoid a single server carrying too much load, slow down the shocking group effect during shutdown, and improve user experience

                Direct Memory

                All direct memory described below are allocated by PooledByteBufAllocator.DEFAULT in the actual code, that is, they are all direct memory cached and managed by Netty.

                Practical significance

                The upper limit of direct memory capacity affects the peak value of client requests and admin API requests that the Turms server can handle at the same time

                Main users
                • Network I/O operations. For example, based on Netty: the third party relies on drivers such as mongo-driver-java and Lettuce; the Turms server itself implements the client-oriented TCP/HTTP server.
                • Log printing. The log printing developed by Turms directly writes Java basic data into the direct memory block, and then writes it into the file descriptor.

                In other words, basically all memory areas that need to be accessed by the system kernel, we use direct memory directly to avoid meaningless heap memory copies.

                Note: In the Linux system, the direct memory used by Turms is still in the user space, so when writing the direct memory to the device (such as network card and hard disk), it still needs to be copied twice from the user space to the kernel space and from the kernel space to the device , and these two copy operations cannot be avoided by the upper server.

                life cycle

                Because in the Turms server, the life cycle of direct memory is highly consistent with the life cycle of client requests and admin API requests, a piece of direct memory usually only exists in part or all of the life cycle of a request. Specifically, its life cycle is roughly as follows:

                • The life cycle of a request begins when Netty cuts the TCP byte stream. Netty cuts the TCP byte stream according to the varint encoded header (the length of the Payload represented by its value), and when this memory is cut When it comes out (note: there is no memory copy here), the life cycle of this piece of direct memory representing the request begins.

                • After the Turms server parses this memory into a specific request model, Turms will determine whether this type of request needs to use its own direct memory. If the processing logic of the request does not need to use this memory, the memory will be immediately reclaimed back to Netty's memory cache pool. Otherwise, requests such as "forwarding user messages" need to use this memory, and this memory will not be reclaimed immediately. Turms will then perform business logic processing on the request.

                • In the process of business processing, other network I/O operations (such as sending requests to MongoDB/Redis) or log printing operations may be involved. These two types of operations need to take out new direct from the memory buffer pool managed by Netty Memory block for MongoDB/Redis client request encoding and response decoding operations, or log printing operations.

                • After the Turms server finally flushes the direct memory of the request response to the network card, in addition to the direct memory representing the log record, other direct memory involved in the process will also be recycled.

                  The only exception is: if the direct memory of a request needs to be forwarded to multiple clients, Turms will use the reference counter to separate the life cycle of the request from the life cycle of the direct memory to ensure that the same piece of direct memory Forwarded to multiple clients to avoid memory copies.

                  Notice:

                  1. The direct memory reclamation mentioned above does not reclaim the memory to the system, but reclaims it back to the memory pool managed by Netty, and the memory will not be actually released at this time.
                  2. Direct memory is mainly through: When Pooled ByteBuf is release, Netty will detect whether the Chunk it belongs to is idle (0% usage). If yes, the memory is actually freed by the function io.netty.buffer.PoolArena#destroyChunk.

                Due to the existence of this life cycle, the real usage rate of heap memory and direct memory is actually related. The increase in heap memory is mainly due to a series of logics processed by the Turms server after receiving client requests or admin API requests. In a process, the usage rate of direct memory increases because of request decoding and response encoding, network I/O operation encoding and decoding and log printing in the logic. When the life cycle of the request ends, both the heap memory and the direct memory can be reclaimed.

                Memory health check

                Configuration

                Configuration class: im.turms.server.common.infra.property.env.common.healthcheck.MemoryHealthCheckProperties

                As mentioned above, it is very difficult or even unrealistic for the operation and maintenance personnel to accurately estimate how much memory the server should use, especially the memory occupied by some key system kernels (such as TCP connections) changes dynamically, so MemoryHealthCheckProperties not only provides configurations such as maxAvailableMemoryPercentage and maxAvailableDirectMemoryPercentage that limit the upper limit of the memory that the Turms server can use, but also provides the configuration minFreeSystemMemoryBytes, which allows the Turms server to detect the available physical memory of the system in real time, and Do your best to reserve this memory.

                Memory monitoring implementation - MemoryHealthChecker

                effect:

                • When it is detected that the system's physical memory is insufficient, notify the upper layer service to refuse to process user sessions and requests, so as to do its best to ensure that the physical memory will not be exhausted and avoid using Swap memory
                • If it is detected that the system has insufficient physical memory and the used heap memory exceeds heapMemoryGcThresholdPercentage, call System.gc() to suggest JVM to perform Full GC

                pay attention

                • As mentioned above, the life cycle of the direct memory is highly consistent with the life cycle of the request, so even if MemoryHealthChecker detects that the total memory used has exceeded XX, it will not actively try to release the direct memory, but wait Netty's internal memory management mechanism releases it
                • In summary, although the Turms server will try its best not to run out of physical memory, for a large number of extremely sudden requests, the Turms server may still run out of physical memory, and Swap memory will be used at this time. If the Swap memory is closed by the system or the Swap memory is insufficient, the Turms server will directly throw an OutOfMemoryError exception. Therefore, we can use Swap memory as the last line of defense, so it is not recommended to turn off Swap memory in a production environment.

                About the Valhalla project - Codes like a class, works like an int

                The memory usage of Java has always been criticized by people. For example, the memory required by the object header stored by an Integer object (12 bytes in the case of a 64-bit system and the compressed pointer is turned on) is several times larger than the actual int data, and because Such design flaws lead to the need for some workarounds when programming. For example, when using Integer objects, the JVM will preferentially use the object cache in the java.lang.Integer.IntegerCache class. Compared with many C++ server projects (such as Nginx and Redis) that pursue performance optimization (even register-level optimization), due to Java's own design flaws and conservatism, Java's waste of memory makes people feel a little "self-defeating". And what's worse: this spirit has also been transmitted to the entire Java ecosystem. By reading the source code, we can find that many well-known Java projects also have the attitude of "the function can be used, the code is comfortable to write, and the performance is about the same. Anyway, the JVM will help the GC". Repeated memory copying (such as the most common String and StringBuilder are usually copied back and forth many times in practice, and the source code is shocking), only a very few projects such as Netty have the awareness of performance optimization and excellence. We have already explained this point in other chapters, so I won’t repeat it here.

                The Valhalla project reconstructs the existing Java Object system. The original Object is called IdentityObject in the new Java system, and the Object under the new system has become the parent class of IdentityObject and ValueObject (note: the Valhalla team has not yet finalized, so the concept may not yet be finalized. will change), the two are somewhat similar to C#'s Reference types and Value types. Among them, ValueObject is divided into two categories, namely primitive class and value class. primitive class allows developers to customize data structures that are as efficient as the eight traditional Java basic types, without object headers, without pointer lookup, stack allocation, and naturally without GC. At the same time, these classes can also be declared fields and define functions. The traditional eight basic types of Java will also be redesigned based on the new object system, such as int such primitive type will become primitive class (primitive class is a type of value class, its value cannot be null), and its wrapper class (Wrapper Class) Integer and int.ref that may be supported will become value class (value can be null), so the future will not There will be the concept of wrapper class.

                For example, the primitive instance object of the class primitive class Point { private double x; private double y; } only needs to occupy 2 double bytes, that is, 16 bytes, and no object header is required.

                After the Valhalla project releases the Preview version, we will introduce ValueObject and transform code implementations such as DTO objects and various wrapper classes (such as Date and ByteArrayWrapper) to greatly reduce memory overhead and the number of objects and speed up GC speed. And because we have been waiting for this project for several years and are very familiar with its design, we can complete the fitting and testing work within a week. This is also the only feature that we will green light for the Preview feature.

                Replenish:

                • In fact, the development history of Java also confirms what we have talked about "IM has rich functions to pay a fatal price", that is, the characteristics that a project is proud of, may hide abyss behind it.

                  Java was once proud of Everything is an object, and emphasized that Java has no structures or unions as complex data types. You don't need structures and unions when you have classes (quoted from Sun's release in 1995 The Java white paper: Simple, Object Oriented, and Familiar) to promote Java is far easier to use than C and C++.

                  (Additional supplement: Looking at the development history of Java, developers will also lament the powerful vitality shown by Java's ability to continuously adapt to the development of the times, adjust its own development direction, and overcome five obstacles)

                  But in today's programming practice, advocating "everything is an object" without providing structure is more like a curse, such as when we put an int into a List<Integer>, we need a new New object, increasing the object header. In other words, as long as we use common data structures such as List and Map provided by Java, a lot of memory will be wasted in vain, and these collection classes are unavoidable in actual projects. (Supplement: In fact, internal data structures such as HashSet and LinkedList are more wasteful than the memory waste that many developers can imagine. The memory occupied by the object header is more than the actual data, so we see It will use "shocking" to evaluate its source code).

                  Today, the Valhalla project hopes to change this situation by introducing the primitive/value class language feature, but because it needs to be forward compatible with the huge Java ecosystem, and let Java get rid of the traditional curse of everything is an object, the Valhalla project The development of Java is on thin ice, and the design draft alone has been overturned many times. It has taken nearly 8 years to release the Preview feature, and it will take a long time for developers to re-acquaint themselves with the new Java language model in the future. It can be seen that a feature that a project is proud of at the beginning may become a "curse" in the middle and late stages of project development, causing headaches for both project maintainers and users.

                  The same is true for the design of IM functions. A design with strong vitality should follow the design concept of Less is more. "Rich IM functions" seems to be a feature to be proud of. At the beginning, developers thought that open source IM projects had all the functions for themselves, and they basically didn't have to do anything. But there is a price behind this, and the scalability of the project may be extremely poor. It is better to rewrite it yourself if you do expansion in the middle and late stages.

                • If there is no Valhalla project in Java, it is possible that the Turms server will initially be established in C# language.

                Reference document: Java language model under the Valhalla project

                thread

                Since the Turms server does not have blocking I/O, network requests such as RPC, MongoDB, and Redis are all implemented asynchronously based on Netty. If you look further down, on the Linux system, they are all epoll-related operations. The number of threads required is far less than that of traditional Java web applications.

                Taking a 16-core CPU as an example, the peak number of threads of turms-gateway and turms-service ranges from 80 to 140 (including JVM internal threads). The specific peak number depends on the number of CPU cores of the server and the running server. It depends on the number (for example, one turms-gateway can start the TCP/WebSocket/UDP server at the same time).

                It is particularly worth mentioning that the peak number of threads in Turms has nothing to do with the scale of concurrent online users and the requested QPS.

                Supplement: Because the number of threads used by the Turms server itself is not much compared to the number of CPU cores, we directly use ThreadLocal to cache some relatively large and thread-unsafe objects in individual codes, and compared with traditional On the server side, Turms also greatly reduces the overhead caused by thread context switching.

                CPU health monitoring

                Configuration class: im.turms.server.common.infra.property.env.common.healthcheck.CpuHealthCheckProperties

                Function: Monitor the CPU usage. If the CPU usage exceeds the threshold for N times, set isHealthy of the node to false, and share this state with other nodes, and refuse to provide services until the CPU usage is healthy. For specific configuration, see the configuration class above.

                Turms thread list

                scope of usecategorythread namequantityfunction
                GeneralAdmin HTTP Server Threadturms-admin-http-accptor1Admin HTTP Server Acceptor Thread
                turms-admin-http-workerNumber of CPU coresAdmin HTTP server worker thread
                User blacklistturms-client-blocklist-sync1Used to synchronize blacklist data between clusters
                health checkerturms-health-checker1
                Loggingturms-log-processor1Used for log formatting and output
                Shutdownturms-shutdown1When the server is shut down, schedule the Shutdown task of each component
                Scheduled tasksturms-task-manager1Used to schedule scheduled tasks
                Cluster implementationturms-node-connection-client-ioNumber of CPU coresNode communication I/O threads
                turms-node-connection-keepalive1Used to regularly send heartbeats between nodes, and remove peer nodes with expired heartbeats
                turms-node-connection-retry1Node connection retry thread
                turms-node-connection-server-acceptor1Node connection server Acceptor thread
                turms-node-connection-server-workerNumber of CPU coresNode connection server Worker thread
                turms-node-discovery-change-notifier1Node addition, deletion and modification event notification thread
                turms-node-discovery-heartbeat-refresher1Used for the Leader node to refresh the heartbeat time in the service registry,
                Redis clientlettuce-event-loopRedis client I/O thread
                MongoDBturms-mongo-change-watcher1Used to execute the MongoDB Change Stream callback function
                mongo-event-loopMongoDB client I/O thread
                turms-gatewayFake clientturms-fake-clientNumber of CPU coresFake Turms client I/O threads
                turms-fake-client-manager1Schedule Fake Turms client to send requests
                turms-client-heartbeat-refresher1Used to periodically refresh client heartbeats in batches
                Gateway serverturms-gateway-udp-acceptor1UDP server Acceptor thread
                turms-gateway-udp-workerNumber of CPU coresUDP server worker thread
                turms-gateway-tcp-acceptor1TCP server Acceptor thread
                turms-gateway-tcp-workerNumber of CPU coresTCP server worker thread
                turms-gateway-ws-acceptor1WebSocket server Acceptor thread
                turms-gateway-ws-workerNumber of CPU coresWebSocket server worker threads
                turms-gateway-idle-connection-timeout-timer1Used to monitor and close the network connection that has not established an application layer user session with the server for a long time
                Client current limit and anti-swipeturms-ip-request-token-bucket-cleaner1Used to clear expired Token Bucket data

                Threading Model

                (Related documents: Linux System Reference Configuration, source code-network configuration)

                Business processing TCP/WebSocket server and HTTP background management API server

                Both the implementation of the business processing TCP/WebSocket server and the HTTP background management API server adopt the master-slave Reactor multithreading model. Specifically, an Acceptor thread (main Reactor group, Boss EventLoopGroup) and a Worker thread group with the number of CPU cores (slave Reactor group, Worker EventLoopGroup) are used. in:

                • The Acceptor thread listens to the connection event of the TCP client from the ServerSocketChannel through the io.netty.channel.nio.NioEventLoop#run function, and creates a corresponding SocketChannel for the connected TCP client, and assigns it to a Worker thread for subsequent processing.

                  Acceptor thread name: turms-gateway-tcp-acceptor, turms-gateway-ws-acceptor or turms-admin-http-acceptor.

                  Mainly related to Linux system configuration: net.core.somaxconn (the maximum length of the TCP accept queue).

                • A Worker thread can bind and process multiple SocketChannel, and use io.netty.channel.nio.NioEventLoop#run to continuously monitor SocketChannel read events and need to process write tasks, and read and write words Execute a series of encoding and decoding functions of ChannelHandler in ChannelPipeline when throttling, and complete the task of byte encoding and decoding.

                  After the Worker thread completes the decoding work of the client request, the Worker thread will execute the source code-client request processing logic (note: there is no need to switch threads here). In the process of processing this business request, the most time-consuming is the Protobuf decoding requested by the client and the encoding operation requested by MongoDB and Redis, while the IM logic only completes the scheduling of the IM business logic, so it is not time-consuming. In particular, during the processing of business requests, if a string needs to be sensitive word filtering detection, and using the MASK_TEXT strategy, its performance can be simply equal to Javas String#getBytes("UTF-8")`, so it is not time-consuming.

                  Worker thread name: turms-gateway-tcp-worker, turms-gateway-ws-worker or turms-admin-http-worker.

                  Main Linux system configuration: net.ipv4.tcp_mem, net.ipv4.tcp_rmem, net.ipv4.tcp_wmem

                Node server and client

                TODO

                Lettuce and MongoDB client

                TODO

                The method of judging which thread group any line of code is executed on

                After understanding the above-mentioned threading model of the Turms server, readers can easily determine which thread group any line of code on the Turms server will be executed on.

                Taking the processing of client business requests as an example, starting from the fact that Netty's Worker thread reads the TurmsRequest byte stream sent by a Turms client, the entire business processing process will be executed on the Worker thread. After the logic, you can return to process other business requests.

                During business process processing, Worker threads may trigger various network I/O operations, such as sending MongoDB and Redis client requests. When these network I/O operations are completed, there will be a series of business-related callback functions that need to be executed, and these callback functions will be executed on the MongoDB or Redis client NIO thread.

                In short, all non-callback business processing codes seen by developers at the Service layer are executed on Worker threads, while various callback business processing codes are usually executed on NIO threads of MongoDB or Redis clients to execute. The admin API is the same.

                About the Loom project - Codes like sync, works like async

                background

                On the one hand, many relatively long-lived technical solutions benefit from their rich ecology and longevity, on the other hand, because of their rich ecology, they are too big to lose their tails, and eventually withdraw from the stage of history because they cannot adapt to the development of the times. In the Java ecosystem, the blocking implementation of various technical solutions is actually a major obstacle that endangers the development of Java in the new era. Among them, the implementation of JDBC blocking is the biggest obstacle to the implementation of Java asynchronous ecology. One of the reasons why Turms did not adopt the traditional SQL database is that there was no mature asynchronous JDBC implementation in the Java ecosystem at that time, and even some projects did not use Java as a project. For languages such as Go or C#, only one sentence is left: "Java's threading model is not "cloud-native" enough, and the ecosystem is too backward."

                The revolutionary aspect of the Loom project is that it officially introduces virtual threads into the Java world, allowing seemingly synchronous code to be executed asynchronously.

                From the perspective of the Turms server, talk about our attitude towards the Loom project

                Although the revolution of the Loom project is mentioned above, the Turms project will not adopt the coroutine provided by the Loom project in the future, because for the Turms server project, the coroutine can only add new problems (such as stack copying) and cannot Solve existing problems. The specific reasons are as follows:

                • The revolutionary of the coroutine is that it tries to solve the status quo of the heavy use of blocking APIs (such as JDBC) in the Java ecosystem, allowing seemingly synchronous code to be executed asynchronously. However, the Turms server does not block I/O when processing client business requests, and the revolutionary nature of coroutines does not work on the Turms server. And if there is a third-party library that uses blocking I/O, then we usually have doubts about the technical level of its author and will not use its implementation.

                • The Loom project introduces a StackCopy-based coroutine, which needs to save the call stack to the heap when it is parked, and retrieve the call stack from the heap when it unparks and executes the thaw operation, but this does not affect the Turms server. It is superfluous, because the Turms server does not block I/O when processing client business requests, and does not need to park. Some articles promoting the Loom project will mention that coroutines have the advantage of "even if you open tens of thousands of coroutines, you only need to occupy such a small amount of memory", but the Turms server only needs to open 0 coroutines, and use more than 0 bytes of memory. Memory can also achieve the same effect.

                  In addition, although saving the call stack can solve a major fatal shortcoming of reactor-core "abnormal stack information is basically useless and difficult to debug", reactor-core has overcome this shortcoming under the optimization of the Turms server (see below for detailsSupplement: Disadvantages of reactor-core).

                • The learning difficulty of coroutine is "1+1>2", and its learning curve is actually higher than reactor-core. It is said that the difficulty of learning coroutines is "1+1>2" because: developers must master the use, principle and optimization of threads and coroutines at the same time, and at the same time ensure that traditional code modeled on threads can run correctly Among coroutines, mastering reactor-core only requires the most basic knowledge of threads.

                  Some developers may think that the use of reactor-core is more complicated than coroutines, but such statements are usually only from the perspective of beginners. For junior engineers, whether it is a coroutine or reactor-core, the superficial use of both is actually very simple without learning its principles. Only in the early stages of developer learning, coroutines can ensure that junior engineers can easily write high-performance code at the Java level, and reactor-core is best written by senior engineers with junior programmers, otherwise the code may be maintained Poor performance, even logic errors. But as long as this short initial stage is passed, learning coroutines will face the problem of "learning difficulty 1+1>2" just mentioned, and reactor-core only requires engineers to master the most basic thread knowledge.

                  As described in The method of judging which thread group any line of code is executed on, for any line of code* on the Turms server (including third-party libraries), we only need to rely on the most basic thread knowledge to accurately Infer which thread group this line of code will execute on, and who, where, and why this thread group was created, and what its life cycle is.

                  In addition, when we write Turms server code, we hardly consider "how to write asynchronous code with reactor-core", just as many developers do not consider "how to write synchronous code".

                • The compatibility of coroutines to the Java ecosystem is still a question mark. The Loom project itself still has a long way to go, and a large number of projects are needed to step on and verify. If a basic network library such as Netty, which is closely related to threads, has any negative optimization, explicit errors, or hidden unexpected behaviors when interacting with coroutines, its impact on upper-layer applications will be shaken.

                • Coroutines introduce a new abstraction layer (coroutines), and this layer of abstraction is redundant for the Turms server, which will only increase resource overhead and learning difficulty. Especially when we write performance-related key codes, we usually write the Java layer code from the perspective of system calls. Java just helps to cover the system calls with a layer of skin, and the thinner the skin, the better. In this way, we can quickly understand what syscall is called by the JVM to evaluate whether our Java layer code is efficient enough and whether there is room for optimization.

                • Java asynchronous implementation has about ten solutions so far, but in fact, no matter how tossed about the skin of the asynchronous model of Java, no matter how the ecology changes, no matter how "revolutionary", the calling function of the system layer remains unchanged. For example, whether to use epoll or epoll, whether to use off-heap memory or off-heap memory. There is no need for the Turms server to use coroutines and introduce an additional abstraction layer because coroutines are more "fashionable".

                • reactor-core not only implements asynchronous calls, but also has stronger expressive capabilities than coroutines. For example, if we want to know the measurement data such as the success rate and execution time of a link, we only need to call a function such as metrics(...); The number of automatic retries only needs to call retry(...); if you want to switch the execution thread of the data stream, you only need to execute a function like publishOn(...), and the scheduling logic of the thread is under control among.

                To sum up, stacked coroutines can't play a role on the Turms server, and the performance will not be better than the reactor-core under the Turms server. There are still countless pits in the ecology that need to be stepped on and verified by projects. For the Turms service The meaningless abstraction of coroutines on the end is also redundant, which only increases the difficulty of learning. The expressive ability of reactor-core is also better than that of coroutines. It is difficult for the Turms server to have a reason to use coroutines.

                Of course, the content mentioned above is mainly for the Turms server project. For most Java projects, the benefits of Loom outweigh the disadvantages. In particular, third-party library authors no longer need to maintain two sets of synchronous and asynchronous implementations.

                Supplement: Disadvantages of reactor-core

                As we are about the use of dependent libraries As mentioned in the chapter, the asynchronous implementation library such as reactor-core is the best The fatal shortcoming is that when it is combined with some dependent libraries that advocate "more encapsulation, more abstraction, and users do not need to close the implementation logic", developers can only hope that the server can always run normally, otherwise once a bug is encountered , developers will soon be unable to help but have a series of questions: "Can an asynchronous framework like reactor-core be used in a production environment? I can't even find where the exception is thrown. Can such code really be maintained?" Therefore, some technicians of the project team regretted using reactor-core, and even adopted other languages, such as Go, to rewrite the current Java project.

                For example, the console now reports an error "Netty Prompt: The reference count of ByteBuf has reached 0, and the release operation cannot be performed again". Pay special attention, this does not omit any useful log information, this is all the useful information that developers can really see from the log. Even this log has been stripped of misleading information, namely its stack information. If developers go to Debug according to the stack information, they will never be able to find the real Root Cause. And can developers know why this exception occurs and locate which module caused this exception based on this line of logs? This is a bug that actually happened in Turms, and it is also the only one that spent more than 6 hours reading all the network I/O related source codes that Turms depends on, and troubleshooting the most difficult bug of Root Cause: Memory leaks when Turms uses the previous buffer reference to release a recycled pooled buffer.

                In short, to use reactor-core well, three conditions must be met:

                1. All key codes must be controllable, otherwise when something goes wrong, you can only hope for:
                • The developers of the third-party library have high technical level and solid code design skills. If the third-party dependency is also based on asynchronous programming, the requirements are even higher. The author must be able to predict the exceptions that upper-layer developers may encounter, and throw the exceptions to the upper-layer application through asynchronous means.

                • The third-party library is not complicated, and you can quickly read the relevant source code.

                  A good example is: reactor-netty. Its developers have a high technical level and solid design skills. The code is also relatively streamlined and easy to read.

                1. The exception and print log must be passed in a standardized manner. Even for asynchronous programming, as long as exceptions are transmitted and logs are printed in a standardized manner, we can immediately see the cause of most bugs through a single log. Only a few bugs may need to be associated with multiple logs for troubleshooting. If you can't do this, you can only resign yourself to fate when things go wrong.

                2. There must be engineers in the team who are proficient in asynchronous programming.

                As long as one of the above conditions is missing, developers will sooner or later encounter such difficult bugs as "Netty prompt: the reference count of ByteBuf has reached 0 and cannot be released again". Therefore, for general technical teams, we recommend The Loom project, not reactor-core. Of course, it may be more recommended to switch programming languages. However, the Turms project can now meet the above conditions, and there is no longer the situation of "extremely difficult to debug".

                Extras:

                • Some articles will say that asynchronous frameworks like reactor-core are easy to write callback hell. However, as mentioned above, reactor-core itself has a strong expressive ability. In fact, developers "can write several layers of call levels if they want to design several layers". In other words, if the highest calling level of a function is 5 levels, then reactor-core can be used to write 5/4/3/2/1 level codes. In practice, the nested callback functions of the Turms server are all nested to reduce intermediate objects or implement stack allocation (rather than heap allocation). For details, see the source code of the Turms server.
                • When developing the turms-admin management system, we usually try to avoid using await/async as much as possible. The reason is that turms-admin will eventually transpile into ES5 syntax, and the functions modified by await/async are in the source map After closing, it is very difficult to debug, so try to avoid await/async.
                - + \ No newline at end of file diff --git a/docs/server/module/xmpp.html b/docs/server/module/xmpp.html index 7d397dcc..5a99ce95 100644 --- a/docs/server/module/xmpp.html +++ b/docs/server/module/xmpp.html @@ -17,8 +17,8 @@ -
                Skip to content

                XMPP

                Background

                XMPP is an open instant messaging protocol based on XML.

                Turms does not use the XMPP protocol itself because:

                • It is very inefficient:
                  • The data format uses redundant and inefficient XML, and its metadata is often larger than the actual transmitted data.
                  • In XMPP's process design, there are many inefficient designs, such as converting user avatar images into Base64 text for transmission, and the server needs to actively push the modified personal information of a user to other users who subscribe to their presence in the roster.
                • It has poor scalability. Some articles may say that XMPP has strong scalability, but this "strong scalability" is only relative to those protocols with little scalability. A protocol with truly strong scalability is definitely a self-developed one.

                However, considering the following two points, we plan to adapt the Turms server to support the XMPP protocol in the near future:

                • The XMPP ecosystem happens to make up for a deficiency of Turms, which some developers have feedbacked under the Turms project: it is still quite complicated to implement a customized IM application from scratch based on Turms, especially since they need to implement UI interfaces and adapt APIs by themselves. Therefore, Turms is more suitable for teams that want to delve into IM research and development rather than for quick product releases.

                  XMPP has a relatively rich client-side ecosystem, so as long as the Turms server is slightly adapted, it can provide services to XMPP clients. This allows users to quickly offer services through various UI-based XMPP clients while enjoying the benefits of Turms. When users want to create their own dedicated IM application, they can gradually phase out XMPP clients and transition to the Turms client.

                  Note: Due to Turms' positioning, we do not consider tasks related to "providing UI-based clients" in our long-term plans. In other words, we will only consider providing UI-based clients after we have released customized stress testing platforms, data analysis platforms, and implemented various extensions and bug fixes for Turms. Therefore, the priority of this task is very low.

                • Most well-known XMPP open source server projects not only have outdated technical architecture and stacks, but also poor code quality and engineering capabilities. For example, the Tigase project, as an open source project that has been developed for decades, still makes a large number of rookie mistakes such as comparing strings using ==, or mixing data models with business logic without any code design capabilities, which is astonishing in terms of development ability.

                  Although some open source XMPP servers may advertise their "scalable" architecture, their scalability is incomparable to that of Turms. Turms is a project that tries to achieve the ultimate in all aspects (including scalability) from a true sense of architecture, code implementation, database design, etc., so in the field of medium-to-large IM, Turms can strike a blow against them.

                Note: In fact, we do not have a plan to replace other XMPP servers with Turms server because the positioning of XMPP servers and Turms server are very different. One of the main goals of XMPP servers is to achieve open communication for instant messaging (just like email), but the support of the XMPP protocol in Turms server is mainly to allow users to quickly communicate with Turms server using XMPP clients, so as to provide services to the world quickly. Moreover, we do not have a plan to support the communication between Turms servers and other XMPP servers.

                Implementation Principle

                • The turms-gateway server first implements a customized XMPP server internally.

                  Note: Customization is necessary because Turms does not need some of the features specified by the XMPP protocol, so there is no need to implement them. However, the customized XMPP server can still be compatible with standard XMPP clients.

                • When the XMPP server receives requests from XMPP clients, it will convert these requests into corresponding Turms service calls. Therefore, from the perspective of subsequent calls, XMPP client requests and Turms client requests follow similar logic, ultimately achieving interoperability between XMPP clients and Turms clients.

                  Note:

                  • Both use "similar logic" because their business processes are slightly different and not a one-to-one relationship.
                  • XMPP and Turms clients share the same account system, so one account can be used to log in to both XMPP and Turms clients.
                  • XMPP clients do not know about the Turms clients, and vice versa. The reason why they can communicate with each other is that the turms-gateway will convert the data into the protocol format they can understand before sending it.
                - +
                Skip to content

                XMPP

                Background

                XMPP is an open instant messaging protocol based on XML.

                Turms does not use the XMPP protocol itself because:

                • It is very inefficient:
                  • The data format uses redundant and inefficient XML, and its metadata is often larger than the actual transmitted data.
                  • In XMPP's process design, there are many inefficient designs, such as converting user avatar images into Base64 text for transmission, and the server needs to actively push the modified personal information of a user to other users who subscribe to their presence in the roster.
                • It has poor scalability. Some articles may say that XMPP has strong scalability, but this "strong scalability" is only relative to those protocols with little scalability. A protocol with truly strong scalability is definitely a self-developed one.

                However, considering the following two points, we plan to adapt the Turms server to support the XMPP protocol in the near future:

                • The XMPP ecosystem happens to make up for a deficiency of Turms, which some developers have feedbacked under the Turms project: it is still quite complicated to implement a customized IM application from scratch based on Turms, especially since they need to implement UI interfaces and adapt APIs by themselves. Therefore, Turms is more suitable for teams that want to delve into IM research and development rather than for quick product releases.

                  XMPP has a relatively rich client-side ecosystem, so as long as the Turms server is slightly adapted, it can provide services to XMPP clients. This allows users to quickly offer services through various UI-based XMPP clients while enjoying the benefits of Turms. When users want to create their own dedicated IM application, they can gradually phase out XMPP clients and transition to the Turms client.

                  Note: Due to Turms' positioning, we do not consider tasks related to "providing UI-based clients" in our long-term plans. In other words, we will only consider providing UI-based clients after we have released customized stress testing platforms, data analysis platforms, and implemented various extensions and bug fixes for Turms. Therefore, the priority of this task is very low.

                • Most well-known XMPP open source server projects not only have outdated technical architecture and stacks, but also poor code quality and engineering capabilities. For example, the Tigase project, as an open source project that has been developed for decades, still makes a large number of rookie mistakes such as comparing strings using ==, or mixing data models with business logic without any code design capabilities, which is astonishing in terms of development ability.

                  Although some open source XMPP servers may advertise their "scalable" architecture, their scalability is incomparable to that of Turms. Turms is a project that tries to achieve the ultimate in all aspects (including scalability) from a true sense of architecture, code implementation, database design, etc., so in the field of medium-to-large IM, Turms can strike a blow against them.

                Note: In fact, we do not have a plan to replace other XMPP servers with Turms server because the positioning of XMPP servers and Turms server are very different. One of the main goals of XMPP servers is to achieve open communication for instant messaging (just like email), but the support of the XMPP protocol in Turms server is mainly to allow users to quickly communicate with Turms server using XMPP clients, so as to provide services to the world quickly. Moreover, we do not have a plan to support the communication between Turms servers and other XMPP servers.

                Implementation Principle

                • The turms-gateway server first implements a customized XMPP server internally.

                  Note: Customization is necessary because Turms does not need some of the features specified by the XMPP protocol, so there is no need to implement them. However, the customized XMPP server can still be compatible with standard XMPP clients.

                • When the XMPP server receives requests from XMPP clients, it will convert these requests into corresponding Turms service calls. Therefore, from the perspective of subsequent calls, XMPP client requests and Turms client requests follow similar logic, ultimately achieving interoperability between XMPP clients and Turms clients.

                  Note:

                  • Both use "similar logic" because their business processes are slightly different and not a one-to-one relationship.
                  • XMPP and Turms clients share the same account system, so one account can be used to log in to both XMPP and Turms clients.
                  • XMPP clients do not know about the Turms clients, and vice versa. The reason why they can communicate with each other is that the turms-gateway will convert the data into the protocol format they can understand before sending it.
                + \ No newline at end of file diff --git a/docs/turms-admin.html b/docs/turms-admin.html index e39a4527..77e701cc 100644 --- a/docs/turms-admin.html +++ b/docs/turms-admin.html @@ -17,8 +17,8 @@ -
                Skip to content

                turms-admin

                turms-admin is a customized backend administration single page application (SPA) for Turms project, specifically including: cluster management (cluster monitoring, cluster configuration), content management, client blacklist, permission control, client terminal, these five major sections.

                Note: turms-admin is positioned only as a visual Web application for the Turms server-side Admin API, so turms-admin itself does not provide any data collection, data analysis and alarm functions.

                Deployment Overview

                Turms uses a separate front- and back-end design, so the Turms server is not even "aware" of the existence of the turms-admin front-end project. So users can even open turms-admin directly in the browser and interact with the Turms server through local static HTML files. However, in order to facilitate developers' operation and deployment, the turms-admin project also provides the following two deployment options.

                shell
                docker run -d -p 6510:6510 ghcr.io/turms-im/turms-admin
                docker run -d -p 6510:6510 ghcr.io/turms-im/turms-admin

                The image provides static resources for turms-admin externally through the built-in Nginx server. You will be able to access the http://localhost:6510 page after running the command

                Simple web server

                The turms-admin project itself also provides a simple web server based on Node.js. This web server will provide static resources of turms-admin to the public via HTTP interface, and will carry PM2 for turms-admin process management by default.

                Installation and Implementation Steps

                1. Install Node.js
                2. In the turms-admin directory, execute the npm run quickstart command, which consists of npm install && npm run build && npm run start, including the dependency package installation, front-end build and server-side execution. Wait for PM2 to indicate that the status of turms-admin is online, indicating that the turms-admin server-side process has been started
                3. Open the browser and visit the http://localhost:6510 page

                Common operations and maintenance commands

                start: Execute the turms-admin server-side process

                stop: Terminate the turms-admin server process

                delete:Terminate the turms-admin server process and delete its process record in PM2

                restart: restart the turms-admin server

                reload: reload the turms-admin server configuration

                For more commands and server-side configurations, please refer to PM2 documentation

                Introduction of the module

                Cluster management.

                • Cluster monitoring: view the real-time operational status of the cluster; view the specific information and metric data of a particular server
                • Cluster Configuration: This section corresponds to the global configuration function of the Turms server, which can modify the Turms server configuration in real time with zero downtime
                • Cluster Flight Logger: Manage the flight logger of each node of the cluster
                • Cluster plug-in: manage the plug-in of each node of the cluster

                Content management: add, delete, change and check various business data

                Client Blacklist: This part corresponds to the global blacklist mechanism of Turms server, which is used to add, delete, and check blacklist records

                Permission control: used to add, delete and change the information and permissions of administrators

                Client terminal: equipped with turms-client-js client implementation, used for administrators to quickly test the real client request and server response

                TODO: post GIF demo image

                - +
                Skip to content

                turms-admin

                turms-admin is a customized backend administration single page application (SPA) for Turms project, specifically including: cluster management (cluster monitoring, cluster configuration), content management, client blacklist, permission control, client terminal, these five major sections.

                Note: turms-admin is positioned only as a visual Web application for the Turms server-side Admin API, so turms-admin itself does not provide any data collection, data analysis and alarm functions.

                Deployment Overview

                Turms uses a separate front- and back-end design, so the Turms server is not even "aware" of the existence of the turms-admin front-end project. So users can even open turms-admin directly in the browser and interact with the Turms server through local static HTML files. However, in order to facilitate developers' operation and deployment, the turms-admin project also provides the following two deployment options.

                shell
                docker run -d -p 6510:6510 ghcr.io/turms-im/turms-admin
                docker run -d -p 6510:6510 ghcr.io/turms-im/turms-admin

                The image provides static resources for turms-admin externally through the built-in Nginx server. You will be able to access the http://localhost:6510 page after running the command

                Simple web server

                The turms-admin project itself also provides a simple web server based on Node.js. This web server will provide static resources of turms-admin to the public via HTTP interface, and will carry PM2 for turms-admin process management by default.

                Installation and Implementation Steps

                1. Install Node.js
                2. In the turms-admin directory, execute the npm run quickstart command, which consists of npm install && npm run build && npm run start, including the dependency package installation, front-end build and server-side execution. Wait for PM2 to indicate that the status of turms-admin is online, indicating that the turms-admin server-side process has been started
                3. Open the browser and visit the http://localhost:6510 page

                Common operations and maintenance commands

                start: Execute the turms-admin server-side process

                stop: Terminate the turms-admin server process

                delete:Terminate the turms-admin server process and delete its process record in PM2

                restart: restart the turms-admin server

                reload: reload the turms-admin server configuration

                For more commands and server-side configurations, please refer to PM2 documentation

                Introduction of the module

                Cluster management.

                • Cluster monitoring: view the real-time operational status of the cluster; view the specific information and metric data of a particular server
                • Cluster Configuration: This section corresponds to the global configuration function of the Turms server, which can modify the Turms server configuration in real time with zero downtime
                • Cluster Flight Logger: Manage the flight logger of each node of the cluster
                • Cluster plug-in: manage the plug-in of each node of the cluster

                Content management: add, delete, change and check various business data

                Client Blacklist: This part corresponds to the global blacklist mechanism of Turms server, which is used to add, delete, and check blacklist records

                Permission control: used to add, delete and change the information and permissions of administrators

                Client terminal: equipped with turms-client-js client implementation, used for administrators to quickly test the real client request and server response

                TODO: post GIF demo image

                + \ No newline at end of file diff --git a/docs/zh-CN/client/api.html b/docs/zh-CN/client/api.html index b9106ec9..5f17526a 100644 --- a/docs/zh-CN/client/api.html +++ b/docs/zh-CN/client/api.html @@ -12,12 +12,12 @@ - + -
                Skip to content

                接口

                Turms客户端目前支持JavaScript、Kotlin、Swift与Dart这四种语言,对外暴露一致的接口,并且表现为一致的行为。各语言版本之间的部分接口参数可能出现不完全一致的情况,这主要体现在:1. 接口采用更贴近当前语言特性及习惯的参数与语法;2. turms-client-js独有的参数与接口。

                由于Turms各语言客户端行为具有高度的一致性,因此如果您基于上述任意一种语言进行业务开发,您可以在代码逻辑不做改变的情况下,轻松将已写好的业务代码翻译为另外三种语言(具体可参考在本文结尾处的示例)。

                客户端的对外逻辑结构

                • TurmsClient:Turms客户端唯一直接对外暴露的类,一个TurmsClient实例代表着一个客户端与服务端之间的会话连接。以下变量是TurmsClient对外的成员变量。

                  • driver:TurmsClient的运行驱动。负责连接的开起关闭、底层数据的发送接收与心跳控制等基础性操作。以下介绍到的Service层类都基于driver运作。

                  • userService:用户相关服务。负责如用户登陆、添加好友、添加关系人分组、发送/处理好友请求、查询附近的用户等操作。

                  • groupService:群组相关服务。负责如创建群组、变更群主、修改群成员角色、修改群信息等操作。

                  • messageService:消息相关服务。负责如发送消息、修改已发送消息、查询各类消息与其状态、撤回消息等操作。

                  • notificationService:通知相关服务。负责接受与响应业务层面上的通知(比如:其他用户向该用户发送好友请求、群组成员上下线等通知)。 提醒:消息(message)不算做业务层面上的“通知”(notification),因此notificationService不会处理用户消息,用户消息仅由messageService进行处理。而driver中TurmsNotification的“通知”概念指的是网络层面上的Turms服务端给Turms客户端的通知,因此notificationService也不会处理底层的TurmsNotification数据。

                    补充:关于通知功能的开启与关闭,您可以在turms服务端im.turms.server.common.infra.property.env.service.business.NotificationProperties处,实时地进行修改。

                  • storageService:存储相关服务(可选拓展)。负责用户头像、群组头像与消息附件的上传与下载操作。补充:该服务为Turms的拓展服务,因此若您希望使用该功能,您需要将turms-plugin-minio或您自行实现的存储插件集成到turms服务端当中。

                Service中方法的返回值

                与Turms服务端交互的所有Turms客户端接口都基于异步模型编写。turms-client-js使用Promise模型,turms-client-kotlin使用Coroutines模型,而turms-client-swift使用Promise模型(由PromiseKit提供)。

                各种Service可以对Turms所提供的业务数据进行增删改查操作。您需要了解其返回值种类,以开发您自己的业务代码。

                对于状态码为10xx的响应(拓展知识)

                • 对于增加业务数据的方法,如果该方法的返回值被声明为一个异步模型(如:Promise<Response<string>>),则返回的泛型(如前文的string类型)的值必定不为空,否则会抛出一个状态码为INVALID_RESPONSE的错误ResponseErrorResponseException,表明本应该存在的数据丢失。若出现该错误,则意味着Turms服务端或客户端自身存在行为不一致的Bug。

                • 对于删除与更新业务数据的方法,它们均返回被异步模型包裹的Void类型(如:Promise<Response<Void>>)。

                • 对于查找业务数据的方法:

                  如果该类方法返回被异步模型包裹的List类型,则当服务端返回空数据时,该查找操作方法会返回一个空List,而非null或undefined。

                  如果被包裹的类型不是List类型,则当服务端返回空数据时,该查找操作方法会返回一个undefined(JavaScript)或null(Kotlin)或nil(Swift)。特例:answerGroupQuestions方法可以算做查询方法,但其返回数据永不为空。

                对于状态非10xx的响应(拓展知识)

                这类响应均被认作是“错误”状态响应。Service中的方法会通过异步模型抛出ResponseErrorResponseException,并且这些错误或异常实例均会携带具体的响应状态码与错误原因。

                主要接口差异(拓展知识)

                通常情况下,您并不需要关心各客户端接口之间的差异,但如果您的团队需要由一名开发者基于多个Turms客户端进行上层的开发工作,或者您需要对照您项目的上层客户端代码实现的异同,您可以了解一下客户端间主要接口的不同。

                在早期Turms客户端实现中,各客户端之间的接口参数与数据模型是尽量保持统一的参数配置与含义,如时间相关的参数。但这种强行统一的写法不符合目标语言习惯。同时考虑到在大部分情况下,各客户端的上层业务代码通常有专人负责,而非全由一名开发者负责,统一含义意义不大,并且这些差异也符合目标语言习惯,故不进行强制统一。

                客户端主要接口的差异如下表:

                JavaScript客户端Kotlin客户端Swift客户端Dart客户端示例
                时间单位一律为毫秒一律为毫秒采用TimeInterval(即秒)一律为毫秒connectTimeout
                响应异常模型ResponseError(继承自Error)ResponseException(继承自RuntimeException)ResponseError(继承自Error)ResponseException(继承自Exception)
                异步模型PromiseCoroutines由PromiseKit提供的PromiseFuture

                补充:对于对外暴露的回调函数实现,Turms的Swift客户端没有采用Swift常见的delegate代理模式,而是和其他语言客户端一样通过函数传递逃逸闭包。

                理解接口(重点)

                Turms所有客户端的接口都非常容易理解与使用。开发者甚至不需要看Turms客户端有什么接口,只需要凭借基本的IM业务知识就能反推Turms会有什么接口。

                开发者一般只需要记住:

                • 通过new TurmsClient(...)创建Turms客户端实例
                • 在上文客户端的对外逻辑结构提到的:Turms客户端分为五个服务:userService(用户相关服务)、groupService(群组相关服务)、messageService(消息相关服务)、notificationService(通知相关服务)、storageService(存储相关服务、可选拓展)。

                之后我们就能凭借业务知识反推Turms客户端会有什么接口了,比如:

                • 用户首先要能登陆,于是先想到其对应的服务userService用户相关服务。既然是登陆所以找找有没有login方法,于是自然地就找到了client.userService.login(...)方法。
                • 登陆后,用户需要能够发消息,那就先想到messageService消息相关服务,再看看有没有类似sendMessage的方法,于是找到了client.messageService.sendMessage(...)方法。
                • 既然能发消息,那有什么方法能监听收到的消息呢?既然跟消息有关,那依旧想到的是messageService,于是想到方法可能是onMessagesubscribeMessageaddMessageListener,代码里找一找,找到了client.messageService.addMessageListener(...)
                • 既然能监听收到的消息,那怎么监听接收到的通知呢?既然跟通知有关,那想到的就是notificationService通知相关类服务,并且既然监听收到的消息的方法叫addMessageListener,那监听通知的方法就应该是addNotificationListener了,于是找到了client.notification.addNotificationListener

                综上,开发者一般只需凭借基本的业务知识就能反推Turms客户端提供的接口,甚至不需要读Turms客户端的源码。

                而对于高级开发者,Turms客户端也开放了driver对象,让开发者自行实现一些相对底层的操作。另外,如在会话的生命周期提到的,Turms客户端是故意设计的清晰易懂,故意不提供诸如自动重连、自动路由跳转等操作,因为一方面开发者可以很容易地自行实现该类逻辑,另一方面,这类“隐藏”的内部逻辑会使得上层开发者难以把控底层驱动行为,在一些时候反而会成为绊脚石。

                具体示例

                以下示例包括turms-client-js/kotlin/swift/dart四个版本,并且其作用等价。具体包括了以下业务操作:初始化客户端、登录、监听会话连接断开(下线)、监听通知、监听新消息、查询附近的用户、发送消息、创建群组操作。

                体验示例前的服务端准备工作

                • 方案一:无需在本地搭建Turms服务端,用户直接在本地通过客户端API连接Playground上的turms-gateway服务端(WebSocket端口:http://playground.turms.im:10510;TCP端口:http://playground.turms.im:11510)。但注意及时将本地客户端升级到最新版本,以避免出现因为服务端侧的接口更新,导致数据不一致的问题。
                • 方案二:在application.yaml配置文件中更新以下配置:
                  1. turms.gateway.session.enable-authentication设置为false(取消用户登录认证)
                  2. turms.service.message.allow-sending-messages-to-stranger设置为true(允许没有用户关系的用户互相发送消息)
                • 方案三:使用自带dev profile配置。因为Turms提供的devprofile已做了上述配置。默认情况下,Turms发布包中的application.yamlprofile字段为空,即默认的profile不是dev,需要您手动配置为dev

                代码示例

                javascript
                // Initialize client
                +    
                Skip to content

                接口

                Turms客户端目前支持JavaScript、Kotlin、Swift与Dart这四种语言,对外暴露一致的接口,并且表现为一致的行为。各语言版本之间的部分接口参数可能出现不完全一致的情况,这主要体现在:1. 接口采用更贴近当前语言特性及习惯的参数与语法;2. turms-client-js独有的参数与接口。

                由于Turms各语言客户端行为具有高度的一致性,因此如果您基于上述任意一种语言进行业务开发,您可以在代码逻辑不做改变的情况下,轻松将已写好的业务代码翻译为另外三种语言(具体可参考在本文结尾处的示例)。

                客户端的对外逻辑结构

                • TurmsClient:Turms客户端唯一直接对外暴露的类,一个TurmsClient实例代表着一个客户端与服务端之间的会话连接。以下变量是TurmsClient对外的成员变量。

                  • driver:TurmsClient的运行驱动。负责连接的开起关闭、底层数据的发送接收与心跳控制等基础性操作。以下介绍到的Service层类都基于driver运作。

                  • userService:用户相关服务。负责如用户登陆、添加好友、添加关系人分组、发送/处理好友请求、查询附近的用户等操作。

                  • groupService:群组相关服务。负责如创建群组、变更群主、修改群成员角色、修改群信息等操作。

                  • messageService:消息相关服务。负责如发送消息、修改已发送消息、查询各类消息与其状态、撤回消息等操作。

                  • notificationService:通知相关服务。负责接受与响应业务层面上的通知(比如:其他用户向该用户发送好友请求、群组成员上下线等通知)。 提醒:消息(message)不算做业务层面上的“通知”(notification),因此notificationService不会处理用户消息,用户消息仅由messageService进行处理。而driver中TurmsNotification的“通知”概念指的是网络层面上的Turms服务端给Turms客户端的通知,因此notificationService也不会处理底层的TurmsNotification数据。

                    补充:关于通知功能的开启与关闭,您可以在turms服务端im.turms.server.common.infra.property.env.service.business.NotificationProperties处,实时地进行修改。

                  • storageService:存储相关服务(可选拓展)。负责用户头像、群组头像与消息附件的上传与下载操作。补充:该服务为Turms的拓展服务,因此若您希望使用该功能,您需要将turms-plugin-minio或您自行实现的存储插件集成到turms服务端当中。

                Service中方法的返回值

                与Turms服务端交互的所有Turms客户端接口都基于异步模型编写。turms-client-js使用Promise模型,turms-client-kotlin使用Coroutines模型,而turms-client-swift使用Promise模型(由PromiseKit提供)。

                各种Service可以对Turms所提供的业务数据进行增删改查操作。您需要了解其返回值种类,以开发您自己的业务代码。

                对于状态码为10xx的响应(拓展知识)

                • 对于增加业务数据的方法,如果该方法的返回值被声明为一个异步模型(如:Promise<Response<string>>),则返回的泛型(如前文的string类型)的值必定不为空,否则会抛出一个状态码为INVALID_RESPONSE的错误ResponseErrorResponseException,表明本应该存在的数据丢失。若出现该错误,则意味着Turms服务端或客户端自身存在行为不一致的Bug。

                • 对于删除与更新业务数据的方法,它们均返回被异步模型包裹的Void类型(如:Promise<Response<Void>>)。

                • 对于查找业务数据的方法:

                  如果该类方法返回被异步模型包裹的List类型,则当服务端返回空数据时,该查找操作方法会返回一个空List,而非null或undefined。

                  如果被包裹的类型不是List类型,则当服务端返回空数据时,该查找操作方法会返回一个undefined(JavaScript)或null(Kotlin)或nil(Swift)。特例:answerGroupQuestions方法可以算做查询方法,但其返回数据永不为空。

                对于状态非10xx的响应(拓展知识)

                这类响应均被认作是“错误”状态响应。Service中的方法会通过异步模型抛出ResponseErrorResponseException,并且这些错误或异常实例均会携带具体的响应状态码与错误原因。

                主要接口差异(拓展知识)

                通常情况下,您并不需要关心各客户端接口之间的差异,但如果您的团队需要由一名开发者基于多个Turms客户端进行上层的开发工作,或者您需要对照您项目的上层客户端代码实现的异同,您可以了解一下客户端间主要接口的不同。

                在早期Turms客户端实现中,各客户端之间的接口参数与数据模型是尽量保持统一的参数配置与含义,如时间相关的参数。但这种强行统一的写法不符合目标语言习惯。同时考虑到在大部分情况下,各客户端的上层业务代码通常有专人负责,而非全由一名开发者负责,统一含义意义不大,并且这些差异也符合目标语言习惯,故不进行强制统一。

                客户端主要接口的差异如下表:

                JavaScript客户端Kotlin客户端Swift客户端Dart客户端示例
                时间单位一律为毫秒一律为毫秒采用TimeInterval(即秒)一律为毫秒connectTimeout
                响应异常模型ResponseError(继承自Error)ResponseException(继承自RuntimeException)ResponseError(继承自Error)ResponseException(继承自Exception)
                异步模型PromiseCoroutines由PromiseKit提供的PromiseFuture

                补充:对于对外暴露的回调函数实现,Turms的Swift客户端没有采用Swift常见的delegate代理模式,而是和其他语言客户端一样通过函数传递逃逸闭包。

                理解接口(重点)

                Turms所有客户端的接口都非常容易理解与使用。开发者甚至不需要看Turms客户端有什么接口,只需要凭借基本的IM业务知识就能反推Turms会有什么接口。

                开发者一般只需要记住:

                • 通过new TurmsClient(...)创建Turms客户端实例
                • 在上文客户端的对外逻辑结构提到的:Turms客户端分为五个服务:userService(用户相关服务)、groupService(群组相关服务)、messageService(消息相关服务)、notificationService(通知相关服务)、storageService(存储相关服务、可选拓展)。

                之后我们就能凭借业务知识反推Turms客户端会有什么接口了,比如:

                • 用户首先要能登陆,于是先想到其对应的服务userService用户相关服务。既然是登陆所以找找有没有login方法,于是自然地就找到了client.userService.login(...)方法。
                • 登陆后,用户需要能够发消息,那就先想到messageService消息相关服务,再看看有没有类似sendMessage的方法,于是找到了client.messageService.sendMessage(...)方法。
                • 既然能发消息,那有什么方法能监听收到的消息呢?既然跟消息有关,那依旧想到的是messageService,于是想到方法可能是onMessagesubscribeMessageaddMessageListener,代码里找一找,找到了client.messageService.addMessageListener(...)
                • 既然能监听收到的消息,那怎么监听接收到的通知呢?既然跟通知有关,那想到的就是notificationService通知相关类服务,并且既然监听收到的消息的方法叫addMessageListener,那监听通知的方法就应该是addNotificationListener了,于是找到了client.notification.addNotificationListener

                综上,开发者一般只需凭借基本的业务知识就能反推Turms客户端提供的接口,甚至不需要读Turms客户端的源码。

                而对于高级开发者,Turms客户端也开放了driver对象,让开发者自行实现一些相对底层的操作。另外,如在会话的生命周期提到的,Turms客户端是故意设计的清晰易懂,故意不提供诸如自动重连、自动路由跳转等操作,因为一方面开发者可以很容易地自行实现该类逻辑,另一方面,这类“隐藏”的内部逻辑会使得上层开发者难以把控底层驱动行为,在一些时候反而会成为绊脚石。

                具体示例

                以下示例包括turms-client-js/kotlin/swift/dart四个版本,并且其作用等价。具体包括了以下业务操作:初始化客户端、登录、监听会话连接断开(下线)、监听通知、监听新消息、查询附近的用户、发送消息、创建群组操作。

                体验示例前的服务端准备工作

                • 方案一:无需在本地搭建Turms服务端,用户直接在本地通过客户端API连接Playground上的turms-gateway服务端(WebSocket端口:http://playground.turms.im:10510;TCP端口:http://playground.turms.im:11510)。但注意及时将本地客户端升级到最新版本,以避免出现因为服务端侧的接口更新,导致数据不一致的问题。
                • 方案二:在application.yaml配置文件中更新以下配置:
                  1. turms.gateway.session.enable-authentication设置为false(取消用户登录认证)
                  2. turms.service.message.allow-sending-messages-to-stranger设置为true(允许没有用户关系的用户互相发送消息)
                • 方案三:使用自带dev profile配置。因为Turms提供的devprofile已做了上述配置。默认情况下,Turms发布包中的application.yamlprofile字段为空,即默认的profile不是dev,需要您手动配置为dev

                代码示例

                javascript
                // Initialize client
                 const client = new TurmsClient(); // new TurmsClient('ws://any-turms-gateway-server.com');
                 
                 // Listen to the offline event
                @@ -372,7 +372,7 @@
                         intro: 'nope'))
                     .data;
                 print('group $groupId has been created');
                - + \ No newline at end of file diff --git a/docs/zh-CN/client/communication-protocol.html b/docs/zh-CN/client/communication-protocol.html index d9214cb3..7086f9ce 100644 --- a/docs/zh-CN/client/communication-protocol.html +++ b/docs/zh-CN/client/communication-protocol.html @@ -17,8 +17,8 @@ -
                Skip to content

                与服务端通信时使用的数据格式

                对于一般请求与响应而言:

                • 基于纯TCP协议实现的客户端:varint编码的正文长度 + 正文(Protobuf编码的TurmsNotificationTurmsRequest
                • 基于WebSocket协议实现的客户端:正文(Protobuf编码的TurmsNotificationTurmsRequest)。正文的字节长度信息通过底层的WebSocket Frame传输

                对于心跳请求而言:

                • 基于纯TCP协议实现的客户端:一个长度为一字节的[0]字节数组。这里的数值0其实是指“该Payload的长度在varint编码下为一字节长度的0”,即Payload为0字节。
                • 基于WebSocket协议实现的客户端:一个正文为空(0字节)的Binary类型消息

                补充:Turms不通过WebSocket的PING/PONG来实现心跳的原因是:

                • 各浏览器WebSocket实现的PING消息发送时间间隔不同
                • 上层代码无法控制PING/PONG的行为,甚至无法感知行为的发生
                • 网络层面的心跳逻辑不应该和应用层的心跳耦合
                - +
                Skip to content

                与服务端通信时使用的数据格式

                对于一般请求与响应而言:

                • 基于纯TCP协议实现的客户端:varint编码的正文长度 + 正文(Protobuf编码的TurmsNotificationTurmsRequest
                • 基于WebSocket协议实现的客户端:正文(Protobuf编码的TurmsNotificationTurmsRequest)。正文的字节长度信息通过底层的WebSocket Frame传输

                对于心跳请求而言:

                • 基于纯TCP协议实现的客户端:一个长度为一字节的[0]字节数组。这里的数值0其实是指“该Payload的长度在varint编码下为一字节长度的0”,即Payload为0字节。
                • 基于WebSocket协议实现的客户端:一个正文为空(0字节)的Binary类型消息

                补充:Turms不通过WebSocket的PING/PONG来实现心跳的原因是:

                • 各浏览器WebSocket实现的PING消息发送时间间隔不同
                • 上层代码无法控制PING/PONG的行为,甚至无法感知行为的发生
                • 网络层面的心跳逻辑不应该和应用层的心跳耦合
                + \ No newline at end of file diff --git a/docs/zh-CN/client/metrics.html b/docs/zh-CN/client/metrics.html index 7afdee29..a9c7413c 100644 --- a/docs/zh-CN/client/metrics.html +++ b/docs/zh-CN/client/metrics.html @@ -17,8 +17,8 @@ -
                Skip to content

                度量数据

                参考文档:可观测性体系

                网络连接度量

                Turms各客户端会提供网络连接相关的度量数据。开发者可以通过turmsClient.driver.connectionMetrics获取该度量数据对象。该对象包含以下数据:

                数据点名称单位含义
                addressResolverTime毫秒域名解析用时。
                turms-client-js不提供该数据
                connectTime毫秒对于非turms-client-js客户端,该数据指TCP握手用时;
                对于turms-client-js客户端,该数据指域名解析、TCP握手、TLS握手、建立WebSocket连接的总用时
                tlsHandshakeTime毫秒TLS握手用时。
                turms-client-js/swift不提供该数据
                dataReceived字节对于非turms-client-js客户端,该数据指TCP接收到的数据字节数;
                对于turms-client-js客户端,该数据指WebSocket连接接收到的Binary帧Payload数据字节数
                dataSent字节对于非turms-client-js客户端,该数据指TCP已发送的数据字节数;
                对于turms-client-js客户端,该数据指WebSocket连接已发送的Binary帧Payload数据字节数

                业务请求度量

                TODO

                - +
                Skip to content

                度量数据

                参考文档:可观测性体系

                网络连接度量

                Turms各客户端会提供网络连接相关的度量数据。开发者可以通过turmsClient.driver.connectionMetrics获取该度量数据对象。该对象包含以下数据:

                数据点名称单位含义
                addressResolverTime毫秒域名解析用时。
                turms-client-js不提供该数据
                connectTime毫秒对于非turms-client-js客户端,该数据指TCP握手用时;
                对于turms-client-js客户端,该数据指域名解析、TCP握手、TLS握手、建立WebSocket连接的总用时
                tlsHandshakeTime毫秒TLS握手用时。
                turms-client-js/swift不提供该数据
                dataReceived字节对于非turms-client-js客户端,该数据指TCP接收到的数据字节数;
                对于turms-client-js客户端,该数据指WebSocket连接接收到的Binary帧Payload数据字节数
                dataSent字节对于非turms-client-js客户端,该数据指TCP已发送的数据字节数;
                对于turms-client-js客户端,该数据指WebSocket连接已发送的Binary帧Payload数据字节数

                业务请求度量

                TODO

                + \ No newline at end of file diff --git a/docs/zh-CN/client/quick-start.html b/docs/zh-CN/client/quick-start.html index 5ad86c7c..c74d42fe 100644 --- a/docs/zh-CN/client/quick-start.html +++ b/docs/zh-CN/client/quick-start.html @@ -17,7 +17,7 @@ -
                Skip to content

                Quick Start

                1. 克隆Turms仓库(目前客户端代码均未发布到公开的依赖仓库中)。参考命令:git clone --depth 1 https://github.com/turms-im/turms.git

                2. 在您的项目中,引入对应的客户端实现。具体操作如下:

                  • 对于使用turms-client-js的项目:

                    首先进入到turms-client-js子项目的目录,执行命令npm run quickbuild,该命令会安装相关依赖并编译turms-client-js的发布包。然后:

                    • 对于使用模块的项目:
                      • 安装:在package.jsondependencies下添加:"turms-client-js": "file:<YOUR_OWN_PATH>/turms-client-js"即可
                      • 使用:通过import TurmsClient from 'turms-client-js'引入Turms客户端实现
                    • 对于不使用模块的项目:在HTML上添加:<script type="text/javascript" src="<YOUR_OWN_PATH>/turms-client-js/dist/turms-client.iife.js"></script>,并直接使用全局对象TurmsClient
                  • 对于使用turms-client-kotlin的项目:

                    • 安装:在turms-client-kotlin子项目的目录下,执行命令mvn clean install,该命令会将turms-client-kotlin编译并安装其JAR包到本地Maven仓库。

                    • 使用:

                      • 对于Maven项目,添加:

                        xml
                        <dependency>
                        +    
                        Skip to content

                        Quick Start

                        1. 克隆Turms仓库(目前客户端代码均未发布到公开的依赖仓库中)。参考命令:git clone --depth 1 https://github.com/turms-im/turms.git

                        2. 在您的项目中,引入对应的客户端实现。具体操作如下:

                          • 对于使用turms-client-js的项目:

                            首先进入到turms-client-js子项目的目录,执行命令npm run quickbuild,该命令会安装相关依赖并编译turms-client-js的发布包。然后:

                            • 对于使用模块的项目:
                              • 安装:在package.jsondependencies下添加:"turms-client-js": "file:<YOUR_OWN_PATH>/turms-client-js"即可
                              • 使用:通过import TurmsClient from 'turms-client-js'引入Turms客户端实现
                            • 对于不使用模块的项目:在HTML上添加:<script type="text/javascript" src="<YOUR_OWN_PATH>/turms-client-js/dist/turms-client.iife.js"></script>,并直接使用全局对象TurmsClient
                          • 对于使用turms-client-kotlin的项目:

                            • 安装:在turms-client-kotlin子项目的目录下,执行命令mvn clean install,该命令会将turms-client-kotlin编译并安装其JAR包到本地Maven仓库。

                            • 使用:

                              • 对于Maven项目,添加:

                                xml
                                <dependency>
                                     <groupId>im.turms</groupId>
                                     <artifactId>turms-client-kotlin</artifactId>
                                     <version>0.10.0-SNAPSHOT</version>
                                @@ -40,7 +40,7 @@
                                     path: <YOUR_OWN_DIR>/turms_client_dart
                                dependencies:
                                   turms_client_dart:
                                     path: <YOUR_OWN_DIR>/turms_client_dart
                            • 编写业务逻辑代码

                        - + \ No newline at end of file diff --git a/docs/zh-CN/client/requirements.html b/docs/zh-CN/client/requirements.html index 6a3c8d39..b34c6af1 100644 --- a/docs/zh-CN/client/requirements.html +++ b/docs/zh-CN/client/requirements.html @@ -17,8 +17,8 @@ -
                        Skip to content

                        版本要求

                        Turms客户端对版本的最低要求,主要是根据:平台全球市场占有率、平台TLSv1.2最低支持版本与代码实现的优雅程度,三个因素来考量。另外,Turms不提供对TLSv1与TLSv1.1等被时代淘汰协议的官方支持。

                        平台支持的最低版本原因
                        Android21+考虑到21+的市场占有率与代码实现优雅程度,故支持21+
                        iOS12.0+考虑到iOS 12.0+在全球的市场占有率以及苹果产品用户的习惯,turms-client-swift采用NWConnection实现TCP协议,因此设备版本的要求等同于支持NWConnection设备的版本要求。
                        另外,turms-client-swift不会考虑用古老的CFStreamCreatePairWithSocketToHost来实现TCP协议。
                        浏览器支持WebSocket协议的浏览器对于IE浏览器,turms-client-js仅对IE 11提供官方支持。
                        另外,turms-client-js不会将WebSocket降级为轮询机制
                        桌面端turms-client-kotlin(JDK8+)
                        turms-client-js(Node.js 8+)
                        如果您采用turms-client-kotlin实现,则要求JDK版本为8(+),因为JDK 8+默认提供对TLSv1.2的支持。
                        如果您采用turms-client-js实现,则Turms提供对Node.js 8+的官方支持

                        补充

                        • turms-client-kotlin采用的是Socket,而非SocketChannel。其中最主要的原因是:Android SDK不对SocketChannel提供一套标准的TLS协议实现,需要自行实现。考虑到安卓系统的五花八门且系统功能本身就比较受限(尤其相比服务端实现),自行实现TLS协议极易导致各种意料之外的Bugs,故使用Socket以采用官方的TLS协议实现。
                        - +
                        Skip to content

                        版本要求

                        Turms客户端对版本的最低要求,主要是根据:平台全球市场占有率、平台TLSv1.2最低支持版本与代码实现的优雅程度,三个因素来考量。另外,Turms不提供对TLSv1与TLSv1.1等被时代淘汰协议的官方支持。

                        平台支持的最低版本原因
                        Android21+考虑到21+的市场占有率与代码实现优雅程度,故支持21+
                        iOS12.0+考虑到iOS 12.0+在全球的市场占有率以及苹果产品用户的习惯,turms-client-swift采用NWConnection实现TCP协议,因此设备版本的要求等同于支持NWConnection设备的版本要求。
                        另外,turms-client-swift不会考虑用古老的CFStreamCreatePairWithSocketToHost来实现TCP协议。
                        浏览器支持WebSocket协议的浏览器对于IE浏览器,turms-client-js仅对IE 11提供官方支持。
                        另外,turms-client-js不会将WebSocket降级为轮询机制
                        桌面端turms-client-kotlin(JDK8+)
                        turms-client-js(Node.js 8+)
                        如果您采用turms-client-kotlin实现,则要求JDK版本为8(+),因为JDK 8+默认提供对TLSv1.2的支持。
                        如果您采用turms-client-js实现,则Turms提供对Node.js 8+的官方支持

                        补充

                        • turms-client-kotlin采用的是Socket,而非SocketChannel。其中最主要的原因是:Android SDK不对SocketChannel提供一套标准的TLS协议实现,需要自行实现。考虑到安卓系统的五花八门且系统功能本身就比较受限(尤其相比服务端实现),自行实现TLS协议极易导致各种意料之外的Bugs,故使用Socket以采用官方的TLS协议实现。
                        + \ No newline at end of file diff --git a/docs/zh-CN/client/session.html b/docs/zh-CN/client/session.html index 10deb3ba..e3a79bae 100644 --- a/docs/zh-CN/client/session.html +++ b/docs/zh-CN/client/session.html @@ -17,8 +17,8 @@ -
                        Skip to content

                        会话的生命周期

                        Turms客户端的会话生命周期比较容易理解,具体而言:先通过driver.connect(...)进行网络层的连接,而后通过userService.login(...)进行业务层面上的登录操作,在登录成功后,对应的会话就建立了。最后再通过userService.logout(...)方法向服务端发送会话关闭通知,同时也会关闭网络层连接。

                        为了保持逻辑简单,也方便上层开发者自行组合各种逻辑。Turms不提供诸如自动重连、自动路由跳转等操作,一方面开发者可以很容易地实现该类逻辑,另一方面,这类“隐藏”的内部逻辑会使得上层开发者难以把控底层驱动行为,在一些时候反而会成为绊脚石。

                        拓展:如同WebSocket基于关闭帧的会话关闭机制,Turms服务端在关闭会话时,也会通过一个会话关闭信令来通知客户端该会话已关闭,并在信令被Flushed后,通知底层WebSocket/TCP关闭连接。Turms服务端不需要等待客户端对会话关闭信令的任何响应,客户端也不会向服务端发送有关会话关闭信令的响应。

                        生命周期回调钩子

                        层次名称调用时机提醒
                        网络层driver.addOnConnectedListener当网络层连接建立时通常您并不需要通过addOnConnectedListener来添加连接监听事件,
                        而是在driver.connect(...)异步执行成功后,执行自定义代码
                        网络层driver.addOnDisconnectedListener当网络层连接断开时
                        业务逻辑层userService.addOnOnlineListener当会话建立,即用户上线时通常您并不需要通过addOnOnlineListener来添加上线监听事件,
                        而是在userService.login(...)异步执行成功后,执行自定义代码
                        业务逻辑层userService.addOnOfflineListener当会话断开,即用户下线时
                        - +
                        Skip to content

                        会话的生命周期

                        Turms客户端的会话生命周期比较容易理解,具体而言:先通过driver.connect(...)进行网络层的连接,而后通过userService.login(...)进行业务层面上的登录操作,在登录成功后,对应的会话就建立了。最后再通过userService.logout(...)方法向服务端发送会话关闭通知,同时也会关闭网络层连接。

                        为了保持逻辑简单,也方便上层开发者自行组合各种逻辑。Turms不提供诸如自动重连、自动路由跳转等操作,一方面开发者可以很容易地实现该类逻辑,另一方面,这类“隐藏”的内部逻辑会使得上层开发者难以把控底层驱动行为,在一些时候反而会成为绊脚石。

                        拓展:如同WebSocket基于关闭帧的会话关闭机制,Turms服务端在关闭会话时,也会通过一个会话关闭信令来通知客户端该会话已关闭,并在信令被Flushed后,通知底层WebSocket/TCP关闭连接。Turms服务端不需要等待客户端对会话关闭信令的任何响应,客户端也不会向服务端发送有关会话关闭信令的响应。

                        生命周期回调钩子

                        层次名称调用时机提醒
                        网络层driver.addOnConnectedListener当网络层连接建立时通常您并不需要通过addOnConnectedListener来添加连接监听事件,
                        而是在driver.connect(...)异步执行成功后,执行自定义代码
                        网络层driver.addOnDisconnectedListener当网络层连接断开时
                        业务逻辑层userService.addOnOnlineListener当会话建立,即用户上线时通常您并不需要通过addOnOnlineListener来添加上线监听事件,
                        而是在userService.login(...)异步执行成功后,执行自定义代码
                        业务逻辑层userService.addOnOfflineListener当会话断开,即用户下线时
                        + \ No newline at end of file diff --git a/docs/zh-CN/client/turms-chat-demo.html b/docs/zh-CN/client/turms-chat-demo.html new file mode 100644 index 00000000..a132137e --- /dev/null +++ b/docs/zh-CN/client/turms-chat-demo.html @@ -0,0 +1,24 @@ + + + + + + Turms Chat Demo | Turms Documentation + + + + + + + + + + + + + +
                        Skip to content

                        Turms Chat Demo

                        背景

                        最初,我们是计划先通过让turms-gateway支持XMPP协议来让用户能够自行复用世界上已有的XMPP客户端。但是不管是收费,还是免费的XMPP客户端质量基本都不高,主要体现在:

                        1. 大多XMPP客户端项目代码质量差,尤其是很多早期客户端工程师的代码功底很差,甚至会把复杂的UI逻辑与业务逻辑杂糅在一起写(比如著名开源项目JMeter),二次开发不如自己重写。
                        2. 不管是商业还是开源的UI设计水平基本都停留在业余爱好者水平。如果一个客户端项目没有专业的UI,我们会对其团队的前端工程师与UI设计师的能力表示怀疑(团队中只要有一位靠谱的、中级水平的前端工程师,就应该有独立设计单一产品UI的能力),也不会推荐用户去用他们的方案。
                        3. 几乎没有一个开源的XMPP客户端支持完整的跨平台方案。
                        4. 很多质量不高的XMPP客户端甚至需要收费。

                        考虑到提供一套跨桌面端与移动端IM应用的开发难度不高,主要是体力活,并且IM应用的UI与功能通用性强(在市面上找10款IM商业应用调研,会发现至少有9款IM的UI与功能是基本类似的),因此决定先提供IM客户端Demoturms-chat-demo-flutter,让Turms的用户能够自己使用或二次开发,之后再支持XMPP协议。

                        RoadMap

                        • 2023年11月~12月:完成桌面端UI设计;搭建Flutter项目框架;完成桌面端基础组件开发与测试;完整Windows桌面端UI开发与测试。
                        • 2023年12月~2024年1月:完成MacOS桌面端的UI适配工作;完成移动端基础组件开发与测试;完成Android手机端的UI开发与测试。
                        • 2024年1月~2024年2月:完成iOS手机端的UI适配工作。
                        • 2024年2月~3月:完成Web端的UI开发。
                        • 2024年3月~4月:集成turms-client-dart与实现IM业务逻辑(上述任务只有UI开发与测试,不包括业务逻辑)。

                        另外:

                        • 考虑到Turms的其他任务、节假日与工作情况,上述时间可能会略有变动。
                        • 无计划支持小程序。

                        简介

                        我们想着重提醒项目名中的一词——demo。该词主要有以下几种含义:

                        1. 不管是从产品角度,还是技术角度,该客户端demo也只不过是其中可能的的方案之一,用户不应该因为该demo而限制设计自身IM产品的能力,尤其不要认为Turms的服务端是为该demo定制的,正如Turms文档中反复提及Turms是一个通用IM解决方案,致力于解决各种IM场景。
                        2. 为用户的二次开发做准备。这主要分为三个方面:
                          1. UI与业务逻辑分离。方便需要二次开发的团队复用UI来实现自己的业务逻辑。
                          2. 依旧采用宽松的Apache 2.0,而不是客户端开源项目常见的、更加严格的GPL协议。
                          3. 由于全球范围的IM应用的UI设计都非常类似,因此该demo也会实现大部分IM的通用UI与逻辑,一般不提供更为定制化的逻辑,以方面其他团队二次开发。

                        注意:demo没有质量低的含义,这点读者之后看代码质量与UI设计就可明白。

                        + + + + \ No newline at end of file diff --git a/docs/zh-CN/client/turms-client-js.html b/docs/zh-CN/client/turms-client-js.html index 08ceb1ea..0f7a1d48 100644 --- a/docs/zh-CN/client/turms-client-js.html +++ b/docs/zh-CN/client/turms-client-js.html @@ -17,7 +17,7 @@ -
                        Skip to content

                        turms-client-js共享上下文

                        背景

                        由于Turms服务端不支持也没计划支持:一个用户在同一个端同时建立多个会话。因此如果一个用户在浏览器打开多个标签页,并试图以相同的用户ID与设备类型进行登陆时,那么有且仅有一个会话可以建立成功。从浏览器角度来看,就是有且仅有一个标签页能够登陆成功。该场景适用于一般的社交应用。

                        适用场景

                        但部分IM场景需要支持:从用户角度来看,用户只需在一个页面登陆一次,那么其他标签页也就处于已登陆状态了,在所有标签页里的Turms客户端都能以相同的用户身份,发送请求、接收消息与通知。适用于客服系统等场景。

                        为了支持上述场景,需要使用共享上下文。具体而言,对于在不同标签页的同域(同协议;同域名;同端口)、同用户ID且同设备类型的Turms客户端,它们可以共享与Turms服务端的WebSocket连接与一些已登陆用户的信息。

                        提醒:因为只有同域名、同用户ID且同设备类型的Turms客户端才共享上下文,因此您的客户端可以以不同的用户身份登陆不同的标签页,以支持类似“部分标签页共享A用户的会话,部分标签页共享B用户的会话”的特性。

                        使用方法

                        turms-client-js默认不开启共享上下文功能,而如果您的应用需要使用该功能,可以通过在创建TurmsClient实例时,传递一个useSharedContext: true开启。具体代码如下:

                        javascript
                        var client = new TurmsClient({
                        +    
                        Skip to content

                        turms-client-js共享上下文

                        背景

                        由于Turms服务端不支持也没计划支持:一个用户在同一个端同时建立多个会话。因此如果一个用户在浏览器打开多个标签页,并试图以相同的用户ID与设备类型进行登陆时,那么有且仅有一个会话可以建立成功。从浏览器角度来看,就是有且仅有一个标签页能够登陆成功。该场景适用于一般的社交应用。

                        适用场景

                        但部分IM场景需要支持:从用户角度来看,用户只需在一个页面登陆一次,那么其他标签页也就处于已登陆状态了,在所有标签页里的Turms客户端都能以相同的用户身份,发送请求、接收消息与通知。适用于客服系统等场景。

                        为了支持上述场景,需要使用共享上下文。具体而言,对于在不同标签页的同域(同协议;同域名;同端口)、同用户ID且同设备类型的Turms客户端,它们可以共享与Turms服务端的WebSocket连接与一些已登陆用户的信息。

                        提醒:因为只有同域名、同用户ID且同设备类型的Turms客户端才共享上下文,因此您的客户端可以以不同的用户身份登陆不同的标签页,以支持类似“部分标签页共享A用户的会话,部分标签页共享B用户的会话”的特性。

                        使用方法

                        turms-client-js默认不开启共享上下文功能,而如果您的应用需要使用该功能,可以通过在创建TurmsClient实例时,传递一个useSharedContext: true开启。具体代码如下:

                        javascript
                        var client = new TurmsClient({
                             useSharedContext: true
                         });
                        var client = new TurmsClient({
                             useSharedContext: true
                        @@ -83,8 +83,8 @@
                             userId: 1,
                             password: "123",
                             deviceType: DeviceType.ANDROID
                        -});
                        - +});
                        + \ No newline at end of file diff --git a/docs/zh-CN/community/index.html b/docs/zh-CN/community/index.html index 13ae3558..0090271e 100644 --- a/docs/zh-CN/community/index.html +++ b/docs/zh-CN/community/index.html @@ -17,8 +17,8 @@ -
                        Skip to content

                        社区

                        FAQ

                        为什么Issues使用英文?

                        最根本的原因:Issues使用单语言书写,方便搜索。在Issues的使用过程中,最怕遇到使用多语言的开源项目,因为当要搜索一个问题时,比如“Turms服务端的黑名单机制是如何实现的”,对于中英文双语项目,我们通常需要搜索“黑名单”与“blocklist”这两个关键词,换言之,需要搜索至少两次才能保证能搜全相关的Issues,用户搜索体验极差。而如果Issues只有英文,那用户只需搜索“blocklist”关键词。

                        次要原因:使用英文方便在全球开源与推广,而使用非英文语言就与我们开源的宗旨背道而驰了。

                        另外,我们不排斥用户使用非英语语言提Issue,只是鼓励用户多用英文。但我们回复时一定是使用英文。

                        为什么不建立QQ群、微信群、Slack频道或其他群?

                        使用各种群做Issues管理与讨论是一种非常糟糕的实践,Issues管理本来就应该优先使用GitHub的Issues板块。原因如下:

                        Issues板块:

                        • 可以针对一个问题进行集中讨论
                        • 方便后来的用户对各种问题进行搜寻
                        • 开发人员可以通过Issues做任务追踪
                        • 用户可以通过Issues查看各种任务的进度,公开透明

                        而各种群显然做不到上述功能。相反地,各种群是项目信息闭塞的表现,与开源的宗旨背道而驰。部分开源项目会故意靠阻塞信息流通,以赚取咨询费或服务费,但这就不是Turms的开源宗旨了。

                        在实践过程中,群甚至是视频会议更多地用于开发人员内部进行快速讨论,尤其是早期草案的讨论,但最终的讨论结果与其中涉及到的关键问题其实还是会记录在Issue或文档上,以方便用户与开发人员明白一个问题的来龙去脉。

                        可以提“新手问题”吗?

                        Turms项目中没有所谓的“新手问题”,只有“与Turms项目相关的问题”与“与Turms项目不相关的问题”。每个人在接触新领域时,都可能表现地“不是很专业”,我们作为新人更多地希望在这个领域的人多些善意,多些包容。同理地,只要是与Turms项目相关的问题,我们都会答复。并且在遇到“很基础的问题”时,我们通常想得不是“这个问题很糟糕”,而是“可不可以补充些文档,或优化下文档,给新用户多些引导”,因此用户不必担心提出所谓的“新手问题”。

                        另外就是态度问题,只要大家互相尊重,那什么问题都可以讨论,而提问时展示尊重有一个很简单且实用的判断方法,就是根据“一个人在提问时,展示了ta在这个问题上花了多少的时间与精力”,而不可取的常见态度有:1. 自己不读文档、不先查Issues、也不肯思考,直接开问;2. 居高临下。

                        当然,学习如何提问也是件很有意思的事情,具体可以参考:提问的智慧

                        可以用类ChatGPT生成的回答来参与讨论吗?

                        ChatGPT是一个优秀的背诵者,但它对各种技术方案的分析都很天真。用ChatGPT直接参与问题的讨论只会体现出该人:对自己的发言缺乏思考,对项目缺乏负责的态度。因此我们是否回答这类回答,取决于对方去除ChatGPT回答之后的占比。

                        这里提一下为什么这么关注“态度”的问题。其实有过工作经验的工程师应该都有过类似的体验:自己的工作需要依靠其他组的配合,尽管某件事情在技术上很简单的事情,但由于其他组成员的工作懒散与消极配合导致这件事情一直推不动,进而导致你自己的项目举步维艰。因此在需要团队配合的项目中,自己可控的技术问题通常是最容易解决的事情,而督促各方项目组配合,并在截止日期完成工作,才是最难与最费劲的事情。

                        一些还没入行的工程师,会把技术当中工程师的第一生存要义,但其实负责任的态度才是在工作或社区中的第一生存要义(当然,其实一个人如果真得对项目认真负责的人,那ta的技术水平也不会差)。除了需要特定领域,对于大部分项目,大部分合格水平工程师展现出来的技术水平都大差不差,大家真正能体现出差异的,更对的是自己对一个项目是否认真负责的态度。

                        因此为了表现出自己负责任的态度,请不要直接使用类ChatGPT生成的回答来参与讨论。

                        如何判断是不是类ChatGPT回复

                        1. 由于GPT生成的文风过于明显,通常人工直接识别即可。
                        2. 通过Hugging Face开源的Hello-SimpleAI/chatgpt-detector-roberta来在线识别。
                        3. 即便随着GPT发展,展现出来的文风更多。但如今预训练语言模型繁荣,各种语料库也很多,因此基于迁移学习训练一个检测GPT回复的新模型,快得话只需要1天,慢的话也就2~3天。

                        关于“上游优先”

                        开发者直接与开源社区进行互动,并在源头上解决问题的办法,被称为上游优先。

                        对于Turms而言,上游优先主要涉及两个方面:沟通与代码回馈。

                        • 沟通:做特性或改Bug前,最好事先在GitHub开Issue。有些特性看起来好像很常见,也容易实现,但Turms目前却没有实现,那有可能这个“看起来”简单的特性通常会涉及很多细节,比如:

                          • 这一个需求有其他相关需求或拓展需求吗?
                          • 这一个需求可以这么实现,那这类相关特性都可以这么实现吗?是需要一个个单独实现,还有该代码实现是通用的,这一个模板可以实现几乎所有相关需求?
                          • 不论是在单机与分布式场景都可以这么实现吗?
                          • 换个业务视角或技术视角,还有更优秀的设计与实现吗?

                          因此一个“看起来”简单的需求,其背后都可能涉及大量需求分析与技术分析,如果开发者在本地默默地就把一些特性给实现了,则在回馈代码时,还将面临上述一系列问题,而如果这时才发现实现中存在一些重大设计问题,那可能之前的一部分精力就浪费了(当然,还是有收获的,至少知道“当前这个方案还有优化空间”)。因此,开发者在面对复杂特性时,最好做好“设计可能被反复推翻”的心理准备

                          为了尽量避免这种情况,开发者在设计与实现可能复杂的特性时,最好事先开一个在Issue开一个新的讨论,以尽量减少设计被推翻的次数,节约开发者的时间与精力。

                          注意:其实有时候就算前期设计完了,在实现的过程中又会发现更精妙的设计,越是复杂的功能,通常也伴随着更多的设计迭代。但是,这些“推翻/半推翻”级别的迭代最好在代码发布之前,就反复讨论与开发完成,而不是代码发布了才发现。

                          额外补充:正是因为需求的复杂性,所以Turms很多“看起来”的Issues是处于“悬而未决”的状态。Turms GitHub Issues区中,有大量Open Issues,很多特性相关的Issues只是一个种子,需要开发者自行做更细致的需求分析、设计与编码,而其中最难的通常就是需求分析,需要弄清楚“到底要做什么”,开发者既要考虑现在的需求,也得考虑未来的需求,还得防止过度设计,这也是为什么Turms文档中好几次提到类似“IM业务功能的设计与实现其实远比技术中间件的设计与实现难得多的多”。

                        • 降低自己的维护成本,方便持续性地合并上游更新。如果开发者Fork Turms项目做复杂的二次开发,那将面临一个长期维护的问题:如果开发者想要使用上游的新代码,就需要不断地在自己的分支上做适配,而上游Turms服务端更新得越快,开发者的适配工作量就越大。甚至还有可能出现逻辑上的冲突,但开发者没意识到。

                          相反的,如果开发者将代码回馈给上游,那就不会出现这类问题。因为我们不仅会一起来维护这些被回馈的代码,而且在为Turms设计其他新的相关功能模块时,也会考虑这些新设计与这些被回馈的代码在设计上是否一致。

                        • 减少维护冲突,避免反复推翻本地实现。可能开发者自己在本地添加了一些新特性或者修复了一些Bugs,但都没有回馈。过一段时间之后,开发者可能会发现上游比自己实现的功能考虑得更周全且完善,一些Bugs的修复更精妙(关于Turms服务端Bugs的难度介绍,读者可以阅读关于任务难度,最终开发者不得不把自己原来做的工作全Revert,然后再重新拉去上游搭配,重头做一遍适配。这其中的工作量想想就令人感觉痛苦,开发者在本地改得越多,冲突也就可能越多。

                        关于想找Turms作者私聊与做定制化开发

                        如果读者的团队是想自己做二次开发,可以直接看文章关于二次开发

                        如果一些用户想付费找Turms作者做定制化开发,但Turms作者一般只接受为通用需求做无偿开发(是的,一般只接受免费帮社区做开发)。其中的原因很简单,Turms作者不缺钱,即便Turms项目持续每年亏损几万人民币,我们也都能保证Turms这项目持续运营下去,因为我们从始至终压根没打算盈利。所以要么只接受用户很高的报价,高到令人难以拒绝,要么就只接受为社区做免费的开发。

                        因此除非您已经准备好要报一个很高的价格,否则不要尝试私聊Turms的作者帮您做定制化开发。如果您真得很想让Turms作者尽早排期完成您的需求,您可以把自己的需求描述清楚并发到Issues区,然后我们会根据需求的性价比,以及您对您自己所提需求的尊重,来进行排期。

                        当然,如果您甚至愿意付Turms作者很高的定制费用,那我也建议您直接考虑用商业化方案,尽管ta们的开发水平、工作态度与工作责任心大概率不如Turms作者。当然,这主要取决于您决定采取哪个国家与公司提供的解决方案。

                        相比于免费开发,定制开发的区别在于:

                        • 会给出完整的、阶段性的工程排期,如设计、开发、测试、交付等。

                        • 帮忙设计需求。读者可能会好奇,既然都定制了,为啥还需要Turms作者来设计需求。这其实就像是亨利福特说的“如果我问人们他们想要什么,他们会说更快的马”。用户要求的不一定是他们内心真正所需的,而时刻洞察用户真正的需求正是工程师所需的必备技能之一。

                        • 定长的工作时长保证。在这段时间里,可以做仅限于项目相关的定制化设计、开发、测试、部署、解答各种疑问等。

                          当然,以上的内容都是Turms作者在下班时间进行的。

                        如果一些用户担心因为自己没付钱,而Turms作者就会故意拖慢自己想要特性的开发与发布进度,但这也不会发生,因为Turms作者不缺钱,也没打算靠做开源盈利,因此没有故意拖慢的动机。

                        - +
                        Skip to content

                        社区

                        FAQ

                        为什么Issues使用英文?

                        最根本的原因:Issues使用单语言书写,方便搜索。在Issues的使用过程中,最怕遇到使用多语言的开源项目,因为当要搜索一个问题时,比如“Turms服务端的黑名单机制是如何实现的”,对于中英文双语项目,我们通常需要搜索“黑名单”与“blocklist”这两个关键词,换言之,需要搜索至少两次才能保证能搜全相关的Issues,用户搜索体验极差。而如果Issues只有英文,那用户只需搜索“blocklist”关键词。

                        次要原因:使用英文方便在全球开源与推广,而使用非英文语言就与我们开源的宗旨背道而驰了。

                        另外,我们不排斥用户使用非英语语言提Issue,只是鼓励用户多用英文。但我们回复时一定是使用英文。

                        为什么不建立QQ群、微信群、Slack频道或其他群?

                        使用各种群做Issues管理与讨论是一种非常糟糕的实践,Issues管理本来就应该优先使用GitHub的Issues板块。原因如下:

                        Issues板块:

                        • 可以针对一个问题进行集中讨论
                        • 方便后来的用户对各种问题进行搜寻
                        • 开发人员可以通过Issues做任务追踪
                        • 用户可以通过Issues查看各种任务的进度,公开透明

                        而各种群显然做不到上述功能。相反地,各种群是项目信息闭塞的表现,与开源的宗旨背道而驰。部分开源项目会故意靠阻塞信息流通,以赚取咨询费或服务费,但这就不是Turms的开源宗旨了。

                        在实践过程中,群甚至是视频会议更多地用于开发人员内部进行快速讨论,尤其是早期草案的讨论,但最终的讨论结果与其中涉及到的关键问题其实还是会记录在Issue或文档上,以方便用户与开发人员明白一个问题的来龙去脉。

                        可以提“新手问题”吗?

                        Turms项目中没有所谓的“新手问题”,只有“与Turms项目相关的问题”与“与Turms项目不相关的问题”。每个人在接触新领域时,都可能表现地“不是很专业”,我们作为新人更多地希望在这个领域的人多些善意,多些包容。同理地,只要是与Turms项目相关的问题,我们都会答复。并且在遇到“很基础的问题”时,我们通常想得不是“这个问题很糟糕”,而是“可不可以补充些文档,或优化下文档,给新用户多些引导”,因此用户不必担心提出所谓的“新手问题”。

                        另外就是态度问题,只要大家互相尊重,那什么问题都可以讨论,而提问时展示尊重有一个很简单且实用的判断方法,就是根据“一个人在提问时,展示了ta在这个问题上花了多少的时间与精力”,而不可取的常见态度有:1. 自己不读文档、不先查Issues、也不肯思考,直接开问;2. 居高临下。

                        当然,学习如何提问也是件很有意思的事情,具体可以参考:提问的智慧

                        可以用类ChatGPT生成的回答来参与讨论吗?

                        ChatGPT是一个优秀的背诵者,但它对各种技术方案的分析都很天真。用ChatGPT直接参与问题的讨论只会体现出该人:对自己的发言缺乏思考,对项目缺乏负责的态度。因此我们是否回答这类回答,取决于对方去除ChatGPT回答之后的占比。

                        这里提一下为什么这么关注“态度”的问题。其实有过工作经验的工程师应该都有过类似的体验:自己的工作需要依靠其他组的配合,尽管某件事情在技术上很简单的事情,但由于其他组成员的工作懒散与消极配合导致这件事情一直推不动,进而导致你自己的项目举步维艰。因此在需要团队配合的项目中,自己可控的技术问题通常是最容易解决的事情,而督促各方项目组配合,并在截止日期完成工作,才是最难与最费劲的事情。

                        一些还没入行的工程师,会把技术当中工程师的第一生存要义,但其实负责任的态度才是在工作或社区中的第一生存要义(当然,其实一个人如果真得对项目认真负责的人,那ta的技术水平也不会差)。除了需要特定领域,对于大部分项目,大部分合格水平工程师展现出来的技术水平都大差不差,大家真正能体现出差异的,更对的是自己对一个项目是否认真负责的态度。

                        因此为了表现出自己负责任的态度,请不要直接使用类ChatGPT生成的回答来参与讨论。

                        如何判断是不是类ChatGPT回复

                        1. 由于GPT生成的文风过于明显,通常人工直接识别即可。
                        2. 通过Hugging Face开源的Hello-SimpleAI/chatgpt-detector-roberta来在线识别。
                        3. 即便随着GPT发展,展现出来的文风更多。但如今预训练语言模型繁荣,各种语料库也很多,因此基于迁移学习训练一个检测GPT回复的新模型,快得话只需要1天,慢的话也就2~3天。

                        关于“上游优先”

                        开发者直接与开源社区进行互动,并在源头上解决问题的办法,被称为上游优先。

                        对于Turms而言,上游优先主要涉及两个方面:沟通与代码回馈。

                        • 沟通:做特性或改Bug前,最好事先在GitHub开Issue。有些特性看起来好像很常见,也容易实现,但Turms目前却没有实现,那有可能这个“看起来”简单的特性通常会涉及很多细节,比如:

                          • 这一个需求有其他相关需求或拓展需求吗?
                          • 这一个需求可以这么实现,那这类相关特性都可以这么实现吗?是需要一个个单独实现,还有该代码实现是通用的,这一个模板可以实现几乎所有相关需求?
                          • 不论是在单机与分布式场景都可以这么实现吗?
                          • 换个业务视角或技术视角,还有更优秀的设计与实现吗?

                          因此一个“看起来”简单的需求,其背后都可能涉及大量需求分析与技术分析,如果开发者在本地默默地就把一些特性给实现了,则在回馈代码时,还将面临上述一系列问题,而如果这时才发现实现中存在一些重大设计问题,那可能之前的一部分精力就浪费了(当然,还是有收获的,至少知道“当前这个方案还有优化空间”)。因此,开发者在面对复杂特性时,最好做好“设计可能被反复推翻”的心理准备

                          为了尽量避免这种情况,开发者在设计与实现可能复杂的特性时,最好事先开一个在Issue开一个新的讨论,以尽量减少设计被推翻的次数,节约开发者的时间与精力。

                          注意:其实有时候就算前期设计完了,在实现的过程中又会发现更精妙的设计,越是复杂的功能,通常也伴随着更多的设计迭代。但是,这些“推翻/半推翻”级别的迭代最好在代码发布之前,就反复讨论与开发完成,而不是代码发布了才发现。

                          额外补充:正是因为需求的复杂性,所以Turms很多“看起来”的Issues是处于“悬而未决”的状态。Turms GitHub Issues区中,有大量Open Issues,很多特性相关的Issues只是一个种子,需要开发者自行做更细致的需求分析、设计与编码,而其中最难的通常就是需求分析,需要弄清楚“到底要做什么”,开发者既要考虑现在的需求,也得考虑未来的需求,还得防止过度设计,这也是为什么Turms文档中好几次提到类似“IM业务功能的设计与实现其实远比技术中间件的设计与实现难得多的多”。

                        • 降低自己的维护成本,方便持续性地合并上游更新。如果开发者Fork Turms项目做复杂的二次开发,那将面临一个长期维护的问题:如果开发者想要使用上游的新代码,就需要不断地在自己的分支上做适配,而上游Turms服务端更新得越快,开发者的适配工作量就越大。甚至还有可能出现逻辑上的冲突,但开发者没意识到。

                          相反的,如果开发者将代码回馈给上游,那就不会出现这类问题。因为我们不仅会一起来维护这些被回馈的代码,而且在为Turms设计其他新的相关功能模块时,也会考虑这些新设计与这些被回馈的代码在设计上是否一致。

                        • 减少维护冲突,避免反复推翻本地实现。可能开发者自己在本地添加了一些新特性或者修复了一些Bugs,但都没有回馈。过一段时间之后,开发者可能会发现上游比自己实现的功能考虑得更周全且完善,一些Bugs的修复更精妙(关于Turms服务端Bugs的难度介绍,读者可以阅读关于任务难度,最终开发者不得不把自己原来做的工作全Revert,然后再重新拉去上游搭配,重头做一遍适配。这其中的工作量想想就令人感觉痛苦,开发者在本地改得越多,冲突也就可能越多。

                        关于想找Turms作者私聊与做定制化开发

                        如果读者的团队是想自己做二次开发,可以直接看文章关于二次开发

                        如果一些用户想付费找Turms作者做定制化开发,但Turms作者一般只接受为通用需求做无偿开发(是的,一般只接受免费帮社区做开发)。其中的原因很简单,Turms作者不缺钱,即便Turms项目持续每年亏损几万人民币,我们也都能保证Turms这项目持续运营下去,因为我们从始至终压根没打算盈利。所以要么只接受用户很高的报价,高到令人难以拒绝,要么就只接受为社区做免费的开发。

                        因此除非您已经准备好要报一个很高的价格,否则不要尝试私聊Turms的作者帮您做定制化开发。如果您真得很想让Turms作者尽早排期完成您的需求,您可以把自己的需求描述清楚并发到Issues区,然后我们会根据需求的性价比,以及您对您自己所提需求的尊重,来进行排期。

                        当然,如果您甚至愿意付Turms作者很高的定制费用,那我也建议您直接考虑用商业化方案,尽管ta们的开发水平、工作态度与工作责任心大概率不如Turms作者。当然,这主要取决于您决定采取哪个国家与公司提供的解决方案。

                        相比于免费开发,定制开发的区别在于:

                        • 会给出完整的、阶段性的工程排期,如设计、开发、测试、交付等。

                        • 帮忙设计需求。读者可能会好奇,既然都定制了,为啥还需要Turms作者来设计需求。这其实就像是亨利福特说的“如果我问人们他们想要什么,他们会说更快的马”。用户要求的不一定是他们内心真正所需的,而时刻洞察用户真正的需求正是工程师所需的必备技能之一。

                        • 定长的工作时长保证。在这段时间里,可以做仅限于项目相关的定制化设计、开发、测试、部署、解答各种疑问等。

                          当然,以上的内容都是Turms作者在下班时间进行的。

                        如果一些用户担心因为自己没付钱,而Turms作者就会故意拖慢自己想要特性的开发与发布进度,但这也不会发生,因为Turms作者不缺钱,也没打算靠做开源盈利,因此没有故意拖慢的动机。

                        + \ No newline at end of file diff --git a/docs/zh-CN/design/architecture.html b/docs/zh-CN/design/architecture.html index cf70f8a2..8e98e111 100644 --- a/docs/zh-CN/design/architecture.html +++ b/docs/zh-CN/design/architecture.html @@ -17,8 +17,8 @@ -
                        Skip to content

                        架构设计

                        架构特性

                        通用架构特性

                        1. (敏捷性)支持在用户无感知的情况下,对Turms服务端进行停机更新,为快速迭代提供可能
                        2. (可伸缩性)无状态架构,Turms集群支持弹性扩展与异地多活的部署实现,用户可通过DNS就近接入
                        3. (可部署性)支持容器化部署,方便与云服务对接,以实现全自动化部署与运维
                        4. (可观测性)具备相对完善的可观测性体系设计,为业务统计与错误排查提供可能
                        5. (可拓展性)能同时支持中大型即时通讯场景,即便用户体量由小变大也无需重构(当然,对于大型运用场景还有很多优化的工作需要做,但当前架构不影响后期的无痛升级)
                        6. (安全性)提供限流防刷机制与用户/IP黑名单机制,以抵御大部分CC攻击
                        7. (简单性)核心架构“轻量”,方便学习与二次开发(原因请查阅 Turms架构设计
                        8. Turms使用MongoDB分片架构,以支持请求路由(如读写分离),同时也支持跨地域多活部署与数据主主同步,为大规模跨国部署提供实际操作的可能

                        架构说明

                        参考架构图

                        与其他IM项目的架构区别

                        跟Turms服务端的代码实现一样,Turms的架构设计也是非常“抠门”的,能尽量不拆的服务就不拆,能尽量不引入的外部服务就不引入,具体体现在:

                        • 在部分IM项目的架构设计中,它们会把turms-gateway的会话管理中转消息缓存消息发送三大块功能独立分成三个服务,来实现业务解耦与流量削峰。但其相比Turms的架构而言,多增加了两个故障点,增加了开发与运维难度,且需要使用RPC操作,吞吐量也更差。具体而言:

                          在业务解耦方面,部分IM项目会通过中转消息缓存的消息队列来实现下游消费者异步消费消息来实现各种统计功能。但通过消费消息队列中的数据来实现消息的统计是很糟糕的设计,更全面、更专业与更简易的实现是分布式采集与分析业务日志(如基于AWS厂商的CloudWatch Logs => Kinesis Firehose => S3 => Athena/QuickSight方案),这点在可观察体系的日志小节有具体说明。而turms-gateway的会话管理消息发送之间的逻辑并不复杂,解耦的意义不大,故没这方面需求。

                          在流量削峰方面,如今早已是云服务的天下,弹性伸缩服务(Auto Scaling)相比消息队列(如Kafka、RocketMQ或其他云服务)更适合实现流量削峰。各云服务厂商均提供资源监控功能,而弹性伸缩服务能根据各种系统指标(如CPU/内存利用率)与自定义的其他指标(如在线用户数)自动弹性伸缩,在资源闲置的时候又能自动释放,更符合现代运维模式。以AWS云服务为例,运维人员可以使用CloudWatch监听上述的Turms服务端度量数据,并配合Application Auto Scaling做自动化的服务器资源扩缩容。如果运维人员熟悉这些操作,从零开始购买这些云服务到配置完成,大概只需3~10分钟。

                          在高可用方面,部分IM架构会使用高可用(多可用区部署)的消息队列云服务与自研的消息发送服务来消费该队列,以保证通知不丢。但在Turms的架构设计中,就算Turms的消息推送服务端turms-gateway被强制关闭(如硬件故障,服务器直接宕机),Turms服务端集群也能自愈。并且因为在Turms的流程设计中,基于Turms客户端开发的应用本身每次在与Turms重连时(对应turmsClient.userService.addOnOnlineListener(...)这一回调函数)需要发送请求与新连接上的Turms服务端做数据同步,因此消息与状态也不会turms-gateway的宕机或是网络断开而丢失。

                          一些IM项目之所以强行进行解耦,引入消息队列,甚至是在同时在线用户数只有或不足数十万的时候就引入消息队列,只是因为部分人员为了给自己的简历润色、提升自己的不可替代性,而徒增项目所需技术栈,对项目进行过度设计。

                          一般只有基于Serverless架构,给中小型IM场景的云架构设计的时候,消息队列才能最大地发挥作用。依旧以上述场景与AWS为例,用户可以将通知发送给AWS SQS,保证消息服务的高可用,再基于Lambda函数做消息推送,保证通知不丢。在这样的架构设计中,用户没有自研的服务。

                          另外,之所以说Serverless架构在IM场景中最多只适用于中小型IM场景是因为:

                          • Lambda服务是有很多额度限制的,参考Lambda quotas

                          • 相比基于Serverless架构做开发,设计与实现自研的IM服务会简单与可控地非常多。盲目追求更“时尚”的Serverless架构,不一定进步,也可能是退步。

                        • 在部分IM项目的架构设计中,它们会把会话管理再拆成网络连接管理会话逻辑管理两个服务,来实现停机更新会话逻辑管理服务时,客户端不需要断开与网络连接管理服务的连接。但考虑到turms-gateway几乎没什么会话业务逻辑,既有的业务逻辑也很固定,主要的业务逻辑都是在turms-service里实现的,因此turms-gateway很少有停机更新业务逻辑的需要。综上,把将网络连接与会话逻辑拆分成两个独立服务对Turms而言还为时过早,既增加了故障点,性能折损也大,又没什么收益,故Turms架构暂不对会话管理再进行拆分。

                        额外补充:

                        • Turms服务端的代码实现也非常“抠门”的原因:开发基本规约
                        • 其实Turms在早期设计中,考虑过连Redis这样的分布式内存服务也不用,而是采用另一种也很常见的分布式内存实现方案,即:采用类似于Hazelcast的分布式MapIgnite的分布式Cache的设计实现,让Turms服务端之间自行通过分布式Map做分布式数据同步,从而减少对外部服务的依赖。但考虑到集群的高可用设计、Turms服务端自身的发布流程设计等等,因此最终还是引入了Redis服务来实现分布式内存。

                        Turms架构与云架构的关系

                        由于目前(2022年)AWS仍是全球云计算市场份额排名第一的云厂商,因此下文主要基于AWS云进行讲解。

                        • Turms的架构设计既要保证其技术方案不强制依赖任何云服务,以保持技术中立,避免技术栈与任何厂商绑定,让不上云的用户也可以轻松部署一整套Turms服务端(如基于Kubernetes),又要保证Turms所使用的技术方案都必须要有云厂商的支持,以此保证上云的用户可以通过各个厂商的云服务轻松部署一整套高可用的Turms服务端。

                          对于Turms服务端的核心IM功能,该需求并不怎么影响Turms发布核心特性,因为上不上云这些功能都是一样的实现方式。

                          但对于一些IM拓展功能,如文件存储与数据分析等功能,它们的实现就比较麻烦了,因为我们得把各种方案都考虑、设计与实现一遍。以业务数据分析为例,如果Turms绑定AWS厂商做架构设计,那业务数据分析功能实现起来就非常简单了,大体来说就是基于Turms服务端提供的业务日志,提供一套CloudFormation配置,其中根据不同用户的需求与配置,发布(最省事,但不省钱)CloudWatch Logs Insights(基于S3省钱,但不实时)CloudWatch Logs => S3 => Ahtena/QuickSight(基于S3省钱,且引入Kinesis Firehose保证数据实时推送)CloudWatch Logs => Kinesis Firehose => S3 => Athena/QuickSight或其他数据分析方案实现。但Turms又得满足不上云或者不想用其他第三方服务的用户需求,所以后期还得自研一套数据分析方案。因此工作量就会大的多,拓展功能的发布速度也就会慢得多。

                          但是如上所述,如果用户有条件使用第三方服务对Turms提供的业务日志做专业地数据分析,也就不必等待Turms提供解决方案了。

                        • Turms的云架构设计很简单。

                          • Turms的云架构只是云架构的子集。相比中大型混合云的企业云架构设计(企业云架构设计不仅包括各个项目的部署架构设计,也包括组织架构设计、混合云网络架构设计等等),虽然Turms在开源界可以算是中大型项目了,但给这样体量的项目做云架构设计还是相当简单的,对云服务有基本了解的用户都应该能理解Turms的云架构设计。

                          • Turms的云架构很常规。如果用户有部署过其他常规Web服务的云架构,部署Turms起来也差不多,尤其是Turms提供了多种部署方案,甚至还有基于的Terraform方案,来帮助用户自动购买与配置云服务的方案。

                            Turms的云架构中相对麻烦的一点是:部分云厂商不直接支持MongoDB服务。比如AWS就不直接支持高版本的MongoDB服务,尽管AWS有提供兼容低版本MongoDB的DocumentDB服务,但由于MongoDB公司与AWS厂商的竞争关系,AWS目前也只能将DocumentDB兼容的最新MongoDB版本锁死在4.0版本号,且维护力度也比较低。总体而言,DocumentDB服务有些鸡肋且发展前景不好,更推荐直接用MongoDB Atlas服务。

                            但因为MongoDB是AWS的合作伙伴,所以用户还是可以通过VPC Peering的方式轻松地将MongoDB Atlas企业级服务集成进AWS当中,部署起来。

                        客户端访问服务端的一般流程

                        该流程为客户端访问服务端的一般流程,也是Turms架构实现水平扩展的过程,您可以根据实际情况进行调整。

                        • 当客户端需要与turms-gateway服务端建立TCP连接时,客户端可以通过DNS服务来查询接入层服务端域名对应的IP地址,而该IP地址指向SLB/ELB服务(通常基于LVS与Nginx)、全球加速服务、或turms-gateway,具体如何搭配要根据您实际应用的需求与规模而定。该DNS服务端可以配置一个或多个公网IP地址(在生产环境中,切勿配置服务端自身的公网IP地址,以缓解DDoS攻击),并通过轮询或其他策略返回给客户端一个IP地址。补充:

                          • 无论Turms客户端使用的是纯TCP连接,还是上层的WebSocket连接,turms-gateway的上游服务(DNS/SLB等)都应该根据客户端IP地址进行TCP连接的负载均衡。

                          • 强烈建议您开启SLB服务的Sticky Session功能,让会话始终与一个turms-gateway服务端进行连接。这么做的好处是能缓解很大一部分DDoS攻击。因为turms-gateway提供客户端自动封禁机制,能够迅速在本地检测并封禁有异常行为的IP或用户,但turms-gateway服务端之间同步封禁客户端数据默认时间间隔约10~15秒,因此如果关闭了Sticky Session功能,黑客就能利用封禁数据同步间隔这段时间,切换与turms-gateway的TCP连接,进行DDoS攻击。

                          • 通常情况下,您应该将SSL证书放在turms-gateway的上游服务端,即上游的SLB服务或Nginx服务端等。

                          • 由于turms-gateway采用了无状态的架构设计,因此任意客户端可以连接到任意一个turms-gateway服务端上,您也可以弹性增删turms-gateway节点,以实现弹性水平拓展;状态(即用户会话信息)被转移到了分布式内存Redis服务端当中。

                        • 客户端拿到IP地址,并与turms-gateway成功建立TCP连接之后,turms-gateway会检测该IP是否已被封禁,或者turms-gateway自身负载是否过大,如果是,则主动断开TCP连接。否则,放行TCP连接。

                        • 如果turms-gateway放行TCP连接,

                          • 对于使用纯TCP连接的Turms客户端,客户端可以开始发起TurmsRequest的Protobuf数据流。该数据流由ZigZag编码的正文长度头,与Protobuf编码的正文,这两部分组成。
                          • 对于使用WebSocket连接的Turms客户端,客户端会在TCP连接建立成功后,向turms-gateway发起HTTP Upgrade请求,请求将HTTP Upgrade成WebSocket协议。如果升级成功,客户端就可以把Protobuf编码的TurmsRequest数据放在WebSocket Binary Frame的正文中,并发送给turms-gateway。

                          注意:这时Turms客户端只是与turms-gateway建立的网络层连接,但用户尚未登陆,也并没有建立会话信息

                        • 该数据流经过负载均衡服务端(可选)的转发后,会先到达turms-gateway。turms-gateway会先对该数据流进行简单的Protobuf格式校验(不校验具体业务请求的合法性,是为了与turms-service服务端进行业务逻辑解耦,以实现turms-service服务端对业务请求格式进行更新后,turms-gateway不需要停机),如果是非法数据流,则直接断开TCP连接。

                          否则,若为合法请求,则会对其进行部分解析,以确认turms-gateway能否自行处理这个请求。举例来说,对于登陆登出这两个请求,turms-gateway就能自行处理。

                        • 如果turms-gateway能够自行处理,则在处理后返回响应。如果无法处理,则再检测用户是否已在本服务端登陆,如果没有登陆,则拒绝执行请求,并发回响应。如果已登陆,则先根据负载均衡策略从可用的turms-service服务端列表中选出一个turms-service服务端,再通过自研的RPC框架将请求转发给该turms-service服务端,让其进行处理。

                          • 如果turms-gateway检测到该客户端请求是登陆请求,则turms-gateway会根据用户ID与登陆请求中指定的设备类型构成一个会话ID,并根据Redis或本地缓存中的用户会话信息,判断该会话ID是否与已登陆会话冲突。如果发生冲突,则拒绝其进行上线操作,并发回响应,告知客户端被拒绝登陆的原因。否则,将当前用户会话信息注册到Redis,并发回登陆成功响应。此时,用户进入了在线状态。

                            注意:

                            • 一个会话ID(用户ID+设备)在同一时刻只会与一个turms-gateway服务端构成用户会话,与一个turms-gateway服务端构成TCP连接。用户后续的所有业务请求都是在这一个会话与TCP连接中完成的,直到会话关闭、用户下线。

                            • 一个用户ID下的不同设备可以在同一时刻与不同的turms-gateway服务端构成用户会话,无论这些设备是否来自不同的IP。

                              但推荐让一个用户ID下的所有设备始终与一个turms-gateway连接,因为:

                              1. 如果登陆到同个turms-gateway,服务端在转发消息或通知给一个用户时,只需把其字节流发送给一个turms-gateway服务端,而不是多个,以减低系统资源开销、增加吞吐量;
                              2. 在同个turms-gateway的同一用户的所有设备会共享会话的心跳时钟,因此可以减少turms-gateway发送给Redis的TTL心跳刷新的请求数;
                              3. 如果服务端开启了用户状态缓存,在转发消息或通知时可能使用的是尚未更新的用户状态,因此新消息可能不会马上发送给新登陆的设备。
                          • 如果turms-gateway无法处理该客户端请求,则通过RPC服务将客户端请求下发给turms-service。turms-service服务端在收到客户端请求后,会对请求进行校验与处理,并触发ClientRequestHandler插件以协助开发者实现自定义逻辑(如敏感词过滤),另外在处理过程中通常也会向mongos发送对应的CRUD请求。等客户端请求处理完毕后,turms-service会将产生的响应,发回给turms-gateway。对于处理过程中产生的通知,turms-service会先根据被通知用户的ID,向Redis或本地缓存查询该批用户所连接的turms-gateway的节点ID,并通过RPC服务将通知发送给这批turms-gateway,让其进行通知下推操作。

                            补充:Turms采用MongoDB的分片副本架构。mongos收到CRUD请求后,会根据配置进行CRUD请求路由。

                          • 无论turms-gateway接收到的是响应还是通知,turms-gateway都不会对其进行合法性校验,而是直接透传给用户。在通知下推过程中,turms-gateway会触发NotificationHandler插件方法以协助开发者实现自定义逻辑(如离线用户的消息推送)。

                          (值得一提的是,Turms的所有网络IO操作都是基于Netty实现的,即以上所有RPC、数据库调用均是异步非阻塞的)

                        - +
                        Skip to content

                        架构设计

                        架构特性

                        通用架构特性

                        1. (敏捷性)支持在用户无感知的情况下,对Turms服务端进行停机更新,为快速迭代提供可能
                        2. (可伸缩性)无状态架构,Turms集群支持弹性扩展与异地多活的部署实现,用户可通过DNS就近接入
                        3. (可部署性)支持容器化部署,方便与云服务对接,以实现全自动化部署与运维
                        4. (可观测性)具备相对完善的可观测性体系设计,为业务统计与错误排查提供可能
                        5. (可拓展性)能同时支持中大型即时通讯场景,即便用户体量由小变大也无需重构(当然,对于大型运用场景还有很多优化的工作需要做,但当前架构不影响后期的无痛升级)
                        6. (安全性)提供限流防刷机制与用户/IP黑名单机制,以抵御大部分CC攻击
                        7. (简单性)核心架构“轻量”,方便学习与二次开发(原因请查阅 Turms架构设计
                        8. Turms使用MongoDB分片架构,以支持请求路由(如读写分离),同时也支持跨地域多活部署与数据主主同步,为大规模跨国部署提供实际操作的可能

                        架构说明

                        参考架构图

                        与其他IM项目的架构区别

                        跟Turms服务端的代码实现一样,Turms的架构设计也是非常“抠门”的,能尽量不拆的服务就不拆,能尽量不引入的外部服务就不引入,具体体现在:

                        • 在部分IM项目的架构设计中,它们会把turms-gateway的会话管理中转消息缓存消息发送三大块功能独立分成三个服务,来实现业务解耦与流量削峰。但其相比Turms的架构而言,多增加了两个故障点,增加了开发与运维难度,且需要使用RPC操作,吞吐量也更差。具体而言:

                          在业务解耦方面,部分IM项目会通过中转消息缓存的消息队列来实现下游消费者异步消费消息来实现各种统计功能。但通过消费消息队列中的数据来实现消息的统计是很糟糕的设计,更全面、更专业与更简易的实现是分布式采集与分析业务日志(如基于AWS厂商的CloudWatch Logs => Kinesis Firehose => S3 => Athena/QuickSight方案),这点在可观察体系的日志小节有具体说明。而turms-gateway的会话管理消息发送之间的逻辑并不复杂,解耦的意义不大,故没这方面需求。

                          在流量削峰方面,如今早已是云服务的天下,弹性伸缩服务(Auto Scaling)相比消息队列(如Kafka、RocketMQ或其他云服务)更适合实现流量削峰。各云服务厂商均提供资源监控功能,而弹性伸缩服务能根据各种系统指标(如CPU/内存利用率)与自定义的其他指标(如在线用户数)自动弹性伸缩,在资源闲置的时候又能自动释放,更符合现代运维模式。以AWS云服务为例,运维人员可以使用CloudWatch监听上述的Turms服务端度量数据,并配合Application Auto Scaling做自动化的服务器资源扩缩容。如果运维人员熟悉这些操作,从零开始购买这些云服务到配置完成,大概只需3~10分钟。

                          在高可用方面,部分IM架构会使用高可用(多可用区部署)的消息队列云服务与自研的消息发送服务来消费该队列,以保证通知不丢。但在Turms的架构设计中,就算Turms的消息推送服务端turms-gateway被强制关闭(如硬件故障,服务器直接宕机),Turms服务端集群也能自愈。并且因为在Turms的流程设计中,基于Turms客户端开发的应用本身每次在与Turms重连时(对应turmsClient.userService.addOnOnlineListener(...)这一回调函数)需要发送请求与新连接上的Turms服务端做数据同步,因此消息与状态也不会turms-gateway的宕机或是网络断开而丢失。

                          一些IM项目之所以强行进行解耦,引入消息队列,甚至是在同时在线用户数只有或不足数十万的时候就引入消息队列,只是因为部分人员为了给自己的简历润色、提升自己的不可替代性,而徒增项目所需技术栈,对项目进行过度设计。

                          一般只有基于Serverless架构,给中小型IM场景的云架构设计的时候,消息队列才能最大地发挥作用。依旧以上述场景与AWS为例,用户可以将通知发送给AWS SQS,保证消息服务的高可用,再基于Lambda函数做消息推送,保证通知不丢。在这样的架构设计中,用户没有自研的服务。

                          另外,之所以说Serverless架构在IM场景中最多只适用于中小型IM场景是因为:

                          • Lambda服务是有很多额度限制的,参考Lambda quotas

                          • 相比基于Serverless架构做开发,设计与实现自研的IM服务会简单与可控地非常多。盲目追求更“时尚”的Serverless架构,不一定进步,也可能是退步。

                        • 在部分IM项目的架构设计中,它们会把会话管理再拆成网络连接管理会话逻辑管理两个服务,来实现停机更新会话逻辑管理服务时,客户端不需要断开与网络连接管理服务的连接。但考虑到turms-gateway几乎没什么会话业务逻辑,既有的业务逻辑也很固定,主要的业务逻辑都是在turms-service里实现的,因此turms-gateway很少有停机更新业务逻辑的需要。综上,把将网络连接与会话逻辑拆分成两个独立服务对Turms而言还为时过早,既增加了故障点,性能折损也大,又没什么收益,故Turms架构暂不对会话管理再进行拆分。

                        额外补充:

                        • Turms服务端的代码实现也非常“抠门”的原因:开发基本规约
                        • 其实Turms在早期设计中,考虑过连Redis这样的分布式内存服务也不用,而是采用另一种也很常见的分布式内存实现方案,即:采用类似于Hazelcast的分布式MapIgnite的分布式Cache的设计实现,让Turms服务端之间自行通过分布式Map做分布式数据同步,从而减少对外部服务的依赖。但考虑到集群的高可用设计、Turms服务端自身的发布流程设计等等,因此最终还是引入了Redis服务来实现分布式内存。

                        Turms架构与云架构的关系

                        由于目前(2022年)AWS仍是全球云计算市场份额排名第一的云厂商,因此下文主要基于AWS云进行讲解。

                        • Turms的架构设计既要保证其技术方案不强制依赖任何云服务,以保持技术中立,避免技术栈与任何厂商绑定,让不上云的用户也可以轻松部署一整套Turms服务端(如基于Kubernetes),又要保证Turms所使用的技术方案都必须要有云厂商的支持,以此保证上云的用户可以通过各个厂商的云服务轻松部署一整套高可用的Turms服务端。

                          对于Turms服务端的核心IM功能,该需求并不怎么影响Turms发布核心特性,因为上不上云这些功能都是一样的实现方式。

                          但对于一些IM拓展功能,如文件存储与数据分析等功能,它们的实现就比较麻烦了,因为我们得把各种方案都考虑、设计与实现一遍。以业务数据分析为例,如果Turms绑定AWS厂商做架构设计,那业务数据分析功能实现起来就非常简单了,大体来说就是基于Turms服务端提供的业务日志,提供一套CloudFormation配置,其中根据不同用户的需求与配置,发布(最省事,但不省钱)CloudWatch Logs Insights(基于S3省钱,但不实时)CloudWatch Logs => S3 => Ahtena/QuickSight(基于S3省钱,且引入Kinesis Firehose保证数据实时推送)CloudWatch Logs => Kinesis Firehose => S3 => Athena/QuickSight或其他数据分析方案实现。但Turms又得满足不上云或者不想用其他第三方服务的用户需求,所以后期还得自研一套数据分析方案。因此工作量就会大的多,拓展功能的发布速度也就会慢得多。

                          但是如上所述,如果用户有条件使用第三方服务对Turms提供的业务日志做专业地数据分析,也就不必等待Turms提供解决方案了。

                        • Turms的云架构设计很简单。

                          • Turms的云架构只是云架构的子集。相比中大型混合云的企业云架构设计(企业云架构设计不仅包括各个项目的部署架构设计,也包括组织架构设计、混合云网络架构设计等等),虽然Turms在开源界可以算是中大型项目了,但给这样体量的项目做云架构设计还是相当简单的,对云服务有基本了解的用户都应该能理解Turms的云架构设计。

                          • Turms的云架构很常规。如果用户有部署过其他常规Web服务的云架构,部署Turms起来也差不多,尤其是Turms提供了多种部署方案,甚至还有基于的Terraform方案,来帮助用户自动购买与配置云服务的方案。

                            Turms的云架构中相对麻烦的一点是:部分云厂商不直接支持MongoDB服务。比如AWS就不直接支持高版本的MongoDB服务,尽管AWS有提供兼容低版本MongoDB的DocumentDB服务,但由于MongoDB公司与AWS厂商的竞争关系,AWS目前也只能将DocumentDB兼容的最新MongoDB版本锁死在4.0版本号,且维护力度也比较低。总体而言,DocumentDB服务有些鸡肋且发展前景不好,更推荐直接用MongoDB Atlas服务。

                            但因为MongoDB是AWS的合作伙伴,所以用户还是可以通过VPC Peering的方式轻松地将MongoDB Atlas企业级服务集成进AWS当中,部署起来。

                        客户端访问服务端的一般流程

                        该流程为客户端访问服务端的一般流程,也是Turms架构实现水平扩展的过程,您可以根据实际情况进行调整。

                        • 当客户端需要与turms-gateway服务端建立TCP连接时,客户端可以通过DNS服务来查询接入层服务端域名对应的IP地址,而该IP地址指向SLB/ELB服务(通常基于LVS与Nginx)、全球加速服务、或turms-gateway,具体如何搭配要根据您实际应用的需求与规模而定。该DNS服务端可以配置一个或多个公网IP地址(在生产环境中,切勿配置服务端自身的公网IP地址,以缓解DDoS攻击),并通过轮询或其他策略返回给客户端一个IP地址。补充:

                          • 无论Turms客户端使用的是纯TCP连接,还是上层的WebSocket连接,turms-gateway的上游服务(DNS/SLB等)都应该根据客户端IP地址进行TCP连接的负载均衡。

                          • 强烈建议您开启SLB服务的Sticky Session功能,让会话始终与一个turms-gateway服务端进行连接。这么做的好处是能缓解很大一部分DDoS攻击。因为turms-gateway提供客户端自动封禁机制,能够迅速在本地检测并封禁有异常行为的IP或用户,但turms-gateway服务端之间同步封禁客户端数据默认时间间隔约10~15秒,因此如果关闭了Sticky Session功能,黑客就能利用封禁数据同步间隔这段时间,切换与turms-gateway的TCP连接,进行DDoS攻击。

                          • 通常情况下,您应该将SSL证书放在turms-gateway的上游服务端,即上游的SLB服务或Nginx服务端等。

                          • 由于turms-gateway采用了无状态的架构设计,因此任意客户端可以连接到任意一个turms-gateway服务端上,您也可以弹性增删turms-gateway节点,以实现弹性水平拓展;状态(即用户会话信息)被转移到了分布式内存Redis服务端当中。

                        • 客户端拿到IP地址,并与turms-gateway成功建立TCP连接之后,turms-gateway会检测该IP是否已被封禁,或者turms-gateway自身负载是否过大,如果是,则主动断开TCP连接。否则,放行TCP连接。

                        • 如果turms-gateway放行TCP连接,

                          • 对于使用纯TCP连接的Turms客户端,客户端可以开始发起TurmsRequest的Protobuf数据流。该数据流由ZigZag编码的正文长度头,与Protobuf编码的正文,这两部分组成。
                          • 对于使用WebSocket连接的Turms客户端,客户端会在TCP连接建立成功后,向turms-gateway发起HTTP Upgrade请求,请求将HTTP Upgrade成WebSocket协议。如果升级成功,客户端就可以把Protobuf编码的TurmsRequest数据放在WebSocket Binary Frame的正文中,并发送给turms-gateway。

                          注意:这时Turms客户端只是与turms-gateway建立的网络层连接,但用户尚未登陆,也并没有建立会话信息

                        • 该数据流经过负载均衡服务端(可选)的转发后,会先到达turms-gateway。turms-gateway会先对该数据流进行简单的Protobuf格式校验(不校验具体业务请求的合法性,是为了与turms-service服务端进行业务逻辑解耦,以实现turms-service服务端对业务请求格式进行更新后,turms-gateway不需要停机),如果是非法数据流,则直接断开TCP连接。

                          否则,若为合法请求,则会对其进行部分解析,以确认turms-gateway能否自行处理这个请求。举例来说,对于登陆登出这两个请求,turms-gateway就能自行处理。

                        • 如果turms-gateway能够自行处理,则在处理后返回响应。如果无法处理,则再检测用户是否已在本服务端登陆,如果没有登陆,则拒绝执行请求,并发回响应。如果已登陆,则先根据负载均衡策略从可用的turms-service服务端列表中选出一个turms-service服务端,再通过自研的RPC框架将请求转发给该turms-service服务端,让其进行处理。

                          • 如果turms-gateway检测到该客户端请求是登陆请求,则turms-gateway会根据用户ID与登陆请求中指定的设备类型构成一个会话ID,并根据Redis或本地缓存中的用户会话信息,判断该会话ID是否与已登陆会话冲突。如果发生冲突,则拒绝其进行上线操作,并发回响应,告知客户端被拒绝登陆的原因。否则,将当前用户会话信息注册到Redis,并发回登陆成功响应。此时,用户进入了在线状态。

                            注意:

                            • 一个会话ID(用户ID+设备)在同一时刻只会与一个turms-gateway服务端构成用户会话,与一个turms-gateway服务端构成TCP连接。用户后续的所有业务请求都是在这一个会话与TCP连接中完成的,直到会话关闭、用户下线。

                            • 一个用户ID下的不同设备可以在同一时刻与不同的turms-gateway服务端构成用户会话,无论这些设备是否来自不同的IP。

                              但推荐让一个用户ID下的所有设备始终与一个turms-gateway连接,因为:

                              1. 如果登陆到同个turms-gateway,服务端在转发消息或通知给一个用户时,只需把其字节流发送给一个turms-gateway服务端,而不是多个,以减低系统资源开销、增加吞吐量;
                              2. 在同个turms-gateway的同一用户的所有设备会共享会话的心跳时钟,因此可以减少turms-gateway发送给Redis的TTL心跳刷新的请求数;
                              3. 如果服务端开启了用户状态缓存,在转发消息或通知时可能使用的是尚未更新的用户状态,因此新消息可能不会马上发送给新登陆的设备。
                          • 如果turms-gateway无法处理该客户端请求,则通过RPC服务将客户端请求下发给turms-service。turms-service服务端在收到客户端请求后,会对请求进行校验与处理,并触发ClientRequestHandler插件以协助开发者实现自定义逻辑(如敏感词过滤),另外在处理过程中通常也会向mongos发送对应的CRUD请求。等客户端请求处理完毕后,turms-service会将产生的响应,发回给turms-gateway。对于处理过程中产生的通知,turms-service会先根据被通知用户的ID,向Redis或本地缓存查询该批用户所连接的turms-gateway的节点ID,并通过RPC服务将通知发送给这批turms-gateway,让其进行通知下推操作。

                            补充:Turms采用MongoDB的分片副本架构。mongos收到CRUD请求后,会根据配置进行CRUD请求路由。

                          • 无论turms-gateway接收到的是响应还是通知,turms-gateway都不会对其进行合法性校验,而是直接透传给用户。在通知下推过程中,turms-gateway会触发NotificationHandler插件方法以协助开发者实现自定义逻辑(如离线用户的消息推送)。

                          (值得一提的是,Turms的所有网络IO操作都是基于Netty实现的,即以上所有RPC、数据库调用均是异步非阻塞的)

                        + \ No newline at end of file diff --git a/docs/zh-CN/design/schema.html b/docs/zh-CN/design/schema.html index a0a2e5dc..7b97ae92 100644 --- a/docs/zh-CN/design/schema.html +++ b/docs/zh-CN/design/schema.html @@ -17,8 +17,8 @@ -
                        Skip to content

                        集合结构设计

                        需求分析与集合结构设计

                        在做架构设计的时候常说“关键需求决定架构设计,次要需求验证架构”(这里指的“需求”包括功能需求、质量属性需求与约束性需求)。但由于Turms作为一款通用即时通讯项目,其需求并不像具体的即时通讯项目那样明确与清晰。因此,面对无穷无尽的业务需求与各种可能的约束性条件,Turms不可能也不应该针对每种场景都做设计。因此,在做Turms设计时,我们尽可能得“以关键的普适即时通讯需求为主要需求”为准则来设计Turms的功能。

                        而将各种纷繁复杂的需求抽象为实际的业务模型时,就需要搞清楚需求间的主次关系,并最终以集合结构的形式作为这些需求关系在技术架构落地时最为重要的体现。因此务必根据您产品自身需求对Turms默认提供的集合结构进行审阅与必要调整。

                        默认的集合索引设计

                        要点(如果您的团队需要基于Turms做开发,请牢记以下三点)

                        • 集合索引主要是针对分布式数据分片的特点与约束条件,并根据多查少写、以关键的普适即时通讯需求为主要需求而设计的
                        • 集合索引不针对数据分析做设计(具体请查阅 Turms数据分析
                        • 集合索引不针对管理员接口做设计(避免不必要的索引开销,代价就是管理员接口的灵活性相对差)
                        • Turms不采用辅助索引集合来满足拓展的业务功能(因此如果您的项目有拓展业务功能,您就需要基于Turms进行二次开发。当然,实现起来也很简单,合格的中高级工程师都应该有这样的能力)

                        这里特别要强调的就是“以关键的普适即时通讯需求为主”,因为它提醒了集合的设计不仅需要开发人员注意,甚至还需要产品经理与甲方的注意。对于涉及到分布式数据分片的场景,一些看似“实现简单”的功能在实际落地时会带来大量的资源消耗并提高开发与运维的难度,因此针对这样“吃力不讨好”的功能,请务必多方确认该需求是否合理,是否必要,是否能承担相应的风险与成本。在确认是否需要实现、能否经过多次迭代后再实现等现实因素后,再考虑是否需要对集合做弹性设计,以方便后期更新,降低推翻重构的风险。

                        这里以“查询某用户已加入的群组”功能为例。Turms中的GroupMember集合用于管理群组与用户的关系,该集合在设计上默认是对群组ID进行数据分片,因此若需要在分布式数据库服务端中根据群组ID查找群组相关信息,这对数据库而言是很轻松的事(targeted queries)。但反过来,如果不在创建一个新的辅助集合的前提下,那对于根据某用户ID查找该用户已加入的群组就是非常吃力的事情(scatter gather queries)。因为数据库无法根据用户ID定位相关群组的数据,因此会将该查询请求发送给所有数据库服务端,造成大量无效且冗余的请求,有效请求仅占很小的比例,最终导致分布式数据库架构的有效吞吐量甚至不如单机。

                        并且随着用户规模的增加,最终要么因为错误判断主次需求而导致架构需要推翻重做,要么在现有基础上进行自定义拓展(如像ShardingSphere那样,自行实现一个辅助表来帮助做数据定位,但这样的实现很可能又会导致大量的冗余数据与事务操作)。因此务必深入理解Turms默认的集合索引设计,并牢记“默认索引设计主要是针对分布式数据分片的特点与约束条件,并根据多查少写、以关键的普适即时通讯需求为主要需求而设计的”。

                        功能丰富的致命代价

                        在您深刻理解了Turms默认的集合索引设计后,您会发现为什么那么多的大中型即时通讯应用不提供、也不应该提供一些看似“实现简单”的功能,也会更加理解即时通讯在实际落地时需要注意到的点。另一方面,您也应该警惕那些以提供“业务功能丰富”口号的即时通讯技术方案,因为它们很可能只是适用于上百人或上千人的用户规模。若后期您的产品需要扩容,您会发现一些已有的表设计与数据分片设计背道而驰,很可能需要从schema设计层面开始重构,进而导致整个技术实现上的重构,到头来只能另起炉灶踏上自研之路,悔不当初。

                        这里以“为了限制每位用户可创建的群组数量,需要服务端具有快速查找该用户所拥有的群组数量的能力”这个功能为例子进行讲解。这看似是一个很“简单”实现的功能。但由于上述所说的Turms默认索引设计原则,Turms默认只给Group群组的ID做数据分片,以实现群组成员快速查找群组信息。

                        因此我们无法根据群组拥有者ID通过targeted query来快速查询其所拥有的群组数量。要想实现相对可行的方案大致只有以下三种方案(特别注意,以下三种方案您可以通过举一反三运用到其他拓展功能设计上):

                        • 为群组拥有者ID专门创建一个单列索引。虽然无法实现targeted query,但仍可在scatter gather query后通过索引相对快速查询。(注意:这类实现方案是Turms为拓展功能提供的默认实现,但这些实现在默认配置中均关闭)

                        • 维度建模,创建辅助索引集合,用于专门记录群组拥有者ID与对应群组的ID。可以实现targeted query,但一些关键操作为保证数据的一致性需要使用分布式事务,并且仍有数据冗余。

                        • 使用静态统计表专门记录每位用户已拥有的群组数量。该方案效率最高、冗余最小。但仍需要分布式事务,并且可拓展性最差。

                        很明显,为了实现一个很“简单”的功能,我们的三个实现方案不仅对系统资源有着截然不同的要求,甚至连查询的时间复杂度也并非在一个级别上。

                        因此要时刻警惕打着“业务功能丰富”口号的即时通讯解决方案。

                        集合结构

                        Turms的集合结构中可能有您产品压根用不上的字段,但这些不被使用的字段并不会存储在数据库中,您无需担心它们会增加数据库开销。

                        Turms的集合结构是如何设计出来的

                        Turms的集合结构并不是某一个commit或某几天就设计完成的,而是经过长时间的迭代分析与实践,最终整理出来的。步骤大体如下:

                        1. 分析业务需求,把握业务之间错综复杂的逻辑并分清需求的主次关系,并且要求不仅能把现有的所有需求,也要尽可能预测未来需要的业务需求与比较确认不需要的业务需求
                        2. 分析业务实现的具体代码逻辑,确定需要的字段
                        3. 确定字段ID。特别一提的是,复合ID内部又可以独立建立索引。如GroupMember集合的复合ID是group ID + user ID,这两个字段自己又有独立的索引用来实现其他业务功能
                        4. 建立索引。首先分别考虑各个字段是否确实需要索引、是否可以做成可选索引,然后再考虑某几个字段可不可以合并成复合索引(包括分析:记录的基数、复合索引的使用频率、查询条件是否能够始终遵循最左匹配原则、是否能够顺便避免回表查询)
                        5. 判断集合是否需要做分片设计(Sharding),包括分析集合是否需要做数据冷热分离。如果需要做分片设计,那是否能够基于上述的索引“顺便”对数据进行分片

                        集合详解

                        概要

                        下述内容只是基本的理论,如同我们在Turms的集合结构是如何设计出来的提到的,实际业务更为复杂多变,因此面对具体的集合索引设计,还需要结合其实际应用场景做分析与设计。

                        数据分片

                        除了诸如管理员(Admin)、群组类型(GroupType)等小集合不需要做数据分片外,其他大部分集合都做了数据分片的支持,比如用户(User)、群组(Group)与消息(Message)等集合,以实现给mongos发送CRUD请求时,mongos能自行做负载均衡、平衡数据负载,同时也是为了支持冷热数据分离。

                        记录创建时间索引

                        不少集合的复合索引都带上了记录创建时间字段,这是为了配合Turms的拉模式,以支持快速查询某时间区间的记录,并避免客户端重复查询。这也是为什么Turms客户端大部分查询语句都可以带上一个查询时间区间的参数,而如果客户端请求没带上这参数,那么Turms服务端就会默认赋予一个查询时间区间,以保证查询性能。

                        ID只使用B-tree索引

                        我们禁止给记录的ID用Hashed索引,这是因为MongoDB不支持通过Hashed索引保证唯一性约束,只能通过B-tree索引保证记录的唯一性,因此就算我们给记录的ID加上了一个Hashed索引,MongoDB也会自动再额外创建一个B-tree索引,得不偿失。

                        可选字段与索引

                        Turms集合中有几十个可选但默认不开启的索引,这是因为:

                        • 虽然很多IM业务需求都很典型,但却是彼此冲突的,比如需要支持消息或请求发送人能查询他自己发送的消息或请求消息或请求发送人不能查询他自己发送的消息或请求(默认实现)。
                        • 又或者一些IM业务需求虽然典型,但并不是那么常见,比如入群请求的处理者是否能查询他处理过的请求。用于支持该类拓展IM功能的可选索引占大头。
                        • 如果默认开启这些可选索引,那就是往小型IM应用做设计了,对于大点的IM应用而言,那就是犯了我们上面说的“功能丰富的致命代价”的错误。

                        而我们选择默认实现方案的原则是:选择不需要额外加字段或索引、存储成本最低且能跟其他IM业务需求保持逻辑一致的方案。而如果您的应用确实需要支持另一个方案,我们一般也提供多套备选方案,需要用户自行配置以替换默认实现。

                        您只要把握住这个基本原则,就能反推Turms集合各索引为什么那么设计了。另外,在代码中各模型、各字段其实也都有索引相关的注释,用来指导用户:什么字段,在什么场景下适合有索引,为什么一些字段不使用索引。用户可以参考该注释做设计。

                        注意:极个别可选索引是默认开启的,因为这些索引对应的场景非常通用,只有极少应用不需要使用这些场景。另外,Turms目前尚未对未开启这些可选索引的场景做优化,因此目前建议您不要手动关闭它们。

                        补充:

                        • 这些可选索引可以通过配置turms.service.mongo.[服务名].optional-index.[集合名].[字段名]=true开启,如turms.service.mongo.message.optional-index.message.sender-id=true

                          提醒:IntelliJ IDEA支持配置自动补全

                        • 用户也能自行直接向MongoDB服务建立自己想用的索引,并且MongoDB增删索引或字段非常简单,因此就算用户配漏了,或者前期需求不清晰,后期有新需求来了,也无需担心没法加新索引或字段。

                          额外补充:MongoDB每个版本都会发布一些非常实用的新特性,可能早期一些我们需要完全自研的复杂功能,但在MongoDB的新版本中只需要执行一条命令就能实现了,极大地降低开发与运维难度,并提升功能的可靠性,因此非常推荐您尽可能部署新版本MongoDB。

                        默认不给请求模型的请求发送者字段加索引

                        诸如好友请求入群请求这两个集合默认是不给请求发送者加索引的。换言之,一旦用户发送完请求,他就无法再查询他已经发送过的请求了,需要客户端本地自行记录。如果您产品确实需要服务端记录并查询用户发送过的请求,则需要自行配置上述的可选索引,让turms-service在初次建表时,添加该索引,或者您也能自行直接在MongoDB服务端中向集合建立索引。

                        消息(Message)

                        消息是目前唯一支持冷热数据分离存储的模型。而冷热数据分离能极大地节省数据库服务器成本,比如将热数据放到16核128G服务器中,把冷数据放到4核8G服务器中。另外,其他模型目前均没有冷热数据分离存储的意义,因此其他模型不支持。

                        索引
                        • 业务场景:是否需要支持消息发送人能够查询他自己发送的消息

                          • 方案一(默认方案):不支持该特性,使用消息发送时间 + 收信人ID复合索引

                            由于消息需要支持冷热数据分离,因此消息的复合索引是:消息发送时间 + 收信人ID,并且分片键是消息发送时间,以保证之后我们能把不同时间区间的Zones分配给不同的Shards,并实现消息的冷热分离存储。

                            (如果消息不需要支持冷热数据分离,那Turms的消息模型的复合索引应该是:收信人ID + 消息发送时间,并且分片键是收信人ID,以保证MongoDB既能对读写请求都做负载均衡,又能保证发给同一个收信人的消息都尽量分在相同的Chunks中,以提升查询速度)

                            补充:至于为什么没给添加好友请求群组邀请请求等集合做冷热数据分离,这是因为虽然这些请求在业务表现上确实与创建时间紧密相关,比如添加好友请求过了一段时间后,在业务上看就是请求已过期,不可处理状态了。但是,对于请求的接收人而言,就算是过期的请求,用户也经常需要通过查询语句快速查询其接收过的所有请求,其访问次数并不会随着时间而递减。举例来说,比如一个用户今年接收到了20个好友请求,去年接收到20个好友请求,客户端每次查询至多50个请求,那数据库就更应该以收信人ID为维度,把相同请求接收者的数据都分在一个Chunk里。而不是根据请求创建时间,把相同请求接收者的数据分到不同的Chunks,并负载到不同的数据库中。因此,我们不对这些集合做冷热数据分离支持。而对于这类集合,我们一般采用请求接收者ID + 请求创建时间这样的复合索引,并以请求接收者ID为分片键,尽可能将一个请求接受者收到的所有请求都放在相同的Chunk中。

                          • 方案二:支持该特性,使用消息发送时间 + 会话ID复合索引

                            如果您的产品需要这套方案,那您只需在turms-service服务端初次启动时,配置turms.service.message.use-conversation-id=true。只是特别注意:如果您已经采用了方案一的方式在数据库中建好了表并创建了消息记录,则Turms服务端目前并不会创建消息发送时间 + 会话ID复合索引,也不会刷一遍消息数据,给消息填充会话ID。

                            补充知识:私聊会话ID是16字节长的字节数组,其值由消息发送者ID、消息接收者组成。群聊会话ID是一个8字节长的字节数组,其值由群ID组成。

                          • 方案三:支持该特性,但通常不推荐,Turms也不提供支持。该方案是:在消息发送时间 + 收信者ID复合索引的方案下,给发送者ID开启可选索引。

                            之所以不推荐这方案是因为:用户查询一个会话内的消息是极为常见的场景,而这个方案在查询一个会话的消息时,需要查询两次:一次是查询对方发送的消息,一次是查询自己发送的消息,如此低效,因此Turms不提供支持。

                        • 消息删除时间B-tree索引。如果您的产品需要支持逻辑删除,则在“删除”消息时,turms-service会填充该字段的值,否则该字段是不会被用到的。

                        TODO

                        - +
                        Skip to content

                        集合结构设计

                        需求分析与集合结构设计

                        在做架构设计的时候常说“关键需求决定架构设计,次要需求验证架构”(这里指的“需求”包括功能需求、质量属性需求与约束性需求)。但由于Turms作为一款通用即时通讯项目,其需求并不像具体的即时通讯项目那样明确与清晰。因此,面对无穷无尽的业务需求与各种可能的约束性条件,Turms不可能也不应该针对每种场景都做设计。因此,在做Turms设计时,我们尽可能得“以关键的普适即时通讯需求为主要需求”为准则来设计Turms的功能。

                        而将各种纷繁复杂的需求抽象为实际的业务模型时,就需要搞清楚需求间的主次关系,并最终以集合结构的形式作为这些需求关系在技术架构落地时最为重要的体现。因此务必根据您产品自身需求对Turms默认提供的集合结构进行审阅与必要调整。

                        默认的集合索引设计

                        要点(如果您的团队需要基于Turms做开发,请牢记以下三点)

                        • 集合索引主要是针对分布式数据分片的特点与约束条件,并根据多查少写、以关键的普适即时通讯需求为主要需求而设计的
                        • 集合索引不针对数据分析做设计(具体请查阅 Turms数据分析
                        • 集合索引不针对管理员接口做设计(避免不必要的索引开销,代价就是管理员接口的灵活性相对差)
                        • Turms不采用辅助索引集合来满足拓展的业务功能(因此如果您的项目有拓展业务功能,您就需要基于Turms进行二次开发。当然,实现起来也很简单,合格的中高级工程师都应该有这样的能力)

                        这里特别要强调的就是“以关键的普适即时通讯需求为主”,因为它提醒了集合的设计不仅需要开发人员注意,甚至还需要产品经理与甲方的注意。对于涉及到分布式数据分片的场景,一些看似“实现简单”的功能在实际落地时会带来大量的资源消耗并提高开发与运维的难度,因此针对这样“吃力不讨好”的功能,请务必多方确认该需求是否合理,是否必要,是否能承担相应的风险与成本。在确认是否需要实现、能否经过多次迭代后再实现等现实因素后,再考虑是否需要对集合做弹性设计,以方便后期更新,降低推翻重构的风险。

                        这里以“查询某用户已加入的群组”功能为例。Turms中的GroupMember集合用于管理群组与用户的关系,该集合在设计上默认是对群组ID进行数据分片,因此若需要在分布式数据库服务端中根据群组ID查找群组相关信息,这对数据库而言是很轻松的事(targeted queries)。但反过来,如果不在创建一个新的辅助集合的前提下,那对于根据某用户ID查找该用户已加入的群组就是非常吃力的事情(scatter gather queries)。因为数据库无法根据用户ID定位相关群组的数据,因此会将该查询请求发送给所有数据库服务端,造成大量无效且冗余的请求,有效请求仅占很小的比例,最终导致分布式数据库架构的有效吞吐量甚至不如单机。

                        并且随着用户规模的增加,最终要么因为错误判断主次需求而导致架构需要推翻重做,要么在现有基础上进行自定义拓展(如像ShardingSphere那样,自行实现一个辅助表来帮助做数据定位,但这样的实现很可能又会导致大量的冗余数据与事务操作)。因此务必深入理解Turms默认的集合索引设计,并牢记“默认索引设计主要是针对分布式数据分片的特点与约束条件,并根据多查少写、以关键的普适即时通讯需求为主要需求而设计的”。

                        功能丰富的致命代价

                        在您深刻理解了Turms默认的集合索引设计后,您会发现为什么那么多的大中型即时通讯应用不提供、也不应该提供一些看似“实现简单”的功能,也会更加理解即时通讯在实际落地时需要注意到的点。另一方面,您也应该警惕那些以提供“业务功能丰富”口号的即时通讯技术方案,因为它们很可能只是适用于上百人或上千人的用户规模。若后期您的产品需要扩容,您会发现一些已有的表设计与数据分片设计背道而驰,很可能需要从schema设计层面开始重构,进而导致整个技术实现上的重构,到头来只能另起炉灶踏上自研之路,悔不当初。

                        这里以“为了限制每位用户可创建的群组数量,需要服务端具有快速查找该用户所拥有的群组数量的能力”这个功能为例子进行讲解。这看似是一个很“简单”实现的功能。但由于上述所说的Turms默认索引设计原则,Turms默认只给Group群组的ID做数据分片,以实现群组成员快速查找群组信息。

                        因此我们无法根据群组拥有者ID通过targeted query来快速查询其所拥有的群组数量。要想实现相对可行的方案大致只有以下三种方案(特别注意,以下三种方案您可以通过举一反三运用到其他拓展功能设计上):

                        • 为群组拥有者ID专门创建一个单列索引。虽然无法实现targeted query,但仍可在scatter gather query后通过索引相对快速查询。(注意:这类实现方案是Turms为拓展功能提供的默认实现,但这些实现在默认配置中均关闭)

                        • 维度建模,创建辅助索引集合,用于专门记录群组拥有者ID与对应群组的ID。可以实现targeted query,但一些关键操作为保证数据的一致性需要使用分布式事务,并且仍有数据冗余。

                        • 使用静态统计表专门记录每位用户已拥有的群组数量。该方案效率最高、冗余最小。但仍需要分布式事务,并且可拓展性最差。

                        很明显,为了实现一个很“简单”的功能,我们的三个实现方案不仅对系统资源有着截然不同的要求,甚至连查询的时间复杂度也并非在一个级别上。

                        因此要时刻警惕打着“业务功能丰富”口号的即时通讯解决方案。

                        集合结构

                        Turms的集合结构中可能有您产品压根用不上的字段,但这些不被使用的字段并不会存储在数据库中,您无需担心它们会增加数据库开销。

                        Turms的集合结构是如何设计出来的

                        Turms的集合结构并不是某一个commit或某几天就设计完成的,而是经过长时间的迭代分析与实践,最终整理出来的。步骤大体如下:

                        1. 分析业务需求,把握业务之间错综复杂的逻辑并分清需求的主次关系,并且要求不仅能把现有的所有需求,也要尽可能预测未来需要的业务需求与比较确认不需要的业务需求
                        2. 分析业务实现的具体代码逻辑,确定需要的字段
                        3. 确定字段ID。特别一提的是,复合ID内部又可以独立建立索引。如GroupMember集合的复合ID是group ID + user ID,这两个字段自己又有独立的索引用来实现其他业务功能
                        4. 建立索引。首先分别考虑各个字段是否确实需要索引、是否可以做成可选索引,然后再考虑某几个字段可不可以合并成复合索引(包括分析:记录的基数、复合索引的使用频率、查询条件是否能够始终遵循最左匹配原则、是否能够顺便避免回表查询)
                        5. 判断集合是否需要做分片设计(Sharding),包括分析集合是否需要做数据冷热分离。如果需要做分片设计,那是否能够基于上述的索引“顺便”对数据进行分片

                        集合详解

                        概要

                        下述内容只是基本的理论,如同我们在Turms的集合结构是如何设计出来的提到的,实际业务更为复杂多变,因此面对具体的集合索引设计,还需要结合其实际应用场景做分析与设计。

                        数据分片

                        除了诸如管理员(Admin)、群组类型(GroupType)等小集合不需要做数据分片外,其他大部分集合都做了数据分片的支持,比如用户(User)、群组(Group)与消息(Message)等集合,以实现给mongos发送CRUD请求时,mongos能自行做负载均衡、平衡数据负载,同时也是为了支持冷热数据分离。

                        记录创建时间索引

                        不少集合的复合索引都带上了记录创建时间字段,这是为了配合Turms的拉模式,以支持快速查询某时间区间的记录,并避免客户端重复查询。这也是为什么Turms客户端大部分查询语句都可以带上一个查询时间区间的参数,而如果客户端请求没带上这参数,那么Turms服务端就会默认赋予一个查询时间区间,以保证查询性能。

                        ID只使用B-tree索引

                        我们禁止给记录的ID用Hashed索引,这是因为MongoDB不支持通过Hashed索引保证唯一性约束,只能通过B-tree索引保证记录的唯一性,因此就算我们给记录的ID加上了一个Hashed索引,MongoDB也会自动再额外创建一个B-tree索引,得不偿失。

                        可选字段与索引

                        Turms集合中有几十个可选但默认不开启的索引,这是因为:

                        • 虽然很多IM业务需求都很典型,但却是彼此冲突的,比如需要支持消息或请求发送人能查询他自己发送的消息或请求消息或请求发送人不能查询他自己发送的消息或请求(默认实现)。
                        • 又或者一些IM业务需求虽然典型,但并不是那么常见,比如入群请求的处理者是否能查询他处理过的请求。用于支持该类拓展IM功能的可选索引占大头。
                        • 如果默认开启这些可选索引,那就是往小型IM应用做设计了,对于大点的IM应用而言,那就是犯了我们上面说的“功能丰富的致命代价”的错误。

                        而我们选择默认实现方案的原则是:选择不需要额外加字段或索引、存储成本最低且能跟其他IM业务需求保持逻辑一致的方案。而如果您的应用确实需要支持另一个方案,我们一般也提供多套备选方案,需要用户自行配置以替换默认实现。

                        您只要把握住这个基本原则,就能反推Turms集合各索引为什么那么设计了。另外,在代码中各模型、各字段其实也都有索引相关的注释,用来指导用户:什么字段,在什么场景下适合有索引,为什么一些字段不使用索引。用户可以参考该注释做设计。

                        注意:极个别可选索引是默认开启的,因为这些索引对应的场景非常通用,只有极少应用不需要使用这些场景。另外,Turms目前尚未对未开启这些可选索引的场景做优化,因此目前建议您不要手动关闭它们。

                        补充:

                        • 这些可选索引可以通过配置turms.service.mongo.[服务名].optional-index.[集合名].[字段名]=true开启,如turms.service.mongo.message.optional-index.message.sender-id=true

                          提醒:IntelliJ IDEA支持配置自动补全

                        • 用户也能自行直接向MongoDB服务建立自己想用的索引,并且MongoDB增删索引或字段非常简单,因此就算用户配漏了,或者前期需求不清晰,后期有新需求来了,也无需担心没法加新索引或字段。

                          额外补充:MongoDB每个版本都会发布一些非常实用的新特性,可能早期一些我们需要完全自研的复杂功能,但在MongoDB的新版本中只需要执行一条命令就能实现了,极大地降低开发与运维难度,并提升功能的可靠性,因此非常推荐您尽可能部署新版本MongoDB。

                        默认不给请求模型的请求发送者字段加索引

                        诸如好友请求入群请求这两个集合默认是不给请求发送者加索引的。换言之,一旦用户发送完请求,他就无法再查询他已经发送过的请求了,需要客户端本地自行记录。如果您产品确实需要服务端记录并查询用户发送过的请求,则需要自行配置上述的可选索引,让turms-service在初次建表时,添加该索引,或者您也能自行直接在MongoDB服务端中向集合建立索引。

                        消息(Message)

                        消息是目前唯一支持冷热数据分离存储的模型。而冷热数据分离能极大地节省数据库服务器成本,比如将热数据放到16核128G服务器中,把冷数据放到4核8G服务器中。另外,其他模型目前均没有冷热数据分离存储的意义,因此其他模型不支持。

                        索引
                        • 业务场景:是否需要支持消息发送人能够查询他自己发送的消息

                          • 方案一(默认方案):不支持该特性,使用消息发送时间 + 收信人ID复合索引

                            由于消息需要支持冷热数据分离,因此消息的复合索引是:消息发送时间 + 收信人ID,并且分片键是消息发送时间,以保证之后我们能把不同时间区间的Zones分配给不同的Shards,并实现消息的冷热分离存储。

                            (如果消息不需要支持冷热数据分离,那Turms的消息模型的复合索引应该是:收信人ID + 消息发送时间,并且分片键是收信人ID,以保证MongoDB既能对读写请求都做负载均衡,又能保证发给同一个收信人的消息都尽量分在相同的Chunks中,以提升查询速度)

                            补充:至于为什么没给添加好友请求群组邀请请求等集合做冷热数据分离,这是因为虽然这些请求在业务表现上确实与创建时间紧密相关,比如添加好友请求过了一段时间后,在业务上看就是请求已过期,不可处理状态了。但是,对于请求的接收人而言,就算是过期的请求,用户也经常需要通过查询语句快速查询其接收过的所有请求,其访问次数并不会随着时间而递减。举例来说,比如一个用户今年接收到了20个好友请求,去年接收到20个好友请求,客户端每次查询至多50个请求,那数据库就更应该以收信人ID为维度,把相同请求接收者的数据都分在一个Chunk里。而不是根据请求创建时间,把相同请求接收者的数据分到不同的Chunks,并负载到不同的数据库中。因此,我们不对这些集合做冷热数据分离支持。而对于这类集合,我们一般采用请求接收者ID + 请求创建时间这样的复合索引,并以请求接收者ID为分片键,尽可能将一个请求接受者收到的所有请求都放在相同的Chunk中。

                          • 方案二:支持该特性,使用消息发送时间 + 会话ID复合索引

                            如果您的产品需要这套方案,那您只需在turms-service服务端初次启动时,配置turms.service.message.use-conversation-id=true。只是特别注意:如果您已经采用了方案一的方式在数据库中建好了表并创建了消息记录,则Turms服务端目前并不会创建消息发送时间 + 会话ID复合索引,也不会刷一遍消息数据,给消息填充会话ID。

                            补充知识:私聊会话ID是16字节长的字节数组,其值由消息发送者ID、消息接收者组成。群聊会话ID是一个8字节长的字节数组,其值由群ID组成。

                          • 方案三:支持该特性,但通常不推荐,Turms也不提供支持。该方案是:在消息发送时间 + 收信者ID复合索引的方案下,给发送者ID开启可选索引。

                            之所以不推荐这方案是因为:用户查询一个会话内的消息是极为常见的场景,而这个方案在查询一个会话的消息时,需要查询两次:一次是查询对方发送的消息,一次是查询自己发送的消息,如此低效,因此Turms不提供支持。

                        • 消息删除时间B-tree索引。如果您的产品需要支持逻辑删除,则在“删除”消息时,turms-service会填充该字段的值,否则该字段是不会被用到的。

                        TODO

                        + \ No newline at end of file diff --git a/docs/zh-CN/design/status-aware.html b/docs/zh-CN/design/status-aware.html index ce0a38f1..3ffc2e15 100644 --- a/docs/zh-CN/design/status-aware.html +++ b/docs/zh-CN/design/status-aware.html @@ -17,7 +17,7 @@ -
                        Skip to content

                        状态感知

                        状态感知分为两大类,一类是用户在线状态感知,另一类是业务数据变化感知(如收到新消息、群成员发送变化)。

                        由于状态感知的具体实现与具体的产品需求有着密切关系,因此需要您能够把握住以下两点:

                        1. 判断产品需求是否合理。通常不合理的需求,诸如:一个群内可以有10000名用户,当一名用户发送消息时,要保证这条消息能100%地传送给其他9999名用户,并且用户能够拉取几年前的聊天信息。
                        2. 分清主次需求,尽可能在质量属性之间取得平衡。IM服务的实现细节繁多,是否真的有必要为了兼容极端情况,而设计大量的兜底策略(如消息会话级自增ID),既大幅度地增加了开发成本与故障点,也让服务端总体的吞吐量下降。

                        用户在线状态感知

                        简而言之,Turms通过心跳包来检测用户TCP连接的健康状态并以此判断用户是否“在线”。另外,如果您不关心底层实现,您仅需阅读:客户端API——会话的生命周期

                        具体原理(拓展知识)

                        背景

                        从网络传输层来看,TCP只是一个虚拟的连接,需要通过双向的消息传递与消息确认来模拟物理连接,因此如果客户端与服务端之间的连接实际上断开了,但在没有完成四次挥手(即没有完成指定的消息传递与确认)的情况下,TCP仍然判定该连接属于保持状态(如果此时试图从该TCP连接中读取数据,则会抛出带有类似于“An existing connection was forcibly closed by the remote host”消息的异常)。因此对于基于TCP协议开发的上层即时通讯应用而言,如果我们不做额外的工作,服务端就只能错误认为“该用户处于在线状态”。

                        TCP没完成四次挥手的常见原因

                        • 客户端:客户端应用被强制关闭
                        • 服务端:负载持续过高无法响应;服务器直接宕机,导致服务端应用被强制关闭
                        • 链路中间路由:意外中断(如:移动接入网NAT超时)

                        应对异常断开连接的方案

                        为了保证服务端能感知到“用户下线”了的状态,Turms客户端会在上一个任意类型请求(如发送消息请求)的定长时间间隔后(暂不支持根据网络状况配置智能心跳),向服务端发送心跳包来维护其“在线状态”。服务端在收到客户端发来的心跳包或者其他业务请求后,都会在Redis服务端处刷新客户端的在线状态,以此来保活。

                        业务数据变化感知

                        为了让用户能感知业务数据的变化(增删改),Turms支持推模式(服务端主动通知)、拉模式(客户端主动拉取机制。支持按Timeline拉取)以及推拉结合模式,以在实时性与资源消耗之间取得平衡,并让开发者能够自行调整实时性与资源消耗之间的权重。

                        感知方式

                        方式一:推模式(服务端主动通知)

                        推模式指的是:当某个业务模型发生变化时(由于增删改操作),服务端将主动通知相关在线用户该事件的发生。而当客户端收到通知时,Turms客户端会触发NotificationService中的onNotification回调函数。该函数的参数为TurmsRequest对象,表明触发该事件的请求。

                        通知相关行为可以根据:im.turms.server.common.infra.property.env.service.business.NotificationProperties类进行配置。每一种通知类型都可以单独配置,并且所有通知相关配置均可在集群运行时进行动态更新。

                        示例

                        im.turms.server.common.infra.property.env.service.business.NotificationProperties#notifyMembersAfterGroupUpdated这个属性为例。该属性用于控制“当群组信息发生变化时,是否通知群组成员”。这里的群组信息指的是:群组名称、群组类型、群组禁言时间等这样具有全局性的群组信息。

                        如果您将该属性值设置为true,则当群组信息发生变化时,群组成员的客户端都将收到触发该变化的通知。否则,群组成员客户端不会收到任何通知。

                        评价

                        通知机制可以保证通知能实时地传递给相关用户,但其缺点就在于它很容易导致无意义的资源消耗(以具体业务场景为准)。比如用户A已经加入了100个群组,但该用户平时只查看其中3个群组的信息。这种场景下,如果100个群组的所有状态变化都开启了通知机制,则不管是服务端还是客户端都需要浪费大量资源去处理这些无意义的通知(因为该用户从来不看这些通知)。

                        为了解决该类问题,以及满足其他常见需求(如:要求当时离线的用户在上线时也能检测到业务模型是否发生变化;要求在线用户在通知被关闭的情况下也能感知业务模型的变化),Turms还提供了拉模式(客户端主动拉取)让用户来感知业务模型的变化。

                        方式二:拉模式(客户端主动拉取。支持按Timeline拉取)

                        为了弥补上述提到的推模式的不足,Turms还提供了拉模式。

                        大概实现

                        Turms的每个业务模型都带有一个版本信息,这个版本信息记录了该业务模型最后一次更新的时间。当客户端向服务端请求资源时,可以携带客户端最后一次更新该业务模型的时间(也可以不带),Turms服务端会对这个版本信息与当前业务模型的版本信息进行比对,如果客户端发来的版本信息早于当前业务模型的版本信息,则Turms服务端会返回最新的业务模型数据,否则抛出状态码NO_CONTENT,在客户端处则会收到空数据。

                        常见拉取时机(同步时机)
                        • 当您的应用被切换到前台时
                        • 会话重新连接上时
                        • 根据具体业务而定(看下文示例)
                        示例

                        继续以上述的案例为例。假设我们希望群组成员之间能够实时感知其他群组成员资料信息的变化。那如果我们采用通知机制,假设每个群除了用户A还有其他100名在线用户,则用户A的资料信息变化,需要向其他10000(100群*100人/群)名群组成员发送通知,这在实际运用中是绝对不可取的。

                        在实际运用中,通常会在特定时机(比如在用户打开某名用户的个人信息UI界面时,或者打开和某人的聊天窗口时),才让客户端主动请求服务端该用户的信息。同时,通过版本对比,减少无意义的资源浪费。

                        这种时刻注意实时性与资源消耗的设计要牢记在心中,以免设计出不切实际的应用场景。

                        客户端对用户行为感知的实时性与服务端延迟

                        以拉黑用户的相关实现为例,Turms默认对用户关系进行1分钟的缓存,以避免频发查询数据库,这是合理的行为。如果此时用户A“拉黑”了用户B,那么可能会出现:虽然用户A拉黑了用户B,但在有缓存的这段时间里,用户B仍然有可能可以给用户A发送消息(因为Turms服务端是分布式集群,关系缓存与接收拉黑请求操作的服务端不一定是同一个服务端)。这种行为对Turms服务端是可以接受的,而不是Bug

                        其合理且理想的参考解决方案是:在客户端的业务层面上(业务逻辑由您控制,而非由Turms客户端控制),就算Turms服务端发送给Turms客户端消息,您的客户端也应该根据您产品自身的业务逻辑,再做一次是否已拉黑用户判断,如果是,则隐藏不显示。

                        消息感知

                        读扩散与写扩散

                        Turms的架构是基于读扩散消息模型而设计的。下表对读扩散与写扩散各自的优劣势进行了比较,供读者参考:

                        读扩散写扩散
                        含义1. 每名用户跟与其聊天的其他用户或群都有一个独立会话(也叫做信箱或Timeline)。
                        2. 当用户发送消息时,无论私聊还是群聊,数据库都只需存储一份消息记录。
                        3. 当用户查询消息时,客户端需要向服务端发送一个请求来拉取指定会话ID列表的消息;或者(重点)先通过一个请求不指定会话ID列表,来拉取所有私聊会话的消息,再通过一个请求指定群聊会话ID列表来拉取群聊消息
                        1. 每名用户有且仅有一个信箱。
                        2. 当用户发送消息时,需要把这条消息写到该会话内的所有成员信箱中,即若群聊中有其他成员100人,则需要把这消息写100次。
                        3. 当用户查询消息时,客户端无需指定会话ID列表,只需要向服务端发送一个请求读取自己信箱中的消息即可
                        优势场景用户会话(私聊会话与群聊会话)相对少,群人数多的场景。
                        注意:如果应用只有私聊会话,没有群聊会话,那么在Turms服务端的实现下,读扩散与写扩散的优劣势场景其实并没什么太大区别,因为两个消息模型都只要求用户发消息时,数据库写一次消息;用户读消息时,数据库根据索引查一次表(Turms采用的是消息发送时间 + 收信人ID复合索引,具体见消息集合设计
                        为了避免太多的消息拷贝,因此写扩散相对更适合群聊多,但群成员少的场景
                        劣势场景因为客户端需要指定群聊会话的ID列表,因此读扩散的劣势场景是:群聊会话数多,且用户频繁读消息。
                        提醒:Turms服务端是通过一条MongoDB客户端请求,并基于索引来完成上述的查询操作的,因此性能其实也很高效。只是相对于写扩散而言,该场景对于读扩散是劣势场景
                        因为群成员越多,消息被拷贝的次数越多,因此写扩散的劣势场景是:单个群的成员数多,且群成员频繁发消息
                        技术实现1. 可以通过MongoDB的分片副本架构,对读请求进行负载均衡
                        2. 所有的读请求都是基于索引实现的,性能高效
                        1. 写操作难以进行负载均衡
                        2. 更新消息、撤回消息等IM功能的实现成本巨大,需要考虑分布式一致性问题与消息风暴
                        消息可靠性如产品对消息可靠性有较高的要求,即保证消息不丢,保证消息内容一致,那么读扩散对应的实现简单得多,因为数据库只需存储一条消息,用户都只需读取这一条消息因为要保证消息写入到每位群成员的信箱中,因此需要引入弱分布式一致性事务(或强分布式一致性事务),否则消息可能丢失,但分布式一致性事务会导致吞吐量低下
                        总评1. 读扩散适用的产品极广,对于写扩散实现成本巨大的特性,基于读扩散实现,通常都只需要客户端自定义好查询条件,并通过向Turms服务端发送一条查询语句来即可实现(如群组新成员消息分享、多端消息同步),服务端不需要改一行代码,且这些查询任务都是基于索引完成的。
                        2. 读扩散在劣势场景下依然能够依靠索引保证较高的效率
                        由于写扩散需要写入大量的消息,如有任何更新操作(撤回/更新)还需要使用分布式事务,并且IM功能特性(如群组新成员消息分享、多端同步)的实现很复杂。
                        综上,写扩散的业务拓展性极差,其使用场景基本限定在:应用基本都是私聊,没有群聊,且业务功能简单,但对于只有私聊的应用,如上所述,读扩散或写扩散的性能表现都差不多。
                        如果您团队的产品经理要求添加业务功能,您的开发团队很快就会体会到只支持写扩散对IM系统是多么致命的设计。读扩散可以很高效且容易实现的功能,但对于写扩散而言,这就成了低效且高难度实现地功能了

                        再次特别强调:除非您非常明确您的产品的用例就如上述简单且局限(私聊会话数多少无所谓,但群聊会话数多且群成员少),且未来业务需求也基本不变,否则用了写扩散消息模型基本就意味着您的产品终有一天需要重构回读扩散模型,或同时支持读写两套模型。当然,写扩散也可以作为“技术负债”长期保留。

                        提醒:

                        • 从写扩散实现改成读扩散实现几乎意味着要把整个项目的设计与实现都从头重现一遍。也因为消息模型对IM架构的影响是如此巨大,我们在谈Turms的架构时,第一句永远是Turms的架构是基于读扩散消息模型而设计的
                        • 在Turms服务端的实现中,“撤回消息”也是一条消息,即一种特殊的系统消息。

                        消息接收、消息更新与消息撤回

                        Turms基于上述的“推模式”与“拉模式”实现客户端的消息接收、更新与撤回。其中:

                        • 结合上述的常见拉取时机与下文的关于消息的可达性、有序性与重复性,Turms是可以实现100%消息必达、消息一致性排序与去重

                        • 消息更新与撤回的通知本质上也是一条消息,即一条特别的系统消息。Turms服务端在接收到用户发出的消息更新或撤销请求后,会先判断该功能是否启动、用户是否有权限、是否在一定时间区间内等等判断,如果验证通过,则会(下文以撤回消息的流程为例,更新消息同理):

                          • Turms服务端先对存储在数据库的目标原消息记录做修改,给它标记上“消息被撤回”的时间戳。

                          • 然后再生成一条“撤回消息”的系统消息(注意它是message,不是通知notification),并插入到消息集合中。

                          • 最后再将上述的“撤回消息”的系统消息,发送给对应的在线用户,以告知这些客户端:之前某些消息已被撤回了。

                            开发者需要在客户端接收到这系统消息后,自行做对应业务层上的处理(Turms客户端除了解析哪些消息被撤回外,自身不会做其他任何的逻辑处理),比如在本地物理删除该消息,或只是将其隐藏,或是将被撤回的消息替换成类似“该消息已在XX时间被撤回”等等。

                            补充:如上所述,目前Turms服务端处理撤回消息时,会发送给对应的在线客户端一个“撤回消息”的系统消息,以保证在线的客户端能迅速撤回本地已接收到的消息,但之后还会添加配置项,以支持不想让Turms服务端主动发该系统消息的应用。

                          • 如果用户已经下线了,而没有接收到这个“撤回消息”的系统消息,那么在用户下次登陆时,由于它仍需要去主动拉取离线时收到的消息,所以在拉取的过程中也会顺带把上面插入的“撤回消息”的系统消息拉取下来,开发者在检测到这类系统消息时,再做具体的业务层处理即可。

                            提醒:开发者可以通过客户端侧所提供的消息服务中的addMessageListener接口,来判断接收到的消息是否为“撤回消息”的系统消息,以turms-client-js客户端为例:

                            js
                            turmsClient.messageService.addMessageListener((message, addition) => {
                            +    
                            Skip to content

                            状态感知

                            状态感知分为两大类,一类是用户在线状态感知,另一类是业务数据变化感知(如收到新消息、群成员发送变化)。

                            由于状态感知的具体实现与具体的产品需求有着密切关系,因此需要您能够把握住以下两点:

                            1. 判断产品需求是否合理。通常不合理的需求,诸如:一个群内可以有10000名用户,当一名用户发送消息时,要保证这条消息能100%地传送给其他9999名用户,并且用户能够拉取几年前的聊天信息。
                            2. 分清主次需求,尽可能在质量属性之间取得平衡。IM服务的实现细节繁多,是否真的有必要为了兼容极端情况,而设计大量的兜底策略(如消息会话级自增ID),既大幅度地增加了开发成本与故障点,也让服务端总体的吞吐量下降。

                            用户在线状态感知

                            简而言之,Turms通过心跳包来检测用户TCP连接的健康状态并以此判断用户是否“在线”。另外,如果您不关心底层实现,您仅需阅读:客户端API——会话的生命周期

                            具体原理(拓展知识)

                            背景

                            从网络传输层来看,TCP只是一个虚拟的连接,需要通过双向的消息传递与消息确认来模拟物理连接,因此如果客户端与服务端之间的连接实际上断开了,但在没有完成四次挥手(即没有完成指定的消息传递与确认)的情况下,TCP仍然判定该连接属于保持状态(如果此时试图从该TCP连接中读取数据,则会抛出带有类似于“An existing connection was forcibly closed by the remote host”消息的异常)。因此对于基于TCP协议开发的上层即时通讯应用而言,如果我们不做额外的工作,服务端就只能错误认为“该用户处于在线状态”。

                            TCP没完成四次挥手的常见原因

                            • 客户端:客户端应用被强制关闭
                            • 服务端:负载持续过高无法响应;服务器直接宕机,导致服务端应用被强制关闭
                            • 链路中间路由:意外中断(如:移动接入网NAT超时)

                            应对异常断开连接的方案

                            为了保证服务端能感知到“用户下线”了的状态,Turms客户端会在上一个任意类型请求(如发送消息请求)的定长时间间隔后(暂不支持根据网络状况配置智能心跳),向服务端发送心跳包来维护其“在线状态”。服务端在收到客户端发来的心跳包或者其他业务请求后,都会在Redis服务端处刷新客户端的在线状态,以此来保活。

                            业务数据变化感知

                            为了让用户能感知业务数据的变化(增删改),Turms支持推模式(服务端主动通知)、拉模式(客户端主动拉取机制。支持按Timeline拉取)以及推拉结合模式,以在实时性与资源消耗之间取得平衡,并让开发者能够自行调整实时性与资源消耗之间的权重。

                            感知方式

                            方式一:推模式(服务端主动通知)

                            推模式指的是:当某个业务模型发生变化时(由于增删改操作),服务端将主动通知相关在线用户该事件的发生。而当客户端收到通知时,Turms客户端会触发NotificationService中的onNotification回调函数。该函数的参数为TurmsRequest对象,表明触发该事件的请求。

                            通知相关行为可以根据:im.turms.server.common.infra.property.env.service.business.NotificationProperties类进行配置。每一种通知类型都可以单独配置,并且所有通知相关配置均可在集群运行时进行动态更新。

                            示例

                            im.turms.server.common.infra.property.env.service.business.NotificationProperties#notifyMembersAfterGroupUpdated这个属性为例。该属性用于控制“当群组信息发生变化时,是否通知群组成员”。这里的群组信息指的是:群组名称、群组类型、群组禁言时间等这样具有全局性的群组信息。

                            如果您将该属性值设置为true,则当群组信息发生变化时,群组成员的客户端都将收到触发该变化的通知。否则,群组成员客户端不会收到任何通知。

                            评价

                            通知机制可以保证通知能实时地传递给相关用户,但其缺点就在于它很容易导致无意义的资源消耗(以具体业务场景为准)。比如用户A已经加入了100个群组,但该用户平时只查看其中3个群组的信息。这种场景下,如果100个群组的所有状态变化都开启了通知机制,则不管是服务端还是客户端都需要浪费大量资源去处理这些无意义的通知(因为该用户从来不看这些通知)。

                            为了解决该类问题,以及满足其他常见需求(如:要求当时离线的用户在上线时也能检测到业务模型是否发生变化;要求在线用户在通知被关闭的情况下也能感知业务模型的变化),Turms还提供了拉模式(客户端主动拉取)让用户来感知业务模型的变化。

                            方式二:拉模式(客户端主动拉取。支持按Timeline拉取)

                            为了弥补上述提到的推模式的不足,Turms还提供了拉模式。

                            大概实现

                            Turms的每个业务模型都带有一个版本信息,这个版本信息记录了该业务模型最后一次更新的时间。当客户端向服务端请求资源时,可以携带客户端最后一次更新该业务模型的时间(也可以不带),Turms服务端会对这个版本信息与当前业务模型的版本信息进行比对,如果客户端发来的版本信息早于当前业务模型的版本信息,则Turms服务端会返回最新的业务模型数据,否则抛出状态码NO_CONTENT,在客户端处则会收到空数据。

                            常见拉取时机(同步时机)
                            • 当您的应用被切换到前台时
                            • 会话重新连接上时
                            • 根据具体业务而定(看下文示例)
                            示例

                            继续以上述的案例为例。假设我们希望群组成员之间能够实时感知其他群组成员资料信息的变化。那如果我们采用通知机制,假设每个群除了用户A还有其他100名在线用户,则用户A的资料信息变化,需要向其他10000(100群*100人/群)名群组成员发送通知,这在实际运用中是绝对不可取的。

                            在实际运用中,通常会在特定时机(比如在用户打开某名用户的个人信息UI界面时,或者打开和某人的聊天窗口时),才让客户端主动请求服务端该用户的信息。同时,通过版本对比,减少无意义的资源浪费。

                            这种时刻注意实时性与资源消耗的设计要牢记在心中,以免设计出不切实际的应用场景。

                            客户端对用户行为感知的实时性与服务端延迟

                            以拉黑用户的相关实现为例,Turms默认对用户关系进行1分钟的缓存,以避免频发查询数据库,这是合理的行为。如果此时用户A“拉黑”了用户B,那么可能会出现:虽然用户A拉黑了用户B,但在有缓存的这段时间里,用户B仍然有可能可以给用户A发送消息(因为Turms服务端是分布式集群,关系缓存与接收拉黑请求操作的服务端不一定是同一个服务端)。这种行为对Turms服务端是可以接受的,而不是Bug

                            其合理且理想的参考解决方案是:在客户端的业务层面上(业务逻辑由您控制,而非由Turms客户端控制),就算Turms服务端发送给Turms客户端消息,您的客户端也应该根据您产品自身的业务逻辑,再做一次是否已拉黑用户判断,如果是,则隐藏不显示。

                            消息感知

                            读扩散与写扩散

                            Turms的架构是基于读扩散消息模型而设计的。下表对读扩散与写扩散各自的优劣势进行了比较,供读者参考:

                            读扩散写扩散
                            含义1. 每名用户跟与其聊天的其他用户或群都有一个独立会话(也叫做信箱或Timeline)。
                            2. 当用户发送消息时,无论私聊还是群聊,数据库都只需存储一份消息记录。
                            3. 当用户查询消息时,客户端需要向服务端发送一个请求来拉取指定会话ID列表的消息;或者(重点)先通过一个请求不指定会话ID列表,来拉取所有私聊会话的消息,再通过一个请求指定群聊会话ID列表来拉取群聊消息
                            1. 每名用户有且仅有一个信箱。
                            2. 当用户发送消息时,需要把这条消息写到该会话内的所有成员信箱中,即若群聊中有其他成员100人,则需要把这消息写100次。
                            3. 当用户查询消息时,客户端无需指定会话ID列表,只需要向服务端发送一个请求读取自己信箱中的消息即可
                            优势场景用户会话(私聊会话与群聊会话)相对少,群人数多的场景。
                            注意:如果应用只有私聊会话,没有群聊会话,那么在Turms服务端的实现下,读扩散与写扩散的优劣势场景其实并没什么太大区别,因为两个消息模型都只要求用户发消息时,数据库写一次消息;用户读消息时,数据库根据索引查一次表(Turms采用的是消息发送时间 + 收信人ID复合索引,具体见消息集合设计
                            为了避免太多的消息拷贝,因此写扩散相对更适合群聊多,但群成员少的场景
                            劣势场景因为客户端需要指定群聊会话的ID列表,因此读扩散的劣势场景是:群聊会话数多,且用户频繁读消息。
                            提醒:Turms服务端是通过一条MongoDB客户端请求,并基于索引来完成上述的查询操作的,因此性能其实也很高效。只是相对于写扩散而言,该场景对于读扩散是劣势场景
                            因为群成员越多,消息被拷贝的次数越多,因此写扩散的劣势场景是:单个群的成员数多,且群成员频繁发消息
                            技术实现1. 可以通过MongoDB的分片副本架构,对读请求进行负载均衡
                            2. 所有的读请求都是基于索引实现的,性能高效
                            1. 写操作难以进行负载均衡
                            2. 更新消息、撤回消息等IM功能的实现成本巨大,需要考虑分布式一致性问题与消息风暴
                            消息可靠性如产品对消息可靠性有较高的要求,即保证消息不丢,保证消息内容一致,那么读扩散对应的实现简单得多,因为数据库只需存储一条消息,用户都只需读取这一条消息因为要保证消息写入到每位群成员的信箱中,因此需要引入弱分布式一致性事务(或强分布式一致性事务),否则消息可能丢失,但分布式一致性事务会导致吞吐量低下
                            总评1. 读扩散适用的产品极广,对于写扩散实现成本巨大的特性,基于读扩散实现,通常都只需要客户端自定义好查询条件,并通过向Turms服务端发送一条查询语句来即可实现(如群组新成员消息分享、多端消息同步),服务端不需要改一行代码,且这些查询任务都是基于索引完成的。
                            2. 读扩散在劣势场景下依然能够依靠索引保证较高的效率
                            由于写扩散需要写入大量的消息,如有任何更新操作(撤回/更新)还需要使用分布式事务,并且IM功能特性(如群组新成员消息分享、多端同步)的实现很复杂。
                            综上,写扩散的业务拓展性极差,其使用场景基本限定在:应用基本都是私聊,没有群聊,且业务功能简单,但对于只有私聊的应用,如上所述,读扩散或写扩散的性能表现都差不多。
                            如果您团队的产品经理要求添加业务功能,您的开发团队很快就会体会到只支持写扩散对IM系统是多么致命的设计。读扩散可以很高效且容易实现的功能,但对于写扩散而言,这就成了低效且高难度实现地功能了

                            再次特别强调:除非您非常明确您的产品的用例就如上述简单且局限(私聊会话数多少无所谓,但群聊会话数多且群成员少),且未来业务需求也基本不变,否则用了写扩散消息模型基本就意味着您的产品终有一天需要重构回读扩散模型,或同时支持读写两套模型。当然,写扩散也可以作为“技术负债”长期保留。

                            提醒:

                            • 从写扩散实现改成读扩散实现几乎意味着要把整个项目的设计与实现都从头重现一遍。也因为消息模型对IM架构的影响是如此巨大,我们在谈Turms的架构时,第一句永远是Turms的架构是基于读扩散消息模型而设计的
                            • 在Turms服务端的实现中,“撤回消息”也是一条消息,即一种特殊的系统消息。

                            消息接收、消息更新与消息撤回

                            Turms基于上述的“推模式”与“拉模式”实现客户端的消息接收、更新与撤回。其中:

                            • 结合上述的常见拉取时机与下文的关于消息的可达性、有序性与重复性,Turms是可以实现100%消息必达、消息一致性排序与去重

                            • 消息更新与撤回的通知本质上也是一条消息,即一条特别的系统消息。Turms服务端在接收到用户发出的消息更新或撤销请求后,会先判断该功能是否启动、用户是否有权限、是否在一定时间区间内等等判断,如果验证通过,则会(下文以撤回消息的流程为例,更新消息同理):

                              • Turms服务端先对存储在数据库的目标原消息记录做修改,给它标记上“消息被撤回”的时间戳。

                              • 然后再生成一条“撤回消息”的系统消息(注意它是message,不是通知notification),并插入到消息集合中。

                              • 最后再将上述的“撤回消息”的系统消息,发送给对应的在线用户,以告知这些客户端:之前某些消息已被撤回了。

                                开发者需要在客户端接收到这系统消息后,自行做对应业务层上的处理(Turms客户端除了解析哪些消息被撤回外,自身不会做其他任何的逻辑处理),比如在本地物理删除该消息,或只是将其隐藏,或是将被撤回的消息替换成类似“该消息已在XX时间被撤回”等等。

                                补充:如上所述,目前Turms服务端处理撤回消息时,会发送给对应的在线客户端一个“撤回消息”的系统消息,以保证在线的客户端能迅速撤回本地已接收到的消息,但之后还会添加配置项,以支持不想让Turms服务端主动发该系统消息的应用。

                              • 如果用户已经下线了,而没有接收到这个“撤回消息”的系统消息,那么在用户下次登陆时,由于它仍需要去主动拉取离线时收到的消息,所以在拉取的过程中也会顺带把上面插入的“撤回消息”的系统消息拉取下来,开发者在检测到这类系统消息时,再做具体的业务层处理即可。

                                提醒:开发者可以通过客户端侧所提供的消息服务中的addMessageListener接口,来判断接收到的消息是否为“撤回消息”的系统消息,以turms-client-js客户端为例:

                                js
                                turmsClient.messageService.addMessageListener((message, addition) => {
                                     if (addition.recalledMessageIds.length) {
                                         // is a system message to recall messages
                                     } else {
                                @@ -30,7 +30,7 @@
                                         // not
                                     }
                                 });

                            另外:

                            • 关于Turms服务端删除消息的流程,Turms服务端目前只是对消息做对应的软删除或硬删除,并不会执行任何“撤回消息”相关的逻辑。我们之后会给Turms添加对应的配置项,以支持希望删除消息时,也执行撤回消息的应用。
                            • 目前Turms服务端对“更新消息”并没有提供如“撤回消息”那样完整的支持,这部分的优化会在近期完成。

                            关于消息的可达性、有序性与重复性

                            架构设计永远是平衡的艺术,盲目承诺消息100%必达只是一种销售的说辞。好比大部分互联网应用在分布式事务的技术实现上,只会采用性能更好的弱分布式事务,而非虽然更可靠但性能低下的强分布式事务。是否需要实现100%的消息必达还是需要根据业务场景而定。如在直播聊天室场景,不仅不要求消息必达,甚至还会要求服务端要能根据负载情况与消息优先级,主动丢弃用户消息,或者只将消息发送给一部分用户。

                            直播场景也可能不强制要求消息有序性,而是要求“怎样消息吞吐量大,怎样设计。尽量保证消息的有序性,但不提供额外辅助资源进行支持”。一些设计IM应用也可以“为了取得高吞吐量与高可达性间的平衡,对免费群采用非消息必达机制,对VIP群采用消息必达机制”。实际应用的需求永远是五花八门的。

                            因此再次强调:做功能设计时,要分清主次需求,尽可能在质量属性之间取得平衡。切忌脱离业务场景,闭门造车。

                            总结

                            由于下文各种消息特性的具体实现对比相对复杂,该总结部分为您快速归纳最终方案。

                            在大原则上Turms在设计时遵循能客户端自己实现的,Turms服务端就不实现,以实现最大的吞吐量也灵活业务实现。如果特性必须由服务端实现,且对吞吐量影响不大,则默认开启,否则默认关闭,具体而言:

                            • 可达性

                              • 方案一:如果您希望实现几乎100%的消息必达,您可以开启turms.service.message.sequence-id下的use-sequence-id-for-group-conversationuse-sequence-id-for-private-conversation(默认配置下,均关闭),该机制会在每次生成消息记录时,向Redis请求一个会话级别的自增sequence ID,并将这个ID赋给当前消息记录上,客户端可以通过这个ID的自增性与消息发送时间判断消息是否丢失(需要判断消息发送时间是因为:如果Redis宕机,序列号数据丢失,序列ID会从头开始计算,而当客户端检测到序列号变小时,则可以再根据消息发送时间判断哪条是最新的消息)。

                                注意:sequence IDmessage ID没有任何关系。

                              • 方案二(默认实现):如果您不要求消息必须100%必达,则关闭上述配置,从而获得更大的消息推送吞吐量。

                            • 有序性

                              • 顺序最终一致性

                                • 方案一:借助上述提到的自增sequence ID“顺便”实现消息的有序性
                                • 方案二:(默认实现)使用服务端时间保证消息顺序。提醒:不仅仅是消息需要使用系统时间,Turms服务端各个功能模块也重度使用系统时间,如基于Snowflake算法生成的ID、日志的时间戳与基于时间戳的限流防刷机制。
                              • 接收顺序一致性:部分IM系统会通过延迟发送消息或客户端延迟展示消息,来尽可能避免“客户端先接收到在后发送的消息、再接收到在前发送的消息”,导致消息UI需要重排。但Turms暂未计划提供相关支持

                              • 因果一致性:客户端发送消息时,可以携带preMessageId字段,用于指示在消息发送客户端UI上显示的上一条消息ID是什么。该记录对Turms自身没任何实际作用,但其他客户端可参考该值做上层的消息UI展示,以实现客户端之间消息逻辑的因果一致。

                                注意:preMessageId跟“消息可达性”的实现没有任何关系,它仅仅用于您产品进行消息UI排序

                            • 重复性。Turms服务端在这方面只是提供全局ID唯一的消息记录,消息的去重工作需要开发人员自行在客户端实现:如果您的应用需要实现100%的消息去重,则需要考虑落盘存储已接收的消息ID。如果您的应用只需要保证一个应用的生命周期内消息去重,那就只需要在内存中存储已接收的消息ID,每当服务端推送来新消息只需判断该ID的消息是否已处理过即可。

                              提醒:通常只需要存储本地最近时间(如最近1天)的消息ID即可,没有必要进行全量存储

                            另外,下文会把一个业界常见但却通常非常失败的设计方案,即采用需要服务端参与的消息确认机制方案作为反面案例进行讲解。它用最高的成本实现了最差的“可达性”与“重复性”的效果,并且性能与拓展性也都极差。(TODO:尚未更新该部分文档)

                            消息确认机制(Acknowledge)

                            值得注意的是:

                            1. Turms的消息确认机制并不需要Turms服务端的参与
                            2. 消息确认机制与业务层面“消息已读”功能是完全独立的,二者没有关联关系。
                            需要服务端参与的Ack机制不需要服务端参与的Ack机制
                            介绍部分即时通讯架构设计中,会要求客户端在接受到消息后,间隔一定时间(如5秒、10秒等),向服务端发送消息确认请求(而不是一接受到消息就确认。一是为了提高确认处理效率,二是减少因网络延迟问题丢消息的概率)。
                            服务端记录每个会话最新的确认时间,以实现用户在对所有会话进行消息拉取时(如用户上线时),可以通过一个简单的请求去拉取确认时间至今的所有消息。
                            客户端本地存储每个会话的最后确认时间,客户端如果想获得任意其所属的会话消息,则向服务端发送对应的会话ID与确认时间,服务端会返回确认时间至今的所有消息。
                            优点1. 客户端实现简单,无需在本地存储会话信息1. 客户端可以自定义消息拉取范围。业务适用面更广,可以很轻松支持多端消息同步功能
                            2. 服务端不需要先查一次所有会话的确认时间,再根据Ack时间拉取消息,性能更优
                            3. 不需要客户端定时发送确认请求给服务端,能够完全省去大量确认操作带来的性能开销
                            缺点1. 服务端需要先查一次所有会话的确认时间,再根据确认时间拉取消息,性能相对差
                            2. 对于受到的每一条消息,客户端都需要向服务端发送确认请求,然后服务端更新对应的消息状态,性能低下
                            1. 客户端发请求时,需要携带所有欲请求消息的会话ID与其对应的确认时间,请求体相对较大(但也对应了上述②的优点)
                            2. 需要开发者自行实现客户端本地数据库(如:Realm数据库。Turms未来可能会以拓展形式,帮助开发者实现本地存储功能)

                            关于消息的可达性

                            架构设计永远是平衡的艺术,盲目承诺消息100%必达只是一种销售的说辞。好比大部分互联网应用在分布式事务的技术实现上,只会采用性能更好的弱分布式事务,而非虽然更可靠但性能低下的强分布式事务。是否需要实现100%的消息必达还是根据业务场景而定(如在直播聊天室场景,不仅不要求消息必达,甚至还会要求服务端能主动根据负载情况,抛弃用户消息)。

                            实现消息100%必达的方案也比较简单,可以通过Redis实现一个会话级别的自增ID生成服务器,保证消息ID在一个会话内递增。客户端能通过ID的递增性自行判断是否有消息丢失,如果发现消息丢失,则发请求向服务端拿取指定消息即可。

                            Turms会同时支持上述的会话级消息自增ID实现来保证消息100%必达(TODO),同时也提供基于Snowflake算法的全局自增ID实现来提供最佳的吞吐量(代价就是消息不能保证100%必达)。

                            关于未读消息数的实现

                            业务需求

                            • 作为应用桌面角标(Badge Number)时,显示未读消息总数(iOS必须服务端计算总数)。需要支持离线更新,或不需要支持离线更新
                            • 作为应用内的会话角标时,显示各个会话的未读消息数

                            方案

                            不支持离线消息推送时携带未读消息数(默认实现)支持离线消息推送时携带未读消息数(TODO)
                            实现客户端在接收消息与拉消息时,自行发送请求让服务端实时计算“未读消息数”。
                            在这个方案中,Turms服务端其实并没有未读消息数这个概念,服务端只是根据客户端请求去计算某个消息发送时间区间内的消息数
                            使用Redis,支持离线消息推送时携带未读消息数:携带会话未读消息数与总消息未读数;只携带总未读消息数
                            大致实现是:服务端接收到消息时,将对应的收信人在Redis的未读消息数记录加1,总数也加1
                            用户读取消息时,或用户或群组被删除时,则在Redis记录中做相反的减操作
                            注意:总未读消息数必须由服务端计算
                            优点1. 实现简单且可以灵活地支持各种业务需求,无需专门引入Redis服务端
                            2. 发送消息时,无需向Redis发送请求去计算消息未读数,写吞吐量更高
                            1. 支持离线消息推送时携带未读消息数
                            2. 读取未读消息时,不需要实时计算,读吞吐量更高
                            缺点1. 不支持离线消息推送时携带未读消息数
                            2. 客户端读取未读消息数时,需要实时计算,读吞吐量更低(补充:有索引支持)
                            1. 需要引入Redis服务端,增加运维成本与难度
                            2. 服务端每次接收到新消息,都需要Redis发送请求去计算消息未读数,写吞吐量更低
                            与未读消息的关系未读消息未读消息数都是以端为维度,由客户端自行通过上述的客户端向服务发送本地消息最后确认时间,来获取这个时间点之后的“未读”消息与“未读”消息数。
                            因此不同端得到的未读消息未读消息数可能是不一致的
                            未读消息仍是以端为维度,但未读消息数则以用户为维度。如果消息A在桌面端被“读”了,那手机端仍可以认为其“未读”,但推送给该用户所有客户端的未读消息数都统一减了1
                            因此不同端得到的未读消息可能是不一致的,但未读消息数是一致的
                            补充如上文所述,该方案其实也能“强行”支持离线消息推送时携带未读消息数。
                            但因为这方案并不是为频繁读取未读消息数而设计的,因此如果每次推送消息时,都让服务端自行实时计算未读消息数,其性能明显是不可取的,因此实践上是不支持的
                            上述方案各有优劣,具体用哪个方案,取决于具体应用的业务需求。不需要支持离线消息推送时携带未读消息数,则采用左侧的方案,需要支持则采用右侧的方案。
                            如果客户在这两个方案基础上,还有额外需求,则需要自行做二次开发
                            TODO:该实现将在近期支持

                            具体实现

                            TODO

                            关于离线推送的实现

                            对于在线用户,开发者可以通过notification属性来配置是否让服务端主动推送消息给在线用户(默认为true)。对于离线用户,离线推送的实现通常需要借助手机运营商提供的推送SDK,通过其通道进行离线推送。

                            但由于Turms本身不接入任何运营商,也没计划接入,因此您需要通过NotificationHandler插件来实现自定义的离线推送逻辑。该Handler提供一个handle函数,并接受消息信息、在线用户ID、离线用户ID与可选的未读消息数这四个参数,您可以自行通过该函数调用厂商提供的推送SDK,来实现离线推送逻辑。

                            消息批量拉取

                            TODO:暂不支持。由于消息拉取是由客户端自行控制的,因此该功能可以很容易地高效且灵活实现,我们会在正式发布之前提供支持。

                            特大群

                            特大群的实现其实并不难,只是它的业务需求与场景跟一般社交应用的很不一样,所以要有一套专门的策略来支持特大群。

                            策略(TODO)

                            1. 消息按照优先级发送
                            2. 智能限制消息峰值,主动根据服务端状况与消息优先级丢消息
                            3. 分桶(分小群)发消息
                            4. 通常不需要消息漫游功能
                            - + \ No newline at end of file diff --git a/docs/zh-CN/feature/group.html b/docs/zh-CN/feature/group.html index 7caa9d6b..71fb26f5 100644 --- a/docs/zh-CN/feature/group.html +++ b/docs/zh-CN/feature/group.html @@ -17,8 +17,8 @@ -
                            Skip to content

                            群组相关功能

                            群成员类型包括:群主、管理员、普通成员、游客、匿名游客

                            相关路径与模型

                            • 管理员API路径:/groups。具体API细节请参考OpenAPI文档
                            • 客户端接口:请查阅GroupServiceController类。
                            • 底层请求模型:请查阅https://github.com/turms-im/proto/tree/master/request/group目录下的接口描述文件
                            • 配置类:im.turms.server.common.infra.property.env.service.business.group.GroupProperties

                            功能列表

                            功能
                            描述相关配置属性名
                            新建群组新建群组turms.service.group.activate-group-when-created
                            群主解散群群主可以解散群turms.service.group.delete-group-logically-by-default
                            主动退群除群主外,其他用户均可以主动退群。群主需先将群转让给其他群成员才可以进行退群操作
                            群主转让群群主可以将群的拥有者权限转给群内的其他成员,转移后, 被转让者变为新的群主,原群主变为普通成员。群主还可以选择在转让的同时,直接退出该群
                            修改群组资料支持群组名,群组头像,群组介绍,群组通知,群组类型等字段
                            群组禁言群组普通成员在禁言时段无法发送消息,仅有群主与管理员能发送消息
                            获取群组信息根据过滤条件(如群组ID),查找群组
                            增加群组成员增加群组成员
                            发送入群邀请拥有邀请权限角色的群组成员可向指定用户发送入群邀请turms.service.group.invitation.content-limit
                            turms.service.group.invitation.expire-after-seconds
                            turms.service.group.invitation.expired-invitations-cleanup-cron
                            turms.service.group.invitation.delete-expired-invitations-when-cron-triggered
                            撤销入群邀请群主、管理员与入群邀请发起者可撤销入群邀请turms.service.group.invitation.allow-recall-pending-invitation-by-owner-and-manager
                            发送入群请求turms.service.group.join-request.content-limit
                            turms.service.group.join-request.expire-after-seconds
                            turms.service.group.join-request.expired-join-requests-cleanup-cron
                            turms.service.group.join-request.delete-expired-join-requests-when-cron-triggered
                            撤销入群请求turms.service.group.join-request.allow-recall-join-request-sent-by-oneself
                            设置入群问题对于入群策略为“入群请求者回答问题正确后加入”的群组,群主与管理员可以设置入群问题。入群问题可以有多个,一个问题可以多个答案turms.service.group.question.answer-content-limit
                            turms.service.group.question.max-answer-count
                            turms.service.group.question.question-content-limit
                            删除入群问题删除入群问题
                            移除群组成员群主和管理员可以移除群组成员,且管理员不能移除群主和其他管理员
                            更新群组成员信息根据对应的“群组类型”,指定角色的群组成员可以修改其他群组成员的成员信息(如:群主为群组成员赋予管理员角色)
                            群组成员禁言禁言用户可以在群组内,但无法发送消息
                            群组成员坐标实时共享群组成员可以将自己的坐标实时地分享给其他群组成员
                            群组黑名单用户被拉黑后,将无法再进入群组。如果被拉黑用户在被拉黑之前是当前群组成员,则在拉黑后该用户会自动在群组成员列表中移除

                            群组类型配置

                            在群组配置方面,Turms使用了“群组类型”这一概念。默认情况下,Turms提供了一种通用的群组类型,同时您也可以通过对“群组类型”做增删改查操作,以满足您定制化的群组类型需求。

                            对应的管理员API路径:/groups/types。具体API细节请查阅OpenAPI文档 对应的配置模型:im.turms.service.domain.group.po.GroupType

                            配置列表

                            属性描述配置属性名
                            群成员上限人数有效值为1~∞groupSizeLimit
                            邀请入群策略支持配置:
                            ①仅群主可邀请:OWNEROWNER_REQUIRING_APPROVAL
                            ②群主+管理员可邀请:OWNER_MANAGEROWNER_MANAGER_REQUIRING_APPROVAL
                            ③群主+管理员与群成员可邀请:OWNER_MANAGER_MEMBEROWNER_MANAGER_MEMBER_REQUIRING_APPROVAL
                            ④所有人可邀请:ALLALL_REQUIRING_APPROVAL
                            invitationStrategy
                            被邀请人同意模式支持配置:
                            ①需要被邀请人同意:邀请者给被邀请者发送邀请。如果被邀请者同意邀请,则自动加入群:带_REQUIRING_APPROVAL的策略;
                            ②不需要被邀请人同意:邀请者禁止给被邀请者发送邀请。邀请者可以直接把被邀请者加入群中:不带_REQUIRING_APPROVAL的策略
                            invitationStrategy
                            入群策略支持配置:
                            ①在群主或管理员批准入群请求后,入群请求者方可加入:JOIN_REQUEST
                            ②入群请求者回答问题正确后,自动加入:QUESTION
                            ③允许未被拉黑的用户主动加入:MEMBERSHIP_REQUEST
                            ④不允许任何用户主动加入,需要群主或管理员发送邀请或直接拉入群中:INVITATION
                            joinStrategy
                            群信息更新策略支持配置:
                            ①仅群主可修改;
                            ②群主+管理员可修改;
                            ③群主+管理员+群成员可修改;
                            ④所有人可修改
                            groupInfoUpdateStrategy
                            群成员信息更新策略群主可以修改所有人的在群组内的成员信息,管理员只能修改群组中普通成员的成员信息memberInfoUpdateStrategy
                            游客发言可禁止、可允许guestSpeakable
                            群成员修改自身信息可禁止、可允许selfInfoUpdatable
                            群消息已读回执可开启、可关闭enableReadReceipt
                            修改已发送消息可开启、可关闭messageEditable

                            提醒:

                            • 上述的“邀请入群策略”、“被邀请人同意模式”与“入群策略”之间没有互斥关系,都是彼此兼容的,因此开发者可以根据自身的应用场景,对其进行搭配。

                            • 如果管理员修改了一个群组类型的邀请策略或入群策略,进而导致群组所对应的策略发生变化,那么原本对应旧策略的数据会被封存,而不会被系统删除,但原本就有权限的用户仍然可以删除、修改与查询这些数据。

                              举例而言,一个群原本是基于“审批入群请求”策略让新用户入群的,并且该群已经接收到了一些入群请求,如果此时系统管理员(注意:用户是没权限修改群组类型的)将群组策略修改为“基于问答”策略让新用户入群,那么之前收到的入群请求并不会被系统删除。当群管理员试图批准这些入群请求,服务端也会告知群策略以发生变化,并拒绝批准。但是群管理员仍然可以删除、修改与查询这些入群请求。

                              额外一提,可能部分用户会觉得Turms的群组策略比较复杂,但这种“复杂”跟用户没什么关系,用户只需要按照自己的应用场景做配置即可,使用起来非常简单,只是Turms的开发者实现这些动态的组合策略比较复杂。

                            • 咱无计划支持“用户拉黑群组,以拒绝接收入群邀请与被拉入群中”特性。

                            场景介绍

                            用户加入一个群

                            1. 客户端通过turmsClient.groupService.queryGroups(...)查询指定群的群信息。

                            2. 基于本地硬编码的群类型ID与群类型信息的关系,获得群类型信息。

                              补充:

                              • 这里不支持客户端动态查询群组类型信息是因为大部分应用的群组类型很固定,没有动态拉取信息的必要。
                              • 如果您的应用本来就只使用一种群类型,那直接在客户端硬编码群类型信息就可以了,直接跳过①②两个步骤,直接进入下一个步骤。
                            3. 根据群类型信息中的入群策略,判断需要调用哪个客户端API加群:

                              • 如果是JOIN_REQUEST策略,则需要调用turmsClient.groupService.createJoinRequest(...)来发送入群请求,并等待群管理员审批。
                              • 如果是QUESTION策略,则需要调用turmsClient.groupService.queryGroupJoinQuestions(...)查询群问题,再通过turmsClient.groupService.answerGroupQuestions(...)来回答群问题,当分值达到群管理员设置的入群分值门槛后,即可自动加入群中。
                              • 如果是MEMBERSHIP_REQUEST策略,则调用turmsClient.groupService.joinGroup(...)即可直接加入群中,不需要任何审批。
                              • 如果是INVITATION策略,则需要等待群管理员给当前用户发送入群邀请。
                            - +
                            Skip to content

                            群组相关功能

                            群成员类型包括:群主、管理员、普通成员、游客、匿名游客

                            相关路径与模型

                            • 管理员API路径:/groups。具体API细节请参考OpenAPI文档
                            • 客户端接口:请查阅GroupServiceController类。
                            • 底层请求模型:请查阅https://github.com/turms-im/proto/tree/master/request/group目录下的接口描述文件
                            • 配置类:im.turms.server.common.infra.property.env.service.business.group.GroupProperties

                            功能列表

                            功能
                            描述相关配置属性名
                            新建群组新建群组turms.service.group.activate-group-when-created
                            群主解散群群主可以解散群turms.service.group.delete-group-logically-by-default
                            主动退群除群主外,其他用户均可以主动退群。群主需先将群转让给其他群成员才可以进行退群操作
                            群主转让群群主可以将群的拥有者权限转给群内的其他成员,转移后, 被转让者变为新的群主,原群主变为普通成员。群主还可以选择在转让的同时,直接退出该群
                            修改群组资料支持群组名,群组头像,群组介绍,群组通知,群组类型等字段
                            群组禁言群组普通成员在禁言时段无法发送消息,仅有群主与管理员能发送消息
                            获取群组信息根据过滤条件(如群组ID),查找群组
                            增加群组成员增加群组成员
                            发送入群邀请拥有邀请权限角色的群组成员可向指定用户发送入群邀请turms.service.group.invitation.content-limit
                            turms.service.group.invitation.expire-after-seconds
                            turms.service.group.invitation.expired-invitations-cleanup-cron
                            turms.service.group.invitation.delete-expired-invitations-when-cron-triggered
                            撤销入群邀请群主、管理员与入群邀请发起者可撤销入群邀请turms.service.group.invitation.allow-recall-pending-invitation-by-owner-and-manager
                            发送入群请求turms.service.group.join-request.content-limit
                            turms.service.group.join-request.expire-after-seconds
                            turms.service.group.join-request.expired-join-requests-cleanup-cron
                            turms.service.group.join-request.delete-expired-join-requests-when-cron-triggered
                            撤销入群请求turms.service.group.join-request.allow-recall-join-request-sent-by-oneself
                            设置入群问题对于入群策略为“入群请求者回答问题正确后加入”的群组,群主与管理员可以设置入群问题。入群问题可以有多个,一个问题可以多个答案turms.service.group.question.answer-content-limit
                            turms.service.group.question.max-answer-count
                            turms.service.group.question.question-content-limit
                            删除入群问题删除入群问题
                            移除群组成员群主和管理员可以移除群组成员,且管理员不能移除群主和其他管理员
                            更新群组成员信息根据对应的“群组类型”,指定角色的群组成员可以修改其他群组成员的成员信息(如:群主为群组成员赋予管理员角色)
                            群组成员禁言禁言用户可以在群组内,但无法发送消息
                            群组成员坐标实时共享群组成员可以将自己的坐标实时地分享给其他群组成员
                            群组黑名单用户被拉黑后,将无法再进入群组。如果被拉黑用户在被拉黑之前是当前群组成员,则在拉黑后该用户会自动在群组成员列表中移除

                            群组类型配置

                            在群组配置方面,Turms使用了“群组类型”这一概念。默认情况下,Turms提供了一种通用的群组类型,同时您也可以通过对“群组类型”做增删改查操作,以满足您定制化的群组类型需求。

                            对应的管理员API路径:/groups/types。具体API细节请查阅OpenAPI文档 对应的配置模型:im.turms.service.domain.group.po.GroupType

                            配置列表

                            属性描述配置属性名
                            群成员上限人数有效值为1~∞groupSizeLimit
                            邀请入群策略支持配置:
                            ①仅群主可邀请:OWNEROWNER_REQUIRING_APPROVAL
                            ②群主+管理员可邀请:OWNER_MANAGEROWNER_MANAGER_REQUIRING_APPROVAL
                            ③群主+管理员与群成员可邀请:OWNER_MANAGER_MEMBEROWNER_MANAGER_MEMBER_REQUIRING_APPROVAL
                            ④所有人可邀请:ALLALL_REQUIRING_APPROVAL
                            invitationStrategy
                            被邀请人同意模式支持配置:
                            ①需要被邀请人同意:邀请者给被邀请者发送邀请。如果被邀请者同意邀请,则自动加入群:带_REQUIRING_APPROVAL的策略;
                            ②不需要被邀请人同意:邀请者禁止给被邀请者发送邀请。邀请者可以直接把被邀请者加入群中:不带_REQUIRING_APPROVAL的策略
                            invitationStrategy
                            入群策略支持配置:
                            ①在群主或管理员批准入群请求后,入群请求者方可加入:JOIN_REQUEST
                            ②入群请求者回答问题正确后,自动加入:QUESTION
                            ③允许未被拉黑的用户主动加入:MEMBERSHIP_REQUEST
                            ④不允许任何用户主动加入,需要群主或管理员发送邀请或直接拉入群中:INVITATION
                            joinStrategy
                            群信息更新策略支持配置:
                            ①仅群主可修改;
                            ②群主+管理员可修改;
                            ③群主+管理员+群成员可修改;
                            ④所有人可修改
                            groupInfoUpdateStrategy
                            群成员信息更新策略群主可以修改所有人的在群组内的成员信息,管理员只能修改群组中普通成员的成员信息memberInfoUpdateStrategy
                            游客发言可禁止、可允许guestSpeakable
                            群成员修改自身信息可禁止、可允许selfInfoUpdatable
                            群消息已读回执可开启、可关闭enableReadReceipt
                            修改已发送消息可开启、可关闭messageEditable

                            提醒:

                            • 上述的“邀请入群策略”、“被邀请人同意模式”与“入群策略”之间没有互斥关系,都是彼此兼容的,因此开发者可以根据自身的应用场景,对其进行搭配。

                            • 如果管理员修改了一个群组类型的邀请策略或入群策略,进而导致群组所对应的策略发生变化,那么原本对应旧策略的数据会被封存,而不会被系统删除,但原本就有权限的用户仍然可以删除、修改与查询这些数据。

                              举例而言,一个群原本是基于“审批入群请求”策略让新用户入群的,并且该群已经接收到了一些入群请求,如果此时系统管理员(注意:用户是没权限修改群组类型的)将群组策略修改为“基于问答”策略让新用户入群,那么之前收到的入群请求并不会被系统删除。当群管理员试图批准这些入群请求,服务端也会告知群策略以发生变化,并拒绝批准。但是群管理员仍然可以删除、修改与查询这些入群请求。

                              额外一提,可能部分用户会觉得Turms的群组策略比较复杂,但这种“复杂”跟用户没什么关系,用户只需要按照自己的应用场景做配置即可,使用起来非常简单,只是Turms的开发者实现这些动态的组合策略比较复杂。

                            • 咱无计划支持“用户拉黑群组,以拒绝接收入群邀请与被拉入群中”特性。

                            场景介绍

                            用户加入一个群

                            1. 客户端通过turmsClient.groupService.queryGroups(...)查询指定群的群信息。

                            2. 基于本地硬编码的群类型ID与群类型信息的关系,获得群类型信息。

                              补充:

                              • 这里不支持客户端动态查询群组类型信息是因为大部分应用的群组类型很固定,没有动态拉取信息的必要。
                              • 如果您的应用本来就只使用一种群类型,那直接在客户端硬编码群类型信息就可以了,直接跳过①②两个步骤,直接进入下一个步骤。
                            3. 根据群类型信息中的入群策略,判断需要调用哪个客户端API加群:

                              • 如果是JOIN_REQUEST策略,则需要调用turmsClient.groupService.createJoinRequest(...)来发送入群请求,并等待群管理员审批。
                              • 如果是QUESTION策略,则需要调用turmsClient.groupService.queryGroupJoinQuestions(...)查询群问题,再通过turmsClient.groupService.answerGroupQuestions(...)来回答群问题,当分值达到群管理员设置的入群分值门槛后,即可自动加入群中。
                              • 如果是MEMBERSHIP_REQUEST策略,则调用turmsClient.groupService.joinGroup(...)即可直接加入群中,不需要任何审批。
                              • 如果是INVITATION策略,则需要等待群管理员给当前用户发送入群邀请。
                            + \ No newline at end of file diff --git a/docs/zh-CN/feature/index.html b/docs/zh-CN/feature/index.html index 6fd570f9..961cb406 100644 --- a/docs/zh-CN/feature/index.html +++ b/docs/zh-CN/feature/index.html @@ -17,8 +17,8 @@ -
                            Skip to content

                            业务功能介绍

                            1. 在业务功能列表处,有部分功能标注了“✍”图标,该图标用于表明:是否执行该业务功能点的判定逻辑,需要您结合自身业务应用场景,自行判断并调用相关API。因为Turms自身无法判定当前上下文是否满足触发该功能点的条件。
                            2. 此功能列表参考了:网易云信、环信、融云、LeanCloud、腾讯云通讯等商用即时通信服务。Turms提供了几乎所有这些商业服务所提供的业务功能,并在很多方面更上一层楼。
                            3. Turms的功能配置参数极其自由,您甚至可以配置一个群组上限成员数量为10,000,单个消息上限100MB,关闭大部分业务功能等等的配置,拓展将消息转发给所有的用户等等的功能,Turms服务端不会干涉您满足任何的业务场景。 Turms只是为您提供了最通用且合理的默认配置,如默认一个群的上限人数为500,单个消息最大可为1MB等等。
                            4. 如果您未在此列表中找到您所需要的功能,请先检查是否您的需求仅需配置Turms参数即可实现。确认无法通过Turms配置参数实现后,请再在Issue区域提出。Turms会根据“性价比”进行评估,并尽可能满足您的需求。
                            5. Turms的版本号设计并不完全遵循 Semantic Versioning,Turms的大版本号主要由关键功能的引入而推动。在涉及Breaking Changes的部分会单独提出。

                            注意

                            • 对于一些功能点,Turms服务端或是客户端本身并不直接提供一些业务功能点。以“阅后即焚”功能为例,Turms实际做的事情仅仅是在消息的基础上,多传递了一个burnAfter的参数,阅后怎么“焚”,什么时间点“焚”,要不要用户的本地数据库里的消息也给“焚”了等等业务实现细节都是上层应用实现者要考虑的事情,Turms不予干预。
                            • 做功能设计时,要牢记国家的相关法律法规,避免设计出与国家管理要求相悖的设计。如《互联网交互式服务安全管理要求 第4部分:即时通信服务》
                            - +
                            Skip to content

                            业务功能介绍

                            1. 在业务功能列表处,有部分功能标注了“✍”图标,该图标用于表明:是否执行该业务功能点的判定逻辑,需要您结合自身业务应用场景,自行判断并调用相关API。因为Turms自身无法判定当前上下文是否满足触发该功能点的条件。
                            2. 此功能列表参考了:网易云信、环信、融云、LeanCloud、腾讯云通讯等商用即时通信服务。Turms提供了几乎所有这些商业服务所提供的业务功能,并在很多方面更上一层楼。
                            3. Turms的功能配置参数极其自由,您甚至可以配置一个群组上限成员数量为10,000,单个消息上限100MB,关闭大部分业务功能等等的配置,拓展将消息转发给所有的用户等等的功能,Turms服务端不会干涉您满足任何的业务场景。 Turms只是为您提供了最通用且合理的默认配置,如默认一个群的上限人数为500,单个消息最大可为1MB等等。
                            4. 如果您未在此列表中找到您所需要的功能,请先检查是否您的需求仅需配置Turms参数即可实现。确认无法通过Turms配置参数实现后,请再在Issue区域提出。Turms会根据“性价比”进行评估,并尽可能满足您的需求。
                            5. Turms的版本号设计并不完全遵循 Semantic Versioning,Turms的大版本号主要由关键功能的引入而推动。在涉及Breaking Changes的部分会单独提出。

                            注意

                            • 对于一些功能点,Turms服务端或是客户端本身并不直接提供一些业务功能点。以“阅后即焚”功能为例,Turms实际做的事情仅仅是在消息的基础上,多传递了一个burnAfter的参数,阅后怎么“焚”,什么时间点“焚”,要不要用户的本地数据库里的消息也给“焚”了等等业务实现细节都是上层应用实现者要考虑的事情,Turms不予干预。
                            • 做功能设计时,要牢记国家的相关法律法规,避免设计出与国家管理要求相悖的设计。如《互联网交互式服务安全管理要求 第4部分:即时通信服务》
                            + \ No newline at end of file diff --git a/docs/zh-CN/feature/message.html b/docs/zh-CN/feature/message.html index c3c0ea17..d082ea34 100644 --- a/docs/zh-CN/feature/message.html +++ b/docs/zh-CN/feature/message.html @@ -17,8 +17,8 @@ -
                            Skip to content

                            消息相关功能

                            相关路径与模型

                            • 管理员API路径:/messages。具体API细节请参考OpenAPI文档
                            • 客户端接口:请查阅MessageServiceController
                            • 底层请求模型:请查阅https://github.com/turms-im/proto/tree/master/request/message目录下的接口描述文件
                            • 配置类:im.turms.server.common.infra.property.env.service.business.message.MessageProperties

                            功能列表

                            消息功能
                            功能描述相关配置
                            离线消息实现思路:您可以在Turms客户端每次登陆时,都<主动>向Turms服务端请求关于<该用户在离线状态时,收到的所有私聊与群聊各自具体的离线消息数量,以及各自具体的最后N条消息(默认为1条)>的数据,以此同时兼顾消息的实时性与服务的性能。 默认情况下,Turms服务端<不会>定时删除寄存在Turms服务端的任何离线消息turms.service.message.default-available-messages-number-with-total
                            漫游消息✍在新设备登录时,由开发者自行调用Turms客户端的消息查询接口,指定数量与时段等条件,向Turms服务端请求漫游消息。
                            漫游消息的实现本质与“历史消息”的实现一样
                            (✍原因:Turms无法自行判断什么是“新设备登陆”)
                            多端同步当一名用户有多客户端同时在线时,Turms服务端会将消息下发给该用户所有在线的客户端
                            历史消息支持查询用户的历史消息。默认Turms永久存储消息(包括用户消息或系统消息)
                            历史消息的实现本质与“漫游消息”的实现一样
                            turms.service.message.message-retention-period-hours
                            turms.service.message.expired-messages-cleanup-cron
                            发送消息turms.service.message.time-type
                            turms.service.message.persist-message
                            turms.service.message.persist-record
                            turms.service.message.persist-pre-message-id
                            turms.service.message.persist-sender-ip
                            turms.service.message.check-if-target-active-and-not-deleted
                            turms.service.message.max-text-limit
                            turms.service.message.max-records-size-bytes
                            turms.service.message.allow-send-messages-to-oneself
                            turms.service.message.allow-send-messages-to-stranger
                            turms.service.message.delete-message-logically-by-default
                            turms.service.message.send-message-to-other-sender-online-devices
                            turms.service.message.use-conversation-id
                            turms.service.message.sequence-id.use-sequence-id-for-group-conversation
                            turms.service.message.sequence-id.use-sequence-id-for-private-conversation
                            消息撤回撤回投递成功的消息,默认允许发信人撤回距投递成功时间 5 分钟内的消息turms.service.message.allow-recall-message
                            turms.service.message.available-recall-duration-seconds
                            消息编辑编辑已发送成功的消息turms.service.message.allow-edit-message-by-sender
                            阅后即焚收信人接收到发信人的消息后,收信人客户端会根据发信人预先设定(或默认)的时间按时自动销毁
                            已读回执✍通知私聊对象或群组成员中,当前用户已读某条消息
                            查看私聊、群组会话中对方的已读/未读状态
                            (✍原因:Turms无法得知您的用户在什么情况下算是“已读某条消息”。开发者需要自行调用turmsClient.messageService.readMessage()来告知对方,当前用户已读某条消息)
                            turms.service.conversation.read-receipt.enabled
                            allow-move-read-date-forward
                            turms.service.conversation.read-receipt.update-read-date-after-message-sent
                            turms.service.conversation.read-receipt.update-read-date-when-user-querying-message
                            turms.service.conversation.read-receipt.use-server-time
                            消息转发将消息转发给其他用户或群组
                            @某人用于特别提醒某用户。如果Turms客户端检测到已接收的消息中被@的用户为当前登陆中的用户,Turms客户端则会触发@回调函数。开发者可自行实现后续相关业务逻辑。常用于给被@的用户提醒通知。
                            群内 @ 消息与普通消息没有本质区别,仅是在被 @ 的人在收到消息时,需要做特殊处理(触发回调函数)
                            正在输入✍当通信中的一方正在键入文本时,告知收信人(一名或多名用户),该用户正在输入消息
                            (✍原因:Turms无法得知您的用户是否正在键入文本)
                            turms.service.conversation.typing-status.enabled

                            查询会话消息时的注意事项

                            默认配置下,Turms不支持“在私聊会话中,消息发送者能够查询他自己发送的消息”(具体原因:消息索引设计。注意:在群聊会话中,消息发送者始终能够查询他自己发送的消息),开发者可以通过在turms-service服务端的配置文件中配置turms.service.message.use-conversation-id=true来启用会话ID

                            之后turmsClient.messageService.queryMessages({areGroupMessages: false, fromIds: [10,11,12]})的语义会由原来的“查询私聊会话中,由用户ID为11、12与13的用户发给当前用户的消息”变为“查询私聊会话中,由用户ID为11、12与13的用户发给当前用户的消息,与当前用户发送给用户ID为11、12与13的用户的消息”。

                            业务消息类型

                            从开发者角度看,Turms客户端在发送消息时内部有且仅使用一种数据模型,即CreateMessageRequest。由于它带有string与List<byte[]>类型的字段,因此您实际上能在发送消息时传递任何形式的数据。只是Turms为方便开发者快速实现各种业务消息类型,Turms客户端对常见消息类型做了划分,以方便开发者快速上手。

                            提醒:Turms的消息(所有业务类型的消息)均可以标记为系统消息。但系统消息只能通过turms管理员API发送,Turms客户端无法发送系统消息。

                            业务消息类型
                            描述
                            文本消息消息内容为文本
                            提醒:文本也可以是JSON,编码成Base64的二进制数据
                            图片消息消息内容为描述部分(可选):图片 URL 地址、尺寸、图片大小
                            图片数据(可选)
                            语音消息消息内容为描述部分(可选):语音文件的 URL 地址、时长、大小、格式
                            语音数据(可选)
                            视频消息消息内容为描述部分(可选):视频文件的 URL 地址、时长、大小、格式
                            视频数据(可选)
                            文件消息消息内容为描述部分(可选):文件的 URL 地址、大小、格式
                            文件数据(可选)
                            地理位置消息消息内容为地理位置标题、地址、经度、纬度信息
                            组合消息消息内容为文本信息与任意个数的其他任意内容类消息类型的消息(如:一条消息既包含了文本,也包含了图片与音频)
                            自定义消息Turms在传输时仅使用一种数据结构,它自身可以携带string与List<byte[]>数据结构。因此开发者可以自由实现任意的自定义消息类型,例如红包消息、石头剪子布等形式的消息

                            二进制数据的传输实现

                            二进制数据(文件)的传输实现方案主要有以下两种:

                            使用Turms客户端发送消息API的records字段(极不推荐)使用对象存储服务(AWS S3、阿里云OSS等)
                            简介Turms默认支持传递与存储消息附带的二进制数据records,因此您可以将图片、视频、文件等二进制数据存储在records当中您应用的客户端(注意:这里的“客户端”不是Turms的客户端,是您IM应用的客户端)向您的服务服务端程序请求OSS操作许可Token,由客户端将带着这个Token找到OSS服务并上传文件至OSS,接着拿着从OSS那返回的文件URL传递给Turms服务端,由Turms保存这个URL文本,而不保留文件的二进制数据。
                            由于Turms插件支持开发者自行实现文件管理服务,因此您也可以通过实现插件的方式实现该功能。比如Turms官方提供的MinIO对象存储服务端的集成实现turms-plugin-minio就是基于Turms插件实现的,供您参考
                            优点实现简单无限容量;
                            支持CDN加速,优化用户体验;
                            支持UI可视化管理,并提供各种运维管理功能。云存储服务一般都支持诸如冗余存储、服务器端加密、冷热数据分层存储(极大地减低数据存储成本)等实用功能特性
                            缺点一个Turms客户端有且仅与服务端建立一个TCP连接,因此如果用户使用Turms客户端自带的records字段传输较大的文件,则会阻塞其他业务请求的数据传输;
                            MongoDB在查询消息数据时,会把整条消息记录加载到内存中,极大地拖慢消息查询速度

                            参考资料:存储服务

                            - +
                            Skip to content

                            消息相关功能

                            相关路径与模型

                            • 管理员API路径:/messages。具体API细节请参考OpenAPI文档
                            • 客户端接口:请查阅MessageServiceController
                            • 底层请求模型:请查阅https://github.com/turms-im/proto/tree/master/request/message目录下的接口描述文件
                            • 配置类:im.turms.server.common.infra.property.env.service.business.message.MessageProperties

                            功能列表

                            消息功能
                            功能描述相关配置
                            离线消息实现思路:您可以在Turms客户端每次登陆时,都<主动>向Turms服务端请求关于<该用户在离线状态时,收到的所有私聊与群聊各自具体的离线消息数量,以及各自具体的最后N条消息(默认为1条)>的数据,以此同时兼顾消息的实时性与服务的性能。 默认情况下,Turms服务端<不会>定时删除寄存在Turms服务端的任何离线消息turms.service.message.default-available-messages-number-with-total
                            漫游消息✍在新设备登录时,由开发者自行调用Turms客户端的消息查询接口,指定数量与时段等条件,向Turms服务端请求漫游消息。
                            漫游消息的实现本质与“历史消息”的实现一样
                            (✍原因:Turms无法自行判断什么是“新设备登陆”)
                            多端同步当一名用户有多客户端同时在线时,Turms服务端会将消息下发给该用户所有在线的客户端
                            历史消息支持查询用户的历史消息。默认Turms永久存储消息(包括用户消息或系统消息)
                            历史消息的实现本质与“漫游消息”的实现一样
                            turms.service.message.message-retention-period-hours
                            turms.service.message.expired-messages-cleanup-cron
                            发送消息turms.service.message.time-type
                            turms.service.message.persist-message
                            turms.service.message.persist-record
                            turms.service.message.persist-pre-message-id
                            turms.service.message.persist-sender-ip
                            turms.service.message.check-if-target-active-and-not-deleted
                            turms.service.message.max-text-limit
                            turms.service.message.max-records-size-bytes
                            turms.service.message.allow-send-messages-to-oneself
                            turms.service.message.allow-send-messages-to-stranger
                            turms.service.message.delete-message-logically-by-default
                            turms.service.message.send-message-to-other-sender-online-devices
                            turms.service.message.use-conversation-id
                            turms.service.message.sequence-id.use-sequence-id-for-group-conversation
                            turms.service.message.sequence-id.use-sequence-id-for-private-conversation
                            消息撤回撤回投递成功的消息,默认允许发信人撤回距投递成功时间 5 分钟内的消息turms.service.message.allow-recall-message
                            turms.service.message.available-recall-duration-seconds
                            消息编辑编辑已发送成功的消息turms.service.message.allow-edit-message-by-sender
                            阅后即焚收信人接收到发信人的消息后,收信人客户端会根据发信人预先设定(或默认)的时间按时自动销毁
                            已读回执✍通知私聊对象或群组成员中,当前用户已读某条消息
                            查看私聊、群组会话中对方的已读/未读状态
                            (✍原因:Turms无法得知您的用户在什么情况下算是“已读某条消息”。开发者需要自行调用turmsClient.messageService.readMessage()来告知对方,当前用户已读某条消息)
                            turms.service.conversation.read-receipt.enabled
                            allow-move-read-date-forward
                            turms.service.conversation.read-receipt.update-read-date-after-message-sent
                            turms.service.conversation.read-receipt.update-read-date-when-user-querying-message
                            turms.service.conversation.read-receipt.use-server-time
                            消息转发将消息转发给其他用户或群组
                            @某人用于特别提醒某用户。如果Turms客户端检测到已接收的消息中被@的用户为当前登陆中的用户,Turms客户端则会触发@回调函数。开发者可自行实现后续相关业务逻辑。常用于给被@的用户提醒通知。
                            群内 @ 消息与普通消息没有本质区别,仅是在被 @ 的人在收到消息时,需要做特殊处理(触发回调函数)
                            正在输入✍当通信中的一方正在键入文本时,告知收信人(一名或多名用户),该用户正在输入消息
                            (✍原因:Turms无法得知您的用户是否正在键入文本)
                            turms.service.conversation.typing-status.enabled

                            查询会话消息时的注意事项

                            默认配置下,Turms不支持“在私聊会话中,消息发送者能够查询他自己发送的消息”(具体原因:消息索引设计。注意:在群聊会话中,消息发送者始终能够查询他自己发送的消息),开发者可以通过在turms-service服务端的配置文件中配置turms.service.message.use-conversation-id=true来启用会话ID

                            之后turmsClient.messageService.queryMessages({areGroupMessages: false, fromIds: [10,11,12]})的语义会由原来的“查询私聊会话中,由用户ID为11、12与13的用户发给当前用户的消息”变为“查询私聊会话中,由用户ID为11、12与13的用户发给当前用户的消息,与当前用户发送给用户ID为11、12与13的用户的消息”。

                            业务消息类型

                            从开发者角度看,Turms客户端在发送消息时内部有且仅使用一种数据模型,即CreateMessageRequest。由于它带有string与List<byte[]>类型的字段,因此您实际上能在发送消息时传递任何形式的数据。只是Turms为方便开发者快速实现各种业务消息类型,Turms客户端对常见消息类型做了划分,以方便开发者快速上手。

                            提醒:Turms的消息(所有业务类型的消息)均可以标记为系统消息。但系统消息只能通过turms管理员API发送,Turms客户端无法发送系统消息。

                            业务消息类型
                            描述
                            文本消息消息内容为文本
                            提醒:文本也可以是JSON,编码成Base64的二进制数据
                            图片消息消息内容为描述部分(可选):图片 URL 地址、尺寸、图片大小
                            图片数据(可选)
                            语音消息消息内容为描述部分(可选):语音文件的 URL 地址、时长、大小、格式
                            语音数据(可选)
                            视频消息消息内容为描述部分(可选):视频文件的 URL 地址、时长、大小、格式
                            视频数据(可选)
                            文件消息消息内容为描述部分(可选):文件的 URL 地址、大小、格式
                            文件数据(可选)
                            地理位置消息消息内容为地理位置标题、地址、经度、纬度信息
                            组合消息消息内容为文本信息与任意个数的其他任意内容类消息类型的消息(如:一条消息既包含了文本,也包含了图片与音频)
                            自定义消息Turms在传输时仅使用一种数据结构,它自身可以携带string与List<byte[]>数据结构。因此开发者可以自由实现任意的自定义消息类型,例如红包消息、石头剪子布等形式的消息

                            二进制数据的传输实现

                            二进制数据(文件)的传输实现方案主要有以下两种:

                            使用Turms客户端发送消息API的records字段(极不推荐)使用对象存储服务(AWS S3、阿里云OSS等)
                            简介Turms默认支持传递与存储消息附带的二进制数据records,因此您可以将图片、视频、文件等二进制数据存储在records当中您应用的客户端(注意:这里的“客户端”不是Turms的客户端,是您IM应用的客户端)向您的服务服务端程序请求OSS操作许可Token,由客户端将带着这个Token找到OSS服务并上传文件至OSS,接着拿着从OSS那返回的文件URL传递给Turms服务端,由Turms保存这个URL文本,而不保留文件的二进制数据。
                            由于Turms插件支持开发者自行实现文件管理服务,因此您也可以通过实现插件的方式实现该功能。比如Turms官方提供的MinIO对象存储服务端的集成实现turms-plugin-minio就是基于Turms插件实现的,供您参考
                            优点实现简单无限容量;
                            支持CDN加速,优化用户体验;
                            支持UI可视化管理,并提供各种运维管理功能。云存储服务一般都支持诸如冗余存储、服务器端加密、冷热数据分层存储(极大地减低数据存储成本)等实用功能特性
                            缺点一个Turms客户端有且仅与服务端建立一个TCP连接,因此如果用户使用Turms客户端自带的records字段传输较大的文件,则会阻塞其他业务请求的数据传输;
                            MongoDB在查询消息数据时,会把整条消息记录加载到内存中,极大地拖慢消息查询速度

                            参考资料:存储服务

                            + \ No newline at end of file diff --git a/docs/zh-CN/feature/simultaneous-login.html b/docs/zh-CN/feature/simultaneous-login.html index 15fa7d07..3e503105 100644 --- a/docs/zh-CN/feature/simultaneous-login.html +++ b/docs/zh-CN/feature/simultaneous-login.html @@ -17,8 +17,8 @@ -
                            Skip to content
                            - +
                            Skip to content
                            + \ No newline at end of file diff --git a/docs/zh-CN/feature/user.html b/docs/zh-CN/feature/user.html index 8962ebb2..997e5fb7 100644 --- a/docs/zh-CN/feature/user.html +++ b/docs/zh-CN/feature/user.html @@ -17,8 +17,8 @@ -
                            Skip to content

                            用户相关功能

                            相关路径与模型

                            • 管理员API路径:/users。具体API细节请参考OpenAPI文档
                            • 客户端接口:请查阅UserServiceController
                            • 底层请求模型:请查阅https://github.com/turms-im/proto/tree/master/request/user目录下的接口描述文件
                            • 配置类:im.turms.server.common.infra.property.env.service.business.user.UserProperties

                            用户信息功能

                            功能功能描述相关配置
                            新增用户turms.service.user.activate-user-when-added
                            删除用户turms.service.user.delete-user-logically
                            修改用户资料用户修改自己的昵称、介绍、头像URL
                            获取用户资料用户查看自己或其他用户的资料
                            设置用户资料访问权限用户可以针对个人的每项资料设置访问权限。访问权限有:所有人可见、好友可见、仅自己可见
                            用户权限组管理员可以针对不同的用户给予不同的权限配置模型:im.turms.service.domain.user.po.UserPermissionGroup

                            用户关系托管

                            概念:

                            • 关系:关系分为单向关系与双向关系。单向关系指的是:关系的Owner(关系拥有者)对Related User(关系人)具有某种具体的关系,如“单向好友”(允许对方发消息、好友请求过来)或是“拉黑用户”(禁止对方发消息、好友请求过来等)。单向关系的建立不需要进行权限认证。双向关系指的是:用户A对用户B有一个单向关系,用户B对用户A也有一个单向关系。如用户A屏蔽了用户B,而用户B可以指明不屏蔽用户A。
                            • 关系人(Related Users):指的是具有单向或双向关系(指明对方为好友或拉黑用户)的用户。如果两名用户不具有任意一种关系,则其为Strangers。
                            • 关系人分组:关系人分组由分组名与一组关系人组成,每个关系必然存在于至少一个关系人分组当中。如果客户端在创建关系时,未对该关系进行分组操作,则该关系会被放进该用户的默认关系组当中。因此要特别注意的就是:在“一个关系人分组”里可以同时有“好友”与“被拉黑”的用户。当然您可以通过业务限制,限制一个分组里只能有某一类的关系人。

                            额外补充:实际上,在Turms领域模型中并没有“好友/拉黑用户”这样的概念,其实质是一个叫“isBlocked”的bool。

                            功能
                            功能描述相关配置
                            获取关系根据可选的过滤(如指定用户ID、“是否是联系人”、“是否是好友/拉黑用户”等)与分组条件,获取当前用户所拥有的关系
                            添加关系人(+发起好友请求)①若是添加关系为“好友”的关系人,则根据您自定义的Turms服务端配置,用户既可直接添加"好友"关系,也可以先发起好友请求,待获得被请求人批准后,才自动执行添加“好友”关系操作。
                            ②若是添加关系为“拉黑用户”的关系人,则无需批准,直接生效。用户将不再收到拉黑用户发来的任何消息或者请求。
                            turms.service.user.friend-request.content-limit
                            turms.service.user.friend-request.delete-expired-requests-when-cron-triggered
                            turms.service.user.friend-request.allow-send-request-after-declined-or-ignored-or-expired
                            turms.service.user.friend-request.friend-request-expire-after-seconds
                            turms.service.user.friend-request.expired-user-friend-requests-cleanup-cron
                            turms.service.user.friend-request.delete-expired-requests-when-cron-triggered
                            通过/拒绝好友请求用户可以通过或者拒绝好友请求。若同意好友请求,则二者将建立双向的“好友”关系
                            删除关系人根据可选删除条件(如“是/不是关系人”、“是好友/拉黑用户”),删除某类关系人或指定关系人。deleteTwoSidedRelationships
                            修改与关系人的关系修改用户关系(好友/拉黑用户)信息。在修改关系为“好友”时,默认需要先发送好友请求(您可以取消此步骤)
                            创建关系人分组创建分组时,可以同时指定分组名与被添加的关系人。同一关系人可以被添加到多个分组
                            删除关系人分组删除关系人分组,同时可以可选是否转移被删除关系人分组中的关系人到其他分组(若不指定,则默认分配到“默认分组”)
                            重命名关系人分组重命名关系人分组
                            获取用户自己的关系人分组信息获取用户自己的关系人分组信息
                            添加关系人到某分组将关系人添加到/移到关系人分组。若分组不存在,则操作失败
                            从某分组中删除关系人将关系人从关系人分组中删除

                            定位功能

                            配置类:im.turms.server.common.infra.property.env.common.location.LocationProperties

                            功能功能描述相关配置
                            用户位置记录定期记录用户所在位置turms.location.enabled
                            turms.location.treat-user-id-and-device-type-as-unique-user
                            附近的人根据当前实时坐标搜寻附近的其他用户turms.location.users-nearby-request.default-max-available-nearby-users-number
                            turms.location.users-nearby-request.default-max-distance-meters
                            turms.location.users-nearby-request.max-available-users-nearby-number-limit
                            turms.location.users-nearby-request.max-distance-meters

                            统计功能

                            配置类:im.turms.server.common.infra.property.env.service.env.StatisticsProperties

                            尽管Turms提供一些基础的统计功能,但推荐用户通过云服务采集各种统计数据,如Amazon CloudWatch。

                            功能功能描述相关配置
                            在线用户数统计Turms集群中的Master节点会定期将集群中的在线用户数以日志形式进行记录turms.service.statistics.log-online-users-number
                            turms.service.statistics.online-users-number-logging-cron
                            - +
                            Skip to content

                            用户相关功能

                            相关路径与模型

                            • 管理员API路径:/users。具体API细节请参考OpenAPI文档
                            • 客户端接口:请查阅UserServiceController
                            • 底层请求模型:请查阅https://github.com/turms-im/proto/tree/master/request/user目录下的接口描述文件
                            • 配置类:im.turms.server.common.infra.property.env.service.business.user.UserProperties

                            用户信息功能

                            功能功能描述相关配置
                            新增用户turms.service.user.activate-user-when-added
                            删除用户turms.service.user.delete-user-logically
                            修改用户资料用户修改自己的昵称、介绍、头像URL
                            获取用户资料用户查看自己或其他用户的资料
                            设置用户资料访问权限用户可以针对个人的每项资料设置访问权限。访问权限有:所有人可见、好友可见、仅自己可见
                            用户权限组管理员可以针对不同的用户给予不同的权限配置模型:im.turms.service.domain.user.po.UserPermissionGroup

                            用户关系托管

                            概念:

                            • 关系:关系分为单向关系与双向关系。单向关系指的是:关系的Owner(关系拥有者)对Related User(关系人)具有某种具体的关系,如“单向好友”(允许对方发消息、好友请求过来)或是“拉黑用户”(禁止对方发消息、好友请求过来等)。单向关系的建立不需要进行权限认证。双向关系指的是:用户A对用户B有一个单向关系,用户B对用户A也有一个单向关系。如用户A屏蔽了用户B,而用户B可以指明不屏蔽用户A。
                            • 关系人(Related Users):指的是具有单向或双向关系(指明对方为好友或拉黑用户)的用户。如果两名用户不具有任意一种关系,则其为Strangers。
                            • 关系人分组:关系人分组由分组名与一组关系人组成,每个关系必然存在于至少一个关系人分组当中。如果客户端在创建关系时,未对该关系进行分组操作,则该关系会被放进该用户的默认关系组当中。因此要特别注意的就是:在“一个关系人分组”里可以同时有“好友”与“被拉黑”的用户。当然您可以通过业务限制,限制一个分组里只能有某一类的关系人。

                            额外补充:实际上,在Turms领域模型中并没有“好友/拉黑用户”这样的概念,其实质是一个叫“isBlocked”的bool。

                            功能
                            功能描述相关配置
                            获取关系根据可选的过滤(如指定用户ID、“是否是联系人”、“是否是好友/拉黑用户”等)与分组条件,获取当前用户所拥有的关系
                            添加关系人(+发起好友请求)①若是添加关系为“好友”的关系人,则根据您自定义的Turms服务端配置,用户既可直接添加"好友"关系,也可以先发起好友请求,待获得被请求人批准后,才自动执行添加“好友”关系操作。
                            ②若是添加关系为“拉黑用户”的关系人,则无需批准,直接生效。用户将不再收到拉黑用户发来的任何消息或者请求。
                            turms.service.user.friend-request.content-limit
                            turms.service.user.friend-request.delete-expired-requests-when-cron-triggered
                            turms.service.user.friend-request.allow-send-request-after-declined-or-ignored-or-expired
                            turms.service.user.friend-request.friend-request-expire-after-seconds
                            turms.service.user.friend-request.expired-user-friend-requests-cleanup-cron
                            turms.service.user.friend-request.delete-expired-requests-when-cron-triggered
                            通过/拒绝好友请求用户可以通过或者拒绝好友请求。若同意好友请求,则二者将建立双向的“好友”关系
                            删除关系人根据可选删除条件(如“是/不是关系人”、“是好友/拉黑用户”),删除某类关系人或指定关系人。deleteTwoSidedRelationships
                            修改与关系人的关系修改用户关系(好友/拉黑用户)信息。在修改关系为“好友”时,默认需要先发送好友请求(您可以取消此步骤)
                            创建关系人分组创建分组时,可以同时指定分组名与被添加的关系人。同一关系人可以被添加到多个分组
                            删除关系人分组删除关系人分组,同时可以可选是否转移被删除关系人分组中的关系人到其他分组(若不指定,则默认分配到“默认分组”)
                            重命名关系人分组重命名关系人分组
                            获取用户自己的关系人分组信息获取用户自己的关系人分组信息
                            添加关系人到某分组将关系人添加到/移到关系人分组。若分组不存在,则操作失败
                            从某分组中删除关系人将关系人从关系人分组中删除

                            定位功能

                            配置类:im.turms.server.common.infra.property.env.common.location.LocationProperties

                            功能功能描述相关配置
                            用户位置记录定期记录用户所在位置turms.location.enabled
                            turms.location.treat-user-id-and-device-type-as-unique-user
                            附近的人根据当前实时坐标搜寻附近的其他用户turms.location.users-nearby-request.default-max-available-nearby-users-number
                            turms.location.users-nearby-request.default-max-distance-meters
                            turms.location.users-nearby-request.max-available-users-nearby-number-limit
                            turms.location.users-nearby-request.max-distance-meters

                            统计功能

                            配置类:im.turms.server.common.infra.property.env.service.env.StatisticsProperties

                            尽管Turms提供一些基础的统计功能,但推荐用户通过云服务采集各种统计数据,如Amazon CloudWatch。

                            功能功能描述相关配置
                            在线用户数统计Turms集群中的Master节点会定期将集群中的在线用户数以日志形式进行记录turms.service.statistics.log-online-users-number
                            turms.service.statistics.online-users-number-logging-cron
                            + \ No newline at end of file diff --git a/docs/zh-CN/index.html b/docs/zh-CN/index.html index cc0b3d61..ae9155f3 100644 --- a/docs/zh-CN/index.html +++ b/docs/zh-CN/index.html @@ -17,7 +17,7 @@ -
                            Skip to content

                            English

                            Turms 是什么

                            Turms是一套全球范围内最为先进的、为同时在线用户数为100K~10M应用而设计的开源即时通讯引擎。

                            若想详细了解Turms项目,您可以阅读Turms文档。下文为Turms项目的概要。

                            Playground

                            (当前Demo的服务端版本:ghcr.io/turms-im/turms-admin:latestghcr.io/turms-im/turms-gateway:latestghcr.io/turms-im/turms-service:latest

                            您可以使用任意turms-client-(java/js/swift)客户端,向turms-gateway服务端发送请求,并与其他用户进行交互。

                            另外,Playground由一条指令全自动搭建:ENV=dev,demo docker compose -f docker-compose.standalone.yml --profile monitoring up --force-recreate -d

                            Quick Start

                            通过以下命令,可以在本地全自动地搭建一套完整的Turms最小集群(包含turms-gateway、turms-service与turms-admin)及其依赖服务端(MongoDB分片集群与Redis)

                            sh
                            git clone --depth 1 https://github.com/turms-im/turms.git
                            +    
                            Skip to content

                            English

                            Turms 是什么

                            Turms是一套全球范围内最为先进的、为同时在线用户数为100K~10M应用而设计的开源即时通讯引擎。

                            若想详细了解Turms项目,您可以阅读Turms文档。下文为Turms项目的概要。

                            Playground

                            (当前Demo的服务端版本:ghcr.io/turms-im/turms-admin:latestghcr.io/turms-im/turms-gateway:latestghcr.io/turms-im/turms-service:latest

                            您可以使用任意turms-client-(java/js/swift)客户端,向turms-gateway服务端发送请求,并与其他用户进行交互。

                            另外,Playground由一条指令全自动搭建:ENV=dev,demo docker compose -f docker-compose.standalone.yml --profile monitoring up --force-recreate -d

                            Quick Start

                            通过以下命令,可以在本地全自动地搭建一套完整的Turms最小集群(包含turms-gateway、turms-service与turms-admin)及其依赖服务端(MongoDB分片集群与Redis)

                            sh
                            git clone --depth 1 https://github.com/turms-im/turms.git
                             cd turms
                             docker plugin install grafana/loki-docker-driver:latest --alias loki --grant-all-permissions
                             docker compose -f docker-compose.standalone.yml up --force-recreate
                            git clone --depth 1 https://github.com/turms-im/turms.git
                            @@ -38,7 +38,7 @@
                             源项目文档地址:https://turms-im.github.io/docs
                            源项目:turms-im/turms
                             源项目地址:https://github.com/turms-im/turms
                             源项目文档地址:https://turms-im.github.io/docs

                            Q & A

                            1. Turms项目如何盈利?

                              我们目前不需要盈利。当然,我们也不排斥盈利,但我们不会为了赚取咨询、培训等费用,而故意不写好文档或不做好项目。另外一提的是,确实有很多(半)开源项目是通过故意不写好文档与不做好开源项目,来赚取服务支持费用。

                            2. 如果有盈利组织,如培训机构或公司,引用了Turms的文档,甚至把Turms项目做成SaaS服务出售,这些盈利组织需要注意什么吗?

                              我们对您的团队是否计划从Turms项目中盈利毫不关心,您的团队只需遵守Apache License 2.0协议,注明上述的Turms源项目信息。

                            3. Turms项目适合做成SaaS服务,那Turms项目为什么不采用AGPL或SSPL协议?

                              我们目前不需要盈利,且也没计划盈利,我们只要求使用者遵守Apache License 2.0协议。

                            4. Turms项目如果不盈利,那它的项目质量如何?

                              我们的文档与源码已经替我们回答了这个问题,并且在全球开源界,暂时还没有一个开源IM项目能跟Turms项目在中大型IM应用场景中竞争。另外一提的是,商业项目不代表质量高,甚至很多商业项目的文档与代码质量是令人触目惊心的水平。

                            5. Turms有使用双授权协议或带有隐藏收费条目吗?

                              没有。一些开源项目对个人使用免费,对商业使用收费,采用双授权协议,或带有很多隐藏收费条目。而整个Turms项目有且仅使用Apache License 2.0协议,也不存在任何收费环节。部分项目自称开源软件,但其实并不是,具体可参考开源软件真正的定义:中文版英文版

                            特别感谢

                            Turms项目主要在IntelliJ IDEA与CLion这两个IDE上进行开发。

                            感谢JetBrains Community Support Team为非商业开源项目提供的License。

                            - + \ No newline at end of file diff --git a/docs/zh-CN/reference/admin-api.html b/docs/zh-CN/reference/admin-api.html index 771390ad..99d7bdea 100644 --- a/docs/zh-CN/reference/admin-api.html +++ b/docs/zh-CN/reference/admin-api.html @@ -17,8 +17,8 @@ -
                            Skip to content

                            管理员API接口

                            Turms服务端的接口文档采用OpenAPI 3.0标准,并通过HTTP服务对外提供当前服务端的OpenAPI接口文档。

                            如果您需要查阅API接口文档,您可以在启动Turms服务端后,访问 http://localhost:端口号/openapi/ui 查阅API接口。如果您需要API接口的JSON格式数据,则可访问 http://localhost:端口号/openapi/docs 获取。其中,turms-gateway管理员HTTP服务端的默认端口号是9510,而turms-service则使用8510端口。

                            注意:在将Turms服务端部署到生产环境时,通常不需要将Turms服务端的Admin API端口开放给公网,以避免不必要的攻击。

                            接口设计准则

                            为了让接口能够顾名思义,保证开发者能一目了然,Turms的Admin API接口设计参考RESTful设计风格,并做了进一步优化与统一,具体遵循以下准则:

                            • URL的路径部分代表目标资源,如/users/relationships;或是资源的表现形式,如/users/relationships/page表示以分页的形式返回资源。一个URI有且仅可能返回一种格式的Response。

                            • POST方法用于Create资源,DELETE方法用于Delete资源,PUT方法用于Update资源,GET方法用于Query资源,以及比较特殊的HEAD方法用于Check资源(类似于GET,但无Response body,仅通过HTTP状态码交互)

                            • 请求的Query string用于定位资源,如?ids=1,2,3;或是附加指令,如?reset=true

                              注意:与RESTful风格不同,Turms服务端不使用请求URL路径(Path)做资源定位。如GET /flight-recordings/jfr下载JFR文件接口,在RESTful风格中应该是GET /flight-recordings/jfr/{id},但在Turms服务端中是GET /flight-recordings/jfr?id={id}

                            • 请求的Body用于描述要创建或更新的数据

                            使用管理接口的对象

                            • 您的前端管理系统或后端服务端发出HTTP(S)请求进行调用

                            • 管理员后台管理Web项目的turms-admin使用

                            注意:管理接口不是给终端用户使用的,而是您团队内部进行调用的。因此通常情况下,您不需要给turms-service服务端开放外网IP与端口。

                            类别

                            非业务相关类

                            监控类

                            类别Controller路径支持该接口的服务
                            日志管理LogController/logs均支持
                            度量信息管理MetricsController/metrics均支持
                            飞行记录管理FlightRecordingController/flight-recordings均支持

                            插件类

                            类别Controller路径支持该接口的服务
                            插件管理PluginController/plugins均支持

                            管理员类

                            类别Controller路径支持该接口的服务补充
                            管理员管理AdminController/adminsturms-service每个Turms集群默认存在一个角色为ROOT,账号名与密码均为turms的账号
                            管理员角色管理AdminRoleController/admins/rolesturms-service每个Turms集群默认存在一个角色为ROOT的超级管理员角色,其具有所有权限

                            集群类

                            类别Controller路径支持该接口的服务
                            集群节点管理MemberController/cluster/membersturms-service
                            集群配置管理SettingController/cluster/settingsturms-service

                            黑名单类

                            类别Controller路径支持该接口的服务
                            IP黑名单管理IpBlocklistController/blocked-clients/ipsturms-service
                            用户黑名单管理UserBlocklistController/blocked-clients/usersturms-service

                            用户会话类

                            类别Controller路径支持该接口的服务
                            用户会话管理SessionController/sessionsturms-gateway

                            业务相关类

                            下表所有API端口仅存在于turms-service服务端,turms-gateway服务端没有这些API端口。

                            用户类

                            类别Controller路径
                            用户信息管理UserController/users
                            用户在线状态管理UserOnlineInfoController/users/online-infos
                            用户权限组管理UserPermissionGroupController/users/permission-groups
                            用户关系管理UserRelationshipController/users/relationships
                            用户关系组管理UserRelationshipGroupController/users/relationships/groups
                            用户好友请求管理UserFriendRequestController/users/relationships/friend-requests

                            群组类

                            类别Controller路径
                            群组管理GroupController/groups
                            群组类型管理GroupTypeController/groups/types
                            群组入群问题管理GroupQuestionController/groups/questions
                            群组成员管理GroupMemberController/groups/members
                            群组黑名单管理GroupBlocklistController/groups/blocked-users
                            群组邀请管理GroupInvitationController/groups/invitations
                            群组入群请求管理GroupJoinRequestController/groups/join-requests

                            聊天会话类

                            类别Controller路径
                            聊天会话管理ConversationController/conversations

                            消息类

                            类别Controller路径
                            消息管理MessageController/messages

                            统计

                            当前对外暴露的统计相关接口多为Legacy API,不推荐使用。我们会在之后对其进行调整与重构。具体原因请查阅数据分析章节。

                            管理员API接口安全

                            用户向Turms服务端发送的每个HTTP请求都会经过Turms服务端的认证与授权流程,具体内容可见管理员安全

                            - +
                            Skip to content

                            管理员API接口

                            Turms服务端的接口文档采用OpenAPI 3.0标准,并通过HTTP服务对外提供当前服务端的OpenAPI接口文档。

                            如果您需要查阅API接口文档,您可以在启动Turms服务端后,访问 http://localhost:端口号/openapi/ui 查阅API接口。如果您需要API接口的JSON格式数据,则可访问 http://localhost:端口号/openapi/docs 获取。其中,turms-gateway管理员HTTP服务端的默认端口号是9510,而turms-service则使用8510端口。

                            注意:在将Turms服务端部署到生产环境时,通常不需要将Turms服务端的Admin API端口开放给公网,以避免不必要的攻击。

                            接口设计准则

                            为了让接口能够顾名思义,保证开发者能一目了然,Turms的Admin API接口设计参考RESTful设计风格,并做了进一步优化与统一,具体遵循以下准则:

                            • URL的路径部分代表目标资源,如/users/relationships;或是资源的表现形式,如/users/relationships/page表示以分页的形式返回资源。一个URI有且仅可能返回一种格式的Response。

                            • POST方法用于Create资源,DELETE方法用于Delete资源,PUT方法用于Update资源,GET方法用于Query资源,以及比较特殊的HEAD方法用于Check资源(类似于GET,但无Response body,仅通过HTTP状态码交互)

                            • 请求的Query string用于定位资源,如?ids=1,2,3;或是附加指令,如?reset=true

                              注意:与RESTful风格不同,Turms服务端不使用请求URL路径(Path)做资源定位。如GET /flight-recordings/jfr下载JFR文件接口,在RESTful风格中应该是GET /flight-recordings/jfr/{id},但在Turms服务端中是GET /flight-recordings/jfr?id={id}

                            • 请求的Body用于描述要创建或更新的数据

                            使用管理接口的对象

                            • 您的前端管理系统或后端服务端发出HTTP(S)请求进行调用

                            • 管理员后台管理Web项目的turms-admin使用

                            注意:管理接口不是给终端用户使用的,而是您团队内部进行调用的。因此通常情况下,您不需要给turms-service服务端开放外网IP与端口。

                            类别

                            非业务相关类

                            监控类

                            类别Controller路径支持该接口的服务
                            日志管理LogController/logs均支持
                            度量信息管理MetricsController/metrics均支持
                            飞行记录管理FlightRecordingController/flight-recordings均支持

                            插件类

                            类别Controller路径支持该接口的服务
                            插件管理PluginController/plugins均支持

                            管理员类

                            类别Controller路径支持该接口的服务补充
                            管理员管理AdminController/adminsturms-service每个Turms集群默认存在一个角色为ROOT,账号名与密码均为turms的账号
                            管理员角色管理AdminRoleController/admins/rolesturms-service每个Turms集群默认存在一个角色为ROOT的超级管理员角色,其具有所有权限

                            集群类

                            类别Controller路径支持该接口的服务
                            集群节点管理MemberController/cluster/membersturms-service
                            集群配置管理SettingController/cluster/settingsturms-service

                            黑名单类

                            类别Controller路径支持该接口的服务
                            IP黑名单管理IpBlocklistController/blocked-clients/ipsturms-service
                            用户黑名单管理UserBlocklistController/blocked-clients/usersturms-service

                            用户会话类

                            类别Controller路径支持该接口的服务
                            用户会话管理SessionController/sessionsturms-gateway

                            业务相关类

                            下表所有API端口仅存在于turms-service服务端,turms-gateway服务端没有这些API端口。

                            用户类

                            类别Controller路径
                            用户信息管理UserController/users
                            用户在线状态管理UserOnlineInfoController/users/online-infos
                            用户权限组管理UserPermissionGroupController/users/permission-groups
                            用户关系管理UserRelationshipController/users/relationships
                            用户关系组管理UserRelationshipGroupController/users/relationships/groups
                            用户好友请求管理UserFriendRequestController/users/relationships/friend-requests

                            群组类

                            类别Controller路径
                            群组管理GroupController/groups
                            群组类型管理GroupTypeController/groups/types
                            群组入群问题管理GroupQuestionController/groups/questions
                            群组成员管理GroupMemberController/groups/members
                            群组黑名单管理GroupBlocklistController/groups/blocked-users
                            群组邀请管理GroupInvitationController/groups/invitations
                            群组入群请求管理GroupJoinRequestController/groups/join-requests

                            聊天会话类

                            类别Controller路径
                            聊天会话管理ConversationController/conversations

                            消息类

                            类别Controller路径
                            消息管理MessageController/messages

                            统计

                            当前对外暴露的统计相关接口多为Legacy API,不推荐使用。我们会在之后对其进行调整与重构。具体原因请查阅数据分析章节。

                            管理员API接口安全

                            用户向Turms服务端发送的每个HTTP请求都会经过Turms服务端的认证与授权流程,具体内容可见管理员安全

                            + \ No newline at end of file diff --git a/docs/zh-CN/reference/status-code.html b/docs/zh-CN/reference/status-code.html index 5a6d6fdb..0a20e47e 100644 --- a/docs/zh-CN/reference/status-code.html +++ b/docs/zh-CN/reference/status-code.html @@ -17,8 +17,8 @@ -
                            Skip to content

                            状态码

                            共有两种状态码需要开发者了解,一种是ResponseStatusCode,另一种是SessionCloseStatus。下表内容不需要刻意记忆,只需要在遇见不认识的状态码时,懂得查询即可。

                            ResponseStatusCode

                            ResponseStatusCode表明请求响应中的处理状态,类似HTTP的状态码。

                            每一个请求响应都会包含一个ResponseStatusCode。具体的状态码声明可查阅turms-client-kotlin项目下的im.turms.client.model.ResponseStatusCode类。

                            客户端独有状态码

                            客户端独有状态码不会出现在Turms服务端中,表明客户端请求在客户端本地就被拒绝执行。

                            类别名称状态码含义
                            连接相关CONNECT_TIMEOUT1
                            请求相关INVALID_REQUEST100
                            CLIENT_REQUESTS_TOO_FREQUENT101
                            REQUEST_TIMEOUT102
                            ILLEGAL_ARGUMENT103
                            通知相关INVALID_NOTIFICATION200
                            INVALID_RESPONSE201
                            会话相关CLIENT_SESSION_ALREADY_ESTABLISHED300
                            CLIENT_SESSION_HAS_BEEN_CLOSED301
                            消息相关MESSAGE_IS_REJECTED400
                            存储相关QUERY_PROFILE_URL_TO_UPDATE_BEFORE_LOGIN500

                            通用状态码

                            类别名称状态码含义
                            成功响应OK1000
                            NO_CONTENT1001
                            ALREADY_UP_TO_DATE1002
                            客户端请求错误INVALID_REQUEST_FROM_SERVER1100
                            CLIENT_REQUESTS_TOO_FREQUENT_FROM_SERVER1101
                            ILLEGAL_ARGUMENT_FROM_SERVER1102
                            RECORD_CONTAINS_DUPLICATE_KEY1103
                            REQUESTED_RECORDS_TOO_MANY1104
                            SEND_REQUEST_FROM_NON_EXISTING_SESSION1105
                            UNAUTHORIZED_REQUEST1106
                            服务端错误SERVER_INTERNAL_ERROR1200
                            SERVER_UNAVAILABLE1201
                            用户登录相关错误UNSUPPORTED_CLIENT_VERSION2000
                            LOGIN_TIMEOUT2010
                            LOGIN_AUTHENTICATION_FAILED2011
                            LOGGING_IN_USER_NOT_ACTIVE2012
                            LOGIN_FROM_FORBIDDEN_DEVICE_TYPE2013
                            用户会话相关错误SESSION_SIMULTANEOUS_CONFLICTS_DECLINE2100
                            SESSION_SIMULTANEOUS_CONFLICTS_NOTIFY2101
                            SESSION_SIMULTANEOUS_CONFLICTS_OFFLINE2102
                            CREATE_EXISTING_SESSION2103
                            UPDATE_NON_EXISTING_SESSION_HEARTBEAT2104
                            用户位置相关错误USER_LOCATION_RELATED_FEATURES_ARE_DISABLED2200
                            QUERYING_NEAREST_USERS_BY_SESSION_ID_IS_DISABLED2201
                            用户信息相关错误UPDATE_INFO_OF_NON_EXISTING_USER2300
                            USER_PROFILE_NOT_FOUND2301
                            PROFILE_REQUESTER_NOT_IN_CONTACTS_OR_BLOCKED2302
                            PROFILE_REQUESTER_HAS_BEEN_BLOCKED2303
                            用户权限组相关错误QUERY_PERMISSION_OF_NON_EXISTING_USER2400
                            用户关系相关错误ADD_NOT_RELATED_USER_TO_GROUP2500
                            CREATE_EXISTING_RELATIONSHIP2501
                            用户好友请求相关错误REQUESTER_NOT_FRIEND_REQUEST_RECIPIENT2600
                            CREATE_EXISTING_FRIEND_REQUEST2601
                            FRIEND_REQUEST_SENDER_HAS_BEEN_BLOCKED2602
                            群组信息相关错误UPDATE_INFO_OF_NON_EXISTING_GROUP3000
                            NOT_OWNER_TO_UPDATE_GROUP_INFO3001
                            NOT_OWNER_OR_MANAGER_TO_UPDATE_GROUP_INFO3002
                            NOT_MEMBER_TO_UPDATE_GROUP_INFO3003
                            群组类型相关错误NO_PERMISSION_TO_CREATE_GROUP_WITH_GROUP_TYPE3100
                            CREATE_GROUP_WITH_NON_EXISTING_GROUP_TYPE3101
                            群组所有权相关错误NOT_ACTIVE_USER_TO_CREATE_GROUP3200
                            NOT_OWNER_TO_TRANSFER_GROUP3201
                            NOT_OWNER_TO_DELETE_GROUP3202
                            SUCCESSOR_NOT_GROUP_MEMBER3203
                            OWNER_QUITS_WITHOUT_SPECIFYING_SUCCESSOR3204
                            MAX_OWNED_GROUPS_REACHED3205
                            TRANSFER_NON_EXISTING_GROUP3206
                            群组入群问题相关错误NOT_OWNER_OR_MANAGER_TO_CREATE_GROUP_QUESTION3300
                            NOT_OWNER_OR_MANAGER_TO_DELETE_GROUP_QUESTION3301
                            NOT_OWNER_OR_MANAGER_TO_UPDATE_GROUP_QUESTION3302
                            NOT_OWNER_OR_MANAGER_TO_ACCESS_GROUP_QUESTION_ANSWER3303
                            CREATE_GROUP_QUESTION_FOR_INACTIVE_GROUP3304
                            CREATE_GROUP_QUESTION_FOR_GROUP_USING_JOIN_REQUEST3305
                            CREATE_GROUP_QUESTION_FOR_GROUP_USING_INVITATION3306
                            CREATE_GROUP_QUESTION_FOR_GROUP_USING_MEMBERSHIP_REQUEST3307
                            GROUP_QUESTION_ANSWERER_HAS_BEEN_BLOCKED3308
                            MEMBER_CANNOT_ANSWER_GROUP_QUESTION3309
                            ANSWER_INACTIVE_QUESTION3310
                            ANSWER_QUESTION_OF_INACTIVE_GROUP3311
                            群组成员相关错误ADD_USER_TO_GROUP_REQUIRING_INVITATION3400
                            ADD_USER_TO_INACTIVE_GROUP3401
                            ADD_USER_WITH_ROLE_HIGHER_THAN_REQUESTER3402
                            ADD_BLOCKED_USER_TO_GROUP3403
                            ADD_BLOCKED_USER_TO_INACTIVE_GROUP3404
                            NOT_OWNER_OR_MANAGER_TO_REMOVE_GROUP_MEMBER3405
                            NOT_OWNER_TO_REMOVE_GROUP_OWNER_OR_MANAGER3406
                            NOT_OWNER_TO_UPDATE_GROUP_MEMBER_INFO3407
                            NOT_OWNER_OR_MANAGER_TO_UPDATE_GROUP_MEMBER_INFO3408
                            NOT_MEMBER_TO_QUERY_MEMBER_INFO3409
                            群组黑名单相关错误NOT_OWNER_OR_MANAGER_TO_ADD_BLOCKED_USER3500
                            NOT_OWNER_OR_MANAGER_TO_REMOVE_BLOCKED_USER3501
                            群组入群请求相关错误GROUP_JOIN_REQUEST_SENDER_HAS_BEEN_BLOCKED3600
                            NOT_JOIN_REQUEST_SENDER_TO_RECALL_REQUEST3601
                            NOT_OWNER_OR_MANAGER_TO_ACCESS_GROUP_REQUEST3602
                            RECALL_NOT_PENDING_GROUP_JOIN_REQUEST3603
                            SEND_JOIN_REQUEST_TO_INACTIVE_GROUP3604
                            SEND_JOIN_REQUEST_TO_GROUP_USING_MEMBERSHIP_REQUEST3605
                            SEND_JOIN_REQUEST_TO_GROUP_USING_INVITATION3606
                            SEND_JOIN_REQUEST_TO_GROUP_USING_QUESTION3607
                            RECALLING_GROUP_JOIN_REQUEST_IS_DISABLED3608
                            群组邀请相关错误GROUP_INVITER_NOT_MEMBER3700
                            GROUP_INVITEE_ALREADY_GROUP_MEMBER3701
                            NOT_OWNER_OR_MANAGER_TO_RECALL_INVITATION3702
                            NOT_OWNER_OR_MANAGER_TO_ACCESS_INVITATION3703
                            NOT_OWNER_TO_SEND_INVITATION3704
                            NOT_OWNER_OR_MANAGER_TO_SEND_INVITATION3705
                            NOT_MEMBER_TO_SEND_INVITATION3706
                            INVITEE_HAS_BEEN_BLOCKED3707
                            RECALLING_GROUP_INVITATION_IS_DISABLED3708
                            SEND_GROUP_INVITATION_TO_GROUP_NOT_REQUIRE_INVITATION3709
                            RECALL_NOT_PENDING_GROUP_INVITATION3710
                            聊天会话相关错误UPDATING_TYPING_STATUS_IS_DISABLED4000
                            UPDATING_READ_DATE_IS_DISABLED4001
                            MOVING_READ_DATE_FORWARD_IS_DISABLED4002
                            消息发送相关错误MESSAGE_RECIPIENT_NOT_ACTIVE5000
                            MESSAGE_SENDER_NOT_IN_CONTACTS_OR_BLOCKED5001
                            PRIVATE_MESSAGE_SENDER_HAS_BEEN_BLOCKED5002
                            GROUP_MESSAGE_SENDER_HAS_BEEN_BLOCKED5003
                            SEND_MESSAGE_TO_INACTIVE_GROUP5004
                            SEND_MESSAGE_TO_MUTED_GROUP5005
                            SENDING_MESSAGES_TO_ONESELF_IS_DISABLED5006
                            MUTED_MEMBER_SEND_MESSAGE5007
                            GUESTS_HAVE_BEEN_MUTED5008
                            MESSAGE_IS_ILLEGAL5009
                            消息更新相关错误UPDATING_MESSAGE_BY_SENDER_IS_DISABLED5100
                            NOT_SENDER_TO_UPDATE_MESSAGE5101
                            NOT_MESSAGE_RECIPIENT_TO_UPDATE_MESSAGE_READ_DATE5102
                            消息撤回相关错误RECALL_NON_EXISTING_MESSAGE5200
                            RECALLING_MESSAGE_IS_DISABLED5201
                            MESSAGE_RECALL_TIMEOUT5202
                            消息查询相关错误NOT_MEMBER_TO_QUERY_GROUP_MESSAGES5300
                            存储相关错误STORAGE_NOT_IMPLEMENTED = 60006000

                            SessionCloseStatus

                            SessionCloseStatus表明会话关闭的原因。

                            具体的状态码声明可查阅im.turms.server.common.access.common.SessionCloseStatus类。

                            原因类别名称状态码含义
                            客户端非法行为ILLEGAL_REQUEST100非法请求
                            HEARTBEAT_TIMEOUT110心跳超时
                            LOGIN_TIMEOUT111登录超时
                            SWITCH112会话超时,TCP或WebSocket切换为UDP进入休眠保活状态
                            服务端行为SERVER_ERROR200服务端异常错误
                            SERVER_CLOSED201服务端进入停机状态
                            SERVER_UNAVAILABLE202服务不可用
                            网络层错误CONNECTION_CLOSED300未收到关闭帧,网络层连接被强制关闭
                            未知错误UNKNOWN_ERROR400未知的服务端或客户端行为错误
                            用户主动关闭DISCONNECTED_BY_CLIENT500当前用户主动请求关闭会话
                            DISCONNECTED_BY_OTHER_DEVICE501由于当前用户的其他设备上线,导致当前会话关闭
                            管理员主动关闭DISCONNECTED_BY_ADMIN600管理员通过API主动关闭会话
                            用户状态变更USER_IS_DELETED_OR_INACTIVATED700用户账号被删除或进入未激活状态
                            USER_IS_BLOCKED701用户IP或用户ID被封禁
                            - +
                            Skip to content

                            状态码

                            共有两种状态码需要开发者了解,一种是ResponseStatusCode,另一种是SessionCloseStatus。下表内容不需要刻意记忆,只需要在遇见不认识的状态码时,懂得查询即可。

                            ResponseStatusCode

                            ResponseStatusCode表明请求响应中的处理状态,类似HTTP的状态码。

                            每一个请求响应都会包含一个ResponseStatusCode。具体的状态码声明可查阅turms-client-kotlin项目下的im.turms.client.model.ResponseStatusCode类。

                            客户端独有状态码

                            客户端独有状态码不会出现在Turms服务端中,表明客户端请求在客户端本地就被拒绝执行。

                            类别名称状态码含义
                            连接相关CONNECT_TIMEOUT1
                            请求相关INVALID_REQUEST100
                            CLIENT_REQUESTS_TOO_FREQUENT101
                            REQUEST_TIMEOUT102
                            ILLEGAL_ARGUMENT103
                            通知相关INVALID_NOTIFICATION200
                            INVALID_RESPONSE201
                            会话相关CLIENT_SESSION_ALREADY_ESTABLISHED300
                            CLIENT_SESSION_HAS_BEEN_CLOSED301
                            消息相关MESSAGE_IS_REJECTED400
                            存储相关QUERY_PROFILE_URL_TO_UPDATE_BEFORE_LOGIN500

                            通用状态码

                            类别名称状态码含义
                            成功响应OK1000
                            NO_CONTENT1001
                            ALREADY_UP_TO_DATE1002
                            客户端请求错误INVALID_REQUEST_FROM_SERVER1100
                            CLIENT_REQUESTS_TOO_FREQUENT_FROM_SERVER1101
                            ILLEGAL_ARGUMENT_FROM_SERVER1102
                            RECORD_CONTAINS_DUPLICATE_KEY1103
                            REQUESTED_RECORDS_TOO_MANY1104
                            SEND_REQUEST_FROM_NON_EXISTING_SESSION1105
                            UNAUTHORIZED_REQUEST1106
                            服务端错误SERVER_INTERNAL_ERROR1200
                            SERVER_UNAVAILABLE1201
                            用户登录相关错误UNSUPPORTED_CLIENT_VERSION2000
                            LOGIN_TIMEOUT2010
                            LOGIN_AUTHENTICATION_FAILED2011
                            LOGGING_IN_USER_NOT_ACTIVE2012
                            LOGIN_FROM_FORBIDDEN_DEVICE_TYPE2013
                            用户会话相关错误SESSION_SIMULTANEOUS_CONFLICTS_DECLINE2100
                            SESSION_SIMULTANEOUS_CONFLICTS_NOTIFY2101
                            SESSION_SIMULTANEOUS_CONFLICTS_OFFLINE2102
                            CREATE_EXISTING_SESSION2103
                            UPDATE_NON_EXISTING_SESSION_HEARTBEAT2104
                            用户位置相关错误USER_LOCATION_RELATED_FEATURES_ARE_DISABLED2200
                            QUERYING_NEAREST_USERS_BY_SESSION_ID_IS_DISABLED2201
                            用户信息相关错误UPDATE_INFO_OF_NON_EXISTING_USER2300
                            USER_PROFILE_NOT_FOUND2301
                            PROFILE_REQUESTER_NOT_IN_CONTACTS_OR_BLOCKED2302
                            PROFILE_REQUESTER_HAS_BEEN_BLOCKED2303
                            用户权限组相关错误QUERY_PERMISSION_OF_NON_EXISTING_USER2400
                            用户关系相关错误ADD_NOT_RELATED_USER_TO_GROUP2500
                            CREATE_EXISTING_RELATIONSHIP2501
                            用户好友请求相关错误REQUESTER_NOT_FRIEND_REQUEST_RECIPIENT2600
                            CREATE_EXISTING_FRIEND_REQUEST2601
                            FRIEND_REQUEST_SENDER_HAS_BEEN_BLOCKED2602
                            群组信息相关错误UPDATE_INFO_OF_NON_EXISTING_GROUP3000
                            NOT_OWNER_TO_UPDATE_GROUP_INFO3001
                            NOT_OWNER_OR_MANAGER_TO_UPDATE_GROUP_INFO3002
                            NOT_MEMBER_TO_UPDATE_GROUP_INFO3003
                            群组类型相关错误NO_PERMISSION_TO_CREATE_GROUP_WITH_GROUP_TYPE3100
                            CREATE_GROUP_WITH_NON_EXISTING_GROUP_TYPE3101
                            群组所有权相关错误NOT_ACTIVE_USER_TO_CREATE_GROUP3200
                            NOT_OWNER_TO_TRANSFER_GROUP3201
                            NOT_OWNER_TO_DELETE_GROUP3202
                            SUCCESSOR_NOT_GROUP_MEMBER3203
                            OWNER_QUITS_WITHOUT_SPECIFYING_SUCCESSOR3204
                            MAX_OWNED_GROUPS_REACHED3205
                            TRANSFER_NON_EXISTING_GROUP3206
                            群组入群问题相关错误NOT_OWNER_OR_MANAGER_TO_CREATE_GROUP_QUESTION3300
                            NOT_OWNER_OR_MANAGER_TO_DELETE_GROUP_QUESTION3301
                            NOT_OWNER_OR_MANAGER_TO_UPDATE_GROUP_QUESTION3302
                            NOT_OWNER_OR_MANAGER_TO_ACCESS_GROUP_QUESTION_ANSWER3303
                            CREATE_GROUP_QUESTION_FOR_INACTIVE_GROUP3304
                            CREATE_GROUP_QUESTION_FOR_GROUP_USING_JOIN_REQUEST3305
                            CREATE_GROUP_QUESTION_FOR_GROUP_USING_INVITATION3306
                            CREATE_GROUP_QUESTION_FOR_GROUP_USING_MEMBERSHIP_REQUEST3307
                            GROUP_QUESTION_ANSWERER_HAS_BEEN_BLOCKED3308
                            MEMBER_CANNOT_ANSWER_GROUP_QUESTION3309
                            ANSWER_INACTIVE_QUESTION3310
                            ANSWER_QUESTION_OF_INACTIVE_GROUP3311
                            群组成员相关错误ADD_USER_TO_GROUP_REQUIRING_INVITATION3400
                            ADD_USER_TO_INACTIVE_GROUP3401
                            ADD_USER_WITH_ROLE_HIGHER_THAN_REQUESTER3402
                            ADD_BLOCKED_USER_TO_GROUP3403
                            ADD_BLOCKED_USER_TO_INACTIVE_GROUP3404
                            NOT_OWNER_OR_MANAGER_TO_REMOVE_GROUP_MEMBER3405
                            NOT_OWNER_TO_REMOVE_GROUP_OWNER_OR_MANAGER3406
                            NOT_OWNER_TO_UPDATE_GROUP_MEMBER_INFO3407
                            NOT_OWNER_OR_MANAGER_TO_UPDATE_GROUP_MEMBER_INFO3408
                            NOT_MEMBER_TO_QUERY_MEMBER_INFO3409
                            群组黑名单相关错误NOT_OWNER_OR_MANAGER_TO_ADD_BLOCKED_USER3500
                            NOT_OWNER_OR_MANAGER_TO_REMOVE_BLOCKED_USER3501
                            群组入群请求相关错误GROUP_JOIN_REQUEST_SENDER_HAS_BEEN_BLOCKED3600
                            NOT_JOIN_REQUEST_SENDER_TO_RECALL_REQUEST3601
                            NOT_OWNER_OR_MANAGER_TO_ACCESS_GROUP_REQUEST3602
                            RECALL_NOT_PENDING_GROUP_JOIN_REQUEST3603
                            SEND_JOIN_REQUEST_TO_INACTIVE_GROUP3604
                            SEND_JOIN_REQUEST_TO_GROUP_USING_MEMBERSHIP_REQUEST3605
                            SEND_JOIN_REQUEST_TO_GROUP_USING_INVITATION3606
                            SEND_JOIN_REQUEST_TO_GROUP_USING_QUESTION3607
                            RECALLING_GROUP_JOIN_REQUEST_IS_DISABLED3608
                            群组邀请相关错误GROUP_INVITER_NOT_MEMBER3700
                            GROUP_INVITEE_ALREADY_GROUP_MEMBER3701
                            NOT_OWNER_OR_MANAGER_TO_RECALL_INVITATION3702
                            NOT_OWNER_OR_MANAGER_TO_ACCESS_INVITATION3703
                            NOT_OWNER_TO_SEND_INVITATION3704
                            NOT_OWNER_OR_MANAGER_TO_SEND_INVITATION3705
                            NOT_MEMBER_TO_SEND_INVITATION3706
                            INVITEE_HAS_BEEN_BLOCKED3707
                            RECALLING_GROUP_INVITATION_IS_DISABLED3708
                            SEND_GROUP_INVITATION_TO_GROUP_NOT_REQUIRE_INVITATION3709
                            RECALL_NOT_PENDING_GROUP_INVITATION3710
                            聊天会话相关错误UPDATING_TYPING_STATUS_IS_DISABLED4000
                            UPDATING_READ_DATE_IS_DISABLED4001
                            MOVING_READ_DATE_FORWARD_IS_DISABLED4002
                            消息发送相关错误MESSAGE_RECIPIENT_NOT_ACTIVE5000
                            MESSAGE_SENDER_NOT_IN_CONTACTS_OR_BLOCKED5001
                            PRIVATE_MESSAGE_SENDER_HAS_BEEN_BLOCKED5002
                            GROUP_MESSAGE_SENDER_HAS_BEEN_BLOCKED5003
                            SEND_MESSAGE_TO_INACTIVE_GROUP5004
                            SEND_MESSAGE_TO_MUTED_GROUP5005
                            SENDING_MESSAGES_TO_ONESELF_IS_DISABLED5006
                            MUTED_MEMBER_SEND_MESSAGE5007
                            GUESTS_HAVE_BEEN_MUTED5008
                            MESSAGE_IS_ILLEGAL5009
                            消息更新相关错误UPDATING_MESSAGE_BY_SENDER_IS_DISABLED5100
                            NOT_SENDER_TO_UPDATE_MESSAGE5101
                            NOT_MESSAGE_RECIPIENT_TO_UPDATE_MESSAGE_READ_DATE5102
                            消息撤回相关错误RECALL_NON_EXISTING_MESSAGE5200
                            RECALLING_MESSAGE_IS_DISABLED5201
                            MESSAGE_RECALL_TIMEOUT5202
                            消息查询相关错误NOT_MEMBER_TO_QUERY_GROUP_MESSAGES5300
                            存储相关错误STORAGE_NOT_IMPLEMENTED = 60006000

                            SessionCloseStatus

                            SessionCloseStatus表明会话关闭的原因。

                            具体的状态码声明可查阅im.turms.server.common.access.common.SessionCloseStatus类。

                            原因类别名称状态码含义
                            客户端非法行为ILLEGAL_REQUEST100非法请求
                            HEARTBEAT_TIMEOUT110心跳超时
                            LOGIN_TIMEOUT111登录超时
                            SWITCH112会话超时,TCP或WebSocket切换为UDP进入休眠保活状态
                            服务端行为SERVER_ERROR200服务端异常错误
                            SERVER_CLOSED201服务端进入停机状态
                            SERVER_UNAVAILABLE202服务不可用
                            网络层错误CONNECTION_CLOSED300未收到关闭帧,网络层连接被强制关闭
                            未知错误UNKNOWN_ERROR400未知的服务端或客户端行为错误
                            用户主动关闭DISCONNECTED_BY_CLIENT500当前用户主动请求关闭会话
                            DISCONNECTED_BY_OTHER_DEVICE501由于当前用户的其他设备上线,导致当前会话关闭
                            管理员主动关闭DISCONNECTED_BY_ADMIN600管理员通过API主动关闭会话
                            用户状态变更USER_IS_DELETED_OR_INACTIVATED700用户账号被删除或进入未激活状态
                            USER_IS_BLOCKED701用户IP或用户ID被封禁
                            + \ No newline at end of file diff --git a/docs/zh-CN/server/deployment/config.html b/docs/zh-CN/server/deployment/config.html index 33c96bed..7dffd8ff 100644 --- a/docs/zh-CN/server/deployment/config.html +++ b/docs/zh-CN/server/deployment/config.html @@ -12,12 +12,12 @@ - + -
                            Skip to content

                            配置参数

                            重要性

                            即时通讯的业务场景繁多,因此不同业务对硬件资源有着天差地别的要求(比如:需要数据库的架构与不需要数据库的架构)。 为了有效利用服务器资源,请务必细心了解Turms服务端所提供的配置参数。

                            • 场景一:100%的消息可达率 vs 主动抛弃消息

                              • 在社交应用中,一般都会要求消息有100%的可达率。反之,对于在直播聊天室应用中,服务端甚至会专门根据消息优先级与服务端负载,来主动地抛弃用户消息或是把消息只发送给聊天室中的部分用户。
                              • 对于前者,Turms使用Redis来拉取会话级别的递增sequence ID来实现消息的100%必达。对于后者,Turms会根据内存中的消息与服务端负载信息来主动抛弃消息。二者对于消息可达率有着完全不同但又都合理的需求,因此二者的实现对于硬件配置也有着截然不同的要求。
                            • 场景二:读扩散消息存储 vs 零消息存储

                              • A应用是一款主要面向商务客户的即时通讯应用。这款应用有一个需求:当一名用户在商务群里发送了一条消息,该用户能够得知群组中其他每一名用户是否已读该消息,就算该用户发完消息就下线了,当其再次上线时,仍能查询其他人对该消息的已读状态。

                                因此,如果一个商务群有100名用户,当其中一名用户发出一条消息时,Turms需要存储1条Message与1条Conversation(Turms采用读扩散消息模型,另外请注意:该条Conversation记录会携带99个群成员的最后已读时间)。

                              • B应用是一款直播弹幕聊天应用,它对消息的处理非常随意。当一名用户在一个直播频道发出一条消息后,该用户不仅无需得知其他用户的已读状态,甚至连消息本身都不要求存储(即无离线消息需求)。

                                因此,如果一个直播频道有100名,当其中一名用户发出一条消息时,Turms需要存储0条Message与0条Conversation记录。

                              • 对比A应用需要消息存储功能,而B应用不用。因此在B应用的架构设计中甚至都不要用到存储消息的表(当然,实际应用中一般还是会存储用户消息来做用户行为分析)。因此二者对硬件需求也截然不同。

                            本地配置与全局配置

                            Turms服务端具有本地配置与全局配置两大类配置,其中:

                            本地配置全局配置
                            应用域仅对当前节点生效对集群中所有节点生效
                            存储位置存储在本地的application-[profile].yaml文件中存储在MongoDB数据库中turms-config/shared-cluster-properties集合中
                            可变对于标有MutableProperty注释的属性,用户都能通过供管理员专用的API接口在Turms集群运行时进行零停机实时更新同左

                            配置分类

                            配置分为两大类,一类是JVM的配置,一类是Turms服务端的配置。

                            JVM配置

                            turms-gateway的JVM默认配置文件为:turms-gateway/dist/config/jvm.options

                            turms-service的JVM默认配置文件为:turms-service/dist/config/jvm.options

                            用户一般使用默认的JVM配置即可,不需要自行修改JVM配置。

                            如果用户想修改JVM配置,可以通过以下两种方式:

                            1. 修改环境变量TURMS_GATEWAY_JVM_CONF(对于turms-gateway)或TURMS_SERVICE_JVM_CONF(对于turms-service),并指向自定义的JVM配置文件,以使用完全自定义的JVM配置。下文以修改turms-gateway的JVM配置为例,具体修改方法:

                              1. 如果通过run.sh脚本启动,则可以使用类似export TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path> && sh run.sh -f来设置环境变量并启动。

                              2. 如果通过Docker镜像启动,则可以使用类似:

                                shell
                                docker run -d --name turms-gateway --ulimit nofile=1048576 \
                                +    
                                Skip to content

                                配置参数

                                重要性

                                即时通讯的业务场景繁多,因此不同业务对硬件资源有着天差地别的要求(比如:需要数据库的架构与不需要数据库的架构)。 为了有效利用服务器资源,请务必细心了解Turms服务端所提供的配置参数。

                                • 场景一:100%的消息可达率 vs 主动抛弃消息

                                  • 在社交应用中,一般都会要求消息有100%的可达率。反之,对于在直播聊天室应用中,服务端甚至会专门根据消息优先级与服务端负载,来主动地抛弃用户消息或是把消息只发送给聊天室中的部分用户。
                                  • 对于前者,Turms使用Redis来拉取会话级别的递增sequence ID来实现消息的100%必达。对于后者,Turms会根据内存中的消息与服务端负载信息来主动抛弃消息。二者对于消息可达率有着完全不同但又都合理的需求,因此二者的实现对于硬件配置也有着截然不同的要求。
                                • 场景二:读扩散消息存储 vs 零消息存储

                                  • A应用是一款主要面向商务客户的即时通讯应用。这款应用有一个需求:当一名用户在商务群里发送了一条消息,该用户能够得知群组中其他每一名用户是否已读该消息,就算该用户发完消息就下线了,当其再次上线时,仍能查询其他人对该消息的已读状态。

                                    因此,如果一个商务群有100名用户,当其中一名用户发出一条消息时,Turms需要存储1条Message与1条Conversation(Turms采用读扩散消息模型,另外请注意:该条Conversation记录会携带99个群成员的最后已读时间)。

                                  • B应用是一款直播弹幕聊天应用,它对消息的处理非常随意。当一名用户在一个直播频道发出一条消息后,该用户不仅无需得知其他用户的已读状态,甚至连消息本身都不要求存储(即无离线消息需求)。

                                    因此,如果一个直播频道有100名,当其中一名用户发出一条消息时,Turms需要存储0条Message与0条Conversation记录。

                                  • 对比A应用需要消息存储功能,而B应用不用。因此在B应用的架构设计中甚至都不要用到存储消息的表(当然,实际应用中一般还是会存储用户消息来做用户行为分析)。因此二者对硬件需求也截然不同。

                                本地配置与全局配置

                                Turms服务端具有本地配置与全局配置两大类配置,其中:

                                本地配置全局配置
                                应用域仅对当前节点生效对集群中所有节点生效
                                存储位置存储在本地的application-[profile].yaml文件中存储在MongoDB数据库中turms-config/shared-cluster-properties集合中
                                可变对于标有MutableProperty注释的属性,用户都能通过供管理员专用的API接口在Turms集群运行时进行零停机实时更新同左

                                配置分类

                                配置分为两大类,一类是JVM的配置,一类是Turms服务端的配置。

                                JVM配置

                                turms-gateway的JVM默认配置文件为:turms-gateway/dist/config/jvm.options

                                turms-service的JVM默认配置文件为:turms-service/dist/config/jvm.options

                                用户一般使用默认的JVM配置即可,不需要自行修改JVM配置。

                                如果用户想修改JVM配置,可以通过以下两种方式:

                                1. 修改环境变量TURMS_GATEWAY_JVM_CONF(对于turms-gateway)或TURMS_SERVICE_JVM_CONF(对于turms-service),并指向自定义的JVM配置文件,以使用完全自定义的JVM配置。下文以修改turms-gateway的JVM配置为例,具体修改方法:

                                  1. 如果通过run.sh脚本启动,则可以使用类似export TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path> && sh run.sh -f来设置环境变量并启动。

                                  2. 如果通过Docker镜像启动,则可以使用类似:

                                    shell
                                    docker run -d --name turms-gateway --ulimit nofile=1048576 \
                                       --memory-swappiness=0 \
                                       -p 7510:7510 -p 9510:9510 -p 10510:10510 -p 11510:11510 -p 12510:12510 \
                                       --health-cmd="curl -I --silent $${HOST}:9510/health || exit 1" \
                                    @@ -53,12 +53,12 @@
                                       --health-retries=3 \
                                       --health-start-period=60s \
                                       -v <your-jvm-options-file-path>:/opt/turms/turms-gateway/config/jvm.options:ro \
                                    -  ghcr.io/turms-im/turms-gateway
                                  3. 如果通过Docker Compose,则可以使用类似:

                                    shell
                                    TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path> docker compose -f docker-compose.standalone.yml up --force-recreate
                                    TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path> docker compose -f docker-compose.standalone.yml up --force-recreate
                                    powershell
                                    $env:TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path>;docker compose -f docker-compose.standalone.yml up --force-recreate
                                    $env:TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path>;docker compose -f docker-compose.standalone.yml up --force-recreate

                                    注意:上述的TURMS_GATEWAY_JVM_CONF路径指向的是镜像内部的路径,而非宿主机的路径。如果想使用宿主机里的配置文件,则需要修改docker-compose.standalone.yml配置文件,以使用Docker的挂载机制,如:

                                    yaml
                                    turms-gateway:
                                    +  ghcr.io/turms-im/turms-gateway
                                  4. 如果通过Docker Compose,则可以使用类似:

                                    shell
                                    TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path> docker compose -f docker-compose.standalone.yml up --force-recreate
                                    TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path> docker compose -f docker-compose.standalone.yml up --force-recreate
                                    powershell
                                    $env:TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path>;docker compose -f docker-compose.standalone.yml up --force-recreate
                                    $env:TURMS_GATEWAY_JVM_CONF=<your-jvm-options-file-path>;docker compose -f docker-compose.standalone.yml up --force-recreate

                                    注意:上述的TURMS_GATEWAY_JVM_CONF路径指向的是镜像内部的路径,而非宿主机的路径。如果想使用宿主机里的配置文件,则需要修改docker-compose.standalone.yml配置文件,以使用Docker的挂载机制,如:

                                    yaml
                                    turms-gateway:
                                       volumes:
                                         - <your-jvm-options-file-path>:/opt/turms/turms-gateway/config/jvm.options:ro
                                    turms-gateway:
                                       volumes:
                                         - <your-jvm-options-file-path>:/opt/turms/turms-gateway/config/jvm.options:ro
                                2. 修改环境变量TURMS_GATEWAY_JVM_OPTS(对于turms-gateway)或TURMS_SERVICE_JVM_OPTS(对于turms-service),以在JVM配置文件的基础上,附加自定义的JVM配置,并覆盖已声明的JVM配置。具体修改方法同上,故不赘述。

                                  注意:该变量的格式为:-D<name>=<value> -D<name>=<value>,如:-Dspring.profiles.active=DEV -Dturms.cluster.discovery.address.advertise-host=myturms

                                Turms服务端配置

                                Turms配置分为四大类:

                                • Turms Gateway配置:对应turms-gateway服务端独有的配置
                                • Turms Service配置:对应turms-service服务端独有的配置。
                                • Common通用配置:Common通用配置可以被turms-gateway和turms-service服务端共用。
                                • 插件自身的配置:Turms服务端插件自身提供的配置。

                                配置方法

                                1. 前文提到的TURMS_GATEWAY_JVM_CONFTURMS_SERVICE_JVM_CONF,与TURMS_GATEWAY_JVM_OPTSTURMS_SERVICE_JVM_OPTS也都可以被用来配置Turms服务端的参数。
                                2. 修改application.yaml下的配置文件。具体方法:
                                  1. 直接修改仓库内服务端下的application.yaml文件。因为如果修改了配置源文件,那用户就不能使用Turms官方提供的Docker镜像了,并且还需要自行打包成JAR包并制作镜像,因此这种方式一般只用于本地开发测试用,不用于线上环境。
                                  2. 使用前文提到的Docker挂载的方式,将自定义的服务端配置文件挂载到/opt/turms/turms-gateway/config/application.yaml路径上。
                                3. 调用Admin HTTP API进行修改,其路径为:PUT /cluster/settings

                                提醒:对于插件自身的配置,其配置方法跟Turms服务端的配置方法一样,除了暂时不支持使用Admin HTTP API动态修改外,同样可以基于上述的①②两个方法进行配置。举例来说,如果一个插件是给turms-gateway服务端使用的插件,那么用户可以将插件自身的配置放到turms-gateway服务端的TURMS_GATEWAY_JVM_OPTS环境变量当中。

                                配置集(Profiles)

                                如果开发者需要对同一个Turms服务端配置与切换使用不同的配置,则可以使用配置集。

                                默认情况下,Turms服务端源码中硬编码的配置与application.yaml文件中指定的配置就是默认生产环境的配置。如果开发者想要切换使用其他配置集,则可以通过修改application.yaml文件中的spring.profiles.active配置来使用其他配置集。

                                比如常见的用例:在本地开发调试时,想将生产环境配置,切换成默认的开发环境配置,则开发者可以将application.yaml文件中的spring.profiles.active值修改为dev,这样Turms服务端就会采用application.yamlapplication-dev.yaml(默认开发环境配置)两个文件中指定的配置,且application-dev.yaml文件中的配置优先级更高,将覆盖默认配置。

                                配置参数介绍

                                由于Turms服务端的配置项高达上百个,本小节仅对配置类别做简要的介绍。如果读者想查阅具体的配置项,可以查阅im.turms.server.common.infra.property包下的各配置类代码,或者继续浏览下文配置项小节所提供的配置项说明。

                                提醒:您在本地编译turms/turms-gateway服务端项目后,编译器会生成target/classes/META-INF/spring-configuration-metadata.json文件。IntelliJ IDEA 能够自动检测到该文件,并在您输入Turms相关配置的时提供配置提示与补全功能,如下图所示:

                                Tumrs Service配置
                                类别字段名描述补充
                                管理员APIAdminApiPropertiesadminApi管理员API接口相关配置
                                客户端APIClientApiPropertiesclientApi客户端API接口相关配置
                                Fake数据FakePropertiesfakeFake数据相关配置
                                数据源MongoPropertiesmongoMongoDB数据库相关配置Turms完全复用MongoDB的URI配置。参考文档:
                                https://docs.mongodb.com/manual/reference/connection-string/
                                TurmsRedisPropertiesredisRedis数据库相关配置
                                统计StatisticsPropertiesstatistics统计相关配置
                                通知NotificationPropertiesnotification通知相关配置
                                文件存储StoragePropertiesstorage存储相关配置
                                业务行为UserPropertiesuser用户相关配置
                                GroupPropertiesgroup群组相关配置
                                ConversationPropertiesconversation消息会话服务相关配置
                                MessagePropertiesmessage消息服务相关配置
                                Turms Gateway配置
                                类别字段名描述
                                管理员APIAdminApiPropertiesadminApi管理员API接口相关配置
                                客户端APIClientApiPropertiesclientApi面向客户端的HTTP接入层相关配置(即ReasonController的相关配置)
                                NotificationLoggingPropertiesnotificationLogging通知日志相关配置
                                服务接口UdpPropertiesudpUDP服务端相关配置
                                TcpPropertiestcpTCP服务端相关配置
                                WebSocketPropertieswebsocketWebSocket服务端相关配置
                                DiscoveryPropertiesserviceDiscovery服务发现相关配置
                                Fake数据FakePropertiesfakeFake数据相关配置
                                数据源MongoPropertiesmongoMongoDB数据库相关配置
                                TurmsRedisPropertiesredisRedis数据库相关配置
                                业务行为SimultaneousLoginPropertiessimultaneousLogin多端登录相关配置
                                SessionPropertiessession会话相关配置
                                Common通用配置
                                字段名描述
                                ClusterPropertiescluster集群相关配置。包括配置当前运行节点信息、服务发现注册信息、配置中心信息、RPC参数
                                HealthCheckPropertieshealthCheck监控节点健康状态
                                IpPropertiesip公网IP探测相关配置
                                LocationPropertieslocation用户坐标相关配置
                                LoggingPropertieslogging基础日志配置
                                PluginPropertiesplugin插件相关配置
                                SecurityPropertiessecurity用户与管理员密码加密相关配置
                                UserStatusPropertiesuserStatus用户会话(连接)状态相关配置
                                插件自身的配置

                                如果用户想查阅Turms服务端官方插件的配置项,可以阅读对应的插件文档,这些文档都会罗列该插件所提供的配置项。

                                服务端端口号配置

                                服务端配置项端口作用
                                turms-admin6510(HTTP)提供后台管理员系统的Web页面
                                turms-service/turms-gatewayturms.cluster.connection.server.port7510(TCP)供turms-service与turms-gateway服务端的RPC使用
                                turms-serviceturms.service.admin-api.http.port8510(HTTP)提供admin API与metrics API
                                turms-gatewayturms.gateway.admin-api.http.port9510(HTTP)提供metrics API
                                turms-gatewayturms.gateway.websocket.port10510(WebSocket)与turms-client-js客户端交互
                                turms-gatewayturms.gateway.tcp.port11510(TCP)与客户端交互
                                turms-gatewayturms.gateway.udp.port12510(UDP)与客户端交互(客户端均暂不支持)。
                                注意:UDP服务端为实验性功能,并不在第一版发布计划中

                                配置项

                                注意:下表不包括Turms服务端插件的配置。

                                配置项全局属性可变属性数据类型默认值说明
                                turms.cluster.connection.client.keepalive-interval-secondsint5
                                turms.cluster.connection.client.keepalive-timeout-secondsint15
                                turms.cluster.connection.client.reconnect-interval-secondsint15
                                turms.cluster.connection.server.hoststring0.0.0.0
                                turms.cluster.connection.server.portint7510
                                turms.cluster.connection.server.port-auto-incrementbooleanfalse
                                turms.cluster.connection.server.port-countint100
                                turms.cluster.discovery.address.advertise-hoststringThe advertise address of the local node exposed to admins. (e.g. 100.131.251.96)
                                turms.cluster.discovery.address.advertise-strategyenumPRIVATE_ADDRESSThe advertise strategy is used to decide which type of address should be used so that admins can access admin APIs and metrics APIs
                                turms.cluster.discovery.address.attach-port-to-hostbooleantrueWhether to attach the local port to the host. e.g. The local host is 100.131.251.96, and the port is 9510 so the service address will be 100.131.251.96:9510
                                turms.cluster.discovery.delay-to-notify-members-change-secondsint3Delay notifying listeners on members change. Waits for seconds to avoid thundering herd
                                turms.cluster.discovery.heartbeat-interval-secondsint10
                                turms.cluster.discovery.heartbeat-timeout-secondsint30
                                turms.cluster.idstringturms
                                turms.cluster.node.active-by-defaultbooleantrue
                                turms.cluster.node.idstringThe node ID must start with a letter or underscore, and matches zero or more of characters [a-zA-Z0-9_] after the beginning. e.g. "turms001", "turms_002"
                                turms.cluster.node.leader-eligiblebooleantrueOnly works when it is a turms-service node
                                turms.cluster.node.priorityint0The priority to be a leader
                                turms.cluster.node.zonestringe.g. "us-east-1" and "ap-east-1"
                                turms.cluster.rpc.request-timeout-millisint30000The timeout for RPC requests in milliseconds
                                turms.flight-recorder.closed-recording-retention-periodint0A closed recording will be retained for the given period and will be removed from the file system after the retention period. 0 means no retention. -1 means unlimited retention.
                                turms.gateway.admin-api.address.advertise-hoststringThe advertise address of the local node exposed to admins. (e.g. 100.131.251.96)
                                turms.gateway.admin-api.address.advertise-strategyenumPRIVATE_ADDRESSThe advertise strategy is used to decide which type of address should be used so that admins can access admin APIs and metrics APIs
                                turms.gateway.admin-api.address.attach-port-to-hostbooleantrueWhether to attach the local port to the host. e.g. The local host is 100.131.251.96, and the port is 9510 so the service address will be 100.131.251.96:9510
                                turms.gateway.admin-api.enabledbooleantrueWhether to enable the APIs for administrators
                                turms.gateway.admin-api.http.hoststring0.0.0.0
                                turms.gateway.admin-api.http.max-request-body-size-bytesint10485760
                                turms.gateway.admin-api.http.portint9510
                                turms.gateway.admin-api.log.enabledbooleantrueWhether to log API calls
                                turms.gateway.admin-api.log.log-request-paramsbooleantrueWhether to log the parameters of requests
                                turms.gateway.admin-api.rate-limiting.capacityint50The maximum number of tokens that the bucket can hold
                                turms.gateway.admin-api.rate-limiting.initial-tokensint50The initial number of tokens for new session
                                turms.gateway.admin-api.rate-limiting.refill-interval-millisint1000The time interval to refill. 0 means never refill
                                turms.gateway.admin-api.rate-limiting.tokens-per-periodint50Refills the bucket with the specified number of tokens per period if the bucket is not full
                                turms.gateway.admin-api.use-authenticationbooleantrueWhether to use authentication. If false, all HTTP requesters will personate the root user and all HTTP requests will be passed. You may set it to false when you want to manage authentication via security groups, NACL, etc
                                turms.gateway.client-api.logging.excluded-notification-categoriesSet-enum[]Turms will get the notifications to log from the union of "includedNotificationCategories" and "includedNotifications" except the notifications included in "excludedNotificationCategories" and "excludedNotificationTypes"
                                turms.gateway.client-api.logging.excluded-notification-typesSet-enum[]Turms will get the notifications to log from the union of "includedNotificationCategories" and "includedNotifications" except the notifications included in "excludedNotificationCategories" and "excludedNotificationTypes"
                                turms.gateway.client-api.logging.excluded-request-categoriesSet-enum[]Turms will get the requests to log from the union of "includedRequestCategories" and "includedRequests" except the requests included in "excludedRequestCategories" and "excludedRequestTypes"
                                turms.gateway.client-api.logging.excluded-request-typesSet-enum[]Turms will get the requests to log from the union of "includedRequestCategories" and "includedRequests" except the requests included in "excludedRequestCategories" and "excludedRequestTypes"
                                turms.gateway.client-api.logging.heartbeat-sample-ratefloat0
                                turms.gateway.client-api.logging.included-notification-categoriesLinkedHashSet-LoggingCategoryProperties[]Turms will get the notifications to log from the union of "includedNotificationCategories" and "includedNotifications" except the notifications included in "excludedNotificationCategories" and "excludedNotificationTypes"
                                turms.gateway.client-api.logging.included-notificationsLinkedHashSet-LoggingRequestProperties[]Turms will get the notifications to log from the union of "includedNotificationCategories" and "includedNotifications" except the notifications included in "excludedNotificationCategories" and "excludedNotificationTypes"
                                turms.gateway.client-api.logging.included-request-categoriesLinkedHashSet-LoggingCategoryProperties[
                                {
                                "category": "ALL",
                                "sampleRate": 1
                                }
                                ]
                                Turms will get the requests to log from the union of "includedRequestCategories" and "includedRequests" except the requests included in "excludedRequestCategories" and "excludedRequestTypes"
                                turms.gateway.client-api.logging.included-requestsLinkedHashSet-LoggingRequestProperties[]Turms will get the requests to log from the union of "includedRequestCategories" and "includedRequests" except the requests included in "excludedRequestCategories" and "excludedRequestTypes"
                                turms.gateway.client-api.max-request-size-bytesint16384The client session will be closed and may be blocked if it tries to send a request larger than the size. Note: The average size of turms requests is 16~64 bytes
                                turms.gateway.client-api.rate-limiting.capacityint50The maximum number of tokens that the bucket can hold
                                turms.gateway.client-api.rate-limiting.initial-tokensint50The initial number of tokens for new session
                                turms.gateway.client-api.rate-limiting.refill-interval-millisint1000The time interval to refill. 0 means never refill
                                turms.gateway.client-api.rate-limiting.tokens-per-periodint1Refills the bucket with the specified number of tokens per period if the bucket is not full
                                turms.gateway.client-api.return-reason-for-server-errorbooleanfalseWhether to return the reason for the server error to the client. Note: 1. It may reveal sensitive data like the IP of internal servers if true; 2. turms-gateway never return the information of stack traces no matter it is true or false.
                                turms.gateway.fake.enabledbooleanfalseWhether to fake clients. Note that faking only works in non-production environments
                                turms.gateway.fake.first-user-idlong100
                                turms.gateway.fake.request-count-per-intervalint10The number of requests to send per interval. If requestIntervalMillis is 1000, requestCountPerInterval is TPS in fact
                                turms.gateway.fake.request-interval-millisint1000The interval to send request
                                turms.gateway.fake.user-countint10Run the number of real clients as faked users with an ID from [firstUserId, firstUserId + userCount) to connect to turms-gateway. So please ensure you have set "turms.service.fake.userCount" to a number larger than or equal to (firstUserId + userCount)
                                turms.gateway.notification-logging.enabledbooleanfalseWhether to parse the buffer of TurmsNotification to log. Note that the property has an impact on performance
                                turms.gateway.service-discovery.advertise-hoststringThe advertise address of the local node exposed to the public. The property can be used to advertise the DDoS Protected IP address to hide the origin IP address (e.g. 100.131.251.96)
                                turms.gateway.service-discovery.advertise-strategyenumPRIVATE_ADDRESSThe advertise strategy is used to help clients or load balancing servers to access the local node. Note: For security, do NOT use "PUBLIC_ADDRESS" in production to prevent from exposing the origin IP address for DDoS attack.
                                turms.gateway.service-discovery.attach-port-to-hostbooleantrueWhether to attach the local port to the host. For example, if the local host is 100.131.251.96, and the port is 10510, so the service address will be 100.131.251.96:10510
                                turms.gateway.service-discovery.identitystringThe identity of the local node will be sent to clients as a notification if identity is not blank and "turms.gateway.session.notifyClientsOfSessionInfoAfterConnected" is true (e.g. "turms-east-0001")
                                turms.gateway.session.client-heartbeat-interval-secondsint60The client heartbeat interval. Note that the value will NOT change the actual heartbeat behavior of clients, and the value is only used to facilitate related operations of turms-gateway
                                turms.gateway.session.close-idle-session-after-secondsint180A session will be closed if turms server does not receive any request (including heartbeat request) from the client during closeIdleSessionAfterSeconds. References: https://mp.weixin.qq.com/s?__biz=MzAwNDY1ODY2OQ==&mid=207243549&idx=1&sn=4ebe4beb8123f1b5ab58810ac8bc5994&scene=0#rd
                                turms.gateway.session.device-details.expire-after-secondsint2592000Device details information will expire after the specified time has elapsed. 0 means never expire
                                turms.gateway.session.device-details.itemsList-DeviceDetailsItemProperties[]
                                turms.gateway.session.identity-access-management.enabledbooleantrueWhether to authenticate and authorize users when logging in. Note that user ID is always required even if enabled is false. If false at startup, turms-gateway will not connect to the MongoDB server for user records
                                turms.gateway.session.identity-access-management.http.authentication.response-expectation.body-fieldsMap{
                                "authenticated": true
                                }
                                turms.gateway.session.identity-access-management.http.authentication.response-expectation.headersMap{}
                                turms.gateway.session.identity-access-management.http.authentication.response-expectation.status-codesSet-string[
                                "2??"
                                ]
                                turms.gateway.session.identity-access-management.http.request.headersMap{}
                                turms.gateway.session.identity-access-management.http.request.http-methodenumGET
                                turms.gateway.session.identity-access-management.http.request.timeout-millisint30000
                                turms.gateway.session.identity-access-management.http.request.urlstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.ecdsa256.p12.file-pathstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.ecdsa256.p12.key-aliasstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.ecdsa256.p12.passwordstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.ecdsa256.pem-file-pathstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.ecdsa384.p12.file-pathstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.ecdsa384.p12.key-aliasstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.ecdsa384.p12.passwordstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.ecdsa384.pem-file-pathstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.ecdsa512.p12.file-pathstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.ecdsa512.p12.key-aliasstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.ecdsa512.p12.passwordstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.ecdsa512.pem-file-pathstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.hmac256.file-pathstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.hmac256.p12.file-pathstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.hmac256.p12.key-aliasstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.hmac256.p12.passwordstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.hmac384.file-pathstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.hmac384.p12.file-pathstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.hmac384.p12.key-aliasstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.hmac384.p12.passwordstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.hmac512.file-pathstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.hmac512.p12.file-pathstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.hmac512.p12.key-aliasstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.hmac512.p12.passwordstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.ps256.p12.file-pathstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.ps256.p12.key-aliasstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.ps256.p12.passwordstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.ps256.pem-file-pathstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.ps384.p12.file-pathstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.ps384.p12.key-aliasstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.ps384.p12.passwordstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.ps384.pem-file-pathstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.ps512.p12.file-pathstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.ps512.p12.key-aliasstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.ps512.p12.passwordstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.ps512.pem-file-pathstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.rsa256.p12.file-pathstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.rsa256.p12.key-aliasstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.rsa256.p12.passwordstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.rsa256.pem-file-pathstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.rsa384.p12.file-pathstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.rsa384.p12.key-aliasstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.rsa384.p12.passwordstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.rsa384.pem-file-pathstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.rsa512.p12.file-pathstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.rsa512.p12.key-aliasstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.rsa512.p12.passwordstring
                                turms.gateway.session.identity-access-management.jwt.algorithm.rsa512.pem-file-pathstring
                                turms.gateway.session.identity-access-management.jwt.authentication.expectation.custom-payload-claimsMap{
                                "authenticated": true
                                }
                                turms.gateway.session.identity-access-management.jwt.verification.audiencestring
                                turms.gateway.session.identity-access-management.jwt.verification.custom-payload-claimsMap{}
                                turms.gateway.session.identity-access-management.jwt.verification.issuerstring
                                turms.gateway.session.identity-access-management.typeenumPASSWORDNote that if the type is not PASSWORD, turms-gateway will not connect to the MongoDB server for user records
                                turms.gateway.session.min-heartbeat-interval-secondsint18The minimum interval to refresh the heartbeat status by client requests to avoid refreshing the heartbeat status frequently
                                turms.gateway.session.notify-clients-of-session-info-after-connectedbooleantrueWhether to notify clients of the session information after connected with the server
                                turms.gateway.session.switch-protocol-after-secondsint540If the turms server only receives heartbeat requests from the client during switchProtocolAfterSeconds, the TCP/WebSocket connection will be closed with the close status "SWITCH" to indicate the client should keep sending heartbeat requests over UDP if they want to keep online. Note: 1. The property only works if UDP is enabled; 2. For browser clients, UDP is not supported
                                turms.gateway.simultaneous-login.allow-device-type-others-loginbooleantrueWhether to allow the devices of DeviceType.OTHERS to login
                                turms.gateway.simultaneous-login.allow-device-type-unknown-loginbooleantrueWhether to allow the devices of DeviceType.UNKNOWN to login
                                turms.gateway.simultaneous-login.login-conflict-strategyenumDISCONNECT_LOGGED_IN_DEVICESThe login conflict strategy is used for servers to know how to behave if a device is logging in when there are conflicted and logged-in devices
                                turms.gateway.simultaneous-login.strategyenumALLOW_ONE_DEVICE_OF_EACH_DEVICE_TYPE_ONLINEThe simultaneous login strategy is used to control which devices can be online at the same time
                                turms.gateway.tcp.backlogint4096The maximum number of connection requests waiting in the backlog queue. Large enough to handle bursts and GC pauses but do not set too large to prevent SYN-Flood attacks
                                turms.gateway.tcp.close-idle-connection-after-secondsint300A TCP connection will be closed on the server side if a client has not established a user session in a specified time. Note that the developers on the client side should take the responsibility to close the TCP connection according to their business requirements
                                turms.gateway.tcp.connection-timeoutint30
                                turms.gateway.tcp.enabledbooleantrue
                                turms.gateway.tcp.hoststring0.0.0.0
                                turms.gateway.tcp.portint-1
                                turms.gateway.tcp.wiretapbooleanfalse
                                turms.gateway.udp.enabledbooleantrue
                                turms.gateway.udp.hoststring0.0.0.0
                                turms.gateway.udp.portint-1
                                turms.gateway.websocket.backlogint4096The maximum number of connection requests waiting in the backlog queue. Large enough to handle bursts and GC pauses but do not set too large to prevent SYN-Flood attacks
                                turms.gateway.websocket.close-idle-connection-after-secondsint300A WebSocket connection will be closed on the server side if a client has not established a user session in a specified time. Note that the developers on the client side should take the responsibility to close the WebSocket connection according to their business requirements
                                turms.gateway.websocket.connect-timeoutint30Used to mitigate the Slowloris DoS attack by lowering the timeout for the TCP connection handshake
                                turms.gateway.websocket.enabledbooleantrue
                                turms.gateway.websocket.hoststring0.0.0.0
                                turms.gateway.websocket.portint-1
                                turms.health-check.check-interval-secondsint3
                                turms.health-check.cpu.retriesint5
                                turms.health-check.cpu.unhealthy-load-threshold-percentageint95
                                turms.health-check.memory.direct-memory-warning-threshold-percentageint50Log warning messages if the used direct memory exceeds the max direct memory of the percentage
                                turms.health-check.memory.heap-memory-gc-threshold-percentageint60If the used memory has used the reserved memory specified by maxAvailableMemoryPercentage and minFreeSystemMemoryBytes, try to start GC when the used heap memory exceeds the max heap memory of the percentage
                                turms.health-check.memory.heap-memory-warning-threshold-percentageint95Log warning messages if the used heap memory exceeds the max heap memory of the percentage
                                turms.health-check.memory.max-available-direct-memory-percentageint95The server will refuse to serve when the used direct memory exceeds the max direct memory of the percentage to try to avoid OutOfMemoryError
                                turms.health-check.memory.max-available-memory-percentageint95The server will refuse to serve when the used memory (heap memory + JVM internal non-heap memory + direct buffer pool) exceeds the physical memory of the percentage. The server will try to reserve max(maxAvailableMemoryPercentage of the physical memory, minFreeSystemMemoryBytes) for kernel and other processes. Note that the max available memory percentage does not conflict with the usage of limiting memory in docker because docker limits the memory of the container, while this memory percentage only limits the available memory for JVM
                                turms.health-check.memory.min-free-system-memory-bytesint134217728The server will refuse to serve when the free system memory is less than minFreeSystemMemoryBytes
                                turms.health-check.memory.min-heap-memory-gc-interval-secondsint10
                                turms.health-check.memory.min-memory-warning-interval-secondsint10
                                turms.ip.cached-private-ip-expire-after-millisint60000The cached private IP will expire after the specified time has elapsed. 0 means no cache
                                turms.ip.cached-public-ip-expire-after-millisint60000The cached public IP will expire after the specified time has elapsed. 0 means no cache
                                turms.ip.public-ip-detector-addressesList-string[
                                "https://checkip.amazonaws.com",
                                "https://whatismyip.akamai.com",
                                "https://ifconfig.me/ip",
                                "https://myip.dnsomatic.com"
                                ]
                                The public IP detectors will only be used to query the public IP of the local node if needed (e.g. If the node discovery property "advertiseStrategy" is "PUBLIC_ADDRESS". Note that the HTTP response body must be a string of IP instead of a JSON
                                turms.location.enabledbooleantrueWhether to handle users' locations
                                turms.location.nearby-user-request.default-max-distance-metersint10000The default maximum allowed distance in meters
                                turms.location.nearby-user-request.default-max-nearby-user-countshort20The default maximum allowed number of nearby users
                                turms.location.nearby-user-request.max-distance-metersint10000The maximum allowed distance in meters
                                turms.location.nearby-user-request.max-nearby-user-countshort100The maximum allowed number of nearby users
                                turms.location.treat-user-id-and-device-type-as-unique-userbooleanfalseWhether to treat the pair of user ID and device type as a unique user when querying users nearby. If false, only the user ID is used to identify a unique user
                                turms.logging.console.enabledbooleanfalse
                                turms.logging.console.levelenumINFO
                                turms.logging.file.compression.enabledbooleantrue
                                turms.logging.file.enabledbooleantrue
                                turms.logging.file.file-pathstring@HOME/@SERVICE_TYPE_NAME.log
                                turms.logging.file.levelenumINFO
                                turms.logging.file.max-file-size-mbint32
                                turms.logging.file.max-filesint320
                                turms.plugin.dirstringpluginsThe relative path of plugins
                                turms.plugin.enabledbooleantrueWhether to enable plugins
                                turms.plugin.java.allow-savebooleanfalseWhether to allow to save plugins using HTTP API
                                turms.plugin.js.allow-savebooleanfalseWhether to allow to save plugins using HTTP API
                                turms.plugin.js.debug.enabledbooleanfalseWhether to enable debugging
                                turms.plugin.js.debug.inspect-hoststringlocalhostThe inspect host
                                turms.plugin.js.debug.inspect-portint24242The inspect port
                                turms.plugin.network.pluginsList-NetworkPluginProperties[]
                                turms.plugin.network.proxy.connect-timeout-millisint60000The HTTP proxy connect timeout in millis
                                turms.plugin.network.proxy.enabledbooleanfalseWhether to enable HTTP proxy
                                turms.plugin.network.proxy.hoststringThe HTTP proxy host
                                turms.plugin.network.proxy.passwordstringThe HTTP proxy password
                                turms.plugin.network.proxy.portint8080The HTTP proxy port
                                turms.plugin.network.proxy.usernamestringThe HTTP proxy username
                                turms.security.blocklist.ip.auto-block.corrupted-frame.block-levelsList-BlockLevel[
                                {
                                "blockDurationSeconds": 600,
                                "goNextLevelTriggerTimes": 1,
                                "reduceOneTriggerTimeIntervalMillis": 60000
                                },
                                {
                                "blockDurationSeconds": 1800,
                                "goNextLevelTriggerTimes": 1,
                                "reduceOneTriggerTimeIntervalMillis": 60000
                                },
                                {
                                "blockDurationSeconds": 3600,
                                "goNextLevelTriggerTimes": 0,
                                "reduceOneTriggerTimeIntervalMillis": 60000
                                }
                                ]
                                turms.security.blocklist.ip.auto-block.corrupted-frame.block-trigger-timesint5Block the client when the block condition is triggered the times
                                turms.security.blocklist.ip.auto-block.corrupted-frame.enabledbooleanfalse
                                turms.security.blocklist.ip.auto-block.corrupted-request.block-levelsList-BlockLevel[
                                {
                                "blockDurationSeconds": 600,
                                "goNextLevelTriggerTimes": 1,
                                "reduceOneTriggerTimeIntervalMillis": 60000
                                },
                                {
                                "blockDurationSeconds": 1800,
                                "goNextLevelTriggerTimes": 1,
                                "reduceOneTriggerTimeIntervalMillis": 60000
                                },
                                {
                                "blockDurationSeconds": 3600,
                                "goNextLevelTriggerTimes": 0,
                                "reduceOneTriggerTimeIntervalMillis": 60000
                                }
                                ]
                                turms.security.blocklist.ip.auto-block.corrupted-request.block-trigger-timesint5Block the client when the block condition is triggered the times
                                turms.security.blocklist.ip.auto-block.corrupted-request.enabledbooleanfalse
                                turms.security.blocklist.ip.auto-block.frequent-request.block-levelsList-BlockLevel[
                                {
                                "blockDurationSeconds": 600,
                                "goNextLevelTriggerTimes": 1,
                                "reduceOneTriggerTimeIntervalMillis": 60000
                                },
                                {
                                "blockDurationSeconds": 1800,
                                "goNextLevelTriggerTimes": 1,
                                "reduceOneTriggerTimeIntervalMillis": 60000
                                },
                                {
                                "blockDurationSeconds": 3600,
                                "goNextLevelTriggerTimes": 0,
                                "reduceOneTriggerTimeIntervalMillis": 60000
                                }
                                ]
                                turms.security.blocklist.ip.auto-block.frequent-request.block-trigger-timesint5Block the client when the block condition is triggered the times
                                turms.security.blocklist.ip.auto-block.frequent-request.enabledbooleanfalse
                                turms.security.blocklist.ip.enabledbooleantrue
                                turms.security.blocklist.ip.sync-blocklist-interval-millisint10000
                                turms.security.blocklist.user-id.auto-block.corrupted-frame.block-levelsList-BlockLevel[
                                {
                                "blockDurationSeconds": 600,
                                "goNextLevelTriggerTimes": 1,
                                "reduceOneTriggerTimeIntervalMillis": 60000
                                },
                                {
                                "blockDurationSeconds": 1800,
                                "goNextLevelTriggerTimes": 1,
                                "reduceOneTriggerTimeIntervalMillis": 60000
                                },
                                {
                                "blockDurationSeconds": 3600,
                                "goNextLevelTriggerTimes": 0,
                                "reduceOneTriggerTimeIntervalMillis": 60000
                                }
                                ]
                                turms.security.blocklist.user-id.auto-block.corrupted-frame.block-trigger-timesint5Block the client when the block condition is triggered the times
                                turms.security.blocklist.user-id.auto-block.corrupted-frame.enabledbooleanfalse
                                turms.security.blocklist.user-id.auto-block.corrupted-request.block-levelsList-BlockLevel[
                                {
                                "blockDurationSeconds": 600,
                                "goNextLevelTriggerTimes": 1,
                                "reduceOneTriggerTimeIntervalMillis": 60000
                                },
                                {
                                "blockDurationSeconds": 1800,
                                "goNextLevelTriggerTimes": 1,
                                "reduceOneTriggerTimeIntervalMillis": 60000
                                },
                                {
                                "blockDurationSeconds": 3600,
                                "goNextLevelTriggerTimes": 0,
                                "reduceOneTriggerTimeIntervalMillis": 60000
                                }
                                ]
                                turms.security.blocklist.user-id.auto-block.corrupted-request.block-trigger-timesint5Block the client when the block condition is triggered the times
                                turms.security.blocklist.user-id.auto-block.corrupted-request.enabledbooleanfalse
                                turms.security.blocklist.user-id.auto-block.frequent-request.block-levelsList-BlockLevel[
                                {
                                "blockDurationSeconds": 600,
                                "goNextLevelTriggerTimes": 1,
                                "reduceOneTriggerTimeIntervalMillis": 60000
                                },
                                {
                                "blockDurationSeconds": 1800,
                                "goNextLevelTriggerTimes": 1,
                                "reduceOneTriggerTimeIntervalMillis": 60000
                                },
                                {
                                "blockDurationSeconds": 3600,
                                "goNextLevelTriggerTimes": 0,
                                "reduceOneTriggerTimeIntervalMillis": 60000
                                }
                                ]
                                turms.security.blocklist.user-id.auto-block.frequent-request.block-trigger-timesint5Block the client when the block condition is triggered the times
                                turms.security.blocklist.user-id.auto-block.frequent-request.enabledbooleanfalse
                                turms.security.blocklist.user-id.enabledbooleantrue
                                turms.security.blocklist.user-id.sync-blocklist-interval-millisint10000
                                turms.security.password.admin-password-encoding-algorithmenumBCRYPTThe password encoding algorithm for admins
                                turms.security.password.initial-root-passwordstringThe initial password of the root user
                                turms.security.password.user-password-encoding-algorithmenumSALTED_SHA256The password encoding algorithm for users
                                turms.service.admin-api.address.advertise-hoststringThe advertise address of the local node exposed to admins. (e.g. 100.131.251.96)
                                turms.service.admin-api.address.advertise-strategyenumPRIVATE_ADDRESSThe advertise strategy is used to decide which type of address should be used so that admins can access admin APIs and metrics APIs
                                turms.service.admin-api.address.attach-port-to-hostbooleantrueWhether to attach the local port to the host. e.g. The local host is 100.131.251.96, and the port is 9510 so the service address will be 100.131.251.96:9510
                                turms.service.admin-api.allow-delete-without-filterbooleanfalseWhether to allow administrators to delete data without any filter. Better false to prevent administrators from deleting all data by accident
                                turms.service.admin-api.default-available-records-per-requestint10The default available records per query request
                                turms.service.admin-api.enabledbooleantrueWhether to enable the APIs for administrators
                                turms.service.admin-api.http.hoststring0.0.0.0
                                turms.service.admin-api.http.max-request-body-size-bytesint10485760
                                turms.service.admin-api.http.portint8510
                                turms.service.admin-api.log.enabledbooleantrueWhether to log API calls
                                turms.service.admin-api.log.log-request-paramsbooleantrueWhether to log the parameters of requests
                                turms.service.admin-api.max-available-online-users-status-per-requestint20The maximum available online users' status per query request
                                turms.service.admin-api.max-available-records-per-requestint1000The maximum available records per query request
                                turms.service.admin-api.max-day-difference-per-count-requestint31The maximum day difference per count request
                                turms.service.admin-api.max-day-difference-per-requestint90The maximum day difference per query request
                                turms.service.admin-api.max-hour-difference-per-count-requestint24The maximum hour difference per count request
                                turms.service.admin-api.max-month-difference-per-count-requestint12The maximum month difference per count request
                                turms.service.admin-api.rate-limiting.capacityint50The maximum number of tokens that the bucket can hold
                                turms.service.admin-api.rate-limiting.initial-tokensint50The initial number of tokens for new session
                                turms.service.admin-api.rate-limiting.refill-interval-millisint1000The time interval to refill. 0 means never refill
                                turms.service.admin-api.rate-limiting.tokens-per-periodint50Refills the bucket with the specified number of tokens per period if the bucket is not full
                                turms.service.admin-api.use-authenticationbooleantrueWhether to use authentication. If false, all HTTP requesters will personate the root user and all HTTP requests will be passed. You may set it to false when you want to manage authentication via security groups, NACL, etc
                                turms.service.client-api.disabled-endpointsSet-enum[]The disabled endpoints for client requests. Return ILLEGAL_ARGUMENT if a client tries to access them
                                turms.service.client-api.logging.excluded-notification-categoriesSet-enum[]Turms will get the notifications to log from the union of "includedNotificationCategories" and "includedNotifications" except the notifications included in "excludedNotificationCategories" and "excludedNotificationTypes"
                                turms.service.client-api.logging.excluded-notification-typesSet-enum[]Turms will get the notifications to log from the union of "includedNotificationCategories" and "includedNotifications" except the notifications included in "excludedNotificationCategories" and "excludedNotificationTypes"
                                turms.service.client-api.logging.excluded-request-categoriesSet-enum[]Turms will get the requests to log from the union of "includedRequestCategories" and "includedRequests" except the requests included in "excludedRequestCategories" and "excludedRequestTypes"
                                turms.service.client-api.logging.excluded-request-typesSet-enum[]Turms will get the requests to log from the union of "includedRequestCategories" and "includedRequests" except the requests included in "excludedRequestCategories" and "excludedRequestTypes"
                                turms.service.client-api.logging.included-notification-categoriesLinkedHashSet-LoggingCategoryProperties[]Turms will get the notifications to log from the union of "includedNotificationCategories" and "includedNotifications" except the notifications included in "excludedNotificationCategories" and "excludedNotificationTypes"
                                turms.service.client-api.logging.included-notificationsLinkedHashSet-LoggingRequestProperties[]Turms will get the notifications to log from the union of "includedNotificationCategories" and "includedNotifications" except the notifications included in "excludedNotificationCategories" and "excludedNotificationTypes"
                                turms.service.client-api.logging.included-request-categoriesLinkedHashSet-LoggingCategoryProperties[
                                {
                                "category": "ALL",
                                "sampleRate": 1
                                }
                                ]
                                Turms will get the requests to log from the union of "includedRequestCategories" and "includedRequests" except the requests included in "excludedRequestCategories" and "excludedRequestTypes"
                                turms.service.client-api.logging.included-requestsLinkedHashSet-LoggingRequestProperties[]Turms will get the requests to log from the union of "includedRequestCategories" and "includedRequests" except the requests included in "excludedRequestCategories" and "excludedRequestTypes"
                                turms.service.conversation.read-receipt.allow-move-read-date-forwardbooleanfalseWhether to allow to move the last read date forward
                                turms.service.conversation.read-receipt.enabledbooleantrueWhether to allow to update the last read date
                                turms.service.conversation.read-receipt.update-read-date-after-message-sentbooleantrueWhether to update the read date after a user sent a message
                                turms.service.conversation.read-receipt.update-read-date-when-user-querying-messagebooleanfalseWhether to update the read date when a user queries messages
                                turms.service.conversation.read-receipt.use-server-timebooleantrueWhether to use the server time to set the last read date when updating
                                turms.service.conversation.typing-status.enabledbooleantrueWhether to notify users of typing statuses sent by other users
                                turms.service.fake.clear-all-collections-before-fakingbooleanfalseWhether to clear all collections before faking at startup
                                turms.service.fake.enabledbooleanfalseWhether to fake data. Note that faking only works in non-production environments
                                turms.service.fake.fake-if-collection-existsbooleanfalseWhether to fake data even if the collection has already existed
                                turms.service.fake.user-countint1000the total number of users to fake
                                turms.service.group.activate-group-when-createdbooleantrueWhether to activate a group when created by default
                                turms.service.group.delete-group-logically-by-defaultbooleantrueWhether to delete groups logically by default
                                turms.service.group.invitation.allow-recall-pending-invitation-by-owner-and-managerbooleanfalseWhether to allow the owner and managers of a group to recall pending group invitations
                                turms.service.group.invitation.delete-expired-invitations-when-cron-triggeredbooleanfalseWhether to delete expired group invitations when the cron expression is triggered
                                turms.service.group.invitation.expire-after-secondsint2592000A group invitation will become expired after the specified time has passed
                                turms.service.group.invitation.expired-invitations-cleanup-cronstring0 15 2 * * *Clean the expired group invitations when the cron expression is triggered if "deleteExpiredInvitationsWhenCronTriggered" is true
                                turms.service.group.invitation.max-content-lengthint200The maximum allowed length for the text of a group invitation
                                turms.service.group.join-request.allow-recall-join-request-sent-by-oneselfbooleanfalseWhether to allow users to recall the join requests sent by themselves
                                turms.service.group.join-request.delete-expired-join-requests-when-cron-triggeredbooleanfalseWhether to delete expired group join requests when the cron expression is triggered
                                turms.service.group.join-request.expire-after-secondsint2592000A group join request will become expired after the specified time has elapsed
                                turms.service.group.join-request.expired-join-requests-cleanup-cronstring0 30 2 * * *Clean the expired group join requests when the cron expression is triggered if "deleteExpiredJoinRequestsWhenCronTriggered" is true
                                turms.service.group.join-request.max-content-lengthint200The maximum allowed length for the text of a group join request
                                turms.service.group.member-cache-expire-after-secondsint15The group member cache will expire after the specified seconds. If 0, no group member cache
                                turms.service.group.question.answer-content-limitint50The maximum allowed length for the text of a group question's answer
                                turms.service.group.question.max-answer-countint10The maximum number of answers for a group question
                                turms.service.group.question.question-content-limitint200The maximum allowed length for the text of a group question
                                turms.service.message.allow-edit-message-by-senderbooleantrueWhether to allow the sender of a message to edit the message
                                turms.service.message.allow-recall-messagebooleantrueWhether to allow users to recall messages. Note: To recall messages, more system resources are needed
                                turms.service.message.allow-send-messages-to-oneselfbooleanfalseWhether to allow users to send messages to themselves
                                turms.service.message.allow-send-messages-to-strangerbooleantrueWhether to allow users to send messages to a stranger
                                turms.service.message.available-recall-duration-secondsint300The available recall duration for the sender of a message
                                turms.service.message.cache.sent-message-cache-max-sizeint10240The maximum size of the cache of sent messages.
                                turms.service.message.cache.sent-message-expire-afterint30The retention period of sent messages in the cache. For a better performance, it is a good practice to keep the value greater than the allowed recall duration
                                turms.service.message.check-if-target-active-and-not-deletedbooleantrueWhether to check if the target (recipient or group) of a message is active and not deleted
                                turms.service.message.default-available-messages-number-with-totalint1The default available messages number with the "total" field that users request
                                turms.service.message.delete-message-logically-by-defaultbooleantrueWhether to delete messages logically by default
                                turms.service.message.expired-messages-cleanup-cronstring0 45 2 * * *Clean the expired messages when the cron expression is triggered
                                turms.service.message.is-recalled-message-visiblebooleanfalseWhether to respond with recalled messages to clients' message query requests
                                turms.service.message.max-records-size-bytesint15728640The maximum allowed size for the records of a message
                                turms.service.message.max-text-limitint500The maximum allowed length for the text of a message
                                turms.service.message.message-retention-period-hoursint0A message will be retained for the given period and will be removed from the database after the retention period
                                turms.service.message.persist-messagebooleantrueWhether to persist messages in databases. Note: If false, senders will not get the message ID after the message has sent and cannot edit it
                                turms.service.message.persist-pre-message-idbooleanfalseWhether to persist the previous message ID of messages in databases
                                turms.service.message.persist-recordbooleanfalseWhether to persist the records of messages in databases
                                turms.service.message.persist-sender-ipbooleanfalseWhether to persist the sender IP of messages in databases
                                turms.service.message.sequence-id.use-sequence-id-for-group-conversationbooleanfalseWhether to use the sequence ID for group conversations so that the client can be aware of the loss of messages. Note that the property has a significant impact on performance
                                turms.service.message.sequence-id.use-sequence-id-for-private-conversationbooleanfalseWhether to use the sequence ID for private conversations so that the client can be aware of the loss of messages. Note that the property has a significant impact on performance
                                turms.service.message.time-typeenumLOCAL_SERVER_TIMEThe time type for the delivery time of message
                                turms.service.message.use-conversation-idbooleanfalseWhether to use conversation ID so that a user can query the messages sent by themselves in a conversation quickly
                                turms.service.mongo.admin.optional-index.admin.registration-datebooleanfalse
                                turms.service.mongo.admin.optional-index.admin.role-idbooleanfalse
                                turms.service.mongo.group.optional-index.group-blocked-user.block-datebooleanfalse
                                turms.service.mongo.group.optional-index.group-blocked-user.requester-idbooleanfalse
                                turms.service.mongo.group.optional-index.group-invitation.group-idbooleantrue
                                turms.service.mongo.group.optional-index.group-invitation.inviter-idbooleanfalse
                                turms.service.mongo.group.optional-index.group-invitation.response-datebooleanfalse
                                turms.service.mongo.group.optional-index.group-join-request.creation-datebooleanfalse
                                turms.service.mongo.group.optional-index.group-join-request.group-idbooleantrue
                                turms.service.mongo.group.optional-index.group-join-request.responder-idbooleanfalse
                                turms.service.mongo.group.optional-index.group-join-request.response-datebooleanfalse
                                turms.service.mongo.group.optional-index.group-member.join-datebooleanfalse
                                turms.service.mongo.group.optional-index.group-member.mute-end-datebooleanfalse
                                turms.service.mongo.group.optional-index.group.creation-datebooleanfalse
                                turms.service.mongo.group.optional-index.group.creator-idbooleanfalse
                                turms.service.mongo.group.optional-index.group.deletion-datebooleantrue
                                turms.service.mongo.group.optional-index.group.mute-end-datebooleanfalse
                                turms.service.mongo.group.optional-index.group.owner-idbooleantrue
                                turms.service.mongo.group.optional-index.group.type-idbooleanfalse
                                turms.service.mongo.message.optional-index.message.deletion-datebooleantrue
                                turms.service.mongo.message.optional-index.message.reference-idbooleanfalse
                                turms.service.mongo.message.optional-index.message.sender-idbooleanfalse
                                turms.service.mongo.message.optional-index.message.sender-ipbooleantrue
                                turms.service.mongo.message.tiered-storage.auto-range-updater.cronstring0 0 3 * * *
                                turms.service.mongo.message.tiered-storage.auto-range-updater.enabledbooleantrue
                                turms.service.mongo.message.tiered-storage.enabledbooleantrue
                                turms.service.mongo.message.tiered-storage.tiersLinkedHashMap{
                                "cold": {
                                "days": 270,
                                "enabled": true,
                                "shards": [
                                ""
                                ]
                                },
                                "frozen": {
                                "days": 0,
                                "enabled": true,
                                "shards": [
                                ""
                                ]
                                },
                                "hot": {
                                "days": 30,
                                "enabled": true,
                                "shards": [
                                ""
                                ]
                                },
                                "warm": {
                                "days": 60,
                                "enabled": true,
                                "shards": [
                                ""
                                ]
                                }
                                }
                                The storage properties for tiers from hot to cold. Note that the order of the tiers is important
                                turms.service.mongo.user.optional-index.user-friend-request.recipient-idbooleanfalse
                                turms.service.mongo.user.optional-index.user-friend-request.requester-idbooleanfalse
                                turms.service.mongo.user.optional-index.user-friend-request.response-datebooleanfalse
                                turms.service.mongo.user.optional-index.user-relationship-group-member.group-indexbooleanfalse
                                turms.service.mongo.user.optional-index.user-relationship-group-member.join-datebooleanfalse
                                turms.service.mongo.user.optional-index.user-relationship-group-member.related-user-idbooleanfalse
                                turms.service.mongo.user.optional-index.user-relationship.establishment-datebooleanfalse
                                turms.service.notification.friend-request-created.notify-friend-request-recipientbooleantrueWhether to notify the recipient when the requester has created a friend request
                                turms.service.notification.friend-request-created.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have created a friend request
                                turms.service.notification.friend-request-replied.notify-friend-request-requesterbooleantrueWhether to notify the requester when a recipient has replied to the friend request sent by the requester
                                turms.service.notification.friend-request-replied.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have replied to a friend request
                                turms.service.notification.group-blocked-user-added.notify-blocked-userbooleanfalseWhether to notify the user when they have been blocked by a group
                                turms.service.notification.group-blocked-user-added.notify-group-membersbooleanfalseWhether to notify group members when a user has been blocked by a group
                                turms.service.notification.group-blocked-user-added.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have added a blocked user to a group
                                turms.service.notification.group-blocked-user-removed.notify-group-membersbooleanfalseWhether to notify group members when a user is unblocked by a group
                                turms.service.notification.group-blocked-user-removed.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have removed a blocked user from a group
                                turms.service.notification.group-blocked-user-removed.notify-unblocked-userbooleanfalseWhether to notify the user when they are unblocked by a group
                                turms.service.notification.group-conversation-read-date-updated.notify-other-group-membersbooleanfalseWhether to notify other group members when a group member has updated their read date in a group conversation
                                turms.service.notification.group-conversation-read-date-updated.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have updated the read date in a group conversation
                                turms.service.notification.group-created.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have created a group
                                turms.service.notification.group-deleted.notify-group-membersbooleantrueWhether to notify group members when a group owner has updated their group
                                turms.service.notification.group-deleted.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have deleted a group
                                turms.service.notification.group-invitation-added.notify-group-membersbooleanfalseWhether to notify group members when a user has been invited
                                turms.service.notification.group-invitation-added.notify-group-owner-and-managersbooleantrueWhether to notify the group owner and managers when a user has been invited
                                turms.service.notification.group-invitation-added.notify-inviteebooleantrueWhether to notify the user when they have been invited by a group member
                                turms.service.notification.group-invitation-added.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have invited a user to a group
                                turms.service.notification.group-invitation-recalled.notify-group-membersbooleanfalseWhether to notify group members when an invitation has been recalled
                                turms.service.notification.group-invitation-recalled.notify-group-owner-and-managersbooleantrueWhether to notify the group owner and managers when an invitation has been recalled
                                turms.service.notification.group-invitation-recalled.notify-inviteebooleantrueWhether to notify the invitee when a group member has recalled their received group invitation
                                turms.service.notification.group-invitation-recalled.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have recalled a group invitation
                                turms.service.notification.group-join-request-created.notify-group-membersbooleanfalseWhether to notify group members when a user has created a group join request for their group
                                turms.service.notification.group-join-request-created.notify-group-owner-and-managersbooleantrueWhether to notify the group owner and managers when a user has created a group join request for their group
                                turms.service.notification.group-join-request-created.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have created a group join request
                                turms.service.notification.group-join-request-recalled.notify-group-membersbooleanfalseWhether to notify group members when a user has recalled a group join request for their group
                                turms.service.notification.group-join-request-recalled.notify-group-owner-and-managersbooleantrueWhether to notify the group owner and managers when a user has recalled a group join request for their group
                                turms.service.notification.group-join-request-recalled.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have recalled a group join request
                                turms.service.notification.group-member-added.notify-added-group-memberbooleantrueWhether to notify the group member when added by others
                                turms.service.notification.group-member-added.notify-other-group-membersbooleantrueWhether to notify other group members when a group member has been added
                                turms.service.notification.group-member-added.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have added a group member
                                turms.service.notification.group-member-info-updated.notify-other-group-membersbooleanfalseWhether to notify other group members when a group member's information has been updated
                                turms.service.notification.group-member-info-updated.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have updated their group member information
                                turms.service.notification.group-member-info-updated.notify-updated-group-memberbooleanfalseWhether to notify the group member when others have updated their group member information
                                turms.service.notification.group-member-online-status-updated.notify-group-membersbooleanfalseWhether to notify other group members when a member's online status has been updated
                                turms.service.notification.group-member-removed.notify-other-group-membersbooleantrueWhether to notify other group members when a group member has been removed
                                turms.service.notification.group-member-removed.notify-removed-group-memberbooleantrueWhether to notify the group member when removed by others
                                turms.service.notification.group-member-removed.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they removed a group member
                                turms.service.notification.group-updated.notify-group-membersbooleantrueWhether to notify group members when the group owner or managers have updated their group
                                turms.service.notification.group-updated.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have updated a group
                                turms.service.notification.message-created.notify-message-recipientsbooleantrueWhether to notify the message recipients when a sender has created a message to them
                                turms.service.notification.message-created.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have created a message
                                turms.service.notification.message-updated.notify-message-recipientsbooleantrueWhether to notify the message recipients when a sender has updated a message sent to them
                                turms.service.notification.message-updated.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have updated a message
                                turms.service.notification.one-sided-relationship-group-deleted.notify-relationship-group-membersbooleanfalseWhether to notify members when a one-side relationship group owner has deleted the group
                                turms.service.notification.one-sided-relationship-group-deleted.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have deleted a relationship group
                                turms.service.notification.one-sided-relationship-group-member-added.notify-new-relationship-group-memberbooleanfalseWhether to notify the new member when a user has added them to their one-sided relationship group
                                turms.service.notification.one-sided-relationship-group-member-added.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have added a new member to their one-sided relationship group
                                turms.service.notification.one-sided-relationship-group-member-removed.notify-removed-relationship-group-memberbooleanfalseWhether to notify the removed member when a user has removed them from their one-sided relationship group
                                turms.service.notification.one-sided-relationship-group-member-removed.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have removed a new member from their one-sided relationship group
                                turms.service.notification.one-sided-relationship-group-updated.notify-relationship-group-membersbooleanfalseWhether to notify members when a one-side relationship group owner has updated the group
                                turms.service.notification.one-sided-relationship-group-updated.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have updated a relationship group
                                turms.service.notification.one-sided-relationship-updated.notify-related-userbooleanfalseWhether to notify the related user when a user has updated a one-sided relationship with them
                                turms.service.notification.one-sided-relationship-updated.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have updated a one-sided relationship
                                turms.service.notification.private-conversation-read-date-updated.notify-contactbooleanfalseWhether to notify another contact when a contact has updated their read date in a private conversation
                                turms.service.notification.private-conversation-read-date-updated.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have updated the read date in a private conversation
                                turms.service.notification.user-info-updated.notify-non-blocked-related-usersbooleanfalseWhether to notify non-blocked related users when a user has updated their information
                                turms.service.notification.user-info-updated.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have updated their information
                                turms.service.notification.user-online-status-updated.notify-non-blocked-related-usersbooleanfalseWhether to notify non-blocked related users when a user has updated their online status
                                turms.service.notification.user-online-status-updated.notify-requester-other-online-sessionsbooleantrueWhether to notify the requester's other online sessions when they have updated their online status
                                turms.service.push-notification.apns.bundle-idstring
                                turms.service.push-notification.apns.enabledbooleanfalse
                                turms.service.push-notification.apns.key-idstring
                                turms.service.push-notification.apns.sandbox-enabledbooleanfalse
                                turms.service.push-notification.apns.signing-keystring
                                turms.service.push-notification.apns.team-idstring
                                turms.service.push-notification.fcm.credentialsstring
                                turms.service.push-notification.fcm.enabledbooleanfalse
                                turms.service.statistics.log-online-users-numberbooleantrueWhether to log online users number
                                turms.service.statistics.online-users-number-logging-cronstring0/15 * * * * *The cron expression to specify the time to log online users' number
                                turms.service.storage.group-profile-picture.allowed-content-typestringimage/*The allowed "Content-Type" of the resource that the client can upload
                                turms.service.storage.group-profile-picture.allowed-referrersList-string[]Restrict access to the resource to only allow the specific referrers (e.g. "https://github.com/turms-im/turms/*")
                                turms.service.storage.group-profile-picture.download-url-expire-after-secondsint300The presigned URLs are valid only for the specified duration. 0 means no expiration
                                turms.service.storage.group-profile-picture.expire-after-daysint0Delete the resource the specific days after creation. 0 means no expiration
                                turms.service.storage.group-profile-picture.max-size-bytesint1048576The maximum size of the resource that the client can upload. 0 means no limit
                                turms.service.storage.group-profile-picture.min-size-bytesint0The minimum size of the resource that the client can upload. 0 means no limit
                                turms.service.storage.group-profile-picture.upload-url-expire-after-secondsint300The presigned URLs are valid only for the specified duration. 0 means no expiration
                                turms.service.storage.message-attachment.allowed-content-typestring/The allowed "Content-Type" of the resource that the client can upload
                                turms.service.storage.message-attachment.allowed-referrersList-string[]Restrict access to the resource to only allow the specific referrers (e.g. "https://github.com/turms-im/turms/*")
                                turms.service.storage.message-attachment.download-url-expire-after-secondsint300The presigned URLs are valid only for the specified duration. 0 means no expiration
                                turms.service.storage.message-attachment.expire-after-daysint0Delete the resource the specific days after creation. 0 means no expiration
                                turms.service.storage.message-attachment.max-size-bytesint1048576The maximum size of the resource that the client can upload. 0 means no limit
                                turms.service.storage.message-attachment.min-size-bytesint0The minimum size of the resource that the client can upload. 0 means no limit
                                turms.service.storage.message-attachment.upload-url-expire-after-secondsint300The presigned URLs are valid only for the specified duration. 0 means no expiration
                                turms.service.storage.user-profile-picture.allowed-content-typestringimage/*The allowed "Content-Type" of the resource that the client can upload
                                turms.service.storage.user-profile-picture.allowed-referrersList-string[]Restrict access to the resource to only allow the specific referrers (e.g. "https://github.com/turms-im/turms/*")
                                turms.service.storage.user-profile-picture.download-url-expire-after-secondsint300The presigned URLs are valid only for the specified duration. 0 means no expiration
                                turms.service.storage.user-profile-picture.expire-after-daysint0Delete the resource the specific days after creation. 0 means no expiration
                                turms.service.storage.user-profile-picture.max-size-bytesint1048576The maximum size of the resource that the client can upload. 0 means no limit
                                turms.service.storage.user-profile-picture.min-size-bytesint0The minimum size of the resource that the client can upload. 0 means no limit
                                turms.service.storage.user-profile-picture.upload-url-expire-after-secondsint300The presigned URLs are valid only for the specified duration. 0 means no expiration
                                turms.service.user.activate-user-when-addedbooleantrueWhether to activate a user when added by default
                                turms.service.user.delete-two-sided-relationshipsbooleanfalseWhether to delete the two-sided relationships when a user requests to delete a relationship
                                turms.service.user.delete-user-logicallybooleantrueWhether to delete a user logically
                                turms.service.user.friend-request.allow-send-request-after-declined-or-ignored-or-expiredbooleanfalseWhether to allow resending a friend request after the previous request has been declined, ignored, or expired
                                turms.service.user.friend-request.delete-expired-requests-when-cron-triggeredbooleanfalseWhether to delete expired when the cron expression is triggered
                                turms.service.user.friend-request.expired-user-friend-requests-cleanup-cronstring0 0 2 * * *Clean expired friend requests when the cron expression is triggered if deleteExpiredRequestsWhenCronTriggered is true
                                turms.service.user.friend-request.friend-request-expire-after-secondsint2592000A friend request will become expired after the specified time has elapsed
                                turms.service.user.friend-request.max-content-lengthint200The maximum allowed length for the text of a friend request
                                turms.service.user.max-intro-lengthint100The maximum allowed length for a user's intro
                                turms.service.user.max-name-lengthint20The maximum allowed length for a user's name
                                turms.service.user.max-password-lengthint16The maximum allowed length for a user's password
                                turms.service.user.max-profile-picture-lengthint100The maximum allowed length for a user's profile picture
                                turms.service.user.min-password-lengthint-1The minimum allowed length for a user's password. If 0, it means the password can be an empty string "". If -1, it means the password can be null
                                turms.service.user.respond-offline-if-invisiblebooleanfalseWhether to respond to client with the OFFLINE status if a user is in INVISIBLE status
                                turms.shutdown.job-timeout-millislong120000Wait for a job 2 minutes at most for extreme cases by default. Though it is a long time, graceful shutdown is usually better than force shutdown.
                                turms.user-status.cache-user-sessions-statusbooleantrueWhether to cache the user sessions status
                                turms.user-status.user-sessions-status-cache-max-sizeint-1The maximum size of the cache of users' sessions status
                                turms.user-status.user-sessions-status-expire-afterint60The life duration of each remote user's sessions status in the cache. Note that the cache will make the presentation of users' sessions status inconsistent during the time
                                - + \ No newline at end of file diff --git a/docs/zh-CN/server/deployment/distribution.html b/docs/zh-CN/server/deployment/distribution.html index e2505f48..d75e04f5 100644 --- a/docs/zh-CN/server/deployment/distribution.html +++ b/docs/zh-CN/server/deployment/distribution.html @@ -17,7 +17,7 @@ -
                                Skip to content

                                发布

                                服务端发布包的目录结构

                                turms-gateway与turms-service服务端发布包的目录结构如下:

                                ├─bin
                                +    
                                Skip to content

                                发布

                                服务端发布包的目录结构

                                turms-gateway与turms-service服务端发布包的目录结构如下:

                                ├─bin
                                 │  └─run.sh
                                 ├─config
                                 │  ├─application.yaml
                                @@ -200,7 +200,7 @@
                                 net.ipv4.tcp_moderate_rcvbuf = 1
                                 # 默认值:1。TCP使用16位来记录窗口大小,最大值可以是65535B。如果超过该值,就需要开启tcp_window_scaling机制
                                 net.ipv4.tcp_window_scaling = 1

                                配置完后,执行sudo sysctl -p以加载sysctl的最新配置。

                                特别一提的是:我们在系统资源管理提到了Turms服务端会预留部分内存给系统内核,该部分内存主要就是指上述的TCP连接的缓冲区。

                                初始拥塞窗口(initcwnd)配置

                                保持默认值:10MSS。

                                参考文档:

                                - + \ No newline at end of file diff --git a/docs/zh-CN/server/deployment/getting-started.html b/docs/zh-CN/server/deployment/getting-started.html index 8ce44b97..bb238637 100644 --- a/docs/zh-CN/server/deployment/getting-started.html +++ b/docs/zh-CN/server/deployment/getting-started.html @@ -17,7 +17,7 @@ -
                                Skip to content

                                搭建与启动

                                自动搭建与启动

                                单机环境

                                适用场景:搭建流程方便快捷,但无法满足容灾、弹性扩展、零宕机升级与负载均衡等需求,主要用于搭建Demo用于展示,与服务对SLA无要求的用户。

                                基于Docker Compose

                                通过以下命令,可以全自动地搭建一套完整的Turms最小集群(包含turms-gateway、turms-service与turms-admin)及其依赖服务端(MongoDB分片集群与Redis)

                                bash
                                git clone --depth 1 https://github.com/turms-im/turms.git
                                +    
                                Skip to content

                                搭建与启动

                                自动搭建与启动

                                单机环境

                                适用场景:搭建流程方便快捷,但无法满足容灾、弹性扩展、零宕机升级与负载均衡等需求,主要用于搭建Demo用于展示,与服务对SLA无要求的用户。

                                基于Docker Compose

                                通过以下命令,可以全自动地搭建一套完整的Turms最小集群(包含turms-gateway、turms-service与turms-admin)及其依赖服务端(MongoDB分片集群与Redis)

                                bash
                                git clone --depth 1 https://github.com/turms-im/turms.git
                                 cd turms
                                 docker plugin install grafana/loki-docker-driver:latest --alias loki --grant-all-permissions
                                 # Or "ENV=dev,demo docker compose -f docker-compose.standalone.yml --profile monitoring up --force-recreate -d" to run with sidecar services in dev profile
                                @@ -108,7 +108,7 @@
                                 docker run -p 6510:6510 ghcr.io/turms-im/turms-admin
                                 docker run -p 7510:7510 -p 8510:8510 ghcr.io/turms-im/turms-service
                                 docker run --ulimit nofile=102400:102400 -p 7510:7510 -p 9510:9510 -p 10510:10510 -p 11510:11510 -p 12510:12510 ghcr.io/turms-im/turms-gateway

                                另外,您可以通过volume挂载的方式来使用自定义的application.yamljvm.options。如配置-v /your-custom-config-dir:/opt/turms/turms/config

                                方案二:下载并解压Turms服务端压缩包(由于v.0.10.0尚未发布在release页面中,因此该方案暂不可用),根据下述步骤运行:

                                • (如果您将MongoDB与Redis都安装默认配置安装在本地,可跳过此步骤)根据您的需求配置config/jvm.options、config/application.yaml(您可以在此处配置Turms自定义的配置参数,并且您也可以在此处配置多个MongoDB或mongos的服务端地址。具体可参考:https://docs.mongodb.com/manual/reference/connection-string)。

                                • (推荐使用Ansible)在所有需要运行Turms服务端的系统上,运行bin/turms脚本(默认以Thin包形式执行,若需以Fat包形式执行,请在执行脚本时加上-f参数,如:sh run.sh -f。之后再运行turms-gateway服务端。turms-gateway与turms-service服务端会通过MongoDB(作为服务注册中心)来自动寻找其他服务端节点,由此Turms集群开始运作。

                                方案三:克隆Turms仓库源码,直接通过IDE运行turms-gateway与turms-service服务端。(参考命令:git clone --depth 1 https://github.com/turms-im/turms.git

                              提醒

                              • turms-service服务端在启动时,会自动检测数据库中是否已存在一个角色为ROOT,且账号为turms的超级管理员账号。如果不存在,则turms-service服务端会自动创建一个角色为ROOT、名称为turms与密码为turms.security.password.initial-root-password(默认为:turms)的管理员账号。在生产环境中,请务必记得要修改默认密码。
                              • 上述操作主要用于您初次体验Turms集群使用,若您需将Turms部署在生产环境当中,请务必查阅Wiki手册,了解各种配置参数的意义,以最小的资源消耗,来定制属于您自己的业务需求与业务组合。

                              Turms服务端启动与关闭的大致流程

                              启动流程

                              1. 连接并校验mongos与Redis服务端。
                              2. 检测MongoDB是否已建表,如果已经建好表了,则跳过这步。如果没有就进行:建表、添加索引、添加分片键、添加Zones用于冷热数据分离存储。如果开启了MongoDB的Fake数据,则turms-service会自动向MongoDB生成Fake数据,用于开发测试。
                              3. 对于turms-service服务端,它会检测MongoDB中是否已存在一个角色为ROOT,且账号为turms的超级管理员账号。如果不存在,则会向MongoDB创建一个角色为ROOT、名称为turms与密码为turms.security.password.initial-root-password(默认为:turms)的管理员账号。
                              4. 注册本地Node节点到服务注册中心,如果注册成功,则拉取并应用集群全局配置,并搭建RPC服务端,用于接收RPC客户端连接。如果失败,则抛异常并退出进程。
                              5. 开启Admin HTTP服务端,用于接收管理员API请求。另外,对于turms-gateway,还要开启网关服务端(如TCP/WebSocket),用于接收客户端连接与请求。
                              6. 对于turms-gateway,如果开启了Fake客户端,则生成真实的客户端连接并随机发送真实客户端请求(请求类型随机、请求参数随机),用于开发测试。

                              至此,服务端启动完毕。

                              关闭流程

                              (对于turms-gateway)

                              1. 拒绝新客户端网络连接与客户端请求。
                              2. 关闭Fake客户端,关闭已建立的客户端会话连接。
                              3. 关闭对接TCP/UDP/WebSocket客户端连接的服务端,与HTTP管理员API服务端。

                              (对于turms-gateway与turms-service) 4. 关闭黑名单同步机制。 5. 关闭集群服务(如RPC节点间的连接、服务注册与发现服务)。 6. 关闭插件机制。 7. 发送完Redis/MongoDB客户端请求后,关闭Turms服务端连接到Redis和MongoDB的网络连接。 8. 打印完日志,关闭日志服务。

                              至此,服务端关闭完成。

                            - + \ No newline at end of file diff --git a/docs/zh-CN/server/development/code.html b/docs/zh-CN/server/development/code.html index fd5ba5bb..7cb57c7f 100644 --- a/docs/zh-CN/server/development/code.html +++ b/docs/zh-CN/server/development/code.html @@ -17,7 +17,7 @@ -
                            Skip to content

                            源码

                            本文讲解Turms服务端的包结构与各主要功能模块的大致源码实现,以帮忙开发者更快地阅读源码与理解相关流程。

                            提醒:

                            1. Turms服务端重度使用reactor-core这一响应式框架,本文默认读者已经熟练掌握响应式编程,如果读者还没掌握响应式编程,则建议先自行学习并掌握reactor-core
                            2. Turms会不定期优化代码,一些函数名或函数实现可能会稍微改变,但其思想是不会变的。
                            3. 各模块源码所做的事情通常比下文讲得多得多,但为了方便读者理解,本文只挑选主要流程进行讲解,并略去了大量细节。如果读者对其中的细节感兴趣,可以在阅读完本文的相关讲解,并对主要流程有大概的认识后,再去阅读源码,了解其具体实现细节。

                            项目结构

                            我们常说代码即文档,代码可以让读者从微观视角了解各功能的实现细节与逻辑关系,而包则像是文档的目录。好的分包应该在宏观上清晰地展示“文档”的层次与结构,以便于读者理解。本文讲解Turms服务端的包结构,以帮忙开发者更好地理解包之间的关系与层次。

                            背景(拓展内容)

                            不管是什么样的分包理念,其实基础的分包类别就只有四种:按特性分(Feature)、按类型分(Type)、按层分(Layer)与不分包,各种上层的设计理念只是对这些基础分包类别的不同组合。

                            另外就算是同一个项目,其在不同的发展阶段通常也是会适用不同的包结构。我们经常说架构是演进式发展的,包分类其实也需要演进式发展。比如说Turms服务端在早期阶段,一共也没几个模块,而按今天Turms服务端给一堆模块分包的思路去给早期Turms服务端做包结构设计,那结果就是:包结构可读性不升反降,为了设计而设计,即过度设计。

                            分包目标(拓展内容)

                            做分包设计时必须要有明确目标,不然很容易陷入到为了套某种包设计,而强行分包的情况,诸如一些项目Service层先写接口类再写实现类,而不去思考设计规范中为什么需要这样的接口,又或者强行套用DDD分层模板,而不去思考一些设计是不是已经严重违背约定俗成的惯例了,导致编程时反而束手束脚。

                            Turms服务端项目的分包主要目标有:

                            • 尽量保证功能特性模块的高内聚,降低模块的复杂度。这点主要是出于代码的可维护性考虑,以避免陷入非常常见的按特性+按类型的混合设计按特性+按类型+按分层的混合设计,因为混合设计既会让代码的归属问题变得模棱两可,又也会因为一层包下使用了不同的分包策略而降低包结构的可读性,不利于长期维护。
                            • 尽量保证业务子域的独立性。这点主要是为了划清业务边界,让各模块易读与易改(额外一提,turms-service未来会支持以各种业务域组合形态做部署,举例来说,turms-service未来既可以部署成用户业务域的服务,也可以部署成消息业务域的服务,也可以部署成用户+消息业务域的服务等等)。
                            • 支撑域的功能特性模块与业务模块一定要分开。这点主要是为了划清问题域与支撑域的边界。
                            • 尽量让开发者通过包结构就能推测包的上下游关系。这点主要出于代码的可读性考虑,在长期的编程实践中,当我们看到中大型项目的包结构不做分层的代码,那我们可能就得自己翻几遍包或代码,才推测可能的包上下游关系。
                            • 在能保证逻辑清晰的情况下,尽量让包的层次少一点。

                            另外,在阅读各种优秀开源项目的包结构时,我们会发现大部分知名的中大型开源服务端项目可能压根就不做分层设计,并且通常是以按功能特性分包为主,以按类型分包为辅做混合设计,又或者是按照常规的MVC或DDD分层设计。对于这些分包思路,我们一般会评价“中规中矩,符合惯例,但差强人意”,因为它们并没有很好的满足上述的分包目标,很多开发者在阅读这些项目的源码时也会陷入到“不知道从哪读起”的情况,编码时也会经常遇到代码模棱两可的归属问题。

                            分包思路

                            各种分包理念通常只是在理想场景下提供了种思路,切不可直接盲目套用。我们在给Turms服务端设计包结构时,主要参考:各种分包的设计理念、优秀开源项目的实践与实际效果、约定俗成的惯例、项目规模、项目类型、包的规模、长期的编程实践的体会。

                            因此,各种分包理念只是“参考建议”,实操时到底是要靠长期编程的实践与体会,判断各种设计适不适合Turms项目的实际情况,并对各种分包理念做了取长补短,读者也可以从Turms服务端项目的包结构中看到很多设计理念、甚至是基于DDD微服务设计的影子。具体而言,Turms服务端的分包示意图如下:

                            上图框里的名称就是实际Turms服务端中包的名称,其连线就是包与包之间的逻辑关系。其中:

                            第一层按层来分包,分别是accessdomainstorageinfra这四层,其中:

                            • access:接入层,负责管理员与客户端的会话管理与请求调度。该层会将用户请求分发给domain层的access层。

                            • domain:业务域层,负责处理各种业务域相关的逻辑。domain层内部按照常见的分层分包设计,分为accessservicerepository三层。

                              • 其中,相对比较特别的是domain层中的access层。因为service的上层不仅有调度管理员HTTP请求的Controller层admin,还有调度客户端请求的Controller层clientservicerequest(对于turms-gateway来说是client包,而对于turms-service来说是servicerequest包)。二者共享Service层,因此使用一个单独的accecss层来涵盖这两层。

                              • 关于为什么要单独抽个model出来

                                像是上图中dto(Data Transfer Object)/bo(Business Object)/po(Persistent Object)都是贫血模型,只有model是充血模型,它们不仅要存储状态(数据),还自带些行为(逻辑),用于处理各种高内聚的逻辑,比较特别,因此单独分包。

                              • 关于rpc

                                一些域(domain)具有它自己独有的RPC请求,则这些RPC请求都会被归到该域下。如Session域下的im.turms.server.common.domain.session.rpc.SetUserOfflineRequest这一RPC请求。

                                额外一提,集群RPC的实现在im.turms.server.common.infra.cluster.service.rpc包下。

                            • storage:存储层,提供MongoDB客户端管理与Redis客户端管理,分别对应mongo包与redis包。

                            • infra:基础服务层,负责为accessdomain层提供基础功能实现,如日志处理、配置管理等。infra层内部按照功能特性分包。

                            综上,Turms服务端的分包其实设计得很巧妙:

                            • 通过accessdomainstorageinfra这四层,开发者可以基于已掌握的MVC分层知识,快速理解Turms服务端的源码层次,并能根据包名清楚地明白每层包跟用户会话与用户请求是什么关系。
                            • domain层的各业务域又可以帮助开发者快速区分各Turms服务端有哪些业务域。而每个域的内部又基于常见的MVC分层设计,开发者可以基于以往知识,快速理解一个业务域的内部上下游逻辑关系。
                            • infra层又可以帮助开发者理解Turms服务端的支撑域具体包括哪些功能模块。

                            因此这样的分包层次其实是比较清晰的,有利于长期维护。另外,读者可能已经从上述的分包实践中,看出了很多分包理念的影子,而Turms只是参考这些理念做的设计,并不需要遵循这些分包理念。

                            补充:

                            • 关于为什么不把第一层包拆分成模块(Java Modules),这是因为现阶段既没分模块的必要,分模块还会增加项目结构的复杂度,如无必要,勿增实体。
                            • Turms服务端的贫血模型大部分是用Java的record表示的,但还有些贫血模型因为性能原因(改动一个字段是否需要重新new个新对象),仍用class表示。

                            请求处理在包之间的流转过程

                            读者在理解上文的Turms分包设计之后,应该已经能对Turms服务端的请求处理流程有比较清晰的认识了。这里以最经典的“客户端登陆”为例,并以包的视角简要地串讲相关流程(读者可结合上述的分包示意图进行阅读),帮忙读者更清晰地理解包的分层设计。

                            • 客户端登陆时,首先要与turms-gateway服务端建立纯TCP或WebSocket连接,此时处理网络连接的是access层,因为它是客户端连接,因此是access/client层(而不是access/admin)。

                            • 建立完网络连接后,客户端发送给turms-gateway登陆请求时,turms-gateway会将解析后的请求经由access层传递给domain/session/access/client/controller层的Controller进行处理,Controller再将具体业务逻辑交由domain/session/access/client/service层的Service进行处理,而Service又会将:1. 相关的MongoDB数据库的查询操作,交由domain/session/access/client/repository层的Repository进行处理,Repository只是对相关CRUD语句的拼接,而这些语句又会传递给storage/mongo这层的MongoDB客户端实现,它们会向MongoDB服务端发送最终的请求;2. 相关的Redis操作则交由storage/redis这层处理。

                              等请求处理完毕,则通过回调并根据包的上下游关系,依次返回。

                            • 而对于infra这层与其他各种子包,绝大部分也就是为上面的各层提供各种能力的支持,如infra/logging日志包与infra/cluster集群实现包。

                            其他各类请求(管理员HTTP请求、客户端基于TCP/WebSocket连接的业务请求)的处理流程也大致同上,读者可以自行举一反三。

                            下面的篇章会继续以更为细致的源码视角,对客户端请求的处理流程进行讲解。

                            客户端请求处理流程

                            阅读下文前,建议读者先行阅读客户端访问服务端标准流程,先从架构角度理解其背后的设计思路,这样在读源码的时候就不容易“迷路”。

                            请求模型:im.turms.server.common.access.client.dto.request.TurmsRequest

                            响应与通知模型:im.turms.server.common.access.client.dto.notification.TurmsNotification

                            UML顺序图

                            turms-gateway

                            简介:用于维护与客户端的网络连接,维护应用层会话,并将大部分的业务请求下发给turms-service服务端。

                            网络层配置

                            1. 启动接收客户端请求的服务端

                              TCP服务端:im.turms.gateway.access.client.tcp.TcpServerFactory#create

                              WebSocket服务端:im.turms.gateway.access.client.websocket.WebSocketServerFactory#create

                              这两个函数主要做的就是:基于reactor-netty库,绑定服务端的监听地址、配置EventLoop线程池、(可选)配置SSL、开启相关度量等常规服务端相关工作。

                            2. 对于纯TCP连接(而非预备的WebSocket连接),给新确立的TCP连接绑上编解码Handlers

                              im.turms.gateway.access.client.tcp.TcpServerFactory#create函数内,通过下述回调,给新TCP连接绑上对应的TurmsRequestTurmsNotification的编解码实例。

                              java
                              .doOnConnection(connection -> {
                              +    
                              Skip to content

                              源码

                              本文讲解Turms服务端的包结构与各主要功能模块的大致源码实现,以帮忙开发者更快地阅读源码与理解相关流程。

                              提醒:

                              1. Turms服务端重度使用reactor-core这一响应式框架,本文默认读者已经熟练掌握响应式编程,如果读者还没掌握响应式编程,则建议先自行学习并掌握reactor-core
                              2. Turms会不定期优化代码,一些函数名或函数实现可能会稍微改变,但其思想是不会变的。
                              3. 各模块源码所做的事情通常比下文讲得多得多,但为了方便读者理解,本文只挑选主要流程进行讲解,并略去了大量细节。如果读者对其中的细节感兴趣,可以在阅读完本文的相关讲解,并对主要流程有大概的认识后,再去阅读源码,了解其具体实现细节。

                              项目结构

                              我们常说代码即文档,代码可以让读者从微观视角了解各功能的实现细节与逻辑关系,而包则像是文档的目录。好的分包应该在宏观上清晰地展示“文档”的层次与结构,以便于读者理解。本文讲解Turms服务端的包结构,以帮忙开发者更好地理解包之间的关系与层次。

                              背景(拓展内容)

                              不管是什么样的分包理念,其实基础的分包类别就只有四种:按特性分(Feature)、按类型分(Type)、按层分(Layer)与不分包,各种上层的设计理念只是对这些基础分包类别的不同组合。

                              另外就算是同一个项目,其在不同的发展阶段通常也是会适用不同的包结构。我们经常说架构是演进式发展的,包分类其实也需要演进式发展。比如说Turms服务端在早期阶段,一共也没几个模块,而按今天Turms服务端给一堆模块分包的思路去给早期Turms服务端做包结构设计,那结果就是:包结构可读性不升反降,为了设计而设计,即过度设计。

                              分包目标(拓展内容)

                              做分包设计时必须要有明确目标,不然很容易陷入到为了套某种包设计,而强行分包的情况,诸如一些项目Service层先写接口类再写实现类,而不去思考设计规范中为什么需要这样的接口,又或者强行套用DDD分层模板,而不去思考一些设计是不是已经严重违背约定俗成的惯例了,导致编程时反而束手束脚。

                              Turms服务端项目的分包主要目标有:

                              • 尽量保证功能特性模块的高内聚,降低模块的复杂度。这点主要是出于代码的可维护性考虑,以避免陷入非常常见的按特性+按类型的混合设计按特性+按类型+按分层的混合设计,因为混合设计既会让代码的归属问题变得模棱两可,又也会因为一层包下使用了不同的分包策略而降低包结构的可读性,不利于长期维护。
                              • 尽量保证业务子域的独立性。这点主要是为了划清业务边界,让各模块易读与易改(额外一提,turms-service未来会支持以各种业务域组合形态做部署,举例来说,turms-service未来既可以部署成用户业务域的服务,也可以部署成消息业务域的服务,也可以部署成用户+消息业务域的服务等等)。
                              • 支撑域的功能特性模块与业务模块一定要分开。这点主要是为了划清问题域与支撑域的边界。
                              • 尽量让开发者通过包结构就能推测包的上下游关系。这点主要出于代码的可读性考虑,在长期的编程实践中,当我们看到中大型项目的包结构不做分层的代码,那我们可能就得自己翻几遍包或代码,才推测可能的包上下游关系。
                              • 在能保证逻辑清晰的情况下,尽量让包的层次少一点。

                              另外,在阅读各种优秀开源项目的包结构时,我们会发现大部分知名的中大型开源服务端项目可能压根就不做分层设计,并且通常是以按功能特性分包为主,以按类型分包为辅做混合设计,又或者是按照常规的MVC或DDD分层设计。对于这些分包思路,我们一般会评价“中规中矩,符合惯例,但差强人意”,因为它们并没有很好的满足上述的分包目标,很多开发者在阅读这些项目的源码时也会陷入到“不知道从哪读起”的情况,编码时也会经常遇到代码模棱两可的归属问题。

                              分包思路

                              各种分包理念通常只是在理想场景下提供了种思路,切不可直接盲目套用。我们在给Turms服务端设计包结构时,主要参考:各种分包的设计理念、优秀开源项目的实践与实际效果、约定俗成的惯例、项目规模、项目类型、包的规模、长期的编程实践的体会。

                              因此,各种分包理念只是“参考建议”,实操时到底是要靠长期编程的实践与体会,判断各种设计适不适合Turms项目的实际情况,并对各种分包理念做了取长补短,读者也可以从Turms服务端项目的包结构中看到很多设计理念、甚至是基于DDD微服务设计的影子。具体而言,Turms服务端的分包示意图如下:

                              上图框里的名称就是实际Turms服务端中包的名称,其连线就是包与包之间的逻辑关系。其中:

                              第一层按层来分包,分别是accessdomainstorageinfra这四层,其中:

                              • access:接入层,负责管理员与客户端的会话管理与请求调度。该层会将用户请求分发给domain层的access层。

                              • domain:业务域层,负责处理各种业务域相关的逻辑。domain层内部按照常见的分层分包设计,分为accessservicerepository三层。

                                • 其中,相对比较特别的是domain层中的access层。因为service的上层不仅有调度管理员HTTP请求的Controller层admin,还有调度客户端请求的Controller层clientservicerequest(对于turms-gateway来说是client包,而对于turms-service来说是servicerequest包)。二者共享Service层,因此使用一个单独的accecss层来涵盖这两层。

                                • 关于为什么要单独抽个model出来

                                  像是上图中dto(Data Transfer Object)/bo(Business Object)/po(Persistent Object)都是贫血模型,只有model是充血模型,它们不仅要存储状态(数据),还自带些行为(逻辑),用于处理各种高内聚的逻辑,比较特别,因此单独分包。

                                • 关于rpc

                                  一些域(domain)具有它自己独有的RPC请求,则这些RPC请求都会被归到该域下。如Session域下的im.turms.server.common.domain.session.rpc.SetUserOfflineRequest这一RPC请求。

                                  额外一提,集群RPC的实现在im.turms.server.common.infra.cluster.service.rpc包下。

                              • storage:存储层,提供MongoDB客户端管理与Redis客户端管理,分别对应mongo包与redis包。

                              • infra:基础服务层,负责为accessdomain层提供基础功能实现,如日志处理、配置管理等。infra层内部按照功能特性分包。

                              综上,Turms服务端的分包其实设计得很巧妙:

                              • 通过accessdomainstorageinfra这四层,开发者可以基于已掌握的MVC分层知识,快速理解Turms服务端的源码层次,并能根据包名清楚地明白每层包跟用户会话与用户请求是什么关系。
                              • domain层的各业务域又可以帮助开发者快速区分各Turms服务端有哪些业务域。而每个域的内部又基于常见的MVC分层设计,开发者可以基于以往知识,快速理解一个业务域的内部上下游逻辑关系。
                              • infra层又可以帮助开发者理解Turms服务端的支撑域具体包括哪些功能模块。

                              因此这样的分包层次其实是比较清晰的,有利于长期维护。另外,读者可能已经从上述的分包实践中,看出了很多分包理念的影子,而Turms只是参考这些理念做的设计,并不需要遵循这些分包理念。

                              补充:

                              • 关于为什么不把第一层包拆分成模块(Java Modules),这是因为现阶段既没分模块的必要,分模块还会增加项目结构的复杂度,如无必要,勿增实体。
                              • Turms服务端的贫血模型大部分是用Java的record表示的,但还有些贫血模型因为性能原因(改动一个字段是否需要重新new个新对象),仍用class表示。

                              请求处理在包之间的流转过程

                              读者在理解上文的Turms分包设计之后,应该已经能对Turms服务端的请求处理流程有比较清晰的认识了。这里以最经典的“客户端登陆”为例,并以包的视角简要地串讲相关流程(读者可结合上述的分包示意图进行阅读),帮忙读者更清晰地理解包的分层设计。

                              • 客户端登陆时,首先要与turms-gateway服务端建立纯TCP或WebSocket连接,此时处理网络连接的是access层,因为它是客户端连接,因此是access/client层(而不是access/admin)。

                              • 建立完网络连接后,客户端发送给turms-gateway登陆请求时,turms-gateway会将解析后的请求经由access层传递给domain/session/access/client/controller层的Controller进行处理,Controller再将具体业务逻辑交由domain/session/access/client/service层的Service进行处理,而Service又会将:1. 相关的MongoDB数据库的查询操作,交由domain/session/access/client/repository层的Repository进行处理,Repository只是对相关CRUD语句的拼接,而这些语句又会传递给storage/mongo这层的MongoDB客户端实现,它们会向MongoDB服务端发送最终的请求;2. 相关的Redis操作则交由storage/redis这层处理。

                                等请求处理完毕,则通过回调并根据包的上下游关系,依次返回。

                              • 而对于infra这层与其他各种子包,绝大部分也就是为上面的各层提供各种能力的支持,如infra/logging日志包与infra/cluster集群实现包。

                              其他各类请求(管理员HTTP请求、客户端基于TCP/WebSocket连接的业务请求)的处理流程也大致同上,读者可以自行举一反三。

                              下面的篇章会继续以更为细致的源码视角,对客户端请求的处理流程进行讲解。

                              客户端请求处理流程

                              阅读下文前,建议读者先行阅读客户端访问服务端标准流程,先从架构角度理解其背后的设计思路,这样在读源码的时候就不容易“迷路”。

                              请求模型:im.turms.server.common.access.client.dto.request.TurmsRequest

                              响应与通知模型:im.turms.server.common.access.client.dto.notification.TurmsNotification

                              UML顺序图

                              turms-gateway

                              简介:用于维护与客户端的网络连接,维护应用层会话,并将大部分的业务请求下发给turms-service服务端。

                              网络层配置

                              1. 启动接收客户端请求的服务端

                                TCP服务端:im.turms.gateway.access.client.tcp.TcpServerFactory#create

                                WebSocket服务端:im.turms.gateway.access.client.websocket.WebSocketServerFactory#create

                                这两个函数主要做的就是:基于reactor-netty库,绑定服务端的监听地址、配置EventLoop线程池、(可选)配置SSL、开启相关度量等常规服务端相关工作。

                              2. 对于纯TCP连接(而非预备的WebSocket连接),给新确立的TCP连接绑上编解码Handlers

                                im.turms.gateway.access.client.tcp.TcpServerFactory#create函数内,通过下述回调,给新TCP连接绑上对应的TurmsRequestTurmsNotification的编解码实例。

                                java
                                .doOnConnection(connection -> {
                                     // Inbound
                                     connection.addHandlerLast("varintLengthBasedFrameDecoder", CodecFactory.getExtendedVarintLengthBasedFrameDecoder(maxFrameLength));
                                     // Outbound
                                @@ -492,7 +492,7 @@
                                     }
                                     return sink.asMono();
                                 }

                                至此,RPC发送端的处理流程就结束了。

                                特别一提的是:之所以请求ID没有在上游就被编码,这是因为部分RPC请求有可能被发送给多个RPC接收端,如群消息就经常被转发给多个turms-gateway服务端,而通过分别编码,就可以让上游传来的字节数据被共享,无需内存拷贝,极大地提升内存使用率,这也是为什么Turms要自研RPC服务的原因之一。

                                HandleServiceRequest的RPC接收端

                                TODO

                              - + \ No newline at end of file diff --git a/docs/zh-CN/server/development/plugin.html b/docs/zh-CN/server/development/plugin.html index 43af8be8..81cb9b1e 100644 --- a/docs/zh-CN/server/development/plugin.html +++ b/docs/zh-CN/server/development/plugin.html @@ -17,7 +17,7 @@ -
                              Skip to content

                              自定义插件

                              插件拓展点列表

                              类别拓展点描述
                              管理员类AdminActionHandler管理员行为Handler。用于监听管理员的API操作
                              用户类UserAuthenticator用户登陆认证。当客户端向turms-gateway请求登录时,turms-gateway会调用该插件以实现自定义的登录认证逻辑。通过该插件,您就不需要(可选)将您业务系统中的用户信息同步到Turms当中了
                              UserOnlineStatusChangeHandler用户在线状态变更Handler。当任意一位用户进入上线或离线状态时,turms-gateway会调用该接口
                              请求类ClientRequestHandler客户端业务请求处理器。用于修改请求参数(甚至可以转变成其他业务请求)与自定义请求实现。当turms收到客户端业务请求时会调用该Handler。通过该插件,您可以实现敏感词过滤等功能
                              通知与消息类NotificationHandler通知Handler。当由于某行为的发生需要通知给相关用户时,turms-gateway会调用该Handler。常用于集成自定义的第三方推送服务
                              ExpiredMessageDeletionNotifier过期消息自动删除通知处理器。当Turms自动定期删除过期消息时,Turms服务端会调用该接口,告知该插件实现方所有将要被删除的消息。常用于开发者备份消息
                              服务实现类StorageServiceProvider存储服务Provider。Turms项目本身没有存储服务的具体实现,仅对外暴露了存储服务相关的接口,供该插件实现。(可参考turms-plugin-minio)
                              业务模型生命周期类(TODO)

                              插件加载方式

                              • 本地加载:Turms服务端会检测发布包plugins目录下,以.jar文件名结尾的JAR包,以及以.js文件名结尾的JavaScript文件是否为插件实现,如果是插件,则会在Turms服务端启动时加载它们。

                                注意:Turms服务端不会加载存放在lib目录下的插件。

                                拓展资料:Turms服务端发布包的目录结构

                              • 通过HTTP接口加载:

                                • 添加Java插件的API接口:POST /plugins/java
                                • 添加JavaScript插件的API接口:POST /plugins/js

                                拓展资料:插件相关API接口

                              • 通过turms-admin加载(基于“通过HTTP加载”实现):在/cluster/plugin页面,管理员也能通过UI的方式上传Java插件与JavaScript插件。

                              插件实现

                              Turms服务端支持基于JVM或JavaScript语言的插件实现。

                              JVM语言插件JavaScript插件
                              语言版本Java 21 (Bytecode 65.0)ECMAScript 2022
                              优点适合实现逻辑复杂的功能。
                              比如Turms项目的官方插件turms-plugin-antispam敏感词过滤插件
                              只需新建一个JavaScript文件,就可以直接编写自定义逻辑,无需编译,无需打包;
                              方便支持热更新
                              缺点如果只是实现一点自定义逻辑,依旧需要先搭个插件项目,然后基于构建工具将代码打包成Jar包,流程繁琐如果需要实现复杂的逻辑,则不如基于Java插件实现;
                              内存开销比Java插件大;
                              解释执行,运行效率低
                              总评更适合做实现复杂、偏重且实现相对固定的插件。
                              该类插件更像是一个“工程”
                              更适合小巧轻量、需要支持热更新的插件。
                              该类插件更像是一个“小补丁”

                              JVM语言版本(以Java为例)

                              实现步骤

                              1. 安装Turms项目的JAR包依赖,供您插件编译时使用

                                • Clone Turms的仓库。参考命令:git clone --depth 1 https://github.com/turms-im/turms.git
                                • 在Turms项目的根目录(即.git目录的父目录)下,通过执行mvn install -DskipUTs -DskipITs -DskipSTs命令来编译Turms项目源码,并将生成的JAR包自动安装到本地的Maven仓库中,供您插件编译时使用
                              2. 搭建插件项目

                                • 方案一(推荐):将turms/turms-plugin-demo目录克隆一份到本地,并基于该模板进行开发。该方案可以减少不必要的重复配置工作。

                                • 方案二:手动搭建。具体步骤如下:

                                  1. 新建一个Maven项目,并在pom.xml中添加依赖(实现turms-gateway服务端的插件,则添加turms-gateway依赖。实现turms-service的插件,则添加turms-service的依赖):

                                    xml
                                    <dependency>
                                    +    
                                    Skip to content

                                    自定义插件

                                    插件拓展点列表

                                    类别拓展点描述
                                    管理员类AdminActionHandler管理员行为Handler。用于监听管理员的API操作
                                    用户类UserAuthenticator用户登陆认证。当客户端向turms-gateway请求登录时,turms-gateway会调用该插件以实现自定义的登录认证逻辑。通过该插件,您就不需要(可选)将您业务系统中的用户信息同步到Turms当中了
                                    UserOnlineStatusChangeHandler用户在线状态变更Handler。当任意一位用户进入上线或离线状态时,turms-gateway会调用该接口
                                    请求类ClientRequestHandler客户端业务请求处理器。用于修改请求参数(甚至可以转变成其他业务请求)与自定义请求实现。当turms收到客户端业务请求时会调用该Handler。通过该插件,您可以实现敏感词过滤等功能
                                    通知与消息类NotificationHandler通知Handler。当由于某行为的发生需要通知给相关用户时,turms-gateway会调用该Handler。常用于集成自定义的第三方推送服务
                                    ExpiredMessageDeletionNotifier过期消息自动删除通知处理器。当Turms自动定期删除过期消息时,Turms服务端会调用该接口,告知该插件实现方所有将要被删除的消息。常用于开发者备份消息
                                    服务实现类StorageServiceProvider存储服务Provider。Turms项目本身没有存储服务的具体实现,仅对外暴露了存储服务相关的接口,供该插件实现。(可参考turms-plugin-minio)
                                    业务模型生命周期类(TODO)

                                    插件加载方式

                                    • 本地加载:Turms服务端会检测发布包plugins目录下,以.jar文件名结尾的JAR包,以及以.js文件名结尾的JavaScript文件是否为插件实现,如果是插件,则会在Turms服务端启动时加载它们。

                                      注意:Turms服务端不会加载存放在lib目录下的插件。

                                      拓展资料:Turms服务端发布包的目录结构

                                    • 通过HTTP接口加载:

                                      • 添加Java插件的API接口:POST /plugins/java
                                      • 添加JavaScript插件的API接口:POST /plugins/js

                                      拓展资料:插件相关API接口

                                    • 通过turms-admin加载(基于“通过HTTP加载”实现):在/cluster/plugin页面,管理员也能通过UI的方式上传Java插件与JavaScript插件。

                                    插件实现

                                    Turms服务端支持基于JVM或JavaScript语言的插件实现。

                                    JVM语言插件JavaScript插件
                                    语言版本Java 21 (Bytecode 65.0)ECMAScript 2022
                                    优点适合实现逻辑复杂的功能。
                                    比如Turms项目的官方插件turms-plugin-antispam敏感词过滤插件
                                    只需新建一个JavaScript文件,就可以直接编写自定义逻辑,无需编译,无需打包;
                                    方便支持热更新
                                    缺点如果只是实现一点自定义逻辑,依旧需要先搭个插件项目,然后基于构建工具将代码打包成Jar包,流程繁琐如果需要实现复杂的逻辑,则不如基于Java插件实现;
                                    内存开销比Java插件大;
                                    解释执行,运行效率低
                                    总评更适合做实现复杂、偏重且实现相对固定的插件。
                                    该类插件更像是一个“工程”
                                    更适合小巧轻量、需要支持热更新的插件。
                                    该类插件更像是一个“小补丁”

                                    JVM语言版本(以Java为例)

                                    实现步骤

                                    1. 安装Turms项目的JAR包依赖,供您插件编译时使用

                                      • Clone Turms的仓库。参考命令:git clone --depth 1 https://github.com/turms-im/turms.git
                                      • 在Turms项目的根目录(即.git目录的父目录)下,通过执行mvn install -DskipUTs -DskipITs -DskipSTs命令来编译Turms项目源码,并将生成的JAR包自动安装到本地的Maven仓库中,供您插件编译时使用
                                    2. 搭建插件项目

                                      • 方案一(推荐):将turms/turms-plugin-demo目录克隆一份到本地,并基于该模板进行开发。该方案可以减少不必要的重复配置工作。

                                      • 方案二:手动搭建。具体步骤如下:

                                        1. 新建一个Maven项目,并在pom.xml中添加依赖(实现turms-gateway服务端的插件,则添加turms-gateway依赖。实现turms-service的插件,则添加turms-service的依赖):

                                          xml
                                          <dependency>
                                               <groupId>im.turms</groupId>
                                               <artifactId>turms-gateway</artifactId>
                                               <version>0.10.0-SNAPSHOT</version>
                                          @@ -222,7 +222,7 @@
                                           }
                                           
                                           export default MyTurmsPlugin;

                                          其中:

                                          • MyTurmsExtension类是开发者自定义的TurmsExtension拓展,开发者可以自定义类名。其中:

                                            • getExtensionPoints函数必须存在,用于返回该拓展类实现了的插件拓展点名称。如果开发者指定了拓展点,但没有实现拓展点的接口函数,则Turms服务端在执行插件回调函数时,会跳过该插件,并不会报错。
                                          • MyTurmsPlugin类是开发者自定义的TurmsPlugin插件,开发者可以自定义类名。其中:

                                            • getDescriptor函数必须存在,它返回的对象是插件的描述信息:

                                              • id字段用于区分插件。无格式要求,但是必须不为空。

                                              • 其他字段起描述作用,暂无实际作用,均可为空。

                                            • getExtensions函数必须存在,它返回的对象是拓展类数组,如上文的MyTurmsExtension

                                          • export default用于导出开发者自定的插件,如上文的MyTurmsPlugin

                                          注意事项:

                                          • Turms服务端只会检测plugins目录下,以.js文件名结尾的文件是否为插件实现,因此如果您将插件JAR包放到lib目录下,则这些插件将不会被识别与使用。
                                          • Turms不对插件进行访问控制,您需要自行确保插件中没有恶意代码。注意:恶意插件不仅可以调用函数直接强制关闭Turms服务端,甚至可以直接控制操作系统。
                                          • 上下文环境以插件为单位,即每个插件都有它独立的上下文环境,并且一个插件的所有函数公用一个上下文环境。换言之,下次执行的函数可以查看上次执行的函数对上下文环境的改动。
                                          • JavaScript插件也能像Java插件那样访问Turms服务端的Java类与实例,甚至直接调用System.exit(),只是不推荐用JavaScript写复杂的插件
                                          • 不支持调用Node.js模块。

                                          主要全局对象

                                          • load函数是GraalVM的全局函数,用于加载外部JavaScript资源。
                                          • turms对象。该对象挂载了:
                                            • log对象,用于日志打印
                                            • fetch函数,用于发送HTTP请求

                                          TODO

                                          插件Debug步骤

                                          在Debug模式下(配置turms.plugin.js.debug.enabledtrue,可以启动Debug模式):

                                          1. 当插件宿主Turms Java服务端调用由Java Proxy类代理后的JavaScript插件函数实现时(其代理实现源码在:im.turms.server.common.infra.plugin.JsExtensionPointInvocationHandler),监听JavaScript插件的WebSocket Debugger服务端会等待开发者启动Chrome浏览器的Debugger,以保证在开发者绑定完Debugger后,才开始执行JavaScript插件代码。此时调用JavaScript插件函数的Java调用线程会进入WAITING状态,并等待JavaScript插件函数执行完成。

                                          2. 为了监听JavaScript插件代码实现,开发者需要自行打开Chrome浏览器,并输入监听JavaScript插件的WebSocket Debugger服务端监听地址,开发者可以在该页面上给JavaScript插件代码打上断点供调试。其中,服务端监听地址会被Turms服务端打印在控制台上,类似于:

                                            Debugger listening on ws://127.0.0.1:24242/bd62b7be-bdec-48a6-9ad0-9314af33d531 For help, see: https://www.graalvm.org/tools/chrome-debugger E.g. in Chrome open: devtools://devtools/bundled/js_app.html?ws=127.0.0.1:24242/bd62b7be-bdec-48a6-9ad0-9314af33d531

                                            其中的devtools://devtools/bundled/js_app.html?ws=127.0.0.1:24242/bd62b7be-bdec-48a6-9ad0-9314af33d531即是监听地址。

                                          3. 在绑定完Chrome Debugger后,JavaScript插件函数就会开始执行

                                          4. 等JavaScript插件函数执行完毕后,Java调用线程会进入RUNNABLE状态,而Java的代理函数也会接着返回JavaScript插件函数返回的数据。

                                          配置项

                                          配置名默认值说明
                                          turms.plugin.enabledtrue是否开启插件机制
                                          turms.plugin.dirplugins本地插件所在目录。Turms服务端将从该目录中加载插件
                                          turms.plugin.network.proxy.enabledfalse下载网络插件时,是否开启HTTP代理
                                          turms.plugin.network.proxy.usernameHTTP代理用户名
                                          turms.plugin.network.proxy.passwordHTTP代理密码
                                          turms.plugin.network.proxy.hostHTTP代理主机名
                                          turms.plugin.network.proxy.port8080HTTP代理端口号
                                          turms.plugin.network.proxy.connect-timeout-millis60_000HTTP代理连接超时时长(毫秒)
                                          turms.plugin.network.plugins[?].url插件URL
                                          turms.plugin.network.plugins[?].typeAUTO插件类型。
                                          当值为AUTO时,Turms服务端会根据URL的路径检测插件的类型:如果URL以.jar结尾,则判断为Java插件,如果URL以.js结尾,则判断为JavaScript插件,否则Turms服务端会抛出无法识别插件类型的异常。
                                          当值为JAVA时,则为Java插件类型
                                          当值为JAVA_SCRIPT时,则为JavaScript插件类型
                                          turms.plugin.network.plugins[?].use-local-cachefalse是否使用本地插件缓存。如果false,Turms服务端会在每次启动时都重新下载插件
                                          turms.plugin.network.plugins[?].download.http-methodGET请求插件URL时,HTTP请求的方法类型
                                          turms.plugin.network.plugins[?].download.timeout-millis60_000下载插件的超时时间(毫秒)

                                          插件相关API接口

                                          OpenAPI地址:http://playground.turms.im:8510/openapi/ui#/plugin-controller

                                          Controller路径作用通用
                                          PluginControllerGET /plugins查询插件
                                          PUT /plugins更新插件
                                          DELETE /plugins删除插件
                                          POST /plugins/java添加Java插件
                                          POST /plugins/js添加JavaScript插件
                                    - + \ No newline at end of file diff --git a/docs/zh-CN/server/development/redevelopment.html b/docs/zh-CN/server/development/redevelopment.html index b0ef758a..0e0f5073 100644 --- a/docs/zh-CN/server/development/redevelopment.html +++ b/docs/zh-CN/server/development/redevelopment.html @@ -17,7 +17,7 @@ -
                                    Skip to content

                                    关于二次开发

                                    基于Turms做二次开发的原因

                                    客观原因

                                    • 唯一性。Turms解决方案是全球即时通讯开源领域内,唯一一个基于现代化架构与现代化工程技术,并且适合中大规模部署的解决方案。而其他数十款IM开源项目仍处于刀耕火种时代,多是强调企业通讯、或端到端安全的IM项目,通常只能获得企业用户的青睐。除Turms之外,全球开源界尚未有一款面向常规互联网应用设计的中大型IM开源项目。

                                    • 规范性。由于Turms的架构设计是标准商用即时通讯架构的变种,因此如果您的专业团队是以常见的商用标准为要求,您的团队设计出来的架构也与Turms现在的架构相差不多的,没有必要另起炉灶从零自研。

                                    • 简易性。Turms整个架构与各个模块的实现其实都比较简洁与轻量的,二次开发难度不高。

                                    • 可控性。Turms基于Apache V2协议进行开发,100%开源,并对很多基础中间件进行了自研,保证了底层技术的可控,避免了项目后期发展动力不足。

                                    • 文档齐全。其中包括了诸如消息感知、可观察性体系、敏感词过滤、防刷限流、全局黑名单等等模块的设计文档。我们在写Turms文档时,是秉着“就怕没写明白”的态度写的,Turms文档不仅会写“做了什么”、“怎么做”,还会写“为什么要这么做”,通过提供设计理念与思路与核心要点帮助开发者理解各种功能模块,这在开源圈其实都是比较少见的。而部分开源IM项目的人员为了赚取咨询费与担心被抄袭,秉着“就怕用户能明白”了的态度写着,因此不愿意写好文档。

                                      提醒:设计文档对开发者与架构师的重要性不言而喻,读者在使用各种开源IM项目时,可以自行进行检验一个项目的文档是“就怕没写明白”,还是“就怕用户能明白”。

                                    • IM系统自身细节繁多,而开发人员水平又参差不齐,很难保证做出来的项目质量如何。实现用户A能给用户B/群B发消息最多也只是实现IM系统功能的1%功能,并且这些功能模块不像是一些通用的依赖库可以随意插拔而是要定制实现,如Turms基于双数组Trie AC自动机算法的敏感词过滤功能,且各实现环环相扣(其实就连Turms的文档都是互相引用、环环相扣的),因此各模块都要自研,要求设计人员与开发人员有很强的功底。

                                      (若想了解一个完整的IM系统具体有多少细节功能,可以继续阅读Turms的文档。当然,IM系统的功能可以更加丰富,这些是我们上面说过的:IM不仅是复杂,而且是可以几乎无止尽的复杂。)

                                      而Turms基本已经实现了一个完整的IM服务端系统,基本用户能想到的,与没想到的,我们都已经实现了,或者已经打好底子了,就算是不实现的功能我们一般也已经写明了为什么故意不实现,保证透明。

                                      另外,Turms的一些实现方案可能看起来是“理所当然”的方案,但其实我们在设计与实现一个方案时,通常是已经推翻了其他众多方案了,其背后是大量的推导与实践,用户看到的只是一个最终方案,然后觉得“这是理所当然的方案”而已。关于这点,Turms各模块的设计文档都有做相关的说明。

                                    • 代码质量高。Turms服务端在代码实现上能始终保持着高级工程师应有的水平,能在代码性能与可读性中取得平衡。具体请看Turms的服务端源码与各模块的设计文档。我们之所以敢说Turms服务端能达到Java生态的极限,除了Turms服务端自身实现本来就非常高效外,我们对很多低效但关键的依赖库(如mongo-java-driverlettuce)进行了重构,甚至自研实现(如日志实现/集群实现),以保证极致的性能。

                                      特别一提,部分开源项目自称性能很好,但其实一看代码就露馅了。这里给读者介绍三个判断开源作者编码水平的比较通用、快速且实用的方法,供读者参考:

                                      • (初级)语法、数据结构与编程范式的合理使用。
                                      • (中级)通过类名、变量名、函数名等,观察作者的词汇量+用词准确度。词汇量与用词准确度是很难伪装的东西,通过这个方法一般很容易反推出项目作者的技术背景、技术水平与编码经验。如果作者词汇量丰富且用词都比较准确,那编码水平通常不会差。
                                      • (高级)反范式设计(如反设计模式设计、反常规算法设计与Unsafe操作等)。合理使用设计模式可以看出作者是否有设计思维,而敢于反范式设计通常是作者自己心中有明确的编码目标,且对相关的设计与底层代码非常熟悉,洞察到常规设计中的不足,并有勇气回答“为什么不按照标准套路做设计”的质问,才敢于反范式做设计。

                                      当然,上述方法仅供读者参考,实际的考察点可以更多。

                                    • 技术方案具有前瞻性。作为软件工程师,我们深有体会的一点是:可能今天众星拱月的知名技术方案,明天就成了昨日黄花,成了“技术负债”。诸如服务端侧的Hadoop,Web侧的Bootstrap、Backbone.js与Ember.js。而Turms在做技术选型时,不仅会考虑当前的现状,如集群的设计与实现,还会考虑未来技术的发展进程,如系统资源管理提到的Valhalla项目与Loom项目。

                                    • 自研IM服务的市场需求大。即便现在到各招聘网站查询IM工程师相关岗位,也能发现国内外还有大量企业招聘IM相关人才,各公司投入上百或千万从零或基于古老的IM开源项目自研,重复造IM服务,社会资源利用率低。

                                    另外,如果您还在犹豫是否要采用其他开源IM项目,那我们非常推荐您将Turms与它们做对比,在您大概读过Turms与另外开源IM项目的文档与源码,相信您心中会有明确的答案。

                                    主观原因

                                    • 您项目的核心业务与即时通讯相关,或者有深耕于即时通讯业务的计划。
                                    • 您项目所需要的拓展功能Turms目前暂未提供,尤其是需要通过辅助索引表来实现的拓展功能(关于辅助索引表,可查看Turms集合设计)。
                                    • 您项目存在大量项目独有的IM实现细节。Turms虽然提供了上百个配置项,但这些也只是普适的配置。根据具体业务需求的不同,IM相关功能的具体实现极其丰富,但Turms不可能直接提供这些相对小众业务功能的实现,否则代码量将会指数级增加,因此需要您自行做二次开发。

                                    项目引入

                                    1. 拉取Turms仓库:git clone https://github.com/turms-im/turms.git

                                    2. 由于Turms各子项目的proto模型文件放在一个独立的仓库之中,因此您还需要在Turms项目的根目录下,通过以下命令来拉取submodule中的代码。

                                    git submodule update --init --recursive
                                    +    
                                    Skip to content

                                    关于二次开发

                                    基于Turms做二次开发的原因

                                    客观原因

                                    • 唯一性。Turms解决方案是全球即时通讯开源领域内,唯一一个基于现代化架构与现代化工程技术,并且适合中大规模部署的解决方案。而其他数十款IM开源项目仍处于刀耕火种时代,多是强调企业通讯、或端到端安全的IM项目,通常只能获得企业用户的青睐。除Turms之外,全球开源界尚未有一款面向常规互联网应用设计的中大型IM开源项目。

                                    • 规范性。由于Turms的架构设计是标准商用即时通讯架构的变种,因此如果您的专业团队是以常见的商用标准为要求,您的团队设计出来的架构也与Turms现在的架构相差不多的,没有必要另起炉灶从零自研。

                                    • 简易性。Turms整个架构与各个模块的实现其实都比较简洁与轻量的,二次开发难度不高。

                                    • 可控性。Turms基于Apache V2协议进行开发,100%开源,并对很多基础中间件进行了自研,保证了底层技术的可控,避免了项目后期发展动力不足。

                                    • 文档齐全。其中包括了诸如消息感知、可观察性体系、敏感词过滤、防刷限流、全局黑名单等等模块的设计文档。我们在写Turms文档时,是秉着“就怕没写明白”的态度写的,Turms文档不仅会写“做了什么”、“怎么做”,还会写“为什么要这么做”,通过提供设计理念与思路与核心要点帮助开发者理解各种功能模块,这在开源圈其实都是比较少见的。而部分开源IM项目的人员为了赚取咨询费与担心被抄袭,秉着“就怕用户能明白”了的态度写着,因此不愿意写好文档。

                                      提醒:设计文档对开发者与架构师的重要性不言而喻,读者在使用各种开源IM项目时,可以自行进行检验一个项目的文档是“就怕没写明白”,还是“就怕用户能明白”。

                                    • IM系统自身细节繁多,而开发人员水平又参差不齐,很难保证做出来的项目质量如何。实现用户A能给用户B/群B发消息最多也只是实现IM系统功能的1%功能,并且这些功能模块不像是一些通用的依赖库可以随意插拔而是要定制实现,如Turms基于双数组Trie AC自动机算法的敏感词过滤功能,且各实现环环相扣(其实就连Turms的文档都是互相引用、环环相扣的),因此各模块都要自研,要求设计人员与开发人员有很强的功底。

                                      (若想了解一个完整的IM系统具体有多少细节功能,可以继续阅读Turms的文档。当然,IM系统的功能可以更加丰富,这些是我们上面说过的:IM不仅是复杂,而且是可以几乎无止尽的复杂。)

                                      而Turms基本已经实现了一个完整的IM服务端系统,基本用户能想到的,与没想到的,我们都已经实现了,或者已经打好底子了,就算是不实现的功能我们一般也已经写明了为什么故意不实现,保证透明。

                                      另外,Turms的一些实现方案可能看起来是“理所当然”的方案,但其实我们在设计与实现一个方案时,通常是已经推翻了其他众多方案了,其背后是大量的推导与实践,用户看到的只是一个最终方案,然后觉得“这是理所当然的方案”而已。关于这点,Turms各模块的设计文档都有做相关的说明。

                                    • 代码质量高。Turms服务端在代码实现上能始终保持着高级工程师应有的水平,能在代码性能与可读性中取得平衡。具体请看Turms的服务端源码与各模块的设计文档。我们之所以敢说Turms服务端能达到Java生态的极限,除了Turms服务端自身实现本来就非常高效外,我们对很多低效但关键的依赖库(如mongo-java-driverlettuce)进行了重构,甚至自研实现(如日志实现/集群实现),以保证极致的性能。

                                      特别一提,部分开源项目自称性能很好,但其实一看代码就露馅了。这里给读者介绍三个判断开源作者编码水平的比较通用、快速且实用的方法,供读者参考:

                                      • (初级)语法、数据结构与编程范式的合理使用。
                                      • (中级)通过类名、变量名、函数名等,观察作者的词汇量+用词准确度。词汇量与用词准确度是很难伪装的东西,通过这个方法一般很容易反推出项目作者的技术背景、技术水平与编码经验。如果作者词汇量丰富且用词都比较准确,那编码水平通常不会差。
                                      • (高级)反范式设计(如反设计模式设计、反常规算法设计与Unsafe操作等)。合理使用设计模式可以看出作者是否有设计思维,而敢于反范式设计通常是作者自己心中有明确的编码目标,且对相关的设计与底层代码非常熟悉,洞察到常规设计中的不足,并有勇气回答“为什么不按照标准套路做设计”的质问,才敢于反范式做设计。

                                      当然,上述方法仅供读者参考,实际的考察点可以更多。

                                    • 技术方案具有前瞻性。作为软件工程师,我们深有体会的一点是:可能今天众星拱月的知名技术方案,明天就成了昨日黄花,成了“技术负债”。诸如服务端侧的Hadoop,Web侧的Bootstrap、Backbone.js与Ember.js。而Turms在做技术选型时,不仅会考虑当前的现状,如集群的设计与实现,还会考虑未来技术的发展进程,如系统资源管理提到的Valhalla项目与Loom项目。

                                    • 自研IM服务的市场需求大。即便现在到各招聘网站查询IM工程师相关岗位,也能发现国内外还有大量企业招聘IM相关人才,各公司投入上百或千万从零或基于古老的IM开源项目自研,重复造IM服务,社会资源利用率低。

                                    另外,如果您还在犹豫是否要采用其他开源IM项目,那我们非常推荐您将Turms与它们做对比,在您大概读过Turms与另外开源IM项目的文档与源码,相信您心中会有明确的答案。

                                    主观原因

                                    • 您项目的核心业务与即时通讯相关,或者有深耕于即时通讯业务的计划。
                                    • 您项目所需要的拓展功能Turms目前暂未提供,尤其是需要通过辅助索引表来实现的拓展功能(关于辅助索引表,可查看Turms集合设计)。
                                    • 您项目存在大量项目独有的IM实现细节。Turms虽然提供了上百个配置项,但这些也只是普适的配置。根据具体业务需求的不同,IM相关功能的具体实现极其丰富,但Turms不可能直接提供这些相对小众业务功能的实现,否则代码量将会指数级增加,因此需要您自行做二次开发。

                                    项目引入

                                    1. 拉取Turms仓库:git clone https://github.com/turms-im/turms.git

                                    2. 由于Turms各子项目的proto模型文件放在一个独立的仓库之中,因此您还需要在Turms项目的根目录下,通过以下命令来拉取submodule中的代码。

                                    git submodule update --init --recursive
                                     git submodule foreach git pull origin master
                                    git submodule update --init --recursive
                                     git submodule foreach git pull origin master
                                    1. (可选)如果您使用的是IntelliJ IDEA,则可以通过File -> New -> Project from Existing Source引入整个Turms项目。IDEA将自动识别整个Turms项目的目录结构,并引入对应的Maven依赖库。

                                    搭建开发环境

                                    除了Turms服务端外,Turms其他子项目的搭建都非常常规与简单,故不赘述。

                                    Turms服务端开发环境的搭建其实也非常简单,具体步骤包括:

                                    1. 安装JDK 21以开发Turms服务端

                                    2. 下载、安装并启动Redis服务端。以RHEL/CentOS为例:

                                      bash
                                      yum install epel-release
                                       yum update
                                      @@ -28,7 +28,7 @@
                                       yum install redis
                                       systemctl start redis
                                       systemctl enable redis

                                      对于Windows平台,可在 tporadowski/redis 下载Windows版本供本地开发测试用。

                                    3. 下载、安装并启动MongoDB分片集群

                                      • 下载MongoDB 4.4
                                      • 启动MongoDB分片集群:推荐安装mtools以全自动搭建MongoDB分片集群,其安装指令为:pip3 install mtools[mlaunch]。在安装完mtools后,只需运行mlaunch init --replicaset --sharded 1 --nodes 1 --config 1 --hostname localhost --port 27017 --mongos 1这一条指令,并等待数秒,即可完成MongoDB分片集群的搭建
                                    4. 确认Redis服务端与MongoDB分片集群都正常运行后,即可启动Turms服务端

                                    补充:

                                    • 对于Redis、MongoDB的启动,可以设置成开机自启服务,这样就不用每次重启电脑后再手动搭建了。另外,就算是手动搭建,其实开发者多操作几次,基本也可以在10~30秒完成Redis与MongoDB分片集群的搭建,搭建与启动流程非常简单。
                                    • 在进行服务端开发时,推荐将turms-gatewayturms-service两个项目下的application.yaml中的spring.profiles.active=prod改为dev。这是因为:
                                      • 在默认生产环境配置下,Turms服务端是不会在控制台打印日志的,因此不方便开发者进行调试
                                      • dev环境下,turms-service会自动向MongoDB数据库生成Fake数据,并且turms-gateway也会自动创建基于TCP的Fake客户端,这些客户端会随机地(请求类型随机、请求参数随机)向turms-gateway发送真实客户端请求,以方便开发者测试。
                                    • 如果您想替换MongoDB服务端的端口,您只需在Turms项目下全局替换27017为您的目标端口即可。

                                    关于任务难度

                                    对于准备基于Turms做二次开发(改Turms项目自身的源码)的团队,可以参考下述的任务难度表,给成员分配任务。

                                    任务的难度值为0~10,其中:

                                    • 0表示极其简单
                                    • 1~3表示简单
                                    • 4~6表示中等
                                    • 7~9表示难
                                    • 10表示无法实现

                                    服务端

                                    “代码实现难度”主要从两个角度考虑,一个是逻辑复杂度,另一个则是工作量(繁琐程度,主要依靠“体力”实现)。比如等量自研一套spring-webflux的实现,其逻辑复杂度算1~3,但工作量算是5~6,二者综合下来就算5~6。而算法实现则一般是高逻辑复杂度、低工作量。

                                    需求分析相关流程设计代码实现难度(前提:代码实现必须高效)
                                    IM基础业务功能3~7。需要考虑所有IM业务特性是否逻辑一致、以及是否能够高效实现(由实现反推或限制IM业务需求)等4~6:初期阶段。如消息用读扩散、写扩散、读写混合技术选型。各种通知推、拉、推拉混合技术选型
                                    1~2:目前阶段
                                    1~3。绝大部分就是常规的CRUD操作。个别为3的任务难在其要平衡代码优雅与高效实现之间的矛盾,偏代码设计问题。
                                    拓展功能2~53~4:初期阶段
                                    1~2:目前阶段
                                    2:限流防刷机制
                                    4~5:全局黑名单
                                    7~8:敏感词实现
                                    中间件实现与基础库1~31~31~4。
                                    1:如度量、分布式雪花ID分发器
                                    2~3:如日志、分布式配置中心
                                    3~4:如插件机制、RPC、服务注册与发现
                                    改BUG0~30~31:绝大部分常规Bug
                                    Turms很少孤立地改Bug,一般改Bug前要推演导致这Bug的业务流程设计合不合理,有没有优化空间,其次才是改这Bug。
                                    并且难改的Bug一般跟代码实现没什么关系,一般难改的Bug是因为流程设计出了漏洞。
                                    比如要是架构设计出了问题,本应该用读扩散的架构,但却用了写扩散。底层设计出错,上层再怎么改也只是隔靴搔痒。
                                    定制算法与数据结构11~21:常规定制数据结构。如im.turms.server.common.infra.collection.FastEnumMap
                                    2:无锁线程安全的定制数据结构,如:im.turms.server.common.infra.collection.ConcurrentEnumMapim.turms.server.common.infra.throttle.TokenBucket
                                    4~5:无锁线程安全的定制Growable数据结构,如im.turms.server.common.infra.collection.SpscGrowableLongRingBuffer
                                    8:敏感词中的im.turms.plugin.antispam.ac.AhoCorasickDoubleArrayTrie

                                    总评:

                                    • IM功能的难点在于需求分析与概要设计,新加一个IM特性既要考虑它对其他IM业务特性是否逻辑一致,又要考虑当前架构是否能对其进行高效实现、需不需要分布式事务、是否需要增添数据库中的集合字段等等众多问题。而对于代码分包分层,其在早期阶段比较复杂,但这些问题目前都已经解决且比较稳定了,因此新的任务一般不会遇到代码流程设计上的难点。而具体的代码实现一般都很常规,个别实现可能相对繁琐。
                                    • 定制中间件与基础库的实现基本没难点,相对要注意的主要也是需求分析(当然,中间件需求分析的难度跟IM业务功能的需求分析相比,简单非常多)。
                                    • 大部分Bug本身没有什么难度,但需要有反推导致这Bug的Root Cause,并思考该业务流程有没有优化空间的能力(其实到底还是难在需求分析)
                                    • 除基于双数组Trie的AC自动机算法比较难实现,其他大部分定制算法都比较容易实现。并且其实需要定制的算法与数据结构很少,因此二开团队应该不会遇到算法与数据结构相关的难题。

                                    特别一提:不做一个功能也是要需求分析的。比如Turms有一些功能的流程都设计完了,其代码实现也写完了。但最终考虑到该需求可能与其他的需求逻辑发生冲突,或者较大性能损耗而该需求又是可有可无的,因此这些功能会一直处于实现了但不发布的悬垂状态。

                                    turms-admin

                                    turms-admin本身没有技术难点,代码层次与实现都比较规范,不存在中大型前端项目中因为历史遗留原因而存在的大量异构子项目嵌套问题(比如根项目用Backbone,而嵌套在这个根项目的子项目混合使用Vue、Angular、React等,以及各种依赖版本冲突),因此初级前端工程师就应该有能力上手并做二开。

                                    而做一个新UI特性的时间占比一般是:需求分析(40%) > UI设计(30%) >= 代码实现(30%)

                                    turms-client

                                    turms-client本身没有技术难点,代码层次与实现都比较规范,初级工程师就应该有能力上手并做二开。

                                    turms-client的难点相对来说,是API接口设计“尽量让接口顾名思义,同时又保证开发者有拓展底层的能力”。

                                    - + \ No newline at end of file diff --git a/docs/zh-CN/server/development/rules.html b/docs/zh-CN/server/development/rules.html index 4dd2aa37..624ba3f6 100644 --- a/docs/zh-CN/server/development/rules.html +++ b/docs/zh-CN/server/development/rules.html @@ -17,7 +17,7 @@ -
                                    Skip to content

                                    基本开发规约

                                    保守设计与激进设计

                                    Java自身是一个很保守的语言,其大生态也非常保守。其设计原则是“提供一套安全的API,Java使用者怎么使用这些API,都不会导致Java内部出错”(除了Unsafe类),因此提供各种访问控制机制、内部内存拷贝与反复加锁。而Turms服务端代码的编写原则一般是“程序怎么跑的快,怎么写。只要Caller敢乱传或乱用数据,我们就直接报错或直接无视”。举例而言,Turms的StringUtil通过jdk.internal.misc.Unsafe#getReference获取String对象内部的byte[]对象,以避免内存拷贝,Caller需要自行保证不“胡作非为”。而Java自身提供的String#getBytes()为了保证使用者无法修改到内部的byte[],因此是将该byte[]对象拷贝一份,再传给Caller。

                                    因此在字符串实践中,对于一个常规基于Spring搭建的Web应用,一个HTTP请求从TCP字节流切割出来之后,可能需要反反复复在StringStringBuilderbyte[]HeapByteBufferDirectByteBuffer等数据之间进行切换与拼接,最终一个业务层面上的String类型对象,被第三方库与Java内部拷贝5~30次是很常见的。

                                    再以具体应用为例,如果我们使用Spring创建了一个Controller Bean,并在其中定义了一个返回值类型为String的API函数,以通过这个API返回Prometheus格式的度量数据。如果我们在这前提下做“最优雅”的写法,我们至少需要对这个内存对象做4次内存拷贝(不含系统内核刷数据到网卡部分;Turms通过优化,只需要做一次内存拷贝:即堆内存到堆外内存;这个度量数据实际大小约8K):

                                    1. 将Java的基本数据写入StringBuilder,此时堆内存->堆内存拷贝
                                    2. StringBuilder#toString(),又一次堆内存拷贝
                                    3. String#getBytes(),至少又一次堆内存拷贝
                                    4. 将byte[]写到堆外内存DirectByteBuffer,以交给系统内核做写入数据操作

                                    内存有效使用率极低,且注意上面只是一个最简单的API String响应返回的功能,实际应用中涉及到的流程更为复杂,因此一个流程下来,一个字符串被拷贝5~30次是非常常见的事情。因此我们经常能见到当一个HTTP服务端基于其语言主流生态构建时,一个常规Java Web应用所使用到的内存,可能是其等量C++ HTTP服务端的数十倍甚至百倍。

                                    除了各种网络API,日志实现也需要频繁跟String打交道。而Turms在内存实践上就比通用实现高效太多了,Turms直接通过PooledByteBufAllocator.DEFAULT分配缓存了的堆外内存,并直接将Java的基本数据写入堆外内存块中。并且在整个过程中,我们避免使用Java自身的低效实现,从而避免无意义的堆到堆内存拷贝。

                                    综上,尽管Java自身比较保守,Turms则相对强调激进,并以性能优先,而非“代码优雅”,必要时善用Unsafe类。当然,我们“激进”也是有限度的,诸如:1. 绝不替换Java内部类实现;2. 尽量不编写JNI与C语言代码

                                    补充:

                                    1. 对于Java语法糖级别的实践,我们的态度是“比较无所谓”,如for (X x : Collection<X>) (需要创建迭代器对象,多消耗至少几十B)与更高效的for (int i = 0; i < length; i++),两者写法都允许
                                    2. 除了保守的倾向,Java圈子还有一个很吊诡的现象,即“优化时选择性忽视”,比如一方面放任StringStringBuilder的内存拷贝,一个API处理流程下来,需要把数个String拷贝数十次。另一方面,就精打细算地研究JVM内存优化。Turms面对各种优化项,主要就是根据“性价比”,优先优化性价比高的部分,以避免缘木求鱼。

                                    服务端开发的基本规约

                                    代码编写策略的优先级

                                    一般规则:性能(低时间复杂度与空间复杂度) > 代码可读性 > 设计模式

                                    • 性能 > 代码可读性。如使用long,而不是java.util.Datejava.time.Instant来表示时间,以避免创建新对象以及时间转换时的计算;又比如im.turms.server.common.infra.cluster.service.idgen.SnowflakeIdGenerator类下的nextIncreasingId函数与nextLargeGapId函数重复了约10行代码,但我们不提取这公共代码出来,以避免开辟新方法栈(不考虑JVM的滞后Inline操作)。
                                    • 性能 > 设计模式。如场景:
                                      • 遍历处理String中的char[]元素。如果使用责任链模式,则需要用不同的Handler类实现不同类别的处理逻辑,虽然这样可以把逻辑理得很清晰,但是每个Handler都需要遍历一遍char[],因此处理的时间复杂度为O(n*m)(n为char[]长度,m为Handler个数),这种复杂度的代码在Turms服务端代码中是禁止的。此时,就需要反设计模式来编写代码,尽可能把处理逻辑都写在一次遍历中,且尽量不要新开函数区分逻辑(这条可选),而是用注释分块来区分不同的处理逻辑,以避免函数栈开销。
                                      • Protobuf模型的高效设计一直受人称道,但官方Java版本的Protobuf的代码实现是偏保守且低效的。比如Protobuf模型是Immutable的,只有其Builder是Mutable的,因此想要修改Protobuf模型,还得先toBuilder()成一个Builder,再重新创建一个新Protobuf模型实例,内存有效使用率低下(额外补充:其字符串解码实现也是非常地低效,比如其为了兼容低版本Java,采用了char[]进行编码,但新版本Java的String内部只存储byte[],因此需要一次额外的类型转换)。而我们可控的代码是能不用Builder就不用Builder,避免无意义的内存消耗。

                                    例外:如在极少数情况下,代码可读性优先于性能。以下文中提到的禁止在客户端请求与管理员API请求的处理过程中使用反射为例。尽管有这个规则,但如果请求中需要创建供数据库驱动使用的Entity对象时,那我们还是会通过反射创建并填充这个对象。因为如果不使用反射,就需要手写上百个字段序列化与反序列化逻辑,工作量巨大,且容易出错。而使用反射的收益性就很高,所以允许使用反射。

                                    上述的示例还有很多,具体可以看Turms服务端代码。添加新代码时,只需要保证:新加的代码几乎没有任何时间或空间上的优化余地。如果还有优化空间,但收益很低且实现复杂,则允许后期再进行优化。

                                    线程与锁

                                    • 禁止使用弹性线程池,如需创建新线程,则需要进行专门的代码审查

                                    • 在客户端请求与管理员API请求的处理过程中,尽量不使用synchronized与Lock操作(包括可重入锁)。如果确实需要临界区,则优先考虑重构代码流程或用CAS技术替代。

                                    内存与GC

                                    • 禁止对ByteBuf进行拷贝操作

                                    • 对于网络I/O操作,禁止使用非池化或堆内存,只允许使用池化的直接内存

                                    • 尽量不要创建新对象,尽量使用对象池。如设计中常见的:为了将不同层的数据模型进行逻辑分离,专门拆成了DTO与BO模型。Turms对于这种场景,会尽量使用一个数据模型,并通过自定义Jackson的序列化逻辑来实现符合DTO模型的响应

                                      另外:该规则会在Valhalla项目发布之后,发生改变,尤其是我们将移除大部分现有的对象池

                                    • 尽量不要创建带多个unused字段的对象。如Turms用自定义的QueryOptions模型重构了MongoDB的FindOptions模型,其中一个原因就是FindOptions模型会被频繁使用,但其带有数十个无用字段

                                    • 在客户端请求与管理员API请求的处理过程中,禁止使用Stream

                                    • 关于“为什么一些看似可以用primitive参数的函数,依旧使用包装类”的问题。依旧使用包装类是因为:一个函数中的部分参数虽然可能看似可以使用primitives,但实际上这些primitives最终大概率会传给Java的集合类实现(如Map<Long, Object>)、只接受对象的函数(如Object类型、Long类型、泛型等)或作为类的Object字段等。因此,如果一个函数只是自顾自地使用primitive,那整条逻辑处理下来,这个primitive很可能在包装类与primitive之间反复转换多次。综上,Turms服务端在大部分情况下,统一使用包装类,以避免这样多次的转换。只有能保证primitive不会转成包装类,我们才统一使用primitive。

                                      另外,这既是为什么我们在关于Valhalla项目中说“万物皆对象”的设计理念“像诅咒一样挥之不去”,一个primitive在复杂的逻辑中,很难不会被转换成包装类,无意义的对象浪费了大量的内存,也是为什么我们一直在等待Valhalla项目终结包装类、并支持诸如List<int>类型等特性。

                                    代理与反射

                                    • 禁止使用动态代理技术(如Java动态代理、CGLib、Spring AOP等),尽量不使用代理或使用静态编译技术代替(如Lombok)。

                                      唯一的例外情况:Turms服务端的插件机制中,使用Java的动态代理去代理JavaScript编写的插件。

                                    • 在客户端请求与管理员API请求的处理过程中,除非不使用反射就需要写大量繁杂代码,其他场景下禁止使用反射技术。如:Turms在对MongoDB的Entity模型的数百个字段进行序列化与反序列化时,使用了反射。

                                    另外,如果有第三方依赖违背了以上原则,则根据性价比,排期对第三方依赖进行重构。

                                    文本格式

                                    toString()文本格式

                                    Java项目toString()实现的文本格式五花八门,甚至Java自身的内部代码都有很多风格不一致的文本格式。就括号的风格来说,既有Java record默认的[key=value]格式,也有Lombok生成的(key=value)格式,还有Google AutoValue生成的{key=value}格式。

                                    为了实现文本格式统一,Turms服务端项目统一采用如下格式:

                                    • 对于文本的前缀与后缀,分别使用{},而不是[]()。因为在Turms的文本格式设计中,[]指代数组,()指代需要特别标记,以让重要信息更为醒目。具体规则见下文的服务端运行日志与异常文本格式

                                    • 在键与值之间使用主流的=,而不是:

                                    • 对于字符串值,需要使用""对值进行包裹;对于其它非数组值,均采用值的toString()形式;对于数组值,则使用[]来包括数组中的值。

                                      比如:ClassName{key1=value, key2=[value1, value2]}

                                    注意:Turms服务端目前暂未统一toString()的文本格式,但上文所述内容是之后的改进方向。

                                    服务端运行日志与异常文本格式

                                    因为日志与异常的文本格式设计存在非常多的细节,而很多常见实践的原则又是互相冲突的,并且Java领域也没有一个统一的最佳实践,所以几乎所有的大中开源项目(包括Java自身的源码)都做不到文本格式统一,而是各种文本格式混合使用,具体用啥格式主要就靠工程师当下的“感觉”。

                                    因此本节专门讲解Turms服务端采用哪些文本格式,与为什么不采用另外一些常见的文本格式,以减少读者在实践中困惑。

                                    统一格式的重要性

                                    对于一些文本格式规则,可能读者在阅读单条日志,感觉不出规则之间有什么差别。但当读者需要翻阅数十条,甚至数百条、数千条各种不同的日志时,就能明白使用规范统一的文本格式有多么地节省阅读精力了。

                                    具体规则

                                    简单来说:

                                    • 文本中的重要信息尽量放句未。重要信息通常是变量。
                                    • 当重要信息在句末时,需要使用: 来分割重要信息与其他文本。如:使用Could not find the class: my.company.Main,而不使用The class (my.company.Main) could not be found
                                    • 句子不需要省略冠词aanthe。特别强调这点是因为大部分知名大中开源项目偏向于省略冠词。
                                    • 对于名词短语,通常使用限制性同位语,而非定语名词。比如,限制性同位语:The collection "messasge"The setting "turms.whatever.min";定语名词:The "messasge" collectionThe "turms.whatever.min" setting
                                    • 特殊符号的作用与使用:
                                    作用使用的符号在句中时: 搭配时与数组搭配时常见例子
                                    表示数组值[,]使用[value]格式。
                                    Detected illegal operations [CREATE, DELETE] on the collection "message"
                                    使用: [value]格式。
                                    Detected illegal operations: [CREATE, DELETE]
                                    表示区间[..]闭区间,(..)开区间如:[1..2]、``
                                    包裹需要特别分离以达到醒目效果的信息()使用(value)格式。
                                    The path (/turms/1.txt/) is illegal
                                    无需使用(),使用: value格式即可。
                                    Could not find any resource from the path: /turms/1.txt
                                    无需使用(),使用[value]格式即可。
                                    The paths [/1.txt, /2.txt] are illegal
                                    对象、枚举值、路径、域名、字段引用
                                    包裹键值对{}使用{key=value}格式。
                                    使用: {key=value}格式使用[{key=value}, {key=value}]格式
                                    包裹名称或字符串值""使用"value"格式。
                                    The property "turms.whatever.min" must be greater than 0The setting name "abc123" should not contain any digit
                                    使用: "value"格式。
                                    Unknown property: "turms.whatever.min"
                                    使用["value", "value"]格式。
                                    The properties ["turms.whatever.min", "turms.whatever.max"] are unknown
                                    字段名、参数名、数据库集合名
                                    • 名称与引用的区别

                                      先举一个相对容易理解的例子,以字段的名称与引用为例,假设有一个类com.abc.Song(歌曲)中有一个字段name,则该字段的名称是name,而名称在句中被使用时需要加上双引号"",如The field "name" contains illegal characteres。而字段的引用是com.abc.Song#name,而引用在句中被使用时需要加上括号(),如The field (com.abc.Song#name) should be accessible

                                      但在实际开发过程中,我们会发现很多字符串本身是可以有多种解释的。比如有一个类的名称是com.my.Main,那这个名称既可以被解释为类的名称,也可以解释为类的引用。而考虑到类名称不会出现像上述名称可能带来的严重歧义,且大多数中大知名开源项目的实践也不用""包裹类名,因此对于类名,Turms在设计时,统一将其作为类的引用,而非类的名称来解释,故此类的引用需要遵循()的使用规则,而非""的使用规则。

                                    下一小节将讲解为什么Turms要这么设计,以及为什么不使用一些其他常见的设计。

                                    TODO:稍后更新

                                    关于依赖库的使用

                                    很多依赖库热衷于对底层实现进行抽象与封装,以实现“内部逻辑透明,使用者不用关心背后的逻辑”。这样的设计对于一些逻辑简单、要求快速上线、且不追求性能的应用来说比较实用。但随着一个项目越往后发展,越深入优化,这个不可控的抽象层,会成为问题排查、性能优化、功能定制的绊脚石。抽象层带来的问题,诸如:

                                    • 需求迭代与版本更新严重滞后。如果我们的项目使用了一个抽象层的A依赖,A依赖封装了B依赖。如果我们需要往B依赖添加一个新特性或改Bug,通常的流程是:我们向B依赖的社区提Issue,运气好的话,平均2~4天得到回复。如果运气还很好,对方愿意改。假设改动不大,1周后相关PR被merged。可能等2周、1个月、甚至几个月,B依赖终于发布新版本。然后我们还要等A依赖更新B依赖版本,可能又过了2周、1个月、甚至几个月。等真到我们能使用到新特性,可能几个月已经过了。但更多的情况是,B依赖的维护者压根不愿意修改相关代码。

                                    • 绝大部分知名依赖库,只关心功能实现,并不关心性能,基本是“功能够用,性能凑合就行”的态度。(Turms通过重构依赖代码,解决了大部分下述问题)诸如:

                                      • mongo-java-driver在进行API调用时,反反复复创建大量的中间对象。对于默认配置对象,也不做Cache。
                                      • Lettuce在序列化传递给Redis的指令参数时需要反复扩充内存,并且该Cache的内存数据也没Cache。
                                      • Log4j2竟然使用getBytes读取字符串的数据,并使用StringBuilder做日志的拼接(对比Turms的日志实现直接使用String内部的byte[] value数据,并使用Netty提供的io.netty.buffer.AbstractByteBufAllocator#directBuffer来拼接日志并做日志输出)。(补充:如果读者对日志实现感兴趣,可以阅读日志实现,了解Turms是如何实现日志的)
                                      • 在Protobuf的官方Java实现中,其字符串解码实现也是非常地低效,比如它为了兼容低版本Java,采用了char[]进行编码,但新版本Java的String内部只存储byte[],因此需要一次无意义的内存拷贝(注意:字符串本身就是客户端请求中最大的数据)。
                                      • Spring是低效代码的典型代表,如:
                                        • org.springframework.core.codec.CharSequenceEncoder在处理UTF-8编码的字符串时,会以1个字符对应3字节来开辟DirectByteBuffer用于输出。换言之,上述的8K Prometheus数据,光这块Spring就需要用2.4MB,多用1.6MB。当然,Spring还要更低效,因为它String#getBytes(...)的时候还要进行字符串拷贝。
                                        • 导出巨大的堆转储文件时,spring-boot-actuator:v2.6.6竟然不支持零拷贝(见org.springframework.boot.actuate.management.HeapDumpWebEndpoint.TemporaryFileSystemResource#isFile
                                        • Spring的AOP常用于代理Controller层方法调用,可用于捕获解析后参数,进行日志打印(WebFilter无法获得解析后的参数)。但AOP会给一个方法徒增19个stacks并大量使用反射,从AOP代理开始到Controller方法层的调用所需时间甚至比Turms内部业务处理时间还长(额外补充:AOP是个非常糟糕的设计,Spring应该为Controller层采用的责任链设计)。

                                      综上,很多知名Java依赖库的代码质量并不高,甚至代码性能与质量堪忧,源码读得让人触目惊心。相反,读者可以参考Turms服务端是怎么编码,以把各种细节实现优化到极致的。

                                    • 关注于抽象实现的依赖库在与响应式编程结合时,在问题排查问题上,会给开发者带来地狱级的体验,尤其是Bug与需要手动释放的内存相关。在常规问题排查上,我们通常可以通过栈信息来很快的排查出问题。但在响应式编程中,这样的方法通常行不通,我们更多的靠逻辑推理来排查问题。即熟读上下游代码(包括依赖包内的代码),推演代码可能经过的所有流程。

                                      如果代码的抽象层少、且调用关系扁平,这个排查过程其实很简单,可能我们只用在一个类内的几十行代码上扫几眼,就能大概知道出现问题的原因了。但如果流程中,使用到了大量“封装、抽象,用户无需关注底层实现逻辑”依赖库,地狱级体验就来了。原本我们可能只需要一个小数十行的函数就能实现所有相关逻辑。但如果基于抽象库去实现相关功能,我们在问题排查时,可能要查看的代码可能是A抽象类(A1,A2,A3...)类->B抽象类(B1,B2,B3...)->C抽象类(C1,C2,C3...)->...,在数十个类、数十个方法间跳转,并进行推理。

                                      其中最典型的对照例子就是:Turms的im.turms.gateway.access.client.websocket.WebSocketServerFactory#getHttpRequestHandler在一个小数十行的函数内实现了一组WebSocket握手逻辑。但如果这套逻辑让Spring来实现,它会将各个不同包下的类,各种逻辑东拼西凑地混在一起,在问题排查时,如果还伴随着一些需要手动释放的内存,地狱级的问题排查体验就来了。原本几十行代码能解决的事情,Spring这样的库需要花上千行代码。比如WebFlux内部就有多套Web底层实现,美其名曰“封装、抽象,用户无需关注底层实现逻辑”。

                                    • 部分依赖库在一些地方会自行Suppress异常,上层应用代码无法感知。由于出问题的时候,底层库代码与上层应用代码在大部分情况下,是跑在不同的栈上的。除非底层依赖库支持全局的异常回调,否则上层应用甚至无法感知异常的发生。对于一些Trivial级别的错误,上层应用感知不到也没关系。但如果是一些上层应用非常关注的异常(如RPC的TCP连接的异常断开),这将是引发整个系统异常与失序的导火索了。

                                    • 部分知名依赖库的开发人员甚至缺乏最基本的安全常识。比如Log4j的开发人员竟然添加代码来自动检测预备打印的字符串中是否存在${jndi}模式,如果存在则调用对应的JNDI服务,并默认开启该功能。作为专门编写日志依赖库的开发人员竟然如此缺乏安全常识,且还通过了PR review。

                                    另一方面,自研能规避掉上述所有问题,在提高代码可控性的同时,也极大地降低了研发难度与问题排查难度,并提升代码性能与资源利用率。

                                    综上,Turms项目在引用一个类库时,通常不引入抽象封装库(如Spring),而仅引入实现库。对依赖库中需要性能优化或逻辑优化的点,会直接在Turms项目内部进行重构。结合考虑到自研的难易程度与代码可控性,我们在大部分情况下会尽可能选择自研。

                                    补充:Java的生态虽然繁荣,但高质量的库其实很少,所以大部分对性能有追求的中大型Java开源项目通常也是尽量自研各种功能模块,而不使用第三方依赖库,比如:ElasticsearchCassandraIgnite。另外,在整个Java生态中,我们目前唯一信任其开发人员技术水平的库是:Netty

                                    异常捕获与打印

                                    作用:理解Turms服务端的异常捕获与打印原则能够帮助开发者快速定位异常并发现异常的Root Cause。

                                    在响应式编程中,最为人所诟病的就是该编程范式下的异常通常非常难定位,其堆栈信息基本没用。如果开发者在响应式编程模式下胡乱打印异常日志,很有可能调式者甚至无法根据日志判断这个异常是从哪里抛出来的,更别说反推其执行代码了。

                                    但其实好的异常日志打印原则与实践都比较简单,并且如果遵循该原则,定位异常通常也就几秒或几分钟的事情。其基本原则就是最下游代码抛异常,无需打印。中游代码如果要做异常Translate,那就Translate后继续往上抛,无需打印;最上游接异常并打印。至于什么代码算是“最上游”,调用subscribe()的代码就算“最上游”。该原则实践起来其实也很简单,只是响应式编程里的异常捕获“看起来”比较复杂而已。举例而言,在turms-service服务端中的im.turms.service.access.servicerequest.dispatcher.ServiceRequestDispatcher#dispatch0函数下,有段“根据Service层的处理结果,向相关用户发送通知”的操作,其代码如下:

                                    java
                                    return result
                                    +    
                                    Skip to content

                                    基本开发规约

                                    保守设计与激进设计

                                    Java自身是一个很保守的语言,其大生态也非常保守。其设计原则是“提供一套安全的API,Java使用者怎么使用这些API,都不会导致Java内部出错”(除了Unsafe类),因此提供各种访问控制机制、内部内存拷贝与反复加锁。而Turms服务端代码的编写原则一般是“程序怎么跑的快,怎么写。只要Caller敢乱传或乱用数据,我们就直接报错或直接无视”。举例而言,Turms的StringUtil通过jdk.internal.misc.Unsafe#getReference获取String对象内部的byte[]对象,以避免内存拷贝,Caller需要自行保证不“胡作非为”。而Java自身提供的String#getBytes()为了保证使用者无法修改到内部的byte[],因此是将该byte[]对象拷贝一份,再传给Caller。

                                    因此在字符串实践中,对于一个常规基于Spring搭建的Web应用,一个HTTP请求从TCP字节流切割出来之后,可能需要反反复复在StringStringBuilderbyte[]HeapByteBufferDirectByteBuffer等数据之间进行切换与拼接,最终一个业务层面上的String类型对象,被第三方库与Java内部拷贝5~30次是很常见的。

                                    再以具体应用为例,如果我们使用Spring创建了一个Controller Bean,并在其中定义了一个返回值类型为String的API函数,以通过这个API返回Prometheus格式的度量数据。如果我们在这前提下做“最优雅”的写法,我们至少需要对这个内存对象做4次内存拷贝(不含系统内核刷数据到网卡部分;Turms通过优化,只需要做一次内存拷贝:即堆内存到堆外内存;这个度量数据实际大小约8K):

                                    1. 将Java的基本数据写入StringBuilder,此时堆内存->堆内存拷贝
                                    2. StringBuilder#toString(),又一次堆内存拷贝
                                    3. String#getBytes(),至少又一次堆内存拷贝
                                    4. 将byte[]写到堆外内存DirectByteBuffer,以交给系统内核做写入数据操作

                                    内存有效使用率极低,且注意上面只是一个最简单的API String响应返回的功能,实际应用中涉及到的流程更为复杂,因此一个流程下来,一个字符串被拷贝5~30次是非常常见的事情。因此我们经常能见到当一个HTTP服务端基于其语言主流生态构建时,一个常规Java Web应用所使用到的内存,可能是其等量C++ HTTP服务端的数十倍甚至百倍。

                                    除了各种网络API,日志实现也需要频繁跟String打交道。而Turms在内存实践上就比通用实现高效太多了,Turms直接通过PooledByteBufAllocator.DEFAULT分配缓存了的堆外内存,并直接将Java的基本数据写入堆外内存块中。并且在整个过程中,我们避免使用Java自身的低效实现,从而避免无意义的堆到堆内存拷贝。

                                    综上,尽管Java自身比较保守,Turms则相对强调激进,并以性能优先,而非“代码优雅”,必要时善用Unsafe类。当然,我们“激进”也是有限度的,诸如:1. 绝不替换Java内部类实现;2. 尽量不编写JNI与C语言代码

                                    补充:

                                    1. 对于Java语法糖级别的实践,我们的态度是“比较无所谓”,如for (X x : Collection<X>) (需要创建迭代器对象,多消耗至少几十B)与更高效的for (int i = 0; i < length; i++),两者写法都允许
                                    2. 除了保守的倾向,Java圈子还有一个很吊诡的现象,即“优化时选择性忽视”,比如一方面放任StringStringBuilder的内存拷贝,一个API处理流程下来,需要把数个String拷贝数十次。另一方面,就精打细算地研究JVM内存优化。Turms面对各种优化项,主要就是根据“性价比”,优先优化性价比高的部分,以避免缘木求鱼。

                                    服务端开发的基本规约

                                    代码编写策略的优先级

                                    一般规则:性能(低时间复杂度与空间复杂度) > 代码可读性 > 设计模式

                                    • 性能 > 代码可读性。如使用long,而不是java.util.Datejava.time.Instant来表示时间,以避免创建新对象以及时间转换时的计算;又比如im.turms.server.common.infra.cluster.service.idgen.SnowflakeIdGenerator类下的nextIncreasingId函数与nextLargeGapId函数重复了约10行代码,但我们不提取这公共代码出来,以避免开辟新方法栈(不考虑JVM的滞后Inline操作)。
                                    • 性能 > 设计模式。如场景:
                                      • 遍历处理String中的char[]元素。如果使用责任链模式,则需要用不同的Handler类实现不同类别的处理逻辑,虽然这样可以把逻辑理得很清晰,但是每个Handler都需要遍历一遍char[],因此处理的时间复杂度为O(n*m)(n为char[]长度,m为Handler个数),这种复杂度的代码在Turms服务端代码中是禁止的。此时,就需要反设计模式来编写代码,尽可能把处理逻辑都写在一次遍历中,且尽量不要新开函数区分逻辑(这条可选),而是用注释分块来区分不同的处理逻辑,以避免函数栈开销。
                                      • Protobuf模型的高效设计一直受人称道,但官方Java版本的Protobuf的代码实现是偏保守且低效的。比如Protobuf模型是Immutable的,只有其Builder是Mutable的,因此想要修改Protobuf模型,还得先toBuilder()成一个Builder,再重新创建一个新Protobuf模型实例,内存有效使用率低下(额外补充:其字符串解码实现也是非常地低效,比如其为了兼容低版本Java,采用了char[]进行编码,但新版本Java的String内部只存储byte[],因此需要一次额外的类型转换)。而我们可控的代码是能不用Builder就不用Builder,避免无意义的内存消耗。

                                    例外:如在极少数情况下,代码可读性优先于性能。以下文中提到的禁止在客户端请求与管理员API请求的处理过程中使用反射为例。尽管有这个规则,但如果请求中需要创建供数据库驱动使用的Entity对象时,那我们还是会通过反射创建并填充这个对象。因为如果不使用反射,就需要手写上百个字段序列化与反序列化逻辑,工作量巨大,且容易出错。而使用反射的收益性就很高,所以允许使用反射。

                                    上述的示例还有很多,具体可以看Turms服务端代码。添加新代码时,只需要保证:新加的代码几乎没有任何时间或空间上的优化余地。如果还有优化空间,但收益很低且实现复杂,则允许后期再进行优化。

                                    线程与锁

                                    • 禁止使用弹性线程池,如需创建新线程,则需要进行专门的代码审查

                                    • 在客户端请求与管理员API请求的处理过程中,尽量不使用synchronized与Lock操作(包括可重入锁)。如果确实需要临界区,则优先考虑重构代码流程或用CAS技术替代。

                                    内存与GC

                                    • 禁止对ByteBuf进行拷贝操作

                                    • 对于网络I/O操作,禁止使用非池化或堆内存,只允许使用池化的直接内存

                                    • 尽量不要创建新对象,尽量使用对象池。如设计中常见的:为了将不同层的数据模型进行逻辑分离,专门拆成了DTO与BO模型。Turms对于这种场景,会尽量使用一个数据模型,并通过自定义Jackson的序列化逻辑来实现符合DTO模型的响应

                                      另外:该规则会在Valhalla项目发布之后,发生改变,尤其是我们将移除大部分现有的对象池

                                    • 尽量不要创建带多个unused字段的对象。如Turms用自定义的QueryOptions模型重构了MongoDB的FindOptions模型,其中一个原因就是FindOptions模型会被频繁使用,但其带有数十个无用字段

                                    • 在客户端请求与管理员API请求的处理过程中,禁止使用Stream

                                    • 关于“为什么一些看似可以用primitive参数的函数,依旧使用包装类”的问题。依旧使用包装类是因为:一个函数中的部分参数虽然可能看似可以使用primitives,但实际上这些primitives最终大概率会传给Java的集合类实现(如Map<Long, Object>)、只接受对象的函数(如Object类型、Long类型、泛型等)或作为类的Object字段等。因此,如果一个函数只是自顾自地使用primitive,那整条逻辑处理下来,这个primitive很可能在包装类与primitive之间反复转换多次。综上,Turms服务端在大部分情况下,统一使用包装类,以避免这样多次的转换。只有能保证primitive不会转成包装类,我们才统一使用primitive。

                                      另外,这既是为什么我们在关于Valhalla项目中说“万物皆对象”的设计理念“像诅咒一样挥之不去”,一个primitive在复杂的逻辑中,很难不会被转换成包装类,无意义的对象浪费了大量的内存,也是为什么我们一直在等待Valhalla项目终结包装类、并支持诸如List<int>类型等特性。

                                    代理与反射

                                    • 禁止使用动态代理技术(如Java动态代理、CGLib、Spring AOP等),尽量不使用代理或使用静态编译技术代替(如Lombok)。

                                      唯一的例外情况:Turms服务端的插件机制中,使用Java的动态代理去代理JavaScript编写的插件。

                                    • 在客户端请求与管理员API请求的处理过程中,除非不使用反射就需要写大量繁杂代码,其他场景下禁止使用反射技术。如:Turms在对MongoDB的Entity模型的数百个字段进行序列化与反序列化时,使用了反射。

                                    另外,如果有第三方依赖违背了以上原则,则根据性价比,排期对第三方依赖进行重构。

                                    文本格式

                                    toString()文本格式

                                    Java项目toString()实现的文本格式五花八门,甚至Java自身的内部代码都有很多风格不一致的文本格式。就括号的风格来说,既有Java record默认的[key=value]格式,也有Lombok生成的(key=value)格式,还有Google AutoValue生成的{key=value}格式。

                                    为了实现文本格式统一,Turms服务端项目统一采用如下格式:

                                    • 对于文本的前缀与后缀,分别使用{},而不是[]()。因为在Turms的文本格式设计中,[]指代数组,()指代需要特别标记,以让重要信息更为醒目。具体规则见下文的服务端运行日志与异常文本格式

                                    • 在键与值之间使用主流的=,而不是:

                                    • 对于字符串值,需要使用""对值进行包裹;对于其它非数组值,均采用值的toString()形式;对于数组值,则使用[]来包括数组中的值。

                                      比如:ClassName{key1=value, key2=[value1, value2]}

                                    注意:Turms服务端目前暂未统一toString()的文本格式,但上文所述内容是之后的改进方向。

                                    服务端运行日志与异常文本格式

                                    因为日志与异常的文本格式设计存在非常多的细节,而很多常见实践的原则又是互相冲突的,并且Java领域也没有一个统一的最佳实践,所以几乎所有的大中开源项目(包括Java自身的源码)都做不到文本格式统一,而是各种文本格式混合使用,具体用啥格式主要就靠工程师当下的“感觉”。

                                    因此本节专门讲解Turms服务端采用哪些文本格式,与为什么不采用另外一些常见的文本格式,以减少读者在实践中困惑。

                                    统一格式的重要性

                                    对于一些文本格式规则,可能读者在阅读单条日志,感觉不出规则之间有什么差别。但当读者需要翻阅数十条,甚至数百条、数千条各种不同的日志时,就能明白使用规范统一的文本格式有多么地节省阅读精力了。

                                    具体规则

                                    简单来说:

                                    • 文本中的重要信息尽量放句未。重要信息通常是变量。
                                    • 当重要信息在句末时,需要使用: 来分割重要信息与其他文本。如:使用Could not find the class: my.company.Main,而不使用The class (my.company.Main) could not be found
                                    • 句子不需要省略冠词aanthe。特别强调这点是因为大部分知名大中开源项目偏向于省略冠词。
                                    • 对于名词短语,通常使用限制性同位语,而非定语名词。比如,限制性同位语:The collection "messasge"The setting "turms.whatever.min";定语名词:The "messasge" collectionThe "turms.whatever.min" setting
                                    • 特殊符号的作用与使用:
                                    作用使用的符号在句中时: 搭配时与数组搭配时常见例子
                                    表示数组值[,]使用[value]格式。
                                    Detected illegal operations [CREATE, DELETE] on the collection "message"
                                    使用: [value]格式。
                                    Detected illegal operations: [CREATE, DELETE]
                                    表示区间[..]闭区间,(..)开区间如:[1..2]、``
                                    包裹需要特别分离以达到醒目效果的信息()使用(value)格式。
                                    The path (/turms/1.txt/) is illegal
                                    无需使用(),使用: value格式即可。
                                    Could not find any resource from the path: /turms/1.txt
                                    无需使用(),使用[value]格式即可。
                                    The paths [/1.txt, /2.txt] are illegal
                                    对象、枚举值、路径、域名、字段引用
                                    包裹键值对{}使用{key=value}格式。
                                    使用: {key=value}格式使用[{key=value}, {key=value}]格式
                                    包裹名称或字符串值""使用"value"格式。
                                    The property "turms.whatever.min" must be greater than 0The setting name "abc123" should not contain any digit
                                    使用: "value"格式。
                                    Unknown property: "turms.whatever.min"
                                    使用["value", "value"]格式。
                                    The properties ["turms.whatever.min", "turms.whatever.max"] are unknown
                                    字段名、参数名、数据库集合名
                                    • 名称与引用的区别

                                      先举一个相对容易理解的例子,以字段的名称与引用为例,假设有一个类com.abc.Song(歌曲)中有一个字段name,则该字段的名称是name,而名称在句中被使用时需要加上双引号"",如The field "name" contains illegal characteres。而字段的引用是com.abc.Song#name,而引用在句中被使用时需要加上括号(),如The field (com.abc.Song#name) should be accessible

                                      但在实际开发过程中,我们会发现很多字符串本身是可以有多种解释的。比如有一个类的名称是com.my.Main,那这个名称既可以被解释为类的名称,也可以解释为类的引用。而考虑到类名称不会出现像上述名称可能带来的严重歧义,且大多数中大知名开源项目的实践也不用""包裹类名,因此对于类名,Turms在设计时,统一将其作为类的引用,而非类的名称来解释,故此类的引用需要遵循()的使用规则,而非""的使用规则。

                                    下一小节将讲解为什么Turms要这么设计,以及为什么不使用一些其他常见的设计。

                                    TODO:稍后更新

                                    关于依赖库的使用

                                    很多依赖库热衷于对底层实现进行抽象与封装,以实现“内部逻辑透明,使用者不用关心背后的逻辑”。这样的设计对于一些逻辑简单、要求快速上线、且不追求性能的应用来说比较实用。但随着一个项目越往后发展,越深入优化,这个不可控的抽象层,会成为问题排查、性能优化、功能定制的绊脚石。抽象层带来的问题,诸如:

                                    • 需求迭代与版本更新严重滞后。如果我们的项目使用了一个抽象层的A依赖,A依赖封装了B依赖。如果我们需要往B依赖添加一个新特性或改Bug,通常的流程是:我们向B依赖的社区提Issue,运气好的话,平均2~4天得到回复。如果运气还很好,对方愿意改。假设改动不大,1周后相关PR被merged。可能等2周、1个月、甚至几个月,B依赖终于发布新版本。然后我们还要等A依赖更新B依赖版本,可能又过了2周、1个月、甚至几个月。等真到我们能使用到新特性,可能几个月已经过了。但更多的情况是,B依赖的维护者压根不愿意修改相关代码。

                                    • 绝大部分知名依赖库,只关心功能实现,并不关心性能,基本是“功能够用,性能凑合就行”的态度。(Turms通过重构依赖代码,解决了大部分下述问题)诸如:

                                      • mongo-java-driver在进行API调用时,反反复复创建大量的中间对象。对于默认配置对象,也不做Cache。
                                      • Lettuce在序列化传递给Redis的指令参数时需要反复扩充内存,并且该Cache的内存数据也没Cache。
                                      • Log4j2竟然使用getBytes读取字符串的数据,并使用StringBuilder做日志的拼接(对比Turms的日志实现直接使用String内部的byte[] value数据,并使用Netty提供的io.netty.buffer.AbstractByteBufAllocator#directBuffer来拼接日志并做日志输出)。(补充:如果读者对日志实现感兴趣,可以阅读日志实现,了解Turms是如何实现日志的)
                                      • 在Protobuf的官方Java实现中,其字符串解码实现也是非常地低效,比如它为了兼容低版本Java,采用了char[]进行编码,但新版本Java的String内部只存储byte[],因此需要一次无意义的内存拷贝(注意:字符串本身就是客户端请求中最大的数据)。
                                      • Spring是低效代码的典型代表,如:
                                        • org.springframework.core.codec.CharSequenceEncoder在处理UTF-8编码的字符串时,会以1个字符对应3字节来开辟DirectByteBuffer用于输出。换言之,上述的8K Prometheus数据,光这块Spring就需要用2.4MB,多用1.6MB。当然,Spring还要更低效,因为它String#getBytes(...)的时候还要进行字符串拷贝。
                                        • 导出巨大的堆转储文件时,spring-boot-actuator:v2.6.6竟然不支持零拷贝(见org.springframework.boot.actuate.management.HeapDumpWebEndpoint.TemporaryFileSystemResource#isFile
                                        • Spring的AOP常用于代理Controller层方法调用,可用于捕获解析后参数,进行日志打印(WebFilter无法获得解析后的参数)。但AOP会给一个方法徒增19个stacks并大量使用反射,从AOP代理开始到Controller方法层的调用所需时间甚至比Turms内部业务处理时间还长(额外补充:AOP是个非常糟糕的设计,Spring应该为Controller层采用的责任链设计)。

                                      综上,很多知名Java依赖库的代码质量并不高,甚至代码性能与质量堪忧,源码读得让人触目惊心。相反,读者可以参考Turms服务端是怎么编码,以把各种细节实现优化到极致的。

                                    • 关注于抽象实现的依赖库在与响应式编程结合时,在问题排查问题上,会给开发者带来地狱级的体验,尤其是Bug与需要手动释放的内存相关。在常规问题排查上,我们通常可以通过栈信息来很快的排查出问题。但在响应式编程中,这样的方法通常行不通,我们更多的靠逻辑推理来排查问题。即熟读上下游代码(包括依赖包内的代码),推演代码可能经过的所有流程。

                                      如果代码的抽象层少、且调用关系扁平,这个排查过程其实很简单,可能我们只用在一个类内的几十行代码上扫几眼,就能大概知道出现问题的原因了。但如果流程中,使用到了大量“封装、抽象,用户无需关注底层实现逻辑”依赖库,地狱级体验就来了。原本我们可能只需要一个小数十行的函数就能实现所有相关逻辑。但如果基于抽象库去实现相关功能,我们在问题排查时,可能要查看的代码可能是A抽象类(A1,A2,A3...)类->B抽象类(B1,B2,B3...)->C抽象类(C1,C2,C3...)->...,在数十个类、数十个方法间跳转,并进行推理。

                                      其中最典型的对照例子就是:Turms的im.turms.gateway.access.client.websocket.WebSocketServerFactory#getHttpRequestHandler在一个小数十行的函数内实现了一组WebSocket握手逻辑。但如果这套逻辑让Spring来实现,它会将各个不同包下的类,各种逻辑东拼西凑地混在一起,在问题排查时,如果还伴随着一些需要手动释放的内存,地狱级的问题排查体验就来了。原本几十行代码能解决的事情,Spring这样的库需要花上千行代码。比如WebFlux内部就有多套Web底层实现,美其名曰“封装、抽象,用户无需关注底层实现逻辑”。

                                    • 部分依赖库在一些地方会自行Suppress异常,上层应用代码无法感知。由于出问题的时候,底层库代码与上层应用代码在大部分情况下,是跑在不同的栈上的。除非底层依赖库支持全局的异常回调,否则上层应用甚至无法感知异常的发生。对于一些Trivial级别的错误,上层应用感知不到也没关系。但如果是一些上层应用非常关注的异常(如RPC的TCP连接的异常断开),这将是引发整个系统异常与失序的导火索了。

                                    • 部分知名依赖库的开发人员甚至缺乏最基本的安全常识。比如Log4j的开发人员竟然添加代码来自动检测预备打印的字符串中是否存在${jndi}模式,如果存在则调用对应的JNDI服务,并默认开启该功能。作为专门编写日志依赖库的开发人员竟然如此缺乏安全常识,且还通过了PR review。

                                    另一方面,自研能规避掉上述所有问题,在提高代码可控性的同时,也极大地降低了研发难度与问题排查难度,并提升代码性能与资源利用率。

                                    综上,Turms项目在引用一个类库时,通常不引入抽象封装库(如Spring),而仅引入实现库。对依赖库中需要性能优化或逻辑优化的点,会直接在Turms项目内部进行重构。结合考虑到自研的难易程度与代码可控性,我们在大部分情况下会尽可能选择自研。

                                    补充:Java的生态虽然繁荣,但高质量的库其实很少,所以大部分对性能有追求的中大型Java开源项目通常也是尽量自研各种功能模块,而不使用第三方依赖库,比如:ElasticsearchCassandraIgnite。另外,在整个Java生态中,我们目前唯一信任其开发人员技术水平的库是:Netty

                                    异常捕获与打印

                                    作用:理解Turms服务端的异常捕获与打印原则能够帮助开发者快速定位异常并发现异常的Root Cause。

                                    在响应式编程中,最为人所诟病的就是该编程范式下的异常通常非常难定位,其堆栈信息基本没用。如果开发者在响应式编程模式下胡乱打印异常日志,很有可能调式者甚至无法根据日志判断这个异常是从哪里抛出来的,更别说反推其执行代码了。

                                    但其实好的异常日志打印原则与实践都比较简单,并且如果遵循该原则,定位异常通常也就几秒或几分钟的事情。其基本原则就是最下游代码抛异常,无需打印。中游代码如果要做异常Translate,那就Translate后继续往上抛,无需打印;最上游接异常并打印。至于什么代码算是“最上游”,调用subscribe()的代码就算“最上游”。该原则实践起来其实也很简单,只是响应式编程里的异常捕获“看起来”比较复杂而已。举例而言,在turms-service服务端中的im.turms.service.access.servicerequest.dispatcher.ServiceRequestDispatcher#dispatch0函数下,有段“根据Service层的处理结果,向相关用户发送通知”的操作,其代码如下:

                                    java
                                    return result
                                             .name(CLIENT_REQUEST_NAME)
                                             .tag(CLIENT_REQUEST_TAG_TYPE, requestType.name())
                                             .metrics()
                                    @@ -60,7 +60,7 @@
                                                         });
                                             })
                                             ...

                                    如上文所述,该段代码通过notifyRelatedUsersOfAction函数进行通知下发操作,其内部实现我们并不关心,我们只要在最上游通过subscribe(...)保证能捕获其可能抛出的异常并打印即可。

                                    有且仅自定义继承自RuntimeException的异常类

                                    在Turms服务端项目中,有且仅自定义继承自RuntimeException的异常类,禁止自定义继承自ExceptionChecked Exception)的异常类。

                                    关于使用Checked Exception,还是Unchecked Exception的讨论至今都是众说纷纭,但如今不少文章直接批评Checked Exception是Java的设计败笔,像是Kotlin/Scala/C#这些后来的语言甚至压根没有Checked Exception这一概念,而如今大部分大中知名开源项目一般也只自定义RuntimeException的子类,而不自定义Checked Exception的子类。

                                    常见的认为Checked Exception是糟糕设计的原因比如有:

                                    • 作为第三方库/下游代码,Checked Exception存在接口签名版本化兼容问题。

                                    • 作为大中项目,当子模块都使用Checked Exception,则上游代码的接口可以最终会声明数十个异常,当接口的异常声明做增删改后,牵一发动全身。

                                    • Java代码内部,自己都存在异常设计冲突。比如Java Streams设计中的Lamba自己都不支持抛Checked Exception,对于在Stream里的Lambda,其实现必须当成处理(通常是错误实践)或将其转换成Unchecked Exception(丢失了使用Checked Exception的意义),Java内部甚至因此还引入了UncheckedIOException

                                    • 在实践中,人们经常会回避Checked Exception被设计出来的目的,导致不如不用Checked Exception,比如:

                                      • 直接捕获所有Exception
                                      • Checked Exception翻译成RuntimeException。如try { ... } catch (Exception e) { throw new RuntimeException(e); }
                                      • 由于栈太深,为了避免污染上游代码,直接在下游进行无意义的捕获,甚至有可能错误地直接catch (Exception e) { do nothing }
                                    • 不少开发者会错误地理解异常设计,然后错误地去自定义异常。比如说不少开发者认为如果是上游代码可以避免的异常,则用RuntimeException的子类。如果是上游代码不可避免的异常,则用Checked Exception,类似的观点就非常盲目乐观与缺乏实际项目经验与编码经验了,因为下游抛出的异常到底可不可以处理取决于上游代码逻辑,而不是下游代码的臆想。

                                      举例来说Turms服务端的插件模块在加载插件时,可能插件的类加载器会抛出NoClassDefFoundError,如果按Java早期团队的说法An Error is a subclass of Throwable that indicates serious problems that a reasonable application should not try to catch,那插件模块的上游代码就不应该捕获Error,但Turms作为一个服务端不可能因为加载了一个有问题的类插件,就让服务端异常,因此上游代码真正合理的做法是捕获这些Error,而不是让服务端直接奔溃,陷入异常状态。

                                    而对于Turms服务端项目来说,考虑到Checked Exception唯一能真正发挥作用的场景是:在个别场景中,在设计下游功能模块时,已知上游调用方代码需要根据下游抛出的各种异常做异常区分,为了保证上游没有遗漏处理一些下游抛出的异常,因此可以考虑使用Checked Exception。但由于这种场景非常地少,而且根据上游调用方代码逻辑来设计下游代码也是非常糟糕的实践。

                                    因此为了规避Checked Exception带来了各种问题、统一异常设计风格,与避免把时间浪费在“为什么同样都是某类的模块,A模块用了某类异常,B模块用了某类异常”这类无关紧要的争论上,在Turms服务端项目中,有且仅自定义继承自RuntimeException的异常类,禁止自定义继承自ExceptionChecked Exception)的异常类。

                                    - + \ No newline at end of file diff --git a/docs/zh-CN/server/development/testing.html b/docs/zh-CN/server/development/testing.html index a4f48d69..a45d8203 100644 --- a/docs/zh-CN/server/development/testing.html +++ b/docs/zh-CN/server/development/testing.html @@ -17,8 +17,8 @@ -
                                    Skip to content

                                    测试

                                    关于压力测试

                                    为什么Turms服务端不提供压测报告

                                    对于两个做了相同功能的Java简单函数,我们可以通过JMH轻松地测试它们各自的性能表现。但对于Turms这种稍大的项目而言,不存在这样的银弹。其复杂性主要体现在这么三个方面:

                                    1. Turms支持多种不同的架构,很多功能也都支持开闭。

                                      举例来说,我们在配置参数篇章有提到了一些用例甚至不需要做数据存储,那在其他条件相同的情况下,不做存储的应用自然吞吐量比要做存储的快。

                                      又比如我们在关于消息的可达性、有序性与重复性提到,Turms服务端默认关闭对100%消息必达的支持的,原因是支持100%消息必达是有代价的,它需要至少一个Redis服务端做会话级别的序列号分发工作,每发一条消息都需要请求一个序列号,其吞吐量自然是不如不支持100%消息必达的场景。

                                      又比如用户登陆时,服务端是否需要进行用户状态推送。对于不需要推送的场景,服务端的压力自然远小于要做推送的场景。

                                      又比如我们在可观察测性体系-日志篇章有讲到Turms服务端默认并推荐对用户请求进行100%的日志采样,而100%采样需要大量的I/O操作,其吞吐量自然比不过完全不进行采样的操作。

                                    2. 对于绝大部分的业务请求实现,Turms服务端发给MongoDB的请求绝大多数是用于用户权限校验与数据校验,只有少部分是最终执行用户指示的业务功能。比如如A用户在123群封禁B用户这一操作,就要分别对A用户、B用户与123群三者的状态做校验,全部校验通过了才会封禁B用户。不做这些校验的话,吞吐量自然就会高得多,但除了玩具项目,没有真实项目会不做校验的。

                                    3. Turms服务端通常会在删除某业务数据时,通过分布式事务删除相关的数据。不用分布式事务自然会比用分布式事务吞吐量高,但这就容易产生脏数据。

                                    4. Turms提供很多缓存功能,且未来将支持更多缓存功能。缓存是空间换时间的经典例子。以群组成员缓存为例,群组成员发送消息时,Turms服务端需要查询该群组的成员列表。那对于使用了缓存的场景,Turms服务端可以基于本地Map做查询,其吞吐量自然远高于不使用缓存而向数据库发送查询请求的场景,但不使用缓存的场景优势就在于群组列表的实时性高。

                                    5. 单机与分布式的压测结果也完全不一样。甚至Turms服务端还将支持:在单机部署场景,Turms服务端支持Unix Domain Socket,而无需走TCP连接。

                                    综上,如果Turms只想写个好看的压测报告,Turms服务端可以不做任何数据存储、不保证消息必达、不做用户状态推送、不对用户请求做权限校验、数据校验与日志采样、所有业务操作都不用事务、对所有数据都进行长时间的缓存等等,其最终的吞吐量自然很高,但这样的压测报告就如同空中楼阁,没有多少真实场景会使用这么一套配置。这既是我们做中大型应用的开发人员不太愿意提供简单压测报告的原因,也是我们太不会相信其他中大型应用所提供的压测报告的原因。

                                    对于任何应用的性能表现,我们在看它或快或慢的时候,主要是为了探究“它什么这么快/慢?”。举例来说,我们在研究JVM为什么会占这么多内存时,如果我们只看到了Java极为冗余与普遍的对象头时,我们会感叹“原来是冗余设计问题,是作者的糟糕设计,难怪占这么多内存”,但如果我们又看了JVM对Code Cache的设计与使用,我们又会感叹“原来是空间换时间,是作者的良苦用心,难怪占这么多内存”,评价方向截然不同。

                                    归根到底就是外行看热闹,内行看门道。别说是中大型的应用,就算是小小的Java库,我们在看它的性能报告时,也不能全信。如Log4j2在其性能报告中展示了其优秀的性能表现,但如果我们读过它的源码,会发现Log4j2的实现其实并不高效,而真正把性能做到极致的其实是Turms基于Netty自研的日志实现(具体可查阅自研实现设计文档,与具体代码im.turms.server.common.logging.core.logger.AsyncLogger#doLog),只要对比了二者的源码的实现,就会发现二者在性能优化上不是一个级别的。而Turms为了方便用户看其中的门道,因此文档都写得比较详尽并且关键代码的位置也会标记出来,以方便用户自行评测Turms适不适用于自己的应用场景。

                                    turms-performance-testing项目(预览文档)

                                    尽管Turms没计划提供现成的压测报告,但我们近期会为Turms服务端定制一套分布式压测平台。该平台的UI展示与报告分析会由turms-admin负责,而节点管控与任务执行分别由turms-performance-testing中的Controller节点与Agent节点负责。

                                    特别一提的是:Turms之所以能快速定制与开发众多平台,也得益于我们在基于Turms做二次开发的原因提到的“可控性。Turms项目100%开源,并对很多基础中间件进行了自研,保证了底层技术的可控,避免了项目后期发展动力不足”,因此我们做新项目不会受制于第三方依赖,动力十足。

                                    - +
                                    Skip to content

                                    测试

                                    关于压力测试

                                    为什么Turms服务端不提供压测报告

                                    对于两个做了相同功能的Java简单函数,我们可以通过JMH轻松地测试它们各自的性能表现。但对于Turms这种稍大的项目而言,不存在这样的银弹。其复杂性主要体现在这么三个方面:

                                    1. Turms支持多种不同的架构,很多功能也都支持开闭。

                                      举例来说,我们在配置参数篇章有提到了一些用例甚至不需要做数据存储,那在其他条件相同的情况下,不做存储的应用自然吞吐量比要做存储的快。

                                      又比如我们在关于消息的可达性、有序性与重复性提到,Turms服务端默认关闭对100%消息必达的支持的,原因是支持100%消息必达是有代价的,它需要至少一个Redis服务端做会话级别的序列号分发工作,每发一条消息都需要请求一个序列号,其吞吐量自然是不如不支持100%消息必达的场景。

                                      又比如用户登陆时,服务端是否需要进行用户状态推送。对于不需要推送的场景,服务端的压力自然远小于要做推送的场景。

                                      又比如我们在可观察测性体系-日志篇章有讲到Turms服务端默认并推荐对用户请求进行100%的日志采样,而100%采样需要大量的I/O操作,其吞吐量自然比不过完全不进行采样的操作。

                                    2. 对于绝大部分的业务请求实现,Turms服务端发给MongoDB的请求绝大多数是用于用户权限校验与数据校验,只有少部分是最终执行用户指示的业务功能。比如如A用户在123群封禁B用户这一操作,就要分别对A用户、B用户与123群三者的状态做校验,全部校验通过了才会封禁B用户。不做这些校验的话,吞吐量自然就会高得多,但除了玩具项目,没有真实项目会不做校验的。

                                    3. Turms服务端通常会在删除某业务数据时,通过分布式事务删除相关的数据。不用分布式事务自然会比用分布式事务吞吐量高,但这就容易产生脏数据。

                                    4. Turms提供很多缓存功能,且未来将支持更多缓存功能。缓存是空间换时间的经典例子。以群组成员缓存为例,群组成员发送消息时,Turms服务端需要查询该群组的成员列表。那对于使用了缓存的场景,Turms服务端可以基于本地Map做查询,其吞吐量自然远高于不使用缓存而向数据库发送查询请求的场景,但不使用缓存的场景优势就在于群组列表的实时性高。

                                    5. 单机与分布式的压测结果也完全不一样。甚至Turms服务端还将支持:在单机部署场景,Turms服务端支持Unix Domain Socket,而无需走TCP连接。

                                    综上,如果Turms只想写个好看的压测报告,Turms服务端可以不做任何数据存储、不保证消息必达、不做用户状态推送、不对用户请求做权限校验、数据校验与日志采样、所有业务操作都不用事务、对所有数据都进行长时间的缓存等等,其最终的吞吐量自然很高,但这样的压测报告就如同空中楼阁,没有多少真实场景会使用这么一套配置。这既是我们做中大型应用的开发人员不太愿意提供简单压测报告的原因,也是我们太不会相信其他中大型应用所提供的压测报告的原因。

                                    对于任何应用的性能表现,我们在看它或快或慢的时候,主要是为了探究“它什么这么快/慢?”。举例来说,我们在研究JVM为什么会占这么多内存时,如果我们只看到了Java极为冗余与普遍的对象头时,我们会感叹“原来是冗余设计问题,是作者的糟糕设计,难怪占这么多内存”,但如果我们又看了JVM对Code Cache的设计与使用,我们又会感叹“原来是空间换时间,是作者的良苦用心,难怪占这么多内存”,评价方向截然不同。

                                    归根到底就是外行看热闹,内行看门道。别说是中大型的应用,就算是小小的Java库,我们在看它的性能报告时,也不能全信。如Log4j2在其性能报告中展示了其优秀的性能表现,但如果我们读过它的源码,会发现Log4j2的实现其实并不高效,而真正把性能做到极致的其实是Turms基于Netty自研的日志实现(具体可查阅自研实现设计文档,与具体代码im.turms.server.common.logging.core.logger.AsyncLogger#doLog),只要对比了二者的源码的实现,就会发现二者在性能优化上不是一个级别的。而Turms为了方便用户看其中的门道,因此文档都写得比较详尽并且关键代码的位置也会标记出来,以方便用户自行评测Turms适不适用于自己的应用场景。

                                    turms-performance-testing项目(预览文档)

                                    尽管Turms没计划提供现成的压测报告,但我们近期会为Turms服务端定制一套分布式压测平台。该平台的UI展示与报告分析会由turms-admin负责,而节点管控与任务执行分别由turms-performance-testing中的Controller节点与Agent节点负责。

                                    特别一提的是:Turms之所以能快速定制与开发众多平台,也得益于我们在基于Turms做二次开发的原因提到的“可控性。Turms项目100%开源,并对很多基础中间件进行了自研,保证了底层技术的可控,避免了项目后期发展动力不足”,因此我们做新项目不会受制于第三方依赖,动力十足。

                                    + \ No newline at end of file diff --git a/docs/zh-CN/server/module/anti-spam.html b/docs/zh-CN/server/module/anti-spam.html index 9da77c12..bad3c35f 100644 --- a/docs/zh-CN/server/module/anti-spam.html +++ b/docs/zh-CN/server/module/anti-spam.html @@ -17,7 +17,7 @@ -
                                    Skip to content

                                    敏感词过滤

                                    Turms不支持且未来也不会支持图片、视频与语音的反垃圾检测功能,下文所有内容仅在文本检测范围内进行说明。

                                    功能特性对比

                                    结合现实情况,商用敏感词过滤功能的最大优点是:词库丰富,更新及时,支持多语言。最主要缺点是:按检测次数收费、每次检测都需要发送网络请求;turms-plugin-antispam的最大优点是:免费、本地极速检测,只需遍历一遍目标串。最主要缺点是:不提供词库。具体而言:

                                    特别一提的是:由于黑产的客观存在,“按检测次数收费”的实际开销可能会比您预期的开销大。

                                    商业反垃圾服务(含敏感词过滤)turms-plugin-antispam
                                    免费否。按检测次数收费
                                    开源否。完全闭源是。完全开源
                                    匹配速度需要发送网络请求,比turms-plugin-antispam的匹配速度慢了几个数量级本地极速匹配(基于双数组Trie的AC自动机算法实现)。您可以忽略匹配时带来的性能开销。
                                    在NORMALIZATION模式下,匹配的时间复杂度为O(n),n为输入字符串长度。
                                    在NORMALIZATION_TRANSLITERATION模式下,音译的时间复杂度为O(n),n为输入字符串长度。匹配音译结果的时间复杂度为O(m),m为音译结果字符串长度。
                                    补充:汉字音译指将汉字转换成拼音
                                    文本去噪(如去标点符号、字母与数字标准化)部分支持部分支持
                                    形近字匹配(如火星文)部分支持TODO(1.1支持)
                                    拆字匹配部分支持TODO(1.2支持)
                                    音近字精确匹配支持支持
                                    音近字模糊匹配支持TODO(1.1支持)
                                    多音字匹配支持TODO(1.1支持)
                                    词库闭源,但是词库丰富,更新及时不提供。具体原因见下文
                                    多语言/方言支持支持多种语言与方言需要用户自行采集词库。另外,也有项目通过调用“翻译API”,将源语言翻译成某特定语言再进行匹配,但turms-plugin-antispam不提供该类实现
                                    生僻字支持部分支持部分支持。turms-plugin-antispam能够识别Unicode基本多语言平面(BMP)内的code points,支持识别两万多个汉字(《新华字典》最新版仅收录一万多个汉字)。
                                    由于大部分IM应用都不要求一定要能显示特别生僻的字(如“𤳵”字),建议您的UI前端应用直接用如“?”的占位符对BMP之外的cope points进行替换。
                                    turms-plugin-antispam没有计划支持BMP以外的code points
                                    组合敏感词支持TODO(1.1支持)
                                    文字竖排检测不支持不支持
                                    查询词库附加信息附加信息丰富。如敏感词类别(涉黄、涉政、暴恐、违禁、谩骂、灌水、广告、广告法、涉价值观等)TODO(1.0)。另外,虽然Turms之后会支持该功能,但Turms依旧不提供敏感词库
                                    白名单支持TODO(1.1支持)
                                    地区差异化服务部分支持不支持
                                    人工审核系统部分支持不支持

                                    敏感词检测的复杂性

                                    • 并不是什么文本都能检测的。以字符串“Turms是一个优秀的IM开源项目”为例,如果我们采用常规的竖排明文显示。那么如果敏感词检测系统不支持特征提取,那么该系统就无法检测该类文本:

                                      text
                                      ╔═╤═╤═╤═╤═╗
                                      +    
                                      Skip to content

                                      敏感词过滤

                                      Turms不支持且未来也不会支持图片、视频与语音的反垃圾检测功能,下文所有内容仅在文本检测范围内进行说明。

                                      功能特性对比

                                      结合现实情况,商用敏感词过滤功能的最大优点是:词库丰富,更新及时,支持多语言。最主要缺点是:按检测次数收费、每次检测都需要发送网络请求;turms-plugin-antispam的最大优点是:免费、本地极速检测,只需遍历一遍目标串。最主要缺点是:不提供词库。具体而言:

                                      特别一提的是:由于黑产的客观存在,“按检测次数收费”的实际开销可能会比您预期的开销大。

                                      商业反垃圾服务(含敏感词过滤)turms-plugin-antispam
                                      免费否。按检测次数收费
                                      开源否。完全闭源是。完全开源
                                      匹配速度需要发送网络请求,比turms-plugin-antispam的匹配速度慢了几个数量级本地极速匹配(基于双数组Trie的AC自动机算法实现)。您可以忽略匹配时带来的性能开销。
                                      在NORMALIZATION模式下,匹配的时间复杂度为O(n),n为输入字符串长度。
                                      在NORMALIZATION_TRANSLITERATION模式下,音译的时间复杂度为O(n),n为输入字符串长度。匹配音译结果的时间复杂度为O(m),m为音译结果字符串长度。
                                      补充:汉字音译指将汉字转换成拼音
                                      文本去噪(如去标点符号、字母与数字标准化)部分支持部分支持
                                      形近字匹配(如火星文)部分支持TODO(1.1支持)
                                      拆字匹配部分支持TODO(1.2支持)
                                      音近字精确匹配支持支持
                                      音近字模糊匹配支持TODO(1.1支持)
                                      多音字匹配支持TODO(1.1支持)
                                      词库闭源,但是词库丰富,更新及时不提供。具体原因见下文
                                      多语言/方言支持支持多种语言与方言需要用户自行采集词库。另外,也有项目通过调用“翻译API”,将源语言翻译成某特定语言再进行匹配,但turms-plugin-antispam不提供该类实现
                                      生僻字支持部分支持部分支持。turms-plugin-antispam能够识别Unicode基本多语言平面(BMP)内的code points,支持识别两万多个汉字(《新华字典》最新版仅收录一万多个汉字)。
                                      由于大部分IM应用都不要求一定要能显示特别生僻的字(如“𤳵”字),建议您的UI前端应用直接用如“?”的占位符对BMP之外的cope points进行替换。
                                      turms-plugin-antispam没有计划支持BMP以外的code points
                                      组合敏感词支持TODO(1.1支持)
                                      文字竖排检测不支持不支持
                                      查询词库附加信息附加信息丰富。如敏感词类别(涉黄、涉政、暴恐、违禁、谩骂、灌水、广告、广告法、涉价值观等)TODO(1.0)。另外,虽然Turms之后会支持该功能,但Turms依旧不提供敏感词库
                                      白名单支持TODO(1.1支持)
                                      地区差异化服务部分支持不支持
                                      人工审核系统部分支持不支持

                                      敏感词检测的复杂性

                                      • 并不是什么文本都能检测的。以字符串“Turms是一个优秀的IM开源项目”为例,如果我们采用常规的竖排明文显示。那么如果敏感词检测系统不支持特征提取,那么该系统就无法检测该类文本:

                                        text
                                        ╔═╤═╤═╤═╤═╗
                                         ║┊│项│的│是│T║
                                         ║┊│目│I│一│u║
                                         ║┊│┊│M│个│r║
                                        @@ -40,7 +40,7 @@
                                         
                                         안녕하세요,,,,,,,,,,,,,,,,,,,,,,,,,,,
                                         こんにちは

                                        配置讲解

                                        配置类:im.turms.plugin.antispam.property.AntiSpamProperties

                                        配置前缀:turms.plugin.antispam

                                        配置项

                                        配置名默认值作用
                                        enabledtrue是否启动反垃圾功能
                                        dictParsing.binFilePathnull词库的二进制文件路径。该文件保存了词库文本解析后的数据,用于避免每次服务端启动时都从头解析词库文本。如果用户配置了“textFilePath”与“binFilePath”,则会优先使用“binFilePath”
                                        dictParsing.textFilePathnull词库的文本文件路径
                                        dictParsing.textFileCharset"UTF-8"词库编码格式。推荐统一使用“UTF-8”编码
                                        dictParsing.skipInvalidCharactertrue解析词库文本时,是否自动跳过非法字符。
                                        如果false且在解析过程中遇到非法字符,则会抛出异常
                                        dictParsing.extendedWords.enabledtrue是否需要支持拓展词库功能。如果为true,则解析并使用词库中的所有数据。如果为false,则仅仅解析与使用word字段数据,以大幅度减少内存开销
                                        textParsingStrategyNORMALIZATION_TRANSLITERATION词典文本与用户输入文本的解析策略:
                                        NORMALIZATION:对输入文本进行标准化。如:⑩HELLO(你{}好./ -> 10hello你好
                                        NORMALIZATION_TRANSLITERATION:对输入文本进行标准化并音译。如:⑩HELLO(你{}好./ -> 10hellonihao
                                        unwantedWordHandleStrategyREJECT_REQUEST非法文本处理策略:
                                        REJECT_REQUEST:向客户端返回“MESSAGE_IS_ILLEGAL”错误状态码
                                        MASK_TEXT:替换非法字符,并继续正常处理请求
                                        mask'*'当“unwantedWordHandleStrategy”为“MASK_TEXT”时,所采用的掩码
                                        maxNumberOfUnwantedWordsToReturn0当处理策略为REJECT_REQUEST且该值大于0时,被检测为非法文本的字符串,将以ASCII0x1E(Record Separator)字符作为分隔符,通过异常的描述字符串来表示。该异常文本最终会被客户端接收
                                        textTypes所有其他用户可见的文本配置哪些请求的哪些文本字段需要进行检测
                                        silentIllegalTextTypes配置当检测到这些请求的这些文本字段包含非法字符时,服务端会“OK”状态码响应客户端,但服务端实际并没有继续处理该请求。
                                        在实际业务场景中,该值除了通常为空外,还通常为CREATE_MESSAGE_REQUEST_TEXT,用于静默拒绝发送用户消息

                                        Admin API

                                        TODO

                                        不使用其他开源实现的原因

                                        在全球开源圈子内,目前可找到的开源实现的质量都非常之低,主要体现在:代码质量低(高空间复杂度与时间复杂度)、很多匹配功能都不支持、作者不具备工程设计能力,甚至还有收费的半开源IM项目通过遍历词库来进行匹配的。暂未有像turms-plugin-antispam这样的算法与代码质量都优秀的实现,且传统反垃圾方案(不涉及机器学习)的总体实现难度不大,因此Turms选择自研,也为后期众多拓展做足准备。具体而言:

                                        • 会算法的不会工程设计,会工程设计的不会算法。一方面,实现基于双数组Trie的AC自动机算法的难度较高,且Java的数据结构设计的都比较保守,如StringStringBuilder为了保证内部数据与外部数据隔离,很多函数都会涉及内存拷贝工作,能够在算法实现中避开各种Java的“坑”就需要工程师有基本的优化意识。另一方面,Turms里的反垃圾设计与算法实现的逻辑都是统一的,都是为了Turms这个IM项目设计的,为实际IM需求服务的。因此能保证“能想到的功能就能做到,不需要的功能就不需要提供,以免不必要的时间与空间开销”。
                                        • 自研可以根据项目需求,定制算法实现与算法的上下游代码,以保证绝对的高效(把空间复杂度压到O(1),时间复杂度压到O(n),保证遍历一边字符串即可完成敏感词匹配)。举个例子,在AC自动机标准算法实现中,并没有涉及到“跳过某字符进行匹配”的逻辑。那么如果我们想要实现“只检测BMP内的code points”,就需要在把原始char[]传递给标准AC算法实现之前,先自行过滤并拷贝一个新的char[],再传递给AC自动机进行匹配。这频繁的内存拷贝工作无疑是非常低效且不必要的,尤其是“用户文本消息”本身就是所有用户请求中最占内存也是出现最频繁的数据。而采用定制实现的话,我们只需在AC自动机进行匹配时,加一个if判断条件直接跳过该字符即可。既实现简单清晰,又无需开辟新的内存空间,空间效率高。
                                      - + \ No newline at end of file diff --git a/docs/zh-CN/server/module/chatbot.html b/docs/zh-CN/server/module/chatbot.html index 2153176d..264311fb 100644 --- a/docs/zh-CN/server/module/chatbot.html +++ b/docs/zh-CN/server/module/chatbot.html @@ -17,7 +17,7 @@ -
                                      Skip to content

                                      聊天机器人

                                      turms-plugin-rasa

                                      简介

                                      turms-plugin-rasa是一个基于开源对话式AI框架Rasa而开发的turms-service聊天机器人实现插件。

                                      turms-plugin-rasa的工作流程很简单,即:将用户发送的消息转发给Rasa服务端,等Rasa服务端返回响应后,Turms服务端再将响应以消息的形式发送给用户。

                                      安装

                                      配置

                                      配置项默认值说明
                                      turms-plugin.rasa.enabledtrue是否启动插件
                                      turms-plugin.rasa.instances[?].chatbot-user-id0当用户发送消息给该用户ID时,将消息转发给Rasa服务端
                                      turms-plugin.rasa.instances[?].urlhttp://localhost:5005/webhooks/rest/webhook用于接收用户消息的Rasa服务端地址
                                      turms-plugin.rasa.instances[?].request.timeoutMillis60_000请求超时时长(毫秒)
                                      turms-plugin.rasa.instances[?].response.formatPLAINPLAIN时,Rasa服务端响应中的text文本字段将会被直接作为消息,发送给用户;
                                      JSON时,Rasa服务端响应会先被序列化成JSON格式文本,再作为消息,发送给用户。JSON具体格式见下文。
                                      turms-plugin.rasa.instances[?].response.delimiter\n当上述formatPLAIN,且用户发送给Rasa服务端一条消息,而Rasa服务端返回多个响应时,使用该字符串作为响应text文本字段之间的分隔符
                                      turms-plugin.rasa.instances[?].response.persistDEFAULT是否存储基于Rasa服务端响应生成的消息。
                                      TRUE时,表示存储;
                                      FALSE时,表示不存储;
                                      DEFAULT时,表示基于属性turms.service.message.persist-message判断;

                                      发送给用户的消息的JSON文本格式为:

                                      json
                                      [
                                      +    
                                      Skip to content

                                      聊天机器人

                                      turms-plugin-rasa

                                      简介

                                      turms-plugin-rasa是一个基于开源对话式AI框架Rasa而开发的turms-service聊天机器人实现插件。

                                      turms-plugin-rasa的工作流程很简单,即:将用户发送的消息转发给Rasa服务端,等Rasa服务端返回响应后,Turms服务端再将响应以消息的形式发送给用户。

                                      安装

                                      配置

                                      配置项默认值说明
                                      turms-plugin.rasa.enabledtrue是否启动插件
                                      turms-plugin.rasa.instances[?].chatbot-user-id0当用户发送消息给该用户ID时,将消息转发给Rasa服务端
                                      turms-plugin.rasa.instances[?].urlhttp://localhost:5005/webhooks/rest/webhook用于接收用户消息的Rasa服务端地址
                                      turms-plugin.rasa.instances[?].request.timeoutMillis60_000请求超时时长(毫秒)
                                      turms-plugin.rasa.instances[?].response.formatPLAINPLAIN时,Rasa服务端响应中的text文本字段将会被直接作为消息,发送给用户;
                                      JSON时,Rasa服务端响应会先被序列化成JSON格式文本,再作为消息,发送给用户。JSON具体格式见下文。
                                      turms-plugin.rasa.instances[?].response.delimiter\n当上述formatPLAIN,且用户发送给Rasa服务端一条消息,而Rasa服务端返回多个响应时,使用该字符串作为响应text文本字段之间的分隔符
                                      turms-plugin.rasa.instances[?].response.persistDEFAULT是否存储基于Rasa服务端响应生成的消息。
                                      TRUE时,表示存储;
                                      FALSE时,表示不存储;
                                      DEFAULT时,表示基于属性turms.service.message.persist-message判断;

                                      发送给用户的消息的JSON文本格式为:

                                      json
                                      [
                                           {
                                               "text": <string>,
                                               "image": <string>
                                      @@ -30,7 +30,7 @@
                                           },
                                           ...
                                       ]
                                      - + \ No newline at end of file diff --git a/docs/zh-CN/server/module/cluster.html b/docs/zh-CN/server/module/cluster.html index c0557219..1bee82f4 100644 --- a/docs/zh-CN/server/module/cluster.html +++ b/docs/zh-CN/server/module/cluster.html @@ -17,8 +17,8 @@ -
                                      Skip to content

                                      集群的设计与实现

                                      Turms的集群代码实现比较清晰,也很容易理解。代码实现包为:src/main/java/im/turms/server/common/infra/cluster;配置包为:src/main/java/im/turms/server/common/infra/property/env/common/cluster

                                      纯自研的原因

                                      自研第三方服务
                                      定制化功能Turms有很多定制化的细节需求,各功能环环相扣,自研的话可以保证新需求立马实现,完成一个新需求的所需时间大致为5~60分钟,也无需写Hacky代码别人不一定给做定制化功能。就算给做通常也是几周、几个月、甚至几年后,才把新功能发布到新版本。如此低的效率是绝对不能接受的
                                      学习难度服务划分清晰,代码精简,方面快速学习与掌握。花10~30分钟对有点基础的新人进行培训,新人即可掌握Turms集群服务像ZooKeeper或Eureka这样仅仅关于微服务某个功能的项目,其源码就已经远远多于Turms下述的六大服务源码之和。且第三方服务项目还涉及一些相对复杂但对Turms又完全无用的累赘功能,如Zookeeper的Zab协议,徒增学习难度。要想把握其实现细节需要大量的实践与源码阅读
                                      实现难度集群服务代码实现难度低。举例而言,Turms的集群服务实现的难度总和远远低于“敏感词过滤”功能里提到的“基于双数组Trie的AC自动机算法”。
                                      另外,其实集群服务的实现难度也远低于IM业务逻辑的实现
                                      需要针对第三方服务特点,编写Adaptor代码。难度虽低,但由于第三方服务的源码复杂度导致要想保证Adaptor代码永远如预期那样执行并不是件容易的事情(利用学习各种第三方服务的源码+编写Adaptor代码的时间,已经能从零自研几套集群服务实现了)
                                      部署与运维难度在Turms的集群服务中,仅有“配置中心服务”与“服务注册中心”需要MongoDB服务进行部署,二者共用MongoDB服务。因此:
                                      1. 由于业务数据存储也采用MongoDB服务,因此运维人员可以选择共用一个MongoDB服务,无需额外部署
                                      2. 国内外云厂商均提供MongoDB服务部署服务。仅需点点鼠标即可部署单例或集群MongoDB服务,并且直接实现同城容灾
                                      国内外云厂商支持的“配置中心服务”与“服务注册中心”服务大多与具体厂商相绑定,部署灵活性极差。另一方面,如果Turms采用诸如Eureka这样的开源方案,由于各厂商又不提供Eureka这类开源方案的云服务,因此运维人员还得自己采购云服务器进行部署与运维,相比自研方案,极大地参加了运维难度
                                      性能Turms能结合业务代码特点,让集群服务的实现互相照应配合,保证整个流程下来无冗余数据的生成。同时所有网络操作都基于Netty实现,性能极高由于第三方服务都是基于通用需求,我们为此一方面要写大量的Adaptor代码,徒增资源开销,也增加了学习难度。另一方面,其自身的实现无法保证极致的高效,甚至还有些服务竟然还使用阻塞API

                                      综上,使用第三方服务几乎无任何优势可言,因此Turms采用纯自研方案。另外,其实稍有实力且稍有点定制化需求的公司都会选择自研,原因同上。

                                      节点

                                      实现类:im.turms.server.common.infra.cluster.node.Node

                                      配置类:im.turms.server.common.infra.property.env.common.cluster.NodeProperties

                                      每个服务端有且仅有一个节点类实例。节点类对内管理节点信息与节点生命周期事件,并调度各节点服务。对外承接用户自定义配置,并暴露节点服务与提供一些常用的Util函数供业务实现代码使用。

                                      服务

                                      分布式配置中心服务(Config)

                                      服务类:im.turms.server.common.infra.cluster.service.config.SharedConfigService

                                      配置类:im.turms.server.common.infra.property.env.common.cluster.SharedConfigProperties

                                      如今微服务领域的基础服务实现方案百花齐放。以配置中心的实现方案为例,其实现方案就有:K8S的ConfigMaps、云服务厂商的配置服务(如AWS的AppConfig)、开源实现(如Zookeeper)。作为Turms作为一个技术中立的开源项目,其技术栈绝不能被厂商所绑定。但与此同时,又要保证这些实现能够很方便地获得云服务厂商的支持,以让运维人员“点点鼠标就能实现与部署了”。同时又要满足容灾、高可用、可监控、易操作等多种关键特性,因此Turms通过MongoDB自研实现配置中心实现,以满足上述的所有要求。

                                      具体配置的增删改查操作实现即为常规的MongoDB数据库的增删改查操作,非常常规,故不赘述。唯一值得特别注意的是:Turms通过MongoDB的Change Stream机制来监听配置的变化,而官方客户端实现mongo-java-driver采用轮询机制来监听配置变化,而不是MongoDB服务端主动通知MongoDB的客户端。

                                      补充:

                                      • 因为服务注册中心的“服务信息”本质上来说也是一种配置,因此下述的服务注册与发现也是基于该配置中心实现的。
                                      • MongoDB集群自身的配置中心也是基于MongoDB服务端,即Config服务端实现的。

                                      相关Admin API

                                      TODO

                                      服务注册与发现服务(Discovery)

                                      服务类:im.turms.server.common.infra.cluster.service.discovery.DiscoveryService

                                      配置类:im.turms.server.common.infra.property.env.common.cluster.DiscoveryProperties

                                      职责

                                      该服务主要负责:

                                      • 尽最大努力,保证当前节点注册在服务注册中心中。每个节点在服务端启动时,都会向服务注册中心注册当前节点的信息。如果启动时注册失败(如该节点信息已被注册),则会主动关闭服务端进程并报告失败的异常信息。如果在节点运行过程中,其注册信息被服务注册中心异常删除(如管理员错误地删除了数据),则该节点会自动重新注册其信息
                                      • 服务端被优雅关闭时,在服务注册中心删除当前节点的注册信息。注意:如果服务端被强制关闭(如系统直接断电时),则该节点的注册信息并不会由当前节点删除,而是在服务注册中心检测到60秒的心跳超时后,自动移除其注册信息。另外,这期间其他节点仍会不断地尝试与该节点建立TCP连接,直到其注册信息被服务注册中心移除
                                      • 监听服务注册中心的节点增删改事件,以通知“网络连接服务”去连接或断开对应的TCP连接
                                      • 选举Leader

                                      注册节点的记录格式

                                      注册节点的记录格式一共有两种类型:Member与Leader

                                      Member

                                      类:im.turms.server.common.infra.cluster.service.config.domain.discovery.Member

                                      字段类别字段名描述
                                      KeyclusterId集群ID
                                      nodeId节点ID
                                      一般信息zone节点所在区域。用于充当雪花ID算法中的数据中心ID
                                      nodeVersion节点版本号。用来保证节点之间的操作能够版本兼容
                                      nodeType节点类型。用来保证RPC请求能够被发送给正确的节点
                                      isSeed如果一个节点的lastHeartbeatDate超时60秒,且isSeedfalse,则该节点会被自动移除服务注册中心。如果isSeedtrue,则就算心跳超时,该节点也不会被移除
                                      registrationDate节点注册时间
                                      isLeaderEligible用于判断节点是否可以参与选举
                                      priority优先级。主要用于在Leader选举时,高位值的节点能被优先选举为Leader
                                      RPC地址信息memberHostRPC主机号。用于保证其他节点能够通过该主机号,与其进行通信
                                      memberPortRPC端口号。用于保证其他节点能够通过该端口号,与其进行通信
                                      补充地址信息adminApiAddress无实际作用。仅用于管理员能够通过Admin API得知Admin API的地址信息
                                      wsAddress无实际作用。仅用于管理员能够通过Admin API得知客户端WebSocket服务的地址信息
                                      tcpAddress无实际作用。仅用于管理员能够通过Admin API得知客户端TCP服务的地址信息
                                      udpAddress无实际作用。仅用于管理员能够通过Admin API得知客户端UDP服务的地址信息
                                      状态信息hasJoinedCluster为True时表示该节点成功完成心跳刷新操作。该字段并无实际作用,仅仅作为指示器表明节点心跳健康状态。即便一个节点处于不健康状态,它仍然可以处理客户端请求。
                                      另外,各集群节点的该字段值由Leader节点根据各节点的lastHeartbeatDate进行更新
                                      isHealthy为False时拒绝服务。具体而言包括:如果是turms-gateway服务端,则拒绝新会话的建立与用户请求的处理;如果是turms-service服务端,则拒绝处理turms-gateway服务端发来的RPC请求;在RPC发送端挑选RPC响应服务端时,只从健康的节点中进行选择
                                      isActive为False时表明禁止该节点处理客户端请求。该字段的值有且仅能通过Admin API进行更新。可用于在灰度发布时,先将节点逐步断流,再进行停机更新操作
                                      lastHeartbeatDate记录上一次心跳刷新时间,用于Leader节点根据该值更新hasJoinedCluster信息
                                      Leader
                                      字段类别字段名描述
                                      KeyclusterId集群ID
                                      nodeIdLeader节点ID
                                      一般信息renewDate租约刷新时间。如果超过60s未进行刷新,则服务注册中心会自动删除该Leader记录信息
                                      generation代。主要用于拒绝前代Leader因为没有检测到新Leader的诞生,而尝试进行租约操作的行为

                                      Leader选举

                                      节点参与选举的条件:

                                      • 节点类型必须为turms-service,而不是turms-gateway。这是因为一些Leader行为只能由turms-service执行,turms-gateway没有能力执行这些操作。
                                      • im.turms.server.common.infra.property.env.common.cluster.NodeProperties#leaderEligibletrue(默认为true
                                      • 节点状态必须为active
                                      自动选举

                                      每个具有选举资格的节点:1. 在服务端启动时;2. 在通过Change Stream监听到服务注册中心的Leader信息被删除时;3. 发现自己的isLeaderEligible信息由False变为True时:

                                      当前节点首先会拉取此时此刻服务注册中心内的所有节点信息,并找出一批priority最高的且具有选举资格的节点。如果当前节点在这批节点内,且本地的节点信息快照中不存在Leader,则向服务注册中心发送Leader注册请求,尝试将自己选为Leader。如果服务注册中心确实不存在Leader,则注册成功。否则,注册失败。

                                      注意:如果一个priority更高的节点加入集群中,该节点并不会抢夺Leader角色。

                                      手动选举(Admin API)

                                      API接口POST /cluster/members/leader允许强制集群重新选举Leader。该API带有一个id参数,如果id参数为空,则强制选举当前集群中具有最高priority且具有选举资格的节点为Leader。如果id参数不为空,则将节点ID为id的节点选为Leader,不论其priority。如果该节点不存在或不具备选举资格,则抛出异常。

                                      Leader的职责

                                      总体而言,需要保证只有一个节点触发或执行的动作,通常就由Leader节点执行。另外,该类行为在一些服务端实现中,会通过节点抢占分布式锁来实现,但该实现的可靠性、可控性与性能都远不如使用统一Leader的方案,故Turms不采用抢占分布式锁方案。

                                      具体动作而言:

                                      • Leader最重要的动作之一就是根据其他节点在服务注册中心(MongoDB)的心跳刷新时间,来更新各节点的最新状态(具体代码在:im.turms.server.common.infra.cluster.service.discovery.LocalNodeStatusManager#updateMembersStatus
                                      • “定期cron向Redis发送清除过期黑名单记录的指令”这一动作只需一个节点,即Leader来定期执行。
                                      • “定期cron删除过期数据库数据操作,如用户消息”,也有且仅会被Leader执行(补充:这类操作的代码其实是“历史遗留代码”,“顺便”保留的。毕竟极少应用会真得删除用户数据,因此默认disabled状态,可以忽略)

                                      相关Admin API

                                      TODO

                                      网络连接服务(Connection)

                                      服务类:im.turms.server.common.infra.cluster.service.connection.ConnectionService

                                      配置类:im.turms.server.common.infra.property.env.common.cluster.connection.ConnectionProperties

                                      在Turms服务端集群实现中,Connection是介于TransportRPC之间的一个概念,因为Connection一方面需要维护节点之间的TCP连接,另一方面又需要通过RpcService来完成节点之间的心跳操作(用于检测节点之间的TCP连接是否健康)。之所以没把ConnectionServiceRpcService合并成一个Service是因为二者都有大量自己的逻辑,为尽可能遵循单一职责的原则,以避免大量TCP连接维护与RPC能力实现的逻辑混在一起,因此两个服务没进行合并。

                                      职责

                                      • 根据服务注册与发现服务的请求,基于TCP连接其他集群节点。注意:两个节点之间有且仅会存在一个TCP连接
                                      • 如果意外地与其他集群节点断连,则尽最大努力进行重连操作
                                      • 发送心跳请求,以确认节点之间的TCP连接确实有效

                                      网络连接的生命周期

                                      • 建立TCP连接
                                      • 进行应用层的握手操作,交换节点的基础必要信息,如节点ID,以得知TCP对端是哪个节点。注意:这里的握手不是TCP协议里的握手。
                                      • 在握手成功后,节点之间即可进行网络数据的收发操作
                                      • 在关闭TCP网络连接之前,先发送应用层的挥手操作,通知对端该节点要主动与其断连,以区别TCP意外断连。注意:这里的挥手不是TCP协议里的挥手。
                                      • 关闭TCP连接

                                      编解码服务(Codec)

                                      服务类:im.turms.server.common.infra.cluster.service.codec.CodecService

                                      该服务主要为RPC服务提供数据的编解码实现。特别地,Turms并没有采用反射机制来统一实现序列化与反序列化逻辑,而是为每个数据定制实现,这主要是因为:1. 定制化实现,保证绝对地高效。如Set<DeviceType>可以用一个Byte,按Bit表示值的存在与否,而不是用一组Byte表示;2. 避免反射,保证高效;3. 代码所见即所得,避免隐晦操作的存在

                                      RPC服务

                                      服务类:im.turms.server.common.infra.cluster.service.rpc.RpcService

                                      配置类:im.turms.server.common.infra.property.env.common.cluster.RpcProperties

                                      该服务基于“网络连接服务”提供的底层TCP网络连接与“编解码服务”提供的数据序列化与反序列化能力,来实现RPC操作的相关逻辑。

                                      编码格式

                                      RPC请求的组成部分:

                                      1. Varint编码的正文长度,用于在TCP字节流中区分每个RPC请求数据所在的字节区间。对大部分RPC请求而言,该部分通常占1~2 bytes。
                                      2. 请求头:数据类型ID(2 bytes) + 请求ID(4 bytes)
                                      3. 请求体:不同请求的编码方式不同,但都采用定制编码,以保证极致的高效。另外,请求体中最大的数据为“用户自定义文本”,如“聊天消息”

                                      RPC响应的组成部分:

                                      1. Varint编码的正文长度,用于在TCP字节流中区分每个RPC响应数据所在的字节区间。对大部分RPC响应而言,该部分通常占1 byte。
                                      2. 响应头:数据类型ID(2 bytes) + 被响应的请求ID(4 bytes)
                                      3. 响应体:响应体可以分为两大类:正常响应与异常响应。正确响应即各种数据类型,如八大基本类型与其他组合的数据类型。异常响应本质上也仅仅是一种“组合的数据类型”,它的表现形式为RpcException数据类型,通过RpcErrorCodeResponseStatusCodedescription (String)字段,来描述异常信息。

                                      补充

                                      • 部分请求(如“通知”中的用户聊天消息)会被发送给多个不同的RPC节点,它们的请求体都是共享堆外直接内存的,不需要进行内存拷贝

                                      • Turms目前没计划对RPC请求与响应数据采用压缩技术,这主要是因为:各种压缩算法的压缩率都不太理想,而压缩与解压需要消耗大量的内存与CPU资源。总体下来,压缩的性价比太低,得不偿失,故不采用压缩技术。

                                        额外一提的是,对于服务端与客户端之间的数据传递,未来会考虑支持压缩,其根本动机是:以更多的内存与CPU占用为代价(压缩/解压时要开辟新的内存空间)通过压缩数据,来提升数据的可达性(尤其是在弱网环境下)

                                      背压

                                      turms-gateway服务端对turms-service服务端的背压实现比较取巧,具体而言:每个节点都会根据当前节点的CPU与内存负载状态,判断当前节点的健康状态,并通过“服务注册中心”向其他节点同步该健康信息。turms-gateway会根据已知的turms-service节点列表中,找出“isHealthy”为True的节点,向其发送RPC请求。如果turms-gateway发现当前所有turms-service的“isHealthy”均为False,则不再进行RPC下发,而是直接抛出异常。

                                      失败转移(Failover)

                                      对于无特定目标的RPC请求,如果一个Turms服务端向另一个Turms服务端发送了RPC请求,并且对端响应异常时,发送方会自动再向另一个Turms服务端发送该RPC请求。举例而言,如果客户端发送了一个请求给turms-gateway,turms-gateway会先随机挑选了一个turms-service来处理该用户请求,如果该turms-service响应异常,则turms-gateway会自动再搜寻另一个turms-service来处理该用户请求。

                                      分布式ID生成服务(IdGen)

                                      服务类:im.turms.server.common.infra.cluster.service.idgen.IdService

                                      分布式ID生成器用于为各业务场景快速提供集群唯一的ID。生成一个集群唯一的ID只需要节点进行本地运算操作(具体代码:im.turms.server.common.infra.cluster.service.idgen.SnowflakeIdGenerator#nextLargeGapId),效率极高。

                                      原理

                                      Turms的分布式ID生成器基于主流的雪花ID算法实现,生成的ID为long数据类型,具体而言:

                                      • 最高位(1 bit)始终为0,表示正数
                                      • 41 bits表示以毫秒为单位的时间戳,可表示约69年时间。具体UTC时间区间为:[2020-10-13, 2090-06-19]2020-10-13为硬编码的Epoch时间,如果您想修改该时间,修改im.turms.server.common.infra.cluster.service.idgen.SnowflakeIdGenerator#EPOCH的值即可
                                      • 4 bits表示数据中心ID,ID区间为[0, 15]。在实际运用中,该ID通常以云服务中的区域划分,即每个区域都有一个ID。Turms会根据节点的NodeProperties#zone“区域名”,自动将区域名映射为[0, 15]区间中的值。注意:如果有16个以上的区域名,虽然这些区域名仍会被映射为[0, 15]区间中的值,但这也意味着会出现重复的数据中心ID,有集群节点生成相同ID的风险。并且,被降级处理的节点会打印警告日志,提醒有生成相同ID的风险。
                                      • 8 bits表示工作节点ID,ID区间为[0, 255]。Turms会根据节点的im.turms.server.common.infra.property.env.common.cluster.NodeProperties#zone“区域名”,自动将区域名映射为[0, 255]区间中的值。注意:如果在一个数据中心中有256个以上的节点,虽然这些节点ID仍会被映射为[0, 255]区间中的值,但这也意味着会出现重复的工作节点ID,有集群节点生成相同ID的风险。并且,被降级处理的节点会打印警告日志,提醒有生成相同ID的风险。
                                      • 10 bits表示序列号。在单位时间戳字段内(1毫秒)可表示至多1024个序列号,即1毫秒中最多可生成1024个唯一ID。换言之,1秒内至多可以表示1024000个唯一ID,因此在实际使用中,是不可能出现重复ID的情况。

                                      补充:根据节点信息,更新数据中心ID与工作节点ID信息的代码在:im.turms.server.common.infra.cluster.service.idgen.IdService#IdServiceaddOnMembersChangeListener

                                      变种实现

                                      具体实现:im.turms.server.common.infra.cluster.service.idgen.SnowflakeIdGenerator#nextLargeGapId

                                      常规雪花算法生成的ID是单调递增的。但在大部分情况下,Turms的业务实现采用的是大间距ID,以避免ID单调递增。这么做是因为:使用大间距ID,以保证当这些数据存储到MongoDB数据库时,MongoDB能够根据这些ID,生成足够多的Chunks,并将这些Chunks负载均衡分配给各MongoDB服务端,让其进行存储。而单调递增ID会导致所有新数据始终分配到唯一的热点MongoDB服务端,导致数据库的负载均衡失效。

                                      大间距ID的实现也很简单,仅仅是把各字段进行重排,具体顺序为:序列号、时间戳、数据中心ID、工作节点ID(常规雪花算法的ID顺序为时间戳、数据中心ID、工作节点ID、序列号)。由于序列号占据ID的最高位,且生成的序列号在区间[0, 1023]内单调递增,因此能保证生成的ID快速占据大范围的数值,并被MongoDB分为多个Chunks负载均衡存储在不同的MongoDB服务端内。

                                      - +
                                      Skip to content

                                      集群的设计与实现

                                      Turms的集群代码实现比较清晰,也很容易理解。代码实现包为:src/main/java/im/turms/server/common/infra/cluster;配置包为:src/main/java/im/turms/server/common/infra/property/env/common/cluster

                                      纯自研的原因

                                      自研第三方服务
                                      定制化功能Turms有很多定制化的细节需求,各功能环环相扣,自研的话可以保证新需求立马实现,完成一个新需求的所需时间大致为5~60分钟,也无需写Hacky代码别人不一定给做定制化功能。就算给做通常也是几周、几个月、甚至几年后,才把新功能发布到新版本。如此低的效率是绝对不能接受的
                                      学习难度服务划分清晰,代码精简,方面快速学习与掌握。花10~30分钟对有点基础的新人进行培训,新人即可掌握Turms集群服务像ZooKeeper或Eureka这样仅仅关于微服务某个功能的项目,其源码就已经远远多于Turms下述的六大服务源码之和。且第三方服务项目还涉及一些相对复杂但对Turms又完全无用的累赘功能,如Zookeeper的Zab协议,徒增学习难度。要想把握其实现细节需要大量的实践与源码阅读
                                      实现难度集群服务代码实现难度低。举例而言,Turms的集群服务实现的难度总和远远低于“敏感词过滤”功能里提到的“基于双数组Trie的AC自动机算法”。
                                      另外,其实集群服务的实现难度也远低于IM业务逻辑的实现
                                      需要针对第三方服务特点,编写Adaptor代码。难度虽低,但由于第三方服务的源码复杂度导致要想保证Adaptor代码永远如预期那样执行并不是件容易的事情(利用学习各种第三方服务的源码+编写Adaptor代码的时间,已经能从零自研几套集群服务实现了)
                                      部署与运维难度在Turms的集群服务中,仅有“配置中心服务”与“服务注册中心”需要MongoDB服务进行部署,二者共用MongoDB服务。因此:
                                      1. 由于业务数据存储也采用MongoDB服务,因此运维人员可以选择共用一个MongoDB服务,无需额外部署
                                      2. 国内外云厂商均提供MongoDB服务部署服务。仅需点点鼠标即可部署单例或集群MongoDB服务,并且直接实现同城容灾
                                      国内外云厂商支持的“配置中心服务”与“服务注册中心”服务大多与具体厂商相绑定,部署灵活性极差。另一方面,如果Turms采用诸如Eureka这样的开源方案,由于各厂商又不提供Eureka这类开源方案的云服务,因此运维人员还得自己采购云服务器进行部署与运维,相比自研方案,极大地参加了运维难度
                                      性能Turms能结合业务代码特点,让集群服务的实现互相照应配合,保证整个流程下来无冗余数据的生成。同时所有网络操作都基于Netty实现,性能极高由于第三方服务都是基于通用需求,我们为此一方面要写大量的Adaptor代码,徒增资源开销,也增加了学习难度。另一方面,其自身的实现无法保证极致的高效,甚至还有些服务竟然还使用阻塞API

                                      综上,使用第三方服务几乎无任何优势可言,因此Turms采用纯自研方案。另外,其实稍有实力且稍有点定制化需求的公司都会选择自研,原因同上。

                                      节点

                                      实现类:im.turms.server.common.infra.cluster.node.Node

                                      配置类:im.turms.server.common.infra.property.env.common.cluster.NodeProperties

                                      每个服务端有且仅有一个节点类实例。节点类对内管理节点信息与节点生命周期事件,并调度各节点服务。对外承接用户自定义配置,并暴露节点服务与提供一些常用的Util函数供业务实现代码使用。

                                      服务

                                      分布式配置中心服务(Config)

                                      服务类:im.turms.server.common.infra.cluster.service.config.SharedConfigService

                                      配置类:im.turms.server.common.infra.property.env.common.cluster.SharedConfigProperties

                                      如今微服务领域的基础服务实现方案百花齐放。以配置中心的实现方案为例,其实现方案就有:K8S的ConfigMaps、云服务厂商的配置服务(如AWS的AppConfig)、开源实现(如Zookeeper)。作为Turms作为一个技术中立的开源项目,其技术栈绝不能被厂商所绑定。但与此同时,又要保证这些实现能够很方便地获得云服务厂商的支持,以让运维人员“点点鼠标就能实现与部署了”。同时又要满足容灾、高可用、可监控、易操作等多种关键特性,因此Turms通过MongoDB自研实现配置中心实现,以满足上述的所有要求。

                                      具体配置的增删改查操作实现即为常规的MongoDB数据库的增删改查操作,非常常规,故不赘述。唯一值得特别注意的是:Turms通过MongoDB的Change Stream机制来监听配置的变化,而官方客户端实现mongo-java-driver采用轮询机制来监听配置变化,而不是MongoDB服务端主动通知MongoDB的客户端。

                                      补充:

                                      • 因为服务注册中心的“服务信息”本质上来说也是一种配置,因此下述的服务注册与发现也是基于该配置中心实现的。
                                      • MongoDB集群自身的配置中心也是基于MongoDB服务端,即Config服务端实现的。

                                      相关Admin API

                                      TODO

                                      服务注册与发现服务(Discovery)

                                      服务类:im.turms.server.common.infra.cluster.service.discovery.DiscoveryService

                                      配置类:im.turms.server.common.infra.property.env.common.cluster.DiscoveryProperties

                                      职责

                                      该服务主要负责:

                                      • 尽最大努力,保证当前节点注册在服务注册中心中。每个节点在服务端启动时,都会向服务注册中心注册当前节点的信息。如果启动时注册失败(如该节点信息已被注册),则会主动关闭服务端进程并报告失败的异常信息。如果在节点运行过程中,其注册信息被服务注册中心异常删除(如管理员错误地删除了数据),则该节点会自动重新注册其信息
                                      • 服务端被优雅关闭时,在服务注册中心删除当前节点的注册信息。注意:如果服务端被强制关闭(如系统直接断电时),则该节点的注册信息并不会由当前节点删除,而是在服务注册中心检测到60秒的心跳超时后,自动移除其注册信息。另外,这期间其他节点仍会不断地尝试与该节点建立TCP连接,直到其注册信息被服务注册中心移除
                                      • 监听服务注册中心的节点增删改事件,以通知“网络连接服务”去连接或断开对应的TCP连接
                                      • 选举Leader

                                      注册节点的记录格式

                                      注册节点的记录格式一共有两种类型:Member与Leader

                                      Member

                                      类:im.turms.server.common.infra.cluster.service.config.domain.discovery.Member

                                      字段类别字段名描述
                                      KeyclusterId集群ID
                                      nodeId节点ID
                                      一般信息zone节点所在区域。用于充当雪花ID算法中的数据中心ID
                                      nodeVersion节点版本号。用来保证节点之间的操作能够版本兼容
                                      nodeType节点类型。用来保证RPC请求能够被发送给正确的节点
                                      isSeed如果一个节点的lastHeartbeatDate超时60秒,且isSeedfalse,则该节点会被自动移除服务注册中心。如果isSeedtrue,则就算心跳超时,该节点也不会被移除
                                      registrationDate节点注册时间
                                      isLeaderEligible用于判断节点是否可以参与选举
                                      priority优先级。主要用于在Leader选举时,高位值的节点能被优先选举为Leader
                                      RPC地址信息memberHostRPC主机号。用于保证其他节点能够通过该主机号,与其进行通信
                                      memberPortRPC端口号。用于保证其他节点能够通过该端口号,与其进行通信
                                      补充地址信息adminApiAddress无实际作用。仅用于管理员能够通过Admin API得知Admin API的地址信息
                                      wsAddress无实际作用。仅用于管理员能够通过Admin API得知客户端WebSocket服务的地址信息
                                      tcpAddress无实际作用。仅用于管理员能够通过Admin API得知客户端TCP服务的地址信息
                                      udpAddress无实际作用。仅用于管理员能够通过Admin API得知客户端UDP服务的地址信息
                                      状态信息hasJoinedCluster为True时表示该节点成功完成心跳刷新操作。该字段并无实际作用,仅仅作为指示器表明节点心跳健康状态。即便一个节点处于不健康状态,它仍然可以处理客户端请求。
                                      另外,各集群节点的该字段值由Leader节点根据各节点的lastHeartbeatDate进行更新
                                      isHealthy为False时拒绝服务。具体而言包括:如果是turms-gateway服务端,则拒绝新会话的建立与用户请求的处理;如果是turms-service服务端,则拒绝处理turms-gateway服务端发来的RPC请求;在RPC发送端挑选RPC响应服务端时,只从健康的节点中进行选择
                                      isActive为False时表明禁止该节点处理客户端请求。该字段的值有且仅能通过Admin API进行更新。可用于在灰度发布时,先将节点逐步断流,再进行停机更新操作
                                      lastHeartbeatDate记录上一次心跳刷新时间,用于Leader节点根据该值更新hasJoinedCluster信息
                                      Leader
                                      字段类别字段名描述
                                      KeyclusterId集群ID
                                      nodeIdLeader节点ID
                                      一般信息renewDate租约刷新时间。如果超过60s未进行刷新,则服务注册中心会自动删除该Leader记录信息
                                      generation代。主要用于拒绝前代Leader因为没有检测到新Leader的诞生,而尝试进行租约操作的行为

                                      Leader选举

                                      节点参与选举的条件:

                                      • 节点类型必须为turms-service,而不是turms-gateway。这是因为一些Leader行为只能由turms-service执行,turms-gateway没有能力执行这些操作。
                                      • im.turms.server.common.infra.property.env.common.cluster.NodeProperties#leaderEligibletrue(默认为true
                                      • 节点状态必须为active
                                      自动选举

                                      每个具有选举资格的节点:1. 在服务端启动时;2. 在通过Change Stream监听到服务注册中心的Leader信息被删除时;3. 发现自己的isLeaderEligible信息由False变为True时:

                                      当前节点首先会拉取此时此刻服务注册中心内的所有节点信息,并找出一批priority最高的且具有选举资格的节点。如果当前节点在这批节点内,且本地的节点信息快照中不存在Leader,则向服务注册中心发送Leader注册请求,尝试将自己选为Leader。如果服务注册中心确实不存在Leader,则注册成功。否则,注册失败。

                                      注意:如果一个priority更高的节点加入集群中,该节点并不会抢夺Leader角色。

                                      手动选举(Admin API)

                                      API接口POST /cluster/members/leader允许强制集群重新选举Leader。该API带有一个id参数,如果id参数为空,则强制选举当前集群中具有最高priority且具有选举资格的节点为Leader。如果id参数不为空,则将节点ID为id的节点选为Leader,不论其priority。如果该节点不存在或不具备选举资格,则抛出异常。

                                      Leader的职责

                                      总体而言,需要保证只有一个节点触发或执行的动作,通常就由Leader节点执行。另外,该类行为在一些服务端实现中,会通过节点抢占分布式锁来实现,但该实现的可靠性、可控性与性能都远不如使用统一Leader的方案,故Turms不采用抢占分布式锁方案。

                                      具体动作而言:

                                      • Leader最重要的动作之一就是根据其他节点在服务注册中心(MongoDB)的心跳刷新时间,来更新各节点的最新状态(具体代码在:im.turms.server.common.infra.cluster.service.discovery.LocalNodeStatusManager#updateMembersStatus
                                      • “定期cron向Redis发送清除过期黑名单记录的指令”这一动作只需一个节点,即Leader来定期执行。
                                      • “定期cron删除过期数据库数据操作,如用户消息”,也有且仅会被Leader执行(补充:这类操作的代码其实是“历史遗留代码”,“顺便”保留的。毕竟极少应用会真得删除用户数据,因此默认disabled状态,可以忽略)

                                      相关Admin API

                                      TODO

                                      网络连接服务(Connection)

                                      服务类:im.turms.server.common.infra.cluster.service.connection.ConnectionService

                                      配置类:im.turms.server.common.infra.property.env.common.cluster.connection.ConnectionProperties

                                      在Turms服务端集群实现中,Connection是介于TransportRPC之间的一个概念,因为Connection一方面需要维护节点之间的TCP连接,另一方面又需要通过RpcService来完成节点之间的心跳操作(用于检测节点之间的TCP连接是否健康)。之所以没把ConnectionServiceRpcService合并成一个Service是因为二者都有大量自己的逻辑,为尽可能遵循单一职责的原则,以避免大量TCP连接维护与RPC能力实现的逻辑混在一起,因此两个服务没进行合并。

                                      职责

                                      • 根据服务注册与发现服务的请求,基于TCP连接其他集群节点。注意:两个节点之间有且仅会存在一个TCP连接
                                      • 如果意外地与其他集群节点断连,则尽最大努力进行重连操作
                                      • 发送心跳请求,以确认节点之间的TCP连接确实有效

                                      网络连接的生命周期

                                      • 建立TCP连接
                                      • 进行应用层的握手操作,交换节点的基础必要信息,如节点ID,以得知TCP对端是哪个节点。注意:这里的握手不是TCP协议里的握手。
                                      • 在握手成功后,节点之间即可进行网络数据的收发操作
                                      • 在关闭TCP网络连接之前,先发送应用层的挥手操作,通知对端该节点要主动与其断连,以区别TCP意外断连。注意:这里的挥手不是TCP协议里的挥手。
                                      • 关闭TCP连接

                                      编解码服务(Codec)

                                      服务类:im.turms.server.common.infra.cluster.service.codec.CodecService

                                      该服务主要为RPC服务提供数据的编解码实现。特别地,Turms并没有采用反射机制来统一实现序列化与反序列化逻辑,而是为每个数据定制实现,这主要是因为:1. 定制化实现,保证绝对地高效。如Set<DeviceType>可以用一个Byte,按Bit表示值的存在与否,而不是用一组Byte表示;2. 避免反射,保证高效;3. 代码所见即所得,避免隐晦操作的存在

                                      RPC服务

                                      服务类:im.turms.server.common.infra.cluster.service.rpc.RpcService

                                      配置类:im.turms.server.common.infra.property.env.common.cluster.RpcProperties

                                      该服务基于“网络连接服务”提供的底层TCP网络连接与“编解码服务”提供的数据序列化与反序列化能力,来实现RPC操作的相关逻辑。

                                      编码格式

                                      RPC请求的组成部分:

                                      1. Varint编码的正文长度,用于在TCP字节流中区分每个RPC请求数据所在的字节区间。对大部分RPC请求而言,该部分通常占1~2 bytes。
                                      2. 请求头:数据类型ID(2 bytes) + 请求ID(4 bytes)
                                      3. 请求体:不同请求的编码方式不同,但都采用定制编码,以保证极致的高效。另外,请求体中最大的数据为“用户自定义文本”,如“聊天消息”

                                      RPC响应的组成部分:

                                      1. Varint编码的正文长度,用于在TCP字节流中区分每个RPC响应数据所在的字节区间。对大部分RPC响应而言,该部分通常占1 byte。
                                      2. 响应头:数据类型ID(2 bytes) + 被响应的请求ID(4 bytes)
                                      3. 响应体:响应体可以分为两大类:正常响应与异常响应。正确响应即各种数据类型,如八大基本类型与其他组合的数据类型。异常响应本质上也仅仅是一种“组合的数据类型”,它的表现形式为RpcException数据类型,通过RpcErrorCodeResponseStatusCodedescription (String)字段,来描述异常信息。

                                      补充

                                      • 部分请求(如“通知”中的用户聊天消息)会被发送给多个不同的RPC节点,它们的请求体都是共享堆外直接内存的,不需要进行内存拷贝

                                      • Turms目前没计划对RPC请求与响应数据采用压缩技术,这主要是因为:各种压缩算法的压缩率都不太理想,而压缩与解压需要消耗大量的内存与CPU资源。总体下来,压缩的性价比太低,得不偿失,故不采用压缩技术。

                                        额外一提的是,对于服务端与客户端之间的数据传递,未来会考虑支持压缩,其根本动机是:以更多的内存与CPU占用为代价(压缩/解压时要开辟新的内存空间)通过压缩数据,来提升数据的可达性(尤其是在弱网环境下)

                                      背压

                                      turms-gateway服务端对turms-service服务端的背压实现比较取巧,具体而言:每个节点都会根据当前节点的CPU与内存负载状态,判断当前节点的健康状态,并通过“服务注册中心”向其他节点同步该健康信息。turms-gateway会根据已知的turms-service节点列表中,找出“isHealthy”为True的节点,向其发送RPC请求。如果turms-gateway发现当前所有turms-service的“isHealthy”均为False,则不再进行RPC下发,而是直接抛出异常。

                                      失败转移(Failover)

                                      对于无特定目标的RPC请求,如果一个Turms服务端向另一个Turms服务端发送了RPC请求,并且对端响应异常时,发送方会自动再向另一个Turms服务端发送该RPC请求。举例而言,如果客户端发送了一个请求给turms-gateway,turms-gateway会先随机挑选了一个turms-service来处理该用户请求,如果该turms-service响应异常,则turms-gateway会自动再搜寻另一个turms-service来处理该用户请求。

                                      分布式ID生成服务(IdGen)

                                      服务类:im.turms.server.common.infra.cluster.service.idgen.IdService

                                      分布式ID生成器用于为各业务场景快速提供集群唯一的ID。生成一个集群唯一的ID只需要节点进行本地运算操作(具体代码:im.turms.server.common.infra.cluster.service.idgen.SnowflakeIdGenerator#nextLargeGapId),效率极高。

                                      原理

                                      Turms的分布式ID生成器基于主流的雪花ID算法实现,生成的ID为long数据类型,具体而言:

                                      • 最高位(1 bit)始终为0,表示正数
                                      • 41 bits表示以毫秒为单位的时间戳,可表示约69年时间。具体UTC时间区间为:[2020-10-13, 2090-06-19]2020-10-13为硬编码的Epoch时间,如果您想修改该时间,修改im.turms.server.common.infra.cluster.service.idgen.SnowflakeIdGenerator#EPOCH的值即可
                                      • 4 bits表示数据中心ID,ID区间为[0, 15]。在实际运用中,该ID通常以云服务中的区域划分,即每个区域都有一个ID。Turms会根据节点的NodeProperties#zone“区域名”,自动将区域名映射为[0, 15]区间中的值。注意:如果有16个以上的区域名,虽然这些区域名仍会被映射为[0, 15]区间中的值,但这也意味着会出现重复的数据中心ID,有集群节点生成相同ID的风险。并且,被降级处理的节点会打印警告日志,提醒有生成相同ID的风险。
                                      • 8 bits表示工作节点ID,ID区间为[0, 255]。Turms会根据节点的im.turms.server.common.infra.property.env.common.cluster.NodeProperties#zone“区域名”,自动将区域名映射为[0, 255]区间中的值。注意:如果在一个数据中心中有256个以上的节点,虽然这些节点ID仍会被映射为[0, 255]区间中的值,但这也意味着会出现重复的工作节点ID,有集群节点生成相同ID的风险。并且,被降级处理的节点会打印警告日志,提醒有生成相同ID的风险。
                                      • 10 bits表示序列号。在单位时间戳字段内(1毫秒)可表示至多1024个序列号,即1毫秒中最多可生成1024个唯一ID。换言之,1秒内至多可以表示1024000个唯一ID,因此在实际使用中,是不可能出现重复ID的情况。

                                      补充:根据节点信息,更新数据中心ID与工作节点ID信息的代码在:im.turms.server.common.infra.cluster.service.idgen.IdService#IdServiceaddOnMembersChangeListener

                                      变种实现

                                      具体实现:im.turms.server.common.infra.cluster.service.idgen.SnowflakeIdGenerator#nextLargeGapId

                                      常规雪花算法生成的ID是单调递增的。但在大部分情况下,Turms的业务实现采用的是大间距ID,以避免ID单调递增。这么做是因为:使用大间距ID,以保证当这些数据存储到MongoDB数据库时,MongoDB能够根据这些ID,生成足够多的Chunks,并将这些Chunks负载均衡分配给各MongoDB服务端,让其进行存储。而单调递增ID会导致所有新数据始终分配到唯一的热点MongoDB服务端,导致数据库的负载均衡失效。

                                      大间距ID的实现也很简单,仅仅是把各字段进行重排,具体顺序为:序列号、时间戳、数据中心ID、工作节点ID(常规雪花算法的ID顺序为时间戳、数据中心ID、工作节点ID、序列号)。由于序列号占据ID的最高位,且生成的序列号在区间[0, 1023]内单调递增,因此能保证生成的ID快速占据大范围的数值,并被MongoDB分为多个Chunks负载均衡存储在不同的MongoDB服务端内。

                                      + \ No newline at end of file diff --git a/docs/zh-CN/server/module/data-analytics.html b/docs/zh-CN/server/module/data-analytics.html index a00361d3..d54e00a5 100644 --- a/docs/zh-CN/server/module/data-analytics.html +++ b/docs/zh-CN/server/module/data-analytics.html @@ -17,8 +17,8 @@ -
                                      Skip to content

                                      数据分析

                                      在给小型的即时通讯场景做表结构设计时,由于不需要考虑数据模型的分片设计,并且可以直接将业务模型与统计模型融为一体,因此对于小型业务场景,可以通过代码快速实现开箱即用且执行高效的基础数据分析功能,并延伸提供基于索引字段实现的常用统计API。

                                      但Turms项目是针对中大型即时通讯场景设计的,数据分析与业务实现必然需要进行架构层面上的分离,其中也包括将业务模型与数据模型分离。如果您需要进行数据分析,则您可以采集turms-gateway与turms-service服务端生成的度量或埋点日志,并使用云服务或自研实现对其进行分析。

                                      另外,考虑到确实有很多常用且通用的IM相关统计数据,因此我们之后会新开一个项目turms-data来负责数据分析,并配合Turms服务端与turms-admin来实现:日志与数据库数据的采集、数据仓库搭建、分析统计业务指标、结果可视化等功能。

                                      注意:由于早期Turms主要是为小型即时通讯场景而设计,当时所有API查询字段都是基于索引实现的,可以保证查询的高效性。但后来转为向中大场景做设计,很多索引也因此移除了,但相应API(尤其是统计API)的查询字段并没有被移除,因此现在还有一些API(尤其是统计API)的查询参数的实现会使用到全表扫描,是Legacy代码。我们之后会根据实现性能,对这些API进行分类,以保证一些低效的API不会被误用。

                                      - +
                                      Skip to content

                                      数据分析

                                      在给小型的即时通讯场景做表结构设计时,由于不需要考虑数据模型的分片设计,并且可以直接将业务模型与统计模型融为一体,因此对于小型业务场景,可以通过代码快速实现开箱即用且执行高效的基础数据分析功能,并延伸提供基于索引字段实现的常用统计API。

                                      但Turms项目是针对中大型即时通讯场景设计的,数据分析与业务实现必然需要进行架构层面上的分离,其中也包括将业务模型与数据模型分离。如果您需要进行数据分析,则您可以采集turms-gateway与turms-service服务端生成的度量或埋点日志,并使用云服务或自研实现对其进行分析。

                                      另外,考虑到确实有很多常用且通用的IM相关统计数据,因此我们之后会新开一个项目turms-data来负责数据分析,并配合Turms服务端与turms-admin来实现:日志与数据库数据的采集、数据仓库搭建、分析统计业务指标、结果可视化等功能。

                                      注意:由于早期Turms主要是为小型即时通讯场景而设计,当时所有API查询字段都是基于索引实现的,可以保证查询的高效性。但后来转为向中大场景做设计,很多索引也因此移除了,但相应API(尤其是统计API)的查询字段并没有被移除,因此现在还有一些API(尤其是统计API)的查询参数的实现会使用到全表扫描,是Legacy代码。我们之后会根据实现性能,对这些API进行分类,以保证一些低效的API不会被误用。

                                      + \ No newline at end of file diff --git a/docs/zh-CN/server/module/identity-access-management.html b/docs/zh-CN/server/module/identity-access-management.html index f26e4727..45963b98 100644 --- a/docs/zh-CN/server/module/identity-access-management.html +++ b/docs/zh-CN/server/module/identity-access-management.html @@ -17,7 +17,7 @@ -
                                      Skip to content

                                      身份与访问管理

                                      登陆的认证与授权

                                      Turms既提供了内置的身份与访问管理机制,也支持用户基于插件自定义身份与访问管理实现。

                                      相关配置

                                      配置名默认值说明
                                      turms.gateway.session.identity-access-management.enabledtrue是否开启身份与访问管理机制。
                                      如果该值为false,则关闭Turms内置的身份与访问管理机制与用户基于插件自定义身份与访问管理实现,并允许任意用户登陆,与授权其发送任意请求类型
                                      turms.gateway.session.identity-access-management.typepassword使用的Turms内置身份与访问管理机制类型,其类型可以为nooppasswordjwthttp。具体见下文

                                      内置的身份与访问管理机制

                                      1. NOOP

                                      关闭内置的身份与访问管理机制,并允许任意用户登陆,与授权其发送任意请求类型。

                                      相关配置项
                                      • turms.gateway.session.identity-access-management.type=noop

                                      2. 基于密钥认证

                                      基于Turms服务端自建的MongoDB中的user集合中的密码做用户认证。暂不支持授权实现。

                                      相关配置项
                                      • turms.gateway.session.identity-access-management.type=password

                                      3. 基于JWT认证

                                      JWT令牌中包含了该用户的认证与授权信息。

                                      工作流程
                                      • 客户端应用向您的服务端申请JWT令牌
                                      • 客户端应用拿到JWT令牌后,通过Turms客户端登陆接口turmsClient.userService.login中的password字段将JWT字符串发送给turms-gateway服务端
                                      • turms-gateway服务端拿到JWT令牌后,根据JWT令牌中指定的算法与开发者在turms-gateway服务端配置的公钥配置(非对称加密算法:RS256、RS384、RS512、PS256、PS384、PS512、ES256、ES384、ES512)或私钥配置(对称加密算法:HS256、HS384、HS512)对JWT令牌进行校验。
                                      • 如果开发者未在turms-gateway服务端配置JWT指定的算法密钥配置,则向客户端返回对应的错误信息,以告知客户端该算法不被支持
                                      • 如果JWT令牌校验通过,则根据JWT令牌的认证与授权信息对用户进行认证与授权
                                      • 如果JWT令牌校验失败,则向客户端返回对应的错误信息
                                      JWT正文(Payload)格式
                                      json
                                      {
                                      +    
                                      Skip to content

                                      身份与访问管理

                                      登陆的认证与授权

                                      Turms既提供了内置的身份与访问管理机制,也支持用户基于插件自定义身份与访问管理实现。

                                      相关配置

                                      配置名默认值说明
                                      turms.gateway.session.identity-access-management.enabledtrue是否开启身份与访问管理机制。
                                      如果该值为false,则关闭Turms内置的身份与访问管理机制与用户基于插件自定义身份与访问管理实现,并允许任意用户登陆,与授权其发送任意请求类型
                                      turms.gateway.session.identity-access-management.typepassword使用的Turms内置身份与访问管理机制类型,其类型可以为nooppasswordjwthttp。具体见下文

                                      内置的身份与访问管理机制

                                      1. NOOP

                                      关闭内置的身份与访问管理机制,并允许任意用户登陆,与授权其发送任意请求类型。

                                      相关配置项
                                      • turms.gateway.session.identity-access-management.type=noop

                                      2. 基于密钥认证

                                      基于Turms服务端自建的MongoDB中的user集合中的密码做用户认证。暂不支持授权实现。

                                      相关配置项
                                      • turms.gateway.session.identity-access-management.type=password

                                      3. 基于JWT认证

                                      JWT令牌中包含了该用户的认证与授权信息。

                                      工作流程
                                      • 客户端应用向您的服务端申请JWT令牌
                                      • 客户端应用拿到JWT令牌后,通过Turms客户端登陆接口turmsClient.userService.login中的password字段将JWT字符串发送给turms-gateway服务端
                                      • turms-gateway服务端拿到JWT令牌后,根据JWT令牌中指定的算法与开发者在turms-gateway服务端配置的公钥配置(非对称加密算法:RS256、RS384、RS512、PS256、PS384、PS512、ES256、ES384、ES512)或私钥配置(对称加密算法:HS256、HS384、HS512)对JWT令牌进行校验。
                                      • 如果开发者未在turms-gateway服务端配置JWT指定的算法密钥配置,则向客户端返回对应的错误信息,以告知客户端该算法不被支持
                                      • 如果JWT令牌校验通过,则根据JWT令牌的认证与授权信息对用户进行认证与授权
                                      • 如果JWT令牌校验失败,则向客户端返回对应的错误信息
                                      JWT正文(Payload)格式
                                      json
                                      {
                                           "iss": string, // issuer
                                           "sub": string, // subject
                                           "aud": array<string>, // audience
                                      @@ -92,7 +92,7 @@
                                               "resources": "*" // a string of ["*", "USER", "GROUP_BLOCKED_USER", ...], or an array that contains these strings
                                           }]
                                       }

                                      authenticatedstatements两个字段的含义与上文JWT正文中对应声明的含义相同,故不赘述。

                                      相关配置项
                                      配置名默认值说明
                                      turms.gateway.session.identity-access-management.typepassword设置为http以开启基于外部HTTP响应的身份与访问管理机制
                                      turms.service.message.check-if-target-active-and-not-deletedtrue使用HTTP机制时,需要将该配置项设置成false,否则因为Turms的数据库中并不存在该用户,因此用户将无法发送消息
                                      turms.gateway.session.identity-access-management.http.request.url""请求URL
                                      turms.gateway.session.identity-access-management.http.request.headerstrue附加的请求头
                                      turms.gateway.session.identity-access-management.http.request.http-methodGET请求方法
                                      turms.gateway.session.identity-access-management.http.request.timeout-millis30000请求超时时长
                                      turms.gateway.session.identity-access-management.http.authentication.response-expectation.status-codes"2??"在响应状态码中匹配该值,如果匹配成功,则继续进行其他匹配,否则认证失败
                                      turms.gateway.session.identity-access-management.http.authentication.response-expectation.headers在响应头中匹配该值,如果匹配成功,则继续进行其他匹配,否则认证失败
                                      turms.gateway.session.identity-access-management.http.authentication.response-expectation.body-fields在响应正文中匹配该值,如果匹配成功,则继续进行其他匹配,否则认证失败

                                      基于插件的自定义身份与访问管理实现

                                      认证插件接口:im.turms.gateway.infra.plugin.extension.UserAuthenticator

                                      授权插件接口:TODO

                                      读者可以参考插件实现,实现上述插件接口。

                                      业务逻辑的认证与授权

                                      对于客户端发来的权限信息,Turms服务端的态度是“客户端传来的权限信息均不可信”,因此Turms服务端会根据您在Turms服务端处所设定的业务配置,自行做各种必要的权限判断。

                                      以“修改已发送消息”功能为例,该行为会触发一系列判定逻辑。Turms会先判断目标消息是否确实是由该用户发出的,再根据您在Turms服务端配置的allowEditMessageBySender(默认为true),来判断是否允许用户修改已发送消息,若您设置其为false,则在客户端处会捕获到一个ResponseException(Kotlin)或ResponseError(JavaScript/Swift)对象,而它由业务状态码模型ResponseStatusCode表示(由codereason描述信息组成)。

                                      再比如对于一个“简单”的“发送消息”请求,Turms服务端就会判断该消息发送用户是否处于激活状态、是否设置了“允许发送消息给陌生人(非关系人)”、消息发送者是否在黑名单中。如果接收方是群组,那么消息发送者是否是群成员,并且是否处于禁言状态等等逻辑判断。而您仅仅只需调用一个sendMessage(...)接口即可。

                                      - + \ No newline at end of file diff --git a/docs/zh-CN/server/module/observability.html b/docs/zh-CN/server/module/observability.html index 2ff70b75..c853dbfa 100644 --- a/docs/zh-CN/server/module/observability.html +++ b/docs/zh-CN/server/module/observability.html @@ -17,7 +17,7 @@ -
                                      Skip to content

                                      可观测性体系

                                      为了实现系统的高可靠、让系统具备容量可预估与异常可排查(如检测DDoS攻击)的能力,系统的可观测性体系建设至关重要。一个服务端如果没有对可观测性体系提供支持,那无论其功能有多么丰富,也只是玩具项目。

                                      并且,在可观测性体系下生成的衍生产物也是企业的一项重要资产,企业经营者如果无视可观测性体系的建设,就无法有效分析用户行为与喜好,经营策略的优化就更无从谈起,同时也意味着企业放弃了一笔可观的财富。

                                      Turms与其他常规服务端一样,将可观测性的具体实现分为三类,即:度量(可聚合数值)日志(事件)链路追踪(面向请求)

                                      度量

                                      度量由可聚合数值构成,总体分为系统度量应用度量业务度量。系统度量用于观察系统或容器的运行状态与趋势;应用度量用于观察JVM与Turms应用层的运行状态和趋势;业务度量用于观察业务发展的状态和趋势。在默认不落盘采样的情况下,只占用极小部分的内存空间。

                                      另外,在具体的代码实现上,Turms的度量体系基于主流度量采样库Micrometer来实现。并提供接口/metrics来导出JSON格式,/metrics/prometheus来导出OpenMetrics格式,/metrics/csv来导出CSV格式。

                                      注意:还有一类实现起来相对耗系统资源的统计数据,诸如日活/周活/月活,用户留存率等,这些功能的实现都很常规。但这类相对高纬度的功能适合专门的日志服务或产品来实现,故Turms不直接提供该类数据。

                                      系统度量

                                      云服务厂商也都有提供该类度量,并且其度量点通常更丰富,存储、展示与分析等功能也是开箱即用。Turms提供以下重要度量主要是尽一个服务端应尽的责任,满足不上云用户以及部分用户的定制化需求。对于能使用云服务的用户,应该优先考虑使用云服务。

                                      类别名称类型含义
                                      Uptime(运行时间)process.uptimeTimeGauge进程已运行时长
                                      process.start.timeTimeGauge进程启动时间
                                      Processor(处理器)system.cpu.countGauge进程可用CPU核数
                                      system.load.average.1mGauge最近一分钟系统CPU负载
                                      system.cpu.usageGauge最近系统CPU使用率
                                      process.cpu.usageGauge最近进程CPU使用率
                                      Memory(内存)system.memory.totalGauge系统物理内存大小
                                      system.memory.freeGauge系统可用物理内存大小
                                      system.memory.swap.totalGauge系统Swap内存大小
                                      system.memory.swap.freeGauge系统可用Swap内存大小
                                      Storage(存储)disk.totalGauge总存储容量
                                      disk.freeGauge可用存储容量
                                      FileDescriptorprocess.files.openGauge打开的文件描述符数
                                      process.files.maxGauge可打开的最大文件描述符数

                                      应用度量

                                      JVM度量

                                      以下基于HotSpot虚拟机进行含义描述,Turms不对其他虚拟机提供官方支持。

                                      类别名称类型含义
                                      GCjvm.gc.max.data.sizeGauge老年代最大可用堆内内存
                                      jvm.gc.live.data.sizeGaugeGC后,老年代占用的内存空间
                                      jvm.gc.memory.allocatedCounterEden区一共被分配的内存空间
                                      jvm.gc.memory.promotedCounter老年代一共被分配的内存空间
                                      jvm.gc.pauseTimerGC耗时
                                      Memoryjvm.buffer.countGauge各内存缓冲区池内,内存缓冲区的个数
                                      jvm.buffer.memory.usedGauge各内存缓冲区池的已使用内存
                                      注意:Turms应用层使用的堆外内存都记录在这
                                      jvm.buffer.total.capacityGauge各内存缓冲区池的总容量
                                      jvm.memory.usedGauge各内存池的已使用内存
                                      注意:Turms应用层使用的堆外内存不会被记录在这
                                      jvm.memory.committedGauge各内存池的可用内存
                                      jvm.memory.maxGauge各内存池的最大内存
                                      Threadjvm.threads.peakGauge峰值线程数
                                      jvm.threads.daemonGauge守护线程数
                                      jvm.threads.liveGauge当前活跃线程数
                                      jvm.threads.statesGauge各线程状态下的线程数
                                      Classjvm.classes.loadedGauge已加载classes数
                                      jvm.classes.unloadedCounter已卸载classes数

                                      注意:Turms在进行网络IO操作时,使用的都是内存池中的堆外内存(即通过Netty的PooledByteBufAllocator分配堆外内存),通过故意不释放堆外内存,并将这些堆外内存缓存起来,来避免低效的堆外内存分配与释放操作,因此Turms的内存占用率会持续走高,并且总体没有下降趋势。这不是内存泄漏,只是Turms在缓存这些堆外内存。

                                      集群间TCP连接度量

                                      在连接度量中,因为服务端的节点数有限,所以每个度量都会把TCP端的远程地址作为tag,来区分每个TCP端各自的度量数据,以更细致地观察节点之间的通信情况。

                                      TCP服务端
                                      类型名称类型含义
                                      Connection(连接)turms.node.tcp.server.data.receivedDistributionSummary已接收字节数
                                      turms.node.tcp.server.data.sentDistributionSummary已发送字节数
                                      turms.node.tcp.server.errorsCounter连接异常触发次数
                                      turms.node.tcp.server.tls.handshake.timeTimerTLS握手用时
                                      ByteBufAllocator(内存)TODO
                                      TCP客户端
                                      类型名称类型含义
                                      Connection(连接)turms.node.tcp.client.data.receivedDistributionSummary已接收字节数
                                      turms.node.tcp.client.data.sentDistributionSummary已发送字节数
                                      turms.node.tcp.client.errorsCounter连接异常触发次数
                                      turms.node.tcp.client.tls.handshake.timeTimerTLS握手用时
                                      turms.node.tcp.client.connect.timeTimerTCP连接建立用时
                                      turms.node.tcp.client.address.resolverTimer地址解析用时
                                      ByteBufAllocator(内存)TODO

                                      RPC度量

                                      名称类型含义
                                      rpc.request.subscribedCounter某类型RPC请求的已处理次数
                                      rpc.request.flow.durationTimer某类型RPC请求的处理时长

                                      Admin API度量

                                      因为管理员的IP可以无限多,所以每个度量不会把对端的远程地址作为tag,来区分每个端各自的度量数据。

                                      类型名称类型含义
                                      Connection(连接)admin.api.data.receivedDistributionSummary已接收字节数
                                      admin.api.data.sentDistributionSummary已发送字节数
                                      admin.api.errorsCounter连接异常触发次数
                                      admin.api.tls.handshake.timeTimerTLS握手用时

                                      Turms客户端度量

                                      在连接度量中,因为客户端的数量无限多,所以每个度量不会把对端的远程地址作为tag,来区分每个端各自的度量数据。另外,连接度量通过tag uri来区分TCP/UDP/WebSocket三类连接各自的度量数据。

                                      类型名称类型含义
                                      Connection(连接)turms.client.network.data.receivedDistributionSummary已接收字节数
                                      turms.client.network.data.sentDistributionSummary已发送字节数
                                      turms.client.network.errorsCounter连接异常触发次数
                                      turms.client.network.tls.handshake.timeTimerTLS握手用时
                                      turms.client.network.connect.timeTimer连接建立用时
                                      turms.client.network.address.resolverTimer域名解析用时
                                      Request(请求)turms.client.request.subscribedCounter某类型客户端请求的已处理次数
                                      turms.client.request.flow.durationTimer某类型客户端请求的处理时长
                                      ConnectionProvider(连接池)TODO
                                      ByteBufAllocator(内存)TODO

                                      业务度量

                                      服务端名称类型含义
                                      turms-gatewayuser.logged_inCounter登录用户数
                                      user.onlineGauge在线用户数
                                      turms-serviceuser.registeredCounter注册用户数
                                      user.deletedCounter注销用户数
                                      group.createdCounter创建群组数
                                      group.deletedCounter注销群组数
                                      message.sentCounter已发送消息数

                                      日志

                                      每条日志都对应着Turms服务端运行时发生的事件,用于追踪系统的运行状态与生成高纬度的统计数据。Turms中的日志分类两大类,即应用日志业务日志。应用运行日志本身数量不多,占用空间不大,遵循精与准原则。但为业务分析而设计的客户端API访问日志则不同,它是大部分统计数据的基础数据,是企业的重要资产,因此Turms默认且推荐对其进行100%采样,存储消耗巨大。

                                      注意

                                      • Turms的所有日志、度量与链路追踪的数据格式设计,都是兼顾“简单快捷,方便快速查询”与“精准采样,方便日志服务分析”设计的,但Turms本身不提供任何日志分析功能。

                                      • Turms的日志时间戳与日志切割都是根据UTC时间,而非系统时间。

                                      • 当Turms出现FATAL级别的日志时,需要人工介入修复。目前已有的FATAL级别日志类型有:

                                        • 检测到数据库的表被删除,或被重命名。

                                        • 检测到存储日志的文件系统已满,无法继续打印日志。

                                          注意:当检测到文件系统已满时,Turms就已经无法继续打印日志了,因此在用户没有腾出足够空间之前,Turms其实是不会打印这条FATAL级别的日志的。Turms之后会对这点进行优化,以保证该日志能及时地被打印出来。当然,由于现在的系统都配备了监控系统,因此运维人员在接到存储空间超过自定义阈值的警告时,就应该事先进行处理了。

                                      • Turms会不断地打印日志,并将日志打印成文件,以存储在文件系统当中。当文件系统存储空间不足时,Turms服务端会停止打印日志,但不会丢弃日志,而会将日志堆积在内存当中,所以当内存中堆积的日志过多而导致内存不足时,又会触发Turms服务端的自动保护机制,拒绝所有的用户请求,以避免Turms服务端因为内存不足而宕机。所以运维人员务必要保证Turms服务端所在的系统时刻有足够的存储空间。

                                        拓展阅读:Turms服务端的内存健康检测机制

                                      自研实现(拓展知识)

                                      原因

                                      1. Turms默认且非常推荐对客户端API进行100%采样,需要Logging的实现高效
                                      2. 第三方Logging实现过于冗余,性能低下且内存占用高
                                      3. 避免第三方Logging的开发人员由于缺乏安全常识,写出类似Remote code injection in Log4j的Critical bug
                                      4. Turms的日志实现通过“几乎什么功能都没实现”,并且实现了的功能也照着几乎最高性能标准实现(我们直接将Java的基础数据写入DirectByteBuf,并直接写入文件描述符,不存在字符串拷贝),因此该实现的吞吐量能比log4j2 async logger高数倍,同时内存开销小数倍

                                      具体实现

                                      Turms日志实现非常精简,大概只实现了标准日志库的百分之几的核心功能,打印日志的主要步骤为:

                                      对于常规日志:

                                      • 调用im.turms.server.common.infra.logging.core.logger.AsyncLogger#doLog函数
                                      • doLog函数内部通过PooledByteBufAllocator.DEFAULT分配一块堆外内存,并遍历一遍message,将非占位符直接写入该内存,跳过占位符并写入具体参数,最后将这块内存放到日志处理的MPSC队列中(基于jctools的MpscUnboundedArrayQueue
                                      • 日志处理线程检测到有新的日志(即ByteBuffer对象)时,会将该堆外内存写入NIO包的FileChannel(可以是控制台、也可以是文件)中,该对象在Linux系统下,会最终调用pwrite直接将堆外内存写入文件描述符中

                                      对于各种API日志(如客户端API日志),我们采用了更为定制的实现,即:

                                      • 调用方直接将API信息(如客户端IP、请求大小等)写入DirectByteBuf中,并将这个Buffer传递给AsyncLogger#doLog函数
                                      • doLog函数将日志通用的模板信息(如时间戳、节点ID等)写入另一个DirectByteBuf,并与上述的DirectByteBuf,拼接成一个CompositeByteBuf
                                      • 日志处理线程检测到有新的日志(即CompositeByteBuf对象)时,会将该堆外内存写入NIO包的FileChannel(可以是控制台、也可以是文件)中,该对象在Linux系统下,会最终调用pwrite直接将堆外内存写入文件描述符中

                                      理所当然的,Turms写日志的性能能达到极致。

                                      补充

                                      • 虽然还有更高效的写法,即跨过Java实现,不使用NIO包的FileChannel,而是直接调用底层JNI实现,如在Linux操作系统下,直接通过Linux的pwriteDirectByteBuffer写入文件描述符中。但考虑到代码的可维护性,且Java默认不开放这些底层函数,故不采纳该写法。

                                      • 上述中提到的内存都是通过PooledByteBufAllocator.DEFAULT分配的,且没限制内存使用上限,并且“敢”用MpscUnboundedArrayQueue存储日志,而没限制最大容量。这是因为Turms服务端自己有一套内存管理机制,它能保证内存使用的上限,同时又让使用了的内存逐步释放。

                                      • Turms不支持且未来也不会支持:添加控制台文本样式。因为给控制台文本加样式需要使用ANSI escape codes,而日志文件不需要存储这些字符,因此若要实现该功能,我们需要给控制台与日志文件分别维护一个ByteBuf,一条日志需要消耗双倍的内存,故不考虑该实现。

                                        另外,开发者可以自行使用第三方工具或插件,如Intellij IDEAGrep Console插件,给Turms服务端控制台的日志添加样式。

                                      • 关于“为什么打印非ASCII字符时,会出现乱码”,这是因为:

                                        背景:

                                        • Java 21 String类内部的byte[] value有且仅会存储LATIN-1UTF-16编码的数据
                                        • Turms服务端自身有且仅打印ASCII字符(Turms服务端不会打印任何用户或管理员输入的文本)
                                        • 日志打印这种频繁使用的功能,无意义的内存拷贝是绝对禁止的。

                                        在上述背景下,Turms在打印String时,并不是通过getBytes("UTF-8")取其字节数据,而是通过Unsafe直接获取String的内部LATIN-1UTF-16编码的字节数据,因此日志文件可能是LATIN-1UTF-16混合编码。

                                        而当用户以UTF-8编码查看日志文件时,LATIN-1编码中的ASCII字符可以正确显示,UTF-16编码中的ASCII字符也能显示,只是每个ASCII字符会多带上一个空字符(二进制编码0000 0000),对于其他编码不兼容的字符,则会以乱码形式显示,因此如果Turms服务端打印了非ASCII字符,则用户会看到乱码。

                                        另外,除非Java未来支持存储UTF-8编码的字节数据,否则Turms服务端不会考虑使用getBytes("UTF-8")这样低效的实现。

                                      综上补充内容,也再次验证了我们在各篇章中反复提及的:“功能多”对于追求性能表现的服务端而言,很可能是缺点。

                                      不使用JSON格式的原因

                                      随着微服务的发展,JSON格式日志逐渐流行,比如MongoDB就在4.4版本时开始支持JSON格式日志。使用JSON格式主要有以下三大优点:

                                      • 极大地统一了各服务端的日志格式。尤其对于具有数十/百/千个异构服务端的公司而言,是必须强制要求各项目使用JSON日志格式的
                                      • 各编程语言均对JSON有良好支持,日志打印与解析几乎无难度可言
                                      • 各云厂商的日志服务对JSON格式日志都有着良好的支持,可以实现开箱即用

                                      Turms服务端不使用JSON格式的原因是:

                                      • Turms服务端构成很简单,不需要通过JSON来统一日志格式。
                                      • JSON序列化需要占用额外内存与CPU资源,且存储开销大,如果使用压缩技术,还要额外占用CPU资源。特别是,序列化加上压缩时所需的CPU资源甚至比Turms服务端处理业务请求所需CPU资源还高,这对Turms来说是难以接受的。
                                      • JSON格式其实在原始数据可读性上并不好。因为原始日志是以单行形式进行展示,一行即表明一个事件。JSON格式在单行显示时,会带来大量“噪音”,大量的JSON元数据、JSON键与JSON值纵横交错,直接阅读原始数据的话就比较费力。而Turms服务端的客户端API访问日志通过|分隔符拆分各字段。用户初次只需要多看几个日志,之后就能反应出各字段是代表什么信息。

                                      当然,采用传统的单行格式会造成云服务解析相对复杂,且配置不灵活。但考虑到这种东西配一次即一劳永逸,综合考虑以上情况,Turms服务端日志不采用JSON格式,而仍采用传统的单行格式。

                                      类别

                                      GC日志

                                      用于JVM性能测试、分析调优、排查定位问题。

                                      turms-gateway的服务端JVM GC配置为:-Xlog:gc*,gc+age=trace,safepoint:file=${TURMS_GATEWAY_HOME}/log/turms-gateway-gc.log:utctime,pid,tags:filecount=32,filesize=32m

                                      turms-service的服务端JVM GC配置为:-Xlog:gc*,gc+age=trace,safepoint:file=${TURMS_SERVICE_HOME}/log/turms-service-gc.log:utctime,pid,tags:filecount=32,filesize=32m

                                      服务端运行日志

                                      描述Turms服务端内发生的主要事件,如RPC连接状态的转变、请求处理中服务端错误的发生等。

                                      文件名:turms-gateway.log(turms-gateway服务端);turms-service.log(turms-service服务端)

                                      构成:事件发送时间、日志等级、服务端类型、节点ID、Trace ID、线程、类、消息。其中,服务端信息的主要作用是在分布式日志采集过程中,用于区分日志的来源节点。其他类型日志也都使用这样的日志格式(除了客户端API访问日志与通知日志不记录“类”信息),它们只是在“消息”部分使用了定制化的消息格式。

                                      格式:%d{${sys:LOG_DATEFORMAT_PATTERN}}{GMT+0} ${sys:LOG_LEVEL_PATTERN} ${myctx:NODE_TYPE} ${myctx:NODE_ID} %-19.19X{traceId} %t %-40.40c{1.} : %m%n${sys:LOG_EXCEPTION_CONVERSION_WORD}

                                      解析Regex:(?P<time>\d{4}-\d{2}-\d{2}\s\d{1,2}\:\d{2}\:\d{2}\.\d{3})\s+(?P<level>[A-Z]{4,5})\s+(?P<node_type>[A-Z])\s+(?P<node_id>\S*)\s+\[(?P<trace_id>.{19})\]\s+(?P<thread>\S*)\s+(?P<class>\S*)\s+:\s(?P<msg>.*)

                                      示例:

                                      spreadsheet
                                      2021-08-08 09:52:15.602 ERROR S idanvacg 6404110606919452669 AsyncGetter-1-thread-1 i.t.s.c.c.s.r.RpcService                 : Cannot send response to disposed connection: ServiceResponse{dataForRequester=null, code=SERVER_INTERNAL_ERROR, reason='The pool is closed'}
                                      +    
                                      Skip to content

                                      可观测性体系

                                      为了实现系统的高可靠、让系统具备容量可预估与异常可排查(如检测DDoS攻击)的能力,系统的可观测性体系建设至关重要。一个服务端如果没有对可观测性体系提供支持,那无论其功能有多么丰富,也只是玩具项目。

                                      并且,在可观测性体系下生成的衍生产物也是企业的一项重要资产,企业经营者如果无视可观测性体系的建设,就无法有效分析用户行为与喜好,经营策略的优化就更无从谈起,同时也意味着企业放弃了一笔可观的财富。

                                      Turms与其他常规服务端一样,将可观测性的具体实现分为三类,即:度量(可聚合数值)日志(事件)链路追踪(面向请求)

                                      度量

                                      度量由可聚合数值构成,总体分为系统度量应用度量业务度量。系统度量用于观察系统或容器的运行状态与趋势;应用度量用于观察JVM与Turms应用层的运行状态和趋势;业务度量用于观察业务发展的状态和趋势。在默认不落盘采样的情况下,只占用极小部分的内存空间。

                                      另外,在具体的代码实现上,Turms的度量体系基于主流度量采样库Micrometer来实现。并提供接口/metrics来导出JSON格式,/metrics/prometheus来导出OpenMetrics格式,/metrics/csv来导出CSV格式。

                                      注意:还有一类实现起来相对耗系统资源的统计数据,诸如日活/周活/月活,用户留存率等,这些功能的实现都很常规。但这类相对高纬度的功能适合专门的日志服务或产品来实现,故Turms不直接提供该类数据。

                                      系统度量

                                      云服务厂商也都有提供该类度量,并且其度量点通常更丰富,存储、展示与分析等功能也是开箱即用。Turms提供以下重要度量主要是尽一个服务端应尽的责任,满足不上云用户以及部分用户的定制化需求。对于能使用云服务的用户,应该优先考虑使用云服务。

                                      类别名称类型含义
                                      Uptime(运行时间)process.uptimeTimeGauge进程已运行时长
                                      process.start.timeTimeGauge进程启动时间
                                      Processor(处理器)system.cpu.countGauge进程可用CPU核数
                                      system.load.average.1mGauge最近一分钟系统CPU负载
                                      system.cpu.usageGauge最近系统CPU使用率
                                      process.cpu.usageGauge最近进程CPU使用率
                                      Memory(内存)system.memory.totalGauge系统物理内存大小
                                      system.memory.freeGauge系统可用物理内存大小
                                      system.memory.swap.totalGauge系统Swap内存大小
                                      system.memory.swap.freeGauge系统可用Swap内存大小
                                      Storage(存储)disk.totalGauge总存储容量
                                      disk.freeGauge可用存储容量
                                      FileDescriptorprocess.files.openGauge打开的文件描述符数
                                      process.files.maxGauge可打开的最大文件描述符数

                                      应用度量

                                      JVM度量

                                      以下基于HotSpot虚拟机进行含义描述,Turms不对其他虚拟机提供官方支持。

                                      类别名称类型含义
                                      GCjvm.gc.max.data.sizeGauge老年代最大可用堆内内存
                                      jvm.gc.live.data.sizeGaugeGC后,老年代占用的内存空间
                                      jvm.gc.memory.allocatedCounterEden区一共被分配的内存空间
                                      jvm.gc.memory.promotedCounter老年代一共被分配的内存空间
                                      jvm.gc.pauseTimerGC耗时
                                      Memoryjvm.buffer.countGauge各内存缓冲区池内,内存缓冲区的个数
                                      jvm.buffer.memory.usedGauge各内存缓冲区池的已使用内存
                                      注意:Turms应用层使用的堆外内存都记录在这
                                      jvm.buffer.total.capacityGauge各内存缓冲区池的总容量
                                      jvm.memory.usedGauge各内存池的已使用内存
                                      注意:Turms应用层使用的堆外内存不会被记录在这
                                      jvm.memory.committedGauge各内存池的可用内存
                                      jvm.memory.maxGauge各内存池的最大内存
                                      Threadjvm.threads.peakGauge峰值线程数
                                      jvm.threads.daemonGauge守护线程数
                                      jvm.threads.liveGauge当前活跃线程数
                                      jvm.threads.statesGauge各线程状态下的线程数
                                      Classjvm.classes.loadedGauge已加载classes数
                                      jvm.classes.unloadedCounter已卸载classes数

                                      注意:Turms在进行网络IO操作时,使用的都是内存池中的堆外内存(即通过Netty的PooledByteBufAllocator分配堆外内存),通过故意不释放堆外内存,并将这些堆外内存缓存起来,来避免低效的堆外内存分配与释放操作,因此Turms的内存占用率会持续走高,并且总体没有下降趋势。这不是内存泄漏,只是Turms在缓存这些堆外内存。

                                      集群间TCP连接度量

                                      在连接度量中,因为服务端的节点数有限,所以每个度量都会把TCP端的远程地址作为tag,来区分每个TCP端各自的度量数据,以更细致地观察节点之间的通信情况。

                                      TCP服务端
                                      类型名称类型含义
                                      Connection(连接)turms.node.tcp.server.data.receivedDistributionSummary已接收字节数
                                      turms.node.tcp.server.data.sentDistributionSummary已发送字节数
                                      turms.node.tcp.server.errorsCounter连接异常触发次数
                                      turms.node.tcp.server.tls.handshake.timeTimerTLS握手用时
                                      ByteBufAllocator(内存)TODO
                                      TCP客户端
                                      类型名称类型含义
                                      Connection(连接)turms.node.tcp.client.data.receivedDistributionSummary已接收字节数
                                      turms.node.tcp.client.data.sentDistributionSummary已发送字节数
                                      turms.node.tcp.client.errorsCounter连接异常触发次数
                                      turms.node.tcp.client.tls.handshake.timeTimerTLS握手用时
                                      turms.node.tcp.client.connect.timeTimerTCP连接建立用时
                                      turms.node.tcp.client.address.resolverTimer地址解析用时
                                      ByteBufAllocator(内存)TODO

                                      RPC度量

                                      名称类型含义
                                      rpc.request.subscribedCounter某类型RPC请求的已处理次数
                                      rpc.request.flow.durationTimer某类型RPC请求的处理时长

                                      Admin API度量

                                      因为管理员的IP可以无限多,所以每个度量不会把对端的远程地址作为tag,来区分每个端各自的度量数据。

                                      类型名称类型含义
                                      Connection(连接)admin.api.data.receivedDistributionSummary已接收字节数
                                      admin.api.data.sentDistributionSummary已发送字节数
                                      admin.api.errorsCounter连接异常触发次数
                                      admin.api.tls.handshake.timeTimerTLS握手用时

                                      Turms客户端度量

                                      在连接度量中,因为客户端的数量无限多,所以每个度量不会把对端的远程地址作为tag,来区分每个端各自的度量数据。另外,连接度量通过tag uri来区分TCP/UDP/WebSocket三类连接各自的度量数据。

                                      类型名称类型含义
                                      Connection(连接)turms.client.network.data.receivedDistributionSummary已接收字节数
                                      turms.client.network.data.sentDistributionSummary已发送字节数
                                      turms.client.network.errorsCounter连接异常触发次数
                                      turms.client.network.tls.handshake.timeTimerTLS握手用时
                                      turms.client.network.connect.timeTimer连接建立用时
                                      turms.client.network.address.resolverTimer域名解析用时
                                      Request(请求)turms.client.request.subscribedCounter某类型客户端请求的已处理次数
                                      turms.client.request.flow.durationTimer某类型客户端请求的处理时长
                                      ConnectionProvider(连接池)TODO
                                      ByteBufAllocator(内存)TODO

                                      业务度量

                                      服务端名称类型含义
                                      turms-gatewayuser.logged_inCounter登录用户数
                                      user.onlineGauge在线用户数
                                      turms-serviceuser.registeredCounter注册用户数
                                      user.deletedCounter注销用户数
                                      group.createdCounter创建群组数
                                      group.deletedCounter注销群组数
                                      message.sentCounter已发送消息数

                                      日志

                                      每条日志都对应着Turms服务端运行时发生的事件,用于追踪系统的运行状态与生成高纬度的统计数据。Turms中的日志分类两大类,即应用日志业务日志。应用运行日志本身数量不多,占用空间不大,遵循精与准原则。但为业务分析而设计的客户端API访问日志则不同,它是大部分统计数据的基础数据,是企业的重要资产,因此Turms默认且推荐对其进行100%采样,存储消耗巨大。

                                      注意

                                      • Turms的所有日志、度量与链路追踪的数据格式设计,都是兼顾“简单快捷,方便快速查询”与“精准采样,方便日志服务分析”设计的,但Turms本身不提供任何日志分析功能。

                                      • Turms的日志时间戳与日志切割都是根据UTC时间,而非系统时间。

                                      • 当Turms出现FATAL级别的日志时,需要人工介入修复。目前已有的FATAL级别日志类型有:

                                        • 检测到数据库的表被删除,或被重命名。

                                        • 检测到存储日志的文件系统已满,无法继续打印日志。

                                          注意:当检测到文件系统已满时,Turms就已经无法继续打印日志了,因此在用户没有腾出足够空间之前,Turms其实是不会打印这条FATAL级别的日志的。Turms之后会对这点进行优化,以保证该日志能及时地被打印出来。当然,由于现在的系统都配备了监控系统,因此运维人员在接到存储空间超过自定义阈值的警告时,就应该事先进行处理了。

                                      • Turms会不断地打印日志,并将日志打印成文件,以存储在文件系统当中。当文件系统存储空间不足时,Turms服务端会停止打印日志,但不会丢弃日志,而会将日志堆积在内存当中,所以当内存中堆积的日志过多而导致内存不足时,又会触发Turms服务端的自动保护机制,拒绝所有的用户请求,以避免Turms服务端因为内存不足而宕机。所以运维人员务必要保证Turms服务端所在的系统时刻有足够的存储空间。

                                        拓展阅读:Turms服务端的内存健康检测机制

                                      自研实现(拓展知识)

                                      原因

                                      1. Turms默认且非常推荐对客户端API进行100%采样,需要Logging的实现高效
                                      2. 第三方Logging实现过于冗余,性能低下且内存占用高
                                      3. 避免第三方Logging的开发人员由于缺乏安全常识,写出类似Remote code injection in Log4j的Critical bug
                                      4. Turms的日志实现通过“几乎什么功能都没实现”,并且实现了的功能也照着几乎最高性能标准实现(我们直接将Java的基础数据写入DirectByteBuf,并直接写入文件描述符,不存在字符串拷贝),因此该实现的吞吐量能比log4j2 async logger高数倍,同时内存开销小数倍

                                      具体实现

                                      Turms日志实现非常精简,大概只实现了标准日志库的百分之几的核心功能,打印日志的主要步骤为:

                                      对于常规日志:

                                      • 调用im.turms.server.common.infra.logging.core.logger.AsyncLogger#doLog函数
                                      • doLog函数内部通过PooledByteBufAllocator.DEFAULT分配一块堆外内存,并遍历一遍message,将非占位符直接写入该内存,跳过占位符并写入具体参数,最后将这块内存放到日志处理的MPSC队列中(基于jctools的MpscUnboundedArrayQueue
                                      • 日志处理线程检测到有新的日志(即ByteBuffer对象)时,会将该堆外内存写入NIO包的FileChannel(可以是控制台、也可以是文件)中,该对象在Linux系统下,会最终调用pwrite直接将堆外内存写入文件描述符中

                                      对于各种API日志(如客户端API日志),我们采用了更为定制的实现,即:

                                      • 调用方直接将API信息(如客户端IP、请求大小等)写入DirectByteBuf中,并将这个Buffer传递给AsyncLogger#doLog函数
                                      • doLog函数将日志通用的模板信息(如时间戳、节点ID等)写入另一个DirectByteBuf,并与上述的DirectByteBuf,拼接成一个CompositeByteBuf
                                      • 日志处理线程检测到有新的日志(即CompositeByteBuf对象)时,会将该堆外内存写入NIO包的FileChannel(可以是控制台、也可以是文件)中,该对象在Linux系统下,会最终调用pwrite直接将堆外内存写入文件描述符中

                                      理所当然的,Turms写日志的性能能达到极致。

                                      补充

                                      • 虽然还有更高效的写法,即跨过Java实现,不使用NIO包的FileChannel,而是直接调用底层JNI实现,如在Linux操作系统下,直接通过Linux的pwriteDirectByteBuffer写入文件描述符中。但考虑到代码的可维护性,且Java默认不开放这些底层函数,故不采纳该写法。

                                      • 上述中提到的内存都是通过PooledByteBufAllocator.DEFAULT分配的,且没限制内存使用上限,并且“敢”用MpscUnboundedArrayQueue存储日志,而没限制最大容量。这是因为Turms服务端自己有一套内存管理机制,它能保证内存使用的上限,同时又让使用了的内存逐步释放。

                                      • Turms不支持且未来也不会支持:添加控制台文本样式。因为给控制台文本加样式需要使用ANSI escape codes,而日志文件不需要存储这些字符,因此若要实现该功能,我们需要给控制台与日志文件分别维护一个ByteBuf,一条日志需要消耗双倍的内存,故不考虑该实现。

                                        另外,开发者可以自行使用第三方工具或插件,如Intellij IDEAGrep Console插件,给Turms服务端控制台的日志添加样式。

                                      • 关于“为什么打印非ASCII字符时,会出现乱码”,这是因为:

                                        背景:

                                        • Java 21 String类内部的byte[] value有且仅会存储LATIN-1UTF-16编码的数据
                                        • Turms服务端自身有且仅打印ASCII字符(Turms服务端不会打印任何用户或管理员输入的文本)
                                        • 日志打印这种频繁使用的功能,无意义的内存拷贝是绝对禁止的。

                                        在上述背景下,Turms在打印String时,并不是通过getBytes("UTF-8")取其字节数据,而是通过Unsafe直接获取String的内部LATIN-1UTF-16编码的字节数据,因此日志文件可能是LATIN-1UTF-16混合编码。

                                        而当用户以UTF-8编码查看日志文件时,LATIN-1编码中的ASCII字符可以正确显示,UTF-16编码中的ASCII字符也能显示,只是每个ASCII字符会多带上一个空字符(二进制编码0000 0000),对于其他编码不兼容的字符,则会以乱码形式显示,因此如果Turms服务端打印了非ASCII字符,则用户会看到乱码。

                                        另外,除非Java未来支持存储UTF-8编码的字节数据,否则Turms服务端不会考虑使用getBytes("UTF-8")这样低效的实现。

                                      综上补充内容,也再次验证了我们在各篇章中反复提及的:“功能多”对于追求性能表现的服务端而言,很可能是缺点。

                                      不使用JSON格式的原因

                                      随着微服务的发展,JSON格式日志逐渐流行,比如MongoDB就在4.4版本时开始支持JSON格式日志。使用JSON格式主要有以下三大优点:

                                      • 极大地统一了各服务端的日志格式。尤其对于具有数十/百/千个异构服务端的公司而言,是必须强制要求各项目使用JSON日志格式的
                                      • 各编程语言均对JSON有良好支持,日志打印与解析几乎无难度可言
                                      • 各云厂商的日志服务对JSON格式日志都有着良好的支持,可以实现开箱即用

                                      Turms服务端不使用JSON格式的原因是:

                                      • Turms服务端构成很简单,不需要通过JSON来统一日志格式。
                                      • JSON序列化需要占用额外内存与CPU资源,且存储开销大,如果使用压缩技术,还要额外占用CPU资源。特别是,序列化加上压缩时所需的CPU资源甚至比Turms服务端处理业务请求所需CPU资源还高,这对Turms来说是难以接受的。
                                      • JSON格式其实在原始数据可读性上并不好。因为原始日志是以单行形式进行展示,一行即表明一个事件。JSON格式在单行显示时,会带来大量“噪音”,大量的JSON元数据、JSON键与JSON值纵横交错,直接阅读原始数据的话就比较费力。而Turms服务端的客户端API访问日志通过|分隔符拆分各字段。用户初次只需要多看几个日志,之后就能反应出各字段是代表什么信息。

                                      当然,采用传统的单行格式会造成云服务解析相对复杂,且配置不灵活。但考虑到这种东西配一次即一劳永逸,综合考虑以上情况,Turms服务端日志不采用JSON格式,而仍采用传统的单行格式。

                                      类别

                                      GC日志

                                      用于JVM性能测试、分析调优、排查定位问题。

                                      turms-gateway的服务端JVM GC配置为:-Xlog:gc*,gc+age=trace,safepoint:file=${TURMS_GATEWAY_HOME}/log/turms-gateway-gc.log:utctime,pid,tags:filecount=32,filesize=32m

                                      turms-service的服务端JVM GC配置为:-Xlog:gc*,gc+age=trace,safepoint:file=${TURMS_SERVICE_HOME}/log/turms-service-gc.log:utctime,pid,tags:filecount=32,filesize=32m

                                      服务端运行日志

                                      描述Turms服务端内发生的主要事件,如RPC连接状态的转变、请求处理中服务端错误的发生等。

                                      文件名:turms-gateway.log(turms-gateway服务端);turms-service.log(turms-service服务端)

                                      构成:事件发送时间、日志等级、服务端类型、节点ID、Trace ID、线程、类、消息。其中,服务端信息的主要作用是在分布式日志采集过程中,用于区分日志的来源节点。其他类型日志也都使用这样的日志格式(除了客户端API访问日志与通知日志不记录“类”信息),它们只是在“消息”部分使用了定制化的消息格式。

                                      格式:%d{${sys:LOG_DATEFORMAT_PATTERN}}{GMT+0} ${sys:LOG_LEVEL_PATTERN} ${myctx:NODE_TYPE} ${myctx:NODE_ID} %-19.19X{traceId} %t %-40.40c{1.} : %m%n${sys:LOG_EXCEPTION_CONVERSION_WORD}

                                      解析Regex:(?P<time>\d{4}-\d{2}-\d{2}\s\d{1,2}\:\d{2}\:\d{2}\.\d{3})\s+(?P<level>[A-Z]{4,5})\s+(?P<node_type>[A-Z])\s+(?P<node_id>\S*)\s+\[(?P<trace_id>.{19})\]\s+(?P<thread>\S*)\s+(?P<class>\S*)\s+:\s(?P<msg>.*)

                                      示例:

                                      spreadsheet
                                      2021-08-08 09:52:15.602 ERROR S idanvacg 6404110606919452669 AsyncGetter-1-thread-1 i.t.s.c.c.s.r.RpcService                 : Cannot send response to disposed connection: ServiceResponse{dataForRequester=null, code=SERVER_INTERNAL_ERROR, reason='The pool is closed'}
                                       2021-08-08 14:02:53.123  INFO S xyzjjrhv                     parallel-2 i.t.s.c.c.s.c.ConnectionService          : [Client] Connecting to member: fqfgnyop[192.168.3.2:7511]. Retry times: 0
                                      2021-08-08 09:52:15.602 ERROR S idanvacg 6404110606919452669 AsyncGetter-1-thread-1 i.t.s.c.c.s.r.RpcService                 : Cannot send response to disposed connection: ServiceResponse{dataForRequester=null, code=SERVER_INTERNAL_ERROR, reason='The pool is closed'}
                                       2021-08-08 14:02:53.123  INFO S xyzjjrhv                     parallel-2 i.t.s.c.c.s.c.ConnectionService          : [Client] Connecting to member: fqfgnyop[192.168.3.2:7511]. Retry times: 0

                                      Admin API访问日志(审计日志)

                                      记录管理员对Turms服务端的各种操作。

                                      文件名:turms-service-admin-api.log

                                      格式:管理员账号|管理员IP|请求ID|请求时间|请求API|请求参数|处理结果|处理时间|处理异常信息。其中:

                                      • 会话信息:管理员账号、管理员IP
                                      • 请求信息:请求ID、请求时间、请求API、请求参数。其中,管理员可以通过HTTP响应中的Header X-Request-ID获得请求ID,并配合日志来进行故障排查或行为追踪
                                      • 响应信息:处理结果、处理时间、处理异常信息

                                      示例:

                                      spreadsheet
                                      2021-09-02 07:19:27.219  INFO S wzocsebz 3501287524626242885 Thread-28 : turms|0:0:0:0:0:0:0:1|db612e82-199|2021-09-02 07:30:30.414|updateUser|1|{ids=[1], updateUserDTO=UpdateUserDTO[password=******, name=null, intro=null, profileAccess=null, permissionGroupId=null, registrationDate=null, isActive=null]}|TRUE|
                                      2021-09-02 07:19:27.219  INFO S wzocsebz 3501287524626242885 Thread-28 : turms|0:0:0:0:0:0:0:1|db612e82-199|2021-09-02 07:30:30.414|updateUser|1|{ids=[1], updateUserDTO=UpdateUserDTO[password=******, name=null, intro=null, profileAccess=null, permissionGroupId=null, registrationDate=null, isActive=null]}|TRUE|

                                      客户端API访问日志

                                      由于客户端API访问日志数据是企业的重要资产,因此再次强调:该日志看似简单常规,但其衍生出的运营数据可以高达上百项,既是企业的宝库,也是指引产品发展方向的灯塔。宁可因为100%采样落盘导致服务端吞吐量大减,也不建议您修改相关配置。除非您明确知道且能承受修改参数后会带来的后果。

                                      turms-gateway服务端

                                      文件名:turms-gateway-client-api.log

                                      格式:会话ID|用户ID|设备|版本|IP|请求ID|请求类型|请求大小|请求时间|响应状态码|响应数据类型|响应大小|处理时间。其中:

                                      • 会话信息:会话ID、用户ID、设备、版本、IP
                                      • 请求信息:请求ID、请求类型、请求大小、请求时间
                                      • 响应信息:响应状态码、响应数据类型、响应大小、处理时间

                                      示例:

                                      spreadsheet
                                      2021-08-17 13:21:10.082  INFO G ocnpinxk 4073578036035627538 gateway-tcp-worker-18-2 : 1669286372|100|DESKTOP|1|0:0:0:0:0:0:0:1|6275734689527119988|CREATE_GROUP_MEMBER_REQUEST|32|2021-08-17 13:21:10.079|1201||21|3
                                       2021-08-17 13:21:10.086  INFO G ocnpinxk 8485909300068121199 gateway-tcp-worker-18-1 : 315622910|101|DESKTOP|1|0:0:0:0:0:0:0:1|8981788720014999664|QUERY_GROUP_JOIN_REQUESTS_REQUEST|17|2021-08-17 13:21:10.082|1201||21|4
                                      @@ -32,7 +32,7 @@
                                       2021-09-03 00:08:37.636  INFO G hkivjeav 8332948877634499289 -client-io-15-3 : 190|1|0||19|UPDATE_TYPING_STATUS_REQUEST
                                      turms-service服务端

                                      文件名:turms-service-notification.log

                                      格式:通知触发用户ID|发送状态|通知目标用户数|会话关闭状态码|通知大小|通知转发的请求ID|通知转发的请求类型。其中:

                                      • 通知触发用户信息:通知触发用户ID
                                      • 通知接收用户信息:通知接收用户数量、在线的通知接收用户数量
                                      • 通知信息:会话关闭状态码、通知大小
                                      • 通知转发的请求信息:通知转发的请求ID、通知转发的请求类型

                                      示例:

                                      spreadsheet
                                      2021-09-03 00:08:22.537  INFO S hkivjeav 3166178398923546492 -client-io-15-3 : 149|1|1||75|4971734074638762694|UPDATE_FRIEND_REQUEST_REQUEST
                                       2021-09-03 00:08:37.636  INFO S hkivjeav 8332948877634499289 -client-io-15-3 : 190|1|0||19|6469201046445182337|UPDATE_TYPING_STATUS_REQUEST
                                      2021-09-03 00:08:22.537  INFO S hkivjeav 3166178398923546492 -client-io-15-3 : 149|1|1||75|4971734074638762694|UPDATE_FRIEND_REQUEST_REQUEST
                                       2021-09-03 00:08:37.636  INFO S hkivjeav 8332948877634499289 -client-io-15-3 : 190|1|0||19|6469201046445182337|UPDATE_TYPING_STATUS_REQUEST

                                      慢日志

                                      TODO

                                      采集与分析

                                      Turms只提供原始数据,不提供也没计划提供日志采集与分析功能。

                                      原因

                                      • 现在云厂商都支持日志的采集、解析、存储、检索、分析报警等等高级服务。通过SQL检索,来获取各种高纬度统计数据与图表(诸如:日活、月活、日消息发送量、会话存留时长、新会话占比、留存率等等运营数据)。正是因为该方案已成为行业最佳实践之一,所以Turms自身不提供一些相对复杂、更适合大数据项目来做的功能。
                                      • 日志收集相关技术都很常规。但从商业价值角度去合理规划什么日志应该收集,什么字段应该索引、什么日志应该实时分析、什么日志应该离线分析,这些与商业价值与成本直接挂钩的问题才是难点所在。因此在商业价值考量方面,Turms只能给建议,而非直接插手干预。
                                      • 日志相关服务与产品百家争鸣,而Turms服务端的日志相关实现应当保持中立,因此Turms服务端自身不接入任何的SDK,只提供原始日志供日志相关服务采集。
                                      • 从微服务职责划分的角度来看,Turms服务端的功能也不应该过于耦合。

                                      链路追踪

                                      作用

                                      面向请求,用于快速追踪请求在节点之间与具体节点内的执行情况。

                                      实现

                                      在链路追踪实现规范OpenTracing中,其规定了要使用Trace与Span作为链路追踪的单位。但与动辄数十个、上百个甚至上千个微服务应用相比,Turms的调用链路极为简单,完全不需要通过Span信息来追踪请求。并且,如果Turms采用标准OpenTracing实现,那么很多请求的链路追踪附加信息甚至会比大部分的RPC请求正文还大。

                                      因此,Turms仅仅是在所有日志中添加了一个用于表示trace ID的字段,开发者在进行链路追踪时,仅需要通过查询trace ID字段,即可明白该请求经过的所有节点,与在节点内的执行情况。

                                      监控与报警

                                      在可观测体系中,系统需要根据度量与日志来实时监控服务端运行状态,并在发现系统异常时进行报警通知。

                                      Turms不提供且也没计划提供报警功能。一方面,诸如AWS CloudWatch这样的云服务或其他相关产品都提供了极为丰富、成熟且开箱即用的度量与日志的采集、分析与报警等功能。如果用户熟悉云服务产品,从头开始购买云服务并实现Turms的监控与报警,通常也只需要3~10分钟。另一方面,从微服务职责划分的角度来看,Turms服务端的功能也不应该过于耦合,没必要把这些监控报警功能都集成进来。

                                      即便用户没有计划使用云服务端,那也可以使用诸如Prometheus Alertmanager这样专业且成熟的开源技术方案。如果用户熟悉相关操作,从零搭建这样的一个系统通常也只需要10~60分钟。

                                      - + \ No newline at end of file diff --git a/docs/zh-CN/server/module/security.html b/docs/zh-CN/server/module/security.html index ddcb2bfd..f8f12f48 100644 --- a/docs/zh-CN/server/module/security.html +++ b/docs/zh-CN/server/module/security.html @@ -17,7 +17,7 @@ -
                                      Skip to content

                                      安全

                                      客户端安全

                                      出于安全原因,本文不对Turms暂未提供专门抵御机制的CC攻击进行说明。

                                      客户端黑名单机制

                                      服务端对封禁客户端的处理

                                      当turms-gateway检测到有新的IP或用户ID被封禁时,会首先向已建立且被封禁的会话发送Turms业务层的关闭通知,该通知带有USER_IS_BLOCKED状态码,告知客户端它被封禁了。当数据Flush之后,Turms服务端再自动断开底层TCP连接。

                                      当turms-gateway检查到新建立的TCP连接的对端IP已被封禁,或检测到发送登录请求的用户ID已被封禁,则在默认情况下turms-gateway会直接关闭与其的TCP连接,并且不会发送连接关闭原因的通知,如“您的IP/User ID已被封禁XX时间”。

                                      其中有两点需要注意:

                                      • turms-gateway自身无法在TCP连接建立之前,拒绝与被封禁的IP进行连接。如果您希望在TCP握手之前,服务端就能拒绝对被封禁的IP进行连接,您可通过Turms之后提供的:封禁用户时的回调插件,来通知云服务安全系统封禁IP,从而彻底实现IP封禁。

                                        另外,我们之所以不调用系统服务来彻底封禁IP,这是因为:服务端被强制关闭时,被封禁的IP将不会被自动移除;自行修改底层网络配置可能会和云服务自身的网络管理服务发生冲突,造成服务器异常。

                                      • 在客户端连接或登陆时,turms-gateway会主动断开与封禁的IP或用户的连接,但是并不会发送连接关闭原因的通知。这么做的好处是:1. 云服务的带宽是按出网带宽收费的,入网带宽不收费,因此turms-gateway不发送业务层上的响应,可以减缓被DDoS攻击时带来的带宽费用开销;2. 减少信息暴露,尽量不要给黑客提供有效信息

                                      自动封禁机制

                                      目前支持自动检测并封禁客户端的时机有:

                                      • 当用户发送请求频繁,并达到一定次数时

                                      • 当用户发送的WebSocket帧不符合规范或过大,并达到一定次数时。请求的大小依据WebSocket Frame Header中的Payload Length值

                                      • 当用户发送的Turms客户端请求无法解析或过大,并达到一定次数时。请求的大小依据TCP字节流中客户端请求Header的Payload Length值

                                        补充:

                                        • 服务端检测到数据帧或客户端请求“过大”时,不会继续解析其后续的Payload部分。如果客户端的Payload Length与实际Payload长度不符,则判定为非法请求
                                        • 具体请求大小限制可通过turms.gateway.client-api.max-request-size-bytes配置

                                      换言之,在TCP连接建立后,用户的任何行为都可能触发封禁。

                                      Turms的自动封禁机制采用分级制度,默认提供3个等级,这3个等级的封禁时长分别是:1分钟、30分钟、60分钟。默认配置下,当客户端触发5次非法行为,则服务端会以等级1的配置封禁客户端的IP与用户ID,如果在封禁时间内,又触发了一定次数的非法行为,则进入下一个封禁等级,以此类推。

                                      如果您想要修改默认配置,您可以通过turms.security.blocklist.ip.auto-blockturms.security.blocklist.user-id.auto-block前缀,并配合IDEA的智能提示对默认配置进行修改。其具体的配置项声明在im.turms.server.common.infra.property.env.common.security.AutoBlockItemProperties类中。

                                      封禁相关API

                                      管理员可以通过API:/blocked-clients/ips/blocked-clients/users,分别对封禁IP与封禁用户ID做增删改查操作,具体操作遵循Turms HTTP接口设计的一般规则,故不赘述。

                                      封禁实现原理(拓展知识)

                                      封禁客户端数据的同步实现原理与常见的分布式Replicated Map实现类似。即每个服务端都持有该Map的弱一致的副本,又有一个或多个Redis服务端存有一个基准副本,并且还记录了每个封禁与解封行为的logs,用于各服务端做增量同步。当新服务端上线或某服务端本地logs数据滞后100,000个记录时,这些服务端会向Redis请求全量同步,否则服务端只需以默认的10秒时间间隔向Redis请求增量logs以同步本地副本。

                                      另外Turms目前采用的因果一致性实现是:封禁与解封动作的先后顺序以在Redis的封禁logs队列的插入顺序为基准,各服务端基于该队列的logs顺序,进行因果同步,保证封禁客户端数据的最终一致性。

                                      为什么不使用Bloom Filter

                                      基于Bloom Filter实现黑名单功能的理论方案广为人知,但其实Bloom Filter在这场景下有非常多的陷阱,具体而言:

                                      • Bloom Filter支持的功能特性与工程实践都很受限。诸如:

                                        • 在分布式环境下,如何判断“封禁操作”与“解封操作”的先后顺序,并且如何保证最终一致性
                                        • 如何给不同的封禁用户设定不同的封禁时长(如五分钟/半个钟)
                                        • 如何给附加的被封禁用户附加信息,比如附加被拉黑原因
                                        • 节点间黑名单列表如何同步,如何做增量同步
                                        • 如何实现“取消拉黑操作”,代价是什么

                                        综上,Bloom Filter在分布式环境下,连黑名单系统最为基础的功能都无法实现,就算Bloom Filter配合其他工程实践勉强实现,那Bloom Filter自身的优势也就不存在了。

                                      • 被拉黑用户数据量本身很小,Bloom Filter无法发挥其优势。而且如果只是判断用户是否被拉黑,我们按100万的被封禁的用户ID来看,一共也才需要12MiB或61.4MiB内存(额外补充:这个例子也印证了我们在关于Valhalla项目篇章中提及到的:Java对内存的浪费就让人感觉有些“自暴自弃”了)。因为在实际编程中通常都使用线程安全的集合,且大部分线程安全的Set内部一般都是基于Map实现的,因此下文统一使用的是线程安全的Map:

                                        java
                                        public static void main(String[] args) {
                                        +    
                                        Skip to content

                                        安全

                                        客户端安全

                                        出于安全原因,本文不对Turms暂未提供专门抵御机制的CC攻击进行说明。

                                        客户端黑名单机制

                                        服务端对封禁客户端的处理

                                        当turms-gateway检测到有新的IP或用户ID被封禁时,会首先向已建立且被封禁的会话发送Turms业务层的关闭通知,该通知带有USER_IS_BLOCKED状态码,告知客户端它被封禁了。当数据Flush之后,Turms服务端再自动断开底层TCP连接。

                                        当turms-gateway检查到新建立的TCP连接的对端IP已被封禁,或检测到发送登录请求的用户ID已被封禁,则在默认情况下turms-gateway会直接关闭与其的TCP连接,并且不会发送连接关闭原因的通知,如“您的IP/User ID已被封禁XX时间”。

                                        其中有两点需要注意:

                                        • turms-gateway自身无法在TCP连接建立之前,拒绝与被封禁的IP进行连接。如果您希望在TCP握手之前,服务端就能拒绝对被封禁的IP进行连接,您可通过Turms之后提供的:封禁用户时的回调插件,来通知云服务安全系统封禁IP,从而彻底实现IP封禁。

                                          另外,我们之所以不调用系统服务来彻底封禁IP,这是因为:服务端被强制关闭时,被封禁的IP将不会被自动移除;自行修改底层网络配置可能会和云服务自身的网络管理服务发生冲突,造成服务器异常。

                                        • 在客户端连接或登陆时,turms-gateway会主动断开与封禁的IP或用户的连接,但是并不会发送连接关闭原因的通知。这么做的好处是:1. 云服务的带宽是按出网带宽收费的,入网带宽不收费,因此turms-gateway不发送业务层上的响应,可以减缓被DDoS攻击时带来的带宽费用开销;2. 减少信息暴露,尽量不要给黑客提供有效信息

                                        自动封禁机制

                                        目前支持自动检测并封禁客户端的时机有:

                                        • 当用户发送请求频繁,并达到一定次数时

                                        • 当用户发送的WebSocket帧不符合规范或过大,并达到一定次数时。请求的大小依据WebSocket Frame Header中的Payload Length值

                                        • 当用户发送的Turms客户端请求无法解析或过大,并达到一定次数时。请求的大小依据TCP字节流中客户端请求Header的Payload Length值

                                          补充:

                                          • 服务端检测到数据帧或客户端请求“过大”时,不会继续解析其后续的Payload部分。如果客户端的Payload Length与实际Payload长度不符,则判定为非法请求
                                          • 具体请求大小限制可通过turms.gateway.client-api.max-request-size-bytes配置

                                        换言之,在TCP连接建立后,用户的任何行为都可能触发封禁。

                                        Turms的自动封禁机制采用分级制度,默认提供3个等级,这3个等级的封禁时长分别是:1分钟、30分钟、60分钟。默认配置下,当客户端触发5次非法行为,则服务端会以等级1的配置封禁客户端的IP与用户ID,如果在封禁时间内,又触发了一定次数的非法行为,则进入下一个封禁等级,以此类推。

                                        如果您想要修改默认配置,您可以通过turms.security.blocklist.ip.auto-blockturms.security.blocklist.user-id.auto-block前缀,并配合IDEA的智能提示对默认配置进行修改。其具体的配置项声明在im.turms.server.common.infra.property.env.common.security.AutoBlockItemProperties类中。

                                        封禁相关API

                                        管理员可以通过API:/blocked-clients/ips/blocked-clients/users,分别对封禁IP与封禁用户ID做增删改查操作,具体操作遵循Turms HTTP接口设计的一般规则,故不赘述。

                                        封禁实现原理(拓展知识)

                                        封禁客户端数据的同步实现原理与常见的分布式Replicated Map实现类似。即每个服务端都持有该Map的弱一致的副本,又有一个或多个Redis服务端存有一个基准副本,并且还记录了每个封禁与解封行为的logs,用于各服务端做增量同步。当新服务端上线或某服务端本地logs数据滞后100,000个记录时,这些服务端会向Redis请求全量同步,否则服务端只需以默认的10秒时间间隔向Redis请求增量logs以同步本地副本。

                                        另外Turms目前采用的因果一致性实现是:封禁与解封动作的先后顺序以在Redis的封禁logs队列的插入顺序为基准,各服务端基于该队列的logs顺序,进行因果同步,保证封禁客户端数据的最终一致性。

                                        为什么不使用Bloom Filter

                                        基于Bloom Filter实现黑名单功能的理论方案广为人知,但其实Bloom Filter在这场景下有非常多的陷阱,具体而言:

                                        • Bloom Filter支持的功能特性与工程实践都很受限。诸如:

                                          • 在分布式环境下,如何判断“封禁操作”与“解封操作”的先后顺序,并且如何保证最终一致性
                                          • 如何给不同的封禁用户设定不同的封禁时长(如五分钟/半个钟)
                                          • 如何给附加的被封禁用户附加信息,比如附加被拉黑原因
                                          • 节点间黑名单列表如何同步,如何做增量同步
                                          • 如何实现“取消拉黑操作”,代价是什么

                                          综上,Bloom Filter在分布式环境下,连黑名单系统最为基础的功能都无法实现,就算Bloom Filter配合其他工程实践勉强实现,那Bloom Filter自身的优势也就不存在了。

                                        • 被拉黑用户数据量本身很小,Bloom Filter无法发挥其优势。而且如果只是判断用户是否被拉黑,我们按100万的被封禁的用户ID来看,一共也才需要12MiB或61.4MiB内存(额外补充:这个例子也印证了我们在关于Valhalla项目篇章中提及到的:Java对内存的浪费就让人感觉有些“自暴自弃”了)。因为在实际编程中通常都使用线程安全的集合,且大部分线程安全的Set内部一般都是基于Map实现的,因此下文统一使用的是线程安全的Map:

                                          java
                                          public static void main(String[] args) {
                                               int number = 1_000_000;
                                               var map1 = ConcurrentHashMap.newKeySet((int)(number / 0.75F + 1.0F));
                                               var map2 = new NonBlockingHashMapLong<>(number);
                                          @@ -76,7 +76,7 @@
                                                    1        40        40   org.jctools.maps.NonBlockingHashMapLong
                                                    1        64        64   org.jctools.maps.NonBlockingHashMapLong$CHM
                                                   11            12583464   (total)
                                        • 存在误差

                                        客户端接口防刷限流

                                        turms-gateway的限流实现采用的是主流算法令牌桶算法(如AWS的API Gateway提供流量整型实现就用的是令牌桶算法)。

                                        基础知识

                                        无论什么算法,其根本都需要计算“被允许的请求数”,下文为统一说明,均用“令牌”(Token)一词指代“被允许的请求数”。另外,下表为该类算法的一般实现,其变种并不会影响其算法的本质,故不进行讨论。

                                        固定时间窗口算法滑动时间窗口算法令牌桶算法漏桶算法
                                        令牌上限固定或动态令牌上限(通常固定上限)固定或动态令牌上限(通常固定上限)固定或动态令牌上限(通常固定上限)固定或动态令牌上限(通常固定上限)
                                        当前可用令牌数通过单个时间区间来计算通过多个时间区间来计算通过当前存量令牌数来计算通过当前存量令牌数计算
                                        令牌发放间隔强调粗颗粒度间隔发放(如间隔1分钟)强调细颗粒度间隔发放(如间隔15秒)强调细颗粒度间隔发放(如间隔1秒)强调细颗粒度间隔放行(如间隔1秒)
                                        令牌发放时清空计数是。但一般只对最早的几个窗口进行清空
                                        资源开销无需定时器,开销极小无需定时器,开销极小无需定时器,开销极小每个会话都需要维护一个MPSC同步队列,与一个定时器来定时Poll队列,开销很大
                                        实现难度非常简单非常简单非常简单相对麻烦
                                        总评由于需要清空计数,且颗粒太大,客户端可以在每次令牌发放前突发大量请求,造成“双倍突发流量”的问题避免了“双倍突发流量”的问题,但因为有“清空计数”的操作,所以其控制精度不如令牌桶算法与漏桶算法既可以通过存量令牌来处理突发请求,
                                        又可以通过细颗粒度间隔的令牌发放来平滑地对请求进行限流。
                                        其实云服务的CPU积分机制就与此类似
                                        篇幅略长,见下文

                                        漏桶算法与令牌桶算法都具有处理突发请求与平滑地对请求进行限流的能力。但漏桶算法的一个特别作用就是能对下游服务(最主要的就是数据库)进行限流。但对下游进行限流也是有代价的,它要求运维人员能够精准地估算下游服务吞吐量,否则可能造成下游服务一边处于空闲状态,上游服务却在限流的情况。

                                        另外利用MPSC队列缓存请求,既降低了吞吐量,增加了内存开销与GC次数,导致常规用户体验更差,并加剧了DDoS攻击效果,这与我们引入防刷限流的目的背道而驰。(补充:通过阅读Turms服务端源码,您会发现Turms在处理客户端请求的流程中,代码都尽可能极致地“轻”,因此对每个用户会话都使用MPSC队列算是很重的操作了)

                                        综上,Turms服务端最终使用令牌桶算法

                                        特别一提的是:相比于传统HTTP服务端,其接收并处理一次常规HTTP请求与响应的CPU与内存所需系统资源可能百倍于Turms服务端与其客户端交互所需系统资源(如:除开网络层协议头,Turms客户端一个请求的平均大小约32B)。因此并不需要把少部分用户的突发Turms客户端请求太当回事,可能处理上百个Turms客户端请求所用系统资源就跟处理一个HTTP请求差不多(当然,还有其他形态的CC攻击会造成大量资源消耗)。

                                        其他:

                                        • turms-gateway不支持并且目前也没计划支持全局的限流实现,原因是:全局限流通常是过度设计,全局限流为了时刻缓解DDoS攻击,增加Redis故障点,拉低整个系统的请求处理吞吐量,很多时候顾此失彼,得不偿失
                                        • Turms暂不支持给不同类型的请求赋予不同的权重,如登录请求需要3个令牌,发送消息请求需要1个令牌
                                        • turms-gateway支持运行时零停机更新令牌桶算法的配置

                                        用户信息安全

                                        对于大部分国内稍微有些网龄的群体,除非其具有很强的安全意识,他们的明文密码极有可能已经泄漏了(具体内容可以通过社工库进行了解)。结合大部分用户使用的密码都比较固定,因此不管服务端再怎么加密,其实“密码”的安全性还是偏低。

                                        TODO

                                        管理员安全

                                        管理员认证与授权

                                        认证(Authentication)

                                        认证:服务端基于常见的HTTP Basic authentication实现,确认HTTP请求的发送者是哪位管理员。

                                        配置项:turms.security.password.admin-password-encoding-algorithm,其可选值为:bcrypt(默认)、salted_sha256noop

                                        支持的密钥加密算法
                                        • BCrypt。其cost为硬编码的10(2^10 rounds),用于避免被脱库时,黑客通过彩虹表轻松破解出明文密码。

                                          其具体算法实现可查看turms-server-common子项目下Fork的Bouncy Castle源码实现:org.bouncycastle.crypto.generators.BCrypt#generate

                                        • 加盐SHA-256

                                        • NOOP(明文存储)

                                        特别一提:admin集合里的password字段,其存储形式并不是string(如常见的Base64编码的字符串),而是原始的byte[]字节数据。

                                        授权(Authorization)

                                        授权:服务端确认HTTP请求的发送者有什么权限做什么事

                                        由于Turms自身权限管理的需求很简单,因此其设计与实现也比较简单,比如没有用户组、组角色、角色继承等概念,没有用户与角色的多对多关系。具体而言,Turms采用RBAC(基于角色的访问控制)设计方案。

                                        Turms的RBAC模型

                                        Turms的RBAC模型由管理员(Admin)角色(Role)以及权限(Permission)这三个主体构成。一个用户只可以有一个角色,一个角色可以有多个权限。其中:

                                        • 每个角色还具有一个字段rank,只有相对高rank的管理员可以增、删与修改相对低rank的管理员账号信息,如密码。
                                        • 权限用于描述角色可以对什么资源进行什么操作,如对用户资源进行增删改查操作
                                        特殊角色——Root

                                        Root是Turms内置的管理员角色,拥有所有管理员权限,并且不能被修改与删除。

                                        特殊根账号——turms

                                        根账号turms用户拥有Root根角色权限,其账号名暂不支持修改(但可以通过修改硬编码的im.turms.server.common.domain.admin.constant.AdminConst#ROOT_ADMIN_ACCOUNT值来修改根账号名),其初始密码默认为turms,但用户可以通过配置项turms.security.password.initial-root-passwordadmin集合尚未创建、turms-service启动时,应用自定义的初始密码。

                                        日志脱敏

                                        TODO

                                        - + \ No newline at end of file diff --git a/docs/zh-CN/server/module/storage.html b/docs/zh-CN/server/module/storage.html index 08a02445..6dd3ea46 100644 --- a/docs/zh-CN/server/module/storage.html +++ b/docs/zh-CN/server/module/storage.html @@ -17,8 +17,8 @@ -
                                        Skip to content

                                        存储服务

                                        Turms自身并不直接提供存储服务,而是在服务端侧开放了存储服务中常见的接口,以供开发者自行实现,而Turms客户端也提供了相对应的存储服务turmsClient.storageService的API,以供开发者自行调用。

                                        注意:

                                        • 开发者完全可以不用Turms客户端与服务端提供的任何接口,而是自己实现一套应用客户端与您自己服务端的交互存储逻辑。Turms只是自己维护了一套常见存储服务的实现,这样大部分开发者就不用自己从零开发了。即便开发者不打算用Turms的存储实现,由于各存储服务实现都是大同小异的,开发者也可以参考Turms的存储实现流程来实现自己的存储逻辑,以节省自研的时间。
                                        • Turms客户端存储服务提供的功能是Turms服务端官方存储服务插件功能的超集,即:Turms客户端存储服务被设计成既可以与Turms服务端官方存储服务插件进行交互,也可以被拓展与其他第三方插件进行交互。

                                        插件接口与配置

                                        存储资源目前一共分为三个类型,分别是:User Profile Picture(用户资料图片)、Group Profile Picture(群组资料图片)与Message Attachment(消息附件)。而每个资源都有其对应的增(改)删查三个函数接口,以供开发者实现。

                                        接口

                                        插件接口:im.turms.service.infra.plugin.extension.StorageServiceProvider

                                        接口函数介绍:

                                        资源类型函数名预期作用返回值说明
                                        用户资料图片deleteUserProfilePicture删除用户资料图片
                                        queryUserProfilePictureUploadInfo查询用户资料图片上传信息返回值格式为Map<String, String>,插件实现者可以自定义任意返回值
                                        queryUserProfilePictureDownloadInfo查询用户资料图片下载信息返回值格式为Map<String, String>,插件实现者可以自定义任意返回值
                                        群组资料图片deleteGroupProfilePicture删除群组资料图片
                                        queryGroupProfilePictureUploadInfo查询群组资料图片上传信息返回值格式为Map<String, String>,插件实现者可以自定义任意返回值
                                        queryGroupProfilePictureDownloadInfo查询群组资料图片下载信息返回值格式为Map<String, String>,插件实现者可以自定义任意返回值
                                        消息附件deleteMessageAttachment删除消息附件
                                        shareMessageAttachmentWithUser将消息附件分享给指定用户
                                        shareMessageAttachmentWithGroup将消息附件分享给指定群组
                                        unshareMessageAttachmentWithUser不再将消息附件分享给指定用户
                                        unshareMessageAttachmentWithGroup不再将消息附件分享给指定群组
                                        queryMessageAttachmentUploadInfo查询消息附件上传信息返回值格式为Map<String, String>,插件实现者可以自定义任意返回值
                                        queryMessageAttachmentUploadInfoInPrivateConversation查询私聊会话中的消息附件上传信息
                                        queryMessageAttachmentUploadInfoInGroupConversation查询群聊会话中的消息附件上传信息
                                        queryMessageAttachmentDownloadInfo查询消息附件下载信息返回值格式为Map<String, String>,插件实现者可以自定义任意返回值
                                        queryMessageAttachmentInfosUploadedByRequester查询请求者上传的消息附件
                                        queryMessageAttachmentInfosInPrivateConversations查询私聊会话中的消息附件
                                        queryMessageAttachmentInfosInGroupConversations查询群聊会话中的消息附件

                                        通用配置

                                        配置项默认值说明
                                        turms.service.storage.user-profile-picture.expire-after-days0自创建时间开始,资源的有效时长(天)。0值代表不会过期
                                        turms.service.storage.user-profile-picture.allowed-referrers只允许指定的Referrers访问资源
                                        turms.service.storage.user-profile-picture.allowed-content-type*/*允许上传的资源Content-Type*/*值代表无限制
                                        turms.service.storage.user-profile-picture.min-size-bytes0允许上传的资源最小值。0值代表无限制
                                        turms.service.storage.user-profile-picture.max-size-bytes1MB允许上传的资源最大值。0值代表无限制
                                        turms.service.storage.user-profile-picture.download-url-expire-after-seconds300资源下载URL的有效时长(秒)
                                        turms.service.storage.user-profile-picture.upload-url-expire-after-seconds300资源上传URL的有效时长(秒)
                                        turms.service.storage.group-profile-picture....同turms.service.storage.user-profile-picture
                                        turms.service.storage.message-attachment....同turms.service.storage.user-profile-picture

                                        官方插件实现

                                        Bucket的基础设计准则

                                        由于对象存储服务提供的功能都大同小异,Turms当前与未来提供的基于对象存储服务的官方插件都会遵循下述的Bucket设计准则。

                                        如上所述,Turms目前包括三类存储资源,分别是User Profile Picture(用户资料图片)、Group Profile Picture(群组资料图片)与Message Attachment(消息附件),它们各自所对应的Bucket名分别为user-profile-picturegroup-profile-picturemessage-attachment。其中:

                                        • user-profile-picturegroup-profile-picture为公开Buckets。对于这些资源的URL,Turms既支持生成规律的URL,以支持客户端自行预测资源URL,避免向Turms服务端发送查询资源URL的请求,也支持生成不规律的URL,以用于反爬虫。具体您的应用需要使用哪种URL,则要根据您产品自身的需求决定。
                                        • message-attachment为私有Bucket,通过Presigned URL为授权的用户提供临时访问消息附件用的URL。
                                        • 所有资源的上传流程都是基于通过Presigned URL为授权的用户提供临时的Multipart Upload接口实现的。

                                        当然,以上只是默认配置,当前主流对象存储服务都支持许多实用特性,如数据冷热分离存储(如Amazon S3 Intelligent-Tiering Storage Class)、加密、复杂的权限控制等等,用户可以在Turms创建的Buckets基础上,再自行通过对象存储服务做进行进一步的配置。

                                        turms-plugin-minio

                                        简介

                                        turms-plugin-minio是一个基于开源对象存储服务MinIO而开发的turms-service存储服务实现插件。

                                        安装

                                        当插件在服务端Start之后,客户端即可调用turmsClient.storageService下对应的API,对存储资源进行增删改查操作。

                                        客户端调用存储相关接口时的注意事项

                                        由于Turms客户端的存储接口采用的是通用接口设计,并不是为turms-plugin-minio定制的,因此在调用客户端API时,需要注意以下事项:

                                        • 当调用queryMessageAttachment接口时,参数fetchDownloadInfo必须为true;当调用queryMessageAttachmentDownloadInfo接口时,参数fetch必须为true

                                        业务功能

                                        消息附件功能
                                        上传消息附件
                                        功能支持
                                        不指定任何会话,上传消息附件TODO
                                        上传消息附件给指定单个私聊会话
                                        上传消息附件给指定多个私聊会话
                                        上传消息附件给指定单个群聊会话
                                        上传消息附件给指定多个群聊会话
                                        删除消息附件
                                        功能支持
                                        删除任意会话中的消息附件TODO
                                        分享与取消分享
                                        功能支持
                                        分享已上传的消息附件给单个私聊会话
                                        分享已上传的消息附件给多个私聊会话
                                        分享已上传的消息附件给单个群聊会话
                                        分享已上传的消息附件给多个群聊会话
                                        取消与单个私聊会话的分享已上传的消息附件给单个私聊会话TODO
                                        取消分享已上传的消息附件给多个私聊会话
                                        分享已上传的消息附件给单个群聊会话TODO
                                        分享已上传的消息附件给多个群聊会话

                                        对于更高级的分享功能,诸如细致的权限控制、自定义分享时长、加密分享等功能,近期暂无计划支持。

                                        查询
                                        功能支持
                                        指定单个私聊会话中,对方分享给我的附件
                                        指定单个私聊会话中,我发送给对方的附件
                                        指定单个私聊会话中,对方分享给我的附件与我发送给对方的附件
                                        指定多个私聊会话中,对方分享给我的附件
                                        指定多个私聊会话中,我发送给对方的附件
                                        指定多个私聊会话中,对方分享给我的附件与我发送给对方的附件
                                        所有私聊会话中,对方分享给我的附件
                                        所有私聊会话中,我发送给对方的附件不支持“只查询私聊会话中,我发送给对方的附件”,
                                        但支持“在所有会话中,我分享的附件”
                                        所有私聊会话中,对方分享给我的附件与我发送给对方的附件
                                        指定单个群聊会话中,指定单个用户(可以是我自己)分享的附件
                                        指定单个群聊会话中,指定多个用户(可以包括我自己)分享的附件
                                        指定单个群聊会话中,所有用户分享(包括我自己)的附件
                                        指定多个群聊会话中,指定单个用户(可以是我自己)分享的附件
                                        指定多个群聊会话中,指定多个用户(可以包括我自己)分享的附件
                                        指定多个群聊会话中,所有用户分享(包括我自己)的附件
                                        所有群聊会话中,指定单个用户分享的附件不支持“所有群聊会话中,指定我分享的附件”,
                                        但支持“在所有会话中,我分享的附件”
                                        所有群聊会话中,指定多个用户(可以包括我自己)分享的附件
                                        所有群聊会话中,所有用户分享(包括我自己)的附件
                                        在所有会话中,我分享的附件
                                        在所有会话中,其他各种查询对象

                                        权限控制

                                        • 查看消息附件

                                          • 发送消息附件的用户无论有没有退出私聊或群聊会话,他们始终都有权限查询自己上传的消息附件。

                                            并且即使上传消息附件的用户退出该会话,该会话中的其他所有用户仍有权限查看该用户上传的消息附件。

                                          • 用户有且仅能查看在已加入的私聊或群聊会话中其他用户分享的消息附件。换言之,如果一位用户先加入了一个会话,而后又退出,则退出后的用户无法查看该会话中的附件。只有当该用户又再次加入该会话,才又有权限查看该会话中的附件。

                                        安全

                                        上传限制:TODO

                                        存储文件数据校验

                                        如果基于云服务来实现存储文件的数据校验,那逻辑的实现会相对简单。如在AWS上,可以通过S3的事件通知来触发自定义的Lambda函数对用户上传的数据做检验,又或者通过在CloudFront侧添加监听origin-response 事件的Lambda@Edge函数做校验,除了自定义的校验逻辑需要写一些代码外,其他功能基本靠点鼠标就能实现了。

                                        但由于MinIO作为独立的存储服务不支持诸如Lambda函数这样的Serverless架构特性,因此相对于Serverless的方案,基于MinIO的事件机制来实现低成本又高可用的数据校验逻辑就麻烦得多了。因此Turms暂不支持对存储文件做数据校验。之后会提供支持。

                                        配置

                                        配置项默认值说明
                                        turms-plugin.minio.enabledtrue是否启动插件
                                        turms-plugin.minio.endpoint"http://localhost:9000"MinIO服务端的地址
                                        turms-plugin.minio.region""MinIO服务端的区域
                                        turms-plugin.minio.access-keyminioadminMinIO服务端的Access Key
                                        turms-plugin.minio.secret-keyminioadminMinIO服务端的Secret Key
                                        turms-plugin.minio.retry.enabledtrue初始化Buckets失败时,是否重试
                                        turms-plugin.minio.retry.initial-interval-millis30_000初始化Buckets失败时,首次重试间隔
                                        turms-plugin.minio.retry.interval-millis30_000初始化Buckets失败时,重试间隔
                                        turms-plugin.minio.retry.max-attempts3初始化Buckets失败时,最多重试次数
                                        turms-plugin.minio.resource-id.mac.enabledfalse是否对资源的Object Key进行MAC算法加密,以生成不可预测的URL来反爬虫。
                                        如果不开启该项,则用户可以通过用户ID或群组ID来获得对应的图片URL
                                        最终资源URL为:<bucket>/<base62(object key)><base62(mac(object key))>。如user-profile-picture/123456789 => user-profile-picture/8M0kX1aEllpuvXRV09grkIEtD4R
                                        注意:如果开启MAC算法,则客户端在调用queryXXXDownloadInfo系列接口时,要将参数fetch设置为true;在调用queryXXX系列接口时,要将参数fetchDownloadInfo设置为true
                                        turms-plugin.minio.resource-id.mac.base64-key"AHR1cm1zLWltL3R1cm1zgA=="Base64编码的MAC算法密钥
                                        turms-plugin.minio.resource-id.base62.enabledfalse是否对资源的Object Key进行Base62算法编码,以缩短URL的长度。
                                        最终资源URL为:<bucket>/<base62(object key)>,或<bucket>/<base62(object key)><base62(mac(object key))>。如user-profile-picture/123456789 => message-attachment/8M0kXuser-profile-picture/8M0kX1aEllpuvXRV09grkIEtD4R
                                        注意:1. 当turms-plugin.minio.resource-key.mac.enabledtrue时,Base62算法会始终被应用。
                                        2. 如果开启Base62算法,则客户端在调用queryXXXDownloadInfo系列接口时,要将参数fetch设置为true;在调用queryXXX系列接口时,要将参数fetchDownloadInfo设置为true
                                        turms-plugin.minio.resource-id.base62.charset...Base62算法的字符集
                                        - +
                                        Skip to content

                                        存储服务

                                        Turms自身并不直接提供存储服务,而是在服务端侧开放了存储服务中常见的接口,以供开发者自行实现,而Turms客户端也提供了相对应的存储服务turmsClient.storageService的API,以供开发者自行调用。

                                        注意:

                                        • 开发者完全可以不用Turms客户端与服务端提供的任何接口,而是自己实现一套应用客户端与您自己服务端的交互存储逻辑。Turms只是自己维护了一套常见存储服务的实现,这样大部分开发者就不用自己从零开发了。即便开发者不打算用Turms的存储实现,由于各存储服务实现都是大同小异的,开发者也可以参考Turms的存储实现流程来实现自己的存储逻辑,以节省自研的时间。
                                        • Turms客户端存储服务提供的功能是Turms服务端官方存储服务插件功能的超集,即:Turms客户端存储服务被设计成既可以与Turms服务端官方存储服务插件进行交互,也可以被拓展与其他第三方插件进行交互。

                                        插件接口与配置

                                        存储资源目前一共分为三个类型,分别是:User Profile Picture(用户资料图片)、Group Profile Picture(群组资料图片)与Message Attachment(消息附件)。而每个资源都有其对应的增(改)删查三个函数接口,以供开发者实现。

                                        接口

                                        插件接口:im.turms.service.infra.plugin.extension.StorageServiceProvider

                                        接口函数介绍:

                                        资源类型函数名预期作用返回值说明
                                        用户资料图片deleteUserProfilePicture删除用户资料图片
                                        queryUserProfilePictureUploadInfo查询用户资料图片上传信息返回值格式为Map<String, String>,插件实现者可以自定义任意返回值
                                        queryUserProfilePictureDownloadInfo查询用户资料图片下载信息返回值格式为Map<String, String>,插件实现者可以自定义任意返回值
                                        群组资料图片deleteGroupProfilePicture删除群组资料图片
                                        queryGroupProfilePictureUploadInfo查询群组资料图片上传信息返回值格式为Map<String, String>,插件实现者可以自定义任意返回值
                                        queryGroupProfilePictureDownloadInfo查询群组资料图片下载信息返回值格式为Map<String, String>,插件实现者可以自定义任意返回值
                                        消息附件deleteMessageAttachment删除消息附件
                                        shareMessageAttachmentWithUser将消息附件分享给指定用户
                                        shareMessageAttachmentWithGroup将消息附件分享给指定群组
                                        unshareMessageAttachmentWithUser不再将消息附件分享给指定用户
                                        unshareMessageAttachmentWithGroup不再将消息附件分享给指定群组
                                        queryMessageAttachmentUploadInfo查询消息附件上传信息返回值格式为Map<String, String>,插件实现者可以自定义任意返回值
                                        queryMessageAttachmentUploadInfoInPrivateConversation查询私聊会话中的消息附件上传信息
                                        queryMessageAttachmentUploadInfoInGroupConversation查询群聊会话中的消息附件上传信息
                                        queryMessageAttachmentDownloadInfo查询消息附件下载信息返回值格式为Map<String, String>,插件实现者可以自定义任意返回值
                                        queryMessageAttachmentInfosUploadedByRequester查询请求者上传的消息附件
                                        queryMessageAttachmentInfosInPrivateConversations查询私聊会话中的消息附件
                                        queryMessageAttachmentInfosInGroupConversations查询群聊会话中的消息附件

                                        通用配置

                                        配置项默认值说明
                                        turms.service.storage.user-profile-picture.expire-after-days0自创建时间开始,资源的有效时长(天)。0值代表不会过期
                                        turms.service.storage.user-profile-picture.allowed-referrers只允许指定的Referrers访问资源
                                        turms.service.storage.user-profile-picture.allowed-content-type*/*允许上传的资源Content-Type*/*值代表无限制
                                        turms.service.storage.user-profile-picture.min-size-bytes0允许上传的资源最小值。0值代表无限制
                                        turms.service.storage.user-profile-picture.max-size-bytes1MB允许上传的资源最大值。0值代表无限制
                                        turms.service.storage.user-profile-picture.download-url-expire-after-seconds300资源下载URL的有效时长(秒)
                                        turms.service.storage.user-profile-picture.upload-url-expire-after-seconds300资源上传URL的有效时长(秒)
                                        turms.service.storage.group-profile-picture....同turms.service.storage.user-profile-picture
                                        turms.service.storage.message-attachment....同turms.service.storage.user-profile-picture

                                        官方插件实现

                                        Bucket的基础设计准则

                                        由于对象存储服务提供的功能都大同小异,Turms当前与未来提供的基于对象存储服务的官方插件都会遵循下述的Bucket设计准则。

                                        如上所述,Turms目前包括三类存储资源,分别是User Profile Picture(用户资料图片)、Group Profile Picture(群组资料图片)与Message Attachment(消息附件),它们各自所对应的Bucket名分别为user-profile-picturegroup-profile-picturemessage-attachment。其中:

                                        • user-profile-picturegroup-profile-picture为公开Buckets。对于这些资源的URL,Turms既支持生成规律的URL,以支持客户端自行预测资源URL,避免向Turms服务端发送查询资源URL的请求,也支持生成不规律的URL,以用于反爬虫。具体您的应用需要使用哪种URL,则要根据您产品自身的需求决定。
                                        • message-attachment为私有Bucket,通过Presigned URL为授权的用户提供临时访问消息附件用的URL。
                                        • 所有资源的上传流程都是基于通过Presigned URL为授权的用户提供临时的Multipart Upload接口实现的。

                                        当然,以上只是默认配置,当前主流对象存储服务都支持许多实用特性,如数据冷热分离存储(如Amazon S3 Intelligent-Tiering Storage Class)、加密、复杂的权限控制等等,用户可以在Turms创建的Buckets基础上,再自行通过对象存储服务做进行进一步的配置。

                                        turms-plugin-minio

                                        简介

                                        turms-plugin-minio是一个基于开源对象存储服务MinIO而开发的turms-service存储服务实现插件。

                                        安装

                                        当插件在服务端Start之后,客户端即可调用turmsClient.storageService下对应的API,对存储资源进行增删改查操作。

                                        客户端调用存储相关接口时的注意事项

                                        由于Turms客户端的存储接口采用的是通用接口设计,并不是为turms-plugin-minio定制的,因此在调用客户端API时,需要注意以下事项:

                                        • 当调用queryMessageAttachment接口时,参数fetchDownloadInfo必须为true;当调用queryMessageAttachmentDownloadInfo接口时,参数fetch必须为true

                                        业务功能

                                        消息附件功能
                                        上传消息附件
                                        功能支持
                                        不指定任何会话,上传消息附件TODO
                                        上传消息附件给指定单个私聊会话
                                        上传消息附件给指定多个私聊会话
                                        上传消息附件给指定单个群聊会话
                                        上传消息附件给指定多个群聊会话
                                        删除消息附件
                                        功能支持
                                        删除任意会话中的消息附件TODO
                                        分享与取消分享
                                        功能支持
                                        分享已上传的消息附件给单个私聊会话
                                        分享已上传的消息附件给多个私聊会话
                                        分享已上传的消息附件给单个群聊会话
                                        分享已上传的消息附件给多个群聊会话
                                        取消与单个私聊会话的分享已上传的消息附件给单个私聊会话TODO
                                        取消分享已上传的消息附件给多个私聊会话
                                        分享已上传的消息附件给单个群聊会话TODO
                                        分享已上传的消息附件给多个群聊会话

                                        对于更高级的分享功能,诸如细致的权限控制、自定义分享时长、加密分享等功能,近期暂无计划支持。

                                        查询
                                        功能支持
                                        指定单个私聊会话中,对方分享给我的附件
                                        指定单个私聊会话中,我发送给对方的附件
                                        指定单个私聊会话中,对方分享给我的附件与我发送给对方的附件
                                        指定多个私聊会话中,对方分享给我的附件
                                        指定多个私聊会话中,我发送给对方的附件
                                        指定多个私聊会话中,对方分享给我的附件与我发送给对方的附件
                                        所有私聊会话中,对方分享给我的附件
                                        所有私聊会话中,我发送给对方的附件不支持“只查询私聊会话中,我发送给对方的附件”,
                                        但支持“在所有会话中,我分享的附件”
                                        所有私聊会话中,对方分享给我的附件与我发送给对方的附件
                                        指定单个群聊会话中,指定单个用户(可以是我自己)分享的附件
                                        指定单个群聊会话中,指定多个用户(可以包括我自己)分享的附件
                                        指定单个群聊会话中,所有用户分享(包括我自己)的附件
                                        指定多个群聊会话中,指定单个用户(可以是我自己)分享的附件
                                        指定多个群聊会话中,指定多个用户(可以包括我自己)分享的附件
                                        指定多个群聊会话中,所有用户分享(包括我自己)的附件
                                        所有群聊会话中,指定单个用户分享的附件不支持“所有群聊会话中,指定我分享的附件”,
                                        但支持“在所有会话中,我分享的附件”
                                        所有群聊会话中,指定多个用户(可以包括我自己)分享的附件
                                        所有群聊会话中,所有用户分享(包括我自己)的附件
                                        在所有会话中,我分享的附件
                                        在所有会话中,其他各种查询对象

                                        权限控制

                                        • 查看消息附件

                                          • 发送消息附件的用户无论有没有退出私聊或群聊会话,他们始终都有权限查询自己上传的消息附件。

                                            并且即使上传消息附件的用户退出该会话,该会话中的其他所有用户仍有权限查看该用户上传的消息附件。

                                          • 用户有且仅能查看在已加入的私聊或群聊会话中其他用户分享的消息附件。换言之,如果一位用户先加入了一个会话,而后又退出,则退出后的用户无法查看该会话中的附件。只有当该用户又再次加入该会话,才又有权限查看该会话中的附件。

                                        安全

                                        上传限制:TODO

                                        存储文件数据校验

                                        如果基于云服务来实现存储文件的数据校验,那逻辑的实现会相对简单。如在AWS上,可以通过S3的事件通知来触发自定义的Lambda函数对用户上传的数据做检验,又或者通过在CloudFront侧添加监听origin-response 事件的Lambda@Edge函数做校验,除了自定义的校验逻辑需要写一些代码外,其他功能基本靠点鼠标就能实现了。

                                        但由于MinIO作为独立的存储服务不支持诸如Lambda函数这样的Serverless架构特性,因此相对于Serverless的方案,基于MinIO的事件机制来实现低成本又高可用的数据校验逻辑就麻烦得多了。因此Turms暂不支持对存储文件做数据校验。之后会提供支持。

                                        配置

                                        配置项默认值说明
                                        turms-plugin.minio.enabledtrue是否启动插件
                                        turms-plugin.minio.endpoint"http://localhost:9000"MinIO服务端的地址
                                        turms-plugin.minio.region""MinIO服务端的区域
                                        turms-plugin.minio.access-keyminioadminMinIO服务端的Access Key
                                        turms-plugin.minio.secret-keyminioadminMinIO服务端的Secret Key
                                        turms-plugin.minio.retry.enabledtrue初始化Buckets失败时,是否重试
                                        turms-plugin.minio.retry.initial-interval-millis30_000初始化Buckets失败时,首次重试间隔
                                        turms-plugin.minio.retry.interval-millis30_000初始化Buckets失败时,重试间隔
                                        turms-plugin.minio.retry.max-attempts3初始化Buckets失败时,最多重试次数
                                        turms-plugin.minio.resource-id.mac.enabledfalse是否对资源的Object Key进行MAC算法加密,以生成不可预测的URL来反爬虫。
                                        如果不开启该项,则用户可以通过用户ID或群组ID来获得对应的图片URL
                                        最终资源URL为:<bucket>/<base62(object key)><base62(mac(object key))>。如user-profile-picture/123456789 => user-profile-picture/8M0kX1aEllpuvXRV09grkIEtD4R
                                        注意:如果开启MAC算法,则客户端在调用queryXXXDownloadInfo系列接口时,要将参数fetch设置为true;在调用queryXXX系列接口时,要将参数fetchDownloadInfo设置为true
                                        turms-plugin.minio.resource-id.mac.base64-key"AHR1cm1zLWltL3R1cm1zgA=="Base64编码的MAC算法密钥
                                        turms-plugin.minio.resource-id.base62.enabledfalse是否对资源的Object Key进行Base62算法编码,以缩短URL的长度。
                                        最终资源URL为:<bucket>/<base62(object key)>,或<bucket>/<base62(object key)><base62(mac(object key))>。如user-profile-picture/123456789 => message-attachment/8M0kXuser-profile-picture/8M0kX1aEllpuvXRV09grkIEtD4R
                                        注意:1. 当turms-plugin.minio.resource-key.mac.enabledtrue时,Base62算法会始终被应用。
                                        2. 如果开启Base62算法,则客户端在调用queryXXXDownloadInfo系列接口时,要将参数fetch设置为true;在调用queryXXX系列接口时,要将参数fetchDownloadInfo设置为true
                                        turms-plugin.minio.resource-id.base62.charset...Base62算法的字符集
                                        + \ No newline at end of file diff --git a/docs/zh-CN/server/module/system-resource-management.html b/docs/zh-CN/server/module/system-resource-management.html index 7fa0e4a0..76075475 100644 --- a/docs/zh-CN/server/module/system-resource-management.html +++ b/docs/zh-CN/server/module/system-resource-management.html @@ -17,10 +17,10 @@ -
                                        Skip to content

                                        系统资源管理

                                        内存与CPU资源对服务端的重要性不言而喻,Turms各模块都比较极致地使用内存与CPU,具体可参考各模块实现的文档与代码。而在另一方面,为保证服务端的正常运行,其内部也提供了一套健康检测机制,该机制配合上层的“拒绝服务”机制,以尽最大努力保证服务端能够正常运行。

                                        Turms提供系统资源监控配置类:im.turms.server.common.infra.property.env.common.healthcheck.HealthCheckProperties,来允许用户配置可用内存占用率与CPU占用率。Turms服务端的HealthCheckManager会持续检测可用物理内存与CPU占用率,如果检测到可用物理内存过低或CPU占用率过高,则会:

                                        • 将自身在服务注册中心的isHealthy信息标记为false。由于RPC发送端只会从isHealthytrue的服务端中,挑选RPC的响应服务端,因此能实现类似背压的效果
                                        • 拒绝对外提供服务。具体而言:如果是turms-gateway服务端,则拒绝新会话的建立与用户请求的处理;如果是turms-service服务端,则拒绝处理turms-gateway服务端发来的RPC请求(注意:就算处于“不健康”状态,turms-service仍然会为管理员API提供服务)

                                        内存管理

                                        JVM基础内存知识

                                        JVM HotSpot虚拟机的内存区域可以划分为:

                                        • 堆内存(Heap Memory):Eden区、Survivor区、老年代(Old Generation)

                                        • 非碓内存(Non-heap Memory)

                                          • 直接内存(Direct Memory):Direct Buffer Pool
                                          • JVM内部内存(JVM Specific Memory):本地方法栈、元空间、Code Cache等

                                          特别注意:通过函数java.lang.management.MemoryMXBean#getNonHeapMemoryUsage获得的NonHeapMemory并不包括Direct Buffer Pool(直接内存缓存池)。具体而言,该函数在JDK 21中所指的内存空间为:

                                          • CodeHeap 'non-nmethods'
                                          • CodeHeap 'non-profiled nmethods'
                                          • CodeHeap 'profiled nmethods'
                                          • Compressed Class Space
                                          • Metaspace

                                        参考文档:How to Monitor VM Internal Memory

                                        可控内存(Managed Memory)的使用

                                        Turms服务端的可控内存指的是堆内存(Heap Memory)直接内存(Direct Memory)这两块区域。

                                        堆内存

                                        实践意义

                                        堆内存的实践意义比较容易理解,就是尽可能配置大的堆内存,以减少GC次数与stop-the-world事件的发生。

                                        配置

                                        JVM默认的堆配置如下:

                                        -XX:MaxRAMPercentage=75
                                        +    
                                        Skip to content

                                        系统资源管理

                                        内存与CPU资源对服务端的重要性不言而喻,Turms各模块都比较极致地使用内存与CPU,具体可参考各模块实现的文档与代码。而在另一方面,为保证服务端的正常运行,其内部也提供了一套健康检测机制,该机制配合上层的“拒绝服务”机制,以尽最大努力保证服务端能够正常运行。

                                        Turms提供系统资源监控配置类:im.turms.server.common.infra.property.env.common.healthcheck.HealthCheckProperties,来允许用户配置可用内存占用率与CPU占用率。Turms服务端的HealthCheckManager会持续检测可用物理内存与CPU占用率,如果检测到可用物理内存过低或CPU占用率过高,则会:

                                        • 将自身在服务注册中心的isHealthy信息标记为false。由于RPC发送端只会从isHealthytrue的服务端中,挑选RPC的响应服务端,因此能实现类似背压的效果
                                        • 拒绝对外提供服务。具体而言:如果是turms-gateway服务端,则拒绝新会话的建立与用户请求的处理;如果是turms-service服务端,则拒绝处理turms-gateway服务端发来的RPC请求(注意:就算处于“不健康”状态,turms-service仍然会为管理员API提供服务)

                                        内存管理

                                        JVM基础内存知识

                                        JVM HotSpot虚拟机的内存区域可以划分为:

                                        • 堆内存(Heap Memory):Eden区、Survivor区、老年代(Old Generation)

                                        • 非碓内存(Non-heap Memory)

                                          • 直接内存(Direct Memory):Direct Buffer Pool
                                          • JVM内部内存(JVM Specific Memory):本地方法栈、元空间、Code Cache等

                                          特别注意:通过函数java.lang.management.MemoryMXBean#getNonHeapMemoryUsage获得的NonHeapMemory并不包括Direct Buffer Pool(直接内存缓存池)。具体而言,该函数在JDK 21中所指的内存空间为:

                                          • CodeHeap 'non-nmethods'
                                          • CodeHeap 'non-profiled nmethods'
                                          • CodeHeap 'profiled nmethods'
                                          • Compressed Class Space
                                          • Metaspace

                                        参考文档:How to Monitor VM Internal Memory

                                        可控内存(Managed Memory)的使用

                                        Turms服务端的可控内存指的是堆内存(Heap Memory)直接内存(Direct Memory)这两块区域。

                                        堆内存

                                        实践意义

                                        堆内存的实践意义比较容易理解,就是尽可能配置大的堆内存,以减少GC次数与stop-the-world事件的发生。

                                        配置

                                        JVM默认的堆配置如下:

                                        -XX:MaxRAMPercentage=75
                                         -XX:InitialRAMPercentage=75
                                        -XX:MaxRAMPercentage=75
                                         -XX:InitialRAMPercentage=75

                                        其中:

                                        • InitialRAMPercentageMaxRAMPercentage指定了需要reserve内存的大小,但Turms服务端访问该内存区域时仍会发生缺页异常。虽然JVM可以通过配置AlwaysPreTouch,将reserved内存直接转换成committed内存,来避免服务端在运行时发生缺页异常。但因为开启该选项后,服务端很难监控真正被使用了的堆内存,因此目前不推荐添加该配置。
                                        • InitialRAMPercentageMaxRAMPercentage设成一样的值主要是为了尽可能保证内存的连续性,避免服务端因为内存扩容与缩容,反复进行GC与stop-the-world操作。
                                        • 堆内存没有配置为接近100%的值,这是为了把剩余的物理内存让给JVM自身的堆外内存(如占最大头的直接内存、CodeCache、Metaspace等)、系统内核(如维护TCP连接时的缓冲区)与边车服务(如:日志采集服务)使用。

                                        另外,推荐生产环境不要给Turms服务端分配超过32GB内存。因为:

                                        • 开启JVM的指针压缩技术,以减少不必要的内存占用
                                        • 避免单个服务端承载太多负荷,在停机时减缓惊群效应,提升用户体验

                                        直接内存

                                        下文所述的所有直接内存在实际代码中,都是由PooledByteBufAllocator.DEFAULT分配,即它们都是被Netty缓存与管理的直接内存。

                                        实践意义

                                        直接内存的容量上限影响Turms服务端在同一时刻能够处理的客户端请求与管理员API请求的峰值

                                        主要使用方
                                        • 网络I/O操作。如基于Netty的:第三方依赖mongo-driver-javaLettuce等驱动;Turms服务端自身面向客户端的TCP/HTTP服务端实现。
                                        • 日志打印。Turms自研的日志打印实现直接将Java基础数据写入直接内存块中,再将其写入文件描述符。

                                        换言之,基本上所有需要系统内核访问的内存区域,我们都是直接使用直接内存,以避免无意义的堆内存拷贝。

                                        注意:在Linux系统中,Turms使用的直接内存仍处于用户空间内,因此将直接内存写入设备(如网卡与硬盘)时,仍需要进行用户空间到内核空间、内核空间到设备的两次拷贝,而这两次拷贝操作是上层服务端无法避免的。

                                        生命周期

                                        因为在Turms服务端中,直接内存的生命周期与客户端请求与管理员API请求的生命周期高度一致,一块直接内存通常只会在一个请求的部分或全部生命周期中存在。具体而言,其生命周期大体如下:

                                        • 一个请求的生命周期开始于Netty对TCP字节流进行切割的阶段,Netty根据varint编码的header(其值表示的Payload长度),来对TCP字节流进行切割,而当这块内存被切割出来时(注意:这里没有发生内存拷贝),这块代表请求的直接内存的生命周期也就此开始了。

                                        • 在Turms服务端将这块内存解析成具体的请求模型之后,Turms会判断该类型的请求是否需要使用代表它自己的直接内存。如果该请求的处理逻辑不需要使用这块内存,则这块内存会被马上回收回Netty的内存缓存池中。否则,诸如“转发用户消息”这样的请求需要使用这块内存,则该块内存不会被马上回收。接着Turms会对该请求进行业务逻辑处理。

                                        • 在业务处理的过程中,可能会涉及到其他网络I/O操作(如向MongoDB/Redis发请求)或日志打印操作,这两类操作都需要从Netty管理的内存缓冲池中取出新的直接内存块,以进行MongoDB/Redis客户端请求的编码与响应解码操作、或日志打印操作。

                                        • 等Turms服务端最终将请求响应的直接内存Flush到网卡后,除了代表日志记录的直接内存外,该过程所涉及的其他直接内存也都会被回收。

                                          唯一一种例外情况是:如果一个请求的直接内存需要转发给多个客户端,那么Turms会通过引用计数器将该请求的生命周期与其直接内存的生命周期分离,以保证能够将同一块直接内存转发给多个客户端,以避免内存拷贝。

                                          注意:

                                          1. 上文所述的直接内存回收并不是将内存回收给系统,而是回收回由Netty管理的内存池中,该内存并不会在这时被真正释放。
                                          2. 直接内存主要是通过:当Pooled ByteBuf被release时,Netty会检测其所属Chunk是否已闲置(使用率为0%)。如果是,则通过函数io.netty.buffer.PoolArena#destroyChunk真正释放该内存。

                                        由于该生命周期的存在,堆内存与直接内存的真实使用率其实具有关联性。堆内存的增长主要是因为Turms服务端接收到了客户端请求或管理员API请求后处理的一系列逻辑。而在一过程中,直接内存的使用率增高是因为请求的解码与响应的编码、逻辑中的网络I/O操作的编解码与日志打印。当请求的生命周期结束时,堆内存与直接内存也就都可以被回收了。

                                        内存健康检测

                                        配置

                                        配置类:im.turms.server.common.infra.property.env.common.healthcheck.MemoryHealthCheckProperties

                                        如上文所述,要想让运维人员准确评估服务端应该使用多少内存其实是非常困难,甚至不现实的事,尤其是一些关键系统内核(如TCP连接)所占内存是动态变化的,因此MemoryHealthCheckProperties除了提供诸如maxAvailableMemoryPercentagemaxAvailableDirectMemoryPercentage这样限定Turms服务端可使用内存上限的配置,同时也提供了minFreeSystemMemoryBytes这一配置,让Turms服务端能够实时检测系统的可用物理内存,并尽最大努力预留这些内存出来。

                                        内存监控实现——MemoryHealthChecker

                                        作用:

                                        • 检测到系统物理内存不足时,通知上层服务拒绝处理用户会话与请求,以尽最大努力保证不会耗尽物理内存,并避免使用Swap内存
                                        • 如果检测到系统物理内存不足时,且已用堆内存超过heapMemoryGcThresholdPercentage,则调用System.gc()来建议JVM进行Full GC

                                        特别注意

                                        • 如上文所述,直接内存的生命周期与请求的生命周期高度一致,因此就算MemoryHealthChecker检测到了已用总内存已经超过XX,它也不会主动尝试去释放直接内存,而是等待Netty内部的内存管理机制对其进行释放
                                        • 综上,尽管Turms服务端会尽最大努力不去耗尽物理内存,但对于极端突发的大量请求,Turms服务端还是有可能会耗尽物理内存,此时会采用Swap内存。如果Swap内存被系统关闭或Swap内存不足,则Turms服务端将直接抛出OutOfMemoryError异常。因此我们可以把使用Swap内存当作最后一道防线,故非常不推荐在生产环境中关闭Swap内存。

                                        关于Valhalla项目——Codes like a class, works like an int

                                        Java的内存占用一直为人所诟病,诸如一个Integer对象所存放的对象头所需的内存(在64位系统且开启了压缩指针的情况下,为12字节)大于实际int数据数倍,也因为这样的设计缺陷,导致编程时还需要一些变通手段,如在使用Integer对象时,JVM会优先使用java.lang.Integer.IntegerCache类里的对象缓存。相比很多追求性能优化(甚至是寄存器级别的优化)的C++服务端项目(如Nginx、Redis),由于Java自身的设计缺陷与保守,Java对内存的浪费就让人感觉有些“自暴自弃”了,并且更糟糕的是:这样的精神也传导给了整个Java生态圈。通过阅读源码,能发现很多知名Java项目也是“功能能用,代码写着舒服,性能差不多就行,反正JVM会帮忙GC”的态度,诸如可以很容易做Cache的地方不Cache、基础数据结构乱用、反复内存拷贝(如最常见的StringStringBuilder在实践中,通常来来回回拷贝很多次,源码让人触目惊心),只有诸如Netty这样极个别项目会有性能优化与精益求精的意识,关于这点我们已经在其他章节重点讲解了,故不赘述。

                                        而Valhalla项目对现有的Java Object体系进行了重构。原有的Object在新的Java体系中叫做IdentityObject,而新体系下的Object则成了IdentityObjectValueObject的父类(注意:Valhalla团队尚未定稿,因此概念可能还会变),二者有些类似于C#的Reference typesValue types。其中ValueObject下分两大类,即primitive classvalue classprimitive class可以让开发者自定义性能如Java传统八大基础类型一样高效的数据结构,无需对象头、访问时无需通过指针查找、栈上分配,自然也无需进行GC,同时这些类也能声明字段并定义函数。而Java传统的八大基本类型也将基于新的对象体系重新进行设计,如int这样的primitive type将成为primitive classprimitive classvalue class的一种类型,其值不可为null),而其包装类(Wrapper Class) Integer与可能会支持的int.ref将成为value class(值可为null),因此未来也不会有包装类这一概念了。

                                        举例来说,类primitive class Point { private double x; private double y; }的primitive实例对象只需占用2个double的字节,即16字节,无需对象头。

                                        等Valhalla项目发布Preview版本后,我们将引入ValueObject,并改造诸如DTO对象与各种包装类(如DateByteArrayWrapper)等代码实现,以极大地减少内存开销与对象数量并加快GC速度。并且由于我们已等待该项目数年,非常熟悉其设计,故可在一周内完成适配与测试工作。这也是我们会为Preview特性开绿灯的唯一特性。

                                        补充:

                                        • 其实Java的发展历程也印证了我们谈到过的“IM功能丰富要付出致命的代价”的观点,即一个项目引以为傲的特性,其背后可能藏着万丈深渊。

                                          Java曾引以为傲的Everything is an object,并强调Java has no structures or unions as complex data types. You don't need structures and unions when you have classes(引用自Sun公司在1995年发布的Java白皮书:Simple, Object Oriented, and Familiar)来宣传Java远比C与C++简单易用。

                                          (额外补充:纵观Java的发展史,开发者也会感叹因Java能够不断顺应时代发展,调整自身发展方向,过五关斩六将而展现出来的强大生命力)

                                          但在当今的编程实践中,提倡“万物皆对象”而不提供structure更像是诅咒,诸如当我们将一个int放进一个List<Integer>时,还需要new一个新对象,徒增对象头。换言之,只要我们使用了Java提供的ListMap等常用数据结构,就得白白浪费非常多的内存,而这些集合类在实际项目中又是无法避免的,它就像诅咒一样挥之不去(补充:其实诸如HashSetLinkedList的内部数据结构比很多开发者能想象到的内存浪费还要浪费,对象头占的内存比实际数据占的还多,也因此我们看其源码时会使用“触目惊心”来评价)。

                                          如今,Valhalla项目希望通过引入primitive/value class语言特性来改变这现状,但因为其既要向前兼容庞大的Java生态,又要让Java摆脱传统万物皆对象的诅咒,导致Valhalla项目的发展如履薄冰,光是设计稿就推翻了非常多次,至今花了近8年时间也没发布Preview特性,且未来还得花很长时间让开发者重新认识新的Java语言模型。可见,一个项目初期引以为傲的特性,可能会在项目发展的中后期就成“诅咒”了,既让项目的维护者头疼,也让使用者头疼。

                                          IM功能设计也是同样的道理,具备强生命力的设计应该遵循Less is more的设计理念。“IM功能丰富”看似是值得引以为傲的特性,开发者初期以为开源IM项目都为自己把功能都做好了,自己基本什么也不用做了。但这背后都是有代价的,项目拓展性可能极差,中后期做拓展还不如自己重写。

                                        • 如果Java没有Valhalla这个项目,可能Turms服务端最初会以C#语言立项。

                                        参考文档:Valhalla项目下的Java语言模型

                                        线程

                                        由于Turms服务端不存在阻塞I/O,诸如RPC、MongoDB与Redis的网络请求都是基于Netty异步实现的,如果更往下看,在Linux系统上,即都为epoll相关操作,因此服务端所需的线程数远远少于传统Java Web应用。

                                        以16核CPU为例,turms-gateway与turms-service的线程数峰值的范围约在80~140(含JVM内部线程)之间,具体峰值数要根据服务器的CPU内核数与所运行的服务端个数(如一个turms-gateway可以同时启动TCP/WebSocket/UDP服务端)而定。

                                        特别值得一提的是:Turms的线程峰值数与同时在线用户规模与请求QPS无关。

                                        补充:正因为Turms服务端自身使用的线程数相比CPU核数而言并不算多,因此在个别代码中我们直接使用ThreadLocal缓存一些相对大且线程不安全的对象,并且相比传统服务端,Turms也极大地减少了线程上下文切换带来的开销。

                                        CPU健康监控

                                        配置类:im.turms.server.common.infra.property.env.common.healthcheck.CpuHealthCheckProperties

                                        作用:监控CPU使用率,如果N次检测到CPU使用率超过阈值,则将节点的isHealthy设为false,并与其他节点共享该状态,同时拒绝提供服务,直到CPU使用率健康。具体配置见上述的配置类。

                                        Turms线程列表

                                        使用范围类别线程名数量作用
                                        通用Admin HTTP服务端线程turms-admin-http-accptor1Admin HTTP服务端Acceptor线程
                                        turms-admin-http-workerCPU核数Admin HTTP服务端Worker线程
                                        用户黑名单turms-client-blocklist-sync1用于同步集群间的黑名单数据
                                        健康检测turms-health-checker1
                                        日志turms-log-processor1用于日志格式化与输出
                                        Shutdownturms-shutdown1服务端关闭时,调度各组件的Shutdown任务
                                        定时任务turms-task-manager1用于调度定时任务
                                        集群实现turms-node-connection-client-ioCPU核数节点通信I/O线程
                                        turms-node-connection-keepalive1用于定时发送节点间的心跳,剔除心跳过期的对端节点
                                        turms-node-connection-retry1节点连接重连线程
                                        turms-node-connection-server-acceptor1节点连接服务端Acceptor线程
                                        turms-node-connection-server-workerCPU核数节点连接服务端Worker线程
                                        turms-node-discovery-change-notifier1节点增删改事件通知线程
                                        turms-node-discovery-heartbeat-refresher1用于Leader节点在服务注册中心刷新心跳时间,
                                        Redis客户端lettuce-event-loopRedis客户端I/O线程
                                        MongoDBturms-mongo-change-watcher1用于执行MongoDB Change Stream回调函数
                                        mongo-event-loopMongoDB客户端I/O线程
                                        turms-gatewayFake客户端turms-fake-clientCPU核数Fake Turms客户端I/O线程
                                        turms-fake-client-manager1调度Fake Turms客户端发送请求
                                        turms-client-heartbeat-refresher1用于定时批量刷新客户端心跳
                                        Gateway服务端turms-gateway-udp-acceptor1UDP服务端Acceptor线程
                                        turms-gateway-udp-workerCPU核数UDP服务端Worker线程
                                        turms-gateway-tcp-acceptor1TCP服务端Acceptor线程
                                        turms-gateway-tcp-workerCPU核数TCP服务端Worker线程
                                        turms-gateway-ws-acceptor1WebSocket服务端Acceptor线程
                                        turms-gateway-ws-workerCPU核数WebSocket服务端Worker线程
                                        turms-gateway-idle-connection-timeout-timer1用于监听并关闭长期没与服务端建立应用层用户会话的网络连接
                                        客户端限流防刷turms-ip-request-token-bucket-cleaner1用于清除过期了的Token Bucket数据

                                        线程模型

                                        (相关文档:Linux系统参考配置源码-网络配置

                                        业务处理TCP/WebSocket服务端与HTTP后台管理API服务端

                                        业务处理TCP/WebSocket服务端与HTTP后台管理API服务端的实现均采用主从Reactor多线程模型。具体而言,均使用一个Acceptor线程(主Reactor组、Boss EventLoopGroup)与CPU核数个数的Worker线程组(从Reactor组、Worker EventLoopGroup)。其中:

                                        • Acceptor线程通过io.netty.channel.nio.NioEventLoop#run函数,从ServerSocketChannel监听TCP客户端的连接事件,并为已连接的TCP客户端创建对应的SocketChannel,将其分配给一个Worker线程进行后续处理。

                                          Acceptor线程名为:turms-gateway-tcp-acceptorturms-gateway-ws-acceptorturms-admin-http-acceptor

                                          主要相关Linux系统配置:net.core.somaxconn(TCP accept队列最大长度)。

                                        • 一个Worker线程可以绑定并处理多个SocketChannel,并通过io.netty.channel.nio.NioEventLoop#run来不断监听SocketChannel的read事件与需要处理write任务,并在读写字节流时执行ChannelPipeline中一系列ChannelHandler的编解码函数,完成字节编解码任务。

                                          在Worker线程完成客户端请求的解码工作后,Worker线程就会执行Turms服务端的源码-客户端请求处理逻辑了(注意:这里并不需要切换线程)。而在这个业务请求处理过程中,最耗时的是客户端请求的Protobuf解码与MongoDB与Redis请求的编码操作,而IM逻辑只是完成IM业务逻辑的调度,因此并不耗时。特别一提的是,在业务请求的处理过程中,如果需要对一个字符串进行敏感词过滤检测,并采用MASK_TEXT策略,则其性能表现可以简单约等于Java的String#getBytes("UTF-8"),因此也不耗时。

                                          Worker线程名为:turms-gateway-tcp-workerturms-gateway-ws-workerturms-admin-http-worker

                                          主要Linux系统配置:net.ipv4.tcp_mem、net.ipv4.tcp_rmem、net.ipv4.tcp_wmem

                                        Node服务端与客户端

                                        TODO

                                        Lettuce与MongoDB客户端

                                        TODO

                                        判断任意一行代码在哪个线程组上执行的方法

                                        在了解了上述Turms服务端的线程模型后,读者可以很容易地判断Turms服务端任意一行代码会执行在哪个线程组上。

                                        以处理客户端业务请求为例,从Netty的Worker线程读完一个Turms客户端发来的TurmsRequest字节流开始,这一整条业务处理流程都会在该Worker线程上执行,该线程在处理完业务逻辑后就可以返回去处理其他业务请求了。

                                        而在业务流程处理过程中,Worker线程可能会触发各种网络I/O操作,诸如发送MongoDB与Redis的客户端请求。当这些网络I/O操作完成后,会有一系列的业务相关的回调函数需要执行,而这些回调函数都会执行在MongoDB或Redis客户端NIO线程上。

                                        简而言之,开发者在Service层看到的所有非回调形式的业务处理代码都是在Worker线程上执行的,而各种回调形式的业务处理代码通常都是在MongoDB或Redis客户端的NIO线程上执行。管理员API同理。

                                        关于Loom项目——Codes like sync, works like async

                                        背景

                                        很多相对长寿的技术方案一方面即得益于其丰富的生态而长寿,另一方面又因为其丰富的生态而尾大不掉,由于不能顺应时代发展,而最终退出历史舞台。而在Java生态中,各种技术方案的阻塞实现其实就是危及Java在新时代发展的一大拦路虎。其中,JDBC阻塞实现就是Java异步生态实现的最大障碍,Turms没有采用传统SQL数据库的原因之一就是:当时的Java生态圈没有成熟的异步JDBC实现,甚至一些项目因此不以Java立项,而改用Go或C#等语言,只留下一句“Java的线程模型不够“云原生”,生态圈太落后”。

                                        而Loom项目的革命性就在于它正式地将协程(Virtual Thread)引入了Java的世界,让看似同步的代码也能以异步方式执行。

                                        从Turms服务端角度,谈我们对Loom项目的态度

                                        尽管上面说了Loom项目的革命之处,但Turms项目未来也不会采用Loom项目提供的协程,因为对于Turms服务端项目来说,协程只能增加新问题(如栈拷贝),并且不能解决已有的问题。具体原因如下:

                                        • 协程的革命性在于其试图解决Java生态重度使用阻塞API(如JDBC)的现状,让看似同步的代码以异步方式执行。但Turms服务端在处理客户端业务请求时没有阻塞I/O,协程的革命性在Turms服务端这发挥不了作用。且如果有第三方库使用了阻塞I/O,那我们通常会对其作者的技术水平产生怀疑,并不会使用其实现。

                                        • Loom项目引入了基于StackCopy的协程,该协程在park的时候需要保存调用栈到堆上,在unpark并执行thaw操作的时候又要从堆上取回调用栈,但这对Turms服务端来说就多此一举了,因为Turms服务端在处理客户端业务请求时没有阻塞I/O,不需要park。一些推广Loom项目的文章会讲到协程具有“就算开数万个协程,也只需占用这么一点内存”的优点,但Turms服务端只需开0个协程,多使用0字节的内存也能实现同样的效果。

                                          另外,尽管保存调用栈能解决reactor-core的一大致命缺点“异常的栈信息基本没用,很难Debug”,但reactor-core在Turms服务端的优化下已经克服了这个缺点(具体见下文补充:reactor-core的缺点)。

                                        • 协程的学习难度是“1+1>2”,其学习曲线其实高于reactor-core。说协程的学习难度是“1+1>2”是因为:开发者同时要掌握线程与协程的使用、原理和优化,同时还要能保证以线程为模型的传统代码要能正确地运行在协程当中,而掌握reactor-core只需要最基本的线程知识。

                                          一些开发者可能会认为reactor-core的使用会比协程复杂,但这样的说法通常只是从初学者角度来看的。对于初级工程师而言,其实不管是协程,还是reactor-core,在不学习其原理的情况下,二者表面的使用其实都很简单。只是在开发者学习的初期,协程可以在Java层面保证了初级工程师很容易写出高性能的代码,而reactor-core最好要有高级工程师带着初级程序员写,否则代码可能维护性极差、甚至出现逻辑错误。但只要过了这短暂的初学阶段,学习协程就会面临刚刚提到的“学习难度1+1>2”的问题,而reactor-core只要求工程师掌握最基本的线程知识。

                                          判断任意一行代码在哪个线程组上执行的方法所述,对于Turms服务端(含第三方库)的任意一行代码,我们只需要凭借最基本的线程知识,就能准确推断出这行代码会在哪个线程组上执行,并且这个线程组是谁、从哪、为什么被创建出来的,其生命周期又是如何。

                                          另外,我们在编写Turms服务端代码的时候,几乎不会考虑“该如何用reactor-core编写异步的代码”,如同很多开发者不会考虑“同步的代码该怎么写”。

                                        • 协程对Java大生态的兼容性还是个问号。Loom项目自身其实还有很长的路要走,需要有大量项目来踩坑与验证。诸如像Netty这样与线程紧密相关的基础网络库如果在协程交互时,出现任何负优化、显式错误、或隐藏非预期行为,其对上层应用的影响都是地动山摇的。

                                        • 协程引入了新的抽象层(协程),而这层抽象层对于Turms服务端来说是多余的,只会徒增资源开销与学习难度。尤其是在我们编写性能相关的关键代码时,我们通常是以系统调用的视角来写Java层的代码,Java只是帮忙给系统调用套了层皮,而这层皮应该越“薄”越好,这样我们才能快速明白JVM到底是调用了什么syscall,以评估我们Java层的代码是否足够高效,还有没有优化的空间。

                                        • Java异步实现至今约有十个方案,但其实Java这层异步模型的皮再怎么折腾,生态再怎么变化,再怎么具有“革命性”,系统层的调用函数还是没变。诸如该用epoll还用epoll,该用堆外内存还用堆外内存。Turms服务端没有必要因为协程更“时尚”,而使用协程,多引入一个抽象层。

                                        • reactor-core不仅实现了异步调用,还具备比协程更强的表达能力。举例来说,如果我们想要知道一个链路的成功率、执行时间等度量数据,只需要调用metrics(...)这么一个函数;想要在数据流出现错误时,按条件进行一定次数的自动重试,只需要调用retry(...);想要将切换数据流的执行线程,只需要执行publishOn(...)这么一个函数,线程的调度逻辑尽在掌握之中。

                                        综上,有栈协程既在Turms服务端这发挥不了作用,性能表现也不会比Turms服务端下的reactor-core优秀,生态还有无数的坑要有项目去踩与验证,对Turms服务端无意义的协程抽象也是冗余,徒增学习难度的,reactor-core的表达能力也比协程优秀,Turms服务端很难有理由会去使用协程。

                                        当然,上文所述内容主要是针对Turms服务端项目而言的,Loom对于绝大多数Java项目来说还是利大于弊,尤其是第三方库作者不用再需要维护同步与异步两套实现。

                                        补充:reactor-core的缺点

                                        如同我们在关于依赖库的使用章节已经提到过的,reactor-core这样的异步实现库最致命的缺点在于,当它结合一些提倡“多做封装、多做抽象、用户无需关闭实现逻辑”的依赖库时,开发者只能寄希望于服务端能够始终正常运行,否则一旦遇到了一个Bug,开发者很快就会情不自禁地产生一连串的疑问:“reactor-core这样的异步框架能用在生产环境吗?我连异常是哪里抛的都找不到,这样的代码真得能维护吗?”,部分项目组的技术人员因此后悔使用了reactor-core,甚至采用其他语言,如Go,来重写当前Java项目。

                                        举例而言,控制台现在报了一个错“Netty提示:ByteBuf的引用计数已经为0,无法再次进行释放操作”。特别注意,这里并没有省去任何有用的日志信息,这就是开发者真正能从日志看到的所有有用信息。甚至这条日志去除了误导信息,即其堆栈信息。如果开发者根据堆栈信息去Debug,那永远都无法找到真实的Root Cause。而开发者能仅凭这行日志,知道为什么会发生这个异常,并定位出哪个模块导致的这个异常吗?这是Turms真实发生过的一个Bug,也是唯一一个花费6小时以上时间,去阅读Turms所有依赖的所有网络I/O相关源码,并排查Root Cause的最难解决的Bug:Memory leaks when Turms uses the previous buffer reference to release a recycled pooled buffer

                                        总之,想要用好reactor-core必须满足三个条件:

                                        1. 所有关键代码必须可控,否则出错的时候只能寄希望于:

                                          • 第三方库的开发人员技术水平高,代码设计功底扎实。如果第三方依赖也是基于异步编程,那这个要求就更高,作者要能够预判上层开发者可能会遇到的异常,并通过异步手段,把异常抛给上层应用。

                                          • 第三方库不复杂,能快速阅读完相关源码。

                                            一个优秀的例子就是:reactor-netty。其开发人员的技术水平高,设计功底扎实。代码也比较精简,容易阅读。

                                        2. 必须规范地传递异常与打印日志。就算是异步编程,只要规范地传递异常与打印日志,我们通过单条日志也能马上看出绝大部分Bug的缘由,只有个别Bug可能需要关联多条日志进行排查。如果做不到这点,出错时只能听天由命。

                                        3. 团队里必须要有工程师熟练掌握异步编程。

                                        只要缺少上面的一个条件,开发者迟早会遇到类似上述的“Netty提示:ByteBuf的引用计数已经为0,无法再次进行释放操作”这样难度的Bug,也因此对于一般的技术团队,我们更推荐Loom项目,而不是reactor-core。当然,更推荐的可能是切换编程语言。但Turms项目如今已经能满足上述条件,不再存在“异常难以Debug”的情况。

                                        额外补充:

                                        • 部分文章会说reactor-core这样的异步框架很容易写出回调地狱。但如上文所述,reactor-core自身有很强的表达能力,实际上是开发者“想设计几层,就能写出几层的调用层级”。换言之,如果一个函数的最高调用层级是5层,那用reactor-core可以写出5/4/3/2/1层级的代码。而在实践中,Turms服务端的嵌套回调函数都是为了减少中间对象或实现栈分配(而非堆分配)而做的嵌套,具体可以看Turms服务端源码。
                                        • 在开发turms-admin管理系统的时候,我们通常也是尽量避免使用await/async,其原因是turms-admin最终会transpile成ES5语法,而被await/async修饰的函数在source map关闭之后,非常难Debug,故尽量避免await/async
                                        - + \ No newline at end of file diff --git a/docs/zh-CN/server/module/xmpp.html b/docs/zh-CN/server/module/xmpp.html index 1608965c..5542a56e 100644 --- a/docs/zh-CN/server/module/xmpp.html +++ b/docs/zh-CN/server/module/xmpp.html @@ -17,8 +17,8 @@ -
                                        Skip to content

                                        XMPP

                                        背景

                                        XMPP是一种以XML为基础的开放式即时通信协议。

                                        Turms自身不采用XMPP协议是因为:

                                        • 设计非常低效:
                                          • 数据格式采用了冗余低效的XML协议,其元数据很多时候比实际传输的数据还大。
                                          • XMPP的流程设计中,存在大量低效设计,比如将用户头像图片转换成Base64文本进行传输,又比如用户修改了某些个人信息,服务端需要将该信息主动推送给其联系列表中订阅其在线状态的用户。
                                        • 拓展性差。一些文章会说XMPP拓展性强,但这种“拓展性强”也只是相对于那些没啥拓展性的协议而言的。真正拓展性强的协议肯定是自研协议。

                                        但考虑到以下两点,我们计划在近期适配Turms服务端,以对XMPP协议提供支持:

                                        • XMPP的生态恰好弥补了Turms的一个缺点,即:一些开发者在Turms项目下反馈:基于Turms从零开始实现一套定制的IM应用还是比较复杂的,尤其还需要自己的团队来实现UI界面并适配接口,因此目前Turms更适合想深耕IM的团队来研究与使用,并不适合快速发布产品。

                                          而XMPP正好有着比较丰富的客户端生态,Turms服务端只要稍微适配下,就能为XMPP客户端提供服务。这样用户既可以快速使用各种带UI的XMPP客户端对外提供服务,又能享受到Turms的优点,等用户想打造专属于自己的IM应用后,可以再逐步淘汰XMPP客户端,并转移到Turms客户端上。

                                          补充:由于Turms的定位,我们在长期计划中都不考虑安排“提供带UI的客户端实现”相关的任务。换一种说法,我们只会在发布了为Turms定制的压测平台、数据分析平台等平台,并实现了各种拓展功能特性与各种Bug修复,才会开始考虑提供带UI的客户端,因此该任务的优先级是极低的。

                                        • 大部分知名的XMPP开源服务端项目的不仅技术架构老套、技术栈陈旧、性能糟糕,而且代码水平与工程能力都偏低。以Tigase项目为例,作为发展了数十年的开源项目,竟然还会犯大量用==比较字符串的低级错误,又或者大量地将数据模型与业务逻辑杂糅在一起,没有代码设计的能力,开发水平之低,令人咋舌。

                                          尽管部分开源XMPP服务端会宣传自己的架构可拓展性“Scalable”强,但其可拓展性跟Turms比起来,就相形见绌了。Turms是在真正意义上,在架构、自身代码实现、数据库设计等方面,尽量把各方面(包括可拓展性)做到极致的项目,因此在中大型IM领域,Turms可以对其进行降维打击。

                                        注意:其实我们并没有计划让Turms服务端取代其他XMPP服务端,因为XMPP服务端与Turms服务端的定位非常不同,XMPP服务端的一大作用是实现即时通讯的开放互通(就像邮件一样开放互通),但Turms服务端支持XMPP协议主要是为了让用户可以快速使用XMPP客户端与Turms服务端互通,从而快速对外提供服务,而且我们也没计划支持Turms服务端与其他XMPP服务端互通。

                                        实现原理

                                        • turms-gateway服务端内部先实现了一个定制的XMPP服务端。

                                          补充:需要定制是因为Turms服务端用不到XMPP协议规定的一些功能,因此也就没必要实现了,但定制的XMPP服务端依旧能兼容标准的XMPP客户端。

                                        • 该XMPP服务端在接收到XMPP客户端的请求后,会将这些请求转换成对应的Turms服务调用,因此从后续调用的角度看,XMPP客户端请求与Turms客户端请求走得都是类似的逻辑,最终实现XMPP客户端与Turms客户端互通。

                                          补充:

                                          • 二者使用“类似的逻辑”是因为它们的业务流程略有差异,并不是100%的一对一关系。
                                          • XMPP客户端与Turms客户端共用账号体系,因此一个账号既可以使用XMPP客户端登陆,也可以用Turms客户端登陆。
                                          • XMPP客户端并不知道Turms客户端这一概念,反之亦然。二者之所以能互通是因为turms-gateway会将数据转换为它们能理解的协议格式,再进行发送。
                                        - +
                                        Skip to content

                                        XMPP

                                        背景

                                        XMPP是一种以XML为基础的开放式即时通信协议。

                                        Turms自身不采用XMPP协议是因为:

                                        • 设计非常低效:
                                          • 数据格式采用了冗余低效的XML协议,其元数据很多时候比实际传输的数据还大。
                                          • XMPP的流程设计中,存在大量低效设计,比如将用户头像图片转换成Base64文本进行传输,又比如用户修改了某些个人信息,服务端需要将该信息主动推送给其联系列表中订阅其在线状态的用户。
                                        • 拓展性差。一些文章会说XMPP拓展性强,但这种“拓展性强”也只是相对于那些没啥拓展性的协议而言的。真正拓展性强的协议肯定是自研协议。

                                        但考虑到以下两点,我们计划在近期适配Turms服务端,以对XMPP协议提供支持:

                                        • XMPP的生态恰好弥补了Turms的一个缺点,即:一些开发者在Turms项目下反馈:基于Turms从零开始实现一套定制的IM应用还是比较复杂的,尤其还需要自己的团队来实现UI界面并适配接口,因此目前Turms更适合想深耕IM的团队来研究与使用,并不适合快速发布产品。

                                          而XMPP正好有着比较丰富的客户端生态,Turms服务端只要稍微适配下,就能为XMPP客户端提供服务。这样用户既可以快速使用各种带UI的XMPP客户端对外提供服务,又能享受到Turms的优点,等用户想打造专属于自己的IM应用后,可以再逐步淘汰XMPP客户端,并转移到Turms客户端上。

                                          补充:由于Turms的定位,我们在长期计划中都不考虑安排“提供带UI的客户端实现”相关的任务。换一种说法,我们只会在发布了为Turms定制的压测平台、数据分析平台等平台,并实现了各种拓展功能特性与各种Bug修复,才会开始考虑提供带UI的客户端,因此该任务的优先级是极低的。

                                        • 大部分知名的XMPP开源服务端项目的不仅技术架构老套、技术栈陈旧、性能糟糕,而且代码水平与工程能力都偏低。以Tigase项目为例,作为发展了数十年的开源项目,竟然还会犯大量用==比较字符串的低级错误,又或者大量地将数据模型与业务逻辑杂糅在一起,没有代码设计的能力,开发水平之低,令人咋舌。

                                          尽管部分开源XMPP服务端会宣传自己的架构可拓展性“Scalable”强,但其可拓展性跟Turms比起来,就相形见绌了。Turms是在真正意义上,在架构、自身代码实现、数据库设计等方面,尽量把各方面(包括可拓展性)做到极致的项目,因此在中大型IM领域,Turms可以对其进行降维打击。

                                        注意:其实我们并没有计划让Turms服务端取代其他XMPP服务端,因为XMPP服务端与Turms服务端的定位非常不同,XMPP服务端的一大作用是实现即时通讯的开放互通(就像邮件一样开放互通),但Turms服务端支持XMPP协议主要是为了让用户可以快速使用XMPP客户端与Turms服务端互通,从而快速对外提供服务,而且我们也没计划支持Turms服务端与其他XMPP服务端互通。

                                        实现原理

                                        • turms-gateway服务端内部先实现了一个定制的XMPP服务端。

                                          补充:需要定制是因为Turms服务端用不到XMPP协议规定的一些功能,因此也就没必要实现了,但定制的XMPP服务端依旧能兼容标准的XMPP客户端。

                                        • 该XMPP服务端在接收到XMPP客户端的请求后,会将这些请求转换成对应的Turms服务调用,因此从后续调用的角度看,XMPP客户端请求与Turms客户端请求走得都是类似的逻辑,最终实现XMPP客户端与Turms客户端互通。

                                          补充:

                                          • 二者使用“类似的逻辑”是因为它们的业务流程略有差异,并不是100%的一对一关系。
                                          • XMPP客户端与Turms客户端共用账号体系,因此一个账号既可以使用XMPP客户端登陆,也可以用Turms客户端登陆。
                                          • XMPP客户端并不知道Turms客户端这一概念,反之亦然。二者之所以能互通是因为turms-gateway会将数据转换为它们能理解的协议格式,再进行发送。
                                        + \ No newline at end of file diff --git a/docs/zh-CN/turms-admin.html b/docs/zh-CN/turms-admin.html index fee83e92..62aefca1 100644 --- a/docs/zh-CN/turms-admin.html +++ b/docs/zh-CN/turms-admin.html @@ -17,8 +17,8 @@ -
                                        Skip to content

                                        turms-admin

                                        turms-admin是一个为Turms项目定制的后台管理单页应用(SPA),具体包括:集群管理(集群监控、集群配置)、内容管理、客户端黑名单、权限控制、客户端终端,这五大版块。

                                        注意:turms-admin的定位仅仅是Turms服务端Admin API的可视化Web应用,因此turms-admin自身不提供任何数据采集、数据分析与报警等功能。

                                        部署概要

                                        Turms采用了前后端分离设计,对于Turms的服务端而言,它们甚至不“知道”有turms-admin这个前端项目的存在。turms-admin的前端项目只是提供诸如JavaScript、CSS与图像等静态资源文件,因此用户甚至可以通过本地的静态HTML文件,直接在浏览器中打开turms-admin,并与Turms服务端进行交互。但为了方便开发者进行运维与部署,turms-admin项目也提供了以下两个部署方案。

                                        Docker镜像(推荐)

                                        shell
                                        docker run -d -p 6510:6510 ghcr.io/turms-im/turms-admin
                                        docker run -d -p 6510:6510 ghcr.io/turms-im/turms-admin

                                        该镜像通过内置的Nginx服务端对外提供turms-admin的静态资源。您在运行完该指令后,就能访问http://localhost:6510页面了

                                        简易Web服务端

                                        turms-admin项目自身也提供了基于Node.js的简易Web服务端,这个Web服务端会通过HTTP接口,对外提供turms-admin的静态资源,并默认搭载PM2进行turms-admin的进程管理。

                                        安装与执行步骤

                                        1. 安装Node.js
                                        2. turms-admin目录下,执行npm run quickstart指令,该指令由npm install && npm run build && npm run start组成,包括了依赖包安装、前端构建与服务端执行。等待PM2提示turms-admin的status为online,表明turms-admin服务端进程已启动
                                        3. 打开浏览器,并访问http://localhost:6510页面

                                        常用运维指令

                                        start:执行turms-admin服务端进程

                                        stop:终止turms-admin服务端进程

                                        delete:终止turms-admin服务端进程,并删除其在PM2的进程记录

                                        restart:重启turms-admin服务端

                                        reload:重新加载turms-admin服务端配置

                                        更多指令与服务端配置,可参考PM2文档

                                        版块介绍

                                        集群管理:

                                        • 集群监控:查看集群的实时运行状态;查看某一个服务端的具体信息与度量数据
                                        • 集群配置:该部分对应着Turms服务端的全局配置功能,可以零停机实时地修改Turms服务端配置
                                        • 集群飞行记录器:管理集群各节点的飞行记录器
                                        • 集群插件:管理集群各节点的插件

                                        内容管理:增删改查各种业务数据

                                        客户端黑名单:该部分对应着Turms服务端的全局黑名单机制,用于增删改查黑名单记录

                                        权限控制:用于增删改管理员的信息与权限

                                        客户端终端:搭载turms-client-js客户端实现,用于管理员快速测试真实客户端请求与服务端响应

                                        TODO:贴GIF演示图

                                        - +
                                        Skip to content

                                        turms-admin

                                        turms-admin是一个为Turms项目定制的后台管理单页应用(SPA),具体包括:集群管理(集群监控、集群配置)、内容管理、客户端黑名单、权限控制、客户端终端,这五大版块。

                                        注意:turms-admin的定位仅仅是Turms服务端Admin API的可视化Web应用,因此turms-admin自身不提供任何数据采集、数据分析与报警等功能。

                                        部署概要

                                        Turms采用了前后端分离设计,对于Turms的服务端而言,它们甚至不“知道”有turms-admin这个前端项目的存在。turms-admin的前端项目只是提供诸如JavaScript、CSS与图像等静态资源文件,因此用户甚至可以通过本地的静态HTML文件,直接在浏览器中打开turms-admin,并与Turms服务端进行交互。但为了方便开发者进行运维与部署,turms-admin项目也提供了以下两个部署方案。

                                        Docker镜像(推荐)

                                        shell
                                        docker run -d -p 6510:6510 ghcr.io/turms-im/turms-admin
                                        docker run -d -p 6510:6510 ghcr.io/turms-im/turms-admin

                                        该镜像通过内置的Nginx服务端对外提供turms-admin的静态资源。您在运行完该指令后,就能访问http://localhost:6510页面了

                                        简易Web服务端

                                        turms-admin项目自身也提供了基于Node.js的简易Web服务端,这个Web服务端会通过HTTP接口,对外提供turms-admin的静态资源,并默认搭载PM2进行turms-admin的进程管理。

                                        安装与执行步骤

                                        1. 安装Node.js
                                        2. turms-admin目录下,执行npm run quickstart指令,该指令由npm install && npm run build && npm run start组成,包括了依赖包安装、前端构建与服务端执行。等待PM2提示turms-admin的status为online,表明turms-admin服务端进程已启动
                                        3. 打开浏览器,并访问http://localhost:6510页面

                                        常用运维指令

                                        start:执行turms-admin服务端进程

                                        stop:终止turms-admin服务端进程

                                        delete:终止turms-admin服务端进程,并删除其在PM2的进程记录

                                        restart:重启turms-admin服务端

                                        reload:重新加载turms-admin服务端配置

                                        更多指令与服务端配置,可参考PM2文档

                                        版块介绍

                                        集群管理:

                                        • 集群监控:查看集群的实时运行状态;查看某一个服务端的具体信息与度量数据
                                        • 集群配置:该部分对应着Turms服务端的全局配置功能,可以零停机实时地修改Turms服务端配置
                                        • 集群飞行记录器:管理集群各节点的飞行记录器
                                        • 集群插件:管理集群各节点的插件

                                        内容管理:增删改查各种业务数据

                                        客户端黑名单:该部分对应着Turms服务端的全局黑名单机制,用于增删改查黑名单记录

                                        权限控制:用于增删改管理员的信息与权限

                                        客户端终端:搭载turms-client-js客户端实现,用于管理员快速测试真实客户端请求与服务端响应

                                        TODO:贴GIF演示图

                                        + \ No newline at end of file