docs: SMMF introduction and usage (#878)

Co-authored-by: junewgl <[email protected]>
eosphoros-ai · Dec 1, 2023 · a35b612 · a35b612
1 parent 5cc24b7
commit a35b612
Show file tree

Hide file tree

Showing 44 changed files with 276 additions and 1,259 deletions.
diff --git a/.env.template b/.env.template
@@ -190,6 +190,13 @@ TONGYI_PROXY_API_KEY={your-tongyi-sk}
 #BAICHUAN_PROXY_API_KEY={your-baichuan-sk}
 #BAICHUAN_PROXY_API_SECRET={your-baichuan-sct}
 
+# Xunfei Spark
+#XUNFEI_SPARK_API_VERSION={version}
+#XUNFEI_SPARK_APPID={your_app_id}
+#XUNFEI_SPARK_API_KEY={your_api_key}
+#XUNFEI_SPARK_API_SECRET={your_api_secret}
+
+
 
 #*******************************************************************#
 #**    SUMMARY_CONFIG                                             **#

diff --git a/.gitignore b/.gitignore
@@ -9,7 +9,6 @@ __pycache__/
 message/
 
 .env
-.idea
 .vscode
 .idea
 .chroma

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -10,7 +10,8 @@ git clone https://github.com/<YOUR-GITHUB-USERNAME>/DB-GPT
 ```
 3. Install the project requirements
 ```
-pip install -r requirements/dev-requirements.txt
+pip install -e ".[default]"
+
 ```
 4. Install pre-commit hooks
 ```

diff --git a/MANIFEST.in b/MANIFEST.in
@@ -1,3 +1,3 @@
-include README.md
 include LICENSE
+include README.md
 include requirements.txt
diff --git a/README.md b/README.md
@@ -102,7 +102,11 @@ At present, we have introduced several key features to showcase our current capa
 
   We offer extensive model support, including dozens of large language models (LLMs) from both open-source and API agents, such as LLaMA/LLaMA2, Baichuan, ChatGLM, Wenxin, Tongyi, Zhipu, and many more. 
 
-  - [Current Supported LLMs](http://docs.dbgpt.site/docs/modules/smmf)
+  - News
+    - 🔥🔥🔥  [qwen-72b-chat](https://huggingface.co/Qwen/Qwen-72B-Chat)
+    - 🔥🔥🔥  [Yi-34B-Chat](https://huggingface.co/01-ai/Yi-34B-Chat)
+  - [More Supported LLMs](http://docs.dbgpt.site/docs/modules/smmf)
+
 - **Privacy and Security**
 
   We ensure the privacy and security of data through the implementation of various technologies, including privatized large models and proxy desensitization.

diff --git a/README.zh.md b/README.zh.md
@@ -114,27 +114,10 @@ DB-GPT是一个开源的数据库领域大模型框架。目的是构建大模
 
   海量模型支持，包括开源、API代理等几十种大语言模型。如LLaMA/LLaMA2、Baichuan、ChatGLM、文心、通义、智谱等。当前已支持如下模型: 
 
-  - [Vicuna](https://huggingface.co/Tribbiani/vicuna-13b)
-  - [vicuna-13b-v1.5](https://huggingface.co/lmsys/vicuna-13b-v1.5)
-  - [LLama2](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf)
-  - [baichuan2-13b](https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat)
-  - [baichuan2-7b](https://huggingface.co/baichuan-inc/Baichuan2-7B-Chat)
-  - [chatglm-6b](https://huggingface.co/THUDM/chatglm-6b)
-  - [chatglm2-6b](https://huggingface.co/THUDM/chatglm2-6b)
-  - [chatglm3-6b](https://huggingface.co/THUDM/chatglm3-6b)
-  - [falcon-40b](https://huggingface.co/tiiuae/falcon-40b)
-  - [internlm-chat-7b](https://huggingface.co/internlm/internlm-chat-7b)
-  - [internlm-chat-20b](https://huggingface.co/internlm/internlm-chat-20b)
-  - [qwen-7b-chat](https://huggingface.co/Qwen/Qwen-7B-Chat)
-  - [qwen-14b-chat](https://huggingface.co/Qwen/Qwen-14B-Chat)
-  - [qwen-72b-chat](https://huggingface.co/Qwen/Qwen-72B-Chat)
-  - [wizardlm-13b](https://huggingface.co/WizardLM/WizardLM-13B-V1.2)
-  - [orca-2-7b](https://huggingface.co/microsoft/Orca-2-7b)
-  - [orca-2-13b](https://huggingface.co/microsoft/Orca-2-13b)
-  - [openchat_3.5](https://huggingface.co/openchat/openchat_3.5)
-  - [zephyr-7b-alpha](https://huggingface.co/HuggingFaceH4/zephyr-7b-alpha)
-  - [mistral-7b-instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1)
-  - [Yi-34B-Chat](https://huggingface.co/01-ai/Yi-34B-Chat)
+  - 新增支持模型
+    - 🔥🔥🔥  [qwen-72b-chat](https://huggingface.co/Qwen/Qwen-72B-Chat)
+    - 🔥🔥🔥  [Yi-34B-Chat](https://huggingface.co/01-ai/Yi-34B-Chat)
+  - [更多开源模型](https://www.yuque.com/eosphoros/dbgpt-docs/iqaaqwriwhp6zslc#qQktR)
 
   - 支持在线代理模型 
     - [x] [OpenAI·ChatGPT](https://api.openai.com/)
@@ -148,28 +131,8 @@ DB-GPT是一个开源的数据库领域大模型框架。目的是构建大模
 
   通过私有化大模型、代理脱敏等多种技术保障数据的隐私安全。
 
-- 支持数据源
+- [支持数据源](https://www.yuque.com/eosphoros/dbgpt-docs/rc4r27ybmdwg9472)
 
-| DataSource                                                                      | support     | Notes                                       |
-| ------------------------------------------------------------------------------  | ----------- | ------------------------------------------- |
-| [MySQL](https://www.mysql.com/)                                                 | Yes         |                                             |
-| [PostgresSQL](https://www.postgresql.org/)                                      | Yes         |                                             |
-| [Spark](https://github.com/apache/spark)                                        | Yes         |                                             |
-| [DuckDB](https://github.com/duckdb/duckdb)                                      | Yes         |                                             |
-| [Sqlite](https://github.com/sqlite/sqlite)                                      | Yes         |                                             |
-| [MSSQL](https://github.com/microsoft/mssql-jdbc)                                | Yes         |                                             |
-| [ClickHouse](https://github.com/ClickHouse/ClickHouse)                          | Yes         |                                             |
-| [Oracle](https://github.com/oracle)                                             | No          |           TODO                              |
-| [Redis](https://github.com/redis/redis)                                         | No          |           TODO                              |
-| [MongoDB](https://github.com/mongodb/mongo)                                     | No          |           TODO                              |
-| [HBase](https://github.com/apache/hbase)                                        | No          |           TODO                              |
-| [Doris](https://github.com/apache/doris)                                        | No          |           TODO                              |
-| [DB2](https://github.com/IBM/Db2)                                               | No          |           TODO                              |
-| [Couchbase](https://github.com/couchbase)                                       | No          |           TODO                              |
-| [Elasticsearch](https://github.com/elastic/elasticsearch)                       | No          |           TODO                              |
-| [OceanBase](https://github.com/OceanBase)                                       | No          |           TODO                              |
-| [TiDB](https://github.com/pingcap/tidb)                                         | No          |           TODO                              |
-| [StarRocks](https://github.com/StarRocks/starrocks)                             | No          |           TODO                              |
 
 ## 架构方案
 整个DB-GPT的架构，如下图所示
@@ -266,6 +229,7 @@ The MIT License (MIT)
   - [x] Sqlite
   - [x] MSSQL
   - [x] ClickHouse
+  - [x] StarRocks
   - [ ] Oracle
   - [ ] Redis
   - [ ] MongoDB
@@ -276,7 +240,7 @@ The MIT License (MIT)
   - [ ] Elasticsearch
   - [ ] OceanBase
   - [ ] TiDB
-  - [ ] StarRocks
+
 
 ### 多模型管理与推理优化
 - [x] [集群部署](https://db-gpt.readthedocs.io/en/latest/getting_started/install/cluster/vms/index.html)

diff --git a/assets/wechat.jpg b/assets/wechat.jpg
diff --git a/docs/blog/authors.yml b/docs/blog/authors.yml
diff --git a/docs/blog/welcome/index.md b/docs/blog/welcome/index.md
diff --git a/...ion_manual/advanced_tutorial/debugging.md → ...pplication/advanced_tutorial/debugging.md b/...ion_manual/advanced_tutorial/debugging.md → ...pplication/advanced_tutorial/debugging.md
diff --git a/...plication_manual/advanced_tutorial/rag.md → ...docs/application/advanced_tutorial/rag.md b/...plication_manual/advanced_tutorial/rag.md → ...docs/application/advanced_tutorial/rag.md
diff --git a/docs/docs/application/advanced_tutorial/smmf.md b/docs/docs/application/advanced_tutorial/smmf.md
@@ -0,0 +1,64 @@
+# SMMF
+
+The DB-GPT project provides service-oriented multi-model management capabilities. Developer who are interested in related capabilities can read the [SMMF](/docs/modules/smmf) module part. Here we focus on how to use multi-LLMs.
+
+
+Here we mainly introduce the usage through the web interface. For developer interested in the command line, you can refer to the [cluster deployment](/docs/installation/model_service/cluster) model. Open the DB-GPT-Web frontend service and click on `Model Management` to enter the multi-model management interface.
+
+
+## List Models
+By opening the model management interface, we can see the list of currently deployed models. The following is the list of models.
+
+<p align="left">
+  <img src={'/img/module/model_list.png'} width="720px"/>
+</p>
+
+## Use Models
+Once the models are deployed, you can switch and use the corresponding model on the multi-model interface.
+
+<p align="left">
+  <img src={'/img/module/model_use.png'} width="720px"/>
+</p>
+
+## Stop Models
+As shown in the figure below, click Model Management to enter the model list interface. Select a specific model and click the red `Stop Model` button to stop the model.
+
+<p align="left">
+  <img src={'/img/module/model_stop.png'} width="720px"/>
+</p>
+
+After the model is stopped, the display in the upper right corner will change.
+
+<p align="left">
+  <img src={'/img/module/model_stopped.png'} width="720px"/>
+</p>
+
+## Model Deployment
+
+ 1. Open the web page, click the `model management` button on the left to enter the model list page, click  `Create Model` in the upper left corner, and then select the name of the model you want to deploy in the pop-up dialog box. Here we choose `vicuna-7b-v1.5`, as shown in the figure.
+
+    <p align="left">
+    <img src={'/img/module/model_vicuna-7b-1.5.png'} width="720px"/>
+    </p>
+
+
+2. Select the appropriate parameters according to the actual deployed model (if you are not sure, the default is enough), then click the `Submit` button at the bottom left of the dialog box, and wait until the model is deployed successfully.
+
+3. After the new model is deployed, you can see the newly deployed model on the model page, as shown in the figure
+
+    <p align="left">
+    <img src={'/img/module/model_vicuna_deployed.png'} width="720px"/>
+    </p>
+
+# Operations and Observability
+
+Operations and observability are important components of a production system. In terms of operational capabilities, DB-GPT provides a command-line tool called dbgpt for operations and management, in addition to the common management functionalities available on the web interface. The dbgpt command-line tool offers the following functionalities:
+
+- Starting and stopping various services
+- Knowledge base management (batch import, custom import, viewing, and deleting knowledge base documents)
+- Model management (viewing, starting, stopping models, and conducting dialogues for debugging)
+Observability tools (viewing and analyzing observability logs)
+
+We won't go into detail about the usage of the command-line tool here. You can use the `dbgpt --help` command to obtain specific usage documentation. Additionally, you can check the documentation for individual subcommands. For example, you can use `dbgpt start --help` to view the documentation for starting a service. For more information, please refer to the document provided below.
+
+- [Debugging](/docs/application/advanced_tutorial/debugging)
diff --git a/..._manual/fine_tuning_manual/text_to_sql.md → ...ication/fine_tuning_manual/text_to_sql.md b/..._manual/fine_tuning_manual/text_to_sql.md → ...ication/fine_tuning_manual/text_to_sql.md
diff --git a/...lication_manual/started_tutorial/agent.md → ...ocs/application/started_tutorial/agent.md b/...lication_manual/started_tutorial/agent.md → ...ocs/application/started_tutorial/agent.md
diff --git a/...manual/started_tutorial/chat_dashboard.md → ...cation/started_tutorial/chat_dashboard.md b/...manual/started_tutorial/chat_dashboard.md → ...cation/started_tutorial/chat_dashboard.md
diff --git a/...tion_manual/started_tutorial/chat_data.md → ...application/started_tutorial/chat_data.md b/...tion_manual/started_tutorial/chat_data.md → ...application/started_tutorial/chat_data.md
diff --git a/...cation_manual/started_tutorial/chat_db.md → ...s/application/started_tutorial/chat_db.md b/...cation_manual/started_tutorial/chat_db.md → ...s/application/started_tutorial/chat_db.md
diff --git a/...ion_manual/started_tutorial/chat_excel.md → ...pplication/started_tutorial/chat_excel.md b/...ion_manual/started_tutorial/chat_excel.md → ...pplication/started_tutorial/chat_excel.md
diff --git a/...manual/started_tutorial/chat_knowledge.md → ...cation/started_tutorial/chat_knowledge.md b/...manual/started_tutorial/chat_knowledge.md → ...cation/started_tutorial/chat_knowledge.md
diff --git a/docs/docs/application_manual/advanced_tutorial/smmf.md b/docs/docs/application_manual/advanced_tutorial/smmf.md
diff --git a/docs/docs/changelog/doc.md b/docs/docs/changelog/doc.md
@@ -1 +1,3 @@
-# Documentation Description
+# ChangeLog 
+
+Our version release information is maintained on GitHub. For more details, please visit [ReleaseNotes](https://github.com/eosphoros-ai/DB-GPT/releases)
diff --git a/docs/docs/installation/sourcecode.md b/docs/docs/installation/sourcecode.md
@@ -73,7 +73,7 @@ import TabItem from '@theme/TabItem';
     {label: 'Open AI', value: 'openai'},
     {label: 'Qwen', value: 'qwen'},
     {label: 'ChatGLM', value: 'chatglm'},
-    {label: 'ERNIE Bot', value: 'erniebot'},
+    {label: 'WenXin', value: 'erniebot'},
   ]}>
   <TabItem value="openai" label="open ai">
   Install dependencies
@@ -180,7 +180,7 @@ LLM_MODEL=wenxin_proxyllm
 PROXY_SERVER_URL={your_service_url}
 WEN_XIN_MODEL_VERSION={version}
 WEN_XIN_API_KEY={your-wenxin-sk}
-WEN_XIN_SECRET_KEY={your-wenxin-sct}
+WEN_XIN_API_SECRET={your-wenxin-sct}
 ```
   </TabItem>
 </Tabs>
@@ -218,7 +218,7 @@ mkdir models and cd models
 
 # embedding model
 git clone https://huggingface.co/GanymedeNil/text2vec-large-chinese
-或者
+or
 git clone https://huggingface.co/moka-ai/m3e-large
 
 # llm model, if you use openai or Azure or tongyi llm api service, you don't need to download llm model

diff --git a/docs/docs/modules/connections.md b/docs/docs/modules/connections.md
@@ -1,2 +1,25 @@
 # Connections
-The connections module supports connecting to various structured, semi-structured, and unstructured data storage engines. Bring multi-dimensional data into the framework and realize the interaction between natural language and multi-dimensional data
+The connections module supports connecting to various structured, semi-structured, and unstructured data storage engines. Bring multi-dimensional data into the framework and realize the interaction between natural language and multi-dimensional data
+
+The list of data sources we currently support is as follows.
+
+| DataSource                                                                      | support     | Notes                                       |
+| ------------------------------------------------------------------------------  | ----------- | ------------------------------------------- |
+| [MySQL](https://www.mysql.com/)                                                 | Yes         |  MySQL is the world's most popular open source database.                             |
+| [PostgresSQL](https://www.postgresql.org/)                                      | Yes         |  The World's Most Advanced Open Source Relational Database                                   |
+| [Spark](https://github.com/apache/spark)                                        | Yes         |  Unified Engine for large-scale data analytics                                |
+| [DuckDB](https://github.com/duckdb/duckdb)                                      | Yes         |  DuckDB is an in-process SQL OLAP database management system                                          |
+| [Sqlite](https://github.com/sqlite/sqlite)                                      | Yes         |                                             |
+| [MSSQL](https://github.com/microsoft/mssql-jdbc)                                | Yes         |                                             |
+| [ClickHouse](https://github.com/ClickHouse/ClickHouse)                          | Yes         |  ClickHouse is the fastest and most resource efficient open-source database for real-time apps and analytics.                                      |
+| [Oracle](https://github.com/oracle)                                             | No          |           TODO                              |
+| [Redis](https://github.com/redis/redis)                                         | No          |  The Multi-model NoSQL Database                              |
+| [MongoDB](https://github.com/mongodb/mongo)                                     | No          |  MongoDB is a source-available cross-platform document-oriented database program                              |
+| [HBase](https://github.com/apache/hbase)                                        | No          |  Open-source, distributed, versioned, column-oriented store modeled                              |
+| [Doris](https://github.com/apache/doris)                                        | No          |  Apache Doris is an easy-to-use, high performance and unified analytics database.                              |
+| [DB2](https://github.com/IBM/Db2)                                               | No          |           TODO                              |
+| [Couchbase](https://github.com/couchbase)                                       | No          |           TODO                              |
+| [Elasticsearch](https://github.com/elastic/elasticsearch)                       | No          |  Free and Open, Distributed, RESTful Search Engine                              |
+| [OceanBase](https://github.com/OceanBase)                                       | No          |  OceanBase is a distributed relational database.                               |
+| [TiDB](https://github.com/pingcap/tidb)                                         | No          |           TODO                              |
+| [StarRocks](https://github.com/StarRocks/starrocks)                             | Yes         | StarRocks is a next-gen, high-performance analytical data warehouse                               |
-Original file line number
+Diff line change
@@ Expand Up / @@ -9,7 +9,6 @@ __pycache__/ @@
     message/
     .env
-    .idea
     .vscode
     .idea
     .chroma
@@ Expand Down @@