docs: readme update & contact (#1097)

eosphoros-ai · Jan 22, 2024 · 1484981 · 1484981
1 parent 4f83363
commit 1484981
Show file tree

Hide file tree

Showing 6 changed files with 93 additions and 196 deletions.
diff --git a/README.md b/README.md
@@ -33,42 +33,71 @@
   </p>
 
 
-[**简体中文**](README.zh.md) | [**Discord**](https://discord.gg/7uQnPuveTY) | [**Documents**](https://docs.dbgpt.site) | [**Wechat**](https://github.com/eosphoros-ai/DB-GPT/blob/main/README.zh.md#%E8%81%94%E7%B3%BB%E6%88%91%E4%BB%AC) | [**Community**](https://github.com/eosphoros-ai/community) | [**Paper**](https://arxiv.org/pdf/2312.17449.pdf)
+[**简体中文**](README.zh.md) | [**Discord**](https://discord.gg/7uQnPuveTY) | [**Documents**](https://docs.dbgpt.site) | [**微信**](https://github.com/eosphoros-ai/DB-GPT/blob/main/README.zh.md#%E8%81%94%E7%B3%BB%E6%88%91%E4%BB%AC) | [**Community**](https://github.com/eosphoros-ai/community) | [**Paper**](https://arxiv.org/pdf/2312.17449.pdf)
 
 </div>
 
 ## What is DB-GPT?
 
-DB-GPT is an open-source framework designed for the realm of large language models (LLMs) within the database field. Its primary purpose is to provide infrastructure that simplifies and streamlines the development of database-related applications. This is accomplished through the development of various technical capabilities, including:
+DB-GPT is an open-source, data-domain large model framework. Its purpose is to build the infrastructure for the large model domain by developing a variety of technical capabilities, including multi-model management, Text2SQL performance optimization, RAG framework and optimization, and Multi-Agents framework collaboration. These capabilities aim to simplify and facilitate the construction of large model applications around databases.
 
-1. **SMMF(Service-oriented Multi-model Management Framework)**
-2. **Text2SQL Fine-tuning**
-3. **RAG(Retrieval Augmented Generation) framework and optimization**
-4. **Data-Driven Agents framework collaboration**
-5. **GBI(Generative Business intelligence)**
-
-DB-GPT simplifies the creation of these applications based on large language models (LLMs) and databases. 
-
-In the era of Data 3.0, enterprises and developers can take the ability to create customized applications with minimal coding, which harnesses the power of large language models (LLMs) and databases.
+In the Data 3.0 era, based on models and databases, enterprises and developers can build their own bespoke applications with less code.
 
+### Data Agents
+![data agents](https://github.com/eosphoros-ai/DB-GPT/assets/17919400/ced393b4-9180-437a-90c5-b43633cda8cb)
 
 ## Contents
-- [Install](#install)
-- [Demo](#demo)
 - [Introduction](#introduction)
+- [Install](#install)
 - [Features](#features)
 - [Contribution](#contribution)
-- [Roadmap](#roadmap)
 - [Contact](#contact-information)
 
-[DB-GPT Youtube Video](https://www.youtube.com/watch?v=f5_g0OObZBQ)
+## Introduction 
+The architecture of DB-GPT is shown in the following figure:
+
+<p align="center">
+  <img src="./assets/dbgpt.png" width="800" />
+</p>
+
+The core capabilities include the following parts:
+
+- **RAG (Retrieval Augmented Generation)**: RAG is currently the most practically implemented and urgently needed domain. DB-GPT has already implemented a framework based on RAG, allowing users to build knowledge-based applications using the RAG capabilities of DB-GPT.
+
+- **GBI (Generative Business Intelligence)**: Generative BI is one of the core capabilities of the DB-GPT project, providing the foundational data intelligence technology to build enterprise report analysis and business insights.
+
+- **Fine-tuning Framework**: Model fine-tuning is an indispensable capability for any enterprise to implement in vertical and niche domains. DB-GPT provides a complete fine-tuning framework that integrates seamlessly with the DB-GPT project. In recent fine-tuning efforts, an accuracy rate based on the Spider dataset has been achieved at 82.5%.
+
+- **Data-Driven Multi-Agents Framework**: DB-GPT offers a data-driven self-evolving fine-tuning framework, aiming to continuously make decisions and execute based on data.
+
+- **Data Factory**: The Data Factory is mainly about cleaning and processing trustworthy knowledge and data in the era of large models.
+
+- **Data Sources**: Integrating various data sources to seamlessly connect production business data to the core capabilities of DB-GPT.
+
+### SubModule
+- [DB-GPT-Hub](https://github.com/eosphoros-ai/DB-GPT-Hub) Text-to-SQL workflow with high performance by applying Supervised Fine-Tuning (SFT) on Large Language Models (LLMs).
+
+#### Text2SQL Finetune
+- support llms
+  - [x] LLaMA
+  - [x] LLaMA-2
+  - [x] BLOOM
+  - [x] BLOOMZ
+  - [x] Falcon
+  - [x] Baichuan
+  - [x] Baichuan2
+  - [x] InternLM
+  - [x] Qwen
+  - [x] XVERSE
+  - [x] ChatGLM2
+
+-  SFT Accuracy
+As of October 10, 2023, through the fine-tuning of an open-source model with 13 billion parameters using this project, we have achieved execution accuracy on the Spider dataset that surpasses even GPT-4!
 
-## Demo
-##### Chat Data
-![chatdata](https://github.com/eosphoros-ai/DB-GPT/assets/13723926/1f77079e-d018-4eee-982b-9b6a66bf1063)
+[More Information about Text2SQL finetune](https://github.com/eosphoros-ai/DB-GPT-Hub)
 
-##### Chat Excel
-![excel](https://github.com/eosphoros-ai/DB-GPT/assets/13723926/3044e83b-a71e-41fe-a1e2-98e479e0ab59)
+- [DB-GPT-Plugins](https://github.com/eosphoros-ai/DB-GPT-Plugins) DB-GPT Plugins that can run Auto-GPT plugin directly
+- [GPT-Vis](https://github.com/eosphoros-ai/GPT-Vis) Visualization protocol
 
 ## Install 
 ![Docker](https://img.shields.io/badge/docker-%230db7ed.svg?style=for-the-badge&logo=docker&logoColor=white)
@@ -120,26 +149,7 @@ At present, we have introduced several key features to showcase our current capa
 - Support Datasources
   - [Datasources](http://docs.dbgpt.site/docs/modules/connections)
 
-## Introduction 
-The architecture of DB-GPT is shown in the following figure:
 
-<p align="center">
-  <img src="./assets/DB-GPT.png" width="800" />
-</p>
-
-The core capabilities primarily consist of the following components:
-1. Multi-Models: We support multiple Large Language Models (LLMs) such as LLaMA/LLaMA2, CodeLLaMA, ChatGLM, QWen, Vicuna, and proxy models like ChatGPT, Baichuan, Tongyi, Wenxin, and more.
-2. Knowledge-Based QA: Our system enables high-quality intelligent Q&A based on local documents such as PDFs, Word documents, Excel files, and other data sources.
-3. Embedding: We offer unified data vector storage and indexing. Data is embedded as vectors and stored in vector databases, allowing for content similarity search.
-4. Multi-Datasources: This feature connects different modules and data sources, facilitating data flow and interaction.
-5. Multi-Agents: Our platform provides Agent and plugin mechanisms, empowering users to customize and enhance the system's behaviour.
-6. Privacy & Security: Rest assured that there is no risk of data leakage, and your data is 100% private and secure.
-7. Text2SQL: We enhance Text-to-SQL performance through Supervised Fine-Tuning (SFT) applied to Large Language Models (LLMs).
-
-### SubModule
-- [DB-GPT-Hub](https://github.com/eosphoros-ai/DB-GPT-Hub) Text-to-SQL workflow with high performance by applying Supervised Fine-Tuning (SFT) on Large Language Models (LLMs).
-- [DB-GPT-Plugins](https://github.com/eosphoros-ai/DB-GPT-Plugins) DB-GPT Plugins that can run Auto-GPT plugin directly
-- [DB-GPT-Web](https://github.com/eosphoros-ai/DB-GPT-Web)  ChatUI for DB-GPT  
 
 ## Image
 🌐 [AutoDL Image](https://www.codewithgpu.com/i/eosphoros-ai/DB-GPT/dbgpt)
@@ -151,106 +161,8 @@ The core capabilities primarily consist of the following components:
 ## Contribution
 
 - Please run `black .` before submitting the code.
-- To check detailed guidelines for new contributions, please refer [how to contribute](https://github.com/csunny/DB-GPT/blob/main/CONTRIBUTING.md)
-
-## RoadMap
-
-<p align="left">
-  <img src="./assets/roadmap.jpg" width="800px" />
-</p>
-
-### KBQA RAG optimization
-- [x] Multi Documents
-  - [x] PDF
-  - [x] Excel, CSV
-  - [x] Word
-  - [x] Text
-  - [x] MarkDown
-  - [ ] Code
-  - [ ] Images 
-
-- [x] RAG
-- [ ] Graph Database
-  - [ ] Neo4j Graph
-  - [ ] Nebula Graph
-- [x] Multi-Vector Database
-  - [x] Chroma
-  - [x] Milvus
-  - [x] Weaviate
-  - [x] PGVector
-  - [ ] Elasticsearch
-  - [ ] ClickHouse
-  - [ ] Faiss 
-
-- [ ] Testing and Evaluation Capability Building
-  - [ ] Knowledge QA datasets
-  - [ ] Question collection [easy, medium, hard]:
-  - [ ] Scoring mechanism
-  - [ ] Testing and evaluation using Excel + DB datasets
-
-### Multi Datasource Support
-
-- Multi Datasource Support 
-  - [x] MySQL
-  - [x] PostgreSQL
-  - [x] Spark
-  - [x] DuckDB
-  - [x] Sqlite
-  - [x] MSSQL
-  - [x] ClickHouse
-  - [ ] Oracle
-  - [ ] Redis
-  - [ ] MongoDB
-  - [ ] HBase
-  - [x] Doris
-  - [ ] DB2
-  - [ ] Couchbase
-  - [ ] Elasticsearch
-  - [ ] OceanBase
-  - [ ] TiDB
-  - [ ] StarRocks
-
-### Multi-Models And vLLM
-- [x] [Cluster Deployment](https://docs.dbgpt.site/docs/installation/model_service/cluster)
-- [x] [Fastchat Support](https://github.com/lm-sys/FastChat)
-- [x] [vLLM Support](https://docs.dbgpt.site/docs/installation/advanced_usage/vLLM_inference)
-- [ ] Cloud-native environment and support for Ray environment
-- [ ] Service Registry(eg:nacos)
-- [ ] Compatibility with OpenAI's interfaces
-- [ ] Expansion and optimization of embedding models
-
-### Agents market and Plugins
-- [x] multi-agents framework
-- [x] custom plugin development 
-- [x] plugin market
-- [ ] Integration with CoT
-- [ ] Enrich plugin sample library
-- [ ] Support for AutoGPT protocol
-- [ ] Integration of multi-agents and visualization capabilities, defining LLM+Vis new standards
-
-### Cost and Observability
-- [x] [debugging](https://docs.dbgpt.site/docs/application_manual/advanced_tutorial/debugging)
-- [ ] Observability
-- [ ] cost & budgets
-
-### Text2SQL Finetune
-- support llms
-  - [x] LLaMA
-  - [x] LLaMA-2
-  - [x] BLOOM
-  - [x] BLOOMZ
-  - [x] Falcon
-  - [x] Baichuan
-  - [x] Baichuan2
-  - [x] InternLM
-  - [x] Qwen
-  - [x] XVERSE
-  - [x] ChatGLM2
+- To check detailed guidelines for new contributions, please refer [how to contribute](https://github.com/eosphoros-ai/DB-GPT/blob/main/CONTRIBUTING.md)
 
--  SFT Accuracy
-As of October 10, 2023, through the fine-tuning of an open-source model with 13 billion parameters using this project, we have achieved execution accuracy on the Spider dataset that surpasses even GPT-4!
-
-[More Information about Text2SQL finetune](https://github.com/eosphoros-ai/DB-GPT-Hub)
 
 ## Licence
 The MIT License (MIT)
@@ -272,8 +184,4 @@ If you find `DB-GPT` useful for your research or development, please cite the fo
 We are working on building a community, if you have any ideas for building the community, feel free to contact us.
 [![](https://dcbadge.vercel.app/api/server/7uQnPuveTY?compact=true&style=flat)](https://discord.gg/7uQnPuveTY)
 
-<p align="center">
-  <img src="./assets/wechat.jpg" width="300px" />
-</p>
-
 [![Star History Chart](https://api.star-history.com/svg?repos=csunny/DB-GPT&type=Date)](https://star-history.com/#csunny/DB-GPT)
diff --git a/README.zh.md b/README.zh.md
@@ -8,19 +8,19 @@
 <div align="center">
   <p>
     <a href="https://github.com/eosphoros-ai/DB-GPT">
-        <img alt="stars" src="https://img.shields.io/github/stars/csunny/db-gpt?style=social" />
+        <img alt="stars" src="https://img.shields.io/github/stars/eosphoros-ai/db-gpt?style=social" />
     </a>
     <a href="https://github.com/eosphoros-ai/DB-GPT">
-        <img alt="forks" src="https://img.shields.io/github/forks/csunny/db-gpt?style=social" />
+        <img alt="forks" src="https://img.shields.io/github/forks/eosphoros-ai/db-gpt?style=social" />
     </a>
     <a href="https://opensource.org/licenses/MIT">
       <img alt="License: MIT" src="https://img.shields.io/badge/License-MIT-yellow.svg" />
     </a>
      <a href="https://github.com/eosphoros-ai/DB-GPT/releases">
-      <img alt="Release Notes" src="https://img.shields.io/github/release/csunny/DB-GPT" />
+      <img alt="Release Notes" src="https://img.shields.io/github/release/eosphoros-ai/DB-GPT" />
     </a>
     <a href="https://github.com/eosphoros-ai/DB-GPT/issues">
-      <img alt="Open Issues" src="https://img.shields.io/github/issues-raw/csunny/DB-GPT" />
+      <img alt="Open Issues" src="https://img.shields.io/github/issues-raw/eosphoros-ai/DB-GPT" />
     </a>
     <a href="https://discord.gg/7uQnPuveTY">
       <img alt="Discord" src="https://dcbadge.vercel.app/api/server/7uQnPuveTY?compact=true&style=flat" />
@@ -33,39 +33,56 @@
     </a>
   </p>
 
-[**English**](README.md) | [**Discord**](https://discord.gg/7uQnPuveTY) | [**文档**](https://www.yuque.com/eosphoros/dbgpt-docs/bex30nsv60ru0fmx) | [**微信**](https://github.com/csunny/DB-GPT/blob/main/README.zh.md#%E8%81%94%E7%B3%BB%E6%88%91%E4%BB%AC) | [**社区**](https://github.com/eosphoros-ai/community) | [**Paper**](https://arxiv.org/pdf/2312.17449.pdf)
+[**English**](README.md) | [**Discord**](https://discord.gg/7uQnPuveTY) | [**文档**](https://www.yuque.com/eosphoros/dbgpt-docs/bex30nsv60ru0fmx) | [**微信**](https://github.com/eosphoros-ai/DB-GPT/blob/main/README.zh.md#%E8%81%94%E7%B3%BB%E6%88%91%E4%BB%AC) | [**社区**](https://github.com/eosphoros-ai/community) | [**Paper**](https://arxiv.org/pdf/2312.17449.pdf)
 </div>
 
 ## DB-GPT 是什么？
-DB-GPT是一个开源的数据库领域大模型框架。目的是构建大模型领域的基础设施，通过开发多模型管理、Text2SQL效果优化、RAG框架以及优化、Multi-Agents框架协作等多种技术能力，让围绕数据库构建大模型应用更简单，更方便。 
-
+DB-GPT是一个开源的数据域大模型框架。目的是构建大模型领域的基础设施，通过开发多模型管理、Text2SQL效果优化、RAG框架以及优化、Multi-Agents框架协作等多种技术能力，让围绕数据库构建大模型应用更简单，更方便。 
 数据3.0 时代，基于模型、数据库，企业/开发者可以用更少的代码搭建自己的专属应用。
 
-## 目录
+## 效果演示
 
-- [安装](#安装)
-- [效果演示](#效果演示)
+### Data Agents 
+![data agents](https://github.com/eosphoros-ai/DB-GPT/assets/17919400/ced393b4-9180-437a-90c5-b43633cda8cb)
+
+
+## 目录
 - [架构方案](#架构方案)
+- [安装](#安装)
 - [特性简介](#特性一览)
 - [贡献](#贡献)
 - [路线图](#路线图)
 - [联系我们](#联系我们)
 
-[DB-GPT视频介绍](https://www.bilibili.com/video/BV1au41157bj/?spm_id_from=333.337.search-card.all.click&vd_source=7792e22c03b7da3c556a450eb42c8a0f)
+## 架构方案
 
-## 效果演示
+<p align="center">
+  <img src="./assets/dbgpt.png" width="800px" />
+</p>
 
-##### Chat Data
-![chatdata](https://github.com/eosphoros-ai/DB-GPT/assets/13723926/1f77079e-d018-4eee-982b-9b6a66bf1063)
+核心能力主要有以下几个部分:
+- **RAG(Retrieval Augmented Generation)**，RAG是当下落地实践最多，也是最迫切的领域，DB-GPT目前已经实现了一套基于RAG的框架，用户可以基于DB-GPT的RAG能力构建知识类应用。 
 
-##### Chat Excel
-![excel](https://github.com/eosphoros-ai/DB-GPT/assets/13723926/3044e83b-a71e-41fe-a1e2-98e479e0ab59)
+- **GBI**：生成式BI是DB-GPT项目的核心能力之一，为构建企业报表分析、业务洞察提供基础的数智化技术保障。 
 
-#### 根据自然语言对话生成分析图表
-<p align="left">
-  <img src="./assets/dashboard.png" width="800px" />
+- **微调框架**:  模型微调是任何一个企业在垂直、细分领域落地不可或缺的能力，DB-GPT提供了完整的微调框架，实现与DB-GPT项目的无缝打通，在最近的微调中，基于spider的准确率已经做到了82.5%
+
+- **数据驱动的Multi-Agents框架**:  DB-GPT提供了数据驱动的自进化微调框架，目标是可以持续基于数据做决策与执行。 
+
+- **数据工厂**: 数据工厂主要是在大模型时代，做可信知识、数据的清洗加工。 
+
+- **数据源**: 对接各类数据源，实现生产业务数据无缝对接到DB-GPT核心能力。 
+
+### RAG生产落地实践架构
+<p align="center">
+  <img src="./assets/RAG-IN-ACTION.jpg" width="800px" />
 </p>
 
+### 子模块
+- [DB-GPT-Hub](https://github.com/eosphoros-ai/DB-GPT-Hub) 通过微调来持续提升Text2SQL效果 
+- [DB-GPT-Plugins](https://github.com/eosphoros-ai/DB-GPT-Plugins) DB-GPT 插件仓库, 兼容Auto-GPT
+- [GPT-Vis](https://github.com/eosphoros-ai/DB-GPT-Web) 可视化协议 
+
 ## 安装
 
 ![Docker](https://img.shields.io/badge/docker-%230db7ed.svg?style=for-the-badge&logo=docker&logoColor=white)
@@ -84,7 +101,7 @@ DB-GPT是一个开源的数据库领域大模型框架。目的是构建大模
   - [**Excel对话**](https://www.yuque.com/eosphoros/dbgpt-docs/prugoype0xd2g4bb)
   - [**数据库对话**](https://www.yuque.com/eosphoros/dbgpt-docs/wswpv3zcm2c9snmg)
   - [**报表分析**](https://www.yuque.com/eosphoros/dbgpt-docs/vsv49p33eg4p5xc1)
-  - [**插件**](https://www.yuque.com/eosphoros/dbgpt-docs/pom41m7oqtdd57hm)
+  - [**Agents**](https://www.yuque.com/eosphoros/dbgpt-docs/pom41m7oqtdd57hm)
 - [**模型服务部署**](https://www.yuque.com/eosphoros/dbgpt-docs/vubxiv9cqed5mc6o)
   - [**单机部署**](https://www.yuque.com/eosphoros/dbgpt-docs/kwg1ed88lu5fgawb)
   - [**集群部署**](https://www.yuque.com/eosphoros/dbgpt-docs/gmbp9619ytyn2v1s)
@@ -137,34 +154,6 @@ DB-GPT是一个开源的数据库领域大模型框架。目的是构建大模
 - [支持数据源](https://www.yuque.com/eosphoros/dbgpt-docs/rc4r27ybmdwg9472)
 
 
-## 架构方案
-整个DB-GPT的架构，如下图所示
-<p align="center">
-  <img src="./assets/DB-GPT_zh.png" width="800px" />
-</p>
-
-核心能力主要有以下几个部分:
-- **RAG(Retrieval Augmented Generation)**，RAG是当下落地实践最多，也是最迫切的领域，DB-GPT目前已经实现了一套基于RAG的框架，用户可以基于DB-GPT的RAG能力构建知识类应用。 
-
-- **GBI**：生成式BI是DB-GPT项目的核心能力之一，为构建企业报表分析、业务洞察提供基础的数智化技术保障。 
-
-- **微调框架**:  模型微调是任何一个企业在垂直、细分领域落地不可或缺的能力，DB-GPT提供了完整的微调框架，实现与DB-GPT项目的无缝打通，在最近的微调中，基于spider的准确率已经做到了82.5%
-
-- **数据驱动的Multi-Agents框架**:  DB-GPT提供了数据驱动的自进化微调框架，目标是可以持续基于数据做决策与执行。 
-
-- **数据工厂**: 数据工厂主要是在大模型时代，做可信知识、数据的清洗加工。 
-
-- **数据源**: 对接各类数据源，实现生产业务数据无缝对接到DB-GPT核心能力。 
-
-### RAG生产落地实践架构
-<p align="center">
-  <img src="./assets/RAG-IN-ACTION.jpg" width="800px" />
-</p>
-
-### 子模块
-- [DB-GPT-Hub](https://github.com/csunny/DB-GPT-Hub) 通过微调来持续提升Text2SQL效果 
-- [DB-GPT-Plugins](https://github.com/csunny/DB-GPT-Plugins) DB-GPT 插件仓库, 兼容Auto-GPT
-- [DB-GPT-Web](https://github.com/csunny/DB-GPT-Web)  多端交互前端界面
 
 ## Image
 
@@ -180,7 +169,11 @@ DB-GPT是一个开源的数据库领域大模型框架。目的是构建大模
 
 ### 多模型使用
 
-[使用指南](https://www.yuque.com/eosphoros/dbgpt-docs/huzgcf2abzvqy8uv)
+- [使用指南](https://www.yuque.com/eosphoros/dbgpt-docs/huzgcf2abzvqy8uv)
+
+### 数据Agents使用
+
+- [数据Agents](https://www.yuque.com/eosphoros/dbgpt-docs/gwz4rayfuwz78fbq)
 
 # 贡献
 > 提交代码前请先执行 `black .`
@@ -193,10 +186,6 @@ The MIT License (MIT)
 
 # 路线图
 
-<p align="left">
-  <img src="./assets/roadmap.jpg" width="800px" />
-</p>
-
 ### 知识库RAG检索优化
 
 - [x] Multi Documents

diff --git a/assets/dbgpt.png b/assets/dbgpt.png
diff --git a/assets/roadmap.jpg b/assets/roadmap.jpg
diff --git a/assets/wechat.jpg b/assets/wechat.jpg