Skip to content

Commit

Permalink
Adding an example of setting up local embedding model (#322)
Browse files Browse the repository at this point in the history
  • Loading branch information
ZiTao-Li committed Jul 2, 2024
1 parent 401fe3a commit 2641e9b
Show file tree
Hide file tree
Showing 2 changed files with 212 additions and 0 deletions.
105 changes: 105 additions & 0 deletions docs/sphinx_doc/en/source/tutorial/210-rag.md
Original file line number Diff line number Diff line change
Expand Up @@ -190,6 +190,111 @@ RAG agent is an agent that can generate answers based on the retrieved knowledge
Your agent will be equipped with a list of knowledge according to the `knowledge_id_list`.
You can decide how to use the retrieved content and even update and refresh the index in your agent's `reply` function.

## (Optional) Setting up a local embedding model service

For those who are interested in setting up a local embedding service, we provide the following example based on the
`sentence_transformers` package, which is a popular specialized package for embedding models (based on the `transformer` package and compatible with both HuggingFace and ModelScope models).
In this example, we will use one of the SOTA embedding models, `gte-Qwen2-7B-instruct`.

* Step 1: Follow the instruction on [HuggingFace](https://huggingface.co/Alibaba-NLP/gte-Qwen2-7B-instruct) or [ModelScope](https://www.modelscope.cn/models/iic/gte_Qwen2-7B-instruct ) to download the embedding model.
(For those who cannot access HuggingFace directly, you may want to use a HuggingFace mirror by running a bash command
`export HF_ENDPOINT=https://hf-mirror.com` or add a line of code `os.environ["HF_ENDPOINT"] = "https://hf-mirror.com"` in your Python code.)
* Step 2: Set up the server. The following code is for reference.

```python
import datetime
import argparse

from flask import Flask
from flask import request
from sentence_transformers import SentenceTransformer

def create_timestamp(format_: str = "%Y-%m-%d %H:%M:%S") -> str:
"""Get current timestamp."""
return datetime.datetime.now().strftime(format_)

app = Flask(__name__)

@app.route("/embedding/", methods=["POST"])
def get_embedding() -> dict:
"""Receive post request and return response"""
json = request.get_json()

inputs = json.pop("inputs")

global model

if isinstance(inputs, str):
inputs = [inputs]

embeddings = model.encode(inputs)

return {
"data": {
"completion_tokens": 0,
"messages": {},
"prompt_tokens": 0,
"response": {
"data": [
{
"embedding": emb.astype(float).tolist(),
}
for emb in embeddings
],
"created": "",
"id": create_timestamp(),
"model": "flask_model",
"object": "text_completion",
"usage": {
"completion_tokens": 0,
"prompt_tokens": 0,
"total_tokens": 0,
},
},
"total_tokens": 0,
"username": "",
},
}

if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--model_name_or_path", type=str, required=True)
parser.add_argument("--device", type=str, default="auto")
parser.add_argument("--port", type=int, default=8000)
args = parser.parse_args()

global model

print("setting up for embedding model....")
model = SentenceTransformer(
args.model_name_or_path
)

app.run(port=args.port)
```

* Step 3: start server.
```bash
python setup_ms_service.py --model_name_or_path {$PATH_TO_gte_Qwen2_7B_instruct}
```


Testing whether the model is running successfully.
```python
from agentscope.models.post_model import PostAPIEmbeddingWrapper


model = PostAPIEmbeddingWrapper(
config_name="test_config",
api_url="http://127.0.0.1:8000/embedding/",
json_args={
"max_length": 4096,
"temperature": 0.5
}
)

print(model("testing"))
```

[[Back to the top]](#210-rag-en)

Expand Down
107 changes: 107 additions & 0 deletions docs/sphinx_doc/zh_CN/source/tutorial/210-rag.md
Original file line number Diff line number Diff line change
Expand Up @@ -174,6 +174,113 @@ RAG 智能体是可以基于检索到的知识生成答案的智能体。
**自己搭建 RAG 智能体.** 只要您的智能体配置具有`knowledge_id_list`,您就可以将一个agent和这个列表传递给`KnowledgeBank.equip`;这样该agent就是被装配`knowledge_id`
您可以在`reply`函数中自己决定如何从`Knowledge`对象中提取和使用信息,甚至通过`Knowledge`修改知识库。


## (拓展) 架设自己的embedding model服务

我们在此也对架设本地embedding model感兴趣的用户提供以下的样例。
以下样例基于在embedding model范围中很受欢迎的`sentence_transformers` 包(基于`transformer` 而且兼容HuggingFace和ModelScope的模型)。
这个样例中,我们会使用当下最好的文本向量模型之一`gte-Qwen2-7B-instruct`


* 第一步: 遵循在 [HuggingFace](https://huggingface.co/Alibaba-NLP/gte-Qwen2-7B-instruct) 或者 [ModelScope](https://www.modelscope.cn/models/iic/gte_Qwen2-7B-instruct )的指示下载模型。
(如果无法直接从HuggingFace下载模型,也可以考虑使用HuggingFace镜像:bash命令行`export HF_ENDPOINT=https://hf-mirror.com`,或者在Python代码中加入`os.environ["HF_ENDPOINT"] = "https://hf-mirror.com"`)
* 第二步: 设置服务器。以下是一段参考代码。

```python
import datetime
import argparse

from flask import Flask
from flask import request
from sentence_transformers import SentenceTransformer

def create_timestamp(format_: str = "%Y-%m-%d %H:%M:%S") -> str:
"""Get current timestamp."""
return datetime.datetime.now().strftime(format_)

app = Flask(__name__)

@app.route("/embedding/", methods=["POST"])
def get_embedding() -> dict:
"""Receive post request and return response"""
json = request.get_json()

inputs = json.pop("inputs")

global model

if isinstance(inputs, str):
inputs = [inputs]

embeddings = model.encode(inputs)

return {
"data": {
"completion_tokens": 0,
"messages": {},
"prompt_tokens": 0,
"response": {
"data": [
{
"embedding": emb.astype(float).tolist(),
}
for emb in embeddings
],
"created": "",
"id": create_timestamp(),
"model": "flask_model",
"object": "text_completion",
"usage": {
"completion_tokens": 0,
"prompt_tokens": 0,
"total_tokens": 0,
},
},
"total_tokens": 0,
"username": "",
},
}

if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--model_name_or_path", type=str, required=True)
parser.add_argument("--device", type=str, default="auto")
parser.add_argument("--port", type=int, default=8000)
args = parser.parse_args()

global model

print("setting up for embedding model....")
model = SentenceTransformer(
args.model_name_or_path
)

app.run(port=args.port)
```

* 第三部:启动服务器。
```bash
python setup_ms_service.py --model_name_or_path {$PATH_TO_gte_Qwen2_7B_instruct}
```


测试服务是否成功启动。
```python
from agentscope.models.post_model import PostAPIEmbeddingWrapper


model = PostAPIEmbeddingWrapper(
config_name="test_config",
api_url="http://127.0.0.1:8000/embedding/",
json_args={
"max_length": 4096,
"temperature": 0.5
}
)

print(model("testing"))
```

[[回到顶部]](#210-rag-zh)


Expand Down

0 comments on commit 2641e9b

Please sign in to comment.