Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rerank: app.py运行卡住,gpu利用率100%,一直卡住不输出结果 #1879

Closed
1 of 7 tasks
xiaoToby opened this issue Jun 28, 2024 · 4 comments
Closed
1 of 7 tasks
Labels
bug Something isn't working

Comments

@xiaoToby
Copy link

xiaoToby commented Jun 28, 2024

例行检查

  • 我已确认目前没有类似 issue
  • 我已完整查看过项目 README,以及项目文档
  • 我使用了自己的 key,并确认我的 key 是可正常使用的
  • 我理解并愿意跟进此 issue,协助测试和提供反馈
  • 我理解并认可上述内容,并理解项目维护者精力有限,不遵循规则的 issue 可能会被无视或直接关闭

你的版本
v4.8

  • 公有云版本
  • 私有部署版本, 具体版本号:

问题描述, 日志截图
首先我使用registry.cn-hangzhou.aliyuncs.com/fastgpt/bge-rerank-v2-m3:v0.1镜像启用模型接入Fastgpt
config.json:
image
docker-compose.yml:
image
fastgpt报错信息:
image

后续我重新启用一个容器,将模型文件和registry.cn-hangzhou.aliyuncs.com/fastgpt/bge-rerank-v2-m3:v0.1中的app.py文件导入在容器中
app.py:

from FlagEmbedding import FlagReranker
from pydantic import Field, BaseModel, validator
from typing import Optional, List

app = FastAPI()
security = HTTPBearer()
env_bearer_token = 'ACCESS_TOKEN'

class QADocs(BaseModel):
    query: Optional[str]
    documents: Optional[List[str]]


class Singleton(type):
    def __call__(cls, *args, **kwargs):
        if not hasattr(cls, '_instance'):
            cls._instance = super().__call__(*args, **kwargs)
        return cls._instance


RERANK_MODEL_PATH = os.path.join(os.path.dirname(__file__), "bge-reranker-v2-m3")

class ReRanker(metaclass=Singleton):
    def __init__(self, model_path):
        self.reranker = FlagReranker(model_path, use_fp16=False)

    def compute_score(self, pairs: List[List[str]]):
        if len(pairs) > 0:
            result = self.reranker.compute_score(pairs, normalize=True)
            if isinstance(result, float):
                result = [result]
            return result
        else:
            return None

class Chat(object):
    def __init__(self, rerank_model_path: str = RERANK_MODEL_PATH):
        self.reranker = ReRanker(rerank_model_path)

    def fit_query_answer_rerank(self, query_docs: QADocs) -> List:
        if query_docs is None or len(query_docs.documents) == 0:
            return []

        pair = [[query_docs.query, doc] for doc in query_docs.documents]
        scores = self.reranker.compute_score(pair)

        new_docs = []
        for index, score in enumerate(scores):
            new_docs.append({"index": index, "text": query_docs.documents[index], "score": score})
        results = [{"index": documents["index"], "relevance_score": documents["score"]} for documents in list(sorted(new_docs, key=lambda x: x["score"], reverse=True))]
        return results

@app.post('/v1/rerank')
async def handle_post_request(docs: QADocs, credentials: HTTPAuthorizationCredentials = Security(security)):
    token = credentials.credentials
    if env_bearer_token is not None and token != env_bearer_token:
        raise HTTPException(status_code=401, detail="Invalid token")
    chat = Chat()
    try:
        results = chat.fit_query_answer_rerank(docs)
        return {"results": results}
    except Exception as e:
        print(f"报错:\n{e}")
        return {"error": "重排出错"}

if __name__ == "__main__":
    token = os.getenv("ACCESS_TOKEN")
    if token is not None:
        env_bearer_token = token
    try:
        uvicorn.run(app, host='0.0.0.0', port=7013)
    except Exception as e:
        print(f"API启动失败!\n报错:\n{e}")

测试文件 test.py:

import requests

url = f"http://localhost:7013/v1/rerank"

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer sk-tidukjjinarerank"
}

data = {
    "model": "bge-reranker-v2-m3",
    "query": "Organic skincare products for sensitive skin",
    "documents": [
        "Eco-friendly kitchenware for modern homes",
        "Biodegradable cleaning supplies for eco-conscious consumers",
        "Organic cotton baby clothes for sensitive skin",
        "Natural organic skincare range for sensitive skin",
        "Tech gadgets for smart homes: 2024 edition",
        "Sustainable gardening tools and compost solutions",
        "Sensitive skin-friendly facial cleansers and toners",
        "Organic food wraps and storage solutions",
        "All-natural pet food for dogs with allergies",
        "Yoga mats made from recycled materials"
    ],
    "top_n": 3
}

response = requests.post(url, headers=headers, json=data)
print(response.json())

效果如下:
image
image
image

@c121914yu @nongmo677 @lijiajun1997
复现步骤

预期结果

相关截图

@xiaoToby xiaoToby added the bug Something isn't working label Jun 28, 2024
@xiaoToby
Copy link
Author

#1111 (comment)

更换了镜像,在容器内替换了原来的app.py文件
得到的效果一样,没有输出,并且三张gpu卡的利用率100% @c121914yu

@Essence9999
Copy link

image
修改app.py中文件路径
docker build构建镜像
docker run创建容器
oneapi配置
config配置参照官方
docker compose配置
启动
docker logs reranker看看服务有没有正常启动

@xiaoToby
Copy link
Author

image 修改app.py中文件路径 docker build构建镜像 docker run创建容器 oneapi配置 config配置参照官方 docker compose配置 启动 docker logs reranker看看服务有没有正常启动

image

@xiaoToby
Copy link
Author

解决了
我这边的解决方案是在docker-compose文件中加入环境变量CUDA_VISIBLE_DEVICES
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants