Skip to content

自定义模型 Custom Models

Keldos edited this page May 10, 2024 · 5 revisions

川虎Chat允许用户添加自定义模型。例如,如果你在本地使用Ollama部署了大型模型推理服务,你便可以利用这一功能创建自定义模型,设置api_hostmodel_name字段,从而通过 川虎Chat 利用该模型。你还可以设定不同的system提示等参数,以快速利用自定义模型进行例如文本翻译等任务。自定义模型以JSON字典的格式储存在config.jsonextra_model_metadata字段中,其中键(key)为自定义模型的显示名称,值为模型参数的字典。例如:

Chuanhu Chat permits users to integrate custom models. For instance, should you deploy a large-scale model inference service locally utilizing Ollama, you can harness this feature to craft a custom model, specifying the api_host and model_name fields, thereby employing the model through Chuanhu Chat. Additionally, you may set various system prompts and other parameters to swiftly utilize your custom model for tasks such as text translation. Custom models are stored in the config.json file under the extra_model_metadata field in the format of a JSON dictionary, where the key represents the display name of the custom model and the value is a dictionary of the model's parameters. For example:

{
...(config.json中的其他内容 Regarding the other content in config.json)

    "extra_model_metadata": {
        "My Chat Model": {
            "model_name": "llama3-70b-chat",
            "description": "This is meta's powerful open source LLM.",
            "model_type": "OpenAIVision",
            "multimodal": false,
            "api_host": "https://www.example.com",
            "token_limit": 4096,
            "max_generation": 4096,
        },
    },
}

其中model_type为自定义模型使用的API格式,例如与OpenAI Chat模型相同,就应该设置为OpenAIVision。所有OpenAI Chat模型都应该使用OpenAIVision类型。如果支持多模态能力,应该把multimodal设置为true。支持的所有类型为:ChatGLM、LLaMA、XMChat、StableLM、MOSS、YuanAI、Minimax、ChuanhuAgent、GooglePaLM、LangchainChat、Midjourney、Spark、OpenAIInstruct、Claude、Qwen、OpenAIVision、ERNIE、DALLE3、GoogleGemini、GoogleGemma、Ollama、Groq。

键值设置 Key-Value Configurations

以下是所有支持的设置项:

键             描述                                                                                
repo_id     HuggingFace 仓库ID,适用于从HuggingFace下载然后在本地运行的模型。                
filelist     要从仓库下载的文件列表,现在只支持 .gguf 文件                                                
model_name     API服务的模型名称,而非展示名称。                                                
description 模型描述,当光标悬停在信息图标上时显示。                                                            
placeholder 模型的聊天区占位内容,在聊天区无消息时显示。它是一个选填字典,其支持的键值包括:
logo:模型的logo图片路径,
slogan:模型的宣传语或问候语,
question_1:候选示例问题1,
question_2:候选示例问题2,
question_3:候选示例问题3,
question_4:候选示例问题4。
其中question_3question_4当屏幕宽度较小时会被隐藏。
model_type     模型类型,用于决定模型的行为。如果未设置,将从模型名称推断模型类型                                
multimodal     模型是否为多模态                                                                            
api_host     模型API的api_host,支持的模型类型为XMChatOpenAIInstructClaudeOpenAIVisionDALLE3Groq,暂不支持的模型为YuanAIMinimaxChuanhuAgentGooglePaLMMidjourneySparkERNIEGoogleGeminiOllama。未来会让更多模型支持此设置项。
api_key     模型API的API密钥                                                                                
system         模型的系统提示                                                                                
token_limit 上下文窗口大小                                                                                
single_turn 是否开启单轮对话                                                                                
temperature 模型生成时的temperature                                                                        
top_p         控制模型的top_p,考虑样本抽取的前百分之几的概率                                                
n_choices     生成的选项数,一般无需设置                                                                                
stop         生成停止的字符或序列                                                                            
max_generation 单次生成的最大令牌限制                                                                        
presence_penalty 已出现token的产生会增加惩罚                                                                    
frequency_penalty 输出中重复词汇的产生会增加惩罚                                                                
logit_bias     在模型生成过程中对某些token加权                                                                
metadata     模型的附加元数据                                                                                

The model_type parameter designates the API format used in custom models; for instance, if it is similar to the OpenAI Chat model, then it should be configured as OpenAIVision. Every OpenAI Chat model must employ the OpenAIVision type. If the model supports multimodal functionalities, then multimodal should be set to true. The range of supported types includes: ChatGLM, LLaMA, XMChat, StableLM, MOSS, YuanAI, Minimax, ChuanhuAgent, GooglePaLM, LangchainChat, Midjourney, Spark, OpenAIInstruct, Claude, Qwen, OpenAIVision, ERNIE, DALLE3, GoogleGemini, GoogleGemma, Ollama, Groq.

Here are all the applicable configuration settings:

Key Description
repo_id The repository ID for HuggingFace, designed for downloading and locally running models.
filelist A list of files to be downloaded from the repository, currently only supports .gguf files.
model_name The operational name of the model within the API service, as opposed to its display name.
description A description of the model, displayed in the chatbot header when cursor overing the info icon .
placeholder The placeholder for the model, displayed in the chat area when no message is present. It is an optional dictionary with supported key values including:
Logo: The logo image path of the model,
slogan: The slogan or greeting of the model,
question_1: Candidate example question 1,
question_2: Candidate Example Question 2,
question_3: Candidate Example Question 3,
question_4: Candidate example question 4.
Among them question_3 and question_4 will be hidden when the screen width is smaller.
model_type The type of model, which dictates the model's behavior. If not specified, it will be inferred from the model name.
multimodal Specifies whether the model is multimodal.
api_host The API host for the model supports types such as 'XMChat', 'OpenAIInstruct', 'Claude', 'OpenAIVision', 'DALLE3', and 'Groq', while it does not currently support models such as 'YuanAI', 'Minimax', 'ChuanhuAgent', 'GooglePaLM', 'Midjourney', 'Spark', 'ERNIE', 'GoogleGemini', and 'Ollama'. In the future, this setting will accommodate more models.
api_key The API key for the model's API.
system System prompts for the model.
token_limit The size of the context window.
single_turn Indicates whether a single-turn dialogue is enabled.
temperature The temperature setting used during model generation.
top_p Controls the top_p of the model, considering the probability of the top percentile samples during extraction.
n_choices The number of options generated, usually does not require setting.
stop The character or sequence that stops generation.
max_generation The maximum token limit per generation.
presence_penalty Increases the penalty for tokens that have already appeared.
frequency_penalty Increases the penalty for repeated vocabulary in the output.
logit_bias Applies weighting to certain tokens during the generation process.
metadata Additional metadata associated with the model.

在添加自定义模型之后,你还需要在extra_models列表字段增加自定义模型的名称。或者,设置available_models字段。注意,在extra_models字段或available_models填写的的应该是模型的显示名称,例如示例中的My Chat Model,而不是设置项目中的model_name字段中的值。例如:

Upon integrating your custom models, you must enrich the extra_models list field with the names of the custom models or alternatively, configure the available_models field. Please note that in the extra_models or available_models fields, one should input the display name of the model—for instance, My Chat Model as illustrated in the example—rather than the value found in the model_name field of the config. For instance:

"available_models": ["GPT3.5 Turbo", "GPT4 Turbo", "GPT4 Vision"], // 可用的模型列表,将覆盖默认的可用模型列表
"extra_models": ["模型名称3", "模型名称4", ...], // 额外的模型,将添加到可用的模型列表之后
"available_models": ["GPT3.5 Turbo", "GPT4 Turbo", "GPT4 Vision"], // This list of available models supersedes the default roster of accessible models
"extra_models": ["Model Name 3", "Model Name 4", ...], // These additional models will be appended to the catalog of available models

示例 Example

    "available_models": ["GPT3.5 Turbo", "GPT4 Turbo", "Mean CEO"],

    "extra_model_metadata": {
        "Mean CEO": {
            "model_name": "gpt-3.5-turbo",
            "description": "An expert of mean jokes, who answers all user questions in a humorous and mean way.",
            "model_type": "OpenAIVision",
            "multimodal": false,
            "api_host": "http://localhost:11434",
            "token_limit": 4096,
            "max_generation": 4096,
            "system": "You are an expert of mean jokes. You Answer all user questions in a humorous and mean way, pretending you are Elon Musk.",
            "placeholder": {
                "logo": "https://wallpaperaccess.com/full/9170621.png",
                "slogan": "I am NOT politically correct",
                "question_1": "How can I get X Premium+?",
                "question_2": "Roast me pls",
                "question_3": "What's the true nature of the universe",
                "question_4": "Ai is biased",
            }
        },
    },
Example Screen Shot

内置模型 Built-in Models

内置的模型和元数据如下,其中有一些没有出现在默认的模型列表中,可以自行添加:

The built-in models and the corresponding metadata, some of which are not included in the default model list, are available for manual inclusion:

{
    "Llama-2-7B":{
        "repo_id": "TheBloke/Llama-2-7B-GGUF",
        "filelist": ["llama-2-7b.Q6_K.gguf"],
    },
    "Llama-2-7B-Chat":{
        "repo_id": "TheBloke/Llama-2-7b-Chat-GGUF",
        "filelist": ["llama-2-7b-chat.Q6_K.gguf"],
    },
    "Qwen 7B": {
        "repo_id": "Qwen/Qwen-7B-Chat-Int4",
    },
    "Qwen 14B": {
        "repo_id": "Qwen/Qwen-14B-Chat-Int4",
    },
    "GPT3.5 Turbo": {
        "model_name": "gpt-3.5-turbo",
        "description": "gpt3.5turbo_description",
        "token_limit": 4096,
    },
    "GPT3.5 Turbo Instruct": {
        "model_name": "gpt-3.5-turbo-instruct",
        "description": "gpt3.5turbo_instruct_description",
        "token_limit": 4096,
    },
    "GPT3.5 Turbo 16K": {
        "model_name": "gpt-3.5-turbo-16k",
        "description": "gpt3.5turbo_16k_description",
        "token_limit": 16384,
    },
    "GPT3.5 Turbo 0301": {
        "model_name": "gpt-3.5-turbo-0301",
        "token_limit": 4096,
    },
    "GPT3.5 Turbo 0613": {
        "model_name": "gpt-3.5-turbo-0613",
        "token_limit": 4096,
    },
    "GPT3.5 Turbo 1106": {
    "model_name": "gpt-3.5-turbo-1106",
    "token_limit": 16384,
    },
    "GPT4": {
        "model_name": "gpt-4",
        "description": "gpt4_description",
        "token_limit": 8192,
    },
    "GPT4 32K": {
        "model_name": "gpt-4-32k",
        "description": "gpt4_32k_description",
        "token_limit": 32768,
    },
    "GPT4 Turbo": {
        "model_name": "gpt-4-turbo",
        "description": "gpt4turbo_description",
        "token_limit": 128000,
        "multimodal": true
    },
    "Claude 3 Haiku": {
        "model_name": "claude-3-haiku-20240307",
        "description": "claude3_haiku_description",
        "token_limit": 200000,
        "max_generation": 4096,
        "multimodal": true
    },
    "Claude 3 Sonnet": {
        "model_name": "claude-3-sonnet-20240229",
        "description": "claude3_sonnet_description",
        "token_limit": 200000,
        "max_generation": 4096,
        "multimodal": true
    },
    "Claude 3 Opus": {
        "model_name": "claude-3-opus-20240229",
        "description": "claude3_opus_description",
        "token_limit": 200000,
        "max_generation": 4096,
        "multimodal": true
    },
    "川虎助理": {"model_name": "川虎助理"},
    "川虎助理 Pro": {"model_name": "川虎助理 Pro"},
    "DALL-E 3": {"model_name": "dall-e-3"},
    "ERNIE-Bot-turbo": {
        "model_name": "ERNIE-Bot-turbo",
        "token_limit": 1024,
    },
    "ERNIE-Bot": {
        "model_name": "ERNIE-Bot",
        "token_limit": 1024,
    },
    "ERNIE-Bot-4": {
        "model_name": "ERNIE-Bot-4",
        "token_limit": 1024,
    },
    "Gemini Pro": {
        "model_name": "gemini-pro",
        "token_limit": 30720,
    },
    "Gemini Pro Vision": {
        "model_name": "gemini-pro-vision",
        "token_limit": 30720,
    },
    "Ollama": {
        "model_name": "ollama",
        "token_limit": 4096,
    },
    "Gemma 2B": {
        "repo_id": "google/gemma-2b-it",
        "model_name": "gemma-2b-it",
        "token_limit": 8192,
    },
    "Gemma 7B": {
        "repo_id": "google/gemma-7b-it",
        "model_name": "gemma-7b-it",
        "token_limit": 8192,
    },
    "Groq LLaMA3 8B": {
        "model_name": "llama3-8b-8192",
        "description": "groq_llama3_8b_description",
        "token_limit": 8192,
    },
    "Groq LLaMA3 70B": {
        "model_name": "llama3-70b-8192",
        "description": "groq_llama3_70b_description",
        "token_limit": 8192,
    },
    "Groq Mixtral 8x7B": {
        "model_name": "mixtral-8x7b-32768",
        "description": "groq_mixtral_8x7b_description",
        "token_limit": 32768,
    },
    "Groq Gemma 7B": {
        "model_name": "gemma-7b-it",
        "description": "groq_gemma_7b_description",
        "token_limit": 8192,
    },
    "GooglePaLM": {"model_name": "models/chat-bison-001"},
    "xmchat": {"model_name": "xmchat"},
    "Azure OpenAI": {"model_name": "azure-openai"},
    "yuanai-1.0-base_10B": {"model_name": "yuanai-1.0-base_10B"},
    "yuanai-1.0-translate": {"model_name": "yuanai-1.0-translate"},
    "yuanai-1.0-dialog": {"model_name": "yuanai-1.0-dialog"},
    "yuanai-1.0-rhythm_poems": {"model_name": "yuanai-1.0-rhythm_poems"},
    "minimax-abab5-chat": {"model_name": "minimax-abab5-chat"},
    "midjourney": {"model_name": "midjourney"},
    "讯飞星火大模型V3.0": {"model_name": "讯飞星火大模型V3.0"},
    "讯飞星火大模型V2.0": {"model_name": "讯飞星火大模型V2.0"},
    "讯飞星火大模型V1.5": {"model_name": "讯飞星火大模型V1.5"},
}
Clone this wiki locally