You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 28, 2024. It is now read-only.
When I try to load a local model, an error raised: ValueError: Tokenizer class BaichuanTokenizer does not exist or is not currently imported. I have set trust_remote_code=True.
I used to use vllm on this model, and it works well.
(ServeController pid=67769) Unhandled error (suppress with 'RAY_IGNORE_UNHANDLED_ERRORS=1'): ray::ServeReplica:ray-llm-myapp-baichuan:VLLMDeployment:pai--myapp-baichuan2-13b-chat-2.initialize_and_get_metadata() (pid=72307, ip=172.17.0.2, actor_id=438f9032f1ec94c824d6519d01000000, repr=<ray.serve._private.replica.ServeReplica:ray-llm-myapp-baichuan:VLLMDeployment:pai--myapp-baichuan2-13b-chat-2 object at 0x7f4e97177c70>) (ServeController pid=67769) File "/home/ray/anaconda3/lib/python3.9/concurrent/futures/_base.py", line 439, in result (ServeController pid=67769) return self.__get_result() (ServeController pid=67769) File "/home/ray/anaconda3/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result (ServeController pid=67769) raise self._exception (ServeController pid=67769) File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/serve/_private/replica.py", line 442, in initialize_and_get_metadata
(ServeController pid=67769) raise RuntimeError(traceback.format_exc()) from None
(ServeController pid=67769) RuntimeError: Traceback (most recent call last):
(ServeController pid=67769) File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/serve/_private/replica.py", line 430, in initialize_and_get_metadata
(ServeController pid=67769) await self._initialize_replica()
(ServeController pid=67769) File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/serve/_private/replica.py", line 190, in initialize_replica
(ServeController pid=67769) await sync_to_async(_callable.__init__)(*init_args, **init_kwargs)
(ServeController pid=67769) File "/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/backend/server/vllm/vllm_deployment.py", line 37, in __init__
(ServeController pid=67769) await self.engine.start()
(ServeController pid=67769) File "/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/backend/llm/vllm/vllm_engine.py", line 78, in start
(ServeController pid=67769) pg, runtime_env = await self.node_initializer.initialize_node(self.llm_app)
(ServeController pid=67769) File "/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/backend/llm/vllm/vllm_node_initializer.py", line 52, in initialize_node
(ServeController pid=67769) await self._initialize_local_node(engine_config)
(ServeController pid=67769) File "/home/ray/anaconda3/lib/python3.9/concurrent/futures/thread.py", line 58, in run
(ServeController pid=67769) result = self.fn(*self.args, **self.kwargs)
(ServeController pid=67769) File "/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/backend/llm/vllm/vllm_node_initializer.py", line 72, in _initialize_local_node
(ServeController pid=67769) _ = AutoTokenizer.from_pretrained(engine_config.actual_hf_model_id)
(ServeController pid=67769) File "/home/ray/anaconda3/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 748, in from_pretrained
(ServeController pid=67769) raise ValueError(
(ServeController pid=67769) ValueError: Tokenizer class BaichuanTokenizer does not exist or is not currently imported.
model yaml
enabled: truedeployment_config:
autoscaling_config:
min_replicas: 1initial_replicas: 1max_replicas: 2target_num_ongoing_requests_per_replica: 1.0metrics_interval_s: 10.0look_back_period_s: 30.0smoothing_factor: 1.0downscale_delay_s: 300.0upscale_delay_s: 90.0ray_actor_options:
num_cpus: 4engine_config:
model_id: pai/myapp-baichuan2-13b-chat-2hf_model_id: /opt/models/myapp-baichuan2-13b-chat-2/engine_kwargs:
trust_remote_code: trueruntime_env:
env_vars:
YOUR_ENV_VAR: "your_value"generation:
prompt_format:
system: "{instruction}\n"# System message. Will default to default_system_messageassistant: "### Response:\n{instruction}\n"# Past assistant message. Used in chat completions API.trailing_assistant: "### Response:\n"# New assistant message. After this point, model will generate tokens.user: "### Instruction:\n{instruction}\n"# User message.default_system_message: "Below is an instruction that describes a task. Write a response that appropriately completes the request."# Default system message.system_in_user: false # Whether the system prompt is inside the user prompt. If true, the user field should include '{system}'add_system_tags_even_if_message_is_empty: false # Whether to include the system tags even if the user message is empty.strip_whitespace: false # Whether to automaticall strip whitespace from left and right of user supplied messages for chat completionsstopping_sequences: ["### Response:", "### End"]scaling_config:
num_workers: 1num_gpus_per_worker: 1num_cpus_per_worker: 4
When I try to load a local model, an error raised:
ValueError: Tokenizer class BaichuanTokenizer does not exist or is not currently imported.
I haveset trust_remote_code
=True.I used to use vllm on this model, and it works well.
model yaml
serve config:
The text was updated successfully, but these errors were encountered: