Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT: Support Phi-1 & Phi-1.5 #506

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Bojun-Feng
Copy link
Contributor

Resolve #462

@XprobeBot XprobeBot added this to the v0.5.2 milestone Oct 5, 2023
@Bojun-Feng Bojun-Feng changed the title FEAT: Phi-1.5 Support FEAT: Support Phi-1.5 Oct 5, 2023
@Bojun-Feng
Copy link
Contributor Author

Hmm, seems like Phi-1.5 can not directly be added as a Pytorch Model and run, some additional glue code might be needed.

Got the following error when trying to run the model on default settings:
ModuleNotFoundError: [address=127.0.0.1:56946, pid=42111] No module named 'transformers_modules.phi-1'

Full Log
2023-10-05 17:09:40,791 xinference   42089 INFO     Xinference successfully started. Endpoint: http://127.0.0.1:9997
2023-10-05 17:09:40,792 xinference.core.worker 42089 DEBUG    Worker actor initialized with main pool: 127.0.0.1:21605
2023-10-05 17:09:40,792 xinference.core.supervisor 42089 DEBUG    Enter add_worker, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, '127.0.0.1:21605'), kwargs: {}
2023-10-05 17:09:40,792 xinference.core.supervisor 42089 INFO     Worker 127.0.0.1:21605 has been added successfully
2023-10-05 17:09:40,792 xinference.core.supervisor 42089 DEBUG    Leave add_worker, elapsed time: 0 ms
2023-10-05 17:09:40,793 xinference.deploy.worker 42089 INFO     Xinference worker successfully started.
2023-10-05 17:09:41,139 xinference.core.supervisor 42089 DEBUG    Enter list_model_registrations, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM'), kwargs: {}
2023-10-05 17:09:41,139 xinference.core.supervisor 42089 DEBUG    Leave list_model_registrations, elapsed time: 0 ms
2023-10-05 17:09:41,207 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'baichuan'), kwargs: {}
2023-10-05 17:09:41,207 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,208 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'baichuan-2'), kwargs: {}
2023-10-05 17:09:41,208 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,209 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'baichuan-2-chat'), kwargs: {}
2023-10-05 17:09:41,209 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,210 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'baichuan-chat'), kwargs: {}
2023-10-05 17:09:41,210 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,211 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'chatglm'), kwargs: {}
2023-10-05 17:09:41,211 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,211 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'chatglm2'), kwargs: {}
2023-10-05 17:09:41,211 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,212 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'chatglm2-32k'), kwargs: {}
2023-10-05 17:09:41,212 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,213 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'code-llama'), kwargs: {}
2023-10-05 17:09:41,213 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,214 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'code-llama-instruct'), kwargs: {}
2023-10-05 17:09:41,214 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,218 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'code-llama-python'), kwargs: {}
2023-10-05 17:09:41,218 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,219 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'falcon'), kwargs: {}
2023-10-05 17:09:41,219 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,220 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'falcon-instruct'), kwargs: {}
2023-10-05 17:09:41,220 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,220 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'glaive-coder'), kwargs: {}
2023-10-05 17:09:41,220 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,221 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'gpt-2'), kwargs: {}
2023-10-05 17:09:41,221 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,221 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'internlm-20b'), kwargs: {}
2023-10-05 17:09:41,221 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,221 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'internlm-7b'), kwargs: {}
2023-10-05 17:09:41,221 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,227 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'internlm-chat-20b'), kwargs: {}
2023-10-05 17:09:41,228 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,228 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'internlm-chat-7b'), kwargs: {}
2023-10-05 17:09:41,228 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,229 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'llama-2'), kwargs: {}
2023-10-05 17:09:41,229 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,229 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'llama-2-chat'), kwargs: {}
2023-10-05 17:09:41,229 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,230 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'OpenBuddy'), kwargs: {}
2023-10-05 17:09:41,230 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,230 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'opt'), kwargs: {}
2023-10-05 17:09:41,230 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,230 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'orca'), kwargs: {}
2023-10-05 17:09:41,230 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,233 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'phi-1.5'), kwargs: {}
2023-10-05 17:09:41,233 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,234 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'qwen-chat'), kwargs: {}
2023-10-05 17:09:41,234 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,234 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'starchat-beta'), kwargs: {}
2023-10-05 17:09:41,234 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,235 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'starcoder'), kwargs: {}
2023-10-05 17:09:41,235 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,235 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'starcoderplus'), kwargs: {}
2023-10-05 17:09:41,235 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,235 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'tiny-llama'), kwargs: {}
2023-10-05 17:09:41,235 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,240 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'vicuna-v1.3'), kwargs: {}
2023-10-05 17:09:41,240 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,243 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'vicuna-v1.5'), kwargs: {}
2023-10-05 17:09:41,243 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,244 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'vicuna-v1.5-16k'), kwargs: {}
2023-10-05 17:09:41,244 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,245 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'wizardlm-v1.0'), kwargs: {}
2023-10-05 17:09:41,245 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:41,245 xinference.core.supervisor 42089 DEBUG    Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'LLM', 'wizardmath-v1.0'), kwargs: {}
2023-10-05 17:09:41,245 xinference.core.supervisor 42089 DEBUG    Leave get_model_registration, elapsed time: 0 ms
2023-10-05 17:09:48,019 xinference.core.supervisor 42089 DEBUG    Enter launch_builtin_model, model_uid: e5cf40e0-63cb-11ee-b038-c1055c423403, model_name: phi-1.5, model_size: 1, model_format: pytorch, quantization: none, replica: 1
2023-10-05 17:09:48,019 xinference.core.worker 42089 DEBUG    Enter get_model_count, args: (<xinference.core.worker.WorkerActor object at 0x1597319d0>,), kwargs: {}
2023-10-05 17:09:48,019 xinference.core.worker 42089 DEBUG    Leave get_model_count, elapsed time: 0 ms
2023-10-05 17:09:48,019 xinference.core.worker 42089 DEBUG    Enter launch_builtin_model, args: (<xinference.core.worker.WorkerActor object at 0x1597319d0>,), kwargs: {'model_uid': 'e5cf40e0-63cb-11ee-b038-c1055c423403-1-0', 'model_name': 'phi-1.5', 'model_size_in_billions': 1, 'model_format': 'pytorch', 'quantization': 'none', 'model_type': 'LLM', 'n_gpu': 'auto'}
2023-10-05 17:09:48,019 xinference.core.supervisor 42089 DEBUG    Enter is_local_deployment, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>,), kwargs: {}
2023-10-05 17:09:48,019 xinference.core.supervisor 42089 DEBUG    Leave is_local_deployment, elapsed time: 0 ms
2023-10-05 17:09:48,024 xinference.model.llm.llm_family 42089 INFO     Caching from Hugging Face: microsoft/phi-1_5
2023-10-05 17:09:48,043 urllib3.connectionpool 42089 DEBUG    Starting new HTTPS connection (1): huggingface.co:443
2023-10-05 17:09:48,243 urllib3.connectionpool 42089 DEBUG    https://huggingface.co:443 "GET /api/models/microsoft/phi-1_5/revision/b6a7e2fe15c21f5847279f23e280cc5a0e7049ef HTTP/1.1" 200 2363
Fetching 14 files:   0%|                                                           | 0/14 [00:00<?, ?it/s]2023-10-05 17:09:48,271 urllib3.connectionpool 42089 DEBUG    Starting new HTTPS connection (1): huggingface.co:443
2023-10-05 17:09:48,272 urllib3.connectionpool 42089 DEBUG    Starting new HTTPS connection (1): huggingface.co:443
2023-10-05 17:09:48,273 urllib3.connectionpool 42089 DEBUG    Starting new HTTPS connection (1): huggingface.co:443
2023-10-05 17:09:48,275 urllib3.connectionpool 42089 DEBUG    Starting new HTTPS connection (1): huggingface.co:443
2023-10-05 17:09:48,276 urllib3.connectionpool 42089 DEBUG    Starting new HTTPS connection (1): huggingface.co:443
2023-10-05 17:09:48,278 urllib3.connectionpool 42089 DEBUG    Starting new HTTPS connection (1): huggingface.co:443
2023-10-05 17:09:48,280 urllib3.connectionpool 42089 DEBUG    Starting new HTTPS connection (1): huggingface.co:443
2023-10-05 17:09:48,280 urllib3.connectionpool 42089 DEBUG    Starting new HTTPS connection (1): huggingface.co:443
2023-10-05 17:09:48,399 urllib3.connectionpool 42089 DEBUG    https://huggingface.co:443 "HEAD /microsoft/phi-1_5/resolve/b6a7e2fe15c21f5847279f23e280cc5a0e7049ef/configuration_mixformer_sequential.py HTTP/1.1" 200 0
2023-10-05 17:09:48,399 urllib3.connectionpool 42089 DEBUG    https://huggingface.co:443 "HEAD /microsoft/phi-1_5/resolve/b6a7e2fe15c21f5847279f23e280cc5a0e7049ef/config.json HTTP/1.1" 200 0
2023-10-05 17:09:48,400 filelock     42089 DEBUG    Attempting to acquire lock 5798238032 on /Users/bojunfeng/.cache/huggingface/hub/models--microsoft--phi-1_5/blobs/c2b5ff89977b9726d5c3e54c28e17aa36d83f268.lock
2023-10-05 17:09:48,400 filelock     42089 DEBUG    Attempting to acquire lock 5798347600 on /Users/bojunfeng/.cache/huggingface/hub/models--microsoft--phi-1_5/blobs/8cc2d51cba96dbebf98898e731cca1d9c5977f71.lock
2023-10-05 17:09:48,400 filelock     42089 DEBUG    Lock 5798238032 acquired on /Users/bojunfeng/.cache/huggingface/hub/models--microsoft--phi-1_5/blobs/c2b5ff89977b9726d5c3e54c28e17aa36d83f268.lock
2023-10-05 17:09:48,400 filelock     42089 DEBUG    Lock 5798347600 acquired on /Users/bojunfeng/.cache/huggingface/hub/models--microsoft--phi-1_5/blobs/8cc2d51cba96dbebf98898e731cca1d9c5977f71.lock
2023-10-05 17:09:48,404 urllib3.connectionpool 42089 DEBUG    https://huggingface.co:443 "HEAD /microsoft/phi-1_5/resolve/b6a7e2fe15c21f5847279f23e280cc5a0e7049ef/merges.txt HTTP/1.1" 200 0
2023-10-05 17:09:48,404 urllib3.connectionpool 42089 DEBUG    https://huggingface.co:443 "HEAD /microsoft/phi-1_5/resolve/b6a7e2fe15c21f5847279f23e280cc5a0e7049ef/generation_config.json HTTP/1.1" 200 0
2023-10-05 17:09:48,404 urllib3.connectionpool 42089 DEBUG    https://huggingface.co:443 "HEAD /microsoft/phi-1_5/resolve/b6a7e2fe15c21f5847279f23e280cc5a0e7049ef/README.md HTTP/1.1" 200 0
2023-10-05 17:09:48,405 filelock     42089 DEBUG    Attempting to acquire lock 5795770704 on /Users/bojunfeng/.cache/huggingface/hub/models--microsoft--phi-1_5/blobs/6f26581545cae8f8f375c5f0f90d956c194a20fd.lock
2023-10-05 17:09:48,405 filelock     42089 DEBUG    Lock 5795770704 acquired on /Users/bojunfeng/.cache/huggingface/hub/models--microsoft--phi-1_5/blobs/6f26581545cae8f8f375c5f0f90d956c194a20fd.lock
2023-10-05 17:09:48,409 urllib3.connectionpool 42089 DEBUG    https://huggingface.co:443 "HEAD /microsoft/phi-1_5/resolve/b6a7e2fe15c21f5847279f23e280cc5a0e7049ef/Research%20License.docx HTTP/1.1" 200 0
2023-10-05 17:09:48,409 filelock     42089 DEBUG    Attempting to acquire lock 5788588304 on /Users/bojunfeng/.cache/huggingface/hub/models--microsoft--phi-1_5/blobs/390505dd1ab349a07cf9764b9dc733d28ea28385.lock
2023-10-05 17:09:48,409 filelock     42089 DEBUG    Lock 5788588304 acquired on /Users/bojunfeng/.cache/huggingface/hub/models--microsoft--phi-1_5/blobs/390505dd1ab349a07cf9764b9dc733d28ea28385.lock
2023-10-05 17:09:48,413 urllib3.connectionpool 42089 DEBUG    https://huggingface.co:443 "HEAD /microsoft/phi-1_5/resolve/b6a7e2fe15c21f5847279f23e280cc5a0e7049ef/.gitattributes HTTP/1.1" 200 0
Fetching 14 files:   7%|███▋                                               | 1/14 [00:00<00:01,  7.05it/s]2023-10-05 17:09:48,444 urllib3.connectionpool 42089 DEBUG    https://huggingface.co:443 "GET /microsoft/phi-1_5/resolve/b6a7e2fe15c21f5847279f23e280cc5a0e7049ef/config.json HTTP/1.1" 200 707
Downloading (…)0e7049ef/config.json: 100%|███████████████████████████████| 707/707 [00:00<00:00, 6.15MB/s]
2023-10-05 17:09:48,445 filelock     42089 DEBUG    Attempting to release lock 5798238032 on /Users/bojunfeng/.cache/huggingface/hub/models--microsoft--phi-1_5/blobs/c2b5ff89977b9726d5c3e54c28e17aa36d83f268.lock
2023-10-05 17:09:48,445 filelock     42089 DEBUG    Lock 5798238032 released on /Users/bojunfeng/.cache/huggingface/hub/models--microsoft--phi-1_5/blobs/c2b5ff89977b9726d5c3e54c28e17aa36d83f268.lock
2023-10-05 17:09:48,446 urllib3.connectionpool 42089 DEBUG    https://huggingface.co:443 "GET /microsoft/phi-1_5/resolve/b6a7e2fe15c21f5847279f23e280cc5a0e7049ef/configuration_mixformer_sequential.py HTTP/1.1" 200 1860
Downloading (…)former_sequential.py: 100%|███████████████████████████| 1.86k/1.86k [00:00<00:00, 28.2MB/s]
2023-10-05 17:09:48,447 filelock     42089 DEBUG    Attempting to release lock 5798347600 on /Users/bojunfeng/.cache/huggingface/hub/models--microsoft--phi-1_5/blobs/8cc2d51cba96dbebf98898e731cca1d9c5977f71.lock
2023-10-05 17:09:48,447 filelock     42089 DEBUG    Lock 5798347600 released on /Users/bojunfeng/.cache/huggingface/hub/models--microsoft--phi-1_5/blobs/8cc2d51cba96dbebf98898e731cca1d9c5977f71.lock
2023-10-05 17:09:48,451 urllib3.connectionpool 42089 DEBUG    https://huggingface.co:443 "HEAD /microsoft/phi-1_5/resolve/b6a7e2fe15c21f5847279f23e280cc5a0e7049ef/modeling_mixformer_sequential.py HTTP/1.1" 200 0
2023-10-05 17:09:48,451 filelock     42089 DEBUG    Attempting to acquire lock 5798359312 on /Users/bojunfeng/.cache/huggingface/hub/models--microsoft--phi-1_5/blobs/7d4f7229ad6e5f85e7ff4fba20847d4052bb74d2.lock
2023-10-05 17:09:48,451 filelock     42089 DEBUG    Lock 5798359312 acquired on /Users/bojunfeng/.cache/huggingface/hub/models--microsoft--phi-1_5/blobs/7d4f7229ad6e5f85e7ff4fba20847d4052bb74d2.lock
2023-10-05 17:09:48,452 urllib3.connectionpool 42089 DEBUG    https://huggingface.co:443 "GET /microsoft/phi-1_5/resolve/b6a7e2fe15c21f5847279f23e280cc5a0e7049ef/README.md HTTP/1.1" 200 8001
Downloading (…)5a0e7049ef/README.md: 100%|███████████████████████████| 8.00k/8.00k [00:00<00:00, 46.5MB/s]
2023-10-05 17:09:48,454 filelock     42089 DEBUG    Attempting to release lock 5795770704 on /Users/bojunfeng/.cache/huggingface/hub/models--microsoft--phi-1_5/blobs/6f26581545cae8f8f375c5f0f90d956c194a20fd.lock
2023-10-05 17:09:48,454 filelock     42089 DEBUG    Lock 5795770704 released on /Users/bojunfeng/.cache/huggingface/hub/models--microsoft--phi-1_5/blobs/6f26581545cae8f8f375c5f0f90d956c194a20fd.lock
2023-10-05 17:09:48,454 urllib3.connectionpool 42089 DEBUG    https://huggingface.co:443 "HEAD /microsoft/phi-1_5/resolve/b6a7e2fe15c21f5847279f23e280cc5a0e7049ef/pytorch_model.bin HTTP/1.1" 302 0
2023-10-05 17:09:48,457 urllib3.connectionpool 42089 DEBUG    https://huggingface.co:443 "GET /microsoft/phi-1_5/resolve/b6a7e2fe15c21f5847279f23e280cc5a0e7049ef/Research%20License.docx HTTP/1.1" 200 38892
                      2023-10-05 17:09:48,460 urllib3.connectionpool 42089 DEBUG    https://huggingface.co:443 "HEAD /microsoft/phi-1_5/resolve/b6a7e2fe15c21f5847279f23e280cc5a0e7049ef/special_tokens_map.json HTTP/1.1" 200 0
Downloading (…)earch%20License.docx: 100%|███████████████████████████| 38.9k/38.9k [00:00<00:00, 17.6MB/s]
2023-10-05 17:09:48,461 filelock     42089 DEBUG    Attempting to release lock 5788588304 on /Users/bojunfeng/.cache/huggingface/hub/models--microsoft--phi-1_5/blobs/390505dd1ab349a07cf9764b9dc733d28ea28385.lock
2023-10-05 17:09:48,461 filelock     42089 DEBUG    Lock 5788588304 released on /Users/bojunfeng/.cache/huggingface/hub/models--microsoft--phi-1_5/blobs/390505dd1ab349a07cf9764b9dc733d28ea28385.lock
2023-10-05 17:09:48,479 urllib3.connectionpool 42089 DEBUG    https://huggingface.co:443 "HEAD /microsoft/phi-1_5/resolve/b6a7e2fe15c21f5847279f23e280cc5a0e7049ef/added_tokens.json HTTP/1.1" 200 0
2023-10-05 17:09:48,489 urllib3.connectionpool 42089 DEBUG    https://huggingface.co:443 "HEAD /microsoft/phi-1_5/resolve/b6a7e2fe15c21f5847279f23e280cc5a0e7049ef/tokenizer.json HTTP/1.1" 200 0
2023-10-05 17:09:48,491 urllib3.connectionpool 42089 DEBUG    https://huggingface.co:443 "HEAD /microsoft/phi-1_5/resolve/b6a7e2fe15c21f5847279f23e280cc5a0e7049ef/tokenizer_config.json HTTP/1.1" 200 0
2023-10-05 17:09:48,496 urllib3.connectionpool 42089 DEBUG    https://huggingface.co:443 "HEAD /microsoft/phi-1_5/resolve/b6a7e2fe15c21f5847279f23e280cc5a0e7049ef/vocab.json HTTP/1.1" 200 0
2023-10-05 17:09:48,500 urllib3.connectionpool 42089 DEBUG    https://huggingface.co:443 "GET /microsoft/phi-1_5/resolve/b6a7e2fe15c21f5847279f23e280cc5a0e7049ef/modeling_mixformer_sequential.py HTTP/1.1" 200 28749
Downloading (…)former_sequential.py: 100%|████████████████████████████| 28.7k/28.7k [00:00<00:00, 142MB/s]
2023-10-05 17:09:48,501 filelock     42089 DEBUG    Attempting to release lock 5798359312 on /Users/bojunfeng/.cache/huggingface/hub/models--microsoft--phi-1_5/blobs/7d4f7229ad6e5f85e7ff4fba20847d4052bb74d2.lock
2023-10-05 17:09:48,501 filelock     42089 DEBUG    Lock 5798359312 released on /Users/bojunfeng/.cache/huggingface/hub/models--microsoft--phi-1_5/blobs/7d4f7229ad6e5f85e7ff4fba20847d4052bb74d2.lock
Fetching 14 files: 100%|██████████████████████████████████████████████████| 14/14 [00:00<00:00, 61.05it/s]
2023-10-05 17:09:48,502 xinference.model.llm.core 42089 DEBUG    Launching e5cf40e0-63cb-11ee-b038-c1055c423403-1-0 with PytorchModel
2023-10-05 17:09:50,561 xinference.core.supervisor 42089 DEBUG    Enter terminate_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'e5cf40e0-63cb-11ee-b038-c1055c423403'), kwargs: {'suppress_exception': True}
2023-10-05 17:09:50,561 xinference.core.supervisor 42089 DEBUG    Leave terminate_model, elapsed time: 0 ms
2023-10-05 17:09:50,561 xinference.core.restful_api 42089 ERROR    [address=127.0.0.1:56946, pid=42111] No module named 'transformers_modules.phi-1'
Traceback (most recent call last):
  File "/Users/bojunfeng/cs/inference/xinference/core/restful_api.py", line 404, in launch_model
    model_uid = await self._supervisor_ref.launch_builtin_model(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 288, in __pyx_actor_method_wrapper
  File "xoscar/core.pyx", line 422, in _handle_actor_result
  File "xoscar/core.pyx", line 465, in _run_actor_async_generator
  File "xoscar/core.pyx", line 466, in xoscar.core._BaseActor._run_actor_async_generator
  File "xoscar/core.pyx", line 471, in xoscar.core._BaseActor._run_actor_async_generator
  File "/Users/bojunfeng/cs/inference/xinference/core/supervisor.py", line 227, in launch_builtin_model
    yield _launch_one_model(rep_model_uid)
  File "xoscar/core.pyx", line 476, in xoscar.core._BaseActor._run_actor_async_generator
  File "xoscar/core.pyx", line 422, in _handle_actor_result
  File "xoscar/core.pyx", line 465, in _run_actor_async_generator
  File "xoscar/core.pyx", line 466, in xoscar.core._BaseActor._run_actor_async_generator
  File "xoscar/core.pyx", line 471, in xoscar.core._BaseActor._run_actor_async_generator
  File "/Users/bojunfeng/cs/inference/xinference/core/supervisor.py", line 206, in _launch_one_model
    yield worker_ref.launch_builtin_model(
  File "xoscar/core.pyx", line 476, in xoscar.core._BaseActor._run_actor_async_generator
  File "xoscar/core.pyx", line 396, in _handle_actor_result
  File "xoscar/core.pyx", line 284, in __pyx_actor_method_wrapper
  File "xoscar/core.pyx", line 287, in xoscar.core.__pyx_actor_method_wrapper
  File "/Users/bojunfeng/cs/inference/xinference/core/utils.py", line 27, in wrapped
    ret = await func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/bojunfeng/cs/inference/xinference/core/worker.py", line 187, in launch_builtin_model
    await model_ref.load()
  File "/Users/bojunfeng/anaconda3/lib/python3.11/site-packages/xoscar/backends/context.py", line 227, in send
    return self._process_result_message(result)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/bojunfeng/anaconda3/lib/python3.11/site-packages/xoscar/backends/context.py", line 102, in _process_result_message
    raise message.as_instanceof_cause()
  File "/Users/bojunfeng/anaconda3/lib/python3.11/site-packages/xoscar/backends/pool.py", line 657, in send
    result = await self._run_coro(message.message_id, coro)
    ^^^^^^^^^^^^^^^^^
  File "/Users/bojunfeng/anaconda3/lib/python3.11/site-packages/xoscar/backends/pool.py", line 368, in _run_coro
    return await coro
  File "/Users/bojunfeng/anaconda3/lib/python3.11/site-packages/xoscar/api.py", line 306, in __on_receive__
    return await super().__on_receive__(message)  # type: ignore
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 558, in __on_receive__
  File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.__on_receive__
  File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.__on_receive__
  File "xoscar/core.pyx", line 524, in xoscar.core._BaseActor.__on_receive__
  File "/Users/bojunfeng/cs/inference/xinference/core/model.py", line 117, in load
    self._model.load()
    ^^^^^^^^^^^^^^^^^
  File "/Users/bojunfeng/cs/inference/xinference/model/llm/pytorch/core.py", line 205, in load
    self._model, self._tokenizer = self._load_model(kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/Users/bojunfeng/cs/inference/xinference/model/llm/pytorch/core.py", line 124, in _load_model
    model = AutoModelForCausalLM.from_pretrained(
    ^^^^^^^^^^^^^^^^^
  File "/Users/bojunfeng/anaconda3/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py", line 482, in from_pretrained
    config, kwargs = AutoConfig.from_pretrained(
    ^^^^^^^^^^^^^^^^^
  File "/Users/bojunfeng/anaconda3/lib/python3.11/site-packages/transformers/models/auto/configuration_auto.py", line 1016, in from_pretrained
    config_class = get_class_from_dynamic_module(class_ref, pretrained_model_name_or_path, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/Users/bojunfeng/anaconda3/lib/python3.11/site-packages/transformers/dynamic_module_utils.py", line 497, in get_class_from_dynamic_module
    return get_class_in_module(class_name, final_module.replace(".py", ""))
      ^^^^^^^^^^^^^^^^^
  File "/Users/bojunfeng/anaconda3/lib/python3.11/site-packages/transformers/dynamic_module_utils.py", line 199, in get_class_in_module
    module = importlib.import_module(module_path)
      ^^^^^^^^^^^^^^^^^
  File "/Users/bojunfeng/anaconda3/lib/python3.11/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
      ^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1204, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1126, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1204, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1126, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1204, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1140, in _find_and_load_unlocked
ModuleNotFoundError: [address=127.0.0.1:56946, pid=42111] No module named 'transformers_modules.phi-1'
2023-10-05 17:09:50,572 urllib3.connectionpool 42089 DEBUG    Starting new HTTP connection (1): 127.0.0.1:9997
2023-10-05 17:09:50,573 xinference.core.supervisor 42089 DEBUG    Enter describe_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x1596a79b0>, 'e5cf40e0-63cb-11ee-b038-c1055c423403'), kwargs: {}
2023-10-05 17:09:50,573 xinference.core.restful_api 42089 ERROR    Model not found in the model list, uid: e5cf40e0-63cb-11ee-b038-c1055c423403
Traceback (most recent call last):
  File "/Users/bojunfeng/cs/inference/xinference/core/restful_api.py", line 361, in describe_model
    return await self._supervisor_ref.describe_model(model_uid)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 284, in __pyx_actor_method_wrapper
  File "xoscar/core.pyx", line 287, in xoscar.core.__pyx_actor_method_wrapper
  File "/Users/bojunfeng/cs/inference/xinference/core/utils.py", line 27, in wrapped
    ret = await func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/bojunfeng/cs/inference/xinference/core/supervisor.py", line 300, in describe_model
    raise ValueError(f"Model not found in the model list, uid: {model_uid}")
ValueError: Model not found in the model list, uid: e5cf40e0-63cb-11ee-b038-c1055c423403
2023-10-05 17:09:50,573 urllib3.connectionpool 42089 DEBUG    http://127.0.0.1:9997 "GET /v1/models/e5cf40e0-63cb-11ee-b038-c1055c423403 HTTP/1.1" 400 89
2023-10-05 17:09:50,573 xinference.core.restful_api 42089 ERROR    Failed to get the model description, detail: Model not found in the model list, uid: e5cf40e0-63cb-11ee-b038-c1055c423403
Traceback (most recent call last):
  File "/Users/bojunfeng/cs/inference/xinference/core/restful_api.py", line 453, in build_interface
    gr.mount_gradio_app(self._app, interface.build(), f"/{model_uid}")
                                   ^^^^^^^^^^^^^^^^^
  File "/Users/bojunfeng/cs/inference/xinference/core/chat_interface.py", line 36, in build
    model = self.client.get_model(self.model_uid)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/bojunfeng/cs/inference/xinference/client.py", line 883, in get_model
    raise RuntimeError(
RuntimeError: Failed to get the model description, detail: Model not found in the model list, uid: e5cf40e0-63cb-11ee-b038-c1055c423403

@UranusSeven
Copy link
Contributor

Yeah. I've also tried to run this model with PyTorchModel and instead of the error, I got garbled text. You may want to check the version of transformers and write another generate method in the way like falcon.

@Bojun-Feng
Copy link
Contributor Author

Bojun-Feng commented Oct 7, 2023

I found the solution. We were encountering two different issues.

Regarding the ModuleNotFoundError, I named the model Phi-1.5 instead of Phi-1_5, but HuggingFace converts the slashes of directory string into periods at one point (link) and thus causes parsing error.

Regarding the garbled text, a recent update to 4.34.0 (link) in Transformers fixes bugs in tokenizers and addressed the issue. Given the same function name and description, different versions behave drastically different:

Example Generations

Transformer Version 4.32.1:

def print_prime(n):
   """
   Print all primes between 1 and n
   """
 self. While others like that she discovered that he wanted to the other important

Transformer Version 4.34.0:

def print_prime(n):
   """
   Print all primes between 1 and n
   """
   # Initialize an array of all numbers
   all_numbers = [i for i in range(1, n + 1)]

@Bojun-Feng Bojun-Feng changed the title FEAT: Support Phi-1.5 FEAT: Support Phi-1 & Phi-1.5 Oct 7, 2023
@Bojun-Feng
Copy link
Contributor Author

Actually, after some more testing with prompts from the Phi-1.5 paper it seems that the tokenizer problem is still not fully solved. I see there is still a PR open in the transformers repo working on it. We might have to wait until another update if we are using the same library.

@XprobeBot XprobeBot modified the milestones: v0.5.2, v0.6.0 Oct 16, 2023
@XprobeBot XprobeBot modified the milestones: v0.6.0, v0.6.1, v0.6.2, v0.6.3 Nov 3, 2023
@XprobeBot XprobeBot modified the milestones: v0.6.3, v0.6.4, v0.6.5 Nov 21, 2023
@XprobeBot XprobeBot modified the milestones: v0.6.5, v0.6.6, v0.7.0 Dec 1, 2023
@XprobeBot XprobeBot modified the milestones: v0.7.1, v0.7.2 Dec 12, 2023
@XprobeBot XprobeBot modified the milestones: v0.10.0, v0.10.1 Mar 29, 2024
@XprobeBot XprobeBot modified the milestones: v0.10.1, v0.10.2 Apr 12, 2024
@XprobeBot XprobeBot modified the milestones: v0.10.2, v0.10.3, v0.11.0 Apr 19, 2024
@XprobeBot XprobeBot modified the milestones: v0.11.0, v0.11.1, v0.11.2 May 11, 2024
@XprobeBot XprobeBot modified the milestones: v0.11.2, v0.11.3 May 24, 2024
@XprobeBot XprobeBot modified the milestones: v0.11.3, v0.11.4, v0.12.0, v0.12.1 May 31, 2024
@XprobeBot XprobeBot modified the milestones: v0.12.1, v0.12.2 Jun 14, 2024
@XprobeBot XprobeBot modified the milestones: v0.12.2, v0.12.4, v0.13.0, v0.13.1 Jun 28, 2024
@XprobeBot XprobeBot modified the milestones: v0.13.1, v0.13.2 Jul 12, 2024
@XprobeBot XprobeBot modified the milestones: v0.13.2, v0.13.4 Jul 26, 2024
@XprobeBot XprobeBot modified the milestones: v0.14, v0.15 Sep 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FEAT: support phi-1.5
3 participants