You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 28, 2024. It is now read-only.
docker run -d -it --gpus all --shm-size 1g -p 8000:8000 -e HF_HOME=~/data -v $cache_dir:/home/ray/data anyscale/ray-llm:latest
But in the container, when I run the command:
serve run ~/serve_configs/amazon--LightGPT.yaml
The error is :
2024-03-25 02:40:33,460 INFO scripts.py:411 -- Running config file: '/home/ray/serve_configs/amazon--LightGPT.yaml'.
2024-03-25 02:40:35,709 WARNING services.py:1996 -- WARNING: The object store is using /tmp instead of /dev/shm because /dev/shm has only 1073741824 bytes available. This will harm performance! You may be able to free up space by deleting files in /dev/shm. If you are inside a Docker container, you can increase /dev/shm size by passing '--shm-size=10.24gb' to 'docker run' (or add it to the run_options list in a Ray cluster config). Make sure to set this to more than 30% of available RAM.
2024-03-25 02:40:36,866 INFO worker.py:1715 -- Started a local Ray instance. View the dashboard at 127.0.0.1:8265
(ServeController pid=22583) WARNING 2024-03-25 02:40:39,253 controller 22583 logging_utils.py:247 - 'RAY_SERVE_ENABLE_JSON_LOGGING' is deprecated, please use 'LoggingConfig' to enable json format.
(ProxyActor pid=22664) WARNING 2024-03-25 02:40:40,795 proxy 172.17.0.2 logging_utils.py:247 - 'RAY_SERVE_ENABLE_JSON_LOGGING' is deprecated, please use 'LoggingConfig' to enable json format.
(ProxyActor pid=22664) INFO 2024-03-25 02:40:40,795 proxy 172.17.0.2 proxy.py:1141 - Proxy actor 28469263dc5907e200fa9fe201000000 starting on node 87617dc5750c5f36331a1ea5935a849259fee4d4a42262c695d9e0ca.
(ProxyActor pid=22664) INFO 2024-03-25 02:40:40,801 proxy 172.17.0.2 proxy.py:1346 - Starting HTTP server on node: 87617dc5750c5f36331a1ea5935a849259fee4d4a42262c695d9e0ca listening on port 8000
(ProxyActor pid=22664) INFO: Started server process [22664]
(ProxyActor pid=22664) WARNING 2024-03-25 02:40:40,824 proxy 172.17.0.2 logging_utils.py:247 - 'RAY_SERVE_ENABLE_JSON_LOGGING' is deprecated, please use 'LoggingConfig' to enable json format.
2024-03-25 02:40:40,848 SUCC scripts.py:480 -- Submitted deploy config successfully.
(ServeController pid=22583) INFO 2024-03-25 02:40:40,841 controller 22583 application_state.py:414 - Building application 'ray-llm'.
(build_serve_application pid=21125) There was a problem when trying to write in your cache folder (/home/adk/data/hub). You should set the environment variable TRANSFORMERS_CACHE to a writable directory.
(ServeController pid=22583) WARNING 2024-03-25 02:40:48,038 controller 22583 application_state.py:742 - Deploying app 'ray-llm' failed with exception:
(ServeController pid=22583) Traceback (most recent call last):
(ServeController pid=22583) File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/serve/_private/application_state.py", line 994, in build_serve_application
(ServeController pid=22583) app = call_app_builder_with_args_if_necessary(import_attr(import_path), args)
(ServeController pid=22583) File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/_private/utils.py", line 1182, in import_attr
(ServeController pid=22583) module = importlib.import_module(module_name)
(ServeController pid=22583) File "/home/ray/anaconda3/lib/python3.9/importlib/__init__.py", line 127, in import_module
(ServeController pid=22583) return _bootstrap._gcd_import(name[level:], package, level)
(ServeController pid=22583) File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
(ServeController pid=22583) File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
(ServeController pid=22583) File "<frozen importlib._bootstrap>", line 972, in _find_and_load_unlocked
(ServeController pid=22583) File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
(ServeController pid=22583) File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
(ServeController pid=22583) File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
(ServeController pid=22583) File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
(ServeController pid=22583) File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
(ServeController pid=22583) File "<frozen importlib._bootstrap_external>", line 850, in exec_module
(ServeController pid=22583) File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
(ServeController pid=22583) File "/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/__init__.py", line 1, in <module>
(ServeController pid=22583) from rayllm.backend.observability.tracing import setup_tracing
(ServeController pid=22583) File "/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/backend/__init__.py", line 1, in <module>
(ServeController pid=22583) from rayllm.backend.server.run import router_application
(ServeController pid=22583) File "/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/backend/server/run.py", line 10, in <module>
(ServeController pid=22583) from rayllm.backend.llm.vllm.vllm_engine import VLLMEngine
(ServeController pid=22583) File "/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/backend/llm/vllm/vllm_engine.py", line 15, in <module>
(ServeController pid=22583) from rayllm.backend.llm.vllm.vllm_compatibility import AviaryAsyncLLMEngine
(ServeController pid=22583) File "/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/backend/llm/vllm/vllm_compatibility.py", line 31, in <module>
(ServeController pid=22583) init_hf_modules()
(ServeController pid=22583) File "/home/ray/anaconda3/lib/python3.9/site-packages/transformers/dynamic_module_utils.py", line 52, in init_hf_modules
(ServeController pid=22583) os.makedirs(HF_MODULES_CACHE, exist_ok=True)
(ServeController pid=22583) File "/home/ray/anaconda3/lib/python3.9/os.py", line 215, in makedirs
(ServeController pid=22583) makedirs(head, exist_ok=exist_ok)
(ServeController pid=22583) File "/home/ray/anaconda3/lib/python3.9/os.py", line 215, in makedirs
(ServeController pid=22583) makedirs(head, exist_ok=exist_ok)
(ServeController pid=22583) File "/home/ray/anaconda3/lib/python3.9/os.py", line 225, in makedirs
(ServeController pid=22583) mkdir(name, mode)
(ServeController pid=22583) PermissionError: [Errno 13] Permission denied: '/home/adk'
(ServeController pid=22583)
(build_serve_application pid=21125) [8b5cff370a16:21125] [[48252,1],0] ORTE_ERROR_LOG: Unreachable in file runtime/ompi_mpi_finalize.c at line 262
The text was updated successfully, but these errors were encountered:
I build the image and run the container.
But in the container, when I run the command:
The error is :
The text was updated successfully, but these errors were encountered: