Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jetson Orin 8Gb gets stuck and restarts #62

Open
rs1990 opened this issue Jan 6, 2025 · 7 comments
Open

Jetson Orin 8Gb gets stuck and restarts #62

rs1990 opened this issue Jan 6, 2025 · 7 comments

Comments

@rs1990
Copy link

rs1990 commented Jan 6, 2025

Hello,

everytime i try and run the following , my system hangs and restarts. could you help please?. Im running Jetpack 6.1

jetson-containers run $(autotag dustynv/nano_llm)   python3 -m nano_llm.agents.video_query --api=mlc     --model Efficient-Large-Model/VILA-2.7b     --max-context-len 256     --max-new-tokens 32     --video-input /dev/video0     --video-output webrtc://@:8554/output
Namespace(packages=['nano_llm'], prefer=['local', 'registry', 'build'], disable=[''], user='dustynv', output='/tmp/autotag', quiet=False, verbose=False)
Namespace(packages=['dustynv/nano_llm'], prefer=['local', 'registry', 'build'], disable=[''], user='dustynv', output='/tmp/autotag', quiet=False, verbose=False)
-- L4T_VERSION=36.4.0  JETPACK_VERSION=6.1  CUDA_VERSION=12.6
-- Finding compatible container image for ['dustynv/nano_llm']
dustynv/nano_llm:r36.4.0
V4L2_DEVICES:  --device /dev/video0  --device /dev/video1 
### DISPLAY environmental variable is already set: ":1"
localuser:root being added to access control list
xauth:  file /tmp/.docker.xauth does not exist
+ docker run --runtime nvidia -it --rm --network host --shm-size=8g --volume /tmp/argus_socket:/tmp/argus_socket --volume /etc/enctune.conf:/etc/enctune.conf --volume /etc/nv_tegra_release:/etc/nv_tegra_release --volume /tmp/nv_jetson_model:/tmp/nv_jetson_model --volume /var/run/dbus:/var/run/dbus --volume /var/run/avahi-daemon/socket:/var/run/avahi-daemon/socket --volume /var/run/docker.sock:/var/run/docker.sock --volume /home/mav/jetson-containers/data:/data -v /etc/localtime:/etc/localtime:ro -v /etc/timezone:/etc/timezone:ro --device /dev/snd -e PULSE_SERVER=unix:/run/user/1000/pulse/native -v /run/user/1000/pulse:/run/user/1000/pulse --device /dev/bus/usb -e DISPLAY=:1 -v /tmp/.X11-unix/:/tmp/.X11-unix -v /tmp/.docker.xauth:/tmp/.docker.xauth -e XAUTHORITY=/tmp/.docker.xauth --device /dev/video0 --device /dev/video1 --device /dev/i2c-0 --device /dev/i2c-1 --device /dev/i2c-2 --device /dev/i2c-4 --device /dev/i2c-5 --device /dev/i2c-7 --device /dev/i2c-9 -v /run/jtop.sock:/run/jtop.sock --name jetson_container_20250105_191129 dustynv/nano_llm:r36.4.0 python3 -m nano_llm.agents.video_query --api=mlc --model Efficient-Large-Model/VILA-2.7b --max-context-len 256 --max-new-tokens 32 --video-input /dev/video0 --video-output webrtc://@:8554/output
/usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py:124: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1142: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Fetching 10 files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 6887.20it/s]
Fetching 12 files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 21950.13it/s]
19:12:07 | INFO | loading /data/models/huggingface/models--Efficient-Large-Model--VILA-2.7b/snapshots/2ed82105eefd5926cccb46af9e71b0ca77f12704 with MLC
19:12:22 | INFO | NumExpr defaulting to 6 threads.
19:12:24 | WARNING | AWQ not installed (requires JetPack 6 / L4T R36) - AWQ models will fail to initialize
['/data/models/mlc/dist/VILA-2.7b/ctx256/VILA-2.7b-q4f16_ft/mlc-chat-config.json', '/data/models/mlc/dist/VILA-2.7b/ctx256/VILA-2.7b-q4f16_ft/params/mlc-chat-config.json']
19:12:30 | INFO | running MLC quantization:

python3 -m mlc_llm.build --model /data/models/mlc/dist/models/VILA-2.7b --quantization q4f16_ft --target cuda --use-cuda-graph --use-flash-attn-mqa --sep-embed --max-seq-len 256 --artifact-path /data/models/mlc/dist/VILA-2.7b/ctx256 --use-safetensors 


Using path "/data/models/mlc/dist/models/VILA-2.7b" for model "VILA-2.7b"
Target configured: cuda -keys=cuda,gpu -arch=sm_87 -max_num_threads=1024 -max_shared_memory_per_block=49152 -max_threads_per_block=1024 -registers_per_block=65536 -thread_warp_size=32
Automatically using target for weight quantization: cuda -keys=cuda,gpu -arch=sm_87 -max_num_threads=1024 -max_shared_memory_per_block=49152 -max_threads_per_block=1024 -registers_per_block=65536 -thread_warp_size=32
Get old param:   0%|                                                                                                                      | 0/197 [00:00<?, ?tensors/sStart computing and quantizing weights... This may take a while.                                                                           | 0/327 [00:00<?, ?tensors/s]
Get old param:   1%|█                                                                                                             | 2/197 [00:04<05:44,  1.77s/tensors]
Set new param:   0%|▎                                                                                                             | 1/327 [00:04<22:51,  4.21s/tensors]

@JIA-HONG-CHU
Copy link

No memory?

@rs1990
Copy link
Author

rs1990 commented Jan 6, 2025 via email

@kevindowling
Copy link

Do you have a ssd with a swap file?

@rs1990
Copy link
Author

rs1990 commented Jan 20, 2025 via email

@kevindowling
Copy link

kevindowling commented Jan 22, 2025

@rs1990
Copy link
Author

rs1990 commented Jan 23, 2025

im using a MicroSD card. i tried the same but it fails. i did create a swap partition while flashing however doesnt seem to be mounting.

@kevindowling
Copy link

Are you getting any errors?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants