[Support]: embeddings GPU acceleration error 0.15 dev builds #14355
-
Describe the problem you are havingI fully understand dev builds are a WIP and nothing is guaranteed but I've been wanting to test the GPU acceleration for embeddings but keep running into the error below. curious if anyone has run into this and what to try. i did try to google the most relevant part of the error but couldnt seem to find anything very obvious as to the source. I pruned my config but can sanitize the full if its needed. Version3879fde-tensorrt What browser(s) are you using?No response Frigate config filedetectors:
coral_pci:
type: edgetpu
device: pci
ffmpeg:
hwaccel_args: preset-nvidia-h265
semantic_search:
enabled: true
reindex: true
model_size: large
genai:
enabled: true
provider: gemini
api_key: '{FRIGATE_GEMINI_API_KEY}'
model: gemini-1.5-flash Relevant Frigate log output2024-10-14 19:38:10.404454306 Token indices sequence length is longer than the specified maximum sequence length for this model (8255 > 8192). Running this sequence through the model will result in indexing errors
2024-10-14 19:38:27.498050394 2024-10-14 19:38:27.497698130 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Einsum node. Name:'/transformer/encoder/layers.0/mixer/inner_attn/Einsum' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 104670758400
2024-10-14 19:38:27.498136455
2024-10-14 19:38:29.114510272 Process embeddings_manager:
2024-10-14 19:38:29.114532092 Traceback (most recent call last):
2024-10-14 19:38:29.114536732 File "/usr/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
2024-10-14 19:38:29.114540252 self.run()
2024-10-14 19:38:29.114544322 File "/opt/frigate/frigate/util/process.py", line 41, in run_wrapper
2024-10-14 19:38:29.114547892 return run(*args, **kwargs)
2024-10-14 19:38:29.114551789 File "/usr/lib/python3.9/multiprocessing/process.py", line 108, in run
2024-10-14 19:38:29.114555266 self._target(*self._args, **self._kwargs)
2024-10-14 19:38:29.114569629 File "/opt/frigate/frigate/embeddings/__init__.py", line 59, in manage_embeddings
2024-10-14 19:38:29.114575853 maintainer = EmbeddingMaintainer(
2024-10-14 19:38:29.114581453 File "/opt/frigate/frigate/embeddings/maintainer.py", line 50, in __init__
2024-10-14 19:38:29.114661998 self.embeddings.reindex()
2024-10-14 19:38:29.114669078 File "/opt/frigate/frigate/embeddings/embeddings.py", line 262, in reindex
2024-10-14 19:38:29.114674961 self.batch_upsert_description(batch_descs)
2024-10-14 19:38:29.114681471 File "/opt/frigate/frigate/embeddings/embeddings.py", line 177, in batch_upsert_description
2024-10-14 19:38:29.114761370 embeddings = self.text_embedding(list(event_descriptions.values()))
2024-10-14 19:38:29.114768083 File "/opt/frigate/frigate/embeddings/functions/onnx.py", line 199, in __call__
2024-10-14 19:38:29.114773753 embeddings = self.runner.run(onnx_inputs)[0]
2024-10-14 19:38:29.114779160 File "/opt/frigate/frigate/util/model.py", line 116, in run
2024-10-14 19:38:29.114783157 return self.ort.run(None, input)
2024-10-14 19:38:29.114789260 File "/usr/local/lib/python3.9/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 220, in run
2024-10-14 19:38:29.114793790 return self._sess.run(output_names, input_feed, run_options)
2024-10-14 19:38:29.114887075 onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Einsum node. Name:'/transformer/encoder/layers.0/mixer/inner_attn/Einsum' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 104670758400 Relevant go2rtc log outputn/a FFprobe output from your cameran/a Frigate statsn/a Install methodProxmox via Docker docker-compose file or Docker CLI commandservices:
frigate:
container_name: frigate
privileged: true
restart: unless-stopped
image: ghcr.io/blakeblackshear/frigate:3879fde-tensorrt
runtime: nvidia
shm_size: "500mb"
devices:
- /dev/apex_0:/dev/apex_0
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
ports:
- 5000:5000
- 8554:8554
- 8555:8555
volumes:
- /etc/localtime:/etc/localtime:ro
- /var/docks/frigate/config:/config
- /nvr:/media/frigate
- type: tmpfs
target: /tmp/cache
tmpfs:
size: 1000000000 Object DetectorCoral Network connectionWired Camera make and modeln/a Screenshots of the Frigate UI's System metrics pagesNo response Any other information that may be helpfulNo response |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 5 replies
-
What if you don't reindex? It looks like your descriptions might be too large with the current batch size. I'll have to think about this |
Beta Was this translation helpful? Give feedback.
-
This is because we are batching during reindexing and the maximum length of the Jina text model is 8192. It shouldn't be difficult to resolve this. |
Beta Was this translation helpful? Give feedback.
-
Did this make it into beta2? I'm still having it crash even though I have reindex set to false. Is there and easy way to just clear out the descriptions? |
Beta Was this translation helpful? Give feedback.
Fixed in #14364