[Support]: embeddings GPU acceleration error 0.15 dev builds #14355

kirsch33 · 2024-10-15T02:50:39Z

kirsch33
Oct 15, 2024

Describe the problem you are having

I fully understand dev builds are a WIP and nothing is guaranteed but I've been wanting to test the GPU acceleration for embeddings but keep running into the error below. curious if anyone has run into this and what to try.

i did try to google the most relevant part of the error but couldnt seem to find anything very obvious as to the source.

I pruned my config but can sanitize the full if its needed.

Version

3879fde-tensorrt

What browser(s) are you using?

No response

Frigate config file

detectors:
  coral_pci:
    type: edgetpu
    device: pci

ffmpeg:
  hwaccel_args: preset-nvidia-h265

semantic_search:
  enabled: true
  reindex: true
  model_size: large

genai:
  enabled: true
  provider: gemini
  api_key: '{FRIGATE_GEMINI_API_KEY}'
  model: gemini-1.5-flash

Relevant Frigate log output

2024-10-14 19:38:10.404454306  Token indices sequence length is longer than the specified maximum sequence length for this model (8255 > 8192). Running this sequence through the model will result in indexing errors
2024-10-14 19:38:27.498050394  2024-10-14 19:38:27.497698130 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Einsum node. Name:'/transformer/encoder/layers.0/mixer/inner_attn/Einsum' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 104670758400
2024-10-14 19:38:27.498136455
2024-10-14 19:38:29.114510272  Process embeddings_manager:
2024-10-14 19:38:29.114532092  Traceback (most recent call last):
2024-10-14 19:38:29.114536732    File "/usr/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
2024-10-14 19:38:29.114540252      self.run()
2024-10-14 19:38:29.114544322    File "/opt/frigate/frigate/util/process.py", line 41, in run_wrapper
2024-10-14 19:38:29.114547892      return run(*args, **kwargs)
2024-10-14 19:38:29.114551789    File "/usr/lib/python3.9/multiprocessing/process.py", line 108, in run
2024-10-14 19:38:29.114555266      self._target(*self._args, **self._kwargs)
2024-10-14 19:38:29.114569629    File "/opt/frigate/frigate/embeddings/__init__.py", line 59, in manage_embeddings
2024-10-14 19:38:29.114575853      maintainer = EmbeddingMaintainer(
2024-10-14 19:38:29.114581453    File "/opt/frigate/frigate/embeddings/maintainer.py", line 50, in __init__
2024-10-14 19:38:29.114661998      self.embeddings.reindex()
2024-10-14 19:38:29.114669078    File "/opt/frigate/frigate/embeddings/embeddings.py", line 262, in reindex
2024-10-14 19:38:29.114674961      self.batch_upsert_description(batch_descs)
2024-10-14 19:38:29.114681471    File "/opt/frigate/frigate/embeddings/embeddings.py", line 177, in batch_upsert_description
2024-10-14 19:38:29.114761370      embeddings = self.text_embedding(list(event_descriptions.values()))
2024-10-14 19:38:29.114768083    File "/opt/frigate/frigate/embeddings/functions/onnx.py", line 199, in __call__
2024-10-14 19:38:29.114773753      embeddings = self.runner.run(onnx_inputs)[0]
2024-10-14 19:38:29.114779160    File "/opt/frigate/frigate/util/model.py", line 116, in run
2024-10-14 19:38:29.114783157      return self.ort.run(None, input)
2024-10-14 19:38:29.114789260    File "/usr/local/lib/python3.9/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 220, in run
2024-10-14 19:38:29.114793790      return self._sess.run(output_names, input_feed, run_options)
2024-10-14 19:38:29.114887075  onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Einsum node. Name:'/transformer/encoder/layers.0/mixer/inner_attn/Einsum' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 104670758400

Relevant go2rtc log output

n/a

FFprobe output from your camera

n/a

Frigate stats

n/a

Install method

Proxmox via Docker

docker-compose file or Docker CLI command

services:
  frigate:
    container_name: frigate
    privileged: true
    restart: unless-stopped
    image: ghcr.io/blakeblackshear/frigate:3879fde-tensorrt
    runtime: nvidia
    shm_size: "500mb"
    devices:
      - /dev/apex_0:/dev/apex_0
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
    ports:
      - 5000:5000
      - 8554:8554
      - 8555:8555
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /var/docks/frigate/config:/config
      - /nvr:/media/frigate
      - type: tmpfs
        target: /tmp/cache
        tmpfs:
          size: 1000000000

Object Detector

Coral

Network connection

Wired

Camera make and model

n/a

Screenshots of the Frigate UI's System metrics pages

No response

Any other information that may be helpful

No response

Answered by NickM-27

Oct 15, 2024

Fixed in #14364

View full answer

NickM-27 · 2024-10-15T02:53:16Z

NickM-27
Oct 15, 2024
Collaborator Sponsor

What if you don't reindex? It looks like your descriptions might be too large with the current batch size. I'll have to think about this

3 replies

NickM-27 Oct 15, 2024
Collaborator Sponsor

Fixed in #14364

Answer selected by NickM-27

kirsch33 Oct 16, 2024
Author

Still wasn't working right (although the original error disappeared) after the testing this PR. I think some other PRs might have fixed the new error already though. However it seems the tensorrt build has been failing for a day or so?

NickM-27 Oct 16, 2024
Collaborator Sponsor

all builds have been failing

hawkeye217 · 2024-10-15T03:00:35Z

hawkeye217
Oct 15, 2024
Collaborator

This is because we are batching during reindexing and the maximum length of the Jina text model is 8192. It shouldn't be difficult to resolve this.

0 replies

dennyreiter · 2024-11-25T18:48:14Z

dennyreiter
Nov 25, 2024

Did this make it into beta2? I'm still having it crash even though I have reindex set to false. Is there and easy way to just clear out the descriptions?

2 replies

NickM-27 Nov 25, 2024
Collaborator Sponsor

I'd suggest creating your own support discussion

dennyreiter Nov 26, 2024

Thank you. I believe my problem ended up being some data corruption in my database.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Support]: embeddings GPU acceleration error 0.15 dev builds #14355

{{title}}

Replies: 3 comments 5 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

[Support]: embeddings GPU acceleration error 0.15 dev builds #14355

kirsch33 Oct 15, 2024

Describe the problem you are having

Version

What browser(s) are you using?

Frigate config file

Relevant Frigate log output

Relevant go2rtc log output

FFprobe output from your camera

Frigate stats

Install method

docker-compose file or Docker CLI command

Object Detector

Network connection

Camera make and model

Screenshots of the Frigate UI's System metrics pages

Any other information that may be helpful

Replies: 3 comments · 5 replies

NickM-27 Oct 15, 2024 Collaborator Sponsor

NickM-27 Oct 15, 2024 Collaborator Sponsor

kirsch33 Oct 16, 2024 Author

NickM-27 Oct 16, 2024 Collaborator Sponsor

hawkeye217 Oct 15, 2024 Collaborator

dennyreiter Nov 25, 2024

NickM-27 Nov 25, 2024 Collaborator Sponsor

dennyreiter Nov 26, 2024

kirsch33
Oct 15, 2024

Replies: 3 comments 5 replies

NickM-27
Oct 15, 2024
Collaborator Sponsor

NickM-27 Oct 15, 2024
Collaborator Sponsor

kirsch33 Oct 16, 2024
Author

NickM-27 Oct 16, 2024
Collaborator Sponsor

hawkeye217
Oct 15, 2024
Collaborator

dennyreiter
Nov 25, 2024

NickM-27 Nov 25, 2024
Collaborator Sponsor