[Question]: Improving Performance #4673

Vova-Ponomarenko1 · 2025-01-28T10:59:27Z

Hi, I noticed that during chunk search, RAG doesn’t fully utilize the CPU, which results in prolonged information retrieval.

As a suggestion, you could implement multithreading.

(This becomes noticeable when working with multiple databases containing large volumes of data or small chunk sizes.)

Thank you!

yingfeng · 2025-01-28T15:21:53Z

Hi, if you mean the retrieval performance, you could click the bulb icon above the response answer, and if the retrieval latency is too long, say several or even tens of thousands, it might be caused by reranker, and you could try to remove reranker because it does not help the ranking a lot. By default, the reranker run over CPUs, which is pretty slow.

If you mean the chunking performance, you could config more task executors in the entrypoint.sh to increase the parallelism.

Regarding to the multi threading during search, it's not necessary, because in most cases, the retrieval itself should be finished within single second. So you need to figure out the real component that take more time at first.

Vova-Ponomarenko1 · 2025-01-28T15:58:36Z

Hi, if you mean the retrieval performance, you could click the bulb icon above the response answer, and if the retrieval latency is too long, say several or even tens of thousands, it might be caused by reranker, and you could try to remove reranker because it does not help the ranking a lot. By default, the reranker run over CPUs, which is pretty slow.

If you mean the chunking performance, you could config more task executors in the entrypoint.sh to increase the parallelism.

Regarding to the multi threading during search, it's not necessary, because in most cases, the retrieval itself should be finished within single second. So you need to figure out the real component that take more time at first.

Well, I’m not using a reranker; I just processed the files, connected the database to the assistant, and asked a question, so the "Searching..." took almost 2 minutes.

P.S. The databases are really large; some files contain up to 45k chunks.

Vova-Ponomarenko1 added the question label Jan 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question]: Improving Performance #4673

[Question]: Improving Performance #4673

Vova-Ponomarenko1 commented Jan 28, 2025 •

edited

Loading

yingfeng commented Jan 28, 2025

Vova-Ponomarenko1 commented Jan 28, 2025

[Question]: Improving Performance #4673

[Question]: Improving Performance #4673

Comments

Vova-Ponomarenko1 commented Jan 28, 2025 • edited Loading

yingfeng commented Jan 28, 2025

Vova-Ponomarenko1 commented Jan 28, 2025

Vova-Ponomarenko1 commented Jan 28, 2025 •

edited

Loading