context window limitation. #138

mystery000 · 2024-09-24T16:17:44Z

mystery000
Sep 24, 2024

Hi, I tried to run /shell ls -R command to read all directories. It's token usage was 289628 that exceeds claude-3.5-sonnect model's context window. I reviewed your reduce.py and it truncated message codeblocks to the first and last X lines, keeping the rest as [...].
I think we should keep all commands and codeblocks while reduce processing. Actually, my shell command was essential for achieving my goals using gptme.

To address this problem, I attempted to store that codeblock on vector database and reuse it in conversation using the RAG approach. However, this was not a viable solution because we have already reached the maximum token limit.

How can we solve this kind of issue?

ErikBjare · 2024-09-24T18:27:11Z

ErikBjare
Sep 24, 2024
Maintainer

If you really want to use the output of such a command, you should store it to file and use grep or some RAG tool to retrieve what you want.

You can temporarily fix your conversation by removing the massive line in the conversation.jsonl. I should probably make the recovery from such a situation less in need of manual intervention.

But looking at the command, you seem to be dumping an entire node_modules into context. Instead you could use something like ls $(git ls-files) (assuming your repo isn't huge).

I recently changed the behavior you describe with lines, it now takes the first 2000 + last 8000 tokens instead: 8a62859

RAG in general is tracked in #59.

I was recently working on a local RAG thing in #135, but not sure if it'll get merged anytime soon.

0 replies

mystery000 · 2024-09-24T18:57:13Z

mystery000
Sep 24, 2024
Author

I understand, but I need to dump the entire code block or output in the context without any truncation. Initially, I thought I could achieve this using RAG. However, I've realized it's not feasible to store it in a vector database and retrieve it later using RAG, as we have already reached the model's maximum context window size. Do you know the best solutions to handle this kind of situation seamlessly?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

context window limitation. #138

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

context window limitation. #138

mystery000 Sep 24, 2024

Replies: 2 comments

ErikBjare Sep 24, 2024 Maintainer

mystery000 Sep 24, 2024 Author

mystery000
Sep 24, 2024

ErikBjare
Sep 24, 2024
Maintainer

mystery000
Sep 24, 2024
Author