context window limitation. #138
Replies: 2 comments
-
If you really want to use the output of such a command, you should store it to file and use grep or some RAG tool to retrieve what you want. You can temporarily fix your conversation by removing the massive line in the But looking at the command, you seem to be dumping an entire node_modules into context. Instead you could use something like I recently changed the behavior you describe with lines, it now takes the first 2000 + last 8000 tokens instead: 8a62859 RAG in general is tracked in #59. I was recently working on a local RAG thing in #135, but not sure if it'll get merged anytime soon. |
Beta Was this translation helpful? Give feedback.
-
I understand, but I need to dump the entire code block or output in the context without any truncation. Initially, I thought I could achieve this using RAG. However, I've realized it's not feasible to store it in a vector database and retrieve it later using RAG, as we have already reached the model's maximum context window size. Do you know the best solutions to handle this kind of situation seamlessly? |
Beta Was this translation helpful? Give feedback.
-
Hi, I tried to run
/shell ls -R
command to read all directories. It's token usage was 289628 that exceedsclaude-3.5-sonnect
model's context window. I reviewed yourreduce.py
and it truncated message codeblocks to the first and lastX
lines, keeping the rest as[...]
.I think we should keep all commands and codeblocks while
reduce
processing. Actually, my shell command was essential for achieving my goals usinggptme
.To address this problem, I attempted to store that codeblock on vector database and reuse it in conversation using the RAG approach. However, this was not a viable solution because we have already reached the maximum token limit.
How can we solve this kind of issue?
Beta Was this translation helpful? Give feedback.
All reactions