-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: The 'view' tool command doesn't work on /workspace #5497
Comments
Seems like the Not quite sure if we can simply fix it with |
For visibility, below is what the issue looks like from the swe-bench trajectory: ...
{
"id": 3,
"timestamp": "2024-11-21T12:13:39.311499",
"source": "agent",
"message": "Running Python code interactively: print(file_editor(**{'command': 'view', 'path': '/workspace/django__django__3.2'}))",
"action": "run_ipython",
"tool_call_metadata": {
"function_name": "str_replace_editor",
"tool_call_id": "call_29i0HPkZ2cmM3qIbT7MBkR0P",
"model_response": {
"id": "chatcmpl-AW0Ly6chyLtCqh0P6VtQbOHik9E7m",
"created": 1732191218,
"model": "gpt-4o-2024-05-13",
"object": "chat.completion",
"system_fingerprint": "fp_04751d0b65",
"choices": [
{
"finish_reason": "tool_calls",
"index": 0,
"message": {
"content": null,
"role": "assistant",
"tool_calls": [
{
"function": {
"arguments": "{\"command\":\"view\",\"path\":\"/workspace/django__django__3.2\"}",
"name": "str_replace_editor"
},
"id": "call_29i0HPkZ2cmM3qIbT7MBkR0P",
"type": "function"
}
],
"function_call": null
}
}
],
"usage": {
"completion_tokens": 28,
"prompt_tokens": 1563,
"total_tokens": 1591,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"service_tier": null,
"prompt_filter_results": [
{
"prompt_index": 0,
"content_filter_results": {}
}
]
},
"total_calls_in_response": 1
},
"args": {
"code": "print(file_editor(**{'command': 'view', 'path': '/workspace/django__django__3.2'}))",
"thought": "",
"include_extra": false,
"confirmation_state": "confirmed",
"kernel_init_code": ""
},
"timeout": 300
},
{
"id": 4,
"timestamp": "2024-11-21T12:13:39.339866",
"source": "agent",
"message": "Code executed in IPython cell.",
"cause": 3,
"observation": "run_ipython",
"tool_call_metadata": {
"function_name": "str_replace_editor",
"tool_call_id": "call_29i0HPkZ2cmM3qIbT7MBkR0P",
"model_response": {
"id": "chatcmpl-AW0Ly6chyLtCqh0P6VtQbOHik9E7m",
"created": 1732191218,
"model": "gpt-4o-2024-05-13",
"object": "chat.completion",
"system_fingerprint": "fp_04751d0b65",
"choices": [
{
"finish_reason": "tool_calls",
"index": 0,
"message": {
"content": null,
"role": "assistant",
"tool_calls": [
{
"function": {
"arguments": "{\"command\":\"view\",\"path\":\"/workspace/django__django__3.2\"}",
"name": "str_replace_editor"
},
"id": "call_29i0HPkZ2cmM3qIbT7MBkR0P",
"type": "function"
}
],
"function_call": null
}
}
],
"usage": {
"completion_tokens": 28,
"prompt_tokens": 1563,
"total_tokens": 1591,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"service_tier": null,
"prompt_filter_results": [
{
"prompt_index": 0,
"content_filter_results": {}
}
]
},
"total_calls_in_response": 1
},
"content": "Here's the files and directories up to 2 levels deep in /workspace/django__django__3.2, excluding hidden items:\n/workspace/django__django__3.2",
"extras": {
"code": "print(file_editor(**{'command': 'view', 'path': '/workspace/django__django__3.2'}))"
}
}
... @enyst Can you help double check again to see if |
I saw the same you see here, it had 'django' in the path, but I'm double checking. |
This was the test I made: #5506 I tested what if it uses this more, to see how it does. With the resolver I have noticed lately that it works very well when it uses this 'view' tool a lot for the initial exploration. From the events:
|
Also the LLM was doing stuff like this 😅
|
Thanks, it seems to fail for both
Yeah, I guess it's probably better than using |
Exactly my thoughts! And once your fix is merged, we can eval the prompt change.
Definitely, and there's more:
|
ahhh good observation @ryanhoangt -- i ran into the issue as well but wasn't sure the cause until this point. Maybe we can modify this line so instead of
I think this only happnes during eval, right? |
Yes, during eval. Edited to add: good question. I saw it with eval. But it works fine with the resolver. |
Yeah I think it'd be better to replace symlink in eval (maybe in a separate PR with some testing), in my PR I also have some issues with it: All-Hands-AI/openhands-aci#5 (comment) Btw @enyst I think we can also update the aci version in your PR with the |
Is there an existing issue for the same bug?
Describe the bug and reproduction steps
I tried to nudge the LLM to use its 'view' command from the file_editor tool, to view directory content with depth=2, when it starts exploring the filesystem. (the first action of the agent). Claude Sonnet has been trained on those as far as we know, and Anthropic's tool use examples include this.
When I tried to eval it on swe-bench, the LLM ran into trouble, it couldn't read the output when it ran it on
/workspace
. So it did things like:"I see the issue - it's a symlink to /testbed. Let's explore the actual directory"
...
"{\"path\": \"/testbed/django/db/backends/base/schema.py\"
Cc: @ryanhoangt @xingyaoww
Logs, Errors, Screenshots, and Additional Context
No response
The text was updated successfully, but these errors were encountered: