Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trajectory replay: Fix a few corner cases #6380

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 13 additions & 1 deletion openhands/controller/replay.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
from openhands.core.logger import openhands_logger as logger
from openhands.events.action.action import Action
from openhands.events.action.message import MessageAction
from openhands.events.event import Event, EventSource


Expand All @@ -14,7 +15,18 @@ class ReplayManager:

def __init__(self, replay_events: list[Event] | None):
if replay_events:
logger.info(f'Replay logs loaded, events length = {len(replay_events)}')
logger.info(f'Replay events loaded, events length = {len(replay_events)}')
for index in range(len(replay_events) - 1):
event = replay_events[index]
if isinstance(event, MessageAction) and event.wait_for_response:
# For any message waiting for response that is not the last
# event, we override wait_for_response to False, as a response
# would have been included in the next event, and we don't
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a response would have been included in the next event

The next event is a MessageAction with source='user', which is the response. Is this event you mean, or do you mean the next source=agent event?

I ask because I'm curious about something: I feel like if the replay process ends, then we close the controller, it will be saved in a new trajectory, and that should reflect perfectly what happened, just like the initial trajectory: so IMHO it should contain... 🤔

  • the agent actions, including this MessageAction
  • the correct user actions, including task, including... the response to this MessageAction? The response is a MessageAction with source='user'

(I mean enough events should be retrieved so that an agent with this history can continue normally, with all information it had in the past. Or do you see a reason why that won't work?)

Copy link
Collaborator Author

@li-boxuan li-boxuan Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The next event is a MessageAction with source='user', which is the response. Is this event you mean

Yes, literally the next event.

I mean enough events should be retrieved so that an agent with this history can continue normally, with all information it had in the past

I think that's what I've been trying to achieve? Do you see any place that would break this assumption? The response from user is indeed included in the trajectory. For example, in demo2.json, step 16 contains the user response.

The logic here is to NOT pause the control flow. The controller "replays" the "recorded" user response from the trajectory, rather than a new user response from the actual user.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see and I agree, thank you, we're on the same page on the goal. I have some small hesitation though, but I am still to look in detail at the .json files, so please take it with a grain of salt, and feel free to ignore it atm (I'll look closer at it tonight):

  • I don't see clearly how the controller can replay the "recorded" user response, since this code says that all actions with source='user' are not replayable. What am I missing?

  • idk, it also seems to me that we're getting an extra MessageAction that wasn't in history before? The null/null message is new in demo2. Unless I'm hallucinating worse than my Opus. 😅

I wonder if there's an alternative: during replay, interpret wait_for_response as "don't wait, read next message". But it might be more complex.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see clearly how the controller can replay the "recorded" user response, since this code says that all actions with source='user' are not replayable. What am I missing?

You are right, I was cheating! It's not being "replayed" because there's nothing to replay. It's skipped from replay manager perspective.

it also seems to me that we're getting an extra MessageAction that wasn't in history before? The null/null #6380 (comment) is new in demo2

Yeah that might be a side-effect of setting wait_for_response = False. Let me think about your alternative.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there's an alternative: during replay, interpret wait_for_response as "don't wait, read next message". But it might be more complex.

This sounds like the right way to do stuff, but... it means more coupling between agent controller and replay manager 💭

# want the user to interfere with the replay process
logger.info(
'Replay events contains wait_for_response message action, ignoring wait_for_response'
)
event.wait_for_response = False
self.replay_events = replay_events
self.replay_mode = bool(replay_events)
self.replay_index = 0
Expand Down
4 changes: 4 additions & 0 deletions openhands/core/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -231,6 +231,10 @@ def load_replay_log(trajectory_path: str) -> tuple[list[Event] | None, Action]:
events = []
for item in data:
event = event_from_dict(item)
if event.source == EventSource.ENVIRONMENT:
# ignore ENVIRONMENT events as they are not issued by
# the user or agent, and should not be replayed
continue
# cannot add an event with _id to event stream
event._id = None # type: ignore[attr-defined]
events.append(event)
Expand Down
Loading