Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] How to design asynchronous human-in-the-loop Crews running on the backend? #2051

Open
olokshyn opened this issue Feb 6, 2025 · 0 comments
Labels
feature-request New feature or request

Comments

@olokshyn
Copy link

olokshyn commented Feb 6, 2025

Feature Area

Core functionality

Is your feature request related to a an existing bug? Please link it here.

Human in the loop with CrewAI: #258 (closed)
Related forum

Describe the solution you'd like

I'd appreciate your help with designing a production human-in-the-loop system. I don't think it's covered by the existing documentation.

My use-case is pretty generic:

  1. a crew is doing a mission-critical work, so it must take input from a human while executing the task.
  2. If a human says that's no good, the crew must re-do the current and potentially previous tasks, considering the human feedback.
  3. This iterative process shall repeat until the human approves the output of the task.

I know people suggested implementing the "ask human tool" and a dedicated agent that can use this tool to get the human input.

However, this is not sufficient once we consider how the crew is deployed: the crew is running on a backend in a background task, with the frontend connected through a web socket. It can also be running within the HTTP request processing context, makes little difference.
Once the crew decides to ask for the human input, it must:

  1. save the current state/context to a persistent storage like a database, so it can continue in case the crew dies before the user can provide the input
  2. yield control to the calling process so it can send the user prompt over the web socket
  3. wait for the user to provide input without timing out or dying
  4. restore the crew context if needed
  5. continue the crew based on the human input.

So this setup raises some questions:

  1. How do I make sure the "ask human" tool is called every time the task produces a result?
  2. How do I make sure the tasks are re-run if the "ask human" tool requests changes, as many times as needed?
  3. How do I make the "ask human" tool to pause the crew and yield control to the calling process?
  4. How do I save and restore the context of a crew?

I see that the human_input=True flag of a Task can solve the first two questions, but looks like it's limited to the stdin inputs.
Alternatively, I could abuse(?) the task guardrail mechanism to as user confirmation after the task is finished and return validation error if the user requests changes. However, I will need to do an additional LLM call to know what the user said.

Memory looks promising for saving state, but won't do the trick.
The agents have two types of memory that save interactions:

  1. Short-term memory. Saves agent's outputs in ChromaDB vector store, which is separate for each agent.
  2. Long-term memory. Saves the evaluation (0-10) of the agent's output (not the output itself) to a SQLite DB, one DB for all agents.
    Neither type of memory seems to save actual messages that led to the agent's output, that's a bit confusing to me since the agent may lose some useful details provided by the user earlier that are not detected as entities.

In conclusion, neither short-term nor long-term memory can help save the full context of all agents when the crew is interrupted by the human input.
There is a bigger problem with data isolation: memories from interactions with different users will be shared across all users through the common databases used by the default storage classes. I think this is solved by using some other storage for memories.

Describe alternatives you've considered

Currently, the way I see it can be implemented is:

  1. Let the "ask human" tool just block the whole crew until the user provides input. Not ideal since now I need to run crew in a separate process, so when the "ask human" tool blocks, I can still use the socket. Async execution should help here.
  2. Provide the "ask human" with two queues, so it can send the user prompt in one and wait for the user's answer in another.
  3. In the web socket process, read/write on these queues and hope the crew process doesn't die meanwhile, losing the context.
  4. Probably use a Flow with the first crew talking to the human and producing results, then Python code verifying the human input was taken into account, then another crew acting on the verified results of the first crew.

The quick&dirty solution is just to patch the CrewAgentExecutorMixin._ask_human_input on the CrewAgentExecutor class in the agent.py module:

from functools import partial

import crewai
from crewai.agents.crew_agent_executor import CrewAgentExecutor


def replace_class(new_class=None, *, class_module, class_name):
    if new_class is None:
        return partial(replace_class, class_module=class_module, class_name=class_name)
    original_class = class_module.__dict__[class_name]
    class_module.__dict__[class_name] = new_class
    assert original_class in new_class.__bases__
    new_class.replaces_class = original_class
    return new_class


@replace_class(class_module=crewai.agent, class_name="CrewAgentExecutor")
class CustomCrewAgentExecutor(CrewAgentExecutor):
    is_custom = True

    def _ask_human_input(self, final_answer: str) -> str:
        # send final_answer to the frontend somehow and wait for the user feedback
        user_feedback = ...
        return user_feedback

Obviously, it ignores saving/restoring the state for all agents.

Additional context

No response

Willingness to Contribute

Yes, I'd be happy to submit a pull request

@olokshyn olokshyn added the feature-request New feature or request label Feb 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant