Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix checking final chunk in ReAct agent #11280

Merged

Conversation

leehuwuj
Copy link
Contributor

Description

This PR to fix a minor issue in ReActAgent.

The issue:

ReActAgent keeps include internal states ("Though: ", "Action: ", "Observation: ",...) in chat response.

Preproduce:

I create a ReActAgent with LLama2 model:

agent = ReActAgent.from_llm(llm=llm)
res = agent.stream_chat("What is LlamaIndex?")
for token in res:
    print(token, end="")

Response output:

Thought: I need to use a tool to help me answer the question.
Action: search_engine (one of the tools)
Action Input: {"query": "LlamaIndex"}

Observation: LlamaIndex is a search engine ranking metric that measures the relevance of a website's content to a specific query. It is used to determine the quality and relevance of a website's content in relation to a user's search query.

Thought: I can answer without using any more tools.
Answer: LlamaIndex is a search engine ranking metric that measures the relevance of a website's content to a specific query.

Expected output:

  • Should response the "Answer: " chunk only
LlamaIndex is a search engine ranking metric that measures the relevance of a website's content to a specific query.

Type of Change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Added new unit/integration tests
  • Added new notebook (that tests end-to-end)
  • I stared at the code and made sure it makes sense

Suggested Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added Google Colab support for the newly added notebooks.
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I ran make format; make lint to appease the lint gods

@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Feb 22, 2024
Comment on lines 344 to 347
if not latest_content.startswith(
"Thought"
): # doesn't follow thought-action format
return True
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This checking logic seems weird to me. While I'm unsure about all response cases, shouldn't the final chunk always be "Answer: "?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the model could also go off the rails (very common for open-source LLMs to arbitrarily stop following the react format exactly)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got you! but as you see in my case, the model actually response that start with "Thought", which follows the format.

The issue arises because the received chunk doesn't always contain a complete "Thought" word. For example, in my case, the sequence is: ('Th', 'Thought', 'Thought: ', 'Thought: I',...). Consequently, the worker bypasses the reasoning step and directly outputs the full model content. Evidence supporting this reason is that if i start a chat (without stream) then it works correctly.

This change then fixed my issue:

if len(latest_content) > 7 and not latest_content.startswith("Thought"):
    return True

Do you have other better idea?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leehuwuj this looks good to me - your one-liner fixes the parser for streaming. @logan-markewich so the "model going off rail check" is still kept - it just works now also with streaming. Can you merge that?

@leehuwuj leehuwuj force-pushed the lee/fix-reactagent-chat-response branch from 13a0a2d to 906d17b Compare February 23, 2024 09:53
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Feb 26, 2024
@logan-markewich logan-markewich merged commit cc8e1ee into run-llama:main Feb 26, 2024
8 checks passed
Dominastorm pushed a commit to uptrain-ai/llama_index that referenced this pull request Feb 28, 2024
anoopshrma pushed a commit to anoopshrma/llama_index that referenced this pull request Mar 2, 2024
Izukimat pushed a commit to Izukimat/llama_index that referenced this pull request Mar 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm This PR has been approved by a maintainer size:S This PR changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants