Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement support to BatchAPIs to gather evidence #687

Open
wants to merge 32 commits into
base: main
Choose a base branch
from

Conversation

maykcaldas
Copy link
Collaborator

@maykcaldas maykcaldas commented Nov 14, 2024

This PR implements support to send requests to OpenAI and Anthropic batch APIs. Due to the parallel nature of gathering evidence and summarizing all candidate papers, we plan to use the batch API when possible.

The use of the batch API depends on Settings.use_batch_in_summary. Therefore, paperqa workflows would still be unchanged in case this setting is set to False (default). Currently, using a batch keeps the process busy while the batch isn't finished on the LLM provider side, which could take up to 24 hours. This scaling issue will be addressed in another PR.

Task list

  • Create a class to make batch calls to openai
  • Create a class to make batch calls to anthropic
  • Integrate the openai class to the get_evidence method
  • Integrate the anthropic class to the get_evidence method
  • Update get_summary_llm to decide which provider to use given the llm in the config
  • ❌ Use pytest.mark.vcr in the tests to avoid creating batches for every test
  • Implement mock servers for testing purposes

This class is used to submit batch calls to the OpenAI batch API
@maykcaldas maykcaldas self-assigned this Nov 14, 2024
paperqa/llms.py Outdated Show resolved Hide resolved
paperqa/llms.py Outdated Show resolved Hide resolved
paperqa/settings.py Outdated Show resolved Hide resolved
paperqa/docs.py Outdated Show resolved Hide resolved
paperqa/llms.py Outdated Show resolved Hide resolved
paperqa/llms.py Outdated Show resolved Hide resolved
maykcaldas and others added 9 commits November 15, 2024 09:10
also added a dependency group in pyproject.toml to install openai and anthropic only if the user wants to use batches, refactored the logic of sumarizing evidences in batch and moved the code to core.py
Also bugfix in tests and created Enums to avoid hardcoding the batch status identifiers
The timelimit and the pooling time for the batches are now in the Settings
tests/test_paperqa.py Outdated Show resolved Hide resolved
tests/test_llms.py Outdated Show resolved Hide resolved
tests/test_llms.py Outdated Show resolved Hide resolved
tests/test_llms.py Outdated Show resolved Hide resolved
paperqa/core.py Show resolved Hide resolved
tests/test_llms.py Outdated Show resolved Hide resolved
tests/test_llms.py Outdated Show resolved Hide resolved
@maykcaldas maykcaldas marked this pull request as ready for review November 19, 2024 17:55
@dosubot dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Nov 19, 2024
@dosubot dosubot bot added the enhancement New feature or request label Nov 19, 2024
@maykcaldas maykcaldas requested a review from whitead November 19, 2024 17:56
@maykcaldas maykcaldas changed the title [WIP] Implement support to BatchAPIs to gather evidence Implement support to BatchAPIs to gather evidence Nov 19, 2024

for _, llm_result in results:
session.add_tokens(llm_result)

session.contexts += [r for r, _ in results if r is not None]
session.contexts += [r for r, _ in results]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did we cut the r is not None filter here? I would think that the results from gather_with_concurrency could still be None on failure, but maybe I'm wrong

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gets the Contexts from gather_with_concurrency or gather_with_batch. And both always return list of tuples with (Context, LLMResult). What can happen is to have an empty text in Context.text, but it seems to me that r is always an instance of Context.
Also, I didn't see any case of map_fxn_summary returning None while studying the code, and mypy also complains that r is None is always a True statement.

Maybe that's an edge case that I didn't see?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we correctly type hinted gather_with_concurrency then this would be resolved. @maykcaldas can you adjust it to be this?

T = TypeVar("T")


async def gather_with_concurrency(n: int, coros: Iterable[Awaitable[T]]) -> list[T]:
    ...
```

paperqa/llms.py Outdated Show resolved Hide resolved
@maykcaldas maykcaldas requested a review from mskarlin November 19, 2024 19:46
paperqa/core.py Outdated Show resolved Hide resolved
paperqa/core.py Outdated Show resolved Hide resolved
paperqa/settings.py Show resolved Hide resolved
paperqa/settings.py Outdated Show resolved Hide resolved
paperqa/core.py Show resolved Hide resolved
paperqa/core.py Outdated Show resolved Hide resolved
paperqa/docs.py Outdated Show resolved Hide resolved

for _, llm_result in results:
session.add_tokens(llm_result)

session.contexts += [r for r, _ in results if r is not None]
session.contexts += [r for r, _ in results]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we correctly type hinted gather_with_concurrency then this would be resolved. @maykcaldas can you adjust it to be this?

T = TypeVar("T")


async def gather_with_concurrency(n: int, coros: Iterable[Awaitable[T]]) -> list[T]:
    ...
```

paperqa/core.py Show resolved Hide resolved
paperqa/llms.py Outdated Show resolved Hide resolved
paperqa/llms.py Outdated Show resolved Hide resolved
paperqa/llms.py Outdated Show resolved Hide resolved
paperqa/llms.py Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request size:XXL This PR changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants