Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: ChatGPTGenerator #5692

Closed
wants to merge 34 commits into from
Closed

feat: ChatGPTGenerator #5692

wants to merge 34 commits into from

Conversation

ZanSara
Copy link
Contributor

@ZanSara ZanSara commented Aug 30, 2023

Related Issues

Proposed Changes:

  • Add ChatGPTGenerator according to the LLM Proposal
  • Add unit tests
  • Add end to end tests

How did you test it?

  • Local tests run
  • CI
  • e2e tests

Notes for the reviewer

n/a

Checklist

@ZanSara ZanSara changed the base branch from main to generators-module August 30, 2023 15:48
@github-actions github-actions bot added the type:documentation Improvements on the docs label Aug 30, 2023
@ZanSara ZanSara mentioned this pull request Aug 30, 2023
@ZanSara ZanSara mentioned this pull request Aug 30, 2023
@ZanSara ZanSara marked this pull request as ready for review August 30, 2023 18:38
@ZanSara ZanSara requested review from a team as code owners August 30, 2023 18:38
@ZanSara ZanSara requested review from dfokina and vblagoje and removed request for a team August 30, 2023 18:38
@ZanSara ZanSara marked this pull request as draft August 30, 2023 18:39
@ZanSara ZanSara marked this pull request as ready for review August 31, 2023 11:02
@ZanSara ZanSara added the 2.x Related to Haystack v2.0 label Aug 31, 2023
Copy link
Member

@vblagoje vblagoje left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the first pass I tried to find obvious and easy-to-find errors. Looks solid; left some early feedback.

headers = {"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"}
if openai_organization:
headers["OpenAI-Organization"] = openai_organization
url = f"{api_base_url}/chat/completions"
Copy link
Contributor Author

@ZanSara ZanSara Aug 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vblagoje We're using the chat completion endpoint for ChatGPT: https://platform.openai.com/docs/api-reference/chat/create

@ZanSara ZanSara requested a review from vblagoje August 31, 2023 14:06
Base automatically changed from generators-module to main August 31, 2023 15:33
@coveralls
Copy link
Collaborator

coveralls commented Aug 31, 2023

Pull Request Test Coverage Report for Build 6039895083

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 3 unchanged lines in 1 file lost coverage.
  • Overall coverage increased (+0.3%) to 48.903%

Files with Coverage Reduction New Missed Lines %
preview/components/generators/openai/_helpers.py 3 96.59%
Totals Coverage Status
Change from base Build 6039423187: 0.3%
Covered Lines: 11749
Relevant Lines: 24025

💛 - Coveralls

:param top_p: An alternative to sampling with temperature, called nucleus sampling, where the model
considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens
comprising the top 10% probability mass are considered.
:param n: How many completions to generate for each prompt.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we elaborate on what this means?

considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens
comprising the top 10% probability mass are considered.
:param n: How many completions to generate for each prompt.
:param stop: One or more sequences where the API will stop generating further tokens.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this define some specific phrase that tells the API to stop generating tokens?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the contrary: it's a keyword that tells us that we can discard all the LLM output that comes after it. It used to be quite important as ChatGPT would keep generating after the stopword if not instructed correctly. I'm not sure it's still an issue, but the parameter is still present.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the wording to try make it more clear, it was quite ambiguous indeed

:param api_base_url: The OpenAI API Base url, defaults to `https://api.openai.com/v1`.
:param openai_organization: The OpenAI organization ID.

See OpenAI documentation](https://platform.openai.com/docs/api-reference/chat) for more details.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
See OpenAI documentation](https://platform.openai.com/docs/api-reference/chat) for more details.
See OpenAI [documentation](https://platform.openai.com/docs/api-reference/chat) for more details.

@ZanSara
Copy link
Contributor Author

ZanSara commented Sep 4, 2023

This PR has become really big, so I'll split it into two smaller ones. I'll make sure to mark you two as reviewers of those PRs as well.

@ZanSara ZanSara closed this Sep 4, 2023
@masci masci deleted the chatgpt-generator branch December 5, 2023 08:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.x Related to Haystack v2.0 topic:tests type:documentation Improvements on the docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants