Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make @RegisterAiService beans request scoped by default #96

Merged
merged 1 commit into from
Dec 7, 2023
Merged

Conversation

geoand
Copy link
Collaborator

@geoand geoand commented Dec 5, 2023

This is done because otherwise the chat memory
does not get cleared properly.

Fixes: #95

Do we need a docs update to go along with this?

@geoand
Copy link
Collaborator Author

geoand commented Dec 5, 2023

@jmartisk pinecone test still failing after merging your PR :)

@jmartisk
Copy link
Collaborator

jmartisk commented Dec 5, 2023

That must be because https://github.com/quarkiverse/quarkus-langchain4j/actions/runs/7101672400 and https://github.com/quarkiverse/quarkus-langchain4j/actions/runs/7101669612 were triggered at about the same time.
It guards against running the matrix together within a single build, but it doesn't guard about somebody triggering two independent builds at once by PRs coming from the main repo...

@jmartisk
Copy link
Collaborator

jmartisk commented Dec 5, 2023

We'll need to submit PRs from forks unless absolutely necessary :)

@geoand
Copy link
Collaborator Author

geoand commented Dec 5, 2023

Ah gotcha, thanks!

@geoand
Copy link
Collaborator Author

geoand commented Dec 5, 2023

I think I'm gonna take out the pinecone test altogether and only enable it in the nightly build.
It's too big of a toll on test parallelism for the sake of just one test

@cescoffier
Copy link
Collaborator

I’m not sure about that. Typically, rest clients are application-scoped. We should change how we handle memory instead.

@geoand
Copy link
Collaborator Author

geoand commented Dec 5, 2023

Unfortunately it's not easy (or possible at all).

The basic problem is that memory is also used when using AiService.create(), and if you try to change the context or the memory to fit the @RegisterAiService model, you end up breaking the create model.
I tried a few variations of this, and none worked well - the only thing that worked was the request scoped service - whose only real problem really is when you are not on a request (which is not really like, but even if it is, one can use @ActivateRequestScope).

@jmartisk
Copy link
Collaborator

jmartisk commented Dec 5, 2023

I think I'm gonna take out the pinecone test altogether and only enable it in the nightly build. It's too big of a toll on test parallelism for the sake of just one test

Yeah I kinda agree. You can disable it by removing the propagation of secrets into workflows. The test runs when it sees a non-empty value of PINECONE_API_KEY.

@cescoffier
Copy link
Collaborator

Are you sure this will continue to work with the web socket? It will create many instances of the AI service, which is what we want to avoid.

I do not get the issue with the creation part. I would need to have a look. Maybe we need to prioritize the scope thingy.

@geoand
Copy link
Collaborator Author

geoand commented Dec 5, 2023

Are you sure this will continue to work with the web socket?

See the changes to the examples :). Essentially it works if you use context propagation

@geoand
Copy link
Collaborator Author

geoand commented Dec 5, 2023

I do not get the issue with the creation part. I would need to have a look. Maybe we need to prioritize the scope thingy.

What happens is that although the ChatMemoryProvider closes and removes all content, the ChatMemory of AiServicesContext still uses them.
If you try to remove them from there as well, you'll get a bunch of breaking tests that use AiServices.create

@cescoffier
Copy link
Collaborator

What if we use a proxy/facade in-between that uses the ARC API to instantiate the memory in the proper context?

@cescoffier
Copy link
Collaborator

Also, what about having AiServicesContext in a special scope and not the AIService?

@geoand
Copy link
Collaborator Author

geoand commented Dec 5, 2023

I thought of that. The problem is producing it, but I'll look into it

@geoand
Copy link
Collaborator Author

geoand commented Dec 6, 2023

@jmartisk the nightly build with the pinecone test is failing very weirdly: https://github.com/quarkiverse/quarkus-langchain4j/actions/runs/7108550795/job/19352035493

@jmartisk
Copy link
Collaborator

jmartisk commented Dec 6, 2023

Holy * that test is obnoxious. Looking into it...

@jmartisk
Copy link
Collaborator

jmartisk commented Dec 6, 2023

Nice, Pinecone has changed the response format and they now don't include an empty metadata object in the response if a vector has no metadata, causing the NPE. I will send a PR to adjust it on our side....

@geoand
Copy link
Collaborator Author

geoand commented Dec 6, 2023

Thanks!

@geoand
Copy link
Collaborator Author

geoand commented Dec 6, 2023

Also, what about having AiServicesContext in a special scope and not the AIService?

This is what I am trying now, but so far it has subtle issues, not to mention that the code is a lot more complicated than what I have pushed here.
I do believe that what I pushed is a fine solution, there are no real downsides of having the client be request scoped.

@geoand
Copy link
Collaborator Author

geoand commented Dec 6, 2023

Also, the more I think about it, the more it makes sense to me that the context (which is an entity unknown to the user) should have the same scope as the bean itself.

@geoand geoand force-pushed the #95 branch 2 times, most recently from a736eed to cf17b8c Compare December 6, 2023 14:08
@geoand
Copy link
Collaborator Author

geoand commented Dec 6, 2023

I have updated the PR with something slightly better.

As for the websockets, the scope is opened, but it is never closed unfortunately... So for those samples I made the services singleton.

If we agree on this approach, then I'll need to update the docs.

@geoand geoand force-pushed the #95 branch 2 times, most recently from 749fb9a to ee75189 Compare December 7, 2023 07:14
@cescoffier
Copy link
Collaborator

Ah Ah! I knew web sockets would be problematic.

I won’t have time to look before next week (traveling). So let’s go with this approach and extend the doc, for now.

We need to go back to the whiteboard to find a proper way to deal with this.

@geoand
Copy link
Collaborator Author

geoand commented Dec 7, 2023

We need to go back to the whiteboard to find a proper way to deal with this.

I totally agree

@geoand geoand force-pushed the #95 branch 2 times, most recently from ae6d811 to 645cc73 Compare December 7, 2023 08:07
@geoand geoand requested a review from cescoffier December 7, 2023 09:19
Copy link
Collaborator

@cescoffier cescoffier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a typo.

This is done because otherwise the chat memory
does not get cleared properly.

Furthermore, add a way to remove memory entries when the service goes out of scope

Fixes: #95
@geoand geoand merged commit b72b2db into main Dec 7, 2023
2 checks passed
@geoand geoand deleted the #95 branch December 7, 2023 12:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ChatMemory is persisted over request boundaries
3 participants