Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak ? #26

Open
iLoveBug opened this issue Aug 11, 2024 · 8 comments
Open

Memory leak ? #26

iLoveBug opened this issue Aug 11, 2024 · 8 comments

Comments

@iLoveBug
Copy link

Thanks guys for your great share of this repo.

I try to use llama 3.1 with tools for graphrag on my MacBook Pro M3 Max 128GB,though Ollama support this model, but I found the entity extraction result is very strange.

Fortunately, fastmlx works quite fine with llama 3.1 for graphrag 0.2.1 (I use this version) except the memory consumption growth as time goes by.

I am not sure it's a memory leak or not.

I download a novel from website, and feed it into graphrag, the file size is less than 200KB.

Hope to get some support here.

@Blaizzy
Copy link
Collaborator

Blaizzy commented Aug 13, 2024

Hey @iLoveBug

I'm happy to hear that fastmlx works well for graphrag.

Could you give me a reproducible example?

For instance, how you start the server and the requests.

@iLoveBug
Copy link
Author

iLoveBug commented Aug 16, 2024 via email

@Blaizzy
Copy link
Collaborator

Blaizzy commented Aug 21, 2024

Can you share the exact example on how to replicate this issue.

Please include as much detail as possible:)

@Blaizzy
Copy link
Collaborator

Blaizzy commented Aug 21, 2024

The request and response you get

@iLoveBug
Copy link
Author

80days-ollama-llama3.1.zip
you can try with this package.

@Blaizzy
Copy link
Collaborator

Blaizzy commented Aug 24, 2024

Thanks for the example @iLoveBug!

But I'm afraid I don't understand what the error is.

Can you elaborate on what you mean by "except the memory consumption growth as time goes by."?

@iLoveBug
Copy link
Author

Sorry for the confuse.

My problem is graphrag consumes a lot of token via many request to LLM, for example, it first request to extract entities and relationships from text chunks, and then write some community report based on the extracted entities and relationships. During the process, I saw the memory consumption growth, and got slower and slower. This is why I guess there maybe some memory leak. The normal case should be the system will release the memory after it finish the response of the request, am I right?

@ivanfioravanti
Copy link

I think this was due to a bug in mlx that has now been fixed. Here ths original issue in Apple MLX repo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants