-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timeout Error while loading /pull llava in the chatbar #33
Comments
It seems that you solved the problems from the other two issues you created, can you write the solution under them please and close the issue? For your issue, it seems to be an ollama related problem, so looking into their issues might help. Also how many pages does your pdf file have? |
yes i have updated the solution for previous issues my pdf size was 1.5 mb only and when i try to /pull llava model it is showing processing and then timeout error. /pull nomic-embed-text loads up pretty quick but the problem with that is when i upload an pdf file it is stuck on processing pdf... and not responsing anything even i ask in chatbar |
I have updated the code yesterday to track timings, can you insert the pdf that is inside the pdf folder of this repository and then paste the command line output here so I can see whats happening and how long it takes? |
ok so i did update the code with the latest commit you have done. streamlit app.py in my browser, /pull nomic-embed-text but still same issue of connectionerror : timeout This is what i got in my browser : and this is what i got in my Ubuntu terminal : |
In another issue you wrote that you changed the ollama url, did you also change it in the vectordb_handler.py file? Also sidenote, you dont need to pull the same model over and over, after pulling it once, you have it locally. Also ollama did not recognize your gpu, make sure that you have the latest version of the code and wsl set up if you work on windows. |
yes i did changed it in the vectordb file as well but now with the latest repo it works with ollama URL too, yeah i dont know why it is not recognizing the GPU i do have WSL2 (Ubuntu) but even with the CPU only pdf loads take around 30 second now and response tiem is 10-15 second |
I build a filter to not show models that have "embed" in their name, because you cant chat with embedding models, therefore it makes no sense to have them in the model selection. Also embeddings usually are chosen once and then being used continuously. You can define which embedding model to use in the config.yaml file |
oh okay make sense now so to summarize/chat with the pdf which model is being used after embedding the pdf? Also, so to unpull i have to delete the model from .ollama directory which i dont want to use |
Y you would need to delete it from the directory you defined, but its a but complicated since they are not named based on their model names. Also a sidenote, the code is not build in a way to directly summarize the pdf, its a RAG approach where three text snippets get retrieved from the vector database, which are most similar to your query and will be given to the llm as context to answer your question. |
Okay so I switched to windows and tested it out myself. Llama3.1 would not load on gpu for me, I also got a timeout error. But when loading smaller models it worked, but took a long time. For example 5 minutes to load llava on gpu while it took only 5 seconds on linux. Without gpu support it worked tho. Also this is an ollama related issue, you might want look for answers over there: ollama/ollama#4427 |
okay sure ill check out the ollama thread for solution for gpu usage also, once you upload the pdf using nomic embed text model, which model do you use to chat with pdf? also can you suggest similar kind of model such as llava but smaller in size so that we dont get a timeout error for it ? |
You can choose what ever model you want for chatting. Look into the ollama library and select and try what fits your system best https://ollama.com/library |
so here is the solution for http connection error or connection timeout error what i did is change the permission setting for .ollama folder by going into properties, security tab and changing the user (your laptop) setting to full control so that it can remove the unnecessary blobs files automatically which is present in the .ollama/model folder after doing this restart your pc and the http error should be gone although whenever i load/pull bigger model (3GB to 5GB) i get timeout error but work fine with small model
|
so i tried to pull this model i was having timeout error but then i plugged in the charger, set the system into performance/turbo mode and it loads the model successfully. although I still cannot load the llava:7b model ends up with timeout error every time i guess it's because of its 4-5GB size while the smaller model loads perfectly Edit: should I download the llava model manually and put it in the .ollama/model folder, so will it be available in the dropdown select a model section in the stream lit app? |
AUDIO ERROR i am getting this when i record an audio from streamlitapp. LibsndfileError: Error opening <_io.BytesIO object at 0x7f71cd288f40>: Format not recognised. i feel like the audio which is recording is not in format of Wav or something similar acceptable format hence this error Edit: although when i try to upload the audio file (ogg format) and summarize it, it works fine but takes few minute to load and process while when i record directly from the record audio button the above error occurs |
@Paramjethwa Just pushed a fix for the audio error and the timeout error. Make sure to rebuild the docker image for the app. |
@Leon-Sander so i have tested the commit and at first the audio didnt work but later as i start the normal chat with keyboard and then used record audio it Worked perfectly fine And about timeout Error of /pull llava it still exist tried few time and end up with the same error of asyncio.exceptions.TimeoutError |
Okay I think I got it now, it is an aiohttp error. I pushed a fix on a new branch, timout_error_fix. On line 53 in utils.py, you can set the time to wait, after which it will throw an timeout error, I set it to 30 minutes now. Can you please test if this fixes your problem, and adjust if you think you would need more time to download the model? If this does not work, I got another workaround idea for you which will surely work. |
Hey Leon it took 15-20 minute to load the model and it got sucessfully completed this works perfectly fine now.. Now the chat App is successfully running everything, but the only issue here is ollama is not detecting/using my GPU to load or response query although i am trying to find the solution in ollama GitHub issue thread hope i find it soon. |
Thanks happy to hear that. Do you have wsl enabled in docker? |
yes it is already enabled the WSL in docker desktop |
Hi Leon finally found the solution for GPU usage, what i did it
Major credit and thanks to you Leon for helping me solve every single problem super grateful! |
this is what i got in the streamlit app after 5 minutes of waiting after doing /pull llava
TimeoutError
Traceback:
File "/usr/local/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/exec_code.py", line 88, in exec_func_with_error_handling
result = func()
File "/usr/local/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 590, in code_to_exec
exec(code, module.dict)
File "/app/app.py", line 171, in
main()
File "/app/app.py", line 123, in main
response = command(user_input)
File "/app/utils.py", line 15, in command
return pull_model_in_background(splitted_input[1])
File "/app/utils.py", line 73, in pull_model_in_background
return asyncio.run(pull_ollama_model_async(model_name, stream=stream))
File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/app/utils.py", line 39, in pull_ollama_model_async
async with session.post(url, json=json_data) as response:
File "/usr/local/lib/python3.10/site-packages/aiohttp/client.py", line 1353, in aenter
self._resp = await self._coro
File "/usr/local/lib/python3.10/site-packages/aiohttp/client.py", line 684, in _request
await resp.start(conn)
File "/usr/local/lib/python3.10/site-packages/aiohttp/client_reqrep.py", line 994, in start
with self._timer:
File "/usr/local/lib/python3.10/site-packages/aiohttp/helpers.py", line 713, in exit
raise asyncio.TimeoutError from None
/pull nomic-embed-text loads up pretty quick but the problem with that is when i upload an pdf file it is stuck on processing pdf... and now responsing anything even i ask in chatbar
The text was updated successfully, but these errors were encountered: