- 
                Notifications
    
You must be signed in to change notification settings  - Fork 599
 
Gemma 3 fix #243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Gemma 3 fix #243
Conversation
| 
           Very cool! @bradhilton can you take a look at this one?  | 
    
| 
           @Lucas-Fernandes-Martins have you been able to test this? Does it work?  | 
    
| 
           Hi @corbt and @bradhilton, thank you for your message! Unfortunately, I spent today doing additional testing and I found something concerning with the solution I proposed. While solving the enable_prefix_caching issue, another one appears (for some reason I failed to notice this yesterday): This seems closely linked to this open issue in Unsloth. Also, when I try to deactivate vllm altogether, I get: I apologize for opening the pull request so soon, I got carried away that the initial enable_cache_prefix issue was solved. If you feel it is appropriate I'll close the pull request, do more investigation, and try and solve the problem. I've seen some folks in the community mentioning Gemma 3 would be very useful to have in ART, specially due to its multilingual capabilities, so I'll do my best to try and solve this. Either way, thank you for the help :)  | 
    
| 
           Thank you @Lucas-Fernandes-Martins for your investigation. I am afraid that adding Gemma 3 support will likely be tricky.  | 
    
| 
           @Lucas-Fernandes-Martins it would be great to get Gemma 3 in! Definitely update this PR if you get to a working solution.  | 
    
| 
           Hi, thank you for your patience. After a few days of investigation, it seems that the main issue is that Usloth's Gemma 3 doesn't support VLLM. However, I got some news from the Unsloth community that VLLM support for Gemma 3 will soon be released (maybe next week even). Once this happens, I'll test ART to see if it now works and keep you folks in the loop! Thanks again :)  | 
    
| 
           Thank you @Lucas-Fernandes-Martins for investigating!  | 
    
As noted in the main READ.me, Gemma 3 models are not yet supported by ART, due to Gemma not accepting the enable_prefix_caching parameter.
To solve this issue, I've introduced the following changes on get_model_config.py:
I believe this would solve the problem, as users can then specify the parameter use_gemma_config and avoid enable_prefix_caching to be added to the arg list.
Let me know if this is not correct or require adaptations.
Thank you very much :)