Gemma 3 fix #243

Lucas-Fernandes-Martins · 2025-07-15T20:10:50Z

As noted in the main READ.me, Gemma 3 models are not yet supported by ART, due to Gemma not accepting the enable_prefix_caching parameter.

To solve this issue, I've introduced the following changes on get_model_config.py:

use_gemma_config = config.get("use_gemma_config", False)

if use_gemma_config:
        init_args = InitArgs(
            model_name=base_model,
            max_seq_length=32768,
            load_in_4bit=True,  # False for LoRA 16bit
            fast_inference=True,  # Enable vLLM fast inference
            # vLLM args
            disable_log_stats=False,
            gpu_memory_utilization=(
                0.79 if enable_sleep_mode else 0.55
            ),  # Reduce if out of memory
            max_lora_rank=8,
            use_async=True,
        )
 else:
        init_args = InitArgs(
            model_name=base_model,
            max_seq_length=32768,
            load_in_4bit=True,  # False for LoRA 16bit
            fast_inference=True,  # Enable vLLM fast inference
            # vLLM args
            disable_log_stats=False,
            enable_prefix_caching=True,
            gpu_memory_utilization=(
                0.79 if enable_sleep_mode else 0.55
            ),  # Reduce if out of memory
            max_lora_rank=8,
            use_async=True,
        )

I believe this would solve the problem, as users can then specify the parameter use_gemma_config and avoid enable_prefix_caching to be added to the arg list.

Let me know if this is not correct or require adaptations.

Thank you very much :)

corbt · 2025-07-16T19:21:42Z

Very cool! @bradhilton can you take a look at this one?

bradhilton · 2025-07-16T21:06:56Z

@Lucas-Fernandes-Martins have you been able to test this? Does it work?

Lucas-Fernandes-Martins · 2025-07-17T04:11:57Z

Hi @corbt and @bradhilton, thank you for your message!

Unfortunately, I spent today doing additional testing and I found something concerning with the solution I proposed.

While solving the enable_prefix_caching issue, another one appears (for some reason I failed to notice this yesterday):

AttributeError: 'Gemma3ForCausalLM' object has no attribute 'vllm_engine'

This seems closely linked to this open issue in Unsloth.

Also, when I try to deactivate vllm altogether, I get:

     63             ctx = zmq.Context(async_ctx)
     64 
---> 65         Which previously had to be::
     66 
     67             ctx = zmq.Context.shadow(async_ctx.underlying)

zmq/backend/cython/context.pyx in zmq.backend.cython.context.Context.__init__()
TypeError: an integer is required

I apologize for opening the pull request so soon, I got carried away that the initial enable_cache_prefix issue was solved. If you feel it is appropriate I'll close the pull request, do more investigation, and try and solve the problem.

I've seen some folks in the community mentioning Gemma 3 would be very useful to have in ART, specially due to its multilingual capabilities, so I'll do my best to try and solve this.

Either way, thank you for the help :)

bradhilton · 2025-07-17T04:17:24Z

Thank you @Lucas-Fernandes-Martins for your investigation. I am afraid that adding Gemma 3 support will likely be tricky.

corbt · 2025-07-17T12:16:32Z

@Lucas-Fernandes-Martins it would be great to get Gemma 3 in! Definitely update this PR if you get to a working solution.

Lucas-Fernandes-Martins · 2025-07-19T02:43:09Z

Hi, thank you for your patience. After a few days of investigation, it seems that the main issue is that Usloth's Gemma 3 doesn't support VLLM. However, I got some news from the Unsloth community that VLLM support for Gemma 3 will soon be released (maybe next week even).

Once this happens, I'll test ART to see if it now works and keep you folks in the loop!

Thanks again :)

bradhilton · 2025-07-19T13:51:42Z

Thank you @Lucas-Fernandes-Martins for investigating!

gemma 3 fix

c3183f6

Lucas-Fernandes-Martins changed the title ~~gemma 3 fix~~ Gemma 3 fix Jul 15, 2025

corbt requested a review from bradhilton July 16, 2025 19:21

gemma 3 fix

5c02f28

Lucas-Fernandes-Martins and others added 3 commits July 18, 2025 18:42

patches for Gemma 3

895e9f3

patches for Gemma 3

2e36099

Merge branch 'OpenPipe:main' into main

dd34410

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Gemma 3 fix #243

Gemma 3 fix #243

Uh oh!

Lucas-Fernandes-Martins commented Jul 15, 2025

Uh oh!

corbt commented Jul 16, 2025

Uh oh!

bradhilton commented Jul 16, 2025

Uh oh!

Lucas-Fernandes-Martins commented Jul 17, 2025

Uh oh!

bradhilton commented Jul 17, 2025

Uh oh!

corbt commented Jul 17, 2025

Uh oh!

Lucas-Fernandes-Martins commented Jul 19, 2025

Uh oh!

bradhilton commented Jul 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Gemma 3 fix #243

Are you sure you want to change the base?

Gemma 3 fix #243

Uh oh!

Conversation

Lucas-Fernandes-Martins commented Jul 15, 2025

Uh oh!

corbt commented Jul 16, 2025

Uh oh!

bradhilton commented Jul 16, 2025

Uh oh!

Lucas-Fernandes-Martins commented Jul 17, 2025

Uh oh!

bradhilton commented Jul 17, 2025

Uh oh!

corbt commented Jul 17, 2025

Uh oh!

Lucas-Fernandes-Martins commented Jul 19, 2025

Uh oh!

bradhilton commented Jul 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants