34b vs 13b vs 7b: Impact of LLM size vs vision encoder size? #1135

matthiasgeihs · 2024-02-14T19:07:37Z

matthiasgeihs
Feb 14, 2024

I've tested with the different model sizes a bit.
It seems like 34b is indeed significantly better in image recognition tasks.

I'd like to understand: How does the language model size affect the performance of the visual recognition given that the vision encoder always has the same size? What are the most impactful parts of the architecture on recognition performance?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

34b vs 13b vs 7b: Impact of LLM size vs vision encoder size? #1135

{{title}}

Replies: 0 comments

Select a reply

34b vs 13b vs 7b: Impact of LLM size vs vision encoder size? #1135

matthiasgeihs Feb 14, 2024

Replies: 0 comments

matthiasgeihs
Feb 14, 2024