New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

metal : utilize max shared memory for mul_mat_id #7935

Merged

ggerganov merged 1 commit into master from gg/metal-mmid-max-rows

Jun 14, 2024

Member

ggerganov commented Jun 14, 2024

Allows larger batch sizes for MoE models (e.g. DeepSeek v2), though using -ub 256 remains more efficient in terms of speed

Review complexity : Low
I have read the contributing guidelines


          metal : utilize max shared memory for mul_mat_id

eaf34ba

slaren approved these changes

View reviewed changes

ggerganov merged commit 66ef1ce into master

68 checks passed

bunnyfu mentioned this pull request

deepseek-code-v2 ollama/ollama#5120

Closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet