Skip to content

Conversation

@bghira
Copy link

@bghira bghira commented Nov 11, 2025

This is a pull of ggml-org/ggml#1384 into the llama.cpp repository for review/sync to ggml, since I'm mostly unfamiliar with the contribution process.

I noted a lack of Metal-accelerated ops in GGML and thought Conv2d would be a simple target for my first contribution.

The results for performance test on M3 Max (the only hw I have for testing) show a substantial boost from leveraging simdgroup:

Shape Metal (GFLOPS) CPU (GFLOPS)
19x19, Cin=256, Cout=4096, fp32 191.6 17.1
224x224, Cin=3, Cout=8, fp32 103.0 1.5
58x58, Cin=32, Cout=64, fp32 159.3 7.0

Copilot-generated summary:

This pull request adds support for 2D convolution (CONV_2D) operations in the Metal backend of GGML, enabling hardware-accelerated execution of this operation on supported Apple devices. The changes include the implementation of the Metal kernel, integration into the operation pipeline, and updates to device capability checks and argument structures.

2D Convolution (CONV_2D) Support:

  • Added a new Metal kernel kernel_conv_2d in ggml-metal.metal for efficient 2D convolution, with template instantiations for both float and half.
  • Introduced the ggml_metal_kargs_conv_2d argument struct in ggml-metal-impl.h to pass necessary parameters to the Metal kernel.
  • Implemented the ggml_metal_op_conv_2d function in ggml-metal-ops.cpp to encode and dispatch the 2D convolution operation.
  • Registered the new operation in the Metal operation pipeline and header files (ggml-metal-ops.cpp, ggml-metal-ops.h) [1] [2].
  • Added the pipeline getter for CONV_2D in ggml-metal-device.cpp and declared it in the header [1] [2].
  • Updated device capability checks to recognize CONV_2D support in ggml-metal-device.m.

Other Minor Changes:

  • Updated tensor API enablement logic for device compatibility, removing checks for some device models.
  • Fixed type consistency in argument passing for the concat operation.
  • Minor code cleanup and header includes [1] [2].

These changes collectively allow GGML to offload 2D convolution operations to the GPU via Metal, improving performance for models that use this operation.

@bghira bghira requested a review from ggerganov as a code owner November 11, 2025 19:03
@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning Apple Metal https://en.wikipedia.org/wiki/Metal_(API) labels Nov 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Apple Metal https://en.wikipedia.org/wiki/Metal_(API) ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant