Skip to content

Latest commit

 

History

History
129 lines (94 loc) · 4.84 KB

ARM-Mali-Bifrost.md

File metadata and controls

129 lines (94 loc) · 4.84 KB

Bifrost is a 4th generation of Mali GPU architecture.

Content:

Bifrost Gen1

Examples

  • Mali-G31

  • Mali-G51

  • Mali-G71

  • Rockchip RK3326 (Mali-G31)

References

1.1. ARM Unveils Next Generation Bifrost GPU Architecture & Mali-G71
1.2. Arm Mali-G71 Performance Counters Reference Guide, [backup]
1.3. Vulkan features for Mali-G71

Notes

  • implement a single texel-per-clock and single pixel-per-clock shader core.
  • G71: 4-wide SIMD per quad; 3x quads per core. 12 FMAs at the same time. These 3 quads are managed by a core's quad manager. [4]
  • In both the Mali-G71 and G72, a quad is just that: a 4-wide SIMD unit, with each lane possessing separate FMA and ADD/SF pipes. [4]
  • 128 bits per pixel of tile buffer color storage. [5]
  • AFBC: [5]
    • G71: partial support
    • G31, G51: full support

Bifrost Gen2

Examples

  • Mali-G52

  • Mali-G72

  • Rockchip RK3562, RK3566, RK3568 (Mali-G52-2EE)

References

2.1. ARM Announces Mali-G72
2.2. Arm Mali-G72 Performance Counters Reference Guide, [backup]
2.3. Mali-G52
2.4. Vulkan features for Mali-G52

Notes

  • implement a single texel-per-clock and single pixel-per-clock shader core.
  • 256 bits per pixel of tile buffer color storage. [5]

Bifrost Gen3

Examples

  • Mali-G76

References

3.1. Arm Mali-G76 Performance Counters Reference Guide, [backup]
3.2. Vulkan features for Mali-G76

Notes

  • Implement a two texel-per-clock and two pixel-per-clock shader core, with an increase in arithmetic performance to compensate. Not every GPU doubled the available performance though.
  • G76: 8-wide SIMD per quad, 3 quads per core. [4]
  • AFBC full support [5]
  • 8 threads per warp [3.1]

Bifrost (all gens)

References

  1. The Bifrost Shader Core, [backup]
  2. The Bifrost GPU architecture
  3. Mesa driver details
  4. Everything we learnt from hacking Arm Mali GPUs
  5. Arm GPU Best Practices Developer Guide, [backup]
  6. Arm GPU Datasheet, [backup]
  7. From Bifrost to Panfrost - deep dive into the first render

Notes

  • scalar

  • Mali GPUs can contain many identical shader cores. Each shader core supports hundreds of concurrently executing threads. [4]

  • Each shader core contains: [4]

    • One to three arithmetic pipelines or execution engines.
    • One load-store pipeline.
    • One texture pipeline.
  • AFBC (v1.2) with 4x4 block.

  • Transaction Elimination with 16x16 pixel block size.

  • From Bifrost onwards float blending is enabled. [5]

  • AFBC compatible formats: [5]

    • VK_FORMAT_R4G4B4A4_UNORM_PACK16
    • VK_FORMAT_B4G4R4A4_UNORM_PACK16
    • VK_FORMAT_R5G6B5_UNORM_PACK16
    • VK_FORMAT_R5G5B5A1_UNORM_PACK16
    • VK_FORMAT_B5G5R5A1_UNORM_PACK16
    • VK_FORMAT_A1R5G5B5_UNORM_PACK16
    • VK_FORMAT_B8G8R8_UNORM
    • VK_FORMAT_B8G8R8A8_UNORM
    • VK_FORMAT_B8G8R8A8_SRGB
    • VK_FORMAT_A8B8G8R8_UNORM
    • VK_FORMAT_A8B8G8R8_SRGB
    • VK_FORMAT_A8R8G8B8_SRGB
    • VK_FORMAT_B10G10R10A2_UNORM
    • VK_FORMAT_R4G4B4A4_UNORM
    • VK_FORMAT_R5G6B5_UNORM
    • VK_FORMAT_R5G5B5A1_UNORM
    • VK_FORMAT_R8_UNORM
    • VK_FORMAT_R8G8_UNORM
    • VK_FORMAT_R8G8B8_UNORM
    • VK_FORMAT_R8G8B8A8_UNORM
    • VK_FORMAT_R8G8B8A8_SRGB
    • VK_FORMAT_A8R8G8B8_UNORM
    • VK_FORMAT_R10G10B10A2_UNORM
    • VK_FORMAT_D24_UNORM_S8_UINT
    • VK_FORMAT_D16_UNORM
    • VK_FORMAT_D32_SFLOAT