- A14
- M1
- Discover Metal enhancements for A14 Bionic
- Mesa driver details
- Dissecting the Apple M1 GPU: 1, 2, 3
- M1 Benchmarks
- M1 reverse engineering
- iGPU Cache Setups Compared, Including M1
- Reverse engineering the Apple G13 GPU architecture
- fp32 has same rate as fp16. [4]
- that there is a penalty (of exactly one cycle) for switching between FP32 and FP16 operation. ref
- FP32 ALU rate is half of FP16 rate on A14 (and earlier chips). That has not changed on A14. F32 ALU rate relative to F16 increased on M1.
- Ray tracing (software).
-
Local memory (L1): [6]
- size: 32KB
- latency: 43ns
- bandwidth: 671 GB/s
-
L2 Cache: [6]
- size: 1MB
- latency: 76.3ns
- bandwidth: 384 GB/s
-
System level cache (L3): [6]
- size: 8MB
- latency: 266ns
- bandwidth: 134 GB/s
-
RAM: [6]
- latency: 311ns
- bandwidth: 50.4 GB/s
-
CPU to GPU bandwidth: 17 GB/s [6]
-
GPU to CPU bandwidth: 17.5 GB/s [6]