Date: 2026-03-13
Toolchain cryptex: v17.5.5179.6 (dated 2026-03-03)
AIR-NT version: 32023.883 (metalfe-32023.883)
The Metal toolchain shipped with Xcode 26.4 beta 3 adds new Apple GPU architecture targets to the AIRNT offline compiler plugins. These targets were previously absent from all AIRNT plugin variants, blocking offline cross-compilation for M5 Pro and A19 Pro GPUs.
This update removes the last architecture gaps in the current Apple GPU lineup.
The main libapplegpu-nt.dylib plugin now reports 19 architectures via -archs:
applegpu_g13d applegpu_g13g applegpu_g13p applegpu_g13s
applegpu_g14d applegpu_g14g applegpu_g14p applegpu_g14s
applegpu_g15d applegpu_g15g applegpu_g15p applegpu_g15s
applegpu_g16g applegpu_g16p applegpu_g16s
applegpu_g17g applegpu_g17p applegpu_g17s ← NEW
applegpu_g18p ← NEW
New target mapping:
| Target | Product | Generation |
|---|---|---|
applegpu_g17s |
M5 Pro / M5 Max | G17 |
applegpu_g18p |
A19 Pro | G18 |
Previously known targets (still present):
| Target | Product | Generation |
|---|---|---|
applegpu_g17g |
M5 | G17 |
applegpu_g17p |
A18 Pro | G17 |
applegpu_g16g |
M4 | G16 |
applegpu_g16s |
M4 Pro / M4 Max | G16 |
applegpu_g16p |
A18 | G16 |
applegpu_g15g |
M3 | G15 |
Before this update, cross-architecture shader compilation was blocked for:
| Target | Before | After |
|---|---|---|
| M5 Pro / M5 Max (g17s) | Not in any AIRNT plugin | Supported (main plugin) |
| A19 Pro (g18p) | Not in any AIRNT plugin | Supported (main plugin) |
This enables:
- Cross-arch ISA diffs covering M3 through A19 Pro (8 targets, 4 GPU generations)
- Instruction-level comparison between M5 (runtime) and M5 Pro (offline) to confirm ISA equivalence
- First look at G18 codegen without needing physical A19 Pro hardware
The main plugin shrank 37% while gaining new targets:
| Plugin | v17.3 (Xcode 26.3) | v17.5 (beta 3) | Delta |
|---|---|---|---|
libapplegpu-nt.dylib |
168 MB | 106 MB | −37% |
libapplegpu23-nt.dylib |
131 MB | 129 MB | −1.5% |
libapplegpu24-nt.dylib |
171 MB | 169 MB | −1.2% |
libapplegpuG9G12-nt.dylib |
182 MB | 181 MB | −0.5% |
-std=metal4.0 is now accepted by the Metal frontend:
xcrun metal -std=metal4.0 -c shader.metal -o shader.air37 metallib files ship in the applegpu-nt/ directory, including:
tex_atomic_emu_g17.metallib— texture atomics emulation for G17tensor.metallib— tensor operations support libraryvft_rt_gen1_agx3.metallib— AGX3-specific ray-tracing variantei_rt_g16p_*.metallib— ray-tracing for G16 phone (A18)runtime.gen15.metallib— G15 runtime support
Ray-tracing coverage now spans G13 through G16P with per-stepping variants (a0/b0/c0).
The new targets live in the main libapplegpu-nt.dylib, not the versioned plugins. They require a macOS 26 AIR triple:
# 1. Compile Metal source to AIR v2.5
xcrun metal -std=macos-metal2.4 -mmacosx-version-min=13.0 \
-c -o shader.air shader.metal
# 2. Disassemble to LLVM IR
air-opt -S shader.air -o shader.ll
# 3. Patch triple to macOS 26
sed -i '' 's/macosx13.0.0/macosx26.0.0/g' shader.ll
# 4. Reassemble
air-opt shader.ll -o shader_patched.air
# 5. Link (use macos 13.0 platform_version to keep AIR v2.5 container)
air-lld -arch air64_v25 \
-platform_version macos 13.0 13.0 \
-o shader.metallib shader_patched.air
# 6. Compile with AIRNT main plugin
applegpu-nt \
-load libMTLPasses.dylib \
-load libapplegpu-nt.dylib \
-force-legacy-2024-arch \
-arch applegpu_g17s \
-N pipeline.mtlp-json \
shader.metallib \
-o output.metallibpackageThe pipeline script (pipeline.mtlp-json) is documented in man(5) metal-pipelines-script:
{"pipelines":{"compute_pipelines":[{"compute_function":"kernel_name"}]}}Three AIRNT methods cover the full target space:
| Method | Targets | Plugin | Triple patch |
|---|---|---|---|
airnt |
M3, M4, A18 (G15/G16) | libapplegpu23-nt.dylib |
none needed |
airnt_v24 |
M4Pro, A18Pro (G16s/G17p) | libapplegpu24-nt.dylib |
macosx15.0.0 |
airnt_main |
M5Pro, A19Pro (G17s/G18p) | libapplegpu-nt.dylib |
macosx26.0.0 |
All three require -force-legacy-2024-arch (v24 and main) or work directly (v23).
M5 (g17g) uses gpu_compile at runtime on the local GPU and does not need AIRNT.
Instruction counts for cos_op across all 8 architectures:
M3 40 real (G15) via AIRNT offline (applegpu23-nt)
M4 38 real (G16) via AIRNT offline (applegpu23-nt)
M4Pro 38 real (G16) via AIRNT offline (applegpu24-nt)
A18 38 real (G16) via AIRNT offline (applegpu23-nt)
A18Pro 38 real (G17) via AIRNT offline (applegpu24-nt)
M5 37 real (G17) via gpu_compile (local M5 hardware)
M5Pro 37 real (G17) via AIRNT offline (applegpu-nt main)
A19Pro 37 real (G18) via AIRNT offline (applegpu-nt main)
M5, M5Pro, and A19Pro produce byte-identical ISA. This is not a local-GPU artifact — M5 compiles via runtime gpu_compile on physical M5 hardware, while M5Pro and A19Pro compile via the AIRNT offline compiler with no GPU involvement. Three independent compilation paths converge to the same binary. The G17/G18 compiler backend is unified.
Additional kernels:
| Kernel | M3 (G15) | M4 (G16) | M5 (G17) | A19Pro (G18) |
|---|---|---|---|---|
| cos_op | 40 | 38 | 37 | 37 |
| exp_op | 18 | 16 | 14 | 14 |
| oracle_half_sin | 31 | 29 | 28 | 28 |
| int_heavy | 81 | 80 | 89 | 89 |
The int_heavy result is notable: G17/G18 emits more instructions than G15/G16 for integer-heavy workloads, suggesting different scheduling or instruction selection — not a simple "newer = fewer instructions" relationship.
Three Metal toolchain cryptex mounts present on this machine:
| Version | Date | Notes |
|---|---|---|
| v17.3.7003.10 | 2026-02-17 | Xcode 26.3 stable |
| v17.5.5170.4 | 2026-02-16 | Beta 2 |
| v17.5.5179.6 | 2026-03-03 | Beta 3 (current) |