Google Native Client (NaCl) uses Software Fault Isolation (SFI) to sandbox untrusted native code. Each supported architecture has a unique SFI implementation. The closest existing analogy for ppc64le is the MIPS 32-bit port: both use fixed-length instructions, pure software isolation (no hardware segments), and dedicated mask registers.
NaCl currently supports: x86-32 (hardware segments), x86-64 (base register + zero-extension), ARM 32-bit (bic masking), and MIPS 32-bit (mask registers). No PPC support has ever existed.
- No unsafe instructions (syscalls, privileged ops forbidden)
- All indirect control flow sandboxed via mask-and-jump pseudo-instructions
- All memory accesses sandboxed (masked or otherwise constrained)
- Fixed-size instruction bundles; pseudo-instructions cannot straddle boundaries
- Guard regions catch off-by-displacement accesses near sandbox boundaries
- Trampolines provide the only controlled exit from untrusted to trusted code
| Parameter | Value | Rationale |
|---|---|---|
| Bundle size | 16 bytes (4 instructions) | NACL_BLOCK_SHIFT=4. Fixed 4-byte POWER instructions. Matches ARM/MIPS. Longest pseudo-instruction (3 insns = 12 bytes) fits. |
| Sandbox size | 1 GiB | NACL_MAX_ADDR_BITS=30. Matches ARM/MIPS model. Simpler than x86-64's 4 GiB + guard region approach. |
| Halt instruction | trap = 0x7FE00008 |
Unconditional trap. Raises SIGTRAP, caught by NaCl signal handler. NACL_HALT_LEN=4. |
| NOP instruction | ori 0,0,0 = 0x60000000 |
Standard POWER NOP encoding. |
| Control flow mask | 0x0FFFFFF0 |
256 MB code region, 16-byte aligned (clears upper bits + bottom 4 bits). |
| Data flow mask | 0x3FFFFFFF |
1 GiB data region. |
| Reserved registers | r13, r14, r15 | r13 = thread context pointer, r14 = control flow mask, r15 = data flow mask. |
| Upper guard | 128 KB | Covers DS-form displacement (signed 16-bit << 2 = ±128 KB with 4-byte granularity). |
| Stack alignment | 16 bytes | ELFv2 ABI requirement. NACL_STACK_ALIGN_MASK = ~0xF. |
| Red zone | 288 bytes | ELFv2 ABI. NACL_STACK_RED_ZONE = 288. |
| Data segment start | 0x10000000 (256 MB) |
Same as MIPS. Code region: 0–256 MB; data region: 256 MB–1 GiB. |
| Trampoline region | 0x10000–0x1FFFF (64 KB) |
Standard NaCl trampoline area. |
NACL_TRAMPRET_FIX |
-8 | TBD based on detailed ABI analysis. |
NACL_USERRET_FIX |
-4 | TBD. |
NACL_SYSARGS_FIX |
0 | TBD. |
# nacljmp rX — sandbox indirect branch via CTR
and rX, rX, r14 # Mask to 256 MB code region + 16-byte alignment
mtctr rX # Move to Count Register
bctr # Branch to CTR
# Total: 3 instructions = 12 bytes (fits 16-byte bundle with 1 NOP pad)# naclret — sandbox return via LR
mflr rX # Move LR to GPR
and rX, rX, r14 # Mask to code region + alignment
mtctr rX # Move to CTR (not LR, to use bctr)
bctr # Branch to CTR
# Total: 4 instructions = 16 bytes (exactly fills a 16-byte bundle)Note: Returns use CTR instead of LR to avoid an unsandboxed blr. The validator must ensure all returns go through this sequence.
# Before any load/store with explicit address register
and rX, rX, r15 # Mask to 1 GiB data region
lwz rY, offset(rX) # Load word (D-form, ±32 KB displacement)For loads/stores with small immediate displacement off a known-safe register (SP/r1), the guard region (128 KB) exceeds the maximum displacement, so no masking is needed.
# After arbitrary SP changes
and r1, r1, r15 # Mask SP to data regionSmall immediate adjustments to SP (e.g., addi r1, r1, -frame_size) don't need masking if the adjustment is within the guard region size.
The validator must reject all of the following:
| Category | Instructions |
|---|---|
| System calls | sc, scv (POWER9+) |
| Privileged returns | rfid, rfscv, rfebb |
| MSR access | mtmsr, mtmsrd, mfmsr |
| TLB/SLB management | tlbie, tlbiel, tlbia, slbie, slbia, slbmte, slbmfee, slbmfev |
| Privileged SPR writes | mtspr to SRR0, SRR1, DAR, DSISR, SPRG0-3, and other supervisor SPRs |
| Privileged SPR reads | mfspr from privileged SPRs (info leak prevention) |
| Hypervisor | hrfid, and all hypervisor-privileged instructions |
| Debug/attention | attn |
| Reserved register writes | Any instruction that modifies r13, r14, or r15 (except through approved pseudo-instruction sequences) |
| Prefixed instructions | All POWER10 8-byte prefixed instructions (pld, pstd, paddi, etc.) — 34-bit displacements (±8 GiB) exceed guard regions |
| Unsandboxed branches | blr, bctr, bctrl, blrl not preceded by the required masking sequence within the same bundle |
Each trampoline slot is 32 bytes (NACL_SYSCALL_BLOCK_SHIFT=5), located at 0x10000 + (syscall_num * 32):
# Trampoline slot (8 instructions, 32 bytes)
# Located at 0x10000 + (syscall_num * 32)
li r0, SYSCALL_NUM # Load syscall number into r0
lis r12, HI(NaClSyscallSeg) # Load upper half of target (PATCHED at load time)
ori r12, r12, LO(NaClSyscallSeg) # Load lower half (PATCHED at load time)
mtctr r12
bctr # Jump to syscall handler
nop # Pad to 32 bytes
nop
nopThe lis/ori pair is patched at load time with the actual address of NaClSyscallSeg. This is the same pattern used by the MIPS port.
struct NaClThreadContext {
/* Callee-saved GPRs (ELFv2: r14-r31, but r14/r15 are mask regs) */
uint64_t r16, r17, r18, r19, r20, r21, r22, r23;
uint64_t r24, r25, r26, r27, r28, r29, r30, r31;
/* Special registers */
uint64_t r1; /* stack pointer */
uint64_t r2; /* TOC pointer */
uint64_t prog_ctr; /* resume PC */
uint64_t lr; /* link register */
uint64_t cr; /* condition register (callee-saved fields) */
/* NaCl control */
uint64_t sysret;
uint64_t new_prog_ctr;
uint64_t trusted_stack_ptr;
uint32_t tls_idx;
uint64_t tls_value1;
uint64_t tls_value2;
uint64_t guard_token;
uint64_t syscall_routine;
/* FPU state (callee-saved: f14-f31) */
double f14, f15, f16, f17, f18, f19, f20, f21;
double f22, f23, f24, f25, f26, f27, f28, f29, f30, f31;
/* VSX/AltiVec state (callee-saved: v20-v31) */
__uint128_t v20, v21, v22, v23, v24, v25, v26, v27;
__uint128_t v28, v29, v30, v31;
};r14 and r15 are not saved — they always hold the mask constants and are reloaded by NaClSwitch.
NaClSyscallSeg:
# Validate r13 via guard_token
ld r12, GUARD_TOKEN_OFFSET(r13)
cmpdi r12, EXPECTED_GUARD_VALUE
bne abort
# Save untrusted callee-saved GPRs into NaClThreadContext
std r16, R16_OFFSET(r13)
std r17, R17_OFFSET(r13)
# ... r18-r31 ...
std r31, R31_OFFSET(r13)
std r1, SP_OFFSET(r13)
std r2, TOC_OFFSET(r13)
mflr r12
std r12, LR_OFFSET(r13)
mfcr r12
std r12, CR_OFFSET(r13)
# Save callee-saved FPRs
stfd f14, F14_OFFSET(r13)
# ... f15-f31 ...
stfd f31, F31_OFFSET(r13)
# Save callee-saved VRs (v20-v31) if AltiVec/VSX is used
# stvx v20, 0, rSCRATCH (with appropriate addressing)
# ...
# Switch to trusted stack
ld r1, TRUSTED_STACK_OFFSET(r13)
# Load trusted TOC
ld r2, TRUSTED_TOC_ADDR # From a known fixed location
# Call NaClSyscallCSegHook(natp)
mr r3, r13 # First argument = thread context pointer
bl NaClSyscallCSegHook
nop # TOC restore nop (ELFv2 convention)NaClSwitch:
# r3 = pointer to NaClThreadContext
# Clear caller-saved FPRs to prevent info leaks (f0-f13)
xxlxor vs0, vs0, vs0 # Clear f0/v0
# ... clear f1-f13, v0-v19 ...
# Restore callee-saved GPRs
ld r16, R16_OFFSET(r3)
# ... r17-r31 ...
ld r31, R31_OFFSET(r3)
ld r1, SP_OFFSET(r3)
ld r2, TOC_OFFSET(r3)
ld r12, LR_OFFSET(r3)
mtlr r12
ld r12, CR_OFFSET(r3)
mtcr r12
# Restore callee-saved FPRs
lfd f14, F14_OFFSET(r3)
# ... f15-f31 ...
lfd f31, F31_OFFSET(r3)
# Restore callee-saved VRs (v20-v31)
# lvx v20, 0, rSCRATCH
# ...
# Load mask registers (constants)
lis r14, 0x0FFF
ori r14, r14, 0xFFF0 # r14 = 0x0FFFFFF0 (control flow mask)
lis r15, 0x3FFF
ori r15, r15, 0xFFFF # r15 = 0x3FFFFFFF (data flow mask)
# Set thread context pointer
mr r13, r3
# Jump to untrusted code
ld r12, NEW_PROG_CTR_OFFSET(r3)
mtctr r12
bctrPOWER has two indirect branch mechanisms: bctr (Branch to Count Register) and blr (Branch to Link Register). Both must be sandboxed. The proposed design routes all indirect branches through CTR, including returns (which normally use blr). This means the compiler must emit a 4-instruction return sequence (mflr + and + mtctr + bctr) instead of a single blr. The validator must reject any bare blr instruction.
ELFv2 ABI uses r2 as the TOC (Table of Contents) pointer for position-independent code. Untrusted NaCl code needs its own TOC for function calls and global variable access. r2 must be:
- Saved/restored at sandbox transitions (included in NaClThreadContext)
- Allowed to be modified by untrusted code (it's not a reserved register)
- The trusted runtime must use a separate TOC, loaded from a known location during NaClSyscallSeg
TOC-relative loads (ld rX, offset(r2)) are safe as long as r2 points within the sandbox and the displacement is within the guard region.
POWER uses lwarx/stwcx (load-reserved/store-conditional) and ldarx/stdcx pairs for atomics. These pairs must:
- Have their memory operands sandboxed (masked with r15)
- Ideally stay within the same bundle to avoid correctness issues if a context switch occurs between them
- A
lwarx+and+stwcxsequence is 3 instructions (12 bytes), which fits in a 16-byte bundle — but the mask must come before thelwarx, not between the pair
Proposed pattern:
and rADDR, rADDR, r15 # Mask address
lwarx rDATA, 0, rADDR # Load-reserved
stwcx. rNEW, 0, rADDR # Store-conditional
# 3 instructions = 12 bytes, fits in 16-byte bundleCAS (compare-and-swap) loops that include a branch will span multiple bundles, which is fine — only the lwarx/stwcx pair and its address masking need to be in the same bundle.
POWER10 introduced 8-byte "prefixed" instructions (e.g., pld, pstd, paddi) with 34-bit immediate fields, enabling displacements up to ±8 GiB. These must be forbidden entirely in the initial port:
- They break the fixed 4-byte instruction assumption
- 34-bit displacements far exceed any practical guard region
- The validator would need to handle variable-length instruction decoding
Future work could allow prefixed instructions with validated displacements, but this significantly complicates the validator.
POWER has different signal frame layouts than x86/ARM. The NaCl signal handler that catches sandbox faults (SIGSEGV, SIGTRAP from trap halt instructions) needs to:
- Understand the POWER
ucontext_t/mcontext_tstructures - Extract the faulting PC and register state
- Determine if the fault is within the sandbox (expected) or indicates a bug
The POWER ISA is large. The instruction encoding uses a 6-bit primary opcode with extended opcodes in varying bit positions across many instruction formats (I, B, SC, D, DS, DQ, DX, X, XL, XFX, XFL, XS, XO, A, M, MD, MDS, VA, VC, VX, etc.). The recommended approach is an allowlist — only permit known-safe instructions — rather than a denylist. Estimated at 5,000–15,000 lines of code, making it the largest component of the port.
| Component | Estimated Complexity | Estimated LOC |
|---|---|---|
| Instruction validator | Very High | 5,000–15,000 |
| sel_ldr arch support (trampoline, syscall, context switch, address space) | High | 2,000–4,000 (asm+C) |
| LLVM/Clang NaCl backend (ppc64le-nacl code generation) | Very High | 3,000–10,000 |
| nacl_config.h additions | Low | ~50 |
| Daemon CMake integration | Low | ~30 |
| IRT compilation | Medium | Build system work |
| Testing infrastructure | High | Significant |
Total estimated effort: Several engineer-months minimum.
- NaCl SFI paper: "Native Client: A Sandbox for Portable, Untrusted x86 Native Code" (Yee et al., 2009)
- ARM NaCl SFI: "Adapting Software Fault Isolation to Contemporary CPU Architectures" (Sehr et al., 2010)
- MIPS NaCl: chromium.googlesource.com/native_client/src/native_client,
src/trusted/service_runtime/arch/mips/ - POWER ISA: "Power ISA Version 3.1" (IBM, 2020)
- ELFv2 ABI: "64-Bit ELF V2 ABI Specification: Power Architecture" (OpenPOWER Foundation)
- Daemon NaCl integration: github.com/DaemonEngine/Daemon,
src/engine/framework/VirtualMachine.cpp - DaemonEngine NaCl fork: github.com/DaemonEngine/native_client
- Saigo toolchain: github.com/DaemonEngine/DaemonSaigoNativeClientToolkit