H. Overman (ee) Independent Security & Systems Research — Euman Language Ecosystem
"The hardware was working perfectly. That was the problem."
Here is what happened on February 25, 1991, in Dhahran, Saudi Arabia.
A Patriot missile battery had been running continuously for 100 hours. Its internal clock measured time in tenths of a second as an integer. To convert to seconds, the integer was multiplied by 1/10 using a 24-bit fixed-point register. The number 1/10 does not have a terminating binary representation. The register stored a truncated approximation. The error was approximately 0.000000095 seconds per tenth of a second — invisible in any single measurement, invisible in any test, invisible in any review.
Over 100 hours of continuous operation, the accumulated error reached 0.34 seconds. A Scud missile travels at 1,676 meters per second. A 0.34-second timing error displaced the predicted intercept point by 573 meters. The system classified the incoming Scud as a false alarm. No intercept was fired. The Scud struck a barracks. 28 American soldiers were killed.
The hardware was working perfectly. The compiler had no warnings. The tests had passed.
The software did exactly what the notation said. The notation did not say what anyone intended. The gap between what the notation encoded and what the engineers meant was 0.34 seconds, accumulated over 100 hours, never written down, never enforced, never visible until it was irreversible.
This paper is about that gap.
Hardware executes instructions. It does not know what they mean.
This is not a limitation of current hardware. It is the fundamental nature of computation. A processor that executes IMUL r1, r2 does not know whether the result represents a velocity, a time, a financial index, or a missile intercept offset. It multiplies. It stores the result. If the result overflows, it sets a flag — which the program may or may not check. If the result is slightly wrong due to representational error, it stores the slightly wrong value with the same confidence as a correct one.
Hardware enforces mechanics. It does not enforce meaning.
Every failure in the historical record demonstrates this in a different register:
The Patriot battery faithfully accumulated 0.000000095 seconds of error per tick across 3,600,000 ticks. The invariant — that the clock must track real elapsed time — was never stated as a structural constraint. Hardware had no mechanism to enforce it. It computed exactly what it was told.
Ariane 5 faithfully converted a 64-bit floating-point value to a 16-bit signed integer. The value was 32,768. The maximum representable value in a 16-bit signed integer is 32,767. The hardware raised an Operand Error exception, as specified. The autopilot received the exception signal and interpreted it as navigation data, as coded. The rocket destroyed itself at 3,700 meters. Every piece of hardware performed its specified function correctly.
The Pentium FDIV bug produced slightly wrong results for specific input combinations. Not randomly wrong. Wrong in a specific, reproducible pattern traceable to missing entries in a division lookup table — five entries absent from a table of 2,048, affecting a specific subset of dividend/divisor combinations. Hardware was certified IEEE 754 compliant. The certification was for the specification, not for correctness. These are different claims.
The Vancouver Stock Exchange faithfully truncated floating-point results on each of approximately 3,000 daily transactions for 22 months. No transaction was wrong in any detectable way. The accumulated error over 22 months caused the index to read 520 when it should have read 1,099. Hardware cannot accumulate error incorrectly. It accumulates exactly the error the representation carries.
Sleipner A faithfully computed finite element stresses from the mesh the engineers provided. The mesh was too coarse. The walls were built 47% thinner than they needed to be. At 65 meters depth, the platform sank. Hardware had no knowledge of what the numbers meant in concrete and water.
Mars Climate Orbiter faithfully computed thruster corrections from the values it received. One subsystem sent pound-force seconds. The receiving subsystem expected newton-seconds. The conversion factor is 4.45. Nine months of corrections were 4.45 times smaller than actual. Hardware cannot enforce units. Units are meaning. Hardware enforces mechanics.
The pattern across every failure is identical: hardware executed the notation faithfully. The notation did not carry the meaning the engineers intended. The gap between notation and meaning is where people die and spacecraft are destroyed and platforms sink.
The common framing is: write for the compiler. The compiler will tell you when the code is wrong.
This is the wrong surface for the problem.
The compiler checks syntax. It checks types. It checks some classes of undefined behavior. These are valuable. They are not sufficient.
The compiler does not know:
- That your clock must track real elapsed time, not just count ticks
- That your velocity value cannot safely narrow from 64 bits to 16 bits at the trajectory your rocket will actually fly
- That your division lookup table has five missing entries
- That your financial index must round rather than truncate
- That your thruster force values are in pound-force seconds when the receiver expects newton-seconds
In every case above, the compiler had no warning to emit. The code was syntactically correct. The types were consistent. The programs compiled clean.
The compiler checks that the notation is internally consistent. It does not check that the notation matches the intent.
That check belongs to the contract.
But hardware adds what the compiler cannot: hardware will execute wrong code across the full operational envelope, including the conditions your tests never covered. The Patriot was not tested for 100 hours of continuous operation at combat tempo. The Ariane 5 SRI was tested on Ariane 4 flight trajectories — which never produced the velocity value that caused the overflow. The Pentium was tested by sampling across a 2^128 input space. Sampling is not exhaustive. Hardware is.
Hardware executes whatever inputs the operational envelope delivers, without regard for test coverage. Tests cover a sample. Only the contract can make a claim that holds across the full operational envelope, because only the contract states what must be true regardless of which inputs arrive.
Throughout this paper, Gate-X refers to an explicit structural error signal — one of four states (Z: unreached, X: failed, 0: valid-deny, 1: valid-allow) — that fires at the point where a structural invariant is violated and propagates causally through the call chain. Unlike a hardware exception that may be misinterpreted, a Gate-X is a named, typed, checkable value. The system cannot proceed past a Gate-X without explicitly handling it.
255 and 0xFF are the same number.
The number does not change. The surface you read it from does.
In decimal, the byte boundary is hidden. In hex, it is right there on the surface. You did not compute your way to it. The structure announced itself from the right projection.
This is the entire discipline in one example. Everything else is a consequence of it.
A surface is a projection of invariant structure onto something you can see and reason about. The multiplication table mod 10 projected as a color grid makes modular arithmetic visible. An exact integer ratio basis — 144,000 = 2^7 × 3^2 × 5^3, a highly composite number exactly divisible by every integer from 1 to 6 and 8 to 10 (not 7 — 7 is not a factor of the basis), and by all common engineering fractions — projected onto exact arithmetic makes representational error structurally impossible. The FRONT/LEAD/REAR contract projected onto a function makes its three standpoints impossible to miss.
The wrong projection hides the invariant. IEEE 754 hides that 0.1 is not exactly representable. A float tick_interval = 0.1 hides that this value will accumulate error over time. A 16-bit signed integer type hides that it cannot represent all values a 64-bit float might produce. These are not hidden by accident. They are hidden by projection — the decimal surface was chosen, and the decimal surface hides the hex.
The right projection makes the invariant impossible to miss. TICK_INTERVAL_NUM = 1, TICK_INTERVAL_DEN = 10 — two integers, exact, no approximation possible. The projection reveals what the decimal hid. The Patriot's failure is visible in the representation before a single line of logic is written.
Three rules for choosing a projection:
1. State the invariant before choosing the surface. The invariant is what does not change under your operations. It precedes the representation. If you cannot state the invariant in one sentence, you do not have one yet. You have a description. Keep dissolving. The invariant of the Patriot's clock: the computed time must equal real elapsed time within the system's tracking tolerance across the full operational deployment window. One sentence. It was never written. Hardware had no surface to project it onto.
2. The surface must make the invariant visible without explanation. A surface that requires explanation is the wrong surface. IEEE 754 requires the explanation: "0.1 is not exactly representable in binary, and errors accumulate over repeated addition." Exact integer ratios require no explanation: these are integers, integers are exact, multiplication of integers is exact, there is no error to accumulate. The right representation makes the invariant impossible to violate silently. The wrong representation executes the violation faithfully.
3. Multiple projections of the same invariant must agree. If two systems project the same invariant onto different surfaces, and their boundary values agree, this is structural proof — not coincidence. If they disagree, one has the wrong invariant or the wrong projection. The agreement is the surface check on the invariant. Two independently developed systems that both converge on 144,000 as the exact-representability basis are not coinciding — they are reading the same structure from different angles.
Seven academic studies establish the engineering consequences of the projection principle, each from a different angle.
Bloch's "How to Design a Good API and Why It Matters" demonstrates that naming decisions compound forward in time. A wrong name at T=0 trains every caller to reason incorrectly from T=1 to T=N. The interest grows with call count and deployment duration.
The Patriot's float tick_adjustment = 0.1 is a name carrying the wrong projection. Every computation downstream inherited the gap between what the notation said and what was meant. By T=100 hours, 3,600,000 multiplications had compounded the error. Hardware does not discount old errors. It accumulates them faithfully.
Bloch did not analyze the Patriot battery — his paper predates it and addresses API design, not weapons systems. But the failure mode the battery illustrates is exactly what his finding predicts: a representation chosen before the invariant was named, compounding forward in time at hardware speed.
The principle: a constant without derivation is decimal pretending to be hex. It hides the structure it came from. If you cannot derive the constant from the formula that produced it, the formula or the constant is wrong. Every threshold, every magic number, every coefficient must carry: here is the formula, here is the Python one-liner that produces it, here is the CONFIRMED marker. Hardware will use the constant in every execution for the operational life of the system. The derivation is the only mechanism to catch wrong constants before hardware runs them.
Their empirical study found that missing preconditions cause more downstream bugs than all other documentation failures combined. The position finding: a precondition in a comment is skipped by the majority of developers under deadline pressure.
Ariane 5 is the canonical example. The precondition for safe conversion of horizontal velocity to 16-bit integer: the value must not exceed 32,767. This was known. It was enforced in four of seven conversions. In the three unprotected cases, the reasoning was: this value never exceeded 32,767 on Ariane 4, so protection is unnecessary. The unstated assumption — that Ariane 5 dynamics would produce the same value range as Ariane 4 — was the missing precondition. Hardware flew Ariane 5 on Ariane 5's trajectory. The assumption was wrong.
The enforcement class determines reality:
| Class | Mechanism | When it fires |
|---|---|---|
| 1 | static_assert |
Before the binary exists |
| 2 | Gate-X check at function entry | Every call, every context |
| 3 | Comment | When the developer reads it |
| 4 | Engineering judgment | When the engineer is present and correct |
Ariane 5 relied on Class 4. Hardware operated in Class 0 — all inputs, all contexts, all operational envelopes. Every precondition belongs in the highest enforcement class it can reach. A precondition that lives only in a comment is not a precondition. It is a note to the reader that hardware will execute through on the first unexpected input.
Their taxonomy identifies six knowledge types in documentation. Pattern knowledge — how multiple components compose — is the most underrepresented type. Individual function contracts cannot substitute for it.
Mars Climate Orbiter is the pattern-knowledge failure. Every function worked correctly. The pound-force to newton-second conversion was correct for its assumed units. The trajectory computation was correct for its received values. The problem was at the composition boundary: the interface between two subsystems carried no unit specification. The canonical pattern for data crossing a subsystem boundary — state the units, state the range, state the contract between systems — was not stated.
Nine months of correct individual computations composed into a fatal systematic error because the composition pattern was missing. Hardware composed the subsystems in the order they were called. Hardware had no access to the engineers' intent about what units should cross the boundary.
The principle: every module boundary must carry a composition pattern — the canonical sequence of calls, the data that crosses each boundary, the units of each value. This is not redundant with individual contracts. Individual contracts say what each function does. The composition pattern says how they must be chained. Without the pattern, hardware executes whatever composition the code produces.
Their finding: debt that is not named and measured will not be paid. Teams that plan to fix it later never do.
Vancouver ran for 22 months with a truncation error per transaction of sub-penny magnitude. Hardware executed 3,000 transactions per day, faithfully, for 22 months. The fix was one line of code. The decision not to fix it was ordinary human deferral of a cost that felt small in isolation.
Hardware does not defer. It executes at hardware speed. Sub-penny per transaction becomes 578 index points over 66,000 transactions.
The principle: every approximation must be named, given a maximum error bound, and tracked. Not the approximations that feel important — all of them. "Close enough per transaction" is not a verifiable claim. "Maximum error per transaction is 0.001; over 66,000 transactions maximum accumulated error is 66; we accept this" is a verifiable claim. If you cannot write the second form, you have not measured the debt. Hardware will measure it for you, in the field, at scale, in the conditions you did not test.
The measurement instrument is the provability record: every claim either machine-verifiable (compile, run, grep) or hand-derivable (execute this derivation, compare to the constant). Anything that cannot be verified by either method is unverified debt that hardware will eventually collect.
Their finding: code decay concentrates in frequently-modified files that lack contract updates. The files most likely to have silent assumptions are the ones most recently changed.
The Patriot battery was identified as having timing drift eleven days before the attack. A patch was developed. The patch arrived the day after. The gap between identification and remediation was eleven days of continued operation with a known decaying time reference.
The original notation failure was the approximation of 1/10. The secondary failure was operational: once the drift was known, the system continued executing the known-incorrect computation for eleven days. Hardware does not pause because engineers are working on a fix. It executes.
The principle: every commit that modifies a function must verify the function's contract is still correct. Not that the code compiles. That the preconditions, the named operation, and the guarantees still accurately describe the post-change function. A contract that does not track with code changes decays. In hardware-adjacent systems, decay executes at hardware speed.
Their finding: rename is the most dangerous refactoring because it changes the surface without changing the logic. The regression test for a refactoring is the contract, not the output.
The Ariane 5 SRI reuse is a refactoring decision with fatal consequences. The component was taken from Ariane 4 and reused in Ariane 5. Output behavior was preserved — same code, same logic, same output format. The contract was not preserved — same code, operating in a different flight dynamic envelope, producing values outside the original contract's domain.
Before-state: horizontal velocity on Ariane 4 trajectories, bounded by Ariane 4 dynamics, always within 16-bit range. After-state: horizontal velocity on Ariane 5 trajectories, unbounded by Ariane 4 assumptions, exceeding 16-bit range.
The invariant — that horizontal velocity would remain convertible — was never stated. When the deployment context changed, hardware executed in the new context with no awareness of the old assumptions.
The principle: every structural change carries a transformation record. Before-state, after-state, invariant both preserve, evidence the invariant holds in the new context. A commit message that says "refactor" satisfies none of these. A commit that says "confirm_hash seed: fixed constant → per-session random — invariant preserved: hash output is unpredictable to attacker without seed access" satisfies all of them. Hardware will execute the post-change code in all contexts. The transformation record is the proof that all contexts are safe.
Their finding: Coccinelle's semantic patch model works at 30-million-line scale because every transformation states the before-pattern, the after-pattern, and the invariant both preserve.
The Pentium FDIV bug is a hardware-level failure of this principle. Intel's table generation was a transformation from the SRT division algorithm. Five entries were missing from the division lookup table — the transformation from the SRT algorithm to the table was applied incorrectly to a specific subset of the input space. The transformation was not stated as an equivalence with a verifiable proof. It was implemented, tested by sampling, and shipped.
Testing by sampling of a 2^128 input space is not a proof. Hardware executed the incorrect entries faithfully for every computation that landed in that region, for the operational life of every affected Pentium. The untested cases were the ones that were wrong.
The principle: every transformation is a proof obligation, not an assertion. State the before-pattern, state the after-pattern, state why they are equivalent under the named invariant. Without the proof, the transformation is a guess that hardware will test in the full input space. With the proof, hardware executing faithfully is hardware executing correctly.
From the seven proofs and the hardware record, a coherent standard emerges.
Before writing code, five questions must be answered in order. Hardware can enforce none of them. Only notation can.
L0: What mathematical domain? The domain determines which approximations are acceptable and which are fatal. IEEE 754 is the wrong domain for decimal fractions — it approximates 1/10, and hardware executes that approximation faithfully across 3,600,000 ticks. Exact integer ratios are the right domain — represent 1/10 as the integer pair (1, 10), multiply the tick count by that pair using integer arithmetic, and hardware cannot accumulate representational error because there is none to accumulate. The domain choice is made before the first line of code. Hardware implements the domain you chose.
L1: What is the invariant — one sentence? The invariant is what does not change under the operations you are implementing. If you need two sentences, you have two invariants or a description. For the Patriot's clock: the computed time must equal real elapsed time within tracking tolerance across the full operational deployment window. This was never stated. Hardware cannot enforce an unstated invariant.
L2: Which projection makes the invariant visible?
The same invariant can project onto many surfaces. Choose the one that makes the invariant impossible to miss. float tick_interval = 0.1 hides the approximation. uint64_t tick_num = 1, tick_den = 10 makes it visible — these are integers, they are exact, there is no approximation to hide.
L3: What is the name?
A name applied before L0-L2 is a label. A label applied after L0-L2 is known is notation. Labels go stale because they were assigned before the structure was understood. Notation does not go stale because it was derived from the invariant. thruster_force_ns versus thruster_force — one carries the unit in the name, one does not. Hardware executes both identically. The name is for the reader. The reader is the next engineer who will write the next function that uses this value.
L4: What are the operations? The operations that preserve the invariant. Each one has three standpoints that must be named precisely:
FRONT: What the caller must provide. Not "input data." The specific precondition, in the highest enforcement class it can reach. If it can be a static_assert, it must be. If it requires runtime information, it must be a Gate-X check at function entry. If it lives only in a comment, it is Class 3 enforcement and hardware will execute through it on the first unexpected input.
LEAD: The specific operation that collapses the before-state and the after-state. Not "processes." The cross-multiply that avoids division. The mprotect syscall that makes a memory region read-only. The GF(2) matrix multiply that extracts privacy-amplified bits. This is the Möbius crossing — the single point where orientation inverts, where the speculative becomes authoritative, where Gate-X fires if the structure fails. Not before. Not after. At the crossing.
REAR: What is guaranteed after commit. The sealed PROT_READ slab. The normalized ratio. The permuted bit array. This is what the next function's FRONT will read. If REAR and the next FRONT describe different things, the functions do not compose correctly — and hardware will execute the composition regardless.
The contract operator encodes the mapping geometry:
(AS/.\IS)— bijective. The function is reversible. Round-trip preserves identity.(AS/--\WAS)— lossy. Information is intentionally destroyed. Hash functions. Projections. Classifications.--— procedural. Not a data mapping. State construction. Side effects. Tests.
A type signature tells you what types flow through a function. A contract tells you what the function means. Hardware executes the type flow. Only the contract carries the meaning.
L5: Write the code. If L0-L4 are correct, L5 is largely mechanical — the implementation is the translation of L4 operations into syntax. If L5 is hard, that difficulty is a diagnostic signal pointing upward to the level where an assumption is wrong. Do not push through difficulty at L5. Read it. The Ariane 5 engineers' difficulty in protecting all seven conversions within the CPU budget was a signal from L1: the invariant was not achievable in the chosen implementation context. The signal was managed as a resource constraint. Hardware flew through the gap.
A contract is not a pair of endpoints. It is a single closed surface.
Take the precondition state (FRONT) as one open system. Take the postcondition state (REAR) as another open system. Join them at the operation (LEAD) with a half-twist. The result is a Möbius strip — one face, no inside or outside, no front or back. There is only the traversal.
The practical consequence: a contract without all three standpoints named is an open system. The caller knows what to provide (FRONT) and what to expect (REAR), but not what commits the transition (LEAD). That gap — the unnamed pivot — is where Gate-X must fire if the invariant fails. If the pivot is not named, the failure point is unspecified, and error handling cannot fire at the right place. A function with only a FRONT and a REAR is two separate open systems pretending to be connected. Hardware will execute the composition regardless, but without the named LEAD, the interface's orientation is undefined.
Gate-X is causal, not terminal. It fires at the point where the structure fails — not at the point where you notice. The Ariane 5 SRI emitted an exception at the moment of overflow — at the LEAD. The autopilot received diagnostic data at the REAR and acted on it as navigation data. The Gate-X signal fired correctly. The downstream system had no contract specifying that this signal meant abort rather than navigate. The missing contract between the SRI and the autopilot is the missing Möbius — two open systems that hardware connected without the half-twist that would have made the interface's orientation undeniable.
A label is assigned by the programmer. A measurement is read off the structure that already exists.
These look identical from outside. They have opposite failure modes.
A stored classification goes stale silently. A computed measurement is wrong at call time, in front of you.
The categories in the Erdős-Selfridge prime categorization engine are not assigned — they are depths, measured recursively off the structure of primes. If you stored them as fields to be set, they would drift whenever the underlying computation changed. Because they are computed on demand from the invariant, they cannot drift. The structure is the invariant. The computation reads it.
The principle generalizes to any system: if you find yourself storing a classification that could be computed, ask why it is being stored. The answer is almost always performance — but stored classifications have a maintenance cost that computed measurements do not. A stored classification that is not updated when the underlying structure changes is technical debt that hardware will execute faithfully until someone notices it is wrong.
If you can compute it, compute it. Nothing stores what can be computed. Nothing recomputes what is already in scope. The complement: if the structure has already computed what you need, pass the pointer. The second measurement is waste.
The data layer and the operation layer are separate. Violation of this is the source of most architectural bugs.
A hash table does not contain sorting logic. A tree does not contain traversal strategy. A prime number does not contain its own categorization function. The algebra operates on the structure. The structure does not know about the algebra.
When a function exists in two forms — one stateful, one pure — it should exist only in the pure form. The stateful wrapper is a one-liner at the call site. Putting state into the function that should be stateless is putting the water into the vessel's wall. The vessel becomes unable to hold different water. The function becomes unable to operate on different state.
This is a projection principle. The wrong projection entangles the vessel and the water, making both invisible. The right projection keeps them separate, making both visible. Hardware executing a function that has entangled data and operation is executing a function whose invariant cannot be stated cleanly — because the invariant changes depending on which state was initialized before the call.
Causal sequence is not a preference. It is a structural requirement that hardware will either enforce or violate, with no middle ground.
MEASURE before CORRECT. SIFT before RECONCILE. RECONCILE before AMPLIFY.
In a quantum key distribution protocol: QBER must be measured on the raw sifted key, before error correction. Measuring after error correction always shows near-zero error — the correction removed the errors, so the measurement is meaningless. The security property — that the QBER threshold determines whether Eve's information is bounded — depends entirely on measuring the right thing at the right time. If phase ordering is wrong, the threshold check compiles, runs, passes, and does nothing. Hardware will execute it exactly as many times as the protocol runs. The security property will appear to hold until it matters that it does not.
Phase ordering enforced in the architecture cannot be violated. Write-once memory sealed between phases makes phase ordering a physical property of the system. A downstream thread that attempts to write to a sealed slab receives SIGSEGV. Hardware enforces it. Not convention. Not documentation. The operating system.
When phase ordering can be enforced in hardware, enforce it in hardware. When it cannot, enforce it in the type system. When it cannot be enforced structurally, enforce it with Gate-X checks at each phase boundary. When none of these are available, at minimum state it as a verified invariant in the code and verify it in the self-test suite.
A phase ordering bug is invisible from the logic surface. It is only visible from the structure surface — the causal graph that must exist before the code. This is why the graph must be drawn before the code is written. The code cannot show you the phase ordering error. The graph can.
You do not need to protect every layer.
You need to identify and seal the single boundary that all data must pass through. Everything above a sealed choke point is protected by construction — not by convention, not by audit. By the seal itself.
The choke point has three states: Buffer → Page → Stream.
Buffer is raw, mutable, unverified. Data exists here before the invariant is confirmed. Mutation is possible. Corruption is possible. This state is dangerous by definition.
Page is sealed — mprotect(PROT_READ) after the phase commits. The colon between Buffer and Page is the seal event. Post-seal write is SIGSEGV. Not convention. The operating system enforces it. Before the seal: speculative. After: authoritative. The invariant is confirmed at the moment of sealing.
Stream is what flows out of the seal — read-only to all downstream consumers. The data is exactly what it was when the seal was applied. No consumer can corrupt it. No race can touch it. The seal is the proof.
Finding the choke point: ask what is the single boundary that all data in this system must cross exactly once. Seal it. Everything above it is protected.
Why this works: a sealed page cannot be modified by any thread, any signal, any race condition, any use-after-free, any buffer overflow writing into the wrong region. You do not need to audit the code that reads the sealed page. You need to audit the code that writes it — once, at the seal. The seal collapses the audit surface from the entire codebase to the single commit operation.
The relationship to Gate-X: Gate-X fires at the LEAD crossing — the moment Buffer becomes Page. If the invariant fails at the crossing, Gate-X propagates and the seal does not happen. Buffer remains Buffer. Downstream consumers receive Gate-X, not a corrupted Page. The choke point and the Gate-X point are the same boundary.
The choke point principle scales. One sealed door protects one phase. Six sealed doors in sequence protect the system.
1D.Stream : Frame : Buffer : MM : IO : NN
Stream — data in motion, unstructured, untrusted by definition. Nothing enters Frame without passing the Stream door.
Frame — structured unit. The framing seal confirms structure before content is interpreted. Malformed frames do not reach Buffer.
Buffer — mutable working memory, sealed to Page on commit. After sealing, Buffer becomes Page. Nothing written after the seal survives.
MM — memory management. The allocator is the door. Data that was never allocated cannot be reached.
IO — the hardware boundary. Data that crosses IO is explicitly requested. The interface to physical devices and the kernel.
NN — the outermost envelope. The boundary between this system and everything external to it.
The invariant of the door stack: if it is inside, it is meant to be inside. This is not a policy statement. It is a structural proof. A sealed door cannot be crossed by accident. Data inside a sealed layer did not arrive by chance — the architecture placed it there deliberately. Presence inside a sealed door is the proof of authorization. You do not need a separate security audit. The architecture is the audit.
Utilization is the only check: not "is this data safe?" but "is this data in the correct layer?" The check is O(1). The proof is structural.
Most security failures occur because a boundary was never named. Buffer overflows reach memory that was never sealed. Injection attacks reach IO because Frame did not seal. Side-channel attacks read MM because the MM door was open. Seal all six doors in order and these attack surfaces do not exist. A path that cannot cross a sealed door has no existence. You are not hardening the system. You are removing the attack surface.
For adversarial contexts, three inner levels extend the stack:
L+1 — Temporal sealing. Not just correct layer, but correct time. Phase ordering is a door. Data that crosses a boundary out of causal order is structurally rejected — Gate-X at the phase boundary, not at the point of misuse.
L+2 — Identity sealing. Not just correct layer at correct time, but produced by the correct process. Provenance is a gate state. Every structural change states its before-state, after-state, equivalence, and evidence. Unattributed data does not cross the identity door.
L+3 — Invariant sealing. Not just provenance, but proof of invariant preservation across every transformation. At L+3, security is not checked. It is read off the surface. The structure announces it. The same way hex makes the byte boundary visible.
Each level collapses the previous level's audit into a structural property. The scaling principle: audit the doors. Seal the doors. The paths protect themselves.
The provability record is the artifact required to pass the L+3 invariant sealing door. A system that cannot produce a provability record cannot demonstrate an unbroken invariant chain from origin to present. L+3 does not accept assertions. It accepts evidence.
Because hardware operates before, during, and after review, every claim in a codebase must be verifiable without the author's presence.
Every claim is either:
- Machine-verifiable: compile the code, run the test, grep the output
- Hand-derivable: execute this derivation, compare to the constant
Nothing else. A claim that cannot be verified by either method is an assertion. Assertions are where silent debt lives. Hardware will find the cases where the assertion is wrong.
The specific mechanisms:
- Every constant has a derivation: the formula, a runnable snippet, a CONFIRMED marker
- Every numerical table has three spot-checks against the formula
- Every self-test runs before any external interaction and halts on failure
- Every contract has all three standpoints named specifically
- Every structural change has a transformation record
The record is not overhead. It is the only artifact that describes what hardware will execute across the full operational envelope, not just the cases that were tested.
Profoundly correct code does not just work. It makes it structurally impossible for hardware to execute the wrong thing without a detectable signal.
Every precondition in the highest enforcement class it can reach. Hardware that violates a static_assert never produces a binary. Hardware that violates a Gate-X check — an explicit four-state error signal (Z: unreached / X: failed / 0: valid-deny / 1: valid-allow; see Part II) that propagates causally through the call chain — returns an error on that call, not a wrong value downstream. Hardware that violates a comment-only precondition executes faithfully through the violation for the operational life of the system.
Every constant with a derivation. Hardware that receives a wrong constant will compute with it faithfully forever. The derivation is the only mechanism to catch wrong constants before hardware runs them. A table without spot-checks is a table that may be wrong in exactly the way the Pentium FDIV table was wrong — in specific regions of the input space that tests did not cover.
Every composition pattern stated. Hardware composes functions in the order they are called, in the inputs they receive, across the full operational envelope. The pattern is the statement of the correct composition. Without it, hardware executes whatever composition the code produces. Mars Climate Orbiter executed 4.45x the correct correction for nine months because the composition pattern — state the units at every subsystem boundary — was not stated.
Every transformation recorded. Hardware executes post-transformation code in all inputs, not just the inputs that motivated the transformation. The Ariane 5 SRI reuse was a transformation without a record: before-context, after-context, invariant both preserve, evidence the invariant holds in the new context. Without the record, hardware executed in the Ariane 5 context with Ariane 4 assumptions.
Every choke point sealed. Hardware cannot corrupt data above a sealed boundary. The seal is the proof that the invariant held at the moment of commit. mprotect(PROT_READ) is not a performance hint. It is a structural guarantee that the operating system enforces. Every phase boundary in a multi-phase system is a choke point. Every choke point must be sealed before the next phase reads it. The six-door stack — Stream:Frame:Buffer:MM:IO:NN — is the full audit surface. Presence inside a sealed door proves authorization. You do not audit paths. You audit doors.
Every error signal explicit and causal. The four gate states are not interchangeable. 0 (valid-deny) is a business logic rejection — the structure is intact, the invariant holds, the answer is no. An authentication check that correctly rejects an invalid credential returns 0. X (failed) is a structural violation — the invariant is broken, the operation cannot complete, the state is undefined. A null pointer dereference, an integer overflow, a precondition violated — these return X. Conflating them produces systems that treat structural failures as recoverable business conditions and let corrupted state propagate. Name which gate state applies at every LEAD crossing before writing the function body.
NaN propagates silently. Gate-X propagates explicitly. When the Ariane 5's conversion overflowed, the hardware raised an exception. The exception was a correct Gate-X signal. The signal was not handled as an abort because no contract specified that this signal meant abort. The missing contract made the correct Gate-X invisible to the receiving system. Hardware executed faithfully through the gap.
The measure of profoundly correct code is: when hardware encounters an input that no human anticipated, it either executes correctly, or it returns a named, detectable, propagated error signal. It does not execute wrong values silently. It does not accumulate error across 3,600,000 operations. It does not convert overflow into navigation data.
This is achievable. Exact integer arithmetic does not approximate — hardware executing integer multiplication cannot accumulate representational error because there is none. Gate-X at function entry cannot be bypassed — hardware executing the check always executes the check. static_assert cannot be silenced — hardware cannot be produced from a binary that fails the assert.
Each of these is a mechanism by which correct notation makes hardware execution faithful to intent rather than merely faithful to mechanics.
Test suites end. Code reviews end. Deployments ship. Hardware keeps executing.
Hardware executes in the field, at operational tempo, under conditions not anticipated in testing, for durations not covered in the design specification, on inputs no engineer considered. Hardware does not stop to ask whether the code is correct for this particular input in this particular context at this particular time. It executes.
The gap between what the notation says and what the engineers meant is the gap where hardware operates. Every undeclared precondition, every unverified constant, every missing composition pattern, every unstated transformation equivalence is a region of that gap. Hardware will find it — not by searching, but by executing. Eventually, in the field, under operational load, the gap will be an input, and hardware will execute through it faithfully.
The engineering discipline described in this paper is not about satisfying reviewers or passing audits. It is about closing the gap so that hardware, executing faithfully through all inputs across all operational conditions, has no gap to find.
When the notation carries meaning — when preconditions are in the highest enforcement class, when constants have derivations, when transformations have proof obligations, when composition patterns are stated, when error signals are explicit and causal — hardware executing faithfully is hardware executing correctly. The mechanics and the meaning are the same thing.
The reader who finishes this paper and writes float tick_interval = 0.1 in a system that will run for 100 hours should feel the weight of what they just did. They handed hardware a gap. Hardware will execute through it exactly as many times as the system runs, as faithfully as the Patriot battery executed through its gap, until the accumulated error reaches 573 meters.
Seal the choke point. The higher layers protect themselves.
The notation is the spec. Hardware reads the spec literally, completely, and without mercy.
Write accordingly.
- Bloch, J. (2006). How to Design a Good API and Why It Matters. OOPSLA 2006 Companion. ACM.
- Uddin, G. & Robillard, M.P. (2015). How API Documentation Fails. IEEE Software, 32(4), 68–75.
- Maalej, W. & Robillard, M.P. (2013). Patterns of Knowledge in API Reference Documentation. IEEE Transactions on Software Engineering, 39(9), 1264–1282.
- Ernst, N.A. et al. (2015). Measure It? Manage It? Ignore It? FSE 2015. ACM.
- Eick, S.G. et al. (2001). Does Code Decay? IEEE Transactions on Software Engineering, 27(1), 1–12.
- Mens, T. & Tourwé, T. (2004). A Survey of Software Refactoring. IEEE TSE, 30(2), 126–139.
- Lawall, J. & Muller, G. (2018). Coccinelle: 10 Years of Automated Evolution in the Linux Kernel. USENIX ATC 2018.
- Arnold, D.N. (1992). The Patriot Missile Failure. University of Minnesota.
- Ariane 501 Inquiry Board (1996). Report by the Inquiry Board. ESA/CNES.
- General Accounting Office (1992). Patriot Missile Defense: Software Problem Led to System Failure at Dhahran. GAO/IMTEC-92-26.
- NASA/JPL (1999). Mars Climate Orbiter Failure Board Report.
- Overman, H. (2026). The 144,000 Ratio System: Why IEEE 754 Cannot Be Trusted. Euman Language Ecosystem.
© 2026 H. Overman (ee). MIT License — attribution required.
The author thanks LeeMetaXTRON https://gist.github.com/LeeMetaX for mentorship and technical guidance.
Seal the choke point. The higher layers protect themselves.