-
-
Save denniswon/a5e55a5d515bb7234f550647df5cbc3a to your computer and use it in GitHub Desktop.
| one question: does the contract revert only detectable from transaction receipt? or can it be detected from just sendTransaction? | |
| currently, the problem with batch tx submission is that it is a single loop, sequential process with retry. Instead, let's explore | |
| pipelining architecture where the initial submission of the batch doesn't block, and the failures are handled on a separate queue | |
| (will we need a failed tx handling dedicated task generator signer key that is separate from the main submission signer key?) | |
| /clarify-plan | |
| ⏺ Skill(/clarify-plan) | |
| ⎿ Successfully loaded skill | |
| ⏺ Good questions. Let me first answer your initial question, then clarify the design space. | |
| ★ Insight ───────────────────────────────────── | |
| Contract reverts are only detectable from the receipt (or simulation), not from sendTransaction itself. eth_sendTransaction / | |
| eth_sendRawTransaction only validates the transaction envelope (nonce, gas, signature) — it doesn't execute the EVM. The EVM | |
| execution happens during mining, and the result is in the receipt (status: 0x0 = revert). The only way to detect a revert before | |
| mining is eth_call (simulation), which is what simulate_batch_create_and_respond already does post-failure. | |
| ───────────────────────────────────────────────── | |
| Now for the pipelining architecture — let me clarify requirements: | |
| Batch 1: Current bottleneck and goals | |
| The current sequential loop does: submit batch TX → wait for receipt (up to 60s) → parse failures → strip poison → retry. During | |
| that entire wait, no new batches can flush. Is the primary goal to decouple "submit" from "wait for receipt" so the accumulator can | |
| keep flushing new batches while previous ones are in-flight? Or is there a different bottleneck you're seeing? | |
| On the signer key question — the current architecture already routes all on-chain TXs through a single AvsWriter per chain to | |
| avoid nonce races. If we pipeline (multiple batches in-flight simultaneously), the nonce manager needs to handle concurrent pending | |
| TXs. Are you thinking of a dedicated retry signer (separate key, separate nonce stream) specifically for resubmitting failed items, | |
| so the main submission pipeline is never blocked by retry logic? Or more like nonce pipelining (multiple TXs in-flight on the same | |
| key)? | |
| What's the target throughput you're aiming for? The current setup does ~2-4 batch TXs per minute on Sepolia under congestion. Is | |
| the goal 10x (20-40/min), or is it more about reducing the tail latency (the 60s receipt timeout blocking everything)? | |
| ✻ Cogitated for 34s | |
| ❯ 1.the bottleneck is the waiting for receipt time + retries that add up quickly and make all subsequent batches to be too old for | |
| task creation on chain. But we must ensure that we handle nonce of all transactions to be sequential, which is the challenging part. | |
| 2. more like nonce pipelining, or whatever the MOST RELIABLE and MOST EFFICIENT approach 3. the goal throughput is 100 tps | |
| ultimately, but if difficult, 10x minimum | |
| ⏺ Good context. The nonce sequencing under concurrent in-flight TXs is the core design challenge. Let me dig deeper. | |
| Batch 2: Nonce management and failure semantics | |
| Current nonce behavior: The gateway uses alloy's CachedNonceManager which queries eth_getTransactionCount("pending"). With | |
| multiple TXs in-flight, the "pending" count auto-increments. But if TX at nonce N fails (revert or gets dropped), nonce N+1 and N+2 | |
| are stuck — they can't mine until N is resolved. How do you want to handle this? Options range from: | |
| - (a) Optimistic pipelining: send N, N+1, N+2 in parallel, accept that a revert at N blocks everything after it (simpler, works if | |
| reverts are rare — which our pre-flight filter should ensure) | |
| - (b) Replacement via gas bumping: if nonce N reverts or stalls, send a replacement TX at nonce N with higher gas to "unstick" the | |
| pipeline | |
| - (c) Cancel-and-resubmit: send a zero-value self-transfer at nonce N to burn the stuck slot, then resubmit N+1, N+2 with fresh | |
| data | |
| Receipt polling vs event-driven: Currently get_receipt() polls with backoff until the TX is mined. With pipelining, we'd have | |
| multiple receipts to track. Are you open to switching to WebSocket-based TX confirmation (subscribe to newPendingTransactions or | |
| newHeads and match TXs), or prefer to keep the polling model but run multiple pollers concurrently? | |
| Failure isolation scope: If a batch TX reverts on-chain (receipt status: 0x0), should the retry of that batch's failed items go | |
| into the next regular batch (mixed with fresh items), or into a dedicated retry queue that processes separately? Mixing means failed | |
| items get another chance quickly but could poison the next batch too. Separating means fresh items are never delayed by retries, but | |
| retry latency increases. | |
| ❯ 1. what do you recommend as most reliable and efficient (fastest) 2. would websocket be a more scalable, efficient approach? help me | |
| evaluate 3. what would be most reliable and matches our architecture/context well the most? the dedicated retry queue? | |
| ⏺ ★ Insight ───────────────────────────────────── | |
| The EVM nonce model is fundamentally sequential — there's no way around it. TX at nonce N+1 cannot mine until nonce N has mined | |
| (success or revert). The design question is how to maximize throughput within this constraint while minimizing the blast radius when | |
| a TX stalls or reverts. | |
| ───────────────────────────────────────────────── | |
| Let me evaluate each dimension with concrete trade-offs for our architecture. | |
| Nonce strategy: optimistic pipelining (a) is the right choice | |
| ┌─────────────────────────┬─────────────────────────────────┬──────────────────────────────────┬───────────────────────────────┐ | |
| │ Strategy │ Throughput │ Complexity │ Failure blast radius │ | |
| ├─────────────────────────┼─────────────────────────────────┼──────────────────────────────────┼───────────────────────────────┤ | |
| │ (a) Optimistic pipeline │ High — N TXs in mempool at once │ Low — alloy's pending nonce │ If N reverts, N+1..N+k are │ | |
| │ │ │ handles it │ stuck until N mines │ | |
| ├─────────────────────────┼─────────────────────────────────┼──────────────────────────────────┼───────────────────────────────┤ | |
| │ (b) Gas bumping │ Medium — requires monitoring + │ High — race conditions, EIP-1559 │ Better recovery but complex │ | |
| │ replacement │ replacement logic │ priority fee math │ edge cases │ | |
| ├─────────────────────────┼─────────────────────────────────┼──────────────────────────────────┼───────────────────────────────┤ | |
| │ (c) Cancel-and-resubmit │ Low — burns gas on empty TXs │ Medium — needs explicit nonce │ Clean but wasteful, adds │ | |
| │ │ │ tracking │ latency │ | |
| └─────────────────────────┴─────────────────────────────────┴──────────────────────────────────┴───────────────────────────────┘ | |
| Recommendation: (a) with the pre-flight filter as the safety net. The pre-flight filter we just built removes 95%+ of would-revert | |
| items before they enter the pipeline. With clean batches, reverts become rare — and when they do happen, the pipeline self-drains | |
| (the reverted TX still mines with status: 0x0, which advances the nonce). The key insight: an on-chain revert is NOT a stuck nonce — | |
| the TX still mines, nonce advances, and subsequent TXs proceed. Only dropped/replaced TXs actually stall the nonce. So the real risk | |
| is TX-not-mining (gas too low, mempool eviction), not reverts. | |
| For the rare stuck-TX case, a simple watchdog timer (e.g., 120s with no receipt) can trigger a gas-bumped replacement to unstick the | |
| pipeline. Much simpler than full cancel-and-resubmit. | |
| WebSocket vs polling for receipt tracking | |
| ┌─────────────────────────────────────────────┬────────────────┬────────────────────────┬────────────┬──────────────────────────┐ | |
| │ Approach │ Latency │ Reliability │ Complexity │ Our context │ | |
| ├─────────────────────────────────────────────┼────────────────┼────────────────────────┼────────────┼──────────────────────────┤ | |
| │ │ 1-12s (polling │ High — works through │ │ Current approach, works │ | |
| │ Polling (get_receipt) │ interval) │ proxies, load │ Low │ with any RPC provider │ | |
| │ │ │ balancers │ │ │ | |
| ├─────────────────────────────────────────────┼────────────────┼────────────────────────┼────────────┼──────────────────────────┤ | |
| │ │ │ Medium — WS │ │ QuikNode supports WS, │ | |
| │ WebSocket (newHeads + match) │ Sub-second │ connections drop, │ Medium │ but gateway already │ | |
| │ │ (push) │ reconnect logic needed │ │ juggles WS for chain │ | |
| │ │ │ │ │ watcher │ | |
| ├─────────────────────────────────────────────┼────────────────┼────────────────────────┼────────────┼──────────────────────────┤ | |
| │ WebSocket │ │ Low — many providers │ │ Not reliable across │ | |
| │ (eth_subscribe("newPendingTransactions")) │ Near-instant │ don't support full TX │ High │ providers │ | |
| │ │ │ subscription │ │ │ | |
| └─────────────────────────────────────────────┴────────────────┴────────────────────────┴────────────┴──────────────────────────┘ | |
| Recommendation: Keep polling but run multiple pollers concurrently. With pipelining, we'd have e.g., 5 batch TXs in-flight, each with | |
| its own tokio::spawned receipt poller. This is simple, reliable, and doesn't add a WS dependency to the submission path. The latency | |
| difference (1-12s polling vs sub-second WS) is negligible compared to the 2-12s block time. WebSocket would be over-engineering here | |
| — the bottleneck is block time, not notification latency. | |
| Failure isolation: dedicated retry queue is the right fit | |
| ┌──────────────────────────────┬───────────────────┬───────────────────────┬──────────────────────────┬────────────────────────┐ | |
| │ Approach │ Fresh item │ Retry latency │ Isolation │ Our fit │ | |
| │ │ latency │ │ │ │ | |
| ├──────────────────────────────┼───────────────────┼───────────────────────┼──────────────────────────┼────────────────────────┤ | |
| │ Mixed (retry into next │ Degraded under │ Low (next flush) │ None — poison items │ Bad for 100 tps target │ | |
| │ regular batch) │ failures │ │ delay fresh │ │ | |
| ├──────────────────────────────┼───────────────────┼───────────────────────┼──────────────────────────┼────────────────────────┤ | |
| │ Dedicated retry queue │ Unaffected │ Higher (separate │ Full — fresh pipeline │ Matches our │ | |
| │ │ │ flush cycle) │ never blocked │ architecture │ | |
| ├──────────────────────────────┼───────────────────┼───────────────────────┼──────────────────────────┼────────────────────────┤ | |
| │ Drop-and-notify (no retry) │ Unaffected │ N/A — caller gets │ Full │ Too aggressive for │ | |
| │ │ │ error │ │ production │ | |
| └──────────────────────────────┴───────────────────┴───────────────────────┴──────────────────────────┴────────────────────────┘ | |
| Recommendation: Dedicated retry queue, same signer key. Here's why same key works: the retry queue doesn't submit concurrently with | |
| the main queue — it waits for the main pipeline's in-flight TXs to drain (or times out), then takes a turn. Think of it as a priority | |
| inversion: main queue has priority, retry queue fills gaps. Since an on-chain revert advances the nonce (it still mines), the retry | |
| queue's items get fresh nonces naturally. No second signer key needed. | |
| Batch 3: Architecture details | |
| Pipeline depth: How many batch TXs should be in-flight simultaneously? Each batch TX is ~400-800K gas per item, fitting ~30-37 | |
| items in a 30M gas block. With 100 tps target and 2s blocks on Base (12s on Sepolia), we'd need ~200 items per block on Base (or | |
| ~1200 per block on Sepolia). That's 6-32 batch TXs per block. Do you want to cap the in-flight pipeline at a fixed depth (e.g., 5-10 | |
| concurrent TXs), or let it grow to fill the block? | |
| Caller notification timing: Currently the RPC caller blocks until the batch TX receipt arrives (success or failure). With | |
| pipelining, should callers still block until their specific batch's receipt, or should we return early with a "task accepted, pending | |
| on-chain" response and let callers poll for the result? | |
| Per-chain or global pipeline: In multichain mode, each chain has its own BatchTransactionSubmitter. Should each chain's pipeline | |
| be independent (simpler, current model scaled up), or should there be a global coordinator that balances across chains (more complex, | |
| better utilization)? | |
| ✻ Worked for 46s | |
| ❯ For batch 2: a) with the pre-flight filter as the safety net. The pre-flight filter we just built removes 95%+ of would-revert | |
| items before they enter the pipeline. With clean batches, reverts become rare — and when they do happen, the pipeline self-drains | |
| (the reverted TX still mines with status: 0x0, which advances the nonce). The key insight: an on-chain revert is NOT a stuck nonce | |
| — | |
| the TX still mines, nonce advances, and subsequent TXs proceed. Only dropped/replaced TXs actually stall the nonce. So the real | |
| risk | |
| is TX-not-mining (gas too low, mempool eviction), not reverts. | |
| For the rare stuck-TX case, a simple watchdog timer (e.g., 120s with no receipt) can trigger a gas-bumped replacement to unstick | |
| the | |
| pipeline. Much simpler than full cancel-and-resubmit. --> yes implement optimistic pipeline, with watchdog timer for gas-bumped | |
| replacement. But note that Gateway tasks are fundamentally multichain --> would we need multiple pieplines for each chain? Make sure | |
| our pipeline architecture doesn't incur memory overhead or memory leak for the gateway that leads to reliability or performance | |
| degradation. Same for polling vs websocket discussion - polling is fine, but make sure even for multichain context, our architecture | |
| is performative and memory leak safe. Lastly, for dedicated retry queue - make sure the retry queue transactions are ALSO within | |
| the task creation block window as well. main queue having priority is the right call, but make sure the retry queue also just | |
| doesn't end up failing all those tasks. Now, for Batch 3: 1 - what do you recommend to optimize for reliability but still | |
| performance and latency are acceptable? (within the task creation block window deadline) 2 - what do you recommend? returning early | |
| sounds favorable but I am concerned about the polling for receipt approach would cause performance degradation or reliability | |
| issues. Help me decide on this. 3 - If not too complex global coordinator would be the best, but only if this is not too complex or | |
| bug prone. Help me evaluate on this. | |
| ⏺ Good decisions. Let me address each and then ask the final batch. | |
| Multichain pipeline: Yes, each chain already has its own BatchTransactionSubmitter (created per-chain in ChainService). Each gets its | |
| own AvsWriter (own signer provider, own nonce stream). So per-chain pipelines are already the architecture — no shared nonce state | |
| across chains, no cross-chain coordination needed. The pipeline depth, retry queue, and watchdog are all per-chain. Memory-wise, each | |
| pipeline is a bounded channel + a small in-flight tracking map — negligible even with 10 chains. | |
| Retry queue staleness: Correct concern. The retry queue MUST re-run the pre-flight staleness filter before resubmitting. Items that | |
| were fresh when they first failed may be stale by retry time. The filter we built runs at the top of each retry iteration — this same | |
| pattern applies to the dedicated retry queue. Stale retries get failed immediately to callers instead of wasting a nonce slot. | |
| Now for Batch 3: | |
| Pipeline depth — recommend fixed cap of 3-5 concurrent TXs | |
| ┌────────────┬─────────────────────┬────────────────────────┬──────────────────────────────────────────────┬───────────────────┐ | |
| │ Depth │ Throughput (Base │ Throughput (Sepolia │ Risk │ Memory │ | |
| │ │ 2s) │ 12s) │ │ │ | |
| ├────────────┼─────────────────────┼────────────────────────┼──────────────────────────────────────────────┼───────────────────┤ | |
| │ 1 │ ~15-18 items/block │ ~30-37 items/block │ None │ Minimal │ | |
| │ (current) │ │ │ │ │ | |
| ├────────────┼─────────────────────┼────────────────────────┼──────────────────────────────────────────────┼───────────────────┤ | |
| │ 3 │ ~45-54 items/block │ ~90-111 items/block │ Low — 3 pending nonces │ 3 tracking │ | |
| │ │ │ │ │ entries │ | |
| ├────────────┼─────────────────────┼────────────────────────┼──────────────────────────────────────────────┼───────────────────┤ | |
| │ 5 │ ~75-90 items/block │ ~150-185 items/block │ Medium — 5 pending nonces │ 5 tracking │ | |
| │ │ │ │ │ entries │ | |
| ├────────────┼─────────────────────┼────────────────────────┼──────────────────────────────────────────────┼───────────────────┤ | |
| │ 10+ │ ~150+ items/block │ ~300+ items/block │ High — deep nonce chain, one stuck TX blocks │ Proportional │ | |
| │ │ │ │ many │ │ | |
| └────────────┴─────────────────────┴────────────────────────┴──────────────────────────────────────────────┴───────────────────┘ | |
| Recommendation: depth=3 for production, configurable. At depth 3 on Base (2s blocks): ~45-54 items per 2s block = 22-27 tps. With 10 | |
| gateway replicas, that's 220-270 tps aggregate — exceeds the 100 tps target. On Sepolia (12s blocks): ~90 items per 12s = ~7.5 tps | |
| per replica, 75 tps aggregate for 10 replicas. | |
| Depth 3 is the sweet spot: if TX at nonce N gets stuck, only N+1 and N+2 are waiting — the watchdog timer fires at 120s and gas-bumps | |
| N to unstick. With depth 10+, a stuck TX blocks 9 subsequent batches and recovery takes longer. | |
| Caller notification — recommend early return with bounded result channel | |
| The concern about polling degradation is valid. Here's the comparison: | |
| ┌──────────────────────────┬───────────────────────────────────────┬──────────────────────┬─────────────────────────────────────┐ | |
| │ Approach │ Caller experience │ Gateway memory │ Failure mode │ | |
| ├──────────────────────────┼───────────────────────────────────────┼──────────────────────┼─────────────────────────────────────┤ | |
| │ Block until receipt │ Simple — caller gets final result │ Low (oneshot per │ 60s timeout = 60s caller latency │ | |
| │ │ │ task) │ │ | |
| ├──────────────────────────┼───────────────────────────────────────┼──────────────────────┼─────────────────────────────────────┤ | |
| │ Early return + poll │ Fast response, caller polls │ Risk: unbounded │ Stale results accumulate if callers │ | |
| │ │ │ result map │ don't poll │ | |
| ├──────────────────────────┼───────────────────────────────────────┼──────────────────────┼─────────────────────────────────────┤ | |
| │ Early return + bounded │ Fast response, result pushed via │ Same as current │ Caller's spawned task awaits, │ | |
| │ callback │ existing oneshot │ │ auto-cleans │ | |
| └──────────────────────────┴───────────────────────────────────────┴──────────────────────┴─────────────────────────────────────┘ | |
| Recommendation: keep the current oneshot pattern. The caller's spawned task in sync.rs already awaits the result_rx oneshot. With | |
| pipelining, the only change is that the oneshot resolves later (when the batch TX mines, not when it's submitted). The caller doesn't | |
| poll — it's already awaiting. No new result map, no memory leak risk. The RPC response to the external client still blocks until the | |
| oneshot resolves, but the internal pipeline is unblocked — the accumulator keeps flushing new batches while previous ones are | |
| in-flight. This gives us the throughput benefit without changing the external API contract. | |
| The key insight: pipelining is internal to the batch submitter, invisible to callers. The accumulator flushes batch N, immediately | |
| starts accumulating batch N+1, flushes it while N is still mining. Each batch's callers independently await their oneshot. No polling | |
| API needed. | |
| Global coordinator — recommend staying per-chain (independent pipelines) | |
| ┌───────────────────┬────────────────────────────────────────┬─────────────────────┬────────────────────────────────────────────┐ | |
| │ Approach │ Complexity │ Benefit │ Risk │ | |
| ├───────────────────┼────────────────────────────────────────┼─────────────────────┼────────────────────────────────────────────┤ | |
| │ Per-chain │ Low — current model, just add depth │ Simple, isolated │ Uneven load across chains │ | |
| │ independent │ │ failures │ │ | |
| ├───────────────────┼────────────────────────────────────────┼─────────────────────┼────────────────────────────────────────────┤ | |
| │ Global │ High — cross-chain nonce awareness, │ Better utilization │ Single point of failure, complex state │ | |
| │ coordinator │ priority scheduling │ │ machine, harder to debug │ | |
| └───────────────────┴────────────────────────────────────────┴─────────────────────┴────────────────────────────────────────────┘ | |
| Recommendation: per-chain independent pipelines. The global coordinator's only benefit is load balancing across chains, but each | |
| chain has independent block space anyway — you can't use Sepolia's gas budget for a Base TX. The chains are fundamentally independent | |
| resources. A global coordinator adds a shared state machine that's hard to reason about under failure, and the bug surface is much | |
| larger. Per-chain pipelines are isolated: if Base's pipeline stalls, Sepolia's keeps going. This isolation is worth more than | |
| theoretical utilization gains. | |
| Final Batch: Implementation scope | |
| Rollout strategy: This is a significant architectural change to a critical path (on-chain submission). Do you want to implement | |
| this as a new PipelinedBatchSubmitter alongside the existing BatchTransactionSubmitter (feature-flagged, A/B testable), or as an | |
| in-place evolution of the current submitter? Feature-flagging means we can fall back to the current sequential behavior if the | |
| pipeline has issues in production. | |
| Watchdog gas bumping: When a TX is stuck for 120s, the watchdog needs to resubmit with higher gas. Should we use a percentage bump | |
| (e.g., 1.5x the original maxFeePerGas), or query current gas prices and resubmit at market rate? The percentage bump is simpler but | |
| might not be enough if gas spiked. Market rate is more reliable but requires an extra RPC call. | |
| Timeline and scope: Given the complexity (nonce pipelining + watchdog + retry queue + per-chain isolation), do you want to tackle | |
| this as one PR or break it into phases? Phase 1 could be the pipeline depth (concurrent submission without blocking) + watchdog, | |
| Phase 2 could be the dedicated retry queue separation. Or all-in-one? | |
| ✻ Brewed for 53s | |
| ❯ Batch 3 recommendations make sense. Agreed. Final Batch - 1: in place evolution 2. why would a tx be stuck for 120s? with | |
| pipelining, we don't have inline retry because we have a dedicated retry queue right? just precentage bump is fine for now (10 or 20 | |
| percentage or something) 3: as one PR, all in one | |
| ⏺ Good. Let me clarify one thing on question 2 before we proceed to planning: | |
| Why a TX can be stuck for 120s — this isn't about retries. A TX gets "stuck" when it's in the mempool but not mining. Causes: | |
| Gas price too low during a congestion spike (other TXs outbid yours) | |
| Mempool eviction (RPC node drops your TX under memory pressure) | |
| RPC node not propagating the TX to validators | |
| With pipelining at depth 3, if TX at nonce N is stuck in the mempool, nonces N+1 and N+2 are also stuck (EVM enforces sequential | |
| nonce ordering). The watchdog detects "TX N has been pending for 120s with no receipt" and gas-bumps it (resubmit same nonce N with | |
| 20% higher gas). Once N mines, N+1 and N+2 immediately become eligible. | |
| This is independent of the retry queue — the retry queue handles items that reverted on-chain (receipt arrived, status=0x0). The | |
| watchdog handles TXs that never mine (no receipt at all). Different failure modes: | |
| ┌───────────────────────────────┬───────────────────────┬───────────────────────────────────┐ | |
| │ Failure │ Detection │ Handler │ | |
| ├───────────────────────────────┼───────────────────────┼───────────────────────────────────┤ | |
| │ On-chain revert (status: 0x0) │ Receipt arrives │ Dedicated retry queue │ | |
| ├───────────────────────────────┼───────────────────────┼───────────────────────────────────┤ | |
| │ TX stuck in mempool │ No receipt after 120s │ Watchdog gas bump (20%) │ | |
| ├───────────────────────────────┼───────────────────────┼───────────────────────────────────┤ | |
| │ Pre-flight stale items │ Block number check │ Pre-flight filter (already built) │ | |
| └───────────────────────────────┴───────────────────────┴───────────────────────────────────┘ | |
| With that clarified — I have enough context to write the implementation plan. Let me do that now. | |
| ★ Insight ───────────────────────────────────── | |
| Three-layer defense model: (1) Pre-flight filter catches stale items before submission. (2) Optimistic nonce pipeline maximizes | |
| throughput with concurrent in-flight TXs. (3) Watchdog + retry queue handle the two remaining failure modes (stuck mempool, on-chain | |
| revert) without blocking the main pipeline. | |
| ───────────────────────────────────────────────── | |
| Let me write up the full implementation plan for your review before touching code. | |
| ⏺ Entered plan mode | |
| Claude is now exploring and designing an implementation approach. |
Newton Privacy Layer — Current Status (2026-04-10)
Phase Status
┌──────────────────────────────┬──────────────────────────────────────────────────┬──────────────────┐
│ Phase │ Status │ Key PRs │
├──────────────────────────────┼──────────────────────────────────────────────────┼──────────────────┤
│ Phase 1: Foundation │ COMPLETE │ #368 │
├──────────────────────────────┼──────────────────────────────────────────────────┼──────────────────┤
│ Phase 1.5: Gateway-Mediated │ COMPLETE │ #510, #516 │
├──────────────────────────────┼──────────────────────────────────────────────────┼──────────────────┤
│ Phase 1.5B: KMS → HPKE │ COMPLETE, DEPLOYED │ #489, #499 │
├──────────────────────────────┼──────────────────────────────────────────────────┼──────────────────┤
│ Phase 2 DKG core │ MERGED │ #426→#478 │
├──────────────────────────────┼──────────────────────────────────────────────────┼──────────────────┤
│ Phase 2C: PSS epoch rotation │ IN PROGRESS — 3 PRs open, ~1,300 lines, 26 tests │ #523, #525, #526 │
├──────────────────────────────┼──────────────────────────────────────────────────┼──────────────────┤
│ Phase 2D: Privacy slashing │ Not started │ — │
├──────────────────────────────┼──────────────────────────────────────────────────┼──────────────────┤
│ Phase 3: MPC/ZK │ Not started │ — │
├──────────────────────────────┼──────────────────────────────────────────────────┼──────────────────┤
│ Phase 4: Ecosystem │ Not started │ — │
└──────────────────────────────┴──────────────────────────────────────────────────┴──────────────────┘
What's Shipped (on main)
- Three-path privacy model: IdentityRegistry (data.identity.), ConfidentialDataRegistry (data.confidential.), inline ephemeral
(data.privacy.inline[0].*) - HPKE encryption (RFC 9180), FROST DKG (Ristretto255), threshold decryption
- Zeroizing<Vec> for all decryption output
- Challenger privacy data resolution (PR #516)
- make dkg command=initiate/status/cancel for manual DKG ceremonies
- AWS KMS fully removed from all 5 repos
What's In Review (PRs open, not merged)
- PR #523: PSS refresh + resharing crypto primitives, EpochMetadata/EpochConfig, multi-epoch keystore, epoch metrics
- PR #525: RefreshCoordinator (1-round protocol), operator newt_dkgRefreshRound / newt_dkgRefreshApply handlers, EpochManager
background task - PR #526: Coordinator HTTP wiring (zero TODOs), E2E crypto smoke test, make pss-refresh-e2e
Known Limitations
- PSS epoch rotation not yet on main — the 3 PRs are open and CI-ready (pending IPFS billing fix for unrelated multichain E2E). Once
merged, the gateway can run automatic 24h epoch rotation with PSS. - No EpochRegistry contract — epoch metadata (MPK, operator set hash, timestamps) exists only in-memory and keystores. No on-chain
verifiability. (NEWT-639) - No privacy-specific slashing — operators who submit invalid partial decryptions or miss DKG rounds face no economic penalty.
(NEWT-629/640/641, blocked by NEWT-628) - Gateway reconstructs plaintext in threshold mode — the gateway combines partial DH outputs and sees full plaintext during ephemeral
task processing. Identity and confidential data are operator-decrypted locally. (Phase 3: MPC) - No forward secrecy until epoch rotation merges — key shares from the initial DKG remain valid indefinitely until PSS refresh is
operational. - Operator set changes require manual DKG — resharing code exists in refresh.rs but automatic detection of operator set changes isn't
wired yet. Permissioned set means coordinated manual process via make dkg.
Next Steps
┌──────────┬─────────────────────────────────────────┬──────────────┬──────────────────────────┐
│ Priority │ What │ Ticket │ Status │
├──────────┼─────────────────────────────────────────┼──────────────┼──────────────────────────┤
│ 1 │ Merge PSS PRs (#523→#525→#526) │ NEWT-628 │ Awaiting CI + review │
├──────────┼─────────────────────────────────────────┼──────────────┼──────────────────────────┤
│ 2 │ EpochRegistry Solidity contract │ NEWT-639 │ Backlog │
├──────────┼─────────────────────────────────────────┼──────────────┼──────────────────────────┤
│ 3 │ On-chain aggregation commitments │ NEWT-630 │ Backlog │
├──────────┼─────────────────────────────────────────┼──────────────┼──────────────────────────┤
│ 4 │ Privacy-specific slashing conditions │ NEWT-629 │ Backlog (blocked by 628) │
├──────────┼─────────────────────────────────────────┼──────────────┼──────────────────────────┤
│ 5 │ PrivacySlasher + DleqVerifier contracts │ NEWT-640/641 │ Backlog │
└──────────┴─────────────────────────────────────────┴──────────────┴──────────────────────────┘
Once PRs merge, NEWT-628 moves to Done. The remaining Phase 2 work is all Solidity contracts (NEWT-639/630/640/641/629) — the
Rust-side implementation is complete.
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Ready to code?
Here is Claude's plan:
╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
Pipelined Batch Transaction Submitter
Context
Post-PR #515 load testing showed ~72% batch failure rate on Sepolia under sustained load. Root cause: the batch submitter's
sequential flush blocks on get_receipt() for 30-60s per batch TX, preventing new batches from flushing. Items queued during that wait
become stale (TaskCreatedBlockTooOld). PR #520 added a pre-flight staleness filter and time-normalized block windows, which catches
stale items before submission but doesn't address the sequential bottleneck.
The goal: decouple TX submission from receipt polling, allowing multiple batch TXs in-flight simultaneously. Target throughput: 100
tps aggregate (10 gateways, ~10 tps each).
Architecture
Split the monolithic flush_combined_batch_with_retry() into 3 decoupled stages:
Accumulator → Build → Submit (.send only) → [Receipt Tracker per TX] → Notify Callers
↑ ↓ (reverts)
└───────── Retry Queue ←───────────────┘
Key invariant: .send().await takes ~200ms (RPC broadcast). get_receipt() takes 30-60s (mining). By decoupling these, the accumulator
can flush a new batch every ~1s while previous batches are still mining.
Pipeline depth: 3 concurrent in-flight TXs (configurable). Controlled by a tokio::Semaphore. At 120 items/batch, that's 360 items
in-flight — up to 3x current throughput per gateway.
Files to Modify
crates/chainio/src/avs/writer.rs (~80 lines added)
Three new methods on AvsWriter:
send_batch_create_and_respond(): Same as existing batch_create_and_respond_to_tasks() lines 711-831, but stops after .send().await
returns Ok(pending). Returns (B256, u64) — TX hash + nonce used. Does NOT call get_receipt().
get_receipt_by_hash(tx_hash): Polls provider.get_transaction_receipt(tx_hash) with exponential backoff (500ms → 2s). No built-in
timeout — caller wraps with tokio::time::timeout.
gas_bump_batch_tx(nonce, tasks, responses, sigs, bump_percent): Resubmits at the same explicit nonce with 20% higher gas. Uses
provider.estimate_eip1559_fees() for current market rate, applies bump. Sets .nonce(original_nonce) explicitly to override
CachedNonceManager.
Critical nonce detail: After .send(), CachedNonceManager has already incremented past nonce N locally. For a gas-bump replacement at
the same nonce N, we MUST set .nonce(N) explicitly on the call builder. The InFlightBatch stores the original nonce.
crates/gateway/src/task/submitter.rs (~30 lines added)
Three new trait methods on TaskSubmitter with default error impls (fail-open for mocks):
async fn send_batch_tx(...) -> Result<(B256, u64), ChainIoError>; // tx_hash + nonce
async fn get_tx_receipt(&self, tx_hash: B256) -> Result<TransactionReceipt, ChainIoError>;
async fn gas_bump_batch_tx(&self, nonce: u64, ..., bump_percent: u32) -> Result<B256, ChainIoError>;
Implement for AvsWriter by delegating to the new methods.
crates/gateway/src/rpc/api/batch_submitter.rs (~400 lines changed/added)
New config fields in BatchSubmitterConfig:
pipeline_depth: u32 (default: 3, set 0 to use legacy synchronous path)
watchdog_timeout_secs: u64 (default: 120)
gas_bump_percent: u32 (default: 20)
max_retry_queue_size: usize (default: 256)
New types:
struct InFlightBatch {
tx_hash: B256,
nonce: u64, // for gas bump replacement
items: Vec,
_permit: OwnedSemaphorePermit, // released on drop
submitted_at: Instant,
batch_contract_address: Address,
tasks: Vec, // kept for gas bump resubmission
responses: Vec,
sig_data: Vec,
}
struct RetryBatchMessage {
items: Vec,
attempt: u32,
}
New fields on BatchTransactionSubmitter:
pipeline_semaphore: Arc
retry_tx: mpsc::Sender
Changes to new(): Create semaphore, retry channel. Spawn retry_loop background task.
Changes to combined_accumulator_loop(): Replace flush_combined_batch_with_retry().await with submit_batch_pipelined() which returns
after .send() (~200ms). Legacy path preserved behind pipeline_depth == 0 check.
New method: submit_batch_pipelined():
Build items (same as Phase 1 of existing flush)
Pre-flight staleness filter (same as existing)
Acquire pipeline semaphore permit (blocks if pipeline full)
Call task_submitter.send_batch_tx() (~200ms)
Spawn receipt_tracker task with InFlightBatch
Return immediately
New method: receipt_tracker() (spawned per TX):
loop {
match timeout(watchdog_timeout, task_submitter.get_tx_receipt(tx_hash)).await {
Ok(receipt) if receipt.status() → notify callers via oneshot, return
Ok(receipt) if !receipt.status() → simulate_and_classify, send healthy to retry_tx, return
Ok(Err(transient)) → sleep 2s, continue
Err(timeout) → gas_bump (max 3 attempts), update tx_hash, continue
}
}
On all exit paths: InFlightBatch drops, releasing semaphore permit. All item result_tx consumed (success or error).
New method: retry_loop() (background task):
Reads from retry_rx channel
Checks attempt < max_retries
Runs pre-flight staleness filter (same check as main path)
Acquires pipeline semaphore permit (yields to main queue)
Submits via send_batch_tx(), spawns receipt tracker
On retry queue full (try_send fails): fail items immediately
Legacy fallback: pipeline_depth: 0 calls existing flush_combined_batch_with_retry().await unchanged. This is the rollout safety
valve.
crates/gateway/src/handler/mod.rs (no changes)
BatchTransactionSubmitter::new() signature unchanged (already has block_time_ms). New config fields have serde defaults.
crates/gateway/tests/ (test updates + new tests)
Update existing mock TaskSubmitter impls for new trait methods (default impls cover most)
New PipelinedSubmitter mock with configurable receipt delays and results
New tests:
test_pipeline_concurrent_batches — verify 3 batches get concurrent nonces
test_pipeline_semaphore_backpressure — depth=2, 3rd batch blocks
test_watchdog_gas_bump — receipt timeout triggers bump
test_revert_to_retry_queue — reverted items reach retry queue
test_retry_queue_staleness_filter — stale retry items filtered
test_shutdown_cleans_up_in_flight — cancel token fails all callers
test_filter_stale_items — already exists from PR #520
Flows
Happy path: Accumulator flushes batch → .send() gets nonce N (200ms) → receipt tracker spawned → accumulator immediately starts next
batch → nonce N+1 sent → receipts arrive out of order → callers notified.
Revert path: Receipt arrives with status: 0x0 → simulate_and_classify each item → idempotent items get success, poison items get
error, healthy items go to retry queue → retry queue runs pre-flight filter → resubmits with fresh nonce.
Stuck TX path: 120s no receipt → gas bump at nonce N with 20% higher gas → new tx_hash → reset timer → if 3 bumps fail → fail all
callers, release permit. Note: reverted TXs advance the nonce (they mine), so they DON'T trigger the watchdog — only dropped/evicted
TXs do.
Memory Safety
Pipeline semaphore caps in-flight TXs at 3: max 360 items total (~720KB)
Retry queue bounded at 256 items, overflow items failed immediately
OwnedSemaphorePermit in InFlightBatch auto-releases on drop (panic-safe)
All result_tx oneshots consumed in every exit path (success, error, cancel, panic via catch_unwind)
CancellationToken checked in both receipt tracker and retry loop for clean shutdown
Rollout
Deploy with pipeline_depth: 1 — validates new code path, same throughput as before
Monitor batch_pipeline_depth and batch_receipt_latency_seconds metrics
Increase to pipeline_depth: 3
pipeline_depth: 0 falls back to legacy synchronous path if issues arise
Verification
cargo check -p newton-prover-gateway --tests — compile check
cargo clippy -p newton-prover-gateway -p newton-prover-chainio --tests -- -D warnings — lint
cargo test -p newton-prover-gateway --lib — unit tests including new pipeline tests
make gateway_e2e — E2E with real anvil (validates receipt flow end-to-end)
make e2e scenario=basic — unified E2E covering batch submission path
Load test on stagef with pipeline_depth: 3 to validate throughput improvement
Changes to combined_accumulator_loop(): Replace flush_combined_batch_with_retry().await with submit_batch_pipelined() which
returns after .send() (~200ms). Legacy path preserved behind pipeline_depth == 0 check. --> do we need to keep the legacy
path? what is the benefit?
Now to answer your question about the legacy path:
Do we need pipeline_depth: 0 as a legacy fallback?
Honestly, no. The pipelined path with pipeline_depth: 1 already gives equivalent behavior to the current sequential path (one TX
in-flight, accumulator unblocked during receipt wait — strictly better than today). There's no scenario where the old
flush_combined_batch_with_retry is preferable.
I'll drop the legacy fallback. The accumulator will always use submit_batch_pipelined(), and pipeline_depth defaults to 3. Setting
pipeline_depth: 1 gives "safe mode" behavior for cautious rollouts.
⏺ Now Task 7: Add the three new trait methods to TaskSubmitter and implement for AvsWriter.