guialbuk/taxonomy-shadow-execution-design.md

Taxonomy Shadow Execution: Design Options for Path Override

Context

We're building shadow execution for taxonomy queries in Storefront. The goal: run both MySQL and Query Engine (QE) paths for sampled requests, compare results at the domain level (e.g., TaxonomyCategoryDrop), and emit observability signals -- without affecting the response.

DualPathHelper.choose_path currently decides the path based on a feature flag:

def choose_path(shop:)
  return PATH_QUERY_ENGINE if shop.features.enabled?(FLAG_HANDLE)
  PATH_DATABASE
end

Shadow execution needs to run the same business logic twice: once with MySQL (normal), once forcing QE. The question is how to tell DualPathHelper to use QE on the shadow run.

What is a Fiber?

A Fiber is Ruby's lightweight concurrency primitive -- a coroutine that can be paused and resumed. Unlike threads, fibers are cooperatively scheduled: they only yield control when they explicitly choose to.

In Storefront (Falcon/Async), each request runs in a fiber. Async(transient: true) spawns a new fiber that runs in the background.

Fiber-local storage

Each fiber has its own storage, accessed via Fiber[]:

Fiber[:my_key] = "value"  # only visible in THIS fiber

This is analogous to thread-local storage (Thread.current[]) but scoped to fibers. Since Async spawns a new fiber, fiber-locals in the shadow fiber are completely invisible to the request fiber.

Option B: Fiber-local override

DualPathHelper checks a fiber-local before the flag:

# DualPathHelper
SHADOW_PATH_OVERRIDE = :"__taxonomy_shadow_path_override__"

def choose_path(shop:)
  override = Fiber[SHADOW_PATH_OVERRIDE]
  return override if override

  return PATH_QUERY_ENGINE if shop.features.enabled?(FLAG_HANDLE)
  PATH_DATABASE
end

ShadowExecution sets it inside the shadow fiber:

# ShadowExecution
def execute_shadow(mysql_result:, &block)
  Fiber[DualPathHelper::SHADOW_PATH_OVERRIDE] = DualPathHelper::PATH_QUERY_ENGINE
  qe_result = yield
  comparator.call(mysql_result, qe_result)
ensure
  Fiber[DualPathHelper::SHADOW_PATH_OVERRIDE] = nil
end

How it flows

Diagram: Option B - Fiber-local (FigJam)

The request fiber calls DualPathHelper with Fiber[:override] = nil, gets MySQL. The shadow fiber sets its own Fiber[:override] = QE, gets QE. The request fiber is never affected.

Call site impact: none

DualPathHelper reads the override implicitly. Call sites don't change:

# TaxonomyCategoryAncestorRepository -- unchanged
DualPathHelper.with_path(operation: "category_ancestors", shop: @context.shop) do |path|
  # ...
end

Option B': Explicit `path_override:` parameter + shared context mutation

DualPathHelper takes an explicit parameter:

# DualPathHelper
def choose_path(shop:, path_override: nil)
  return path_override if path_override
  return PATH_QUERY_ENGINE if shop.features.enabled?(FLAG_HANDLE)
  PATH_DATABASE
end

ShadowExecution mutates the shared context between runs:

# ShadowExecution
def run(operation:, context:, comparator:, &block)
  mysql_result = yield  # first run, context[:taxonomy_path_override] is nil

  context[:taxonomy_path_override] = DualPathHelper::PATH_QUERY_ENGINE

  dispatch_shadow do
    qe_result = yield  # second run, context now has the override
    comparator.call(mysql_result, qe_result)
  ensure
    context[:taxonomy_path_override] = nil
  end

  mysql_result
end

Call sites must thread the override through:

# TaxonomyCategoryAncestorRepository -- must change
DualPathHelper.with_path(
  operation: "category_ancestors",
  shop: @context.shop,
  path_override: @context[:taxonomy_path_override],  # NEW
) do |path|
  # ...
end

The danger of shared context mutation

Context is a mutable hash shared across the request. When ShadowExecution mutates it, the mutation is visible to all code using that context, not just the shadow.

Diagram: Race Condition - Shared Context Mutation (FigJam)

A request triggers two taxonomy queries. The first shadow execution sets context[:override] = QE. Before the shadow fiber cleans it up, the request fiber runs the second query -- and DualPathHelper reads the QE override. The "MySQL" result in the actual response is silently served from QE. The response is corrupted.

Why the fiber-local is immune

Diagram: Fiber-local - Safe with Multiple Shadows (FigJam)

Each shadow fiber sets its own Fiber[:override]. The request fiber never sees it. Other shadow fibers never see it. No shared mutable state.

Comparison

	B (fiber-local)	B' (context mutation)
DualPathHelper change	3 lines (read fiber-local)	Add `path_override:` param to `choose_path` + `with_path`
Call site change	None	Every call site must thread `path_override:`
Explicitness	Implicit (fiber-local read in `choose_path`)	Explicit (parameter in method signature)
Fiber safety	Isolated per fiber	Shared mutable state races across fibers
Multi-shadow safety	Each shadow has own override	Second shadow's override corrupts first
Risk to response correctness	None (request fiber untouched)	High (request fiber reads mutated context)
Maintenance	One place to understand (DualPathHelper)	Every call site must remember to pass override

Recommendation

Option B (fiber-local) is the right choice:

Zero call site changes -- DualPathHelper reads the override implicitly
Fiber-safe by design -- each shadow fiber has its own isolated override
No shared mutable state -- eliminates an entire class of concurrency bugs
Minimal DualPathHelper change -- 3 lines added to choose_path
No risk to response correctness -- the request fiber is never affected

The "implicitness" of fiber-locals is actually a feature here: the shadow execution mechanism is an infrastructure concern that call sites shouldn't need to know about. It's the same pattern as CanonicalLogger.current (fiber-local logger) that Storefront already uses throughout.

guialbuk/taxonomy-shadow-execution-design.md

Select an option

No results found

Select an option

No results found

Taxonomy Shadow Execution: Design Options for Path Override

Context

What is a Fiber?

Fiber-local storage

Option B: Fiber-local override

How it flows

Call site impact: none

Option B': Explicit `path_override:` parameter + shared context mutation

The danger of shared context mutation

Why the fiber-local is immune

Comparison

Recommendation

guialbuk/taxonomy-shadow-execution-design.md

Taxonomy Shadow Execution: Design Options for Path Override

Context

What is a Fiber?

Fiber-local storage

Option B: Fiber-local override

How it flows

Call site impact: none

Option B': Explicit path_override: parameter + shared context mutation

The danger of shared context mutation

Why the fiber-local is immune

Comparison

Recommendation

Option B': Explicit `path_override:` parameter + shared context mutation