Created
January 29, 2026 19:36
-
-
Save 32teeth/bd0793e19cee4709adf6cf27540709ba to your computer and use it in GitHub Desktop.
Revisions
-
32teeth created this gist
Jan 29, 2026 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,132 @@ # Engineering Complexity Matrix: Easy/Difficult × Small/Large Radius This framework helps teams classify, discuss, and de‑risk changes based on **logical complexity** (easy vs. difficult) and **blast radius** (small vs. large). Use it in grooming, architecture review, PR preparation, and onboarding. ------ ## 1) Quadrant Matrix (Primary Framework) > Classify the work first; the rest of the guidance follows from the quadrant. | Quadrant | Effort (Logic) | Radius (Blast Surface) | Typical Examples | Risks | Recommended Strategy | | ----------------------- | -------------- | ---------------------- | ------------------------------------------------------------ | ---------------------------------------- | ------------------------------------------------------------ | | **Q1: Easy–Small** | Low | Small | Localized UI tweak; copy change; private method refactor | Minimal | Standard PR; unit tests | | **Q2: Easy–Large** | Low | Large | Date formatting change in shared helper; config key rename; design token tweak | Hidden regressions across many consumers | Introduce **facade/compat layer**; **contract tests** for top N consumers | | **Q3: Difficult–Small** | High | Small | Complex algorithm refactor within an isolated module; tricky state machine | Local correctness issues | Pair review; property‑based tests; targeted docs | | **Q4: Difficult–Large** | High | Large | Shared API response shape update; cross‑cutting auth flow; telemetry schema change | System‑wide break risk | **Versioned contracts**; migration plan; isolation boundary | ------ ## 2) Key Dimensions Engineering Should Evaluate | Dimension | Scale | What to Check | Why it Matters | Control | | ------------------------ | ----------------- | --------------------------------------------------------- | ------------------------------------------------- | ------------------------------------------------------------ | | **Criticality** | Low → High | Does it affect revenue, security, SLAs, or core journeys? | Higher stakes demand higher validation standards | Strengthen tests and reviews; enforce approvals | | **Coupling (Fan‑out)** | Few → Many | How many modules/services depend on it? | More dependents = wider blast radius | Encapsulate behind a **facade/accessor** | | **Compatibility Window** | Short → Long | Can old and new coexist? For how long? | Short windows force risky cutovers | **Versioned contract** or shim layer | | **Observability** | Minimal → Full | Do we have logs/metrics/traces on the boundary? | You cannot fix what you cannot see | Instrument before change; create dashboards | | **Rollback Cost** | Easy → Hard | How quickly can we revert or disable? | Hard rollbacks increase MTTR | Use a **feature flag** or configuration switch; keep changes isolated | | **Volatility** | Stable → Volatile | Will this area change again soon? | Volatile areas invite rework | Add indirection; avoid lockstep coupling | | **Ownership** | Clear → Diffuse | Do we know who owns each consumer? | Diffuse ownership increases coordination overhead | Notify owners; attach a migration guide | ------ ## 3) Risk Score (Simple Model) Use this to tune test depth and coordination overhead. ``` Risk = (Radius 1–5) + (Criticality 1–5) + (Volatility 1–5) − (Observability 1–5) Guidance: 0–3 → Standard workflow 4–6 → Extra tests + broader peer review 7–9 → Structured rollout (no canary) + contract tests ≥10 → Versioned approach + explicit migration plan + senior review ``` ------ ## 4) Execution Matrix (What to Do by Quadrant) | Quadrant | Test Scope | Architecture Controls | Release Strategy | Operational Prep | | ------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | ----------------------------------------------------- | | **Easy–Small** | Unit tests | None | Standard | No additional ops needed | | **Easy–Large** | Unit + **Contract tests** (top N consumers); screenshot tests if UI | **Facade/Accessor** + compatibility layer | Feature flag or config switch; document rollback steps | Basic dashboards and alerts for the affected boundary | | **Difficult–Small** | Unit + property‑based + focused integration | Keep interface stable; verify encapsulation boundary | Standard or staged rollout by environment | Targeted metrics on the module | | **Difficult–Large** | Unit + **Contract** + Integration + E2E smoke (critical flows only) | **Versioned contract**, isolation boundary, schema registry (if applicable) | Staged release by environment (e.g., dev → test → prod) with a clear revert path | SLO/error‑rate guardrails; runbook for reversion | ------ ## 5) Change‑Type Heuristics (Map Your Change Quickly) | Change Type | Likely Quadrant | Anti‑Pattern to Avoid | Preferred Pattern | | ----------------------------------------------- | ---------------------------- | ----------------------------------------------- | ------------------------------------------------------------ | | Formatting/util change (date, currency, number) | Easy–Large | Editing call sites directly across the codebase | Central **formatter** module; versioned API if semantics change | | Common UI control tweak | Easy–Large | Per‑page overrides | **Design system tokens** + component library update | | Shared config rename | Easy–Large | Direct env reads in each service | **Typed config accessor** with defaults and validation | | Algorithm upgrade (isolated) | Difficult–Small | Leaking new concerns across interfaces | Keep interface stable; property‑based tests | | API response shape change | Difficult–Large | Breaking changes without a grace period | **Versioned API/contract** + consumer‑driven contract tests | | Telemetry schema change | Easy–Large → Difficult–Large | Renaming fields without mapping or doc | **Telemetry facade** with mapping and deprecation window | ------ ## 6) PR Checklist (Paste Into Your PR Template) - **Quadrant classification** included (Easy/Difficult × Small/Large) - **Risk score** provided and rationale noted - **Architecture boundary** identified (facade, contract, isolation) - **Test plan** attached (unit, contract, integration; screenshot if UI) - **Rollback plan** documented (flag/switch name; exact steps) - **Observability** in place (dashboards/alerts for the boundary) - **Coordination**: impacted code owners tagged; migration guide attached if needed ------ ## 7) PR Description Template (Minimal) ``` ### Classification Quadrant: <Easy–Small | Easy–Large | Difficult–Small | Difficult–Large> Risk Score: <value> (R = Radius + Criticality + Volatility − Observability) ### Change Summary - What: <one line> - Why: <value/requirement> - Scope: <repos/modules/pages> ### Controls - Architecture: <facade/compat layer | versioned contract | isolation boundary> - Tests: <unit, contract (top N consumers), integration, screenshot (if UI)> - Release: <flag/switch name> with documented revert steps - Rollback: <exact command/step to disable or revert> ### Observability - Dashboards: <links> - Alerts: <links> - SLO Guardrails: <brief> ### Coordination - Owners notified: <teams/users> - Migration guide: <link> ``` ------ ## 8) Pragmatic Yes/No (for grooming and PR review) - Can this change be **routed through a single facade** today? - Do we have a **feature flag or config switch** and a **tested rollback**? - Are **top N consumers** covered by **contract tests** in CI? - Can old and new **coexist** via a **versioned contract**? - Are **dashboards/alerts** in place **before** the change lands? ------ ## 9) Socratic Prompts (to reduce future radius) - What is the **smallest boundary** behind which 100% of this behavior can live? - Which **consumer assumptions** are most likely false in production? - If this will change again within a quarter, what **indirection** prevents rewiring? - How does this design **lower the radius** of the next similar change?