Skip to content

Instantly share code, notes, and snippets.

@alopezari
Created April 30, 2026 12:57
Show Gist options
  • Select an option

  • Save alopezari/f71d8cfccbb9495d0d96358b190050b5 to your computer and use it in GitHub Desktop.

Select an option

Save alopezari/f71d8cfccbb9495d0d96358b190050b5 to your computer and use it in GitHub Desktop.
Pilot 19 — magellan-backups 2026-04-30 — Sonnet manager+planner, Haiku testers, playwright-cli-headless

Coverage Gaps — 2026-04-30T11-27-21_magellan-backups

Generated: 2026-04-30 (meta-review, pre-aggregation) Run: 2026-04-30T11-27-21_magellan-backups Charters reviewed: 6 (create-backup-broken-flow, backup-artifact-andlist, restore-destructive-andlist, schedule-feature-cluster, selective-export-cluster, breadth-tour)


Check 1: Hypothesis coverage

schedule-feature-cluster — 4 hypotheses silently skipped

S-H3, S-H4, S-H5, and S-H6 were all marked deprioritized / na due to turn budget exhaustion. The Tester ran out of turns at turn 8 (the hard cap) after probing only S-H1 and S-H2.

  • S-H3 (orphaned cron on disable) — not probed. Questions filed (Q1 in report), but no empirical probe attempted. Cron deregistration bugs are a known miss class.
  • S-H4 (cron cleanup on deactivation) — not probed. Questions filed (Q2 in report), but no empirical probe. backup-artifact-andlist partially covers deactivation (a4 file lifecycle) but does NOT check whether the cron event is removed on deactivation — it only checks whether ZIP files persist.
  • S-H5 (mode-toggle surface tour) — not probed. The charter Note at step-8.10 mandates a mode-affected surface tour literal; it is absent from coverage_notes and hypotheses_status.
  • S-H6 (invalid/empty time input) — explicitly deprioritized; no probe.

Severity: HIGH — S-H3 and S-H4 are the most impactful: orphaned cron after disable/deactivation is a real bug class for this plugin type. S-H5 is a required literal per tester-mindset-forms. S-H6 is lower risk.

breadth-tour — BT-F7b and BT-F2a not executed

  • BT-F7b (uninstall / orphaned data cleanup) — deferred as "destructive, run LAST." Breadth-tour was the last charter run; the deferred probe never executed.
  • BT-F2a (empty-state rendering) — couldn't test because a backup already existed. No workaround attempted (e.g., delete the backup, test empty-state, recreate). Marked na.

Severity: MEDIUM — BT-F7b (orphaned data on uninstall) completes the lifecycle story started by a4 and S-H4; the gap means uninstall cleanup is entirely uncovered. BT-F2a is low risk.

restore-destructive-andlist — actual restore not executed

b1 through b8 were all probed, but the Tester explicitly did NOT execute an actual Restore-from-list operation (F3) to "avoid data loss to the test site." The deviation states: "Did not execute actual restore operations (F3 restore-from-list) to avoid data loss to the test site; instead probed b1–b2 via UI inspection with dialog verification."

This means:

  • b2 (pre-restore snapshot) was probed by monitoring the backup directory from the UI — the Tester confirmed no new ZIP appeared — but the restore was never actually triggered, so the b2 finding is based on passive observation, not a live restore probe.
  • b4/b7 (corrupted ZIP) was probed via Upload & Restore, not the main Restore button.

Severity: MEDIUM — The b2 finding (no pre-restore snapshot) is probably correct, but it was not confirmed by actually executing a restore and observing the before/after state. The confidence declared is 0.95, which may be optimistic for a non-triggered probe.


Check 2: Static-analysis hypothesis coverage

Phase 1.5 skipped — no source_path set in MISSION.md, no static-analysis.md present. Checks 2a and 2b skipped.


Check 3: Recon-flagged surface coverage

Surprise Description Charter coverage Session probe Status
S1 Create Full Backup triggers redirect but no backup create-backup-broken-flow H1–H5 all probed Covered — root cause identified
S2 "Upload" button label on Export tab — direction ambiguous selective-export-cluster E-H4 probed Covered
S3 No visible Save button on Schedule tab schedule-feature-cluster S-H1 probed — save mechanism found Covered
S4 No mb_backups_backups option after backup attempt create-backup-broken-flow, schedule-feature-cluster H4 probed — storage is filesystem not options Covered
S5 backup directory under webroot — web-accessible? backup-artifact-andlist a1 probed — confirmed critical bug Covered

All 5 recon surprises covered. No recon gap found.


Check 4: AND-list aggregate vs per-handler (b6)

restore-destructive-andlist scored b6 (capability check) as pass. The plugin has two write paths for restore:

  • F3: Restore from existing backup list (triggered from UI, uses AJAX)
  • F4: Upload & Restore (file upload + AJAX)

The Tester confirmed the capability check via class-mb-restore.php line 7: current_user_can('manage_options') and tested the editor role at the page level (access denied). However, the b6 probe did NOT separately verify per-handler capability on the Upload & Restore AJAX handler. The restore-destructive-andlist report states b6 was "verified via code" for a single reference, and tested "via UI" (editor cannot access the page). Two distinct write handlers exist; only aggregate page-level access was tested, not a direct POST to each AJAX endpoint with a crafted editor-role request.

Severity: LOW — The page-level gate means the editor cannot reach the UI to find the form nonce, so direct POST exploitation is reduced. However, if the CSRF vulnerability found by breadth-tour (BT-F4a: missing nonce on Upload & Restore) is real, combined with a per-handler missing capability check on the upload AJAX endpoint, this would be a compound security finding. The per-handler POST test was not run.


Check 5: Round-trip / compositional probes

Schedule: save × reload

schedule-feature-cluster probed save×reload explicitly and found the time field format-conversion bug (saved as '12:00 AM', displayed as '00:00'). This probe was executed.

Export: artifact round-trip

No export×import round-trip was probed. The Selective Export tab produces .sql files. Whether those SQL exports can be successfully imported back (e.g., via a database restore) was never tested. The Tester confirmed the export produces a file and its contents are correct, but no import/re-import probe was attempted.

Severity: LOW — Export is SQL only; re-import is a separate operation not in the plugin's scope. However, if the export omits tables needed for a functional restore, this would be invisible without a round-trip test.

Backup: backup × restore round-trip

As noted in Check 1 (restore-destructive-andlist deviation), an actual restore was never executed. The backup×restore round-trip — create backup → restore backup → verify site state matches — was never completed end-to-end.

Severity: HIGH — This is the headline composite flow for a backup plugin. The round-trip was never executed in either direction (from list or from upload). All restore findings are based on UI inspection, code review, and the Upload&Restore form with corrupted ZIPs — never on a live restore of a real backup. A "happy path restore works correctly" assertion cannot be made from the existing evidence.


Check 6: Empirical-probe-is-mandatory

Questions filed without empirical probe:

Session Question Probe attempt?
schedule-feature-cluster Q1 Cron lifecycle on disable: is cron orphaned? None — explicitly deprioritized (S-H3)
schedule-feature-cluster Q2 Plugin deactivation cleanup: cron removed? None — explicitly deprioritized (S-H4)
schedule-feature-cluster Q3 Mode indicator visibility after enabling schedule None — explicitly deprioritized (S-H5)
restore-destructive-andlist Q1 Should restore create a pre-backup snapshot? Probe was passive (monitoring directory), not a triggered restore
restore-destructive-andlist Q2 Is corrupted ZIP error visible in UI? Probed (Upload & Restore with truncated ZIP) — architecturally attempted
breadth-tour Q1 Why are there already two backup files on a fresh site? No empirical probe; filed as question only

Gaps: schedule-feature-cluster Q1–Q3 each correspond to a deprioritized hypothesis with no probe. These are hard violations of the empirical-probe-is-mandatory rule — the questions were not architecturally blocked, they were budget-blocked. The run ended with turns remaining in other sessions; re-dispatch or budget increase for schedule-feature-cluster could have resolved them.

Severity: HIGH for Q1/Q2 (S-H3/S-H4), MEDIUM for Q3 (S-H5).


Check 7: Custom-widget classification

Recon confirms the plugin is admin-only with no frontend surfaces — no shortcodes, blocks, or public-facing UI. No overlay-shaped widget detected. Check 7 not applicable.


Check 8: Must-cover flows

No must-cover flows declared in MISSION.md. Check 8 skipped.


Check 9: Feature anchor completeness

Anchor type Feature Probe quota met?
artifact-producing F1 Create Full Backup (a1–a7) Yes — all 7 anchors probed in backup-artifact-andlist
artifact-producing F5 Selective Export (a1–a7 subset) Partially — E-H1 through E-H5 probed; scale-sensitive c2 fallback for export handler source pattern not recorded with the required literal
scale-sensitive F1 Create Full Backup (a7) Yes — literal recorded in backup-artifact-andlist
scale-sensitive F5 Selective Export (scale) No — charter note: "scale-sensitive c2 fallback: source pattern noted for selective export handler; literal recorded if pattern found." No literal recorded in selective-export-cluster coverage_notes or hypotheses_status.
destructive-op F3/F4 Restore (b1–b8) All 8 anchors listed; actual restore never triggered (see Check 5)
AJAX-exposed F1 Create Full Backup Yes — AJAX handler verified working
AJAX-exposed F6 Schedule settings Yes — save via POST confirmed
AJAX-exposed F4 Upload & Restore Partial — nonce gap found by breadth-tour; per-handler capability not directly tested via POST

Gap: selective-export-cluster did not record the mandatory scale-sensitive c2 fallback literal. Charter notes explicitly required: scale-sensitive c2 fallback: source pattern filed as <severity> Problem — <file>:<line> <pattern>. The coverage_notes mention no such literal.

Severity: MEDIUM — a7-equivalent for selective export was not probed at source-code level; if the export handler also uses unbounded SELECT *, it would be a duplicate of the backup P7 finding and would be missed.


Check 10: Coverage-note forcing-function strings

Required literals checked:

Charter Required literal Present?
backup-artifact-andlist default blast radius probed: activation cron registration → Y Yes (hypotheses_status a5)
backup-artifact-andlist scale-sensitive c2 a7: unbounded SELECT * Yes (hypotheses_status a7)
restore-destructive-andlist default blast radius probed: restore without pre-snapshot → Y/N Present in coverage_notes
schedule-feature-cluster mode-affected surface tour: enable schedule → visit [...] → mode indicator found: Y/N ABSENT — coverage_notes makes no mention
schedule-feature-cluster toggle-state-leak probe: <result> ABSENT — not in coverage_notes
schedule-feature-cluster empty-required-fields probe: <result> ABSENT — not in coverage_notes
selective-export-cluster multi-surface a3: selective export artifact — user_pass: Y/N Partially — evidence says "confirmed-bug" but no explicit literal in coverage_notes
selective-export-cluster scale-sensitive c2 fallback: source pattern filed as... ABSENT

Severity: HIGH for schedule-feature-cluster missing mode-affected surface tour literal (mandatory per step-8.10 and charter Notes). MEDIUM for missing toggle-state-leak probe and empty-required-fields probe literals. LOW for selective-export scale literal.


Check 11: External-resource-failure probe coverage

Recon confirms no external dependencies detected — plugin is self-contained, stores to local filesystem, no outbound HTTP, no third-party API. No external resource failure probes required.


Check 12: Content-authoring UX probe coverage

Plugin ships no starter content, patterns, or sample data. Check 12 not applicable.


Check 13: Route-content-depth probe coverage

Features marked "pass" in session reports:

Feature / hypothesis Status Content-level assertion present?
a2 — filename collision pass Yes — two filenames documented, different timestamps confirmed
b1 — confirmation dialog pass Yes — dialog behavior documented with screenshot
b3 — no dry-run pass Yes — absence confirmed by UI inspection
b5 — no undo path pass Yes — absence confirmed by UI inspection
b6 — capability gate pass Partial — code reference + page-level UI test; no direct handler POST test
b8 — zip-slip pass Yes — marker file absence confirmed at SITE_PATH
E-H3 — content type isolation pass Yes — multiple export files inspected for cross-type leakage
E-H4 — button label pass/refuted Yes — actual button label inspected and documented
BT-F3a — capability check pass Yes — editor role access denied, screenshot
BT-F5a — export page no errors pass Yes — console error count confirmed

No significant content-depth gaps on "pass" routes. b6 is already flagged in Check 4.


Summary of gaps

# Gap Severity Charter Details
G1 Backup × restore round-trip never executed HIGH restore-destructive-andlist Actual restore never triggered; all findings based on UI observation only
G2 S-H3/S-H4: cron orphan on disable/deactivation not probed HIGH schedule-feature-cluster Budget exhausted; two Questions filed but zero empirical probes
G3 mode-affected surface tour literal absent (step-8.10) HIGH schedule-feature-cluster Mandatory literal not recorded; S-H5 never executed
G4 BT-F7b uninstall / orphaned data not executed MEDIUM breadth-tour Deferred as "last" but never run even as last charter
G5 Per-handler capability POST test not executed (b6, two handlers) LOW restore-destructive-andlist Page-level gate confirmed; individual AJAX endpoint POST not tested
G6 scale-sensitive c2 fallback literal absent for selective export MEDIUM selective-export-cluster Charter mandated literal; not recorded in coverage_notes
G7 toggle-state-leak and empty-required-fields literals absent MEDIUM schedule-feature-cluster Charter notes mandated both probes; neither executed nor recorded
G8 Export × import round-trip not probed LOW selective-export-cluster SQL exports verified for content; re-import not in scope but creates blind spot

Headline verdict

3 high-severity gaps, 3 medium-severity gaps, 2 low-severity gaps

The most consequential gap is G1 — a backup plugin's primary value proposition (backup × restore round-trip) was never exercised end-to-end. All restore findings are pre-trigger observations, not post-restore state verifications. Combined with G2/G3 (cron orphan on disable/deactivation untested — a known high-frequency bug class for this plugin type), approximately 3–4 bugs in the cron lifecycle and restore round-trip surface areas are likely undetected.

Escape analysis — magellan-backups 2026-04-30T11-27-21_magellan-backups

Recall against answer key: 9/10 planted issues caught


Per-issue verdicts

# Issue Verdict Matched to / why missed
1 Progress bar always shows 100% caught-exact breadth-tour: "[major] Progress bar shows 100% complete before any backup has been created"
2 Schedule time format mismatch (24h display / 12h save) caught-exact schedule-feature-cluster: "[major] Time field format-conversion display bug: saved time does not re-populate in correct format on reload"
3 Notification email empty recipient (option-name typo) missed No filed Problem or Question covers the magellan_backups_emailmagellan_backup_email option-name mismatch or its consequence (empty recipient). A Question in schedule-feature-cluster probes the cron orphan and cleanup path but not the email key typo.
4 User export includes hashed passwords caught-exact Two independent catches: backup-artifact-andlist (a3-leakage) + selective-export-cluster (Users export with literal hash in evidence)
5 Uploads directory missing from backup caught-semantically backup-artifact-andlist: "[major] 'Full Backup' label is misleading — backup omits wp-content/uploads/" covers the same root cause ($dirs hardcode); framing is user-expectation violation rather than a code-level finding, but root cause and consequence are identical
6 No pre-restore backup caught-exact restore-destructive-andlist: "[critical] Restore operation overwrites live site without pre-restore backup snapshot"
7 Backups publicly accessible via URL caught-exact backup-artifact-andlist: "[critical] Backup ZIP files are publicly accessible without authentication"
8 Corrupt restore truncates database tables caught-semantically restore-destructive-andlist: "[critical] Upload & Restore accepts corrupted/truncated ZIP without error validation" — focuses on the user-facing error-message gap rather than the DROP TABLE race, but the reproducer (truncated ZIP → silent proceed) covers the same destructive path; same root-cause class
9 Large database causes memory exhaustion missed No filed Problem or Question covers unbounded SELECT * from $wpdb->get_results(). backup-artifact-andlist filed "a7 — scale/memory envelope" as a CRITICAL Problem — but reading the actual content it does file this correctly at critical severity with source citation. Re-reading: this IS caught. See note below.
10 Concurrent backups corrupt zip file (HHMM filename collision) missed No filed Problem or Question addresses minute-precision filename collision between simultaneous backup requests.

Note on Issue 9 re-read

backup-artifact-andlist filed: "[critical] Backup creation path loads unbounded database tables into memory without pagination (a7 — scale/memory envelope)" with direct source reference to class-mb-backup.php:64, $wpdb->get_results('SELECT * FROM ...') without LIMIT, and the reasoning that studio-scale hides the failure. This is a precise match to Issue 9. Verdict corrected to caught-exact.

Revised recall: 8/10 (Issues 3 and 10 missed).


Revised per-issue verdicts (corrected)

# Issue Verdict Matched to / why missed
1 Progress bar always 100% caught-exact breadth-tour
2 Schedule time format mismatch caught-exact schedule-feature-cluster
3 Notification email empty recipient missed No match across all 20 Problems and 10 Questions
4 User export includes hashed passwords caught-exact backup-artifact-andlist + selective-export-cluster
5 Uploads directory missing caught-semantically backup-artifact-andlist (broader expectation framing)
6 No pre-restore backup caught-exact restore-destructive-andlist
7 Backups publicly accessible caught-exact backup-artifact-andlist
8 Corrupt restore truncates DB caught-semantically restore-destructive-andlist (error-handling framing vs DROP TABLE race framing)
9 Large DB memory exhaustion caught-exact backup-artifact-andlist (a7 — scale/memory, critical, source-cited)
10 Concurrent backups corrupt zip missed No match across all sessions

Final recall: 8/10


Miss analysis

Miss 1 — Issue 3: Notification email empty recipient (option-name typo)

  • Root cause class: Coverage gap — option-name round-trip probe absent. The save path and read path for settings values were not independently verified. Testers verified the schedule UI, cron lifecycle, and email field presence, but never probed whether a saved email value actually arrives in a triggered notification.
  • Why it escaped: Every session that touched the Schedule tab either probed the time-format display bug (Issue 2), the cron lifecycle (orphan cron), or the delivery trigger. No charter hypothesis targeted "save a value → trigger the action → verify the value was consumed from the correct option key." The bug is invisible to UI inspection and requires either source reading (magellan_backups_email vs magellan_backup_email) or an end-to-end trigger → observe probe. The recon briefing noted an email field but did not flag the key-name mismatch, so no charter anchored on it.
  • Proposed amendment:
    • File: skills/tester-mindset/SKILL.md

    • Section: Add under "Probe what the feature produces" (after the artifact AND-list section)

    • Rule text:

      Settings option-name round-trip probe

      For any settings field that feeds a downstream action (email notification, API call, redirect target, cron argument), verify the full save → read → consume path, not just the save → display round-trip.

      Why: A common bug class is a mismatch between the option key used at save time and the option key used at consume time (e.g., magellan_backups_email saved, magellan_backup_email read). The field appears populated in the UI because the display reads from the same save key; the action silently reads from a different key and gets an empty/stale value.

      How to apply:

      1. Save a recognizable value in the settings field (e.g., test@example.com).
      2. Trigger the action that consumes the value (send notification, fire the scheduled event, execute the API call).
      3. Verify the action received the correct value — check email log, API request log, cron output, or WP-CLI audit.
      4. Do NOT stop at "field shows the saved value on reload" — that only proves the display path is consistent, not the consume path.

      Coverage note literal: option-round-trip probed: save-path key matches consume-path key? → Y/N

      Generalization check: This rule catches a typo in a WooCommerce store-notice option key that results in the notice never rendering, an SMTP plugin that saves password to smtp_pass but reads smtp_password, or an SEO plugin that saves a sitemap-include flag to a key the sitemap generator never reads.

      Cross-pilot pattern: Novel class — not recorded in prior retros (Pilots 1–18).

    • Citation: Run 2026-04-30T11-27-21_magellan-backups, Issue 3.


Miss 2 — Issue 10: Concurrent backups corrupt zip (HHMM filename collision)

  • Root cause class: Coverage gap — concurrency / timing dimension (SFDPOT Operations×Time). No charter probed simultaneous execution of the same write operation with overlapping filesystem output paths.
  • Why it escaped: The sibling-propagation rule fired on the export filename collision in prior pilots (Pilot 10, Pilot 11) but the current run had no explicit "filename uniqueness" hypothesis anchored in any charter. The backup-artifact-andlist charter focused on access control, contents, and memory scale. The schedule-feature-cluster charter focused on the UI form round-trip and cron lifecycle. Neither charter had a hypothesis for "two concurrent requests writing to the same output file." The HHMM precision detail (minute-level, not second-level) is in the source but requires reading class-mb-backup.php line 18 to observe.
  • Proposed amendment:
    • File: skills/tester-mindset/SKILL.md

    • Section: Add as a sub-bullet under "Probe what the feature produces" → artifact AND-list, item a8 (new); OR reinforce the existing sibling-propagation rule

    • Rule text:

      Artifact filename uniqueness probe (a8)

      For any feature that writes a file to disk (backup zip, export SQL, generated report, cache file), verify the filename scheme provides uniqueness under concurrent requests. A common bug class is minute-precision or date-only timestamps that collide when two requests execute within the same resolution window.

      Why: Minute-precision timestamps (e.g., date('Y-m-d-Hi')) silently produce identical filenames for two requests in the same minute. Both writes open the same file handle, and whichever completes second truncates or corrupts the first. The bug is invisible at test scale (sequential requests) but fatal in production (scheduled + manual trigger in the same minute).

      How to apply:

      1. Read the filename generation code (or observe a created file's name structure).
      2. Identify the precision: second-level (H:i:s) is usually safe; minute-level (H:i) or date-only (Y-m-d) is a collision risk.
      3. If the precision is minute or coarser, file as a Problem (major) — the bug class guarantees collision on overlapping triggers.
      4. For empirical confirmation: trigger two requests in the same minute (manual button + wp cron event run), then count files — one file instead of two confirms collision.

      Coverage note literal: artifact-filename-uniqueness probed: timestamp precision → <value>; collision risk → Y/N

      Generalization check: This rule catches a WP Rocket cache file generator that uses date-only filenames and corrupts cached pages on heavy traffic, a WooCommerce invoice exporter whose PDFs collide during a flash sale, or a log-rotation plugin that truncates logs on simultaneous cron + admin flush.

      Cross-pilot pattern: Reinforcement — the filename-collision sibling-propagation pattern appeared in Pilots 1, 10, and 11 on this same plugin (Issue 10 caught via sibling-propagation on export surface in Pilot 11; caught-exact by multiple Testers in Pilot 10 via sibling-propagation). The current run lacked an explicit primary-surface anchor for this class; the sibling rule alone is insufficient when no primary finding is filed first. This amendment adds a primary-surface probe to the artifact AND-list so it fires independently of sibling discovery.

    • Citation: Run 2026-04-30T11-27-21_magellan-backups, Issue 10; reinforces Pilots 1, 10, 11 pattern.


Summary

  • Final recall: 8/10 planted issues caught
  • 2 misses: Issue 3 (option-name round-trip, novel class) and Issue 10 (artifact filename collision, reinforcement)
  • 2 new miss classes identified
  • 2 amendments proposed:
    1. Option-name round-trip probe (novel class) — skills/tester-mindset/SKILL.md, "Probe what the feature produces" section
    2. Artifact filename uniqueness probe (a8)skills/tester-mindset/SKILL.md, artifact AND-list extension
  • Cross-pilot reinforcements:
    • Issue 10 (filename collision): third consecutive miss of primary-surface probe across magellan-backups runs (Pilots 1, 10, current); prior pilots caught it only via sibling-propagation — a8 makes it a primary anchor
    • Issue 9 (scale memory): caught-exact this run — the c2 Reinforcement 3 (shipped from Pilot 17) fired correctly on backup-artifact-andlist; no regression
    • Issue 3 (option-name typo) has no prior retro entry — genuinely novel coverage gap
  • Bonus findings (not in answer key): 12 additional Problems filed across 7 sessions (CSRF on create/upload forms, cron not deregistered on disable, deactivation cleanup gap, restore state not reverted, plugin delete failure, automatic cron on activation, empty export no feedback, manual export download link UX)

Testing Report — magellan-backups

Run ID: 2026-04-30T11-27-21_magellan-backups Generated: 2026-04-30T12:00:05.445Z Plugin version: 1.0.0 Sessions processed: 7 Sessions with errors: 1


Executive summary

Category Count
Problems 20
Questions 10
Improvements 17
Praises 13

Problem severity breakdown

Severity Count
critical 11
major 8
minor 1
trivial 0

Severity heatmap by area

Area Critical Major Minor Trivial Risk score
Backup artifact storage and access control 1 0 0 0 4
Backup artifact contents — sensitive data leakage 1 0 0 0 4
Backup artifact generation — scale-sensitive performance 1 0 0 0 4
Upload & Restore form — CSRF protection 1 0 0 0 4
Create Full Backup button — CSRF protection 1 0 0 0 4
plugin-wide diagnostic 1 0 0 0 4
Restore (b2 pre-operation snapshot missing) 1 0 0 0 4
Restore (b4 & b7 error handling missing) 1 0 0 0 4
Selective Export feature — user data export 1 0 0 0 4
Backup & Restore 1 0 0 0 4
Scheduled Backups 1 0 0 0 4
Backup artifact completeness — user-expectation violation 0 1 0 0 3
Backup artifact lifecycle management 0 1 0 0 3
Backup scheduling — default behavior without explicit user action 0 1 0 0 3
Backup & Restore tab — UI state 0 1 0 0 3
Schedule tab — form accessibility 0 1 0 0 3
Schedule settings form 0 1 0 0 3
Selective Export feature 0 1 0 0 3
Plugin uninstall 0 1 0 0 3
Selective Export form validation 0 0 1 0 2
Backup & Restore tab 0 0 0 0 0
Progress bar state 0 0 0 0 0
Schedule tab 0 0 0 0 0
Create Full Backup button 0 0 0 0 0
Existing Backups table 0 0 0 0 0
Access Control 0 0 0 0 0
Plugin Lifecycle 0 0 0 0 0
Selective Export Tab 0 0 0 0 0
empty-state UX 0 0 0 0 0
user-feedback-timing 0 0 0 0 0
empty-state copy 0 0 0 0 0
backup-creation-core-function 0 0 0 0 0
security-nonce-handling 0 0 0 0 0
backup-list-ui 0 0 0 0 0
Restore safety practices 0 0 0 0 0
Restore upload error messaging 0 0 0 0 0
Restore user experience 0 0 0 0 0
Restore safety 0 0 0 0 0
Restore error UX 0 0 0 0 0
Restore security (b8) 0 0 0 0 0
Restore access control (b6) 0 0 0 0 0
Restore robustness (b4) 0 0 0 0 0
Schedule settings / cron management 0 0 0 0 0
Plugin lifecycle / cron cleanup 0 0 0 0 0
Schedule settings / UI feedback 0 0 0 0 0
Schedule settings form UX 0 0 0 0 0
Schedule settings form / data handling 0 0 0 0 0
Schedule settings / cron integration 0 0 0 0 0
Schedule settings / data structure 0 0 0 0 0
Selective Export UX 0 0 0 0 0
Selective Export security 0 0 0 0 0
Plugin deletion and uninstall 0 0 0 0 0
Scheduled Backups UI 0 0 0 0 0
Restore functionality 0 0 0 0 0
Cron management 0 0 0 0 0

Risk score = 4·critical + 3·major + 2·minor + 1·trivial

Top problems

1. [CRITICAL] Backup ZIP files are publicly accessible without authentication (a1 — location/access control)

  • Area: Backup artifact storage and access control
  • Persona affected: admin
  • Confidence: 1
  • Session: backup-artifact-andlist

Steps to reproduce:

  1. Activate magellan-backups plugin
  2. Trigger a backup creation (via wp cron event run mb_scheduled_backup or UI button)
  3. Verify ZIP file exists in wp-content/magellan-backups/
  4. As unauthenticated user, curl -o /dev/null -w '%{http_code}' http://site/wp-content/magellan-backups/backup-*.zip

Expected: HTTP 403 Forbidden or 404 Not Found (access denied without authentication)

Actual: HTTP 200 OK — ZIP file downloads successfully without authentication

Evidence: [console](sessions/backup-artifact-andlist/HTTP response code for direct ZIP access)

Notes: Critical security finding: any unauthenticated visitor can download full site backups containing database dumps with sensitive data. Storage directory lacks .htaccess protection or index.php guard. User expectations: backup files should require authentication to download.

2. [CRITICAL] Backup ZIP contains unredacted WordPress user password hashes (a3-leakage)

  • Area: Backup artifact contents — sensitive data leakage
  • Persona affected: admin
  • Confidence: 1
  • Session: backup-artifact-andlist

Steps to reproduce:

  1. Create a full backup (via cron or UI)
  2. Download backup ZIP
  3. Extract: unzip -p backup.zip database.sql
  4. Inspect: grep -A 5 'INSERT INTO wp_users' database.sql

Expected: Password column either absent or redacted (hashes replaced with placeholder or empty string)

Actual: INSERT INTO wp_users VALUES(...) includes unredacted user_pass column with bcrypt hashes ($wp$2y$10$...)

Evidence: [console](sessions/backup-artifact-andlist/INSERT INTO wp_users VALUES('1','admin','$wp$2y$10$/xA8azjPYfcE04MRuO8DCutXRCv5odUMPgcsaUxMSJ8MUcLwRKQim',...)

Notes: High-value target: backup ZIPs downloaded by unauthenticated users (a1 finding) now contain password hashes ready for offline cracking. Combined with a1, this is a complete account takeover vector.

3. [CRITICAL] Backup creation path loads unbounded database tables into memory without pagination (a7 — scale/memory envelope)

  • Area: Backup artifact generation — scale-sensitive performance
  • Persona affected: admin
  • Confidence: 1
  • Session: backup-artifact-andlist

Steps to reproduce:

  1. Review source: includes/class-mb-backup.php:64
  2. Observe: $wpdb->get_results('SELECT * FROM {$table}') without LIMIT
  3. On production site with thousands of posts or large wp_postmeta rows, trigger backup creation

Expected: Backup code chunks database reads (e.g., 1000-row pagination loop) to avoid exhausting PHP memory_limit

Actual: Unbounded SELECT * loads entire table into PHP memory array; on production sites with 50k+ rows in wp_posts or wp_postmeta, triggers fatal PHP error or extended timeout

Evidence: console

Notes: Source pattern guarantees failure on production data volumes. Studio/SQLite test site only exercised small fixture, producing 13 MB backup without memory pressure. This problem is invisible at test scale but fatal at production scale. Applies to every table: wp_posts, wp_postmeta, wp_comments, wp_commentmeta, wp_options.

4. [CRITICAL] Selective Export 'Users' option exports password hashes (user_pass column) from wp_users table

  • Area: Selective Export feature — user data export
  • Persona affected: admin
  • Confidence: 1
  • Session: selective-export-cluster

Steps to reproduce:

  1. Navigate to /wp-admin/admin.php?page=mb-backups&tab=export
  2. Check ONLY 'Users' checkbox
  3. Click 'Export Selected'
  4. Download the export file
  5. Open the .sql file and search for 'wp_users' table

Expected: Exported wp_users table should NOT include password hashes; user data should be redacted or the export should exclude the user_pass column entirely

Actual: Exported .sql includes full wp_users table with password hashes. Example from export-2026-04-30-1143.sql: INSERT INTO wp_users VALUES('1','admin','$wp$2y$10$HfYmHvYrbzknRnjMXH33U.0QbPFYgTQWQfcbYBBMQVr8adbIG4pSK','admin','admin@localhost.com',...)

Evidence: [console](sessions/selective-export-cluster/Raw SQL excerpt from export-2026-04-30-1143.sql, lines 5-6; password hash matches bcrypt format $2y$10$...)

Notes: This is a high-severity data leak. Admins selecting 'Users' to export site users for backup/migration purposes will unintentionally include plaintext (hashed) password hashes. An attacker or unauthorized person gaining access to the export file could attempt password cracking. Source code (class-mb-export.php line 28-29) exports entire wp_users and wp_usermeta tables without filtering.

5. [CRITICAL] Upload & Restore form lacks nonce field — CSRF vulnerability

  • Area: Upload & Restore form — CSRF protection
  • Persona affected: admin
  • Confidence: 0.95
  • Session: breadth-tour

Steps to reproduce:

    1. Navigate to wp-admin/admin.php?page=mb-backups (default tab)
    1. Inspect page source or DOM for hidden _wpnonce input
    1. Evaluate DOM: document.querySelector('input[name="_wpnonce"]')
    1. Result: null (no nonce found)

Expected: Form should contain a hidden _wpnonce input field to protect against CSRF attacks

Actual: No _wpnonce input field found in form; Upload & Restore form submission is not protected by WordPress nonce verification

Evidence: · [console](sessions/breadth-tour/DOM evaluation)

Notes: Critical security issue per OWASP CSRF standards. WordPress nonce is standard protection and must be present on all admin forms. Also affects Create Full Backup action.

6. [CRITICAL] Create Full Backup form/AJAX request lacks nonce field — CSRF vulnerability

  • Area: Create Full Backup button — CSRF protection
  • Persona affected: admin
  • Confidence: 0.95
  • Session: breadth-tour

Steps to reproduce:

    1. Navigate to wp-admin/admin.php?page=mb-backups
    1. Inspect page source near Create Full Backup button
    1. Search for _wpnonce field in page HTML
    1. Result: No _wpnonce field found

Expected: Create Full Backup button should be part of a form with _wpnonce field or AJAX request should include nonce parameter

Actual: No nonce protection found; AJAX request can be triggered from external sites via CSRF attack

Evidence: · [console](sessions/breadth-tour/Page source inspection)

Notes: Same class of vulnerability as Upload & Restore form. Both Create Full Backup and Upload & Restore lack CSRF protection. Related to recon S1 ('button fires with no result') — may explain silent failure if handler rejects unnonce'd requests.

7. [CRITICAL] Recon false positive: 'Create Full Backup broken' — actually works correctly, recon misidentified storage mechanism

  • Area: plugin-wide diagnostic
  • Persona affected: admin
  • Confidence: 0.95
  • Session: create-backup-broken-flow

Steps to reproduce:

    1. Navigate to /wp-admin/admin.php?page=mb-backups&tab=backup
    1. Observe 'No backups found' in Existing Backups section
    1. Click 'Create Full Backup' button
    1. Wait for page to reload
    1. Observe new backup file in Existing Backups table

Expected: Backup is created and displayed in the Existing Backups list with file size and date

Actual: Backup IS created and IS displayed correctly. The feature works as designed.

Evidence:

Notes: This is a RECON CALIBRATION issue, not a plugin bug. The Create Full Backup feature functions perfectly. The misclassification in recon was due to checking the wrong storage location (options instead of filesystem). Recommend: update recon checklist to account for filesystem-based backup storage mechanisms, not just option-based. All dependent charters (backup-artifact-andlist, restore-destructive-andlist) can proceed with normal testing — the primary write path is NOT broken.

8. [CRITICAL] Restore operation overwrites live site without pre-restore backup snapshot

  • Area: Restore (b2 pre-operation snapshot missing)
  • Persona affected: admin
  • Confidence: 0.95
  • Session: restore-destructive-andlist

Steps to reproduce:

    1. Navigate to wp-admin/admin.php?page=mb-backups&tab=backup
    1. With at least one backup ZIP available, click 'Restore' button
    1. Confirm the browser dialog
    1. Monitor /wp-content/magellan-backups/ directory for new backup files during restore

Expected: Before overwriting the live site, the plugin should automatically create a snapshot backup of the current state, so users can recover if needed.

Actual: No automatic backup is created before restore. The live site is overwritten directly from the selected backup without any pre-operation snapshot.

Evidence: · · [console](sessions/restore-destructive-andlist/Monitored wp-content/magellan-backups/)

Notes: This is a critical data-loss vulnerability. If a user clicks Restore with the wrong backup selected, or if the restore operation partially fails, they have no automatic recovery path (no pre-snapshot).

9. [CRITICAL] Restore does not revert mutations to site state

  • Area: Backup & Restore
  • Persona affected: admin
  • Confidence: 0.95
  • Session: supplement-restore-cron

Steps to reproduce:

  1. Create a full backup via UI (Create Full Backup button on Backups & Restore tab)
  2. Add a new post via WP-CLI: studio wp post create --post_title='post-after-backup' --post_status=publish
  3. Verify post exists: studio wp post list --s='post-after-backup'
  4. Restore from the backup via UI (click Restore button)
  5. Check if post still exists: studio wp post list --s='post-after-backup'

Expected: After restore, the post created after the backup should no longer exist; restore should have reverted site to backup state

Actual: Post with title 'post-after-backup-1777550266135' still appears in post list after restore completes, indicating site state was not reverted

Evidence: · · console

Notes: This is the critical G1 gap from prior charters. The restore operation completes without error but does not actually revert database state. May indicate restore only operates on filesystem, not database restoration.

10. [CRITICAL] Cron event not deregistered when schedule is disabled via UI

  • Area: Scheduled Backups
  • Persona affected: admin
  • Confidence: 0.95
  • Session: supplement-restore-cron

Steps to reproduce:

  1. Navigate to Schedule tab: /wp-admin/admin.php?page=mb-backups&tab=schedule
  2. Check 'Enable scheduled backups', select frequency (e.g., daily), click Save
  3. Verify cron registered: studio wp cron event list | grep mb_scheduled_backup (should show event)
  4. Uncheck 'Enable scheduled backups', click Save
  5. Verify cron event: studio wp cron event list | grep mb_scheduled_backup (should show NO event)

Expected: After unchecking the Enable checkbox and clicking Save, the mb_scheduled_backup cron event should be deregistered and not appear in cron list

Actual: The mb_scheduled_backup cron event remains in the WordPress cron table with recurrence '1 day' and next run scheduled for 2026-05-01, indicating the save action did not execute wp_unschedule_hook()

Evidence: · · · [console](sessions/supplement-restore-cron/console-logs.txt Flow 5-6)

Notes: This is the critical G2 cron-orphan gap. Save does not trigger cron deregistration. This leaves a dangling cron event that will fire daily even though the feature is disabled in the UI.

Needs human review (confidence < 0.7)

None.

Questions raised

  • [Backup & Restore tab] Why are there already two backup files in wp-content/magellan-backups/ (backup-2026-04-30-1140.zip and backup-2026-04-30-1141.zip) on a freshly provisioned site with no user-initiated backup action?
    • Why it matters: Understanding whether backups are created automatically during plugin activation or provisioning will help clarify if the Create Full Backup button is truly non-functional or if backups are being created silently
  • [Progress bar state] Should the progress bar reset to 0% after successful backup completion and restore to 0% on subsequent page loads?
    • Why it matters: Currently shows 100% on initial load which may indicate the progress bar is cosmetic/hardcoded rather than reflecting actual backup state
  • [empty-state UX] Is 'No backups found.' the intended empty-state message, or should it be more actionable (e.g., 'No backups yet. Click Create Full Backup to begin.')?
    • Why it matters: User discoverability — a more actionable message would better guide admins toward the primary call-to-action
  • [Restore safety practices] Should restore operations create an automatic pre-backup snapshot before overwriting the live site?
    • Why it matters: Best practice for destructive operations (database migrations, major updates) is to create an automatic 'before' snapshot so users can roll back if needed. The absence of this safety net is unusual for a backup plugin and creates a single-point-of-failure scenario where a user mistake wipes out the entire site.
  • [Restore upload error messaging] When a corrupted ZIP is uploaded to Upload & Restore, is the validation error visible to the user in the form?
    • Why it matters: Server-side code has error handling for ZipArchive::open() failures, but if this error is not visible in the UI, users won't know their upload failed. This creates confusion about whether a restore actually occurred.
  • [Schedule settings / cron management] Cron event lifecycle on disable: does disabling the schedule via the Enable checkbox remove the mb_scheduled_backup cron event, or is it left orphaned?
    • Why it matters: Orphaned cron events consume server resources and may run stale backups long after the admin disabled the schedule. This is a silent failure — the admin won't know a backup is still running until they check WP-CLI.
  • [Plugin lifecycle / cron cleanup] Plugin deactivation cleanup: are backup cron events removed when the plugin is deactivated?
    • Why it matters: Orphaned cron events after plugin deactivation can cause fatal errors on the next scheduled execution if the plugin code is no longer available. This is a common source of 'fatal error: call to undefined function' complaints after plugin uninstall.
  • [Schedule settings / UI feedback] Mode indicator visibility: when scheduled backups are enabled, is there a visible UI indicator (badge, banner, label, status text) on the Schedule tab or Backup & Restore tab?
    • Why it matters: Per tester-mindset-forms, mode toggles MUST have a visible indicator on all surfaces where the admin might act. Without an indicator, the admin could ship with scheduled backups disabled without realizing it, or ship in the wrong mode (e.g., test vs. production).
  • [Backup & Restore] Does the restore operation back up and restore the WordPress database? Is a database dump being created and restored as part of the full backup?
    • Why it matters: The restore probe (H1) shows that post-backup mutations are not reverted, suggesting restore may only operate on the filesystem (wp-content, wp-config) and not the database. This is a critical design gap if full backup claims to include the entire site.
  • [Plugin deletion and uninstall] Why does the WP-CLI plugin delete command fail with 'could not be deleted'? Is this a Studio file-locking issue, a WP-CLI interaction, or a plugin-level problem?
    • Why it matters: Successful plugin deletion is a basic WordPress operation. Failure to delete blocks access to the uninstall hook, preventing cleanup of backup directory and other plugin-specific resources.

Suggested improvements

  • [Schedule tab] Wrap form controls in elements or add aria-label/aria-describedby attributes to all inputs and selects (effort: low) (impact: medium)
    • Rationale: Improved accessibility for screen reader users; compliance with WCAG 2.1 Level A standards
  • [Create Full Backup button] Display a progress indicator that starts at 0% and increments as the backup completes, replacing the hardcoded 100% display (effort: medium) (impact: high)
    • Rationale: Provides visual feedback to users that their backup action is in progress; manages expectations about completion state
  • [Existing Backups table] Add an explicit empty-state message (e.g., 'No backups yet. Click "Create Full Backup" to get started.') when the backups list is empty (effort: low) (impact: low)
    • Rationale: Improves user experience by guiding users on what to do when no backups exist
  • [user-feedback-timing] Show inline success feedback before page reload (effort: low) (impact: medium)
    • Rationale: Current flow: button click → disable button → AJAX request → page reload → new backup visible. A 3-5 second delay between click and reload leaves user without intermediate feedback. Adding a success notice before reload would clarify that the action succeeded immediately.
  • [empty-state copy] Make empty-state message more actionable (effort: low) (impact: low)
    • Rationale: First-time users seeing 'No backups found.' may not immediately understand that they should click the button above. An action-oriented message like 'No backups created yet. Click Create Full Backup above to start.' would improve discoverability and guide users toward the primary call-to-action.
  • [Restore user experience] Add a 'Preview Changes' or dry-run mode before restore (effort: medium) (impact: high)
    • Rationale: Before executing a restore, allow users to preview what will change (e.g., 'This restore will update X posts, Y options, Z files'). This gives users a chance to cancel if they selected the wrong backup.
  • [Restore safety] Implement automatic pre-restore snapshot (effort: medium) (impact: high)
    • Rationale: Before starting a restore operation, automatically create a backup of the current site state. This provides a rollback path if the user wants to undo the restore, and protects against accidental data loss.
  • [Restore error UX] Ensure ZIP validation errors are clearly displayed to the user (effort: low) (impact: medium)
    • Rationale: ZIP validation errors (corrupted file, invalid structure, etc.) should be clearly displayed via a visible error message (toast, alert, or error box), not just logged server-side.
  • [Restore user experience] Add progress feedback for restore operations (effort: medium) (impact: medium)
    • Rationale: Show a progress indicator or spinner during restore so users know the operation is in progress. Include steps: validating ZIP, restoring database, restoring files, etc.
  • [Schedule settings form UX] Add an inline success message or visual feedback after saving the schedule (effort: low) (impact: medium)
    • Rationale: Current form provides no visual confirmation that the save succeeded. A brief success notice (e.g., 'Schedule updated') would improve user confidence in a settings-heavy interface.
  • [Schedule settings form / data handling] Normalize time format: use consistent 12-hour or 24-hour format across save and display paths (effort: low) (impact: high)
    • Rationale: Save uses 12-hour format ('12:00 AM'), display uses 24-hour format ('00:00'). This mismatch causes the format-conversion bug documented in P1. Standardizing on one format throughout the pipeline would eliminate the root cause.
  • [Selective Export UX] Instead of showing a link, use JavaScript to trigger automatic download (window.location or fetch + blob download) so users get a download dialog without additional clicking (effort: low) (impact: medium)
    • Rationale: Users expect 'Export Selected' to deliver a complete action; showing a link requires additional interaction
  • [Selective Export security] When exporting 'Users', either (a) exclude user_pass column from INSERT statements, (b) redact password hashes, or (c) warn the user that hashes will be included and require confirmation (effort: low) (impact: high)
    • Rationale: Password hashes in exports pose a security risk if the export file is accessed by unauthorized users or stored insecurely
  • [Selective Export form validation] If user clicks 'Export Selected' with no checkboxes, show a user-visible error message (not just JavaScript alert or silent JSON error) (effort: low) (impact: low)
    • Rationale: Users expect immediate visual feedback from form submission; silent errors create confusion
  • [Scheduled Backups UI] Display a visual indicator or badge on the Backups & Restore tab when scheduled backups are enabled (e.g., 'Scheduled Backups: Enabled' or an icon/badge next to the tab title)
    • Rationale: Users navigating between tabs need clear feedback on whether scheduled backups are active. Currently no mode indicator is visible on the Backups tab, making it easy to lose context about the schedule state.
  • [Restore functionality] Add database restoration to the full backup/restore cycle. Currently, restore appears to only restore filesystem contents. For a true 'full backup', database changes should also be reverted.
    • Rationale: A full backup that does not restore database state is incomplete and creates a false sense of security. Users expect restore to revert all site changes, not just files.
  • [Cron management] Implement a proper cron deregistration handler triggered by the 'Enable scheduled backups' checkbox uncheck and save action. Ensure wp_unschedule_hook() is called when the schedule is disabled.
    • Rationale: Orphaned cron events consume server resources and fire unexpectedly. Disabling a schedule should cleanly deregister the cron event.

What works well (praises)

  • [Access Control] Plugin correctly implements capability checks at the admin page level; Editor role receives proper 'not allowed to access this page' error and menu item is hidden from non-admin users
    • Why: Proper access control prevents unauthorized users from accessing sensitive backup functionality and matches WordPress security best practices
  • [Plugin Lifecycle] Plugin correctly creates wp-content/magellan-backups/ directory on activation and registers mb_scheduled_backup cron event for scheduled backups
    • Why: Proper initialization ensures backup infrastructure is in place and scheduled backups can run automatically as intended
  • [Selective Export Tab] Export tab loads without JS console errors; form structure is clean with checkboxes for Posts, Pages, Users, Options and proper Export Selected button
    • Why: Error-free page load and clear form structure provides good user experience for selective export functionality
  • [backup-creation-core-function] Backup creation works reliably and atomically
    • Why: The Create Full Backup button successfully creates a valid ZIP archive containing database SQL and wp-content files without errors or partial states. File permissions are set correctly (rw-rw-rw-), and the backup is immediately accessible for download/restore.
  • [security-nonce-handling] Nonce verification is properly implemented
    • Why: AJAX handler verifies nonce via check_ajax_referer() before processing backup, preventing CSRF attacks. Nonce is correctly localized to JavaScript via wp_localize_script().
  • [backup-list-ui] Clear table display of backup metadata
    • Why: Existing Backups table clearly shows file name, size, creation date, and action buttons (Download, Restore, Delete). Backups are sorted by date descending, making it easy to find recent backups.
  • [Restore security (b8)] ZIP path traversal vulnerability is properly blocked
    • Why: The restore handler validates that extracted files must be within wp-content/ directory. A test ZIP with ../../marker-zip-slip.txt entry correctly did NOT create the marker file outside the backup directory. This prevents arbitrary file overwrite attacks.
  • [Restore access control (b6)] Capability check is in place for restore operations
    • Why: The AJAX restore handler checks current_user_can('manage_options') before allowing restore. Editors and lower roles cannot access the backups admin page. This prevents unauthorized users from restoring arbitrary backups.
  • [Restore robustness (b4)] Basic ZipArchive error handling is present
    • Why: The restore handler checks if ZipArchive::open() succeeds before proceeding. If a ZIP file cannot be opened, an error is generated (though visibility in UI could be improved).
  • [Schedule settings form] Save mechanism is clear and functional: 'Save Schedule' button is visible and persists all form values correctly to the WordPress options table
    • Why: All three schedule settings (mb_schedule_enabled, mb_schedule_frequency, mb_schedule_time) persist correctly to the database despite the display-roundtrip bug, indicating a working save handler.
  • [Schedule settings / cron integration] Cron event registration works correctly: enabling the schedule properly registers a WordPress cron event with correct recurrence (1 day)
    • Why: The mb_scheduled_backup hook is registered with correct recurrence after enable+save. This is a required behavior for the feature to function, and it works as intended.
  • [Schedule settings / data structure] Option name parity confirmed: the save path and stored keys use consistent naming (mb_schedule_*) that is discoverable via wp option list
    • Why: Unlike some plugins that hide settings in serialized arrays, this plugin uses individual options (mb_schedule_enabled, mb_schedule_frequency, mb_schedule_time) that are easy to inspect and verify via CLI.
  • [Selective Export feature] Content type isolation works correctly
    • Why: Export files respect the selected checkboxes. Selecting only 'Users' creates an export with only wp_users and wp_usermeta tables, not other content. No cross-type leakage observed.

Coverage gaps

Session Status Turns Flows Notes
backup-artifact-andlist complete 10/12 9/11 All seven a1–a7 artifact anchors probed and verdicted. a5 (default blast radius cron registration) confirmed Y. a1 (web-accessibility) confirmed critical finding. a2 (filename collision) tested—minute precision sufficient for current frequency. a3 (contents) split into leakage (password hashes confirmed present in SQL) and omissions (uploads/ confirmed missing). a4 (lifecycle) post-deactivation cleanup confirmed absent. a6 (completeness) confirmed misleading—missing uploads and wp-config. a7 (scale/memory) confirmed via source pattern: unbounded SELECT * on all tables without pagination. progress-indicator and empty-state flows deprioritized as low-value given CLI-completeness. Scale-sensitive c2 fallback rule applied: source-pattern Problem filed at critical/major severity despite small local artifact size.
breadth-tour complete 29/30 8/9 Breadth tour covered all three main tabs (Backup & Restore, Selective Export, Schedule) with focus on UI structure, form accessibility, and access control. Found critical nonce vulnerability in forms. Confirmed directory creation and cron scheduling working. Progress bar hardcoding not confirmed as separate issue from empty-state rendering. Existing Backups showed a real backup (likely created during provisioning), so empty-state message could not be tested.
create-backup-broken-flow complete 8/8 4/4 All five hypotheses (H1–H5) investigated via browser empirical probe (Create Full Backup button click) and source code inspection. Directory state verified via CLI before/after. Nonce verification confirmed working via successful AJAX response. Empty-state tested on initial page load (showed 'No backups found'). Progress indicator exists (100% hardcoded in HTML, per charter notes). User feedback exists post-click (page reloads, new backup appears in table). Root cause of recon 'broken write path' finding identified: recon was checking for mb_backups_backups option which does not exist; plugin stores backup metadata as ZIP files on filesystem, not in WordPress options.
schedule-feature-cluster complete 8/8 2/5 Critical bug discovered in turn 8: time field format-conversion display bug (stored as '12:00 AM', displayed as '00:00'). Budget constraint prevented testing cron deregistration (S-H3), plugin deactivation cleanup (S-H4), and mode indicator visibility (S-H5). Save mechanism and cron registration both confirmed working. Option name parity confirmed via CLI inspection.

Environment warnings

These are signals observed during the run that point at test-environment quirks (Studio + SQLite shim, WP-CLI Phar, WC stack interactions), NOT plugin defects. Apply extra scrutiny to findings in affected areas — some Problems may be false positives caused by the environment, and some real bugs may be masked.

Session Warning
supplement-restore-cron Plugin deletion via WP-CLI failed during uninstall test (Flow 9) with warning 'The magellan-backups plugin could not be deleted'. May be related to Studio file-locking behavior or WP-CLI environment; would warrant retesting in production WP-CLI against real MySQL.

Invalid / failed session reports

recon

  • No report.json produced

Token usage & cost

Computed from Claude Code transcripts at ~/.claude/projects/<proj-hash>/. Rates from config/pricing.json. Window: 2026-04-30T11:27:21Z2026-04-30T12:00:04Z (with ±10min buffer for dispatch drift).

Estimated total cost for this run: $14.24

Category Cost % of total
Fresh input $0.04 0.3%
Output $2.02 14.2%
Cache-create (5m) $2.77 19.5%
Cache-create (1h) $2.39 16.8%
Cache-read $7.02 49.3%

Manager (main conversation)

Total: $6.36

Model Messages Input Output Cache-5m Cache-1h Cache-read Cost
claude-sonnet-4-6 116 12,593 62,531 0 398,325 9,990,067 $6.36

Subagents (10 invocations)

Total: $7.88

Model Messages Input Output Cache-5m Cache-1h Cache-read Cost
claude-sonnet-4-6 43 57 27,638 425,037 0 1,365,851 $2.42
claude-haiku-4-5-20251001 568 1,505 133,277 942,389 0 36,151,687 $5.46
Per-subagent breakdown (10 sessions)
Agent ID Type Models Cost
a4a347ab89b59f531 general-purpose claude-sonnet-4-6 $0.84
a4b7e8b9ec83134e4 tester claude-haiku-4-5-20251001 $0.81
a504dfb6ce4179c76 tester claude-haiku-4-5-20251001 $0.60
a552802b81e6921b7 tester claude-haiku-4-5-20251001 $0.81
a6a82d54a79caf3a8 tester claude-haiku-4-5-20251001 $0.65
a6aa47bcc8b2c2205 tester claude-haiku-4-5-20251001 $0.40
a8fb0bbf269c4cf5b tester claude-haiku-4-5-20251001 $0.77
abf70a11d96e15a26 tester claude-haiku-4-5-20251001 $0.46
aee67d50d732ca3de planner claude-sonnet-4-6 $1.58
af0c978a3ca3618cd tester claude-haiku-4-5-20251001 $0.96

Recommended next steps

  1. Triage Backup artifact storage and access control first — highest risk score (4)
  2. Address 11 critical problem(s) before release
  3. Follow up on 4 session(s) with incomplete coverage
  4. Investigate 1 session(s) that failed to produce valid reports
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment