Skip to content

Instantly share code, notes, and snippets.

View alopezari's full-sized avatar

Alex López alopezari

View GitHub Profile
@alopezari
alopezari / magellan-whats-new-april28.md
Last active May 6, 2026 16:17
Magellan — What's new since April 28th (internal announcement)

Magellan — What's new (April 28 → May 6)

Magellan simulates a team of experienced testers working autonomously on your WordPress plugin or theme. It was designed by seasoned QA engineers to translate real exploratory testing techniques — SBTM, PQIP, heuristics like SFDPOT and FEW HICCUPPS — into a team of AI agents that run without human supervision. You point it at a plugin, trigger the run, and wait for the report.

Unlike test automation, Magellan is non-deterministic: agents put themselves in the shoes of real users and explore the system the way a human tester would — following their curiosity, chasing anomalies, adapting to what they find. No test scripts to write, no selectors to maintain. It's not another automation project. It's like having a team of experienced testers working for you.


Real test runs

@alopezari
alopezari / final-report.md
Last active May 6, 2026 14:58
Desktop Mode v0.7.1 — Full test report with verification (Magellan)

Testing Report — WP desktop-mode - Grey box testing

Run ID: 2026-05-06T13-23-40_desktop-mode Generated: 2026-05-06T14:28:49.519Z Plugin version: 0.7.1 Sessions processed: 12 Sessions with errors: 2


@alopezari
alopezari / console-lifecycle-cluster.txt
Created May 6, 2026 14:50
Desktop Mode v0.7.1 — Console logs: lifecycle & scale sessions (Magellan)
Evidence:
- Flow 1: Verified test data present before deactivation (user meta and options)
- Flow 2: Confirmed desktop_mode_presence_daily_prune cron was present before deactivation
- Flow 3: After deactivation, cron event desktop_mode_presence_daily_prune remained scheduled (H8 confirmed)
- Flow 4: After plugin deletion, orphaned user meta (desktop_mode_mode, desktop_mode_os_settings) and options (_desktop_mode_presence) persisted (H7 confirmed)
@alopezari
alopezari / coverage.md
Created May 6, 2026 14:50
Desktop Mode v0.7.1 — Feature coverage matrix (Magellan)

Feature coverage matrix — 2026-05-06T13-23-40_desktop-mode

Coverage mode: full-surface

No FOCUS_OVERRIDE set; MISSION.md declares full-surface coverage with every major subsystem getting ≥1 probe.

Feature Surface Traits Depth Covered by
F1 Admin-bar toggle Admin bar node desktop-mode-toggle (all admin pages) AJAX-exposed, settings-form, output-rendering, user-facing-text, mode-toggle depth portal-session-cluster, usability-first-use, breadth-tour-admin
F2 AJAX save-desktop-mode wp-admin/admin-ajax.php?action=save-desktop-mode (POST) AJAX-exposed, DB-writing, settings-form depth portal-session-cluster, breadth-tour-admin
@alopezari
alopezari / coverage-gaps.md
Created May 6, 2026 14:50
Desktop Mode v0.7.1 — Coverage gaps & meta-review (Magellan)

Coverage gaps — desktop-mode 2026-05-06T13-23-40_desktop-mode

Summary

  • 0 hypotheses silently skipped (automated check passed)
  • 0 surfaces from recon/static-analysis unaddressed
  • 0 AND-list items scored on aggregate when per-path was needed
  • 0 round-trip probes missing on critical pairs
  • 0 Questions filed only from source inspection (no empirical attempt)
  • 1 forcing-function string missing (MEDIUM — probe ran, string absent)
@alopezari
alopezari / cross-charter-intel.md
Created May 6, 2026 14:50
Desktop Mode v0.7.1 — Cross-charter bug pattern digest (Magellan)

Cross-charter intelligence — desktop-mode 2026-05-06T13-23-40_desktop-mode

Bug pattern digest

  • Lifecycle cleanup absent across all resource types: No uninstall.php and no register_deactivation_hook for cron/data cleanup. Plugin creates user meta (desktop_mode_mode, desktop_mode_os_settings), a presence option (_desktop_mode_presence), and a daily cron event (desktop_mode_presence_daily_prune) on activation — none are cleaned up on deactivation or deletion. Confirmed in: lifecycle-cluster (H7, H8). Confidence: high.

  • Recycle Bin REST layer has compounding defects — not just one bug: (1) List endpoint returns empty items despite items in DB (filtering bug — root cause unknown but likely query/capability guard issue); (2) per_page has no upper-bound cap, accepting arbitrary values including 99999 (OOM vector); (3) Empty operation hardcoded at 200 items per batch with no pagination loop (silent truncation). Each defect is independent; fixing one doesn't fix the others. Confirme

@alopezari
alopezari / pilot21-magellan-backups.md
Last active May 4, 2026 10:55
Magellan Pilot 21 — magellan-backups v1.0.0 — 10/10 recall, $8.25, Sonnet+Haiku+Actionbook

Magellan Pilot 21 — magellan-backups v1.0.0

Run ID: 2026-05-04T10-04-00_magellan-backups
Date: 2026-05-04
Result: 🏆 10 / 10 recall — first perfect recall on this plugin


Configuration

@alopezari
alopezari / gist:1fd2ce6fd7168ee3daf56cb4a8647751
Last active April 30, 2026 16:46
Magellan Pilot — magellan-backups v1.0.0 (Sonnet manager+planner, Haiku testers, playwright-cli-headless) 2026-04-30

Magellan Pilot Report — magellan-backups v1.0.0

Run ID: 2026-04-30T16-07-43_magellan-backups Date: 2026-04-30 Duration: 31m 26s (16:07:43Z → 16:39:09Z) Model stack: Manager Sonnet 4.6 · Planner Sonnet 4.6 · Testers Haiku 4.5 · Meta-reviewer Sonnet 4.6 Browser driver: playwright-cli-headless Plugin: Magellan Backups v1.0.0 (local, blind greybox — ISSUES.md stripped) Phase 1.5 static analysis: skipped (no source_path)

@alopezari
alopezari / coverage-gaps.md
Created April 30, 2026 12:57
Pilot 19 — magellan-backups 2026-04-30 — Sonnet manager+planner, Haiku testers, playwright-cli-headless

Coverage Gaps — 2026-04-30T11-27-21_magellan-backups

Generated: 2026-04-30 (meta-review, pre-aggregation) Run: 2026-04-30T11-27-21_magellan-backups Charters reviewed: 6 (create-backup-broken-flow, backup-artifact-andlist, restore-destructive-andlist, schedule-feature-cluster, selective-export-cluster, breadth-tour)


Check 1: Hypothesis coverage

@alopezari
alopezari / magellan-token-optimizations.md
Created April 30, 2026 11:07
Magellan — Token Efficiency Optimizations

Magellan — Token Efficiency Optimizations

Summary of all token-cost optimizations shipped, with the evidence that motivated each and the measured or projected impact. Intended for task reporting (RSM / Linear).


Baseline

Pilot 11 (first instrumented run): ~$102.90 total. Manager ran on Opus for the entire session, including mechanical phases (file IO, jq calls, idle wave-wait). Token capture was manual and inaccurate — Tester model was assumed to be Sonnet but transcript analysis later showed ~50% of Tester calls went to Opus.