This workflow represents a high-throughput, agentic development environment where manual boilerplate is traded for architectural oversight. It balances high-context reasoning with cost-effective execution, utilizing a heterogeneous model stack, voice-driven orchestration, and real-time collaborative IDE integration.
The system uses "Model Arbitrage" to optimize for the "Mana" (weekly token limit) vs. Intelligence trade-off.
- Orchestrator (Kimi K2.5): Acts as the primary "thinker" and project manager. It takes high-level architectural specs (~800 words) and either directly decomposes them into granular, parallelizable execution plans, or creates large collaborative single source of truth feature specification documents that the user iterates with it on first, then decomposes. It has the most amount of total and active parameters and is also specifically reinforcement trained for managing agent swarms with a novel reinforcement learning technique that makes it very efficient at it. And in addition is the only model of the three to have vision which makes it very good for showing high level diagrams or the current state of the user interface that it needs to fix.
- Worker Bee (MiniMax M2.5): Runs 5-6 parallel sub-agents to handle specific components, boilerplate, and low-to-mid complexity implementation tasks. It is significantly more token-efficient, and about 3x faster in TPS than the orchestrator, and also possibly even better at following precise and specific instructions, although it doesn't do without well without those instructions.
- The Sysadmin (GLM 4.7): A dedicated, long-running session focused exclusively on environment maintenance, sysadmin work, and pipeline integrity (often running in 6-hour intensive shifts). So, a very good intermediate point between MiniMax M2.5 and Kimi K2.5 in terms of intelligence and independence, meaning that it only needs to be checked in on four or five times throughout the day for status updates and redirection.
To eliminate the friction of typing long-form architectural prompts, the workflow integrates a specialized local voice-to-text pipeline:
- App: Epicenter Whispering.
- Trigger:
Cmd+Shift+;(Toggle start/stop). - Processing Engine: NVIDIA Parakeet. A state-of-the-art ASR model that handles technical terminology and long-form dictation with minimal error.
- Result: Transcribed text is instantly pasted into the active text box (Zed IDE or web UI), allowing for "thought-speed" prompt engineering.
The development cycle avoids "hallucination drift" by enforcing a strict gatekeeping process before a single line of code is written:
- Voice-to-Spec: Architect voice-dictates the high-level goals, architecture, system state machine / data flows / control flow to the orchestrator.
- Tactical Decomposition: Even with "auto-grant permissions" enabled, the orchestrator must produce a detailed execution plan first. This phase converts high-level strategy into specific tactical steps.
- Human Refinement: The architect and orchestrator iterate on the tactical plan (usually 1-2 rounds) before the final "go-ahead" is given.
- Parallel Execution: MiniMax sub-agents build independent components in parallel and related components in series according to the verified tactical plan.
- Harsh BDD Gatekeeping:
- All sub-agents are required to write both BDD tests *and * implementation. These tests must be harsh, specific, outcome-based and implementation-agnostic, verifying that each feature works exactly as intended in several scenarios and taking into account various aspects of the feature and edge cases. Tests must be based on ground truth and test the whole output, not partial or tautological.
- The architect spends most of their time focusing on reviewing the test code line by line and directing the orchestrator on improving it.
- BDD tests usually reveal weak points, bugs, or poorly/under/un-implemented features in the sub-agents’ implementation, which more sub-agents are launched to fix, with the output from the test as guidance. They are not allowed to modify the tests, unless the bug is in the tests.
- Once the tests are all verified to be good, and also all pass, this provides a level of confidence when reviewing the implementation code, allowing the architect to worry less about anything but the core algorithms, data structures, and tricky bits for manual inspection and tweaking, although skimming and asking for idiom or concision or simplicity improvements to other areas of the code is normal as well.
- The orchestrator then takes all of the components written by the worker bee sub-agents and integrates them together, smoothing out and aligning any cracks at the various API integration points, making it deep refactors or even spawning a subagent to do them if it's a lot of little changes along the way to make sure everything is consistent. Generating a comprehensive single source of truth specification, as I usually do for large features, before the sub-agents even split out, and including context from that in the sub agent spawn prompts, usually helps with this as well.
The choice of Zed as the primary interface is critical for "Google Docs-style" collaboration between human and agent, meaning that despite all this work being done by the agents, the architect never has to lose track of, and is not encouraged to give up control of, what’s going on:
- Live-Disk Edits: Agent suggestions are written directly to the file on disk even before the diff is "accepted." This allows the local LSP and test runners to provide real-time feedback, letting the agent self-correct its own suggestions before reporting back.
- Granular Diff Editing: The architect can selectively accept or reject specific chunks of a diff, and even manually edit the agent's proposed diff directly in the buffer before committing it.
- Follow Mode: Much like observing a dwarf worker in Dwarf Fortress, the architect can use "Follow Mode" to track exactly what an agent is looking at and editing in real-time across the codebase.
- CRDT-Based Editing: the reason that the user and any agents are able to simultaneously collaboratively edit the same file without messing things up is that the Zed editor bases its edit tool and its internal representation of files and diffs on CRDTs. This also means that even if multiple agents end up having to touch the same file, it is unlikely to cause a problem in the vast majority of cases and should be easy to resolve if it does.