Skip to content

Instantly share code, notes, and snippets.

@jkudish
Created April 2, 2026 17:23
Show Gist options
  • Select an option

  • Save jkudish/a61ec33988c5991e88e465a7ea73a460 to your computer and use it in GitHub Desktop.

Select an option

Save jkudish/a61ec33988c5991e88e465a7ea73a460 to your computer and use it in GitHub Desktop.
Multi-Agent Code Review for Claude Code — counselors skill + /deep-review command

Multi-Agent Code Review for Claude Code

A setup that fans out code review to multiple AI coding agents in parallel (via counselors), then synthesizes and acts on the results — all from within Claude Code.

What's in here

File Type What it does
counselors-skill.md Skill (~/.claude/skills/counselors/SKILL.md) General-purpose multi-agent dispatch — send any prompt to multiple agents and get a synthesized response
deep-review-command.md Command (~/.claude/commands/deep-review.md) Opinionated code review workflow that uses counselors to review your current branch, then auto-fixes issues

How it works

/counselors (the skill)

A general-purpose tool for getting multiple AI perspectives on any question:

  1. Gathers context — finds relevant files, git diffs, and related code
  2. Selects agents — defaults to claude-opus, gemini-3-pro-preview, codex-5.3-high, amp-smart (customizable)
  3. Assembles a structured prompt — writes it to ./agents/counselors/[timestamp]-[slug]/prompt.md
  4. Dispatches in parallel — sends to all agents simultaneously via counselors run
  5. Synthesizes — reads all responses and presents consensus, disagreements, risks, and recommendations
  6. Offers action — asks what you'd like to address from the findings

Usage:

/counselors review the auth flow for security issues
/counselors is this database migration safe to run?
/counselors should we use SSR or CSR for this page?

/deep-review (the command)

A full code review pipeline built on top of counselors:

  1. Diffs your branch against main and reads all changed files
  2. Builds a review prompt with an 8-category checklist (correctness, error handling, security, performance, architecture, state/data integrity, code quality, project guidelines)
  3. Dispatches to the smart counselors group — no agent selection step, just runs
  4. Synthesizes findings into Critical / High / Medium / Low tables with file:line references
  5. Auto-fixes Critical and High issues, most Medium issues, skips Low
  6. Walks through remaining items one-by-one for your decision (fix / defer / skip)
  7. Self-learns — appends patterns to a learnings file so recurring issues get flagged earlier in future reviews

The severity escalation rules:

  • If 2+ agents flag the same issue → bump severity up one level
  • Security findings from any agent → always Critical
  • Consensus findings → listed first within each severity

Prerequisites

# Install counselors CLI
npm install -g counselors

# Initialize and add your agents
counselors init
counselors add  # follow prompts to add claude, codex, gemini, etc.

# Create a "smart" group for deep-review (or use your default agents)
counselors group create smart
counselors group add smart claude-opus gemini-3-pro-preview codex-5.3-high amp-smart

Installation

The counselors skill

mkdir -p ~/.claude/skills/counselors
cp counselors-skill.md ~/.claude/skills/counselors/SKILL.md

The deep-review command

mkdir -p ~/.claude/commands
cp deep-review-command.md ~/.claude/commands/deep-review.md

Then use them from any project:

/counselors <your question>
/deep-review
/deep-review focus on the API layer

Output

Both tools write their prompts and agent responses to ./agents/counselors/ in your project directory. Each run gets a timestamped directory:

./agents/counselors/
  1770676882-auth-flow-review/
    prompt.md           # the prompt sent to all agents
    claude-opus.md      # claude's response
    gemini-3-pro.md     # gemini's response
    codex-5.3-high.md   # codex's response
    amp-smart.md        # amp's response

Add agents/ to your .gitignore — these are working files, not meant for version control.

Self-Learning (deep-review only)

After each review cycle, /deep-review appends to ~/.claude/deep-review-learnings.md:

  • Which agents responded/failed
  • Finding counts by severity
  • Consensus vs single-agent findings
  • False positives (items you skipped as non-issues)
  • Recurring patterns across reviews

When a pattern appears in 3+ reviews, it gets promoted into the review prompt template so future reviews check for it explicitly.

Customization

  • Change default agents: Edit the Phase 2: Agent Selection section in the skill
  • Add review categories: Edit the Phase 2: Build the Review Prompt section in deep-review
  • Change severity rules: Edit the Phase 4: Read & Synthesize Results section in deep-review
  • Skip auto-fix: Remove or modify Phase 5 in deep-review if you prefer review-only
name counselors
description Fan out a prompt to multiple AI coding agents in parallel and synthesize their responses.

Counselors — Multi-Agent Review Skill

Fan out a prompt to multiple AI coding agents in parallel and synthesize their responses.

Arguments: $ARGUMENTS

If no arguments provided, ask the user what they want reviewed.


Phase 1: Context Gathering

Parse $ARGUMENTS to understand what the user wants reviewed. Then auto-gather relevant context:

  1. Files mentioned in the prompt: Use Glob/Grep to find files referenced by name, class, function, or keyword
  2. Recent changes: Run git diff HEAD and git diff --staged to capture recent work
  3. Related code: Search for key terms from the prompt and read the most relevant files (up to 5 files, ~50KB total cap)

Be selective — don't dump the entire codebase. Pick the most relevant code sections.


Phase 2: Agent Selection

Default agents: claude-opus, gemini-3-pro-preview, codex-5.3-high, amp-smart

  1. Use defaults unless the user overrides. If $ARGUMENTS does not contain agent-selection instructions (e.g. "use all agents", "add codex", "only gemini"), skip directly to the confirmation step with the defaults.

  2. If the user requests different agents (in $ARGUMENTS or via follow-up), discover available agents by running via Bash:

    counselors ls

    Print the full output, then ask the user to pick using AskUserQuestion.

    If 4 or fewer agents: Use AskUserQuestion with multiSelect: true, one option per agent.

    If more than 4 agents: AskUserQuestion only supports 4 options. Use these fixed options:

    • Option 1: "All [N] agents" — sends to every configured agent
    • Option 2-4: The first 3 individual agents by ID
    • The user can always select "Other" to type a comma-separated list of agent IDs from the printed list above

    Do NOT combine agents into preset groups (e.g. "claude + codex + gemini"). Each option must be a single agent or "All".

  3. MANDATORY: Confirm the selection before continuing. Echo back the exact list you will dispatch to:

    Dispatching to: claude-opus, gemini-3-pro-preview, amp-smart

    Then ask the user to confirm (e.g. "Look good?") before proceeding to Phase 3. This prevents silent tool omissions. If the user corrects the list, update your selection accordingly.


Phase 3: Prompt Assembly

  1. Generate a slug from the topic (lowercase, hyphens, max 40 chars)

    • "review the auth flow" → auth-flow-review
    • "is this migration safe" → migration-safety-review
  2. Create the output directory via Bash. The directory name MUST always be prefixed with a second-precision UNIX timestamp so runs are lexically sortable and never collide:

    ./agents/counselors/TIMESTAMP-[slug]
    

    For example: ./agents/counselors/1770676882-auth-flow-review

    Mac tip: Generate with date +%s (seconds since epoch). Millisecond precision is NOT available via date on macOS without GNU coreutils — use date +%s for portable second-precision timestamps.

  3. Write the prompt file using the Write tool to ./agents/counselors/TIMESTAMP-[slug]/prompt.md:

# Review Request

## Question
[User's original prompt/question from $ARGUMENTS]

## Context

### Files Referenced
[Contents of the most relevant files found in Phase 1]

### Recent Changes
[git diff output, if any]

### Related Code
[Related files discovered via search]

## Instructions
You are providing an independent review. Be critical and thorough.
- Analyze the question in the context provided
- Identify risks, tradeoffs, and blind spots
- Suggest alternatives if you see better approaches
- Be direct and opinionated — don't hedge
- Structure your response with clear headings

Phase 4: Dispatch

Run counselors via Bash with the prompt file, passing the user's selected agents:

counselors run -f ./agents/counselors/[slug]/prompt.md --tools [comma-separated-selections] --json

Example: --tools claude,codex,gemini

Use timeout: 600000 (10 minutes). Counselors dispatches to the selected agents in parallel and writes results to the output directory shown in the JSON output.

Important: Use -f (file mode) so the prompt is sent as-is without wrapping. Use --json to get structured output for parsing.


Phase 5: Read Results

  1. Parse the JSON output from stdout — it contains the run manifest with status, duration, word count, and output file paths for each agent
  2. Read each agent's response from the outputFile path in the manifest
  3. Check stderrFile paths for any agent that failed or returned empty output
  4. Skip empty or error-only reports — note which agents failed

Phase 6: Synthesize and Present

Combine all agent responses into a synthesis:

## Counselors Review

**Agents consulted:** [list of agents that responded]

**Consensus:** [What most agents agree on — key takeaways]

**Disagreements:** [Where they differ, and reasoning behind each position]

**Key Risks:** [Risks or concerns flagged by any agent]

**Blind Spots:** [Things none of the agents addressed that seem important]

**Recommendation:** [Your synthesized recommendation based on all inputs]

---
Reports saved to: [output directory from manifest]

Present this synthesis to the user. Be concise — the individual reports are saved for deep reading.


Phase 7: Action (Optional)

After presenting the synthesis, ask the user what they'd like to address. Offer the top 2-3 actionable items from the synthesis as options. If the user wants to act on findings, plan the implementation before making changes.


Error Handling

  • counselors not installed: Tell the user to install it (npm install -g counselors)
  • No tools configured: Tell the user to run counselors init or counselors add
  • Agent fails: Note it in the synthesis and continue with other agents' results
  • All agents fail: Report errors from stderr files and suggest checking counselors doctor
description Multi-agent code review with automatic fixes. Counselors smart group reviews branch changes, then fixes are applied.
alwaysApply false

Deep Review

Multi-agent review of current branch changes using counselors, followed by automatic fixes.

Arguments: $ARGUMENTS

Phase 1: Gather Context

  1. Run git diff --name-status main...HEAD to identify changed files. If empty, try git diff --name-status HEAD~5...HEAD.
  2. Read all changed/added files (up to ~50KB total).
  3. Run git diff main...HEAD (or HEAD~5...HEAD) for the full diff.
  4. Check for project-level guidelines:
    • .claude/rules/*.md in the current project
    • CLAUDE.md in the project root
    • For monorepos, also check parent .claude/rules/

Phase 2: Build the Review Prompt

Generate a slug from $ARGUMENTS or default to deep-review (lowercase, hyphens, max 40 chars).

Create directory: ./agents/counselors/$(date +%s)-[slug]/

Write prompt.md to that directory with this structure:

# Code Review

## Changes

[Full git diff output]

## Files

[Contents of all changed files]

## Project Guidelines

[Any .claude/rules/ or CLAUDE.md content found]

## Review Instructions

You are performing a deep code review. Be critical, direct, and thorough.

For each changed file, analyze:

### 1. Correctness & Logic

- Business logic errors, off-by-one, wrong conditions
- Missing edge cases, null/undefined handling
- Race conditions, async issues, floating promises

### 2. Error Handling

- Empty catch blocks or swallowed errors
- Lost exception chains (re-throw without cause)
- Overly broad catches at non-boundary locations
- Missing error handling on async operations

### 3. Security

- SQL injection, XSS, CSRF, auth bypass
- Secrets exposure, insecure defaults
- Input validation gaps at system boundaries

### 4. Performance

- N+1 queries, missing indexes, unnecessary re-renders
- Inefficient algorithms, unbounded operations
- Missing pagination or limits on data fetching

### 5. Architecture & Boundaries

- Layer violations (UI accessing DB directly, domain depending on infrastructure)
- Business logic in controllers/handlers
- ORM/DB queries outside data layer
- Scattered environment variable access

### 6. State & Data Integrity

- Impossible states (bags of optionals that should be discriminated unions)
- Boolean explosion (multiple booleans where an enum fits)
- Duplicated or derived state stored instead of computed
- Magic strings where enums/constants belong

### 7. Code Quality

- Vague naming (data, info, handler, manager without qualifier)
- Dead code, unreachable branches, commented-out code
- Pass-through wrappers adding no value
- Over-engineering or premature abstraction

### 8. Compliance with Project Guidelines

- Check against any project guidelines provided above
- Flag deviations from established patterns

## Output Format

Structure your response as:

### Critical (must fix before merge)

| Finding | File:Line | Fix |
| ------- | --------- | --- |

### High (fix soon)

| Finding | File:Line | Fix |
| ------- | --------- | --- |

### Medium (improve)

| Finding | File:Line | Fix |
| ------- | --------- | --- |

### Low (suggestions)

| Finding | File:Line | Fix |
| ------- | --------- | --- |

Be specific with file:line references and concrete fix descriptions.
Do NOT flag things that are fine. Only report actual issues.

Phase 3: Dispatch to Counselors

Run counselors with the smart group (no user confirmation needed):

counselors run -f ./agents/counselors/[dir]/prompt.md --group smart --json

Use timeout: 600000 (10 minutes).

Phase 4: Read & Synthesize Results

  1. Parse JSON output from stdout for the run manifest.
  2. Read each agent's response from the outputFile paths.
  3. Check stderrFile for any failures.
  4. Synthesize into a unified report:
## Deep Review Results

**Agents:** [list of agents that responded]

### Critical (consensus or any-agent-flagged)

| Finding | File:Line | Fix | Flagged By |
| ------- | --------- | --- | ---------- |

### High

| Finding | File:Line | Fix | Flagged By |
| ------- | --------- | --- | ---------- |

### Medium

| Finding | File:Line | Fix | Flagged By |
| ------- | --------- | --- | ---------- |

### Low

| Finding | File:Line | Fix | Flagged By |
| ------- | --------- | --- | ---------- |

### Disagreements

[Where agents differed, with reasoning]

### Blind Spots

[Anything none of the agents caught that seems important]

Prioritization rules:

  • If 2+ agents flag the same issue: bump severity up one level
  • Security findings from any agent: always Critical
  • Consensus findings: higher confidence, list first within severity

Phase 5: Apply Fixes

After presenting the synthesis, proceed to fix issues automatically:

  1. Critical and High: Fix all of these. Read each affected file, apply the fix.
  2. Medium: Fix these too, unless the fix requires an architectural decision (flag those for the user).
  3. Low: Skip unless trivially fixable alongside other changes in the same file.

For each fix:

  • Read the file first
  • Make the minimal change needed
  • Do not refactor surrounding code or add unrelated improvements

After all fixes are applied, run the project's test suite if one exists (npm test, composer test, pytest, etc.). Report any failures.

Phase 6: Summary

Present a final summary that accounts for ALL findings from the synthesis — every single item must appear in exactly one of the three tables below:

## Deep Review Summary

### Fixed

| Finding       | File:Line | What Changed                           |
| ------------- | --------- | -------------------------------------- |
| [description] | file:line | [brief description of the fix applied] |

### Not Fixed (needs your decision)

| Finding       | File:Line | Why                                                               |
| ------------- | --------- | ----------------------------------------------------------------- |
| [description] | file:line | [reason: architectural decision needed / ambiguous intent / etc.] |

### Not Fixed (low priority, skipped)

| Finding       | File:Line | Why                                    |
| ------------- | --------- | -------------------------------------- |
| [description] | file:line | [reason: Low severity, cosmetic, etc.] |

### Test Results

[Pass/fail summary, or "no test suite detected"]

Every finding from the synthesis MUST appear in exactly one table. Do not silently drop findings.

If $ARGUMENTS contains additional focus areas (e.g., "focus on auth", "check the API layer"), weight those areas more heavily in both the review prompt and fix prioritization.

Phase 7: Walk Through Remaining Items

After presenting the summary, immediately walk through every item in the "Not Fixed (needs your decision)" and "Not Fixed (low priority, skipped)" tables one at a time:

For each item:

  1. Present the finding with context: what the problem is, how likely it is to cause issues, and what the fix would look like
  2. Give your recommendation (fix now / defer / skip / needs backend change / etc.)
  3. Wait for the user's response before moving to the next item
  4. If the user says fix it, apply the fix immediately, then move to the next item
  5. If the user says skip/defer, acknowledge and move to the next item

Start with "needs your decision" items (higher priority), then "low priority" items.

After all items are resolved, run /push-review-fixes to commit, push, and request Greptile re-review.

Phase 8: Self-Learning

After the full review cycle completes (all items resolved), update the deep review learnings file:

  1. Read ~/.claude/deep-review-learnings.md (create if it doesn't exist)
  2. Append a new entry with:
    • Date and project name
    • Which agents responded and which failed
    • Categories of findings (how many critical/high/medium/low)
    • Which findings were consensus (2+ agents) vs. single-agent
    • False positives (items the user skipped as non-issues)
    • Patterns: recurring finding types across reviews (e.g., "type guards on API boundaries", "credentials in state", "deprecated platform APIs")
  3. Check for patterns that appeared in 3+ reviews — these should be added to the review prompt template in Phase 2 as explicit check items
  4. If a pattern is project-specific (same project, same type of finding), suggest adding it to the project's .claude/rules/ or CLAUDE.md

The learnings file structure:

# Deep Review Learnings

## Recurring Patterns

- [pattern]: seen N times, last seen [date] in [project]

## Review Log

### [date] - [project] - [branch]

- Agents: [list]
- Findings: X critical, Y high, Z medium, W low
- Consensus findings: [list]
- False positives: [list]
- User decisions: [list of deferred/skipped items and why]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment