# vault_audit.py — HashiCorp Vault Comprehensive Audit Script A Python 3 script that performs a full audit of a HashiCorp Vault cluster: secrets inventory across all namespaces, user/entity activity, and last-access timestamps sourced from the Vault audit log. Works with **HCP Vault** (managed), **Vault Enterprise** (on-prem), and **Vault OSS** (namespaces disabled on OSS). --- ## Features | Area | What it collects | |---|---| | **Namespaces** | Full recursive namespace tree via `sys/namespaces` | | **Secrets — KV v1** | All secret paths (metadata not available in v1) | | **Secrets — KV v2** | All secret paths + `created_time`, `updated_time`, version history, custom metadata | | **Secrets — AWS** | All configured roles (credential_type, role ARNs, policy ARNs) | | **Secrets — Terraform Cloud** | All configured roles (org, team_id, token_account_type) | | **Secrets — KMIP** | All scopes → roles (enabled operations, key type/bits) | | **Last access** | Exact timestamp, entity name, originating IP — **requires audit log** | | **Users/Entities** | Full identity store scan: aliases, policies, groups, auth method | | **Last login** | Exact timestamp, auth path, IP — **requires audit log** | | **Last activity** | Any API call by the entity (not just logins) — **requires audit log** | | **Token proxy** | Without audit log: token `creation_time` / `last_renewal_time` as a proxy | | **Auth method users** | Discovers users configured in userpass, ldap, github, approle, cert, oidc, jwt, k8s, etc. | ### Outputs - **Console** — coloured tables (requires `rich`) or plain text fallback - **JSON** — full structured report: `vault_audit_YYYYMMDD_HHMMSS.json` - **Secrets CSV** — one row per secret: `vault_secrets_YYYYMMDD_HHMMSS.csv` - **Entities CSV** — one row per user/entity: `vault_entities_YYYYMMDD_HHMMSS.csv` --- ## Requirements ```bash pip install requests urllib3 # required pip install rich # optional — coloured console output ``` Python 3.8+. No other dependencies. --- ## Setup ### Environment variables ```bash export VAULT_ADDR="https://your-cluster.hashicorp.cloud:8200" export VAULT_TOKEN="hvs.your-audit-token" # On-prem with internal CA: export VAULT_CACERT="/etc/ssl/certs/internal-ca.pem" ``` ### Token requirements **If you have an audit log** (recommended): run with `--no-token-scan`. The policy then requires only pure `read`/`list` — **no `sudo`, no `update`**. **Without an audit log**: the token scan (`auth/token/accessors`) acts as a last-activity proxy, but that endpoint requires `sudo`. Avoid this by enabling audit logging instead. | Path | Capability | Why | |---|---|---| | `sys/namespaces`, `sys/namespaces/*` | `list` | Namespace discovery | | `sys/mounts`, `sys/auth` | `read` | Engine/auth mount enumeration | | `+/metadata`, `+/metadata/*` | `read`, `list` | KV v2 secret metadata (no secret values read) | | `+/*` | `list` | KV v1 path listing | | `+/roles`, `+/roles/*` | `list`, `read` | AWS roles | | `+/role`, `+/role/*` | `list`, `read` | Terraform / approle / oidc / etc. roles | | `+/scope`, `+/scope/*/role`, `+/scope/*/role/*` | `list`, `read` | KMIP scopes and roles | | `identity/entity/id`, `identity/entity/id/*` | `list`, `read` | User/entity collection | | `identity/group/id`, `identity/group/id/*` | `list`, `read` | Group membership | | `auth/+/users`, `auth/+/users/*` | `list`, `read` | userpass / ldap users | | `auth/+/groups`, `auth/+/groups/*` | `list`, `read` | ldap groups | | `auth/+/role`, `auth/+/role/*` | `list`, `read` | approle / oidc / jwt / k8s roles | | `auth/+/roles`, `auth/+/roles/*` | `list`, `read` | aws / gcp / azure auth roles | | `auth/+/certs`, `auth/+/certs/*` | `list`, `read` | cert auth roles | | `auth/+/map/users`, `auth/+/map/users/*` | `list`, `read` | github auth users | | ~~`auth/token/accessors`~~ | ~~`sudo`, `list`~~ | Token scan only — **skip with `--no-token-scan`** | | ~~`auth/token/lookup-accessor`~~ | ~~`update`~~ | Token scan only — **skip with `--no-token-scan`** | --- ## Minimal read-only Vault policy Save as `vault-audit-readonly.hcl` and apply: ```bash # For HCP Vault or namespace-rooted clusters: vault policy write -namespace=admin vault-audit-readonly vault-audit-readonly.hcl vault token create -namespace=admin \ -policy=vault-audit-readonly \ -ttl=1h \ -display-name="vault-audit-run" ``` ```hcl # vault-audit-readonly.hcl # Pure read/list — no sudo, no write access. # Use with: python3 vault_audit.py --no-token-scan --audit-log /path/to/audit.log # ── Namespace discovery ─────────────────────────────────────────────────────── path "sys/namespaces" { capabilities = ["list"] } path "sys/namespaces/*" { capabilities = ["list"] } # ── Mount enumeration ───────────────────────────────────────────────────────── path "sys/mounts" { capabilities = ["read"] } path "sys/auth" { capabilities = ["read"] } # ── KV v2 (metadata only — secret values are never read) ────────────────────── path "+/metadata" { capabilities = ["list"] } path "+/metadata/*" { capabilities = ["read", "list"] } # ── KV v1 (path listing only — secret values are never read) ────────────────── path "+/*" { capabilities = ["list"] } # ── AWS secrets engine ──────────────────────────────────────────────────────── path "+/roles" { capabilities = ["list"] } path "+/roles/*" { capabilities = ["read"] } # ── Terraform Cloud / approle / oidc / jwt / k8s / aws-auth / azure / gcp ──── path "+/role" { capabilities = ["list"] } path "+/role/*" { capabilities = ["read"] } # ── KMIP secrets engine ─────────────────────────────────────────────────────── path "+/scope" { capabilities = ["list"] } path "+/scope/*/role" { capabilities = ["list"] } path "+/scope/*/role/*" { capabilities = ["read"] } # ── Identity store ──────────────────────────────────────────────────────────── path "identity/entity/id" { capabilities = ["list"] } path "identity/entity/id/*" { capabilities = ["read"] } path "identity/group/id" { capabilities = ["list"] } path "identity/group/id/*" { capabilities = ["read"] } # ── Auth method user/role discovery ─────────────────────────────────────────── path "auth/+/users" { capabilities = ["list"] } path "auth/+/users/*" { capabilities = ["read"] } path "auth/+/groups" { capabilities = ["list"] } path "auth/+/groups/*" { capabilities = ["read"] } path "auth/+/map/users" { capabilities = ["list"] } path "auth/+/map/users/*" { capabilities = ["read"] } path "auth/+/role" { capabilities = ["list"] } path "auth/+/role/*" { capabilities = ["read"] } path "auth/+/roles" { capabilities = ["list"] } path "auth/+/roles/*" { capabilities = ["read"] } path "auth/+/certs" { capabilities = ["list"] } path "auth/+/certs/*" { capabilities = ["read"] } # ── Token scan (only needed WITHOUT --no-token-scan) ────────────────────────── # If you have an audit log, use --no-token-scan and omit these two blocks. # auth/token/accessors requires the "sudo" capability — avoid if possible. # # path "auth/token/accessors" { # capabilities = ["sudo", "list"] # } # path "auth/token/lookup-accessor" { # capabilities = ["update"] # POST endpoint, but it's a read operation # } ``` > **`+` vs `*` in path globs** > `+` matches exactly one path segment (e.g. the mount name). > `*` matches the rest of the path including slashes. > Using `+` is more precise and avoids overly broad grants. --- ## Usage ### Quick start ```bash # HCP Vault — root namespace is auto-detected, starts in "admin" # With audit log (recommended — enables exact timestamps, no sudo needed) python3 vault_audit.py --no-token-scan --audit-log /path/to/audit.log # HCP Vault — interactive: script will prompt for the audit log path python3 vault_audit.py --no-token-scan # On-prem Enterprise python3 vault_audit.py --no-token-scan --audit-log /var/log/vault/audit.log # Scan a specific namespace subtree only python3 vault_audit.py --no-token-scan --namespace team-a/prod # Save nothing to disk (console only) python3 vault_audit.py --no-token-scan --no-save # Verbose/debug logging python3 vault_audit.py --no-token-scan -v ``` ### All options ``` Connection: --addr URL Vault address (default: $VAULT_ADDR) --token TOKEN Vault token (default: $VAULT_TOKEN) --no-tls-verify Disable TLS certificate verification --ca-cert PATH CA bundle for on-prem TLS ($VAULT_CACERT) --timeout SEC Per-request timeout, seconds (default: 15) --namespace NS Start scan from this namespace (default: root) Scope: --no-secrets Skip secret collection --no-users Skip user/entity collection --no-token-scan Skip token accessor scan (use this when you have an audit log) --no-auth-method-scan Skip auth method user discovery --max-accessors N Max token accessors per namespace (default: 2000) --max-depth N Max KV directory recursion depth (default: 12) Audit Log: --audit-log PATH Path to Vault audit JSONL log (enables timestamps) Output: --output-dir DIR Output directory (default: ./vault_audit_output) --no-save Console only, no files --no-color Disable rich/colour output Performance: --workers N Thread pool size (default: 10) --rate-limit RPS Max API requests/second (default: 50) Debug: -v, --verbose Debug logging ``` --- ## About audit logs > **"Last accessed" and "last login" are not stored natively in Vault.** > They only become available by parsing the Vault audit log. ### Enabling audit logging ```bash # File-based audit device vault audit enable file file_path=/var/log/vault/audit.log # Verify vault audit list ``` **HCP Vault**: Portal → your cluster → **Observability** → **Audit Logging** → enable, then stream/export to an S3 bucket, Datadog, Splunk, etc. and download the JSONL file. ### Audit log format Vault writes one JSON object per line (JSONL). Relevant fields the script uses: ```jsonc { "type": "response", // "request" entries are skipped (avoids double-counting) "time": "2025-03-01T14:23:11.123456Z", "auth": { "entity_id": "abc-123", // NOT hashed — used to correlate with identity store "display_name": "alice", "token_type": "service", "policies": ["default", "kv-read"] }, "request": { "operation": "read", "path": "secret/data/myapp/db-creds", "namespace": { "id": "...", "path": "team-a/" }, "remote_address": "10.0.1.5" // NOT hashed — real IP (or proxy IP if behind LB) }, "response": { "auth": { ... } // present only for login events } } ``` **Note on IPs behind a load balancer**: if your Vault is behind a proxy/LB, `remote_address` will show the proxy IP. Enable `X-Forwarded-For` forwarding with: ```bash vault write sys/config/auditing/request-headers/X-Forwarded-For insensitive=true ``` The script automatically extracts the real client IP from `X-Forwarded-For` when present. --- ## Architecture ``` main() ├─ Phase 1 — Namespace discovery sys/namespaces (recursive) ├─ (interactive prompt for audit log if TTY and --audit-log not given) ├─ Phase 2 — Audit log parsing JSONL → AuditIndex in-memory map ├─ Phase 3 — Secret collection ThreadPoolExecutor over all mounts │ ├─ KV v1 list_kv1_recursive() │ ├─ KV v2 list_kv2_recursive() + kv2_get_meta() │ ├─ AWS LIST {mount}/roles → GET {mount}/roles/{role} │ ├─ TF LIST {mount}/role → GET {mount}/role/{role} │ └─ KMIP LIST {mount}/scope → LIST scope/{s}/role → GET scope/{s}/role/{r} ├─ Phase 3b — Audit enrichment enrich_secrets() matches paths in AuditIndex ├─ Phase 4 — Entity collection LIST + GET identity/entity/id/* ├─ Phase 5 — Token accessor scan LIST auth/token/accessors → lookup-accessor │ (skipped with --no-token-scan) ├─ Phase 5b — Auth method user scan collect_auth_method_users() ├─ Phase 5c — Audit enrichment enrich_entities() (login + activity events) └─ Phase 6 — Output ├─ Console (rich or plain) ├─ vault_audit_.json ├─ vault_secrets_.csv └─ vault_entities_.csv ``` ### Key classes | Class | Purpose | |---|---| | `VaultClient` | HTTP wrapper with rate-limiting, retry, thread-local sessions, namespace header | | `TokenBucket` | Thread-safe token-bucket rate limiter | | `AuditIndex` | Parses JSONL audit log into three in-memory dicts: `secret_access`, `entity_login`, `entity_activity` | | `SecretRecord` | One per discovered secret/role — engine type, metadata, last-access from audit log | | `EntityRecord` | One per user/entity — aliases, policies, last login, last activity from audit log | | `TokenProxyRecord` | One per token accessor — `last_renewal_time` used as activity proxy **without** audit log | | `VaultAuditReport` | Top-level container passed to all output functions | --- ## Example outputs ### Console — startup log ``` 10:42:01 INFO Connecting to: https://myvault.example.com:8200 (TLS verify=True) 10:42:01 INFO Vault version: 1.17.3+ent, cluster: vault-cluster-prod 10:42:01 INFO Phase 1/6: Discovering namespaces from root ... 10:42:02 INFO Found 13 namespace(s) (plus root) ┌─ Audit log ──────────────────────────────────────────────────────┐ │ Provide the path to your Vault audit log (JSON/JSONL format). │ │ This enables last-access and last-login timestamps for all │ │ secrets and users. │ │ │ │ HCP Vault: portal → cluster → Audit → enable + download logs │ │ On-prem : check your audit device path (vault audit list) │ │ Leave empty to continue without last-access data. │ └───────────────────────────────────────────────────────────────────┘ Audit log path > /var/log/vault/audit.log 10:42:05 INFO Phase 2/6: Parsing audit log: /var/log/vault/audit.log 10:42:07 INFO Audit log: 142,831 entries parsed, 48 secret paths, 312 entity logins, 1,205 entity activity records 10:42:07 INFO Phase 3/6: Collecting secrets from all engine mounts ... 10:42:07 INFO Found 9 secret engine mount(s) to scan across 14 namespace(s) (types: kv, aws, terraform, kmip) 10:42:09 INFO Collected 34 secret(s) 10:42:09 INFO Phase 4/6: Collecting users/entities from identity store ... 10:42:10 INFO Identity store: 28 entity/user record(s) 10:42:11 INFO Phase 5/6: Token scan — skipped (--no-token-scan) 10:42:11 INFO Phase 5b/6: Scanning auth method mounts for configured users ... 10:42:12 INFO Auth method scan complete. 5 additional principal(s) discovered. 10:42:12 INFO Phase 6/6: Generating output ... 10:42:12 INFO Done. 194 API calls in 11.1s. Secrets: 34. Entities: 33. ``` --- ### Console — namespace tree ``` ══════════════════════════════════════════════════════════════════════════════════════ HCP VAULT AUDIT REPORT Generated : 2025-03-01T10:42:12Z Cluster : https://myvault.example.com:8200 (version 1.17.3+ent, vault-cluster-prod) Namespaces: 13 | Secrets: 34 | Entities: 33 Audit log : 142,831 entries from /var/log/vault/audit.log ══════════════════════════════════════════════════════════════════════════════════════ ────────────────────────────────────────────────────────────────────────────────────── NAMESPACE TREE ────────────────────────────────────────────────────────────────────────────────────── (root) └─ admin/ └─ team-a/ └─ team-a/prod/ └─ team-a/staging/ └─ team-b/ └─ platform/ └─ platform/aws/ └─ platform/kmip/ ``` --- ### Console — secrets table ``` ────────────────────────────────────────────────────────────────────────────────────── SECRETS (34 total) ────────────────────────────────────────────────────────────────────────────────────── Namespace: admin/team-a/prod/ (11 secrets) MOUNT PATH ENGINE CREATED UPDATED VER LAST READ BY IP ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── secret myapp/db-creds kv_v2 2024-08-10 09:12:44 2025-01-15 16:33:01 3 2025-02-28 14:23:11 alice (alice@ldap) 10.0.1.42 secret myapp/api-keys kv_v2 2024-08-10 09:13:22 2024-11-20 10:01:55 1 2025-02-20 08:55:30 svc-backend (approle) 10.0.2.100 secret shared/tls-cert kv_v2 2024-06-01 00:00:00 2025-01-01 00:00:00 5 never - - legacy old-config/settings kv_v1 - - 2025-01-10 11:20:00 bob (bob@ldap) 10.0.1.55 aws prod-role-readonly aws - - 2025-02-25 09:00:12 svc-infra (approle) 10.0.2.101 aws prod-role-admin aws - - 2025-02-01 17:44:03 charlie (charlie@ldap) 10.0.1.88 terraform tfc-workspace-deploy terraform - - 2025-02-19 14:05:55 svc-ci (approle) 10.0.2.105 Namespace: admin/platform/kmip/ (4 secrets) MOUNT PATH ENGINE ... kmip acme-corp/db-encrypt kmip ... 2025-02-27 22:10:03 svc-db (token) 10.0.3.10 kmip acme-corp/backup-keys kmip ... never - - ``` --- ### Console — user summary table ``` ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── USER ACTIVITY SUMMARY (33 total) ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── NAME NAMESPACE AUTH TYPE LAST LOGIN LAST ACTIVITY IP (login) STATUS LOGINS ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── alice admin/ ldap 2025-03-01 08:44:02 2025-03-01 14:23:11 10.0.1.42 active 142 svc-backend admin/team-a/prod/ approle 2025-02-28 01:00:00 2025-03-01 12:10:55 10.0.2.100 active 8,312 svc-infra admin/team-a/prod/ approle 2025-02-25 09:00:08 2025-02-25 09:00:12 10.0.2.101 active 44 bob admin/ ldap 2025-01-10 11:18:44 2025-01-10 11:20:00 10.0.1.55 active 23 charlie admin/ ldap 2025-02-01 17:43:50 2025-02-01 17:44:03 10.0.1.88 active 11 svc-ci admin/platform/ approle 2025-02-19 14:05:50 2025-02-19 14:05:55 10.0.2.105 active 189 old-service-account admin/ approle 2024-06-30 23:59:00 2024-06-30 23:59:00 10.0.2.200 DISABLED 1,044 ``` --- ### Secrets CSV columns ``` namespace_path, mount_path, secret_path, engine_type, kv_version, created_time, updated_time, current_version, oldest_version, max_versions, last_accessed_time, last_accessed_by_entity_id, last_accessed_by_display_name, last_accessed_from_ip, last_accessed_operation, access_count, metadata_error, custom_metadata, engine_data ``` ### Entities CSV columns ``` namespace_path, entity_id, name, disabled, policies, groups, creation_time, last_update_time, aliases_summary, token_accessor, token_display_name, token_auth_path, token_creation_time, token_expire_time, token_last_renewal_time, last_login_time, last_login_from_ip, last_login_auth_method, last_login_auth_path, last_login_namespace, login_count, last_activity_time, last_activity_ip, last_activity_operation, last_activity_path, last_activity_mount_type, activity_count, last_seen_time, auth_method_extra ``` --- ## Limitations & notes | Limitation | Detail | |---|---| | **Vault OSS** | No namespace support — `sys/namespaces` returns 403; script warns and continues | | **HCP Vault** | Audit logs are not queryable via API — export from portal first | | **HCP Vault root** | All real resources live under `admin/` — auto-detected from `.hashicorp.cloud` in URL | | **Audit log required for timestamps** | Without it, only token `creation_time` / `last_renewal_time` is available as a proxy | | **HMAC** | Secret values and some metadata fields are HMAC-hashed in audit logs (by design); `entity_id`, `request.path`, and `remote_address` are NOT hashed | | **Proxy IPs** | If Vault is behind a load balancer, `remote_address` shows the proxy IP unless `X-Forwarded-For` is configured on the audit device | | **AWS STS roles** | Access via `{mount}/sts/{role}` is also captured (in addition to `/creds/`) | | **KMIP** | The script lists roles (configs); actual KMIP protocol operations are logged differently | | **Rate limiting** | Default 50 RPS; tune with `--rate-limit` to avoid overwhelming Vault | --- ## Security recommendations 1. **Use a dedicated audit token** — apply the minimal policy above; do not use `root`. 2. **Always use `--no-token-scan`** when you have an audit log — it removes the only `sudo` requirement from the policy. 3. **Never paste tokens in chat or logs** — use environment variables only. 4. **Rotate the token immediately** after the audit run (or use short-TTL tokens: `-ttl=1h`). 5. **Protect the output files** — they contain your full secret inventory. Store them in a restricted location and delete when done. 6. **Keep audit logging enabled** — without it, there is no reliable record of who accessed what and when. --- ## License MIT. Use at your own risk. Not an official HashiCorp product.