Skip to content

Instantly share code, notes, and snippets.

@ruvnet
Last active April 26, 2026 04:40
Show Gist options
  • Select an option

  • Save ruvnet/478f78997c55b83246a9f82c975643dd to your computer and use it in GitHub Desktop.

Select an option

Save ruvnet/478f78997c55b83246a9f82c975643dd to your computer and use it in GitHub Desktop.
ruLake — A Memory Lake for Agentic AI (MCP + RBAC + JWT + 109 tests, free + open source)

ruLake — A Memory Lake for Agentic AI

Give your AI agents fast, trustworthy memory — without standing up a vector database.

ruLake is the layer between your agents and the data they remember. Plug in the storage you already have (S3, BigQuery, Snowflake, Parquet, files), expose it through one MCP tool, and every agent on every host gets the same low-latency, content-addressed view of memory.

Created by rUv. Part of the RuVector ecosystem alongside ruvector-rabitq (1-bit compression kernel) and RVF (durable segment format). Repo: ruvnet/RuLake.


What it is, in one paragraph

Agentic systems are built on contrastive AI — embeddings that put similar things close together and different things far apart. Every "what does the agent remember about X?" query is, underneath, a contrast: rank the corpus by distance to X. ruLake is the place where those contrasts run. It keeps a compressed copy of your vectors in RAM, serves hits at ≈1.02× raw library speed (essentially free abstraction), and refreshes cold entries from whatever cloud or file store actually owns the bytes. Each cached entry is anchored by a cryptographic witness, so an answer is verifiable across processes, hosts, and time.


Why agents in particular

  • One MCP tool, one decision layer. rulake-mcp speaks the Model Context Protocol. Claude Desktop, Cursor, Cline, Continue, agentic-flow — they all get a single rulake_query tool that takes intent (search / verify / explain / refresh), risk, freshness budget, and policy, and returns the answer plus a decision trace (chosen_action, reason_code, backends_used, refusals). The agent says what it wants; ruLake decides where to look, how strict to be, whether to refuse.
  • Trust by witness, not vibes. Every result carries the SHAKE-256 hex of the underlying bundle. Two agents, two hosts, same data → same witness → same answer, byte-exact.
  • Honest refusals beat confident lies. Stale cache + missing remote witness? WITNESS_MISMATCH_REFUSED, empty data, agent retries narrower. Better than serving a stale answer with a high score.

Performance, cost, footprint

What it delivers
Latency 1.02× raw RaBitQ ≈ ~1 ms cache-hit at n=100k, D=128. Measured.
Throughput 957 QPS single-thread, 2,854 QPS concurrent (Arc-drop-lock + AVX-512 VPOPCNTDQ).
Compression 1-bit RaBitQ — 32× smaller than f32 vectors at D=128.
Cost $0. MIT/Apache-2.0, no service to host, no per-query fee.
Surfaces Rust crate · Python wheel · Node.js · rulake-mcp binary · Docker image.

Edge / browser: WASM SDKs are roadmap (v0.5) — @ruvector/rulake-wasm for Cloudflare Workers, Deno, Bun, browsers. Today the binary is small enough to run on a Raspberry Pi or an EC2 t4g.nano.


Install (five surfaces, three install commands)

# Rust — the core crate
cargo add ruvector-rulake

# Python — PyO3 + ABI3 wheels (NumPy zero-copy)
pip install ruvector-rulake

# Node.js / TypeScript — napi-rs prebuilt binaries (Float32Array zero-copy)
npm install @ruvector/rulake

# MCP server binary — agent-callable governed memory
cargo install --path mcp-server   # or grab a release binary

Wire to Claude Desktop or any MCP client over stdio:

{
  "mcpServers": {
    "rulake": {
      "command": "rulake-mcp",
      "args": ["stdio", "--config", "/etc/rulake/mcp.toml"]
    }
  }
}

Wire a remote agent over Streamable HTTP:

{
  "mcpServers": {
    "rulake": {
      "transport": "streamable-http",
      "url": "https://rulake.example.com/mcp",
      "headers": { "Authorization": "Bearer <token>" }
    }
  }
}

Auth + RBAC (production-shaped)

ruLake's MCP server enforces capability + per-collection RBAC at every tool call. Configure once in mcp.toml:

[[backends]]
type = "fs"
id   = "lake-prod"
root = "/srv/rulake"

[[allow]]
backend    = "lake-prod"
collection = "memories.public.*"     # anchored regex
caps       = ["read"]

[[allow]]
backend    = "lake-prod"
collection = "memories.policies"
caps       = ["read", "publish"]
# Read-only agent (default), bearer auth on loopback:
rulake-mcp http --bind 127.0.0.1:7440 --auth bearer --bearer-token-file /etc/rulake/token

# Production: JWT via OAuth scope-to-capability mapping
#   mcp:rulake:read     → Read
#   mcp:rulake:publish  → Read + Publish
#   mcp:rulake:admin    → Read + Publish + Admin
rulake-mcp http --auth jwt --capabilities read,publish \
                --audit-file /var/log/rulake-mcp/audit.jsonl

Every tool call writes one JSONL audit line with policy_decision (capability_required, capability_granted) + decision (chosen_action, reason_code, backends_used, refusals). The "every audit line explains itself" gate from ADR-004 §7.


Architecture (the short version)

              ┌──────────────────── ruLake ─────────────────────┐
 agent ──MCP──▶                                                 │
              │   ┌─── planner ───┐    ┌─── audit (JSONL) ───┐  │
              │   │ intent →      │    │ policy_decision +   │  │
              │   │   plan →      │    │ decision per call   │  │
              │   │   refusals    │    └─────────────────────┘  │
              │   └───────┬───────┘                             │
              │           ▼                                     │
              │   ┌─── VectorCache (Arc'd) ─────────────────┐   │
              │   │   witness → RaBitQ index (1-bit)        │   │
              │   │   pointers: (be, collection) → witness  │   │
              │   └──────────────────▲──────────────────────┘   │
              │                      │ prime (on miss)          │
              │   ┌────── BackendAdapter ─────────────────┐     │
              │   │  GCS Parquet · S3 · BigQuery · Delta  │     │
              │   │  Iceberg · RVF · FsBackend · Local    │     │
              │   └───────────────────────────────────────┘     │
              └─────────────────────────────────────────────────┘

Six sibling crates today

Crate What it is Tests
ruvector-rulake (root) Core cache + router + RaBitQ kernel 43
python/ Python SDK (PyO3, ABI3 wheels) 14
node/ Node.js / TypeScript SDK (napi-rs) 10
mcp-server/ rulake-mcp — MCP server with RBAC + JWT + audit 38
gcs-backend/ Parquet-on-GCS BackendAdapter 4

109 tests across the repo. Every crate builds silently in --release.


Compared with everything else

ruLake Pinecone / Weaviate BigQuery Vector Search RaBitQ direct
Owns your data
1.02× direct-library speed
Witness-anchored (cross-process)
MCP-native for agents
Per-collection RBAC partial partial
OAuth-style JWT auth
Per-call audit trace partial
$0 to run

Design docs

  • ADR-155 — cache-first, federated refill, M2–M5 backend roadmap
  • ADR-156 — agent-memory substrate framing
  • ADR-001 — submodule + sibling-crate layout
  • ADR-002 — Python SDK (PyO3, ABI3 wheels)
  • ADR-003 — Node.js SDK (napi-rs)
  • ADR-004 — MCP server (1340 lines, the decision-layer + RBAC + auth spec)

Status (2026-04-26)

  • ✅ M1 + M1.5 — core cache + bundle protocol + 3 consistency modes + persist end-to-end
  • ✅ Audience shells — Python SDK, Node.js SDK, MCP server (4 intents, 4 capability tiers, 7 tools)
  • ✅ First cloud backend — GCS Parquet (with cheap current_bundle() for the resource path)
  • ✅ Auth + RBAC — bearer (dev-only), JWT with OAuth-style scope mapping, per-collection allow-lists, JSONL audit, replay-protection module
  • 🚧 v0.4 tail — wire JWT/replay into HTTP middleware, layered rate limiting on the MCP path, tools/list filter by capability
  • 🗺 Roadmap — M2 ParquetBackend on S3, M3 BigQueryBackend with push-down, M4 governance (RBAC/PII/lineage at the cache layer, OpenLineage emission), M5 Delta/Iceberg, WASM SDKs for browser/Cloudflare/Deno, GPU rabitq kernel, Java SDK

Open source. ❤️ Free forever. MIT OR Apache-2.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment