Give your AI agents fast, trustworthy memory — without standing up a vector database.
ruLake is the layer between your agents and the data they remember. Plug in the storage you already have (S3, BigQuery, Snowflake, Parquet, files), expose it through one MCP tool, and every agent on every host gets the same low-latency, content-addressed view of memory.
Created by rUv. Part of the RuVector ecosystem alongside
ruvector-rabitq(1-bit compression kernel) and RVF (durable segment format). Repo: ruvnet/RuLake.
Agentic systems are built on contrastive AI — embeddings that put similar things close together and different things far apart. Every "what does the agent remember about X?" query is, underneath, a contrast: rank the corpus by distance to X. ruLake is the place where those contrasts run. It keeps a compressed copy of your vectors in RAM, serves hits at ≈1.02× raw library speed (essentially free abstraction), and refreshes cold entries from whatever cloud or file store actually owns the bytes. Each cached entry is anchored by a cryptographic witness, so an answer is verifiable across processes, hosts, and time.
- One MCP tool, one decision layer.
rulake-mcpspeaks the Model Context Protocol. Claude Desktop, Cursor, Cline, Continue, agentic-flow — they all get a singlerulake_querytool that takes intent (search/verify/explain/refresh), risk, freshness budget, and policy, and returns the answer plus a decision trace (chosen_action, reason_code, backends_used, refusals). The agent says what it wants; ruLake decides where to look, how strict to be, whether to refuse. - Trust by witness, not vibes. Every result carries the SHAKE-256 hex of the underlying bundle. Two agents, two hosts, same data → same witness → same answer, byte-exact.
- Honest refusals beat confident lies. Stale cache + missing remote witness?
WITNESS_MISMATCH_REFUSED, empty data, agent retries narrower. Better than serving a stale answer with a high score.
| What it delivers | |
|---|---|
| Latency | 1.02× raw RaBitQ ≈ ~1 ms cache-hit at n=100k, D=128. Measured. |
| Throughput | 957 QPS single-thread, 2,854 QPS concurrent (Arc-drop-lock + AVX-512 VPOPCNTDQ). |
| Compression | 1-bit RaBitQ — 32× smaller than f32 vectors at D=128. |
| Cost | $0. MIT/Apache-2.0, no service to host, no per-query fee. |
| Surfaces | Rust crate · Python wheel · Node.js · rulake-mcp binary · Docker image. |
Edge / browser: WASM SDKs are roadmap (v0.5) — @ruvector/rulake-wasm for Cloudflare Workers, Deno, Bun, browsers. Today the binary is small enough to run on a Raspberry Pi or an EC2 t4g.nano.
# Rust — the core crate
cargo add ruvector-rulake
# Python — PyO3 + ABI3 wheels (NumPy zero-copy)
pip install ruvector-rulake
# Node.js / TypeScript — napi-rs prebuilt binaries (Float32Array zero-copy)
npm install @ruvector/rulake
# MCP server binary — agent-callable governed memory
cargo install --path mcp-server # or grab a release binaryWire to Claude Desktop or any MCP client over stdio:
{
"mcpServers": {
"rulake": {
"command": "rulake-mcp",
"args": ["stdio", "--config", "/etc/rulake/mcp.toml"]
}
}
}Wire a remote agent over Streamable HTTP:
{
"mcpServers": {
"rulake": {
"transport": "streamable-http",
"url": "https://rulake.example.com/mcp",
"headers": { "Authorization": "Bearer <token>" }
}
}
}ruLake's MCP server enforces capability + per-collection RBAC at every tool call. Configure once in mcp.toml:
[[backends]]
type = "fs"
id = "lake-prod"
root = "/srv/rulake"
[[allow]]
backend = "lake-prod"
collection = "memories.public.*" # anchored regex
caps = ["read"]
[[allow]]
backend = "lake-prod"
collection = "memories.policies"
caps = ["read", "publish"]# Read-only agent (default), bearer auth on loopback:
rulake-mcp http --bind 127.0.0.1:7440 --auth bearer --bearer-token-file /etc/rulake/token
# Production: JWT via OAuth scope-to-capability mapping
# mcp:rulake:read → Read
# mcp:rulake:publish → Read + Publish
# mcp:rulake:admin → Read + Publish + Admin
rulake-mcp http --auth jwt --capabilities read,publish \
--audit-file /var/log/rulake-mcp/audit.jsonlEvery tool call writes one JSONL audit line with policy_decision (capability_required, capability_granted) + decision (chosen_action, reason_code, backends_used, refusals). The "every audit line explains itself" gate from ADR-004 §7.
┌──────────────────── ruLake ─────────────────────┐
agent ──MCP──▶ │
│ ┌─── planner ───┐ ┌─── audit (JSONL) ───┐ │
│ │ intent → │ │ policy_decision + │ │
│ │ plan → │ │ decision per call │ │
│ │ refusals │ └─────────────────────┘ │
│ └───────┬───────┘ │
│ ▼ │
│ ┌─── VectorCache (Arc'd) ─────────────────┐ │
│ │ witness → RaBitQ index (1-bit) │ │
│ │ pointers: (be, collection) → witness │ │
│ └──────────────────▲──────────────────────┘ │
│ │ prime (on miss) │
│ ┌────── BackendAdapter ─────────────────┐ │
│ │ GCS Parquet · S3 · BigQuery · Delta │ │
│ │ Iceberg · RVF · FsBackend · Local │ │
│ └───────────────────────────────────────┘ │
└─────────────────────────────────────────────────┘
| Crate | What it is | Tests |
|---|---|---|
ruvector-rulake (root) |
Core cache + router + RaBitQ kernel | 43 |
python/ |
Python SDK (PyO3, ABI3 wheels) | 14 |
node/ |
Node.js / TypeScript SDK (napi-rs) | 10 |
mcp-server/ |
rulake-mcp — MCP server with RBAC + JWT + audit |
38 |
gcs-backend/ |
Parquet-on-GCS BackendAdapter | 4 |
109 tests across the repo. Every crate builds silently in --release.
| ruLake | Pinecone / Weaviate | BigQuery Vector Search | RaBitQ direct | |
|---|---|---|---|---|
| Owns your data | ❌ | ✅ | — | — |
| 1.02× direct-library speed | ✅ | ❌ | ❌ | ✅ |
| Witness-anchored (cross-process) | ✅ | ❌ | ❌ | ❌ |
| MCP-native for agents | ✅ | ❌ | ❌ | ❌ |
| Per-collection RBAC | ✅ | partial | partial | ❌ |
| OAuth-style JWT auth | ✅ | ✅ | ✅ | ❌ |
| Per-call audit trace | ✅ | partial | ✅ | ❌ |
| $0 to run | ✅ | ❌ | ❌ | ✅ |
- ADR-155 — cache-first, federated refill, M2–M5 backend roadmap
- ADR-156 — agent-memory substrate framing
- ADR-001 — submodule + sibling-crate layout
- ADR-002 — Python SDK (PyO3, ABI3 wheels)
- ADR-003 — Node.js SDK (napi-rs)
- ADR-004 — MCP server (1340 lines, the decision-layer + RBAC + auth spec)
- ✅ M1 + M1.5 — core cache + bundle protocol + 3 consistency modes + persist end-to-end
- ✅ Audience shells — Python SDK, Node.js SDK, MCP server (4 intents, 4 capability tiers, 7 tools)
- ✅ First cloud backend — GCS Parquet (with cheap
current_bundle()for the resource path) - ✅ Auth + RBAC — bearer (dev-only), JWT with OAuth-style scope mapping, per-collection allow-lists, JSONL audit, replay-protection module
- 🚧 v0.4 tail — wire JWT/replay into HTTP middleware, layered rate limiting on the MCP path,
tools/listfilter by capability - 🗺 Roadmap — M2 ParquetBackend on S3, M3 BigQueryBackend with push-down, M4 governance (RBAC/PII/lineage at the cache layer, OpenLineage emission), M5 Delta/Iceberg, WASM SDKs for browser/Cloudflare/Deno, GPU rabitq kernel, Java SDK
Open source. ❤️ Free forever. MIT OR Apache-2.0.