Skip to content

Instantly share code, notes, and snippets.

View prasad-kumkar's full-sized avatar
🎧
I may be slow to respond.

Prasad prasad-kumkar

🎧
I may be slow to respond.
View GitHub Profile
@prasad-kumkar
prasad-kumkar / continuous-knowledge-update-mechanisms-for-rag.md
Created April 25, 2026 22:00
Continuous Knowledge Update Mechanisms for RAG

Continuous Knowledge Update Mechanisms for RAG

A lot of RAG systems fail quietly after launch. The retrieval layer still returns relevant-looking text, but the corpus drifts away from the real world. Policies change, product docs move, records get corrected, and operational state turns over faster than the index. The result is a system that sounds grounded while relying on stale evidence.

Keeping RAG current is not a matter of running a nightly reindex job. You need an update mechanism that decides what changed, what needs to be invalidated, what needs to be re-embedded, and what should stay untouched so relevance does not bounce around every day.

1. Start with source-specific ingestion cadence, not one global schedule

Different source classes change at different rates, and they matter differently to the user.

@prasad-kumkar
prasad-kumkar / multi-tenant-agent-management-platforms.md
Created April 25, 2026 21:58
Multi-Tenant Agent Management Platforms: How to Isolate Tenants, Enforce Policy, and Protect Shared Capacity

Multi-Tenant Agent Management Platforms: How to Isolate Tenants, Enforce Policy, and Protect Shared Capacity

The hard part of a multi-tenant agent platform is not getting one agent to work. It is letting hundreds of customers run agents on the same platform without one tenant seeing another tenant's data, exhausting shared capacity, bypassing policy, or creating an incident that spreads sideways across the fleet.

If you are building a multi-tenant agent management layer, the goal is straightforward: every request, tool call, memory write, model invocation, secret lookup, and side effect should be attributable to one tenant and bounded by that tenant's policy, budget, and trust zone.

Tenant isolation is a platform boundary

The weakest design is shared everything with logical filtering sprinkled across services. That usually works until one query misses the filter, one cache key is too broad, or one background worker replays a job without tenant context.

@prasad-kumkar
prasad-kumkar / multi-hop-retrieval-and-query-planning.md
Created April 25, 2026 21:57
Multi-Hop Retrieval and Query Planning for Complex Agentic Search

Multi-Hop Retrieval and Query Planning for Complex Agentic Search

Complex questions rarely fail because the model cannot write. They fail because the system treats retrieval like a single search call when the task actually requires staged evidence gathering. If a user asks, "Which vendors with open security exceptions also appear in renewal contracts expiring next quarter, and what precedent do we have for approving them?" one vector lookup is the wrong unit of work.

Multi-hop retrieval turns that request into an evidence workflow. The system decomposes the question, plans subqueries, selects the right retrieval tools for each step, decides when enough evidence has been gathered, and prevents the workflow from ballooning into an expensive search tree.

1. Start with decomposition, not embedding

The first mistake in agentic search is to embed the whole user query and hope the index understands the hidden structure. Complex questions usually contain several jobs mixed together:

@prasad-kumkar
prasad-kumkar / adaptive-chunking-strategies-for-rag.md
Created April 25, 2026 21:56
Adaptive chunking strategies for RAG systems

Adaptive Chunking Strategies for RAG Systems

Static chunking is one of the quiet ways a RAG system degrades. The index looks healthy, retrieval latency is fine, and top-k results still come back. But answer quality gets unstable because the units being retrieved do not line up with the units that actually carry meaning.

If every document is split into the same token window with the same overlap, you usually get one of two failures. The chunks are too small, so the retriever finds fragments without enough local context to support a useful answer. Or the chunks are too large, so unrelated material gets embedded together and the retriever brings back semantically mixed passages that dilute ranking precision.

Adaptive chunking is the fix. Instead of treating chunk size as a single ingestion constant, it treats chunk boundaries as a document-specific decision driven by structure, semantics, and downstream retrieval behavior.

1. Why static chunking breaks retrieval quality

@prasad-kumkar
prasad-kumkar / autonomous-source-credibility-assessment-in-rag.md
Created April 25, 2026 21:55
Autonomous Source Credibility Assessment in RAG Systems

Autonomous Source Credibility Assessment in RAG Systems

Most RAG systems still treat retrieval rank as a proxy for truth. If a chunk is semantically similar to the question, it gets passed to the model as if it were trustworthy enough to answer from. That works for lightweight Q&A. It breaks in production systems where the answer depends on conflicting documents, changing source quality, or evidence that is relevant but not authoritative.

Autonomous source credibility assessment is the layer that decides what retrieved evidence deserves trust, how much trust it deserves, and what the system should do when the evidence does not agree. In practice, this is a sequence of decisions: source-tiering, citation scoring, freshness evaluation, contradiction detection, and trust-aware synthesis.

1. Retrieval relevance and source credibility are different signals

A document can be highly relevant and still be the wrong thing to trust. A blog post summarizing a regulation may rank above the regulation itself. An

Canary Release Strategies for AI Agent Updates

Most software teams already understand canary releases for services. AI agents need the same discipline, but the release surface is wider. You are not only deploying code. You are deploying prompts, tool routing, model versions, safety policies, memory behavior, and sometimes the evaluation logic that judges success. A small change in any one of those layers can alter cost, latency, action quality, or write-path risk.

Safe rollout starts by treating the agent as a versioned runtime contract, not a loose bundle of assets. If you change the prompt but not the tool policy, or the model but not the regression suite, you do not really know what version is in production.

What should count as an agent release

An agent release should package at least:

@prasad-kumkar
prasad-kumkar / governance-for-autonomous-research-agents.md
Created April 25, 2026 21:53
Governance for autonomous research agents

Governance for Autonomous Research Agents: Policies That Prevent Confident, Weak Outputs

Autonomous research agents usually fail by producing something fluent, structured, and plausible enough to move downstream even when the evidence behind it is thin.

That is a governance problem, not just a model problem.

If a research agent can search, collect evidence, synthesize findings, and publish into a dashboard, memo, or workflow, then the system needs explicit rules for acceptable sources, approval boundaries, evidence requirements, and replayable decision history.

Without that control layer, "confidence" becomes a formatting choice instead of a defensible signal.

@prasad-kumkar
prasad-kumkar / automated-rollback-mechanisms-for-rogue-agents.md
Created April 25, 2026 21:52
Automated Rollback Mechanisms for Rogue Agents

Automated Rollback Mechanisms for Rogue Agents

Teams usually think about rollback too late. They add autonomous agents, wire up tools, define approval paths, and maybe implement retries. Then one day an agent starts behaving badly. It loops on the same update. It writes inconsistent states across systems. It keeps taking an action after its context has drifted.

At that point, the question is no longer whether the agent was "aligned." The question is whether the platform can stop the damage, contain the blast radius, and recover without making the incident worse.

That is what rollback infrastructure is for. In production agent systems, rollback is not a single undo button. It is a control-plane capability made up of five parts:

  • triggers that detect unsafe or abnormal behavior early
  • containment controls that freeze the bad run before it spreads
@prasad-kumkar
prasad-kumkar / continuous-learning-loops-for-ai-agents.md
Created April 25, 2026 21:51
Continuous Learning Loops for AI Agents: What to Learn Online, What to Hold Back, and How to Roll Updates Safely

Continuous Learning Loops for AI Agents: What to Learn Online, What to Hold Back, and How to Roll Updates Safely

Most teams say they want agents that "learn continuously," but that does not mean production agents should rewrite themselves from every interaction. The real design problem is deciding which parts of the system may adapt online, which require offline evaluation, and which should never self-modify at all.

The safe approach is to treat learning as a pipeline, not a reflex. Feedback gets captured with metadata, corrections are normalized into reusable artifacts, candidate updates are evaluated offline, and only then are limited changes rolled out under guardrails.

1. Separate the learning surface before you collect more feedback

Not every part of an agent stack should learn the same way.

@prasad-kumkar
prasad-kumkar / autonomous-competitor-intelligence-agents.md
Created April 25, 2026 21:50
Building autonomous competitor-intelligence agents

Building Autonomous Competitor-Intelligence Agents That Produce Defensible Signals

Most competitor monitoring systems are good at collecting noise and bad at producing defensible insight. A dashboard full of scraped pages, ad snapshots, and social posts is not intelligence unless the system can explain why a change matters, how confident it is, and what evidence supports the conclusion.

If you want an autonomous competitor-intelligence agent to be useful in production, design it as a governed research pipeline rather than a crawler with an LLM attached. The hard parts are source selection, change detection, cross-source reconciliation, confidence scoring, and making sure the final output can survive human review.

1. Start with explicit intelligence objects

Do not let the agent reason over raw page text forever. Define the classes of competitive signal you care about: