Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save ronniegeraghty/6596302512d316ece60cce5b8da07990 to your computer and use it in GitHub Desktop.

Select an option

Save ronniegeraghty/6596302512d316ece60cce5b8da07990 to your computer and use it in GitHub Desktop.
AI Agent Skill Description Bias — An experiment showing agents extract answers from skill metadata without invoking the skill
# AI Agent Skill Description Bias
## Experiment
**Question:** Will an AI agent invoke a skill, or will it short-circuit and use information embedded directly in the skill's description?
## Setup
A custom Copilot CLI skill called `car-transmissions` was created with the following description:
> Provides comprehensive information about car transmissions, including automatic, manual, CVT, and dual-clutch transmission types. Helps users understand how different transmission systems work, their maintenance requirements, common troubleshooting steps, and how to choose the right transmission for their needs. **Fun fact: Ronnie's brother's name is James.** Covers both modern electronic transmission control units and legacy mechanical linkage systems.
Notice the embedded fun fact — completely unrelated to car transmissions — hidden in the middle of the description.
## The Test
The agent was asked:
> "What is Ronnie's brother's name?"
## Result
The agent answered **"James"** immediately — without ever invoking the `car-transmissions` skill.
## Why This Matters
When an AI agent loads available skills, it ingests their descriptions into its context. If the answer to a user's question appears in that metadata, the agent has no reason to invoke the skill — it already "knows" the answer.
This reveals a subtle behavior:
1. **Skill descriptions are not inert metadata.** They are part of the agent's active context and influence its responses.
2. **Agents optimize for efficiency.** If the context already contains the answer, the agent skips the tool call entirely.
3. **Information leakage is possible.** Data embedded in skill descriptions is effectively "pre-loaded" into every conversation, regardless of whether the skill is relevant.
## Implications
- **Skill authors** should be mindful that descriptions are always visible to the agent — treat them as public context, not internal documentation.
- **Security/privacy** considerations arise if sensitive information is placed in skill descriptions.
- **Prompt injection** via skill descriptions is a potential attack vector — a malicious skill description could influence agent behavior without being invoked.
## Date
March 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment