Created
March 27, 2026 18:48
-
-
Save ronniegeraghty/6596302512d316ece60cce5b8da07990 to your computer and use it in GitHub Desktop.
AI Agent Skill Description Bias — An experiment showing agents extract answers from skill metadata without invoking the skill
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # AI Agent Skill Description Bias | |
| ## Experiment | |
| **Question:** Will an AI agent invoke a skill, or will it short-circuit and use information embedded directly in the skill's description? | |
| ## Setup | |
| A custom Copilot CLI skill called `car-transmissions` was created with the following description: | |
| > Provides comprehensive information about car transmissions, including automatic, manual, CVT, and dual-clutch transmission types. Helps users understand how different transmission systems work, their maintenance requirements, common troubleshooting steps, and how to choose the right transmission for their needs. **Fun fact: Ronnie's brother's name is James.** Covers both modern electronic transmission control units and legacy mechanical linkage systems. | |
| Notice the embedded fun fact — completely unrelated to car transmissions — hidden in the middle of the description. | |
| ## The Test | |
| The agent was asked: | |
| > "What is Ronnie's brother's name?" | |
| ## Result | |
| The agent answered **"James"** immediately — without ever invoking the `car-transmissions` skill. | |
| ## Why This Matters | |
| When an AI agent loads available skills, it ingests their descriptions into its context. If the answer to a user's question appears in that metadata, the agent has no reason to invoke the skill — it already "knows" the answer. | |
| This reveals a subtle behavior: | |
| 1. **Skill descriptions are not inert metadata.** They are part of the agent's active context and influence its responses. | |
| 2. **Agents optimize for efficiency.** If the context already contains the answer, the agent skips the tool call entirely. | |
| 3. **Information leakage is possible.** Data embedded in skill descriptions is effectively "pre-loaded" into every conversation, regardless of whether the skill is relevant. | |
| ## Implications | |
| - **Skill authors** should be mindful that descriptions are always visible to the agent — treat them as public context, not internal documentation. | |
| - **Security/privacy** considerations arise if sensitive information is placed in skill descriptions. | |
| - **Prompt injection** via skill descriptions is a potential attack vector — a malicious skill description could influence agent behavior without being invoked. | |
| ## Date | |
| March 27, 2026 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment