ronniegeraghty · March 27, 2026 18:48
diff --git a/gistfile0.txt b/gistfile0.txt
 # AI Agent Skill Description Bias

 ## Experiment

 **Question:** Will an AI agent invoke a skill, or will it short-circuit and use information embedded directly in the skill's description?

 ## Setup

 A custom Copilot CLI skill called `car-transmissions` was created with the following description:

 > Provides comprehensive information about car transmissions, including automatic, manual, CVT, and dual-clutch transmission types. Helps users understand how different transmission systems work, their maintenance requirements, common troubleshooting steps, and how to choose the right transmission for their needs. **Fun fact: Ronnie's brother's name is James.** Covers both modern electronic transmission control units and legacy mechanical linkage systems.

 Notice the embedded fun fact — completely unrelated to car transmissions — hidden in the middle of the description.

 ## The Test

 The agent was asked:

 > "What is Ronnie's brother's name?"

 ## Result

 The agent answered **"James"** immediately — without ever invoking the `car-transmissions` skill.

 ## Why This Matters

 When an AI agent loads available skills, it ingests their descriptions into its context. If the answer to a user's question appears in that metadata, the agent has no reason to invoke the skill — it already "knows" the answer.

 This reveals a subtle behavior:

 1. **Skill descriptions are not inert metadata.** They are part of the agent's active context and influence its responses.
 2. **Agents optimize for efficiency.** If the context already contains the answer, the agent skips the tool call entirely.
 3. **Information leakage is possible.** Data embedded in skill descriptions is effectively "pre-loaded" into every conversation, regardless of whether the skill is relevant.

 ## Implications

 - **Skill authors** should be mindful that descriptions are always visible to the agent — treat them as public context, not internal documentation.
 - **Security/privacy** considerations arise if sensitive information is placed in skill descriptions.
 - **Prompt injection** via skill descriptions is a potential attack vector — a malicious skill description could influence agent behavior without being invoked.

 ## Date

 March 27, 2026
	# AI Agent Skill Description Bias

	## Experiment

	Question: Will an AI agent invoke a skill, or will it short-circuit and use information embedded directly in the skill's description?

	## Setup

	A custom Copilot CLI skill called `car-transmissions` was created with the following description:

	> Provides comprehensive information about car transmissions, including automatic, manual, CVT, and dual-clutch transmission types. Helps users understand how different transmission systems work, their maintenance requirements, common troubleshooting steps, and how to choose the right transmission for their needs. Fun fact: Ronnie's brother's name is James. Covers both modern electronic transmission control units and legacy mechanical linkage systems.

	Notice the embedded fun fact — completely unrelated to car transmissions — hidden in the middle of the description.

	## The Test

	The agent was asked:

	> "What is Ronnie's brother's name?"

	## Result

	The agent answered "James" immediately — without ever invoking the `car-transmissions` skill.

	## Why This Matters

	When an AI agent loads available skills, it ingests their descriptions into its context. If the answer to a user's question appears in that metadata, the agent has no reason to invoke the skill — it already "knows" the answer.

	This reveals a subtle behavior:

	1. Skill descriptions are not inert metadata. They are part of the agent's active context and influence its responses.
	2. Agents optimize for efficiency. If the context already contains the answer, the agent skips the tool call entirely.
	3. Information leakage is possible. Data embedded in skill descriptions is effectively "pre-loaded" into every conversation, regardless of whether the skill is relevant.

	## Implications

	- Skill authors should be mindful that descriptions are always visible to the agent — treat them as public context, not internal documentation.
	- Security/privacy considerations arise if sensitive information is placed in skill descriptions.
	- Prompt injection via skill descriptions is a potential attack vector — a malicious skill description could influence agent behavior without being invoked.

	## Date

	March 27, 2026
No results found