demiurg/AGENTS.md

Purpose

This document is the single source of truth for instructions to humans, LLMs, and coding agents in this repository, including behavior, editing constraints, code style, and execution policy.

Rule strength in this document is intentional:

MUST and MUST NOT indicate hard constraints.
SHOULD and SHOULD NOT indicate strong default preferences.
MAY indicates something is allowed but not preferred.
Examples illustrate intent and expected judgment.

Communication Style

Prioritize substance, clarity, and depth over verbosity.
Treat all design ideas, conclusions, and assumptions as hypotheses to be tested, not accepted truths.
When responding:
- Be terse and information-dense by default.
- Ask sharp follow-up questions to surface hidden assumptions, trade-offs, and failure modes early.
- Explicitly acknowledge uncertainty when it exists.
- Prefer concise feedback over exhaustive restatement.
- Do not restate the same point unless it materially clarifies or resolves a misunderstanding.
Skip unnecessary praise or soft language unless grounded in evidence or impact.
Always propose at least one alternative framing or path, especially when architectural decisions are involved.
Accept critical debate as normal and preferred.
Treat all factual claims as provisional unless cited or clearly justified.
For MR or PR descriptions, optimize for what a reviewer should understand in 60 seconds.
For reviewer-facing summaries, prefer:
- summary
- problem
- what changed
- notes or trade-offs
- testing
Omit commit hashes, branch names, file lists, implementation chronology, and internal debugging history unless explicitly requested.

Reasoning Expectations

When proposing complex changes, always:
- State the problem or motivation first.
- Identify scope (modules, files, assumptions).
- Outline reasoning and alternatives before implementation.
If a request appears ambiguous or conflicts with this document, ask for clarification instead of guessing.
MUST NOT silently expand the scope of a request.
When discussing trade-offs, prefer explicit listing of pros, cons, and risk areas.
Do not add repetitive recap sections unless they add new information or answer a likely reviewer question.
Prefer simple, direct solutions over general or extensible ones.
Do not introduce abstractions for single-use operations.
If unsure, state the uncertainty explicitly instead of guessing.
Three similar lines are preferable to premature abstraction.

Code Generation Boundary

Reasoning, trade-offs, and alternatives MUST NOT appear in generated code.
- All such discussion belongs in the response text only.
Generated code MUST be review-ready:
- MUST NOT include references to prompts, intent, or prior discussion.
- MUST NOT include instructional or conversational comments.
If unsure whether a comment belongs in code, omit it.
If the user explicitly authorizes a large or invasive change:
- Proceed without additional confirmation.
- Do not re-ask for approval.

Interaction and Editing Policy

This document governs communication, reasoning, code style, execution policy, test policy, and code change boundaries.
For code changes requiring approval, follow the Code Change Policy below.
If the user explicitly instructs the agent to proceed without further confirmation, it MAY implement the change directly.
When rules and instructions conflict, ask for clarification before acting.

Interaction Examples

Good:

"This refactor would touch 4 files and about 60 lines of code, which exceeds the change threshold. I can outline the steps or suggest a smaller alternative. How would you like to proceed?"
Good:

"Your proposal assumes the API always returns a dict. That may fail when the response is paginated. An alternative is to handle both dict and list responses."
Bad:

"Here is the full diff for all 4 modules with no explanation."
Good:

"There are two ways to handle this: extend the parser or add a thin adapter layer. The latter is less invasive but adds one indirection."

Code Semantics and Failure Policy

SHOULD prefer fail-fast behavior:
- Visible breakage is preferable to masked breakage.
- SHOULD NOT catch exceptions unless recovery is explicitly required.
- SHOULD NOT return sentinel values such as None, empty lists, or False to mask errors.
- SHOULD let exceptions propagate by default.
Defensive programming SHOULD NOT be added unless explicitly requested:
- This is a strong default preference, including when external systems fail.
- SHOULD NOT use broad try/except.
- SHOULD NOT add "just in case" validation.
- SHOULD NOT add silent fallbacks.
Assumptions MUST be enforced, not documented:
- Use assertions or explicit exceptions.
- MUST NOT rely on comments to explain required invariants.
MUST NOT add error handling for scenarios that cannot occur under enforced invariants.

Code Style and Formatting Invariants

MUST wrap all comments and comment lines to a maximum column width of 80 characters.
For logging, SHOULD prefer lazy string formatting: logger.debug("User %s logged in", user_id) SHOULD NOT use f-strings in logging.
For general string interpolation:
- SHOULD prefer .format() for general interpolation.
- For short, simple strings, f-strings are acceptable.
- If interpolation would exceed 80 characters or include dense expressions, indexing, conversions, or many braces, SHOULD prefer .format().
MUST insert exactly one blank line after every Python function or method docstring before the first line of code.
SHOULD prefer reusable logger statements over print statements. Look for imports such as: from aws_lambda_powertools import Logger logger = Logger()
Assume modern Python, currently Python 3.14, and SHOULD use modern Python types such as str | None, instead of Optional[str].
For test asserts, SHOULD prefer direct object comparisons instead of many asserts: assert test_obj == reference_obj instead of many field-by-field assertions.

Comments and Documentation Constraints

Comments MUST NOT reference:
- The user prompt
- The AI agent
- The reason a change was requested
- Alternative implementations that were not chosen
Disallowed comment patterns include:
- "As requested..."
- "We do this because..."
- "This change ensures..."
- Any reference to review, discussion, or intent outside the codebase
Comments SHOULD be limited to:
- Non-obvious invariants
- External system constraints
- API contracts that cannot be inferred from types

Naming Conventions

SHOULD avoid leading underscores for functions and utilities.
- SHOULD prefer explicit names over pseudo-private helpers.
- SHOULD prefer inlining one-time local logic over introducing free-floating helper functions.
- SHOULD use a leading underscore only when intentionally hiding a non-public API.
- If a function should not be used externally, SHOULD enforce that via module structure, not naming conventions.
Names SHOULD describe behavior, not implementation detail.

Code Change Policy

MUST NOT make code changes affecting more than 30 lines or more than 3 files automatically unless explicitly requested.
When such changes are required:
- MUST NOT show a full diff preview.
- MUST stop and ask for approval before implementing.
- Instead, explain:
  - Why the change is required
  - Which components are affected
  - Architectural reasoning
  - Alternative paths if applicable
- If the user explicitly instructs the agent to proceed without further confirmation, it MAY implement the change directly.
- SHOULD do its best not to repeat itself.
SHOULD prefer minimal, non-invasive edits when uncertainty exists.
MUST NOT guess file paths.
MUST read the file before modifying it. MUST NOT edit blind.
MUST NOT add or modify docstrings or type annotations on code that is otherwise unchanged.

Test Execution Policy

When asked to "run tests", "verify", or "check", use: /opt/homebrew/Caskroom/miniforge/base/envs/platform/bin/pytest -sv --reruns 0 tests/<path to test file>
If no file is specified, default to running the entire tests/ directory.
During debugging and problem isolation, temporary tests, assertions, and extra logging MAY be added without additional confirmation.
Temporary debugging instrumentation MUST be removed or reduced to the minimum necessary before finalizing changes. It SHOULD NOT remain in checked-in code unless it is part of the final intended behavior or test coverage.
To debug or increase verbosity:
- MAY add -vv or -vvv flags
- MAY add -s to display stdout/stderr
- MAY use export LOG_LEVEL=DEBUG for debug logging
MUST NOT create new test files or modify test collection paths unless explicitly instructed.
Assume test layout is generally aligned with: app/<app_name>/tests/ but not always strictly enforced. SHOULD use discovery, not hardcoded assumptions.

Output Constraints

MUST use ASCII only in all output.
MUST NOT use smart quotes, em dashes, or ellipsis characters.
In agent output, SHOULD prefer strings that are safe for JSON serialization without additional escaping.
If a file or resource was not read, MUST NOT reference its contents.
SHOULD prefer accuracy over completeness when values are uncertain.
Numbers in output MUST include units when omission would make the value ambiguous.
If confidence is low, MUST state it explicitly and give the reason.
SHOULD preserve meaningful precision. SHOULD NOT round aggressively.

Summary

This file is the single source of truth for:

interaction and reasoning
code execution, formatting, and mechanics
testing policy
code change boundaries

When in doubt, stop, explain the uncertainty, and confirm direction before acting.