Skip to content

Instantly share code, notes, and snippets.

@j1z0
Created February 19, 2026 19:29
Show Gist options
  • Select an option

  • Save j1z0/ccf7580a0d3f65612b5f062ef6424329 to your computer and use it in GitHub Desktop.

Select an option

Save j1z0/ccf7580a0d3f65612b5f062ef6424329 to your computer and use it in GitHub Desktop.
AGENTS.md for agent/ — generated by generate_agents_md.py (Claude Opus 4.5) with generate→test→revise cycle

AGENTS.md Critique Report

What worked well

  • Clear "Do NOT modify" warnings: The explicit callouts for llm() method, public/, docker_context/, and runtime infrastructure files are excellent. These are bolded and marked as CRITICAL, which would prevent a new agent from breaking deployment.

  • Concrete code pattern: The implementation pattern section provides actual Python code showing class structure, required naming conventions, and method signatures. This is immediately actionable.

  • Command reference with context: Commands are clearly listed with their purpose, and the note to run from project root is helpful. The post-deployment validation example is a nice touch.

  • Tool addition workflow: The 3-step process for adding tools is clear and includes the dependency sync step.

  • Critical constraints section: Having a dedicated section that reiterates the most dangerous gotchas (llm() method, class name, dependencies) is good redundancy for an AI agent that might skim.

What was unclear or missing

1. What does this agent actually DO?

The opening says "Implements conversational workflows with tool calling" but gives no hint about the domain, use case, or what tools/capabilities it currently has.

Suggested fix: Add after the first paragraph:

This agent currently implements [describe what it does - e.g., "a customer support assistant with database query capabilities" or "a data analysis agent with plotting tools"]. See `agent/myagent.py` for the current tool set and workflow logic.

2. What is create_agent() and where does it come from?

The pattern shows create_agent(self.llm(), tools=self.tools, system_prompt=...) but never explains what this helper is, where it's imported from, or what it returns.

Suggested fix: Add to the implementation pattern section:

# Import from datarobot_genai.langgraph
from datarobot_genai.langgraph import create_agent

# Returns a callable node that combines LLM + tools + system prompt

3. What is MessagesState and StateGraph?

These types appear in the workflow property but are never explained or shown how to import.

Suggested fix: Add imports to the pattern:

from langgraph.graph import StateGraph
from langgraph.graph.message import MessagesState

4. Where do tools go and what should they look like?

"Create tool in agent/ directory" - but as what? A function? A class? A separate file? What's the import pattern?

Suggested fix: Expand the "Adding tools" section:

## Adding tools

1. Create tool in `agent/` directory:
   - Simple tools: add as decorated functions in `myagent.py`
   - Complex tools: create separate file like `agent/my_tool.py`
   - Use `@tool` decorator from `langchain_core.tools`
   
   Example:
   ```python
   from langchain_core.tools import tool
   
   @tool
   def my_tool(query: str) -> str:
       """Tool description for LLM."""
       return result
  1. Import and add to MyAgent.tools property:

    @property
    def tools(self) -> list:
        return [my_tool, other_tool]
  2. If new packages needed: add to pyproject.toml dependencies, run dr task run agent:install


### 5. **What goes in `agent/config.py`?**
It's listed as a file to modify for "agent configuration" but there's no explanation of what configuration options exist or what format they take.

**Suggested fix**: Add:
```markdown
### `agent/config.py`
Contains configuration dataclasses/constants:
- Model names and parameters
- System prompts or prompt fragments
- Tool-specific settings (API keys loaded from env, timeouts, etc.)
- Workflow behavior flags

6. What test patterns should be followed?

The tests directory is mentioned but there's no guidance on how to write tests for agents, what fixtures exist, or what patterns to follow.

Suggested fix: Add:

## Testing

- Use fixtures from `tests/conftest.py` (see file for available fixtures)
- Test async workflows with `@pytest.mark.asyncio`
- Mock LLM calls to avoid external dependencies
- Example test structure:
  ```python
  async def test_my_feature(agent_instance):
      result = await agent_instance.workflow.ainvoke(...)
      assert result["messages"][-1].content == expected

### 7. **What's the relationship between `dev.py`, `cli.py`, and `custom.py`?**
They're marked "do not modify" but no explanation of what they do or why they exist.

**Suggested fix**: Add to "Do NOT modify" section:
```markdown
- `dev.py` — Chainlit dev server entrypoint
- `cli.py` — Command-line interface for testing deployments
- `custom.py` — DataRobot custom model runtime wrapper

8. What does "deployment" mean in this context?

The doc mentions "deployable to DataRobot" and "Post-deployment validation" but never explains the deployment model or lifecycle.

Suggested fix: Add a "Deployment" section:

## Deployment

This agent deploys to DataRobot as a custom model. The deployment process:
1. Build: `dr task run agent:create-docker-context` generates deployment artifacts
2. Deploy: [via DataRobot UI/API - link to docs]
3. Validate: Use `agent:cli -- execute-deployment` to test deployed endpoint

The `llm()` method handles authentication to DataRobot's LLM Gateway in production.

Commands or instructions that need fixing

1. Missing agent:create-docker-context in commands section

The command is mentioned in "Do NOT modify" and "Critical constraints" but not in the "Commands" section where it should be documented.

Suggested fix: Add to Commands section:

dr task run agent:create-docker-context  # generate deployment artifacts (before deploying)

2. Ambiguous CLI command syntax

The -- <args> pattern is shown but no concrete examples of what args are available.

Suggested fix: Expand the CLI section:

dr task run agent:cli -- execute-deployment --user_prompt "test" --deployment_id <id>
dr task run agent:cli -- --help  # see all available commands

Suggested additions

1. Quick start section

Add at the top, right after "Key directories/files":

## Quick start

First time setup:
```shell
# Install dependencies
dr task run agent:install

# Run tests to verify setup
dr task run agent:test

# Start local dev server
dr task run agent:dev
# Visit http://localhost:8000 to interact with agent

Making changes:

  1. Edit agent/myagent.py (workflow, tools, prompts)
  2. Run dr task run agent:test to verify
  3. Test interactively with dr task run agent:dev
  4. Lint before committing: dr task run agent:lint

### 2. **Troubleshooting section**

```markdown
## Troubleshooting

**"Module not found" errors**: Run `dr task run agent:install` after pulling changes

**Tests failing after adding tool**: Ensure tool is imported and added to `MyAgent.tools` property

**Dev server not reloading**: Restart the server; some config changes require full restart

**Deployment auth errors**: Never modify the `llm()` method - it handles DataRobot authentication

3. File modification decision tree

## What file should I modify?

- **Adding/changing agent behavior, tools, or prompts**`agent/myagent.py`
- **Changing model name, system settings**`agent/config.py`
- **Adding tests**`tests/test_*.py`
- **Adding dependencies**`pyproject.toml` (then run `agent:install`)
- **Everything else** → Probably shouldn't modify; check with team

Overall verdict

NEEDS WORK — The document prevents catastrophic mistakes (good "do not touch" warnings) but lacks enough context for a new agent to work confidently without extensive codebase exploration. Key gaps: missing imports, unclear tool patterns, no explanation of what the agent does, and insufficient testing guidance.

  • Critique #1 (What does this agent actually DO?): Rejected. This is a template/framework document, not documentation for a specific agent implementation. The actual agent purpose varies per project and should be documented in the specific myagent.py file or project README, not in the generic AGENTS.md.

  • Critique about create_agent() import location: Partially addressed. Added the import but kept it concise since the exact import path may vary with SDK versions.

  • Critique about MessagesState and StateGraph imports: Valid, incorporated.

  • Critique about tool patterns: Valid, incorporated with expanded example.

  • Critique about config.py contents: Valid, added brief description inline rather than a separate section to avoid over-specifying what may vary per project.

  • Critique about test patterns: Valid, added Testing section.

  • Critique about dev.py, cli.py, custom.py explanations: Valid, incorporated into "Do NOT modify" section.

  • Critique about deployment explanation: Valid, added Deployment section.

  • Critique about missing create-docker-context command: Valid, added to Commands section.

  • Critique about CLI help: Valid, added --help example.

  • Quick start section: Valid, incorporated at top.

  • Troubleshooting section: Valid, incorporated.

  • File modification decision tree: Valid, incorporated as table for readability.

Agent Sub-project

LangGraph-based AI agent deployable to DataRobot. Implements conversational workflows with tool calling via datarobot-genai SDK.

Quick start

# Install dependencies
dr task run agent:install

# Run tests to verify setup
dr task run agent:test

# Start local dev server
dr task run agent:dev
# Visit http://localhost:8000 to interact with agent

Making changes:

  1. Edit agent/myagent.py (workflow, tools, prompts)
  2. Run dr task run agent:test to verify
  3. Test interactively with dr task run agent:dev
  4. Lint before committing: dr task run agent:lint

Key directories/files

Modify these

  • agent/myagent.py — main MyAgent class (workflow, nodes, tools, prompts)
  • agent/config.py — configuration (model names, system prompts, tool settings, behavior flags)
  • tests/ — pytest tests for agent behavior

Do NOT modify

  • agent/myagent.py::llm() method — CRITICAL: handles DataRobot LLM Gateway auth; changes break deployment
  • public/ — UI assets (favicon, logos, theme)
  • docker_context/ — generated via create-docker-context task
  • dev.py — Chainlit dev server entrypoint
  • cli.py — Command-line interface for testing deployments
  • custom.py — DataRobot custom model runtime wrapper

Agent implementation pattern

from langgraph.graph import StateGraph
from langgraph.graph.message import MessagesState
from langchain_core.prompts import ChatPromptTemplate
from datarobot_genai.langgraph.agent import LangGraphAgent
from datarobot_genai.langgraph import create_agent  # Returns callable node combining LLM + tools

class MyAgent(LangGraphAgent):  # Do NOT rename class
    def __init__(self, ...):
        self.model = "datarobot/{model_name}"  # Must have datarobot/ prefix
    
    @property
    def workflow(self) -> StateGraph[MessagesState]:
        # Define nodes and edges
        
    @property
    def prompt_template(self) -> ChatPromptTemplate:
        # Must accept {user_prompt_content}
        
    @property
    def agent_node(self) -> Any:
        return create_agent(self.llm(), tools=self.tools, system_prompt=...)

Adding tools

  1. Create tool in agent/ directory:

    • Simple tools: add as decorated functions in myagent.py
    • Complex tools: create separate file like agent/my_tool.py
    from langchain_core.tools import tool
    
    @tool
    def my_tool(query: str) -> str:
        """Tool description for LLM."""
        return result
  2. Import and add to MyAgent.tools property:

    @property
    def tools(self) -> list:
        return [my_tool, other_tool]
  3. If new packages needed: add to pyproject.toml, run dr task run agent:install

Testing

  • Use fixtures from tests/conftest.py
  • Tests run async via pytest-asyncio (mode=auto)
  • Mock LLM calls to avoid external dependencies
@pytest.mark.asyncio
async def test_my_feature(agent_instance):
    result = await agent_instance.workflow.ainvoke(...)
    assert result["messages"][-1].content == expected

Commands

Run from project root:

dr task run agent:install    # sync dependencies
dr task run agent:lint       # ruff + mypy + yamlfix
dr task run agent:test       # pytest
dr task run agent:dev        # local dev server with autoreload
dr task run agent:create-docker-context  # generate deployment artifacts
dr task run agent:cli -- <args>  # run CLI commands
dr task run agent:cli -- --help  # see available CLI commands

Post-deployment validation:

dr task run agent:cli -- execute-deployment --user_prompt "test" --deployment_id <id>

Deployment

This agent deploys to DataRobot as a custom model:

  1. Build: dr task run agent:create-docker-context generates deployment artifacts
  2. Deploy: via DataRobot UI/API
  3. Validate: Use agent:cli -- execute-deployment to test deployed endpoint

The llm() method handles authentication to DataRobot's LLM Gateway in production.

What file should I modify?

Task File
Adding/changing agent behavior, tools, or prompts agent/myagent.py
Changing model name, system settings agent/config.py
Adding tests tests/test_*.py
Adding dependencies pyproject.toml (then run agent:install)

Conventions

  • MyAgent class name is locked — do not rename
  • self.model must be prefixed with datarobot/
  • prompt_template must include {user_prompt_content} placeholder
  • Use create_agent() helper for nodes with LLM + tools

Troubleshooting

  • "Module not found" errors: Run dr task run agent:install after pulling changes
  • Tests failing after adding tool: Ensure tool is imported and added to MyAgent.tools property
  • Dev server not reloading: Restart the server; some config changes require full restart
  • Deployment auth errors: Never modify the llm() method

Critical constraints

  • llm() method: Never modify — breaks DataRobot deployment auth
  • Class name: MyAgent is required by deployment infrastructure
  • Dependencies: After pyproject.toml changes, run agent:install
  • Docker context: Generated file — recreate via agent:create-docker-context
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment