Build AI-Powered QA Workflows Without Writing Code The Testing Academy — AI Tester Batch 1X
- What Is LangFlow?
- Why QA/SDET Teams Should Care
- LangFlow Architecture — Mapped to QA Concepts
- Installation & Setup
- Understanding the LangFlow Canvas
- Flow 1 — Bug Report Classifier & Prioritizer
- Flow 2 — Test Case Generator from User Stories
- Flow 3 — API Test Validator with Live Endpoint Testing
- Flow 4 — Flaky Test Analyzer with RAG
- Flow 5 — Multi-Agent QA Pipeline (Jira → Test Cases → Code → Review)
- Connecting LangFlow to Your QA Stack
- LangFlow API — Triggering Flows from CI/CD
- Best Practices & Gotchas
- Cheat Sheet
LangFlow is a visual drag-and-drop builder for creating LLM-powered applications and AI agent workflows — no Python required, but fully extensible when you need it.
Think of LangFlow as the Postman for AI agents. Just like Postman lets you visually build, test, and chain API requests without writing curl commands, LangFlow lets you visually build, test, and chain AI workflows without writing Python scripts.
Traditional Agent Code LangFlow
───────────────────── ────────
50+ lines of Python → Drag 5 components onto canvas
pip install 6 packages → Already built in
Debug with print statements → See data flow in real-time
Share via Git repo → Export as JSON, import anywhere
| Feature | LangFlow | n8n | Flowise | Dify |
|---|---|---|---|---|
| Primary focus | LLM apps & agents | General automation | LLM chatbots | LLM apps |
| Visual builder | ✅ Full canvas | ✅ Workflow view | ✅ Canvas | ✅ Canvas |
| Custom Python | ✅ Custom Components | ✅ Function nodes | ❌ Limited | ❌ Limited |
| Agent support | ✅ Full (tools, memory) | ✅ AI Agent node | ✅ Basic | ✅ Basic |
| RAG support | ✅ Built-in | ❌ Needs plugins | ✅ Built-in | ✅ Built-in |
| Multi-agent | ✅ Sequential + parallel | ❌ Single agent | ❌ No | ❌ No |
| API endpoint | ✅ Auto-generated | ✅ Webhooks | ✅ API | ✅ API |
| Open source | ✅ Fully open | ✅ (with limits) | ✅ Fully open | ✅ Partial |
| Extensibility | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐ | ⭐⭐⭐ |
| QA Fit | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐ | ⭐⭐⭐ |
Bottom line: n8n is better for "connect 10 SaaS tools together." LangFlow is better for "build an intelligent AI agent that reasons about QA problems."
Most QA teams hear about AI agents and think:
"That sounds amazing, but I'd need to learn CrewAI, LangChain, LangGraph, set up environments, write hundreds of lines of Python, debug LLM outputs..."
LangFlow removes that barrier entirely. You build the same powerful agents you'd build in code — but visually.
| QA Workflow | Traditional Approach | LangFlow Approach |
|---|---|---|
| Bug triage bot | 80+ lines of Python + API setup | 5 components, 10 minutes |
| Test case generator | CrewAI crew with 3 agents | Visual flow with 3 agent nodes |
| Flaky test analyzer | Custom RAG pipeline + vector DB | Drag-drop RAG components |
| API contract tester | pytest + requests + validation | Agent + HTTP tool + prompt |
| Release readiness checker | Custom dashboard + scripts | Agent + data sources + output |
- High-throughput production pipelines — If you're processing 10,000+ test results per minute, use code
- Complex state machines — LangGraph gives you finer control for intricate branching
- Custom ML model integration — If you need custom model fine-tuning, code is better
- When your team already knows Python well — Code gives you more control
- Rapid prototyping of agent ideas ("Will this workflow even work?")
- Teams with mixed skill levels (manual testers + automation engineers)
- Demonstrating AI capabilities to stakeholders (visual is persuasive)
- Building internal tools that non-developers need to modify
- Connecting LLM reasoning to existing tools (Jira, Slack, APIs)
This is the key to understanding LangFlow as a QA/SDET professional. Every LangFlow component has a QA equivalent you already know.
| LangFlow Component | QA Equivalent | What It Does |
|---|---|---|
| Flow | Test Suite | A complete workflow from input to output |
| Component/Node | Test Step | A single operation in the workflow |
| Edge (Connection) | Data pipe between steps | Passes output of one component to input of another |
| Chat Input | Test data input | Entry point — what the user/system provides |
| Chat Output | Test report / assertion result | Final output the user sees |
| Prompt | Test template / POM | Reusable template that shapes agent behavior |
| LLM Model | Test execution engine | The brain that processes and generates responses |
| Agent | QA Engineer (automated) | An LLM with tools and reasoning capability |
| Tool | Utility function / helper | Specific capability the agent can invoke |
| Memory | Test context / session state | Information persisted across interactions |
| Vector Store | Test knowledge base | Searchable database of past bugs, docs, test data |
| Retriever | Search query on the knowledge base | Fetches relevant documents from the vector store |
| Text Splitter | Data chunker / parser | Breaks large documents into processable pieces |
| Embeddings | Feature extractor | Converts text to numerical vectors for search |
| API Request | HTTP client (like requests/axios) | Calls external APIs |
| Custom Component | Custom fixture / helper class | Your own Python logic wrapped as a LangFlow node |
| Playground | Test runner / debug console | Interactive panel to test your flow in real-time |
| API Endpoint | Webhook / CI trigger | Auto-generated URL to invoke your flow externally |
┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Chat Input │────▶│ Prompt │────▶│ LLM Model │────▶│ Chat Output │
│ (User types │ │ (System msg + │ │ (Groq/OpenAI │ │ (Response │
│ a question) │ │ user input) │ │ processes) │ │ displayed) │
└──────────────┘ └──────────────┘ └──────────────┘ └──────────────┘
QA Analogy: This is identical to a test pipeline:
- Chat Input = Test data (your input)
- Prompt = Test script template (how to process the input)
- LLM Model = Test execution engine (runs the logic)
- Chat Output = Test result (what you get back)
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Chat Input │────▶│ Agent │────▶│ Chat Output │
│ │ │ ┌───────┐ │ │ │
│ │ │ │ Tool 1│ │ │ │
│ │ │ │ Tool 2│ │ │ │
│ │ │ │ Tool 3│ │ │ │
│ │ │ └───────┘ │ │ │
└──────────────┘ └──────────────┘ └──────────────┘
QA Analogy: This is like a QA engineer (Agent) who has access to Postman (Tool 1), the test database (Tool 2), and Jira (Tool 3). They decide which tool to use based on the task.
# Create a dedicated directory
mkdir langflow-qa && cd langflow-qa
# Create virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install LangFlow
pip install langflow
# Launch LangFlow
langflow run
# Output:
# ╭───────────────────────────────────────────────╮
# │ Welcome to ⛓ LangFlow │
# │ Access: http://127.0.0.1:7860 │
# ╰───────────────────────────────────────────────╯Open your browser at http://127.0.0.1:7860.
docker run -d --name langflow \
-p 7860:7860 \
-v langflow_data:/app/langflow \
langflowai/langflow:latest
# Access at http://localhost:7860# docker-compose.yml
version: '3.8'
services:
langflow:
image: langflowai/langflow:latest
ports:
- "7860:7860"
volumes:
- langflow_data:/app/langflow
environment:
- LANGFLOW_DATABASE_URL=sqlite:///./langflow.db
- LANGFLOW_AUTO_LOGIN=true
- GROQ_API_KEY=${GROQ_API_KEY}
- OPENAI_API_KEY=${OPENAI_API_KEY}
restart: unless-stopped
volumes:
langflow_data:docker compose up -dOnce LangFlow is running:
- Open the UI → http://localhost:7860
- Create an account (or auto-login if configured)
- Configure API keys:
- Click the ⚙️ Settings icon (bottom-left)
- Go to Global Variables
- Add your keys:
- Name:
GROQ_API_KEY→ Value:your-groq-api-key - Name:
OPENAI_API_KEY→ Value:your-openai-api-key(optional)
- Name:
- Verify → Create a simple flow to test connectivity
This is your "Hello World" to verify everything works:
- Click "+ New Flow" → Select "Blank Flow"
- From the left sidebar, drag these onto the canvas:
- Inputs → Chat Input
- Models → Groq (or OpenAI)
- Outputs → Chat Output
- Connect them:
- Chat Input
Message→ GroqInput - Groq
Text→ Chat OutputText
- Chat Input
- Configure the Groq node:
- Model:
llama-3.3-70b-versatile - API Key: Select your
GROQ_API_KEYglobal variable - Temperature:
0.1
- Model:
- Click the Playground button (bottom-right 💬 icon)
- Type:
"Hello, I'm a QA engineer. What can you help me with?" - If you get a response — your setup is working!
┌─────────────────────────────────────────────────────────────────┐
│ [My Flows] [Store] [+ New Flow] [Settings]│
├──────────┬──────────────────────────────────────────────────────┤
│ │ │
│ Component│ CANVAS AREA │
│ Sidebar │ │
│ │ ┌─────┐ ┌─────┐ ┌─────┐ │
│ - Inputs │ │Node │───────▶│Node │───────▶│Node │ │
│ - Models │ │ A │ │ B │ │ C │ │
│ - Prompts│ └─────┘ └─────┘ └─────┘ │
│ - Agents │ │
│ - Tools │ │
│ - Outputs│ [▶ Run] │
│ - RAG │ [💬 Chat] │
│ - ... │ │
├──────────┴──────────────────────────────────────────────────────┤
│ Status Bar / Logs │
└─────────────────────────────────────────────────────────────────┘
| Category | What's Inside | QA Use Case |
|---|---|---|
| Inputs | Chat Input, Text Input, File Loader | Feed bug reports, specs, test data |
| Outputs | Chat Output, Text Output | Display results, reports |
| Prompts | Prompt Template | Define agent behavior / system prompts |
| Models | Groq, OpenAI, Ollama, HuggingFace | The LLM brain |
| Agents | Tool Calling Agent, CrewAI Agent | Autonomous QA agents |
| Tools | Python REPL, API Request, Search, Calculator | Agent capabilities |
| Memories | Chat Memory, Summary Memory | Context across conversations |
| Vector Stores | Chroma, FAISS, Pinecone | Store bug history, test docs |
| Embeddings | OpenAI, HuggingFace, Ollama | Convert text for vector search |
| Text Splitters | Recursive, Character-based | Break docs into chunks |
| Retrievers | Vector Store Retriever | Search through stored documents |
| Utilities | Conditional Router, Text Manipulation | Flow control and data ops |
| Custom | Your own Python components | Custom QA-specific logic |
| Action | How To |
|---|---|
| Add a component | Drag from sidebar → drop on canvas |
| Connect components | Click output handle → drag to input handle |
| Configure a component | Click the component → edit fields in the panel |
| Delete a connection | Click the edge → press Delete |
| Run the flow | Click ▶ Run button or open Playground |
| Test interactively | Click 💬 Playground (bottom-right) |
| Export flow | Click ⋮ menu → Export → saves as .json |
| Import flow | Click ⋮ menu → Import → load .json |
Every component has colored handles:
┌─────────────────────┐
Input ○─│ Component Name │─○ Output
(left) │ │ (right)
│ [Configuration] │
└─────────────────────┘
- Left handles (inputs): Data flows IN
- Right handles (outputs): Data flows OUT
- Handle colors match types: Text → Text, Message → Message, etc.
- Hover over a handle to see its name and expected data type
An intelligent bug triage system that takes raw bug reports and outputs structured classifications with severity, category, team assignment, and action recommendations.
QA Analogy: This replaces the manual bug triage meeting where someone reads each bug and decides "Is this P0 or P3? Who owns this?"
┌────────────┐ ┌───────────────┐ ┌───────────────┐ ┌─────────────┐
│ Chat Input │────▶│ Prompt │────▶│ Groq LLM │────▶│ Chat Output │
│ (Bug report│ │ (System: │ │ (llama-3.3- │ │ (Structured │
│ raw text) │ │ classifier │ │ 70b) │ │ triage) │
│ │ │ template) │ │ │ │ │
└────────────┘ └───────────────┘ └───────────────┘ └─────────────┘
- Click "+ New Flow" → "Blank Flow"
- Name it: "QA Bug Classifier"
Drag these 4 components onto the canvas:
- Inputs → Chat Input
- Prompts → Prompt
- Models → Groq
- Outputs → Chat Output
Click on the Prompt node and set the template:
Template:
You are a Senior QA Lead with 15 years of experience triaging bugs across web, mobile, and API platforms.
CLASSIFICATION RULES:
- CRITICAL: System down, data loss, security breach, payment failures — affects all users
- HIGH: Major feature broken, no workaround, affects >30% of users
- MEDIUM: Feature partially broken, workaround exists, affects <30% of users
- LOW: Cosmetic, typo, minor UX issue, affects <5% of users
CATEGORIES:
- UI/UX: Visual, layout, styling, responsiveness, accessibility
- FUNCTIONAL: Business logic, workflow, calculations, validations
- PERFORMANCE: Slow load, timeout, memory leak, high CPU
- SECURITY: Authentication, authorization, data exposure, injection
- DATA: Data loss, corruption, incorrect values, sync issues
- INTEGRATION: API failures, third-party service issues, webhook problems
- INFRASTRUCTURE: Server errors, deployment issues, environment problems
Analyze the following bug report and provide a STRUCTURED response:
BUG REPORT:
{bug_report}
RESPOND IN THIS EXACT FORMAT:
## 🐛 Bug Triage Report
**Severity:** [CRITICAL / HIGH / MEDIUM / LOW]
**Category:** [Category from list above]
**Confidence:** [HIGH / MEDIUM / LOW] (how confident you are in this classification)
### Impact Analysis
- **Users Affected:** [Estimated scope]
- **Business Impact:** [Revenue / Reputation / Compliance / None]
- **Workaround Available:** [Yes (describe) / No]
### Team Assignment
- **Primary Owner:** [Frontend / Backend / DevOps / QA / Security / Data]
- **Secondary:** [If cross-team]
### Recommended Actions
1. [Action 1]
2. [Action 2]
3. [Action 3]
### Reproduction Confidence
- **Steps Clear:** [Yes / No — what's missing]
- **Environment Specified:** [Yes / No]
- **Frequency:** [Always / Intermittent / Once]
### Similar Known Issues
- [Reference any common patterns you recognize]
Important: In the Prompt node, you'll see a variable {bug_report} is automatically detected. Connect the Chat Input's Message output to this bug_report input handle on the Prompt node.
- Model:
llama-3.3-70b-versatile - API Key: Select
GROQ_API_KEYfrom Global Variables - Temperature:
0.1(low for consistent classification) - Max Tokens:
1500
Chat Input [Message] ──→ Prompt [bug_report]
Prompt [Prompt Message] ──→ Groq [Input]
Groq [Text] ──→ Chat Output [Text]
Open the Playground and paste this test bug:
Title: Checkout page shows 500 error when applying promo code "SUMMER25"
Steps to Reproduce:
1. Add any item to cart
2. Go to checkout
3. Enter promo code "SUMMER25" in the discount field
4. Click "Apply"
5. Page shows "500 Internal Server Error"
Environment: Production, Chrome 120, Windows 11
Frequency: 100% reproducible
User Reports: 47 support tickets in the last 2 hours
Note: The promo code was part of today's marketing campaign sent to 50,000 users via email
The agent should classify this as CRITICAL because it's blocking revenue and affecting an active marketing campaign, assign it to Backend team, and recommend immediate investigation.
Want to route critical bugs to a different output? Add a Conditional Router node:
Groq Output ──→ Conditional Router
├── If contains "CRITICAL" → Urgent Output (could trigger Slack)
└── Otherwise → Standard Output
- Drag Utilities → Conditional Router onto the canvas
- Set the condition:
- Input: Text from Groq
- Condition:
text contains "CRITICAL"ortext contains "HIGH" - True Output: → Connect to an additional output (or webhook)
- False Output: → Connect to standard Chat Output
A flow that takes a user story (from Jira, a document, or typed in) and generates comprehensive test cases in structured format — covering happy path, negative, edge, and boundary scenarios.
QA Analogy: This automates what a QA engineer does during sprint planning — reading stories and writing test cases. The agent IS the test case writer.
┌────────────┐ ┌────────────────┐ ┌───────────────┐ ┌──────────────┐
│ Chat Input │────▶│ Prompt: │────▶│ Agent │────▶│ Chat Output │
│ (User story│ │ Test Engineer │ │ (Tool-calling)│ │ (Test cases) │
│ text) │ │ System Prompt │ │ │ │ │
└────────────┘ └────────────────┘ │ Tools: │ └──────────────┘
│ ├─ Calculator │
│ └─ URL Fetcher│
└───────────────┘
"+ New Flow" → "Blank Flow" → Name: "QA Test Case Generator"
- Inputs → Chat Input
- Prompts → Prompt
- Agents → Tool Calling Agent
- Models → Groq
- Tools → Calculator (for boundary value calculations)
- Outputs → Chat Output
Template:
You are a Senior Test Engineer at a top tech company with expertise in writing exhaustive test cases. You follow the ISTQB methodology and think like both a user and an attacker.
YOUR PROCESS:
1. First, extract all EXPLICIT requirements from the user story
2. Then, identify IMPLICIT requirements (security, performance, accessibility, error handling)
3. For each requirement, generate test cases across these categories:
- ✅ HAPPY PATH: Normal expected usage
- ❌ NEGATIVE: Invalid inputs, unauthorized access, error conditions
- 🔲 EDGE CASE: Boundary values, empty states, maximum limits
- ⚡ PERFORMANCE: Load, concurrency, timeout scenarios
- 🔒 SECURITY: Injection, authentication bypass, data exposure
USER STORY:
{user_story}
OUTPUT FORMAT FOR EACH TEST CASE:
### TC-[NNN]: [Descriptive Title]
- **Category:** [Happy Path / Negative / Edge Case / Performance / Security]
- **Priority:** [P0-Critical / P1-High / P2-Medium / P3-Low]
- **Preconditions:** [What must be true before this test runs]
- **Test Data:** [Specific values to use]
- **Steps:**
1. [Step 1]
2. [Step 2]
3. [Step 3]
- **Expected Result:** [What should happen]
- **Automation Candidate:** [Yes/No — explain why]
---
After all test cases, provide:
## 📊 Coverage Summary
| Category | Count | Coverage Notes |
|---|---|---|
| Happy Path | X | ... |
| Negative | X | ... |
| Edge Case | X | ... |
| Performance | X | ... |
| Security | X | ... |
| **Total** | **X** | |
## ⚠️ Assumptions & Clarifications Needed
- [List anything ambiguous in the story that could change test design]
## 🧪 Test Data Requirements
- [What test data needs to be set up]
Generate AT LEAST 15 test cases covering all categories.
- Click on the Tool Calling Agent node
- Connect:
- LLM: Connect the Groq model to the Agent's LLM input
- Tools: Connect Calculator tool to Agent's Tools input
- System Prompt: Connect Prompt output to Agent's System Prompt
- Agent settings:
- Handle Parsing Errors:
True - Verbose:
True(for debugging)
- Handle Parsing Errors:
- Model:
llama-3.3-70b-versatile - Temperature:
0.2(slightly creative but structured) - Max Tokens:
4000(test cases need room)
Chat Input [Message] ──→ Prompt [user_story]
Prompt [Prompt Message] ──→ Agent [System Prompt]
Groq [Model] ──→ Agent [LLM]
Calculator [Tool] ──→ Agent [Tools]
Agent [Response] ──→ Chat Output [Text]
Playground Input:
User Story: As a customer, I want to add items to my shopping cart so that I can purchase multiple products in a single transaction.
Acceptance Criteria:
- Users can add products from the product listing page or product detail page
- Cart shows item name, quantity, price, and subtotal
- Users can update quantity (1-99) or remove items
- Cart persists across browser sessions (logged-in users)
- Guest users lose cart after 24 hours of inactivity
- Out-of-stock items cannot be added
- Maximum 50 unique items per cart
- Free shipping threshold: orders over $75
- Promo codes can be applied (one per order)
The agent generates 15+ test cases like:
- TC-001: Add single item from product listing page (Happy Path, P0)
- TC-002: Add same item twice — quantity should increment (Happy Path, P1)
- TC-003: Add item with quantity 0 (Negative, P2)
- TC-004: Add item with quantity 100 — exceeds max 99 (Edge Case, P1)
- TC-005: Add 51st unique item — should show error (Edge Case, P1)
- TC-006: Add out-of-stock item (Negative, P0)
- TC-007: Cart persistence after browser close — logged-in (Happy Path, P0)
- TC-008: Guest cart expiry after exactly 24 hours (Edge Case, P1)
- TC-009: Apply two promo codes simultaneously (Negative, P1)
- TC-010: SQL injection in promo code field (Security, P0)
- ...and more
If you want the agent to fetch live requirements from a URL (e.g., a Confluence page or Google Doc):
- Drag Tools → URL Fetcher onto canvas
- Connect it to the Agent's Tools input (alongside Calculator)
- Now the agent can fetch requirements from a URL when you say: "Generate test cases from this spec: https://docs.google.com/document/d/xyz"
An agent that can actually call live API endpoints, validate the responses against expected schemas, and generate a test report — all without writing code.
QA Analogy: This is like having a QA engineer with Postman open, who reads your API spec, creates test requests, fires them, and writes up the results — automatically.
┌────────────┐ ┌────────────────┐ ┌───────────────┐ ┌──────────────┐
│ Chat Input │────▶│ Prompt: │────▶│ Agent │────▶│ Chat Output │
│ (API spec │ │ API Tester │ │ (Tool-calling)│ │ (Test report │
│ or URL) │ │ System Prompt │ │ │ │ with pass/ │
└────────────┘ └────────────────┘ │ Tools: │ │ fail) │
│ ├─ API Request│ └──────────────┘
│ ├─ Python REPL│
│ └─ Calculator │
└───────────────┘
"+ New Flow" → Name: "QA API Tester"
- Inputs → Chat Input
- Prompts → Prompt
- Agents → Tool Calling Agent
- Models → Groq
- Tools → API Request
- Tools → Python REPL (for response validation)
- Tools → Calculator (for response time checks)
- Outputs → Chat Output
Template:
You are an API Test Automation Engineer. Your job is to test REST API endpoints for correctness, performance, and reliability.
YOUR TESTING METHODOLOGY:
1. Understand the endpoint specification
2. Send requests using the API Request tool
3. Validate responses using Python REPL tool
4. Report results in a structured format
VALIDATION CHECKS (apply to EVERY request):
- ✅ Status code matches expected
- ✅ Response time < threshold (default: 2000ms)
- ✅ Response body contains required fields
- ✅ Data types are correct
- ✅ Error responses have proper format
- ✅ Headers include expected values (Content-Type, CORS, etc.)
TEST CATEGORIES TO COVER:
1. **Happy Path:** Valid request with correct parameters
2. **Missing Required Fields:** Omit each required field one at a time
3. **Invalid Data Types:** Send string where number expected, etc.
4. **Boundary Values:** Empty strings, max length, 0, negative numbers
5. **Authentication:** Test without token, expired token, invalid token
6. **Rate Limiting:** Note if headers indicate rate limits
API TO TEST:
{api_spec}
OUTPUT FORMAT:
## 🧪 API Test Report
### Endpoint: [METHOD] [URL]
| # | Test Scenario | Request | Expected | Actual | Status | Response Time |
|---|---|---|---|---|---|---|
| 1 | [Scenario] | [Brief request] | [Expected code] | [Actual code] | ✅/❌ | [ms] |
### Detailed Results
[For each test, show the full request and response]
### Summary
- **Total Tests:** X
- **Passed:** X ✅
- **Failed:** X ❌
- **Pass Rate:** X%
- **Avg Response Time:** Xms
### Issues Found
[List any failures with analysis]
### Recommendations
[What should be fixed]
The API Request component in LangFlow allows agents to make HTTP calls:
- Allowed methods: GET, POST, PUT, DELETE, PATCH
- Headers: Can set custom headers (auth tokens, content-type)
- Body: JSON payloads for POST/PUT
- Timeout: Configurable
The agent will use this tool automatically when it needs to call an endpoint.
This gives the agent the ability to write and run Python code for validation:
# The agent might write code like this internally:
import json
response = json.loads(api_response)
assert "id" in response, "Missing 'id' field"
assert isinstance(response["id"], int), "ID should be integer"
assert response["status"] == "active", f"Expected 'active', got '{response['status']}'"Chat Input [Message] ──→ Prompt [api_spec]
Prompt [Prompt Message] ──→ Agent [System Prompt]
Groq [Model] ──→ Agent [LLM]
API Request [Tool] ──→ Agent [Tools]
Python REPL [Tool] ──→ Agent [Tools]
Calculator [Tool] ──→ Agent [Tools]
Agent [Response] ──→ Chat Output [Text]
Playground Input — Testing a Public API:
Test the JSONPlaceholder API for the Users endpoint:
Base URL: https://jsonplaceholder.typicode.com
Endpoint: /users
Expected behavior:
- GET /users → Returns list of 10 users (200 OK)
- GET /users/1 → Returns single user with fields: id, name, username, email, phone, website
- GET /users/999 → Should return 404 (non-existent user)
- POST /users → Should accept a new user with name and email (201 Created)
- GET /users?username=Bret → Should filter and return 1 user
Response time threshold: 1000ms
Test all scenarios above and report results.
The agent will:
- Call
GET https://jsonplaceholder.typicode.com/usersand validate the response - Call
GET /users/1and check all required fields - Call
GET /users/999and verify error handling - Call
POST /userswith a test payload - Call
GET /users?username=Bretand verify filtering - Use Python REPL to validate response structures
- Generate a comprehensive test report
For testing authenticated APIs, configure the API Request tool with headers:
Headers:
Authorization: Bearer {your_token}
Content-Type: application/json
X-Request-ID: langflow-qa-test-{{timestamp}}
Or create a Custom Component that handles token refresh.
A RAG-powered (Retrieval Augmented Generation) system that ingests your test history, failure logs, and past analyses — then intelligently diagnoses flaky tests based on patterns it has seen before.
QA Analogy: Imagine a senior QA engineer who has reviewed every single test failure in your project's history and can instantly pattern-match when they see a new failure. That's what RAG does — it gives the LLM a "memory" of your project's past.
┌────────────┐ ┌────────────────┐ ┌───────────────┐
│ File Loader │────▶│ Text Splitter │────▶│ Embeddings │
│ (Test logs, │ │ (Chunks docs │ │ (Converts to │
│ reports) │ │ into pieces) │ │ vectors) │
└────────────┘ └────────────────┘ └───────┬───────┘
│
▼
┌────────────┐ ┌────────────────┐ ┌───────────────┐
│ Chat Input │────▶│ Prompt │────▶│ Vector Store │
│ (New flaky │ │ (includes │ │ (Chroma/FAISS)│
│ test info)│ │ retrieved │ │ Stores & finds│
└────────────┘ │ context) │ │ similar cases │
└───────┬────────┘ └───────────────┘
│
▼
┌───────────────┐ ┌──────────────┐
│ Groq LLM │────▶│ Chat Output │
│ (Analyzes with│ │ (Diagnosis + │
│ full context)│ │ fix plan) │
└───────────────┘ └──────────────┘
Without RAG:
You: "test_login_redirect fails intermittently" LLM: "Could be timing issues, network problems, or test data..." (generic advice)
With RAG (your test history loaded):
You: "test_login_redirect fails intermittently" LLM: "Based on similar failures in your project: This test has failed 23 times in the last 30 days, with 87% of failures occurring between 2-3 AM UTC, correlated with the nightly DB refresh job. The fix for test_checkout_redirect, which had the same pattern, was to add a wait-for-DB-ready fixture. Recommended: Apply the same pattern."
Create sample test history files. In production, these would come from your CI/CD system.
File: test_failures.txt (create this locally and upload to LangFlow)
TEST FAILURE REPORT - test_login_redirect
Date: 2025-01-15
Status: FAIL (3 of 10 runs)
Error: TimeoutError: page.wait_for_selector('#dashboard-header', timeout=5000)
Stack: tests/auth/test_login.py:42
Environment: CI Runner #3, Chrome 120
Analysis: Failure occurs during high-load periods. The dashboard page takes
longer to render when the DB connection pool is saturated. Adding explicit
wait for network idle resolved similar issues in test_profile_load.
Resolution: Increased timeout to 15000ms and added page.wait_for_load_state('networkidle')
---
TEST FAILURE REPORT - test_cart_update_quantity
Date: 2025-01-18
Status: FAIL (5 of 20 runs)
Error: AssertionError: expected quantity 3, got 2
Stack: tests/cart/test_cart_operations.py:78
Environment: CI Runner #1-4 (all runners)
Analysis: Race condition when multiple async state updates fire simultaneously.
The quantity update triggers a re-render before the state is committed.
Resolution: Wrapped assertion in retry logic with 3 attempts and 500ms delay.
Also added data-testid="quantity-{item_id}" for more stable selectors.
---
TEST FAILURE REPORT - test_search_results_pagination
Date: 2025-01-20
Status: FAIL (2 of 15 runs)
Error: NoSuchElementError: Unable to locate element: .pagination-next
Stack: tests/search/test_search.py:112
Environment: CI Runner #2, Firefox 121
Analysis: Pagination component uses lazy loading. On slower CI runners,
the element hasn't rendered yet when the test tries to interact.
Only fails on Firefox due to different rendering pipeline.
Resolution: Added browser-specific wait strategy. For Firefox, wait for
MutationObserver to detect DOM changes before asserting pagination exists.
---
TEST FAILURE REPORT - test_payment_webhook_processing
Date: 2025-01-22
Status: FAIL (8 of 8 runs on staging)
Error: AssertionError: Payment status expected 'completed', got 'pending'
Stack: tests/payments/test_webhooks.py:95
Environment: Staging, all browsers
Analysis: NOT FLAKY - this is a real bug. The webhook endpoint was
modified in PR #4521 and now processes events asynchronously, but
the test expects synchronous processing. The staging Stripe webhook
secret was also rotated and not updated in env vars.
Resolution: Updated webhook secret. Modified test to poll for status
change with 30s timeout instead of immediate assertion.
---
TEST FAILURE REPORT - test_user_avatar_upload
Date: 2025-01-25
Status: FAIL (1 of 50 runs)
Error: FileNotFoundError: /tmp/test_avatar_12345.png
Stack: tests/profile/test_avatar.py:33
Environment: CI Runner #5 only
Analysis: The temp file cleanup cron job on Runner #5 runs every 5 minutes.
If the test happens to execute during cleanup, the file is deleted before
upload completes. Other runners have a 15-minute cleanup interval.
Resolution: Use unique temp directory per test run with atexit cleanup.
Configured Runner #5 to match other runners' cleanup schedule.
Add these components to the canvas:
Ingestion Pipeline (left side):
- Data → File (to load test_failures.txt)
- Processing → Recursive Character Text Splitter
- Embeddings → OpenAI Embeddings (or Ollama Embeddings for free local option)
- Vector Stores → Chroma DB
Query Pipeline (right side): 5. Inputs → Chat Input 6. Prompts → Prompt 7. Models → Groq 8. Outputs → Chat Output 9. Retrievers → Vector Store Retriever (connected to Chroma)
File Component:
- Upload
test_failures.txt(or point to your file path)
Recursive Character Text Splitter:
- Chunk Size:
1000 - Chunk Overlap:
200 - Separator:
---(splits on our report delimiter)
Embeddings:
- If using OpenAI: Select
text-embedding-3-small, add API key - If using Ollama (free, local): Model
nomic-embed-text
Chroma DB:
- Collection Name:
qa_test_failures - Persist Directory:
/tmp/chroma_qa(or any writable path)
Vector Store Retriever:
- Connect to Chroma DB
- Number of Results:
3(retrieve top 3 similar past failures)
Prompt Template:
You are a Flaky Test Diagnosis Specialist. You analyze test failures by comparing them against historical patterns from your team's test suite.
HISTORICAL CONTEXT (similar past failures from your project):
{retrieved_context}
NEW FAILURE TO ANALYZE:
{user_question}
PROVIDE YOUR DIAGNOSIS:
## 🔍 Flaky Test Diagnosis
### Classification
- **Truly Flaky?** [Yes / No — explain]
- **Flakiness Type:** [Timing / Race Condition / Resource / Environment / Data / Not Flaky — Real Bug]
- **Confidence:** [High / Medium / Low]
### Pattern Match
- **Similar Past Failures:** [Reference specific past cases from context]
- **Common Root Cause:** [What pattern connects these failures]
### Root Cause Analysis
1. **Primary Cause:** [What's most likely causing this]
2. **Contributing Factors:** [What makes it intermittent]
3. **Environment Factor:** [Is it runner/browser/time-specific?]
### Recommended Fix
[Specific code/config change based on what worked for similar past issues]
### Prevention Strategy
- **Short-term:** [Quick fix to stop the bleeding]
- **Long-term:** [Architectural change to prevent recurrence]
- **Monitoring:** [What to watch for to catch this class of issue]
### Risk Assessment
- **Ignore Risk:** [What happens if we skip this fix]
- **Fix Effort:** [Low / Medium / High — estimated hours]
- **Priority:** [Fix now / Next sprint / Backlog]
Ingestion (one-time, or refresh periodically):
File [Data] ──→ Text Splitter [Input]
Text Splitter [Chunks] ──→ Chroma DB [Documents]
Embeddings [Embedding] ──→ Chroma DB [Embedding]
Query (every time you ask):
Chat Input [Message] ──→ Prompt [user_question]
Chat Input [Message] ──→ Vector Store Retriever [Query]
Vector Store Retriever [Results] ──→ Prompt [retrieved_context]
Prompt [Prompt Message] ──→ Groq [Input]
Groq [Text] ──→ Chat Output [Text]
Playground Input:
Our test test_order_confirmation_email is failing intermittently.
It fails about 3 out of 10 runs on CI Runner #3.
Error: AssertionError: expected email count 1, got 0
Stack: tests/orders/test_notifications.py:67
The test places an order and then immediately checks the email inbox.
It's been getting worse over the past week, especially during peak hours.
The agent finds similar patterns from the loaded history (timing issues, async processing, CI runner-specific problems) and provides a targeted diagnosis referencing your actual past fixes — not generic advice.
In production, you'd load:
- CI/CD test reports (JSON from pytest, JUnit XML)
- Bug postmortems (Confluence pages, Google Docs)
- Slack thread resolutions (exported conversations)
- Git commit messages (fix descriptions)
The more history you feed the RAG, the smarter it gets.
A full multi-agent pipeline where four AI agents collaborate sequentially — mimicking your real QA team workflow from requirement to reviewed test code.
QA Analogy: This is your entire sprint QA process automated:
- BA/Analyst reads the Jira ticket
- Test Engineer writes test cases
- Automation Engineer converts to pytest code
- QA Lead reviews everything
┌────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌─────────────┐
│ Chat Input │────▶│ Agent 1: │────▶│ Agent 2: │────▶│ Agent 3: │────▶│ Agent 4: │────▶│ Chat Output │
│ (Jira │ │ Requirement │ │ Test Case │ │ Automation │ │ QA Lead │ │ (Complete │
│ ticket or │ │ Analyst │ │ Writer │ │ Engineer │ │ Reviewer │ │ package) │
│ story) │ │ │ │ │ │ │ │ │ │ │
└────────────┘ └──────────────┘ └──────────────┘ └──────────────┘ └──────────────┘ └─────────────┘
"+ New Flow" → Name: "QA Multi-Agent Pipeline"
Components needed:
- Prompts → Prompt (named "Analyst Prompt")
- Agents → Tool Calling Agent (named "Requirement Analyst")
- Models → Groq (shared or separate instance)
Analyst Prompt Template:
You are a Senior Requirements Analyst with deep QA expertise. Your job is to extract EVERY testable requirement from a user story or feature spec.
INPUT (User Story / Feature Spec):
{input_requirement}
YOUR OUTPUT MUST INCLUDE:
## 📋 Requirements Extraction
### Functional Requirements
[Number each one. Be specific and atomic — one requirement per line]
- FR-01: [requirement]
- FR-02: [requirement]
### Non-Functional Requirements
- NFR-01: [Performance requirement if applicable]
- NFR-02: [Security requirement if applicable]
- NFR-03: [Accessibility requirement if applicable]
### Business Rules
- BR-01: [Business logic that must be enforced]
### Assumptions
- [Things that are unclear or assumed]
### Edge Cases Identified
- [Scenarios not explicitly mentioned but important]
### Test Data Requirements
- [What data is needed to test these requirements]
### Acceptance Criteria (Given-When-Then)
For each functional requirement, write acceptance criteria:
**FR-01:**
- Given [precondition]
- When [action]
- Then [expected result]
Be EXHAUSTIVE. Miss nothing. If something is ambiguous, flag it AND provide a reasonable interpretation.
Test Writer Prompt Template:
You are a Senior Test Engineer. You receive analyzed requirements from the Requirements Analyst and write comprehensive test cases.
ANALYZED REQUIREMENTS:
{analyst_output}
WRITE TEST CASES IN THIS FORMAT:
## 🧪 Test Suite: [Feature Name]
### TC-001: [Title]
- **Type:** [Functional / Negative / Edge / Boundary / Performance / Security]
- **Priority:** [P0 / P1 / P2 / P3]
- **Linked Requirement:** [FR-XX / NFR-XX]
- **Preconditions:**
- [Precondition 1]
- **Test Data:**
- [Specific values]
- **Steps:**
1. [Step 1]
2. [Step 2]
3. [Step 3]
- **Expected Result:** [Specific, measurable outcome]
- **Automation Feasible:** [Yes / No — reason]
COVERAGE RULES:
- Every FR must have at least 2 test cases (happy + negative)
- Every NFR must have at least 1 test case
- Include boundary values where numbers are involved
- Include at least 2 security test cases
- Mark which test cases are automation candidates
Generate at least 20 test cases.
Automation Engineer Prompt Template:
You are a Test Automation Engineer specializing in Python pytest with Playwright.
WRITTEN TEST CASES:
{test_writer_output}
Convert the test cases marked as "Automation Feasible: Yes" into executable pytest code.
CODING STANDARDS:
- Use pytest with Playwright (pytest-playwright)
- Use Page Object Model pattern
- Use pytest fixtures for setup/teardown
- Use @pytest.mark.parametrize for data-driven tests
- Add pytest markers: @pytest.mark.smoke, @pytest.mark.regression, @pytest.mark.security
- Use descriptive test names: test_<feature>_<scenario>_<expected>
- Add docstrings referencing the TC-ID
- Use assertions with custom messages
- Handle cleanup in fixtures with yield
OUTPUT FORMAT:
```python
# File: tests/test_<feature>.py
import pytest
from playwright.sync_api import Page, expect
# --- Page Objects ---
class [Feature]Page:
def __init__(self, page: Page):
self.page = page
# locators
# methods
# --- Fixtures ---
@pytest.fixture
def feature_page(page: Page):
# setup
yield [Feature]Page(page)
# teardown
# --- Tests ---
class TestFeatureName:
@pytest.mark.smoke
def test_happy_path(self, feature_page):
"""TC-001: [Title]"""
# steps
# assertion
@pytest.mark.regression
@pytest.mark.parametrize("input,expected", [
("valid", "success"),
("", "error"),
("x" * 256, "error"),
])
def test_input_validation(self, feature_page, input, expected):
"""TC-005: Input boundary testing"""
# steps
# assertion
Also generate:
- conftest.py with shared fixtures
- pytest.ini with marker definitions
- requirements.txt
#### Step 5: Build Agent 4 — QA Lead Reviewer
**QA Lead Prompt Template:**
You are a QA Lead performing a final review of the entire test package produced by your team.
ORIGINAL REQUIREMENT: {original_input}
REQUIREMENTS ANALYSIS: {analyst_output}
TEST CASES: {test_writer_output}
AUTOMATED TEST CODE: {automation_output}
PERFORM A THOROUGH REVIEW:
| Requirement | Test Cases | Automated? | Coverage |
|---|---|---|---|
| FR-01 | TC-001, TC-002 | Yes | ✅ Full |
| FR-02 | TC-003 | No |
- Completeness: [Are all scenarios covered?]
- Clarity: [Are steps clear and reproducible?]
- Data Coverage: [Are boundary values and edge cases included?]
- Independence: [Can tests run in any order?]
- POM Pattern: [Properly implemented? Y/N]
- Fixture Usage: [Proper setup/teardown? Y/N]
- Assertions: [Meaningful messages? Y/N]
- Naming: [Follows convention? Y/N]
- DRY: [No code duplication? Y/N]
[List specific scenarios NOT covered]
[Any security scenarios missed?]
[Any performance scenarios missed?]
- [Improvement 1]
- [Improvement 2]
- [Improvement 3]
Status: [✅ APPROVED /
#### Step 6: Chain the Agents
This is where LangFlow shines. You chain agents by connecting the output of one to the input of the next:
Chat Input [Message] ──→ Analyst Prompt [input_requirement] Analyst Prompt ──→ Agent 1 (Analyst) ──→ [output] │ Agent 1 Output ──→ Writer Prompt [analyst_output] Writer Prompt ──→ Agent 2 (Writer) ──→ [output] │ Agent 2 Output ──→ Automation Prompt [test_writer_output] Automation Prompt ──→ Agent 3 (Coder) ──→ [output] │ Agent 3 Output ──→ Review Prompt [automation_output] (+ pass through analyst & writer outputs to review prompt) Review Prompt ──→ Agent 4 (Reviewer) ──→ Chat Output
**LangFlow Chaining Method:**
In LangFlow, you have two approaches for chaining:
**Approach A — Sequential Text Passing:**
1. Agent 1 output → Text node (stores result) → Feed into Agent 2's prompt
2. Agent 2 output → Text node → Feed into Agent 3's prompt
3. Continue the chain...
**Approach B — Using Memory:**
1. Add a **Chat Memory** component
2. Connect all agents to the same memory
3. Each agent "reads" what the previous agents produced from memory
**Approach C — Using Prompt Templating:**
1. Each subsequent Prompt has a variable for the previous agent's output
2. The variable is populated by the previous agent's response
3. This creates a clean data pipeline
#### Step 7: Test the Full Pipeline
**Playground Input:**
JIRA TICKET: SHOP-1234 Title: Implement Wishlist Feature
As a logged-in user, I want to save products to a wishlist so that I can purchase them later.
Acceptance Criteria:
- User can add products from listing or detail pages
- Wishlist has max 100 items
- User can remove items from wishlist
- Wishlist is private (not visible to other users)
- "Add to Wishlist" button shows filled heart when item is already saved
- If item goes out of stock, show visual indicator on wishlist
- User can move items from wishlist to cart
- Wishlist persists across sessions
- Show "Wishlist is empty" state with product recommendations
#### What You Should See
Four sequential outputs showing the full pipeline:
1. **Analyst:** 10+ functional requirements, 3+ NFRs, edge cases, acceptance criteria
2. **Writer:** 20+ test cases with full detail, coverage matrix
3. **Coder:** Complete pytest + Playwright code with POM, fixtures, parametrize
4. **Reviewer:** Coverage matrix, quality checks, gaps identified, final verdict
### Production Enhancement: Adding Jira Integration
To make this flow pull directly from Jira:
1. Add **Tools** → **API Request** to Agent 1
2. Configure with your Jira API:
URL: https://your-instance.atlassian.net/rest/api/3/issue/{ticket_id} Headers: Authorization: Basic {base64_encoded_credentials}
3. Now you can say: *"Analyze SHOP-1234"* and the agent fetches the ticket directly
---
## 11. Connecting LangFlow to Your QA Stack
### Integration Matrix
| Tool | Connection Method | Use Case |
|---|---|---|
| **Jira** | API Request tool | Read tickets, create bugs, update status |
| **Slack** | Webhook / API Request | Send notifications, receive commands |
| **GitHub** | API Request tool | Read PRs, comment on PRs, check CI status |
| **Jenkins** | API Request tool | Trigger builds, fetch test reports |
| **TestRail** | API Request tool | Push test cases, read results |
| **Confluence** | API Request tool / URL Fetcher | Read specs, update test docs |
| **Postman** | Export collection → File input | Import API specs for testing |
| **Selenium Grid** | Custom Component | Trigger browser tests |
| **Allure** | File Loader (JSON reports) | Analyze test trends |
| **Grafana** | API Request tool | Fetch performance metrics |
### Example: Jira Connection via API Request Tool
Configure the API Request tool with these settings:
Method: GET URL: https://your-company.atlassian.net/rest/api/3/issue/{issue_key} Headers: Authorization: Basic <base64(email:api_token)> Accept: application/json Content-Type: application/json
The agent can then be prompted:
"Fetch Jira ticket PROJ-456 and generate test cases for it"
### Example: Slack Notification Output
Add a second output branch using API Request:
Method: POST URL: https://hooks.slack.com/services/YOUR/WEBHOOK/URL Headers: Content-Type: application/json Body: { "channel": "#qa-notifications", "text": "🧪 Test cases generated for PROJ-456\n\nTotal: 18 cases\nP0: 3 | P1: 7 | P2: 5 | P3: 3\n\nReview: [link]" }
---
## 12. LangFlow API — Triggering Flows from CI/CD
### Every Flow Gets a Free API
When you build a flow in LangFlow, it automatically gets an API endpoint. This is the bridge between your visual flows and your CI/CD pipeline.
### Finding Your Flow's API
1. Open your flow
2. Click the **"API"** button (usually top-right, looks like `< / >`)
3. You'll see:
- **cURL command** — ready to copy
- **Python code** — SDK example
- **JavaScript code** — for Node.js
### API Call Structure
```bash
# cURL
curl -X POST "http://localhost:7860/api/v1/run/{flow_id}" \
-H "Content-Type: application/json" \
-d '{
"input_value": "Your bug report or user story here",
"output_type": "chat",
"input_type": "chat"
}'
# Python
import requests
LANGFLOW_URL = "http://localhost:7860"
FLOW_ID = "your-flow-id-here"
def run_qa_flow(input_text: str) -> str:
response = requests.post(
f"{LANGFLOW_URL}/api/v1/run/{FLOW_ID}",
json={
"input_value": input_text,
"output_type": "chat",
"input_type": "chat"
}
)
result = response.json()
return result["outputs"][0]["outputs"][0]["results"]["message"]["text"]
# Usage
bug_report = "Payment page crashes on iOS Safari..."
classification = run_qa_flow(bug_report)
print(classification)# .github/workflows/qa-agent-review.yml
name: AI QA Review
on:
pull_request:
types: [opened, synchronize]
jobs:
ai-test-generation:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Get Changed Files
id: changed
run: |
FILES=$(gh pr diff ${{ github.event.pull_request.number }} --name-only | head -20)
echo "files=$FILES" >> $GITHUB_OUTPUT
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Run LangFlow QA Agent
id: qa_agent
run: |
RESPONSE=$(curl -s -X POST "${{ secrets.LANGFLOW_URL }}/api/v1/run/${{ secrets.FLOW_ID }}" \
-H "Content-Type: application/json" \
-d "{
\"input_value\": \"Review these changed files and suggest test cases: ${{ steps.changed.outputs.files }}\",
\"output_type\": \"chat\",
\"input_type\": \"chat\"
}")
# Extract the message text
MESSAGE=$(echo $RESPONSE | python3 -c "import sys,json; print(json.load(sys.stdin)['outputs'][0]['outputs'][0]['results']['message']['text'])")
# Save for next step
echo "$MESSAGE" > qa_report.md
- name: Comment on PR
uses: actions/github-script@v7
with:
script: |
const fs = require('fs');
const report = fs.readFileSync('qa_report.md', 'utf8');
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: `## 🤖 AI QA Agent Review\n\n${report}`
});pipeline {
agent any
stages {
stage('AI Test Analysis') {
steps {
script {
def response = httpRequest(
url: "${LANGFLOW_URL}/api/v1/run/${FLOW_ID}",
httpMode: 'POST',
contentType: 'APPLICATION_JSON',
requestBody: """
{
"input_value": "Analyze test results from build ${BUILD_NUMBER}: ${TEST_REPORT_SUMMARY}",
"output_type": "chat",
"input_type": "chat"
}
"""
)
def analysis = readJSON text: response.content
echo "AI Analysis: ${analysis.outputs[0].outputs[0].results.message.text}"
}
}
}
}
}# conftest.py — Auto-analyze failures with LangFlow
import pytest
import requests
import json
LANGFLOW_URL = "http://localhost:7860"
FLOW_ID = "your-bug-classifier-flow-id"
def pytest_runtest_makereport(item, call):
"""Hook that runs after each test — sends failures to LangFlow for analysis."""
if call.when == "call" and call.excinfo is not None:
error_info = {
"test_name": item.name,
"test_file": str(item.fspath),
"error_type": call.excinfo.typename,
"error_message": str(call.excinfo.value),
"traceback": str(call.excinfo.getrepr()),
}
try:
response = requests.post(
f"{LANGFLOW_URL}/api/v1/run/{FLOW_ID}",
json={
"input_value": f"Analyze this test failure:\n{json.dumps(error_info, indent=2)}",
"output_type": "chat",
"input_type": "chat"
},
timeout=30
)
if response.ok:
result = response.json()
analysis = result["outputs"][0]["outputs"][0]["results"]["message"]["text"]
# Attach analysis to the test report
item.user_properties.append(("ai_analysis", analysis))
except Exception as e:
pass # Don't let AI analysis failure break the test run| Practice | Why |
|---|---|
| Start with simple flows, add complexity | Debug one component at a time before chaining |
| Use the Playground religiously | Test every change immediately — the feedback loop is instant |
| Set Temperature low (0.1-0.3) for QA tasks | You want consistent, deterministic output for test cases |
| Save prompt templates as separate components | Reusable across flows — like shared fixtures |
| Export flows as JSON backups | Version control your flows — they're just JSON files |
| Use Global Variables for API keys | Never hardcode keys in prompts or tool configs |
| Name your components clearly | "Requirement Analyst Agent" not "Agent 1" |
| Add description to every component | Future-you will thank present-you |
| Test with edge cases | Feed your flow ambiguous, incomplete, and adversarial inputs |
| Monitor token usage | Groq is free but has rate limits — track your consumption |
| Anti-Pattern | Better Alternative |
|---|---|
| Don't chain 10+ agents in one flow | Break into sub-flows, use API calls between them |
| Don't put entire documents in prompts | Use RAG (Flow 4) to retrieve relevant sections |
| Don't expect 100% accuracy from agent output | Always add human review for critical decisions |
| Don't use high temperature for structured output | Keep temperature ≤ 0.3 for consistent formatting |
| Don't skip the Playground | Never deploy a flow you haven't tested interactively |
| Don't use LangFlow for real-time (<100ms) needs | It's not designed for ultra-low-latency responses |
| Don't share flows with embedded API keys | Use Global Variables, export WITHOUT secrets |
| Problem | Cause | Fix |
|---|---|---|
| Agent returns "I don't have access to tools" | Tools not connected to Agent's Tools input | Check edge connections — Tools → Agent |
| "API key not found" error | Key not in Global Variables or typo | Settings → Global Variables → verify name matches exactly |
| Agent gives generic answers | Prompt is too vague | Be specific: include format, examples, constraints |
| Flow runs but output is empty | Output not connected or wrong output type | Verify Chat Output is connected and type matches |
| RAG returns irrelevant results | Bad chunking or wrong embedding model | Adjust chunk size, overlap; try different embeddings |
| Agent loops forever | No termination condition or max iterations | Set max iterations on Agent node (default: 10) |
| Slow response times | Model too large or too many chained agents | Use smaller model for simple tasks, parallelize where possible |
| JSON parsing errors in output | LLM doesn't follow exact format | Add "Respond ONLY with JSON, no preamble" to prompt |
| What You Need | Component to Use | Category |
|---|---|---|
| User types something | Chat Input | Inputs |
| Load a file | File | Data |
| System prompt for agent behavior | Prompt | Prompts |
| LLM brain (free) | Groq (llama-3.3-70b-versatile) | Models |
| LLM brain (paid, powerful) | OpenAI (gpt-4o) | Models |
| LLM brain (local, private) | Ollama | Models |
| Agent that can use tools | Tool Calling Agent | Agents |
| Call an API | API Request | Tools |
| Run Python code | Python REPL | Tools |
| Search the web | Search API | Tools |
| Store documents for RAG | Chroma DB / FAISS | Vector Stores |
| Convert text to vectors | OpenAI / Ollama Embeddings | Embeddings |
| Break docs into chunks | Recursive Character Text Splitter | Processing |
| Search stored documents | Vector Store Retriever | Retrievers |
| Show response to user | Chat Output | Outputs |
| Remember conversation | Chat Memory | Memories |
| Route conditionally | Conditional Router | Utilities |
| Custom Python logic | Custom Component | Custom |
| # | Flow | Components Used | Difficulty | Time to Build |
|---|---|---|---|---|
| 1 | Bug Classifier | Input → Prompt → LLM → Output | ⭐ Beginner | 10 min |
| 2 | Test Case Generator | Input → Prompt → Agent + Tools → Output | ⭐⭐ Beginner+ | 20 min |
| 3 | API Test Validator | Input → Agent + API Tool + Python REPL → Output | ⭐⭐⭐ Intermediate | 30 min |
| 4 | Flaky Test Analyzer (RAG) | File → Splitter → Embeddings → VectorDB → Retriever → LLM → Output | ⭐⭐⭐⭐ Intermediate+ | 45 min |
| 5 | Multi-Agent Pipeline | Input → Agent 1 → Agent 2 → Agent 3 → Agent 4 → Output | ⭐⭐⭐⭐⭐ Advanced | 60 min |
| Shortcut | Action |
|---|---|
Ctrl/Cmd + S |
Save flow |
Ctrl/Cmd + Z |
Undo |
Ctrl/Cmd + D |
Duplicate component |
Delete / Backspace |
Delete selected |
Space + drag |
Pan canvas |
Scroll |
Zoom in/out |
Ctrl/Cmd + E |
Export flow |
| LangFlow World | QA World |
|---|---|
| Flow | Test Suite |
| Component | Test Step |
| Edge/Connection | Data pipeline between steps |
| Prompt | Test script template |
| Agent | Automated QA engineer |
| Tool | Helper utility (Postman, DB client, etc.) |
| Playground | Test runner / debug console |
| API Endpoint | CI/CD webhook trigger |
| Global Variable | Environment variable / secret |
| Vector Store | Test knowledge base / historical data |
| RAG | Intelligent search through past failures |
| Custom Component | Custom pytest fixture or helper |
| Export JSON | Version-controlled test artifacts |
| Memory | Test session context |
- Build Flow 1 first — takes 10 minutes, proves the concept
- Graduate to Flow 2 — add tools and see agent reasoning
- Try Flow 3 with a real API your team owns
- Set up Flow 4 with your actual CI test history — this is where the magic happens
- Build Flow 5 when you're ready for multi-agent orchestration
- Connect to CI/CD (Section 12) to make it production-ready
- Create Custom Components for your team's specific tools
- Share flows with your team — export as JSON, import on their machines
Built for The Testing Academy — AI Tester Batch 1X Author: The Testing Academy | Contact: thetestingacademy@gmail.com