Last active
November 26, 2025 22:00
-
-
Save szymdzum/304645336c57c53d59a6b7e4ba00a7a6 to your computer and use it in GitHub Desktop.
Revisions
-
szymdzum revised this gist
Nov 17, 2025 . 1 changed file with 0 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -12,7 +12,6 @@ Use this skill when investigating GitLab CI/CD pipeline issues. - User reports pipeline failures (e.g., "Pipeline #2961721 failed") - Questions about job failures or CI/CD errors - Investigating UI test failures - Analyzing job logs or error messages ## Quick Start: Finding Pipelines -
szymdzum renamed this gist
Nov 17, 2025 . 1 changed file with 0 additions and 0 deletions.There are no files selected for viewing
File renamed without changes. -
szymdzum revised this gist
Oct 26, 2025 . 1 changed file with 2 additions and 2 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -180,8 +180,8 @@ Load these files as needed for detailed information: ## Project Context - **Project ID**: 2558 - **GitLab Instance**: - **Repository**: - **Typical Pipeline**: 80+ jobs across 12 stages - **Common Child Pipelines**: UI Tests, Deploy -
szymdzum revised this gist
Oct 26, 2025 . 1 changed file with 0 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1 +0,0 @@ -
szymdzum created this gist
Oct 26, 2025 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,194 @@ --- name: Pipeline Investigation description: Debug GitLab CI/CD pipeline failures using glab CLI. Investigate failed jobs, analyze error logs, trace child pipelines, and compare Node version differences. Use for pipeline failures, job errors, build issues, or when the user mentions GitLab pipelines, CI/CD problems, specific pipeline IDs, failed builds, or job logs. --- # Investigating GitLab Pipelines Use this skill when investigating GitLab CI/CD pipeline issues. ## When to Use - User reports pipeline failures (e.g., "Pipeline #2961721 failed") - Questions about job failures or CI/CD errors - Investigating UI test failures - Checking Node 16 vs Node 20 pipeline differences - Analyzing job logs or error messages ## Quick Start: Finding Pipelines ### If user asks about "latest failed pipeline" for a branch: ```bash # Get current branch BRANCH=$(git branch --show-current) # Find latest failed pipeline for this branch glab api "projects/2558/pipelines?ref=$BRANCH&status=failed&per_page=3" | jq '.[] | {id, status, created_at}' # Or for specific branch glab api "projects/2558/pipelines?ref=feat/node20-migration&status=failed&per_page=3" | jq '.[0]' # For merge request pipelines, use the MR ref format: glab api "projects/2558/pipelines?ref=refs/merge-requests/<MR_ID>/head&per_page=3" | jq '.[] | {id, status, created_at}' # Find latest pipelines (any status) for current branch glab api "projects/2558/pipelines?ref=$BRANCH&per_page=5" | jq '.[] | {id, status, created_at}' ``` ### If user provides pipeline ID directly: Start with step 1 below. ## Core Workflow ### 1. Get Pipeline Overview ```bash # Quick check if pipeline exists and get basic status glab api "projects/2558/pipelines/<PIPELINE_ID>" | jq -r '.status // "Pipeline not found"' # Get full pipeline status and metadata glab api "projects/2558/pipelines/<PIPELINE_ID>" | jq '{status, ref, created_at, duration, web_url}' # Verify pipeline has jobs (old pipelines may be cleaned up) glab api "projects/2558/pipelines/<PIPELINE_ID>/jobs" --paginate | jq '. | length' # If returns 0, pipeline data is unavailable - try a more recent one ``` ### 2. List Failed Jobs **ALWAYS use --paginate** when getting jobs (pipelines have 80+ jobs): ```bash # Get ALL failed jobs glab api "projects/2558/pipelines/<PIPELINE_ID>/jobs" --paginate | jq -r '.[] | select(.status == "failed") | "\(.name) - Job \(.id)"' ``` ### 3. Get Job Logs ```bash # Get last 100 lines of job log (capture stderr with 2>&1) glab ci trace <job-id> 2>&1 | tail -100 # Search for errors glab ci trace <job-id> 2>&1 | grep -E "error|Error|failed|FAIL" ``` ### 4. Check for Child Pipelines Jobs like UI Tests and Deploy trigger child pipelines. **Always check bridges**: ```bash # Find child pipelines glab api "projects/2558/pipelines/<PIPELINE_ID>/bridges" | jq '.[] | {name, status, child: .downstream_pipeline.id}' # If child pipeline exists, get its jobs glab api "projects/2558/pipelines/<CHILD_PIPELINE_ID>/jobs" --paginate | jq -r '.[] | "\(.name) | \(.status) | Job \(.id)"' ``` ## Common Patterns ### Pattern: Child Pipeline Failures ```bash # Step 1: Find failed child pipeline CHILD_ID=$(glab api "projects/2558/pipelines/<PIPELINE_ID>/bridges" | jq -r '.[] | select(.status == "failed") | .downstream_pipeline.id') # Step 2: Get failed jobs from child pipeline glab api "projects/2558/pipelines/$CHILD_ID/jobs" --paginate | jq -r '.[] | select(.status == "failed") | "\(.name) - Job \(.id)"' # Step 3: Get one job's log (they're usually identical) glab ci trace <job-id> 2>&1 | tail -100 ``` ### Pattern: Multiple Failed Jobs When many jobs fail (e.g., all Image builds), check ONE representative job first - they often have identical errors. ```bash # Get first failed job FIRST_FAILED=$(glab api "projects/2558/pipelines/<PIPELINE_ID>/jobs" --paginate | jq -r '.[] | select(.status == "failed") | .id' | head -1) # Check its log glab ci trace $FIRST_FAILED 2>&1 | tail -100 # If needed, check if error is identical across all failed jobs glab api "projects/2558/pipelines/<PIPELINE_ID>/jobs" --paginate | \ jq -r '.[] | select(.status == "failed") | .id' | head -3 | while read job_id; do echo "=== Job $job_id ===" glab ci trace $job_id 2>&1 | grep -E "ERROR|Error:|error:" | head -5 done ``` ## Critical Best Practices 1. **Always use --paginate** for job queries (pipelines have 80+ jobs) 2. **Always capture stderr** with `2>&1` when getting logs 3. **Always check for child pipelines** via bridges API 4. **Limit log output** to avoid overwhelming context (use `tail -100` or `head -50`) 5. **Use project ID 2558** explicitly (never rely on context) ## Common Pitfalls to Avoid - ❌ Forgetting `--paginate` (only gets first 20 jobs) - ❌ Not checking child pipelines (missing UI Test/Deploy jobs) - ❌ Confusing Pipeline IDs (~2M) with Job IDs (~20M+) - ❌ Missing stderr output (forgetting `2>&1`) - ❌ Dumping entire logs (use tail/head/grep) - ❌ Investigating old pipelines with no jobs (check job count first) ## Common Error Patterns When analyzing logs, look for these signatures: **Missing Docker Image:** ``` manifest for <image> not found: manifest unknown ``` → Base runner image not available in ECR (common during Node version transitions) **BundleMon Credentials:** ``` bad project credentials {"message":"forbidden"} ``` → BundleMon service access issue (doesn't fail the build, but shows in logs) **Build Timeout:** ``` ERROR: Job failed: execution took longer than <time> ``` → Checkout server builds can take 44+ minutes (known issue) **Test Failures:** ``` FAIL <test-name> Expected: <value> Received: <value> ``` → Unit test assertion failure (check test logs for specifics) ## Reference Files Load these files as needed for detailed information: - **`cli-reference.md`** - Complete glab command syntax, API patterns, jq examples, and advanced queries - **`pipeline-stages.md`** - Stage dependencies, timing, critical paths, and optimization strategies - **`job-catalog.md`** - Full job descriptions, configurations, durations, and dependencies (all 80+ jobs) ## Project Context - **Project ID**: 2558 - **GitLab Instance**: gitlab.kfplc.com - **Repository**: next-gen/kf-ng-web - **Typical Pipeline**: 80+ jobs across 12 stages - **Common Child Pipelines**: UI Tests, Deploy ### Common Job Names in This Project: - **Install And Build**: Install, WebRunner, Wiremock, Reportportal_Setup - **Static Analysis**: ESlint, Typescript, Format, Stylelint - **Test**: UnitTests:Main, UnitTests:App, UnitTests:Checkout, UnitTests:Utils, UnitTests:Miscellaneous, Sonar - **Image**: Create:Image:{banner}:kits:bbm:app, Create:Image:{banner}:kits:checkout:server, Create:Image:{banner}:kits:pim - **Banners**: bquk, bqie, tpuk, cafr, capl This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1 @@ Shows agent how to use glab, agent says ok and does `glab --help` already knows more than you.