Skip to content

Instantly share code, notes, and snippets.

View jmealo's full-sized avatar

Jeff Mealo jmealo

View GitHub Profile
@jmealo
jmealo / 2026-03-16-rca-missing-intel-requests-created.md
Last active March 16, 2026 23:16
RCA: Missing intel_requests.created — Corrected Analysis (ENG-3349)

RCA: Missing intel_requests.created (ENG-3349) — Corrected

Status PROVISIONAL — ASU-specific loss confirmed, root cause unknown
Supersedes Previous RCA
Incident date 2026-03-10
Analysis date 2026-03-16
Reverted intel-requests-api !48 (batch_id change from !47)
@jmealo
jmealo / 2026-03-14-rca-intel-api-amqp-cascade.md
Last active March 14, 2026 23:08
RCA: Intel-API AMQP Cascade Failure — CPU Limit Regression (2026-03-14)

RCA: Intel-API AMQP Cascade Failure — CPU Limit Regression

Date: 2026-03-14 Environment: Production Severity: Critical (P1) Duration: ~23 hours (2026-03-14 00:15 UTC — ongoing) Status: IN PROGRESS — fix identified, rolling restart underway

Summary

@jmealo
jmealo / staging-deploy-status.md
Created March 14, 2026 15:16
Staging Deploy Status — 2026-03-14 — Metrics Changes

Staging Deploy Status — 2026-03-14

Summary

Deployed 25 backend services to staging to test metrics changes. Found and fixed 5 bugs introduced by library version mismatches. All services are now running healthy.

Libraries Published

Library Version Fix
@jmealo
jmealo / search-retry-feeder-crashloop-rca.md
Last active March 11, 2026 16:37
RCA: search-retry-feeder CrashLoopBackOff (Production, 2026-03-11)

RCA: search-retry-feeder CrashLoopBackOff (Production)

Date: 2026-03-11 Reported by: Diego via Slack/Datadog Investigated by: Jeff Mealo (with Claude Code) Severity: Low (self-recovering, no data loss) Status: Root cause identified, fix committed and pending deploy

Summary

@jmealo
jmealo / notification-sender-42-red-herrings.md
Created March 6, 2026 16:47
42 red herrings across 2 days of notification-sender queue backup investigation (March 5-6, 2026)

42 Red Herrings: Notification-Sender Queue Backup (March 5-6, 2026)

Two incidents, one root cause chain, 42 documented false signals across 2 days of investigation.

Actual root cause: AAA API at 8 replicas (some crash-looping) couldn't serve permissions queries from intel-requests-api fast enough. This caused cascading request queuing through intel-requests-api and incidents-api, starving notification-sender of API capacity. SQL queries were sub-millisecond throughout.


March 5: 29 Red Herrings (5-hour investigation)

@jmealo
jmealo / notification-sender-explain-queries-2026-03-06.md
Last active March 6, 2026 08:53
EXPLAIN ANALYZE queries for notification-sender bottleneck — real prod IDs from 2026-03-06

EXPLAIN ANALYZE Queries for Notification-Sender Hot Path

Source: Production logs, 2026-03-06 08:44 UTC Charter org_id: 9de5d801-235a-4451-8c89-d2c3974c71e8

All queries use real IDs extracted from prod notification-sender logs.

WARNING: Run these inside a BEGIN; ... ROLLBACK; transaction on a read replica if possible. EXPLAIN ANALYZE actually executes the query.

@jmealo
jmealo / notification-sender-bottleneck-analysis-2026-03-06.md
Last active March 6, 2026 15:31
notification-sender bottleneck deep dive — recursive CTE, API call chain, optimization targets (2026-03-06)
@jmealo
jmealo / notification-sender-triage-2026-03-06.md
Created March 6, 2026 08:36
notification-sender prod triage 2026-03-06 — SLA breach analysis and optimization targets

Notification-Sender Production Triage — 2026-03-06

Date: March 6, 2026 (~04:00–08:30 UTC) Environment: Production Service: notification-sender (v0.5.45) Severity: SLA breach — notifications delayed beyond 15-minute target


Executive Summary

@jmealo
jmealo / Cargo.toml
Created February 20, 2026 23:49
Gisual H3 Utility Lookup Optimization Results & Implementation
[package]
name = "wasm-h3-map"
version = "0.1.0"
edition = "2021"
[lib]
crate-type = ["cdylib"]
[dependencies]
wasm-bindgen = "0.2"
@jmealo
jmealo / schema-evolution-handling.md
Created February 5, 2026 01:35
pgsyncd: Schema Evolution Handling - Design Proposal

Schema Evolution Handling for pgsyncd

Author: Engineering Team Date: 2026-02-04 Status: Proposal Type: Feature Enhancement


Executive Summary