Skip to content

Instantly share code, notes, and snippets.

@praveen-krishn
Last active March 10, 2026 18:02
Show Gist options
  • Select an option

  • Save praveen-krishn/7c76093d318fad0f9dac435b0aa714e9 to your computer and use it in GitHub Desktop.

Select an option

Save praveen-krishn/7c76093d318fad0f9dac435b0aa714e9 to your computer and use it in GitHub Desktop.

Engineering Operating Model

System Lifecycle

Design → Build → Verify → Release → Observe → Improve


Trunk based development

Developer
   │
   │  (Pre-Check)
   ▼
Pull Request
   │
   │  CI + Reviews
   ▼
Merge to MAIN (Trunk)
   │
   │  (Operational Excellence)
   ▼
Continuous Deployment Pipeline
   │
   ├── DEV
   ├── STAGE
   └── PROD
   │
   │  (Post-Check)
   ▼
Monitoring + Reliability

Build -> Unit tests -> Integration tests -> Package artifact -> Deploy DEV Smoke tests -> Deploy STAGE Acceptance tests -> Manual approval (optional) -> Deploy PROD


Mental Model

DX → enabled by Platform Delivery → enabled by DevOps Reliability → ensured by SRE Direction → ensured by Governance

  • Platform engineering - Productivity- focuses on developer productivity
  • DevOps - DELIVERY - focuses on delivery automation through CI/CD
  • SRE - RELIABILITY - focuses on production reliability through monitoring, SLOs, and incident management.

Core Pillars

Pillar Focus Key Practices
Delivery Process Execution cadence Standups, sprint planning, reviews, retrospectives
Operational Excellence Reliability and production stability Pre-checks, post-checks, SRE practices
Developer Experience (DX) Developer productivity Local development setup automation
Containerized development environments
Fast CI feedback
Preview environments
Developer documentation
Internal libraries
Platform Engineering Provides internal developer platform CI/CD infrastructure
Service templates
Infrastructure as Code
Internal developer tools
Self-service environments
Engineering Governance Technical standards and direction Architecture reviews
Design RFC process
Coding standards
Security standards
PR approval policies
Technology roadmap

Delivery Cadence

Frequency Activities
Daily Standups, progress updates, blocker identification
Weekly Engineering sync, product sync, architecture sync, sprint planning
Bi-Weekly Sprint review, sprint retrospective
Monthly Security updates, language/runtime upgrades, tooling updates
Quarterly Tech debt sprints, innovation / tech jams

Operational Excellence

Pre-Checks (Release Validation) (Before Merge / Deploy)

  • Code reviews
  • Linting and static analysis
  • Automated tests
  • Security scans
  • CI pipeline validation

Release Management

Controls safe production deployment

Release Strategy

  • Feature flags
  • Canary deployments
  • Blue/green deployments
  • Gradual rollouts

Release Validation (After Deployment) (Post-Checks )

  • Staging verification
  • Deployment health checks
  • Smoke tests

Rollback Strategy

  • Automated rollback
  • Version rollback
  • Database migration rollback plan
  • Rollback validation

Reliability (SRE)

  • SLO / SLI definitions
  • Incident response
  • Blameless postmortems
  • Capacity planning
  • Observability Metric, Logging, Tracing
  • Golden signals monitoring
Signal Examples What It Protects Engineering Role Ops Role Leadership View
Latency - How long a request takes. percentiles matter (P95, P99) User experience Optimize code Monitor SLA risk
Traffic - You can’t interpret latency or errors without traffic context. 1. Requests per second, 2. Concurrent users, 3. Transactions per minute Demand insight Capacity design Scaling Growth indicator
Errors - This is user pain 5xx, Timeouts, Failed DB calls, Business logic failures
Reliability Bug quality Alerts Brand risk
Saturation - This predicts failure before it happens CPU %, Memory %, DB connection pool usage, System limits Architecture Infra Capacity planning

Metrics

  • DORA metrics
  • Jira delivery metrics
  • Incident metrics

Engg Process

Engineering Stage Purpose Key Activities Interview Signals
Idea / Discovery Understand problem and business need Requirement clarification
Stakeholder discussion
Feasibility analysis
Strong collaboration between Product, Engineering, Design
Architecture & Design Validate technical approach before coding Architecture design
API design
Design review
Scalability and dependency analysis
Design-first culture, avoiding premature coding
Sprint Planning Convert requirements into deliverable work Backlog grooming
Task breakdown
Effort estimation
Define acceptance criteria
Well-groomed backlog, predictable planning
Sprint Execution / Development Build features incrementally Implementation
Daily standups
Cross-team collaboration
Mid-sprint design alignment
Agile discipline and progress visibility
Code Review Process Maintain code quality and standards Pull request creation
Peer code review
Automated checks (lint/tests)
Approval before merge
Strong governance and accountability
Testing & Quality Assurance Ensure reliability before release Unit tests
Integration tests
Regression testing
QE validation
Testing pyramid and automation maturity
CI/CD Pipeline Automate build and deployment Code build
Automated tests in CI
Artifact creation
Staging deployment
Continuous integration and release discipline
Release Management Safely push changes to production Release validation
Feature flags
Controlled rollout
Rollback capability
Mature deployment practices
Monitoring & Observability Detect issues in production Application logs
Metrics collection
Alerting
Dashboards
Operational maturity
Incident Management Respond and recover from failures On-call response
Incident mitigation
Root cause analysis
Postmortem documentation
Reliability ownership culture
Retrospective & Improvement Improve team efficiency Sprint retrospective
Process improvement actions
Technical debt tracking
Engineering metrics review
Continuous improvement mindset
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment