Skip to content

Instantly share code, notes, and snippets.

@dims
dims / kube-openapi-pr590-risk-analysis.md
Last active April 27, 2026 13:13
kube-openapi PR #590 risk analysis: go-openapi/swag v0.23.0→v0.25.4 behavioral deep-dive

kube-openapi PR #590 — Deep-Dive Risk Analysis

Upgrading go-openapi/swag v0.23.0 → v0.25.4

Prepared: 2026-04-27
PR: kubernetes/kube-openapi#590
Reviewer question: "go-openapi has some reputation of changing semantics without notification by accident. As we use it in our CRD validation there is risk that we break our API (we have forked the go-openapi validator nowadays, so risk is lower than in the past, but worth a check anyway)."


Executive Summary

@dims
dims / k8s-unwanted-deps-2026-04.md
Last active April 26, 2026 12:44
Kubernetes unwanted vendor dependencies status — April 2026

Kubernetes Unwanted Dependencies: Status Report

Date: April 2026
Branch: master (commit 5a555755ba2)
Scope: hack/unwanted-dependencies.json — modules listed in spec.unwantedModules that are still present in vendor/


Background

Kubernetes maintains a blocklist of dependencies that should not appear in the vendor tree, defined in hack/unwanted-dependencies.json. The file has two sections:

@dims
dims / k8s-thermal-masking-full-analysis.md
Last active April 25, 2026 12:13
Kubernetes thermal masking regression analysis and runc shared-tmpfs fix

Kubernetes Thermal Masking Regression: Full Technical Analysis

Issues: k/k#138512, k/k#138388
Root PR: k/k#131018 (merged 2025-07-15, backported 2025-09-03)
Affects: Kubernetes 1.31–1.34, Intel CPUs, high core counts
Date written: 2026-04-24
Updated: 2026-04-25 with runc implementation branch and validation results

Public disclosure note: this analysis is based on public Kubernetes, runc, containerd, and runtime ecosystem issue/PR discussion. The referenced GHSA was still inaccessible when this note was written, so no non-public advisory text is quoted here.

@dims
dims / 2026-04-23-dep-security-analysis-v2.md
Last active April 24, 2026 00:51
Kubernetes dependency security analysis 2026-04-23 (43 packages)

Kubernetes Dependency Security Analysis

Date: 2026-04-23
Packages analyzed: 43
Method: GitHub diff inspection, Go Vulnerability Database, CVE/GHSA search, K8s source grep for reachability


Executive Summary

Of 43 packages with version gaps, 2 require prompt action (live CVE or directly reachable hardening fix), 3 are medium priority (correctness/transitive security value), and the remainder are routine hygiene with no meaningful security delta. Two packages had known CVEs that are already patched in the currently pinned version.

@dims
dims / 2026-04-23-constants-module-impact.md
Created April 23, 2026 21:31
What k8s.io/constants enables — prioritized impact analysis (PR #135896)

What k8s.io/constants enables — prioritized impact analysis

PR: kubernetes/kubernetes#135896 Branch: add-constants-module at /Users/dsrinivas/go/src/k8s.io/kubernetes-pr135896 Cross-checked: all factual claims below verified against the actual branch.


The structural shift in one sentence

@dims
dims / 2026-04-23-k8s-staging-deps-radial.svg
Created April 23, 2026 17:30
k8s.io staging module dependency graph (radial, api at center)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@dims
dims / dra-driver-nvidia-gpu-ci-coverage.md
Created April 21, 2026 17:04
CI Coverage Map — sigs.k8s.io/dra-driver-nvidia-gpu (Lambda/GCP-nvkind/mock-nvml providers, BATS suites, TestGrid tabs, GPU_TYPE= resolution, gap analysis)

CI Coverage Map — sigs.k8s.io/dra-driver-nvidia-gpu

As of 2026-04-21. Sources: .github/workflows/, kubernetes/test-infra (config/jobs/kubernetes-sigs/dra-driver-nvidia-gpu/, config/testgrids/nvidia/nvidia.yaml), testgrid.k8s.io/nvidia-gpu, hack/ci/{gcp-nvkind,lambda,mock-nvml}, tests/bats/, test/e2e/.

TL;DR

  • 3 execution surfaces: GitHub Actions (lint/unit/mock-e2e only), Prow on Lambda Cloud (real GPUs, BATS), Prow on GCP-nvkind (T4 GCE, Ginkgo).
  • 7 Prow jobs on this repo: 3 e2e presubmits + 3 e2e periodics + 1 image-push postsubmit.
  • Only Lambda/arm64 (GH200) gives real arm64 GPU coverage. GCP-nvkind is amd64/T4 only.
  • Nothing is truly a required check. GitHub branch protection on main and release-25.8 lists EasyCLA as the only required status. No rulesets configured. Every CI signal above — GH Actions lint/unit/mock-e2e and all 4 Prow e2e presubmits (optional: true) — posts status but cannot block merge. Merge gating is effectively: EasyCLA + tide/OW
@dims
dims / mock-nvml-bats-test-analysis.md
Last active April 16, 2026 15:09
Mock NVML GB200 Emulation: Deep-dive, BATS Test Analysis, and Test Results

Mock NVML BATS Test Compatibility Analysis

Date: 2026-04-15 Environment: CPU-only Kind cluster, 8x mock GB200 NVL, driver 570.170.01 Branch: worktree-mock-nvml-gb200-ci-v2

Environment Constraints

  • nvidia-smi works, shows 8x NVIDIA GB200 NVL with correct attributes
  • NVML queries work (name, UUID, memory, architecture, compute capability)
@dims
dims / 2026-04-12-lambda-gpu-test-roadmap-v2.md
Created April 12, 2026 20:41
Lambda Cloud GPU Test Coverage: What's Next (v2 roadmap)

Lambda Cloud GPU Test Coverage: What's Next

Date: 2026-04-12 Scope: Forward-looking roadmap for expanding DRA GPU driver test coverage on Lambda Cloud. Covers only what remains to be done — not what's already landed or in flight.

Prerequisite: PRs #1025, #1027, #1028 should be merged first. After they land, Lambda CI runs 25 tests across 6 test files covering basic GPU allocation, CUDA workloads, Dynamic MIG, TimeSlicing, MPS, DRAExtendedResource, Prometheus metrics, CEL selectors, claim lifecycle, and robustness.


1. Zero-Code Wins: Add Existing Tests to Lambda CI

@dims
dims / 2026-04-11-lambda-gpu-test-coverage-roadmap.md
Last active April 12, 2026 19:41
Lambda Cloud GPU Test Coverage Roadmap for dra-driver-nvidia-gpu - comprehensive analysis of testable features, QA plan comparison, and implementation phases

Lambda Cloud GPU Test Coverage Roadmap for dra-driver-nvidia-gpu

Date: 2026-04-11 Scope: Comprehensive analysis of what features of the DRA driver can be tested on Lambda Cloud, what we already cover, what's feasible to add, and what's out of reach.


PR Tracking

| PR | Repo | Status | Description |