Created
May 12, 2026 00:56
-
-
Save stbenjam/ad89fbd480e6a5f5ca8b70aa21237593 to your computer and use it in GitHub Desktop.
Payload Analysis: 4.14.0-0.nightly-2026-05-10-173114 - Multus CNI JoinHostPort bug (CVE-2025-47912)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| <!DOCTYPE html> | |
| <html lang="en"> | |
| <head> | |
| <meta charset="UTF-8"> | |
| <meta name="viewport" content="width=device-width, initial-scale=1.0"> | |
| <title>Payload Analysis: 4.14.0-0.nightly-2026-05-10-173114</title> | |
| <style> | |
| :root { | |
| --red: #dc3545; | |
| --green: #28a745; | |
| --yellow: #ffc107; | |
| --orange: #fd7e14; | |
| --blue: #007bff; | |
| --gray: #6c757d; | |
| --light-gray: #f8f9fa; | |
| --dark: #212529; | |
| --border: #dee2e6; | |
| } | |
| * { box-sizing: border-box; margin: 0; padding: 0; } | |
| body { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', Arial, sans-serif; line-height: 1.6; color: var(--dark); max-width: 1200px; margin: 0 auto; padding: 20px; background: #fff; } | |
| h1 { font-size: 1.8rem; margin-bottom: 8px; } | |
| h2 { font-size: 1.4rem; margin: 24px 0 12px; border-bottom: 2px solid var(--border); padding-bottom: 6px; } | |
| h3 { font-size: 1.1rem; margin: 16px 0 8px; } | |
| a { color: var(--blue); text-decoration: none; } | |
| a:hover { text-decoration: underline; } | |
| .badge { display: inline-block; padding: 2px 10px; border-radius: 12px; font-size: 0.85rem; font-weight: 600; color: #fff; } | |
| .badge-red { background: var(--red); } | |
| .badge-green { background: var(--green); } | |
| .badge-yellow { background: var(--yellow); color: var(--dark); } | |
| .badge-orange { background: var(--orange); } | |
| .badge-gray { background: var(--gray); } | |
| .executive-summary { background: #fef3f3; border-left: 4px solid var(--red); padding: 16px 20px; margin: 16px 0; border-radius: 0 8px 8px 0; } | |
| .executive-summary h2 { border: none; margin-top: 0; } | |
| .summary-grid { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 12px; margin: 16px 0; } | |
| .summary-card { background: var(--light-gray); border-radius: 8px; padding: 16px; text-align: center; } | |
| .summary-card .label { font-size: 0.8rem; color: var(--gray); text-transform: uppercase; letter-spacing: 0.5px; } | |
| .summary-card .value { font-size: 1.8rem; font-weight: 700; margin-top: 4px; } | |
| .summary-card .value.red { color: var(--red); } | |
| .summary-card .value.green { color: var(--green); } | |
| .summary-card .value.orange { color: var(--orange); } | |
| table { width: 100%; border-collapse: collapse; margin: 12px 0; } | |
| th, td { padding: 10px 14px; text-align: left; border-bottom: 1px solid var(--border); } | |
| th { background: var(--light-gray); font-weight: 600; font-size: 0.85rem; text-transform: uppercase; letter-spacing: 0.3px; } | |
| tr:hover { background: #f1f3f5; } | |
| .status-pass { color: var(--green); font-weight: 600; } | |
| .status-fail { color: var(--red); font-weight: 600; } | |
| .status-pending { color: var(--yellow); font-weight: 600; } | |
| details { margin: 12px 0; border: 1px solid var(--border); border-radius: 8px; overflow: hidden; } | |
| summary { padding: 12px 16px; background: var(--light-gray); cursor: pointer; font-weight: 600; } | |
| summary:hover { background: #e9ecef; } | |
| .detail-content { padding: 16px; } | |
| .error-box { background: #fff5f5; border: 1px solid #fecaca; border-radius: 6px; padding: 12px 16px; margin: 8px 0; font-family: 'SFMono-Regular', Consolas, 'Liberation Mono', Menlo, monospace; font-size: 0.85rem; white-space: pre-wrap; word-break: break-all; } | |
| .timeline { margin: 12px 0; padding-left: 20px; border-left: 3px solid var(--border); } | |
| .timeline-item { margin: 8px 0; padding-left: 12px; position: relative; } | |
| .timeline-item::before { content: ''; width: 10px; height: 10px; border-radius: 50%; background: var(--gray); position: absolute; left: -27px; top: 6px; } | |
| .timeline-item.error::before { background: var(--red); } | |
| .timeline-item .time { font-family: monospace; color: var(--gray); font-size: 0.85rem; } | |
| .streak-bar { display: flex; gap: 4px; margin: 8px 0; } | |
| .streak-block { width: 28px; height: 28px; border-radius: 4px; display: flex; align-items: center; justify-content: center; font-size: 0.65rem; color: #fff; font-weight: 700; } | |
| .streak-block.fail { background: var(--red); } | |
| .streak-block.pass { background: var(--green); } | |
| .streak-block.force { background: var(--orange); } | |
| .streak-block.pending { background: var(--yellow); color: var(--dark); } | |
| .streak-block.target { outline: 3px solid var(--dark); outline-offset: 1px; } | |
| .no-revert-box { background: #f0fdf4; border: 1px solid #bbf7d0; border-radius: 8px; padding: 16px 20px; margin: 16px 0; } | |
| .alert-list { list-style: none; padding: 0; } | |
| .alert-list li { padding: 6px 0; border-bottom: 1px solid #f0f0f0; } | |
| .alert-list li:last-child { border-bottom: none; } | |
| .alert-name { font-weight: 600; color: var(--red); } | |
| .alert-duration { color: var(--gray); font-size: 0.85rem; } | |
| .meta-info { color: var(--gray); font-size: 0.9rem; margin-bottom: 20px; } | |
| .footer { margin-top: 40px; padding-top: 16px; border-top: 1px solid var(--border); color: var(--gray); font-size: 0.8rem; } | |
| </style> | |
| </head> | |
| <body> | |
| <h1>Payload Analysis: 4.14.0-0.nightly-2026-05-10-173114</h1> | |
| <p class="meta-info"> | |
| <span class="badge badge-red">Rejected</span>  | |
| Architecture: <strong>amd64</strong> · | |
| Stream: <strong>nightly</strong> · | |
| Version: <strong>4.14</strong> · | |
| Generated: 2026-05-12 · | |
| <a href="https://amd64.ocp.releases.ci.openshift.org/releasestream/4.14.0-0.nightly/release/4.14.0-0.nightly-2026-05-10-173114" target="_blank">Release Controller</a> | |
| </p> | |
| <div class="executive-summary"> | |
| <h2>Executive Summary</h2> | |
| <p> | |
| This payload was <strong>rejected</strong> due to the persistent failure of a single blocking job: | |
| <strong>gcp-ovn-rt-upgrade-4.14-minor</strong> (4.13→4.14 GCP OVN RT kernel upgrade). | |
| The job failed all 4 attempts (3 retries). All other 9 blocking jobs succeeded. | |
| </p> | |
| <p style="margin-top:8px;"> | |
| <strong>Root cause:</strong> A latent bug in the <strong>Multus CNI thin entrypoint</strong> | |
| (<code>cmd/thin_entrypoint/main.go</code>) unconditionally wraps the API server hostname in square brackets | |
| (<code>fmt.Sprintf("%s://[%s]:%s", ...)</code>), which is only valid for IPv6 addresses. | |
| This was exposed when the build toolchain picked up <strong>Go 1.24.8+</strong>, which tightened | |
| <code>url.Parse()</code> to reject brackets around non-IPv6 addresses as a security hardening for | |
| <strong>CVE-2025-47912</strong>. The result: Multus cannot initialize its Kubernetes client, DNS pods | |
| cannot be recycled, the DNS operator degrades, and the upgrade times out. | |
| </p> | |
| <p style="margin-top:8px;"> | |
| <strong>This is a long-running permafail (~27 days).</strong> The job has been failing since | |
| <a href="https://amd64.ocp.releases.ci.openshift.org/releasestream/4.14.0-0.nightly/release/4.14.0-0.nightly-2026-04-16-012435" target="_blank">4.14.0-0.nightly-2026-04-16-012435</a> | |
| (April 16). The last accepted payload (<a href="https://amd64.ocp.releases.ci.openshift.org/releasestream/4.14.0-0.nightly/release/4.14.0-0.nightly-2026-05-07-213817" target="_blank">4.14.0-0.nightly-2026-05-07-213817</a>) | |
| was <strong>force-accepted</strong> despite this same failure. It has been <strong>~99 hours</strong> since the last accepted payload. | |
| </p> | |
| <p style="margin-top:8px;"> | |
| <strong>Fixes already merged but not yet in payloads:</strong> | |
| <a href="https://github.com/openshift/multus-cni/pull/287" target="_blank">openshift/multus-cni#287</a> (OCPBUGS-85253, merged May 7) and | |
| <a href="https://github.com/openshift/cluster-network-operator/pull/2996" target="_blank">openshift/cluster-network-operator#2996</a> (OCPBUGS-84184, merged May 6) | |
| replace the bracket-wrapping with <code>net.JoinHostPort()</code>. | |
| The fixed images have not yet been built into a payload — upcoming nightlies should pick them up. | |
| </p> | |
| </div> | |
| <div class="summary-grid"> | |
| <div class="summary-card"> | |
| <div class="label">Blocking Jobs</div> | |
| <div class="value">10</div> | |
| </div> | |
| <div class="summary-card"> | |
| <div class="label">Passed</div> | |
| <div class="value green">9</div> | |
| </div> | |
| <div class="summary-card"> | |
| <div class="label">Failed</div> | |
| <div class="value red">1</div> | |
| </div> | |
| <div class="summary-card"> | |
| <div class="label">Hours Since Accepted</div> | |
| <div class="value orange">~99h</div> | |
| </div> | |
| <div class="summary-card"> | |
| <div class="label">Failure Streak</div> | |
| <div class="value red">5+ payloads</div> | |
| </div> | |
| <div class="summary-card"> | |
| <div class="label">New PRs</div> | |
| <div class="value">0</div> | |
| </div> | |
| </div> | |
| <h2>Blocking Job Results</h2> | |
| <table> | |
| <thead> | |
| <tr><th>Job</th><th>Status</th><th>Retries</th><th>Link</th></tr> | |
| </thead> | |
| <tbody> | |
| <tr> | |
| <td>aws-ovn-serial</td> | |
| <td class="status-pass">Succeeded</td> | |
| <td>0</td> | |
| <td><a href="https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-main-nightly-4.14-e2e-aws-ovn-serial/2053529046221852672" target="_blank">Prow</a></td> | |
| </tr> | |
| <tr> | |
| <td>aws-ovn-upgrade-micro</td> | |
| <td class="status-pass">Succeeded</td> | |
| <td>1</td> | |
| <td><a href="https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-main-ci-4.14-e2e-aws-ovn-upgrade/2053545492083642368" target="_blank">Prow</a></td> | |
| </tr> | |
| <tr> | |
| <td>aws-sdn-upgrade-4.14-micro</td> | |
| <td class="status-pass">Succeeded</td> | |
| <td>0</td> | |
| <td><a href="https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-main-nightly-4.14-e2e-aws-sdn-upgrade/2053529046167326720" target="_blank">Prow</a></td> | |
| </tr> | |
| <tr> | |
| <td>azure-ovn-upgrade-4.14-micro</td> | |
| <td class="status-pass">Succeeded</td> | |
| <td>0</td> | |
| <td><a href="https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-main-ci-4.14-e2e-azure-ovn-upgrade/2053529046901329920" target="_blank">Prow</a></td> | |
| </tr> | |
| <tr> | |
| <td>driver-toolkit</td> | |
| <td class="status-pass">Succeeded</td> | |
| <td>0</td> | |
| <td><a href="https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-main-nightly-4.14-e2e-aws-driver-toolkit/2053529051938689024" target="_blank">Prow</a></td> | |
| </tr> | |
| <tr> | |
| <td>fips-scan</td> | |
| <td class="status-pass">Succeeded</td> | |
| <td>0</td> | |
| <td><a href="https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-main-nightly-4.14-fips-payload-scan/2053529046402207744" target="_blank">Prow</a></td> | |
| </tr> | |
| <tr style="background: #fff5f5;"> | |
| <td><strong>gcp-ovn-rt-upgrade-4.14-minor</strong></td> | |
| <td class="status-fail">Failed</td> | |
| <td>3</td> | |
| <td><a href="https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-main-ci-4.14-upgrade-from-stable-4.13-e2e-gcp-ovn-rt-upgrade/2053715790477135872" target="_blank">Prow (final)</a></td> | |
| </tr> | |
| <tr> | |
| <td>hypershift-ovn-conformance</td> | |
| <td class="status-pass">Succeeded</td> | |
| <td>0</td> | |
| <td><a href="https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-hypershift-release-4.14-periodics-e2e-aws-ovn-conformance/2053529046251212800" target="_blank">Prow</a></td> | |
| </tr> | |
| <tr> | |
| <td>metal-ipi-ovn-ipv6</td> | |
| <td class="status-pass">Succeeded</td> | |
| <td>0</td> | |
| <td><a href="https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-main-nightly-4.14-e2e-metal-ipi-ovn-ipv6/2053529046305738752" target="_blank">Prow</a></td> | |
| </tr> | |
| <tr> | |
| <td>metal-ipi-sdn-bm</td> | |
| <td class="status-pass">Succeeded</td> | |
| <td>0</td> | |
| <td><a href="https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-main-nightly-4.14-e2e-metal-ipi-sdn-bm/2053529046347681792" target="_blank">Prow</a></td> | |
| </tr> | |
| </tbody> | |
| </table> | |
| <h2>Failure History: gcp-ovn-rt-upgrade-4.14-minor</h2> | |
| <p>This job has been persistently failing across all recent payloads. The streak visualization shows the job's status across recent payloads (newest on right):</p> | |
| <div style="display: flex; align-items: center; gap: 8px; margin: 12px 0; flex-wrap: wrap;"> | |
| <div style="display: flex; align-items: center; gap: 4px;"> | |
| <div class="streak-block pass" title="4.14.0-0.nightly-2026-04-14-134916 (Accepted, gcp PASSED)" style="font-size:0.5rem;">4/14</div> | |
| <div class="streak-block force" title="4.14.0-0.nightly-2026-04-16-012435 (Accepted/Force, gcp FAILED)" style="font-size:0.5rem;">4/16</div> | |
| <div class="streak-block force" title="4.14.0-0.nightly-2026-05-07-213817 (Accepted/Force, gcp FAILED)" style="font-size:0.5rem;">5/7</div> | |
| <div class="streak-block fail" title="4.14.0-0.nightly-2026-05-08-111625 (Rejected)" style="font-size:0.5rem;">5/8</div> | |
| <div class="streak-block fail" title="4.14.0-0.nightly-2026-05-09-034016 (Rejected)" style="font-size:0.5rem;">5/9a</div> | |
| <div class="streak-block fail" title="4.14.0-0.nightly-2026-05-09-211506 (Rejected)" style="font-size:0.5rem;">5/9b</div> | |
| <div class="streak-block fail target" title="4.14.0-0.nightly-2026-05-10-173114 (Rejected) - TARGET" style="font-size:0.5rem;">5/10</div> | |
| <div class="streak-block fail" title="4.14.0-0.nightly-2026-05-11-095134 (Rejected)" style="font-size:0.5rem;">5/11</div> | |
| </div> | |
| <div style="font-size: 0.75rem; color: var(--gray);"> | |
| <span class="streak-block pass" style="width:14px; height:14px; display:inline-flex; font-size:0;"> </span> Pass | |
| <span class="streak-block force" style="width:14px; height:14px; display:inline-flex; font-size:0; margin-left:6px;"> </span> Force-accepted (gcp failed) | |
| <span class="streak-block fail" style="width:14px; height:14px; display:inline-flex; font-size:0; margin-left:6px;"> </span> Rejected | |
| <span style="border: 2px solid var(--dark); width:14px; height:14px; display:inline-flex; border-radius:4px; margin-left:6px;"> </span> Target | |
| </div> | |
| </div> | |
| <table> | |
| <thead> | |
| <tr><th>Payload</th><th>Phase</th><th>gcp-ovn-rt-upgrade Status</th><th>New PRs</th></tr> | |
| </thead> | |
| <tbody> | |
| <tr> | |
| <td><a href="https://amd64.ocp.releases.ci.openshift.org/releasestream/4.14.0-0.nightly/release/4.14.0-0.nightly-2026-04-14-134916" target="_blank">...2026-04-14-134916</a></td> | |
| <td><span class="badge badge-green">Accepted</span></td> | |
| <td class="status-pass">Succeeded (1 retry)</td> | |
| <td>—</td> | |
| </tr> | |
| <tr style="background: #fff8e1;"> | |
| <td><a href="https://amd64.ocp.releases.ci.openshift.org/releasestream/4.14.0-0.nightly/release/4.14.0-0.nightly-2026-04-16-012435" target="_blank">...2026-04-16-012435</a></td> | |
| <td><span class="badge badge-orange">Force-Accepted</span></td> | |
| <td class="status-fail">Failed (3 retries) ← ONSET</td> | |
| <td>0</td> | |
| </tr> | |
| <tr style="background: #fff8e1;"> | |
| <td><a href="https://amd64.ocp.releases.ci.openshift.org/releasestream/4.14.0-0.nightly/release/4.14.0-0.nightly-2026-05-07-213817" target="_blank">...2026-05-07-213817</a></td> | |
| <td><span class="badge badge-orange">Force-Accepted</span></td> | |
| <td class="status-fail">Failed (3 retries)</td> | |
| <td>0</td> | |
| </tr> | |
| <tr> | |
| <td><a href="https://amd64.ocp.releases.ci.openshift.org/releasestream/4.14.0-0.nightly/release/4.14.0-0.nightly-2026-05-08-111625" target="_blank">...2026-05-08-111625</a></td> | |
| <td><span class="badge badge-red">Rejected</span></td> | |
| <td class="status-fail">Failed (3 retries)</td> | |
| <td>0</td> | |
| </tr> | |
| <tr> | |
| <td><a href="https://amd64.ocp.releases.ci.openshift.org/releasestream/4.14.0-0.nightly/release/4.14.0-0.nightly-2026-05-09-034016" target="_blank">...2026-05-09-034016</a></td> | |
| <td><span class="badge badge-red">Rejected</span></td> | |
| <td class="status-fail">Failed (3 retries)</td> | |
| <td>0</td> | |
| </tr> | |
| <tr> | |
| <td><a href="https://amd64.ocp.releases.ci.openshift.org/releasestream/4.14.0-0.nightly/release/4.14.0-0.nightly-2026-05-09-211506" target="_blank">...2026-05-09-211506</a></td> | |
| <td><span class="badge badge-red">Rejected</span></td> | |
| <td class="status-fail">Failed (3 retries)</td> | |
| <td>0</td> | |
| </tr> | |
| <tr style="background: #fef3f3;"> | |
| <td><a href="https://amd64.ocp.releases.ci.openshift.org/releasestream/4.14.0-0.nightly/release/4.14.0-0.nightly-2026-05-10-173114" target="_blank"><strong>...2026-05-10-173114</strong></a></td> | |
| <td><span class="badge badge-red">Rejected</span></td> | |
| <td class="status-fail"><strong>Failed (3 retries) ← TARGET</strong></td> | |
| <td>0</td> | |
| </tr> | |
| <tr> | |
| <td><a href="https://amd64.ocp.releases.ci.openshift.org/releasestream/4.14.0-0.nightly/release/4.14.0-0.nightly-2026-05-11-095134" target="_blank">...2026-05-11-095134</a></td> | |
| <td><span class="badge badge-red">Rejected</span></td> | |
| <td class="status-fail">Pending/Failed (2 retries)</td> | |
| <td>0</td> | |
| </tr> | |
| </tbody> | |
| </table> | |
| <h2>Failed Job Analysis: gcp-ovn-rt-upgrade-4.14-minor</h2> | |
| <details open> | |
| <summary style="background: #fef3f3;"> | |
| <span class="status-fail">FAILED</span> — gcp-ovn-rt-upgrade-4.14-minor | |
| <span style="float:right; font-weight: normal; font-size: 0.85rem;"> | |
| periodic-ci-openshift-release-main-ci-4.14-upgrade-from-stable-4.13-e2e-gcp-ovn-rt-upgrade | |
| </span> | |
| </summary> | |
| <div class="detail-content"> | |
| <h3>Classification</h3> | |
| <p> | |
| <span class="badge badge-red">Product Bug</span> | |
| <span class="badge badge-gray" style="margin-left: 4px;">Networking / Multus CNI</span> | |
| <span class="badge badge-gray" style="margin-left: 4px;">Permafail ~27 days</span> | |
| </p> | |
| <h3>Failure Type</h3> | |
| <p><strong>Upgrade test failure</strong> — The 4.13 cluster installs successfully and the RT kernel tuned profile is applied, but the <strong>4.13→4.14 upgrade fails</strong> due to the DNS operator becoming permanently degraded.</p> | |
| <h3>Root Cause</h3> | |
| <p>During the 4.13→4.14 upgrade, worker nodes are rebooted by the MachineConfigDaemon to apply the new 4.14 OS. After reboot, the <strong>Multus CNI plugin fails to initialize its Kubernetes client</strong> because the API server URL in its kubeconfig is malformed with brackets around the hostname.</p> | |
| <div class="error-box">Multus: error getting k8s client: host must be a URL or a host:port pair: | |
| "https://[api-int.ci-op-k3f4fgi6-ad64e.XXXXXXXXXXXXXXXXXXXXXX]:6443"</div> | |
| <h3>Buggy Code</h3> | |
| <p>In <a href="https://github.com/openshift/multus-cni/blob/release-4.14/cmd/thin_entrypoint/main.go" target="_blank"><code>openshift/multus-cni → cmd/thin_entrypoint/main.go:202</code></a>, the kubeconfig URL was constructed as:</p> | |
| <div class="error-box" style="background: #f8f8f8; border-color: #ddd;">fmt.Sprintf("%s://[%s]:%s", kubeProtocol, kubeHost, kubePort)</div> | |
| <p>This <strong>unconditionally wraps</strong> <code>KUBERNETES_SERVICE_HOST</code> in square brackets <code>[...]</code>, which is only valid for IPv6 addresses. For DNS hostnames (like <code>api-int.ci-op-...</code>), the brackets produce an invalid URL.</p> | |
| <h3>Trigger: Go 1.24.8+ / CVE-2025-47912</h3> | |
| <p>This was a <strong>latent bug</strong> that existed for years but was harmless because older Go versions accepted brackets around any host. When the 4.14 build toolchain picked up <strong>Go 1.24.8+</strong> (around mid-April 2026), the stricter <code>url.Parse()</code> behavior from the <strong>CVE-2025-47912</strong> security fix started rejecting brackets around non-IPv6 addresses, exposing the bug.</p> | |
| <h3>Fix (Already Merged)</h3> | |
| <table> | |
| <thead><tr><th>PR</th><th>Repo</th><th>Jira</th><th>Merged</th><th>Change</th></tr></thead> | |
| <tbody> | |
| <tr> | |
| <td><a href="https://github.com/openshift/multus-cni/pull/287" target="_blank">#287</a></td> | |
| <td>openshift/multus-cni</td> | |
| <td><a href="https://issues.redhat.com/browse/OCPBUGS-85253" target="_blank">OCPBUGS-85253</a></td> | |
| <td>May 7</td> | |
| <td>Replace <code>fmt.Sprintf("[%s]")</code> with <code>net.JoinHostPort()</code> in thin entrypoint kubeconfig generation</td> | |
| </tr> | |
| <tr> | |
| <td><a href="https://github.com/openshift/cluster-network-operator/pull/2996" target="_blank">#2996</a></td> | |
| <td>openshift/cluster-network-operator</td> | |
| <td><a href="https://issues.redhat.com/browse/OCPBUGS-84184" target="_blank">OCPBUGS-84184</a></td> | |
| <td>May 6</td> | |
| <td>Use <code>net.JoinHostPort()</code> in multus admission controller & cloud network config controller</td> | |
| </tr> | |
| </tbody> | |
| </table> | |
| <p style="margin-top: 8px;">The fix correctly uses <code>net.JoinHostPort()</code>, which only adds brackets for actual IPv6 addresses. <strong>These fixes have not yet been built into a payload</strong> — the 4.14 maintenance branch has a slower build pipeline. Upcoming nightlies should pick them up.</p> | |
| <h3>Failure Cascade</h3> | |
| <ol style="margin: 8px 0 8px 20px;"> | |
| <li>Multus CNI cannot initialize → cannot create/destroy pod sandboxes on affected worker nodes</li> | |
| <li>Pods on affected workers cannot be killed or recreated → <code>FailedKillPod</code> errors</li> | |
| <li><code>dns-default</code> DaemonSet rollout stuck (readiness probes fail with <code>connection refused</code> on port 8181)</li> | |
| <li>DNS operator reports <code>DNSDegraded</code>, <code>ClusterOperatorDegraded</code> alert fires</li> | |
| <li>Cluster version operator waits 40+ minutes for DNS → upgrade times out</li> | |
| </ol> | |
| <h3>Timeline (Most Recent Attempt)</h3> | |
| <div class="timeline"> | |
| <div class="timeline-item"> | |
| <span class="time">06:50 UTC</span> — RT kernel tuned profile applied, nodes rebooted/updated (pre-upgrade) | |
| </div> | |
| <div class="timeline-item"> | |
| <span class="time">06:55 UTC</span> — 4.13→4.14 upgrade initiated | |
| </div> | |
| <div class="timeline-item error"> | |
| <span class="time">07:27 UTC</span> — First Multus <code>FailedKillPod</code> errors on worker nodes | |
| </div> | |
| <div class="timeline-item error"> | |
| <span class="time">07:44 UTC</span> — Upgrade reaches 82% (712/861), stuck waiting on DNS operator (degraded) | |
| </div> | |
| <div class="timeline-item error"> | |
| <span class="time">08:06 UTC</span> — Cluster version reports: "Failing: Cluster operator dns is degraded" | |
| </div> | |
| <div class="timeline-item error"> | |
| <span class="time">09:25 UTC</span> — Upgrade times out after exceeding 40 minutes waiting on DNS | |
| </div> | |
| </div> | |
| <h3>Fired Alerts</h3> | |
| <ul class="alert-list"> | |
| <li><span class="alert-name">ClusterOperatorDegraded</span> (dns) <span class="alert-duration">— ~73 minutes</span></li> | |
| <li><span class="alert-name">KubeDaemonSetRolloutStuck</span> (dns-default) <span class="alert-duration">— ~68 minutes</span></li> | |
| <li><span class="alert-name">KubeDaemonSetRolloutStuck</span> (network-check-target) <span class="alert-duration">— ~85 minutes</span></li> | |
| <li><span class="alert-name">KubeDeploymentReplicasMismatch</span> (network-check-source) <span class="alert-duration">— ~100 minutes</span></li> | |
| <li><span class="alert-name">OVNKubernetesNorthdInactive</span> <span class="alert-duration">— ~5 minutes</span></li> | |
| </ul> | |
| <h3>All Attempts</h3> | |
| <table> | |
| <thead> | |
| <tr><th>#</th><th>Prow Job ID</th><th>Result</th></tr> | |
| </thead> | |
| <tbody> | |
| <tr> | |
| <td>1</td> | |
| <td><a href="https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-main-ci-4.14-upgrade-from-stable-4.13-e2e-gcp-ovn-rt-upgrade/2053529046070857728" target="_blank">2053529046070857728</a></td> | |
| <td class="status-fail">Failed — Same root cause (Multus / DNS degraded)</td> | |
| </tr> | |
| <tr> | |
| <td>2</td> | |
| <td><a href="https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-main-ci-4.14-upgrade-from-stable-4.13-e2e-gcp-ovn-rt-upgrade/2053589708469964800" target="_blank">2053589708469964800</a></td> | |
| <td class="status-fail">Failed — Same root cause</td> | |
| </tr> | |
| <tr> | |
| <td>3</td> | |
| <td><a href="https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-main-ci-4.14-upgrade-from-stable-4.13-e2e-gcp-ovn-rt-upgrade/2053654288311259136" target="_blank">2053654288311259136</a></td> | |
| <td class="status-fail">Failed — Same root cause</td> | |
| </tr> | |
| <tr> | |
| <td>4 (final)</td> | |
| <td><a href="https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-main-ci-4.14-upgrade-from-stable-4.13-e2e-gcp-ovn-rt-upgrade/2053715790477135872" target="_blank">2053715790477135872</a></td> | |
| <td class="status-fail">Failed — Same root cause</td> | |
| </tr> | |
| </tbody> | |
| </table> | |
| <h3>Consistency</h3> | |
| <p>All 4 attempts show the identical failure pattern: DNS operator degraded due to Multus CNI client initialization failure with bracket-malformed API server URL. This is a deterministic, 100% reproducible bug — not a flake.</p> | |
| </div> | |
| </details> | |
| <h2>Revert Recommendations</h2> | |
| <div class="no-revert-box" style="background: #fff7ed; border-color: #fdba74;"> | |
| <h3 style="color: var(--orange);">Fix Merged on 4.14 but Incomplete — 4.13 Also Needs Fix</h3> | |
| <p> | |
| The fixes have been merged to <code>release-4.14</code> and are <strong>confirmed present in this payload</strong> | |
| (verified via <code>oc adm release info --output json</code>): | |
| </p> | |
| <table> | |
| <thead><tr><th>Image</th><th>Commit in Payload</th><th>Fix PR</th></tr></thead> | |
| <tbody> | |
| <tr> | |
| <td><code>multus-cni</code></td> | |
| <td><code>26b410137fa0</code> (May 7)</td> | |
| <td><a href="https://github.com/openshift/multus-cni/pull/287" target="_blank">#287</a></td> | |
| </tr> | |
| <tr> | |
| <td><code>cluster-network-operator</code></td> | |
| <td><code>af41a362017b</code> (May 6)</td> | |
| <td><a href="https://github.com/openshift/cluster-network-operator/pull/2996" target="_blank">#2996</a></td> | |
| </tr> | |
| </tbody> | |
| </table> | |
| <p style="margin-top: 8px;"> | |
| <strong>However, the job still fails despite the fixes being present.</strong> This is because | |
| <code>gcp-ovn-rt-upgrade-4.14-minor</code> is a <strong>4.13→4.14 upgrade test</strong>. The test installs | |
| a cluster from <strong>stable 4.13</strong> first, then upgrades to the 4.14 nightly. During the initial 4.13 phase, | |
| the <strong>4.13 multus binary</strong> runs and generates the kubeconfig with the bracket-malformed URL. | |
| If the stable 4.13 images have also been rebuilt with Go 1.24.8+ (which includes the CVE-2025-47912 fix), | |
| they hit the same latent bug <em>before</em> the 4.14 upgrade even delivers the fixed binary. | |
| </p> | |
| <p style="margin-top: 8px;"> | |
| The <code>release-4.13</code> branch of <code>multus-cni</code> has <strong>not</strong> received this fix — | |
| its last commit is from April 2024. The same bracket-wrapping code exists there unchanged. | |
| </p> | |
| <h3 style="margin-top: 16px;">Jira Tracking</h3> | |
| <table> | |
| <thead><tr><th>Bug</th><th>Summary</th><th>Status</th></tr></thead> | |
| <tbody> | |
| <tr> | |
| <td><a href="https://issues.redhat.com/browse/OCPBUGS-72411" target="_blank">OCPBUGS-72411</a></td> | |
| <td>CNO fails to start with "host must be a URL or a host:port pair" (parent bug)</td> | |
| <td>Fix available</td> | |
| </tr> | |
| <tr> | |
| <td><a href="https://issues.redhat.com/browse/OCPBUGS-84184" target="_blank">OCPBUGS-84184</a></td> | |
| <td>CNO 4.14 clone — use <code>net.JoinHostPort</code> for URL construction</td> | |
| <td>Fix merged (May 6)</td> | |
| </tr> | |
| <tr> | |
| <td><a href="https://issues.redhat.com/browse/OCPBUGS-85253" target="_blank">OCPBUGS-85253</a></td> | |
| <td>Multus 4.14 — Fix server URL in generated kubeconfig</td> | |
| <td>Fix merged (May 7)</td> | |
| </tr> | |
| </tbody> | |
| </table> | |
| <h3 style="margin-top: 16px;">Background</h3> | |
| <p> | |
| The buggy <code>fmt.Sprintf("%s://[%s]:%s", ...)</code> pattern has existed in multus-cni since March 2023 | |
| (commit <code>dcf92c8e</code>). It was harmless because older Go versions tolerated brackets around non-IPv6 hosts. | |
| When ART rebuilt builder images with a Go version containing | |
| the <a href="https://github.com/golang/go/issues/75678" target="_blank">CVE-2025-47912</a> fix (around April 14, 2026), | |
| <code>url.Parse()</code> started strictly rejecting the bracketed hostnames, exposing the latent bug. | |
| The fix was backported to <code>release-4.14</code> but <strong>not to <code>release-4.13</code></strong>, which is | |
| the source version for this upgrade test. | |
| </p> | |
| <p style="margin-top: 8px;"> | |
| <strong>Recommended actions:</strong> | |
| </p> | |
| <ul style="margin: 8px 0 0 20px;"> | |
| <li><strong>Cherry-pick the fixes to <code>release-4.13</code></strong> for both <code>multus-cni</code> and <code>cluster-network-operator</code> so the 4.13 source cluster doesn't hit the bug before the upgrade delivers the 4.14 fix</li> | |
| <li>Consider force-accepting payloads in the interim, since all other 9 blocking jobs pass</li> | |
| <li>Verify the fix also reaches any other z-stream branches where ART has updated the Go toolchain</li> | |
| </ul> | |
| </div> | |
| <h2>Payload Composition</h2> | |
| <p>No new pull requests were included in this payload compared to its predecessor. The 4.14 stream has had zero code changes across the entire rejection streak (May 8–11, 2026) and beyond (back to at least April 16, 2026).</p> | |
| <h2>Related Changes on release-4.14</h2> | |
| <p>While the payload diff API shows zero PRs (the changes affect build toolchain, not payload image content), the following commits landed on <code>release-4.14</code> branches in the relevant timeframe:</p> | |
| <table> | |
| <thead><tr><th>Date</th><th>Repo</th><th>Change</th><th>Relevance</th></tr></thead> | |
| <tbody> | |
| <tr> | |
| <td>Apr 15</td> | |
| <td><a href="https://github.com/openshift/ovn-kubernetes/pull/3073" target="_blank">openshift/ovn-kubernetes#3073</a></td> | |
| <td>Unpin OVN, consume latest from FDP</td> | |
| <td class="status-pass">Not the cause (coincidental timing)</td> | |
| </tr> | |
| <tr> | |
| <td>~Apr 14</td> | |
| <td>(ART builder image)</td> | |
| <td>Go toolchain updated to include CVE-2025-47912 fix</td> | |
| <td class="status-fail">Trigger — stricter <code>url.Parse()</code> exposed latent Multus bug</td> | |
| </tr> | |
| <tr> | |
| <td>May 6</td> | |
| <td><a href="https://github.com/openshift/cluster-network-operator/pull/2996" target="_blank">openshift/cluster-network-operator#2996</a></td> | |
| <td>Use <code>net.JoinHostPort</code> for URL construction</td> | |
| <td><span class="badge badge-green">Fix</span></td> | |
| </tr> | |
| <tr> | |
| <td>May 7</td> | |
| <td><a href="https://github.com/openshift/multus-cni/pull/287" target="_blank">openshift/multus-cni#287</a></td> | |
| <td>Fix URL in generated kubeconfig</td> | |
| <td><span class="badge badge-green">Fix</span></td> | |
| </tr> | |
| </tbody> | |
| </table> | |
| <div class="footer"> | |
| <p>Generated by <code>/ci:analyze-payload</code> · Claude Code · 2026-05-12</p> | |
| <p>Target: <code>4.14.0-0.nightly-2026-05-10-173114</code> · Lookback: 10 payloads · Analysis based on Prow job artifacts and release controller data</p> | |
| </div> | |
| </body> | |
| </html> |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment