Skip to content

Instantly share code, notes, and snippets.

View dougbtv's full-sized avatar

Doug Smith dougbtv

View GitHub Profile
@dougbtv
dougbtv / deepseek-v4-pro-guide.md
Created April 28, 2026 22:01
DeepSeek V4 Pro: Build, Run, and Smoke Test Guide (nm-vllm-ent v0.20.1rc0, 8x B200)

DeepSeek V4 Pro: Build, Run, and Smoke Test Guide

Overview

This document covers how to build, deploy, and test the deepseek-ai/DeepSeek-V4-Pro model (1.6T total params, 49B active, FP4+FP8 mixed precision) using nm-vllm-ent (based on upstream vLLM v0.20.1rc0) on NVIDIA B200 GPUs.

Build Information

Field Value
@dougbtv
dougbtv / laguna-howto.md
Last active April 28, 2026 12:30
Poolside Laguna: build, run, and smoke test howto

Poolside Laguna: build, run, and smoke test howto

Poolside Laguna Howto

Build Info

Field Value
Model Poolside Laguna (model card TBD) — poolside.ai
Image quay.io/vllm/rhaiis-early-access:poolside-laguna
@dougbtv
dougbtv / ipv6-upgrade-path.md
Created April 22, 2026 11:45
Home network IPv6 upgrade path — GMAVT/AS12282, ASUS ZenWifi AX, Fedora desktop

Home Network IPv6 Upgrade Path

Current Setup (as of 2026-04-22)

  • Desktop: Fedora 42 (Linux 6.17.11), eno1 on 192.168.50.198/24
  • Router: ASUS ZenWifi AX (192.168.50.1)
  • ISP: GMAVT / Green Mountain Access (AS12282), PPPoE connection
  • Public IP: 69.54.3.214 (pppoe-3.214.gmavt.net)
  • Location: Starksboro, VT
@dougbtv
dougbtv / gemma4-31b-usage.md
Created April 17, 2026 23:25
Gemma 4 31B: Build, Run, and Smoke Test Guide (nm-vllm-ent v0.19.1)

Gemma 4 31B: Build, Run, and Smoke Test Guide

Overview

This document covers how to build, deploy, and test the google/gemma-4-31B-it model using nm-vllm-ent (based on upstream vLLM v0.19.1) on NVIDIA A100 GPUs.

Build Information

Field Value
@dougbtv
dougbtv / MISTRAL_4_SMALL_HOWTO.md
Created April 13, 2026 23:52
Mistral-Small-4-119B: build, run, and smoke test howto
@dougbtv
dougbtv / OMNI_WITH_LTX2.md
Last active March 18, 2026 18:32
Running LTX-2 Video Generation with vLLM-Omni and ComfyUI - Conference Guide

Running LTX-2 Video Generation with vLLM-Omni and ComfyUI

This guide shows you how to run LTX-2 video generation (text-to-video and image-to-video) using vLLM-Omni as the inference backend and ComfyUI as the frontend.

Background

LTX-2 is a powerful video generation model from Lightricks that supports both text-to-video (T2V) and image-to-video (I2V) generation with audio synthesis.

Resources:

@dougbtv
dougbtv / README.md
Last active February 5, 2026 16:31
voxtral realtime in vLLM cheat sheet

RHAII Preview: Voxtral Realtime

This guide covers running and trying out the Red Hat AI Inference Server to serve Mistral Voxtral-Mini-4B-Realtime-2602 model, powered by vLLM.

You can find the Voxtral Mini model card @ https://huggingface.co/mistralai/Voxtral-Mini-4B-Realtime-2602

From the model card:

Voxtral Mini 4B Realtime 2602 is a multilingual, realtime speech-transcription model and among the first open-source solutions to achieve accuracy comparable to offline systems with a delay of <500ms. It supports 13 languages and outperforms existing open-source baselines across a range of tasks, making it ideal for applications like voice assistants and live subtitling.

@dougbtv
dougbtv / README.md
Last active December 15, 2025 16:04
RHAIIS Preview: NVIDIA Nemotron v3 (Nano 30B-A3B) on Red Hat AI Inference Server

RHAIIS Preview: NVIDIA Nemotron v3 (Nano 30B-A3B) on Red Hat AI Inference Server

This is a technical quick-start gist for the latest Red Hat AI Inference Server (RHAIIS) preview image, featuring NVIDIA Nemotron v3 Nano 30B-A3B models on vLLM.

Preview image tag (this release):

  • registry.redhat.io/rhaiis-preview/vllm-cuda-rhel9:nvidia-nemotron-v3

Upstream model family (Hugging Face):

  • nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-Base-BF16
  • nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16
@dougbtv
dougbtv / README.md
Last active July 25, 2024 19:48
net-attach-def spec: multiple interfaces returned in CNI results

Using this dummy CNI script...

Pay attention to the cniresult() routine, which returns... two interfaces.

#!/usr/bin/env bash

# DEBUG=true
# LOGFILE=/tmp/seamless.log
@dougbtv
dougbtv / README.md
Last active February 14, 2024 18:02
Whereabouts reconciler cron schedule change + file deletion in OCP 4.12.z

Enable the reconciler...

oc edit networks.operator.openshift.io cluster and add the additionalNetworks section like:

  additionalNetworks:
  - name: whereabouts-shim
    namespace: openshift-multus
    rawCNIConfig: |-
      {