Skip to content

Instantly share code, notes, and snippets.

View async0x42's full-sized avatar

async0x42 async0x42

View GitHub Profile
@async0x42
async0x42 / kimi-vl-a3b-instruct-video-inference-script-cli.py
Created April 11, 2025 14:01 — forked from eek/kimi-vl-a3b-instruct-video-inference-script-cli.py
This script extracts frames from videos and generates descriptions using the Kimi-VL-A3B model. It takes the following arguments: video_path (required): Path to the input video file --max_frames (default=1): Maximum number of frames to extract --save_dir (default="./test-frames"): Directory to save extracted frames --prompt (default="Describe th…
import cv2
import argparse
import torch
import os # Added import
from PIL import Image
from transformers import AutoModelForCausalLM, AutoProcessor
# Function to extract frames from video, save them, and return paths
def extract_frames(video_path, save_dir, target_fps=1, max_frames=1):
"""Extracts up to max_frames from a video file at target FPS, saves them, and returns their paths."""