Skip to content

Instantly share code, notes, and snippets.

View williamcaban's full-sized avatar

William Caban Babilonia williamcaban

View GitHub Profile

Evaluation flow for Telco Use Cases

Phase 1: Define Evaluation Scope

  1. Map use cases to evaluation dimensions — categorize by: network operations (fault diagnosis, config generation), customer-facing (intent classification, summarization), regulatory (data retention, privacy), and safety-critical (outage triage, escalation routing)
  2. Identify target model(s) — baseline (e.g., Llama-3-8B), candidate, and a reference frontier model for calibration
  3. Set acceptance thresholds per use case — e.g., config generation must be ≥95% syntactically valid; hallucination rate on network terminology must be <2%

Concept The Question It Answers Returns Example
Scorer HOW do I compute this specific value? (deterministic) A number or pass/fail Exact match, F1, BERTScore, ROUGE
Grader HOW do I assess this against a rubric? (qualitative + explanatory) Score + explanation LLM-as-a-Judge with faithfulness rubric, model_graded_qa
Evaluation Task WHAT am I measuring, end-to-end? Structured result from one or more scorers/graders Hallucination detection protocol
Evaluation Suite / Collection WHICH tasks apply to my use case? Aggregate quality signal RAG faithfulness suite for healthcare Q&A
@williamcaban
williamcaban / minio.yaml
Last active November 8, 2025 01:06
Deploying Minio Community Edition container in OpenShift
# kubectl create secret generic minio-secret \
# --from-literal=rootUser=your-username \
# --from-literal=rootPassword=your-secure-password \
# -o yaml --dry-run=client > minio-secret.yaml
#
# kubectl apply -f minio-secret.yaml -f minio.yaml
---
apiVersion: v1
kind: Secret
metadata:

DOCUMENT 1: CITY DEMOGRAPHIC REPORT

Millbrook Municipal Planning Department
Publication Date: March 15, 2024
Report ID: MPD-2024-003

Executive Summary

Millbrook, incorporated in 1892, serves as a diverse mid-sized city of 47,832 residents spanning 23.4 square miles. The municipality encompasses four distinct neighborhoods: Historic Downtown, Riverside District, University Heights, and New Millbrook. Population density averages 2,043 residents per square mile, with significant demographic variation across districts.

Demographic Breakdown by Neighborhood

Historic Downtown (Population: 12,847)

@williamcaban
williamcaban / README.md
Created August 13, 2025 02:42
Dummy implementation of OpenAI Responses API endpoints on Kubernetes Gateway API

Kubernetes Gateway API implementation that covers the OpenAI Responses API endpoints based on the current documentation.

Core Response Operations:

  • POST /v1/responses - Create new responses with input, model, and tools
  • GET /v1/responses/{id} - Retrieve response by ID
  • GET /v1/responses/{id} (with Accept: text/event-stream) - Stream responses in real-time
  • POST /v1/responses/{id} - Continue/extend existing responses
  • DELETE /v1/responses/{id} - Delete responses
@williamcaban
williamcaban / RAGAS Evals with KFP Setup Guide.md
Last active July 26, 2025 17:07
RAGAS evaluation using Kubeflow Pipelines

RAGAS Evaluation with Kubeflow Pipelines - Setup Guide

About

The ragas_pipeline.py is a Kubeflow Pipeline (KFP) definition on how to run RAGAS evaluations using KFP.

This pipeline is designed as an example pipeline with charactersitics expected for production environments like proper resource management, monitoring capabilities, and comprehensive documentation. You should adjust the components based on your specific RAG evaluation needs and infrastructure setup.

Key Components:

@williamcaban
williamcaban / convert_1password_to_apply.py
Created April 13, 2025 23:54
Script to convert 1Password8 export to CSV compatible with Apple Password
import csv
import os
def convert_1password_to_apple(input_file, output_file):
"""
Convert a 1Password CSV export file to a format compatible with Apple Passwords.
Apple Passwords import format requires the following columns:
- Title
- URL

RHEL AI 1.1 as Inference Endpoint

Step 1. Update the host_port key serve section of the config.yaml to listen in all interfaces.

...
serve:
  backend: vllm
  chat_template: auto
 host_port: 0.0.0.0:8000
from datasets import load_dataset
# Combine 'question' and 'answer' into a single 'text' field
def combine_qa(local_dataset):
local_dataset['text'] = f"User: {local_dataset['Question']}\nAssistant: {local_dataset['Answer']}"
return local_dataset
####################################################################################
# main
####################################################################################
@williamcaban
williamcaban / dataset_to_ilab.py
Last active July 7, 2024 18:53
Convert a Q&A custom dataset to InstructLab format
#
import sys, json
from pathlib import Path
from datetime import datetime
import pandas as pd
TSTAMP = datetime.now().replace(microsecond=0).isoformat().replace(":", '_')
ILABGEN = "granite-7b-lab-7b-Q4_K_M"+f"_{TSTAMP}"
DEBUG = True