Anil Keshwani anilkeshwani

Tinkering ⚗️ ML x Audio

89 followers · 710 following

View GitHub Profile

Recently created

Least recently created

Recently updated

Least recently updated

anilkeshwani / diff_jsons.sh

Created December 23, 2025 12:58

Bash: Diff JSONs after normalization with jq

	#!/usr/bin/env bash
	set -euo pipefail

	files=(
	"./path/to/yourfile.json"
	".another/path/to/another/file.json"
	"./toplevelfile.json"
	)

	for f in "${files[@]}"; do

anilkeshwani / random_choice_via_coin_flip.py

Created July 11, 2025 12:26

	import random
	import matplotlib.pyplot as plt

	def choose_from(size: int) -> int:
	start, end = 0, size
	dst = size
	while dst > 1:
	if random.getrandbits(1):
	start += dst // 2 # choose rhs
	else:

anilkeshwani / extended_euclidean_algorithm.py

Created July 8, 2025 16:03

Euclidean Algorithm to find the GCD and Extended Euclidean Algorithm to find the GCD and Bézout coefficients

	def euclidean_algorithm(a: int, b: int, verbose: bool = False) -> int:
	if a < b:
	a, b = b, a
	while True:
	a, b = b, a % b
	if not b:
	return a
	if verbose:
	print(f"{(a, b) = }")

anilkeshwani / jq_examples.md

Created June 4, 2025 14:54 — forked from deepns/jq_examples.md

Some examples of filtering JSON data using jq

Some examples of using jq

Simple filter
Access objects
Access lists/arrays
Combine filters with pipe
Raw output
Transform
Feed into multiple filters

anilkeshwani / tmux_slurm_launcher_example.sh

Created May 23, 2025 13:35

Example Bash script to launch Slurm jobs in new tmux windows (detached from session) with Python

	#!/usr/bin/env bash

	for i in $(seq -f "%03g" 86 108); do
	file="/mnt/scratch-artemis/anilkeshwani/mls/shards/mls-transcripts_uroman_${i}.jsonl"
	echo "Launching ${file}"
	tmux new-window -d -t mls -n "align_and_hubert_mls_$i" -- bash -c "
	srun --partition a6000 --time=04:00:00 --qos=gpu-long --gres=gpu:1 \
	/mnt/scratch-artemis/anilkeshwani/miniconda3/envs/sardalign/bin/python /mnt/scratch-artemis/anilkeshwani/speech-text-alignment/scripts/align_and_hubert_encode.py \
	--jsonl \"${file}\" \
	--ids-not-paths \

anilkeshwani / compare_defaultdict_regular_dict_benchmark_tokenization.py

Created March 18, 2025 09:20

Benchmark performance of defaultdict vs regular dict with explicit key check for tokenization use case

	import random
	import string
	import timeit
	from collections import defaultdict


	# Generate a random list of words (simulating a corpus)
	random.seed(42)
	words = ["".join(random.choices(string.ascii_lowercase, k=5)) for _ in range(100000)]

anilkeshwani / get_pytorch_environment_info.py

Created March 11, 2025 13:06

Query PyTorch, CUDA and NVIDIA environment information

	import platform
	import subprocess
	import sys

	import numpy as np
	import torch


	# PyTorch info
	print(f"PyTorch version: {torch.__version__}")

anilkeshwani / show_hf_llama_3_2_3B_ckpt_structure.py

Last active November 26, 2024 16:03

Snippet showing the Llama 3.2 3B checkpoint structure (as an example of the splitting of models by tensor when saving checkpoints to Hugging Face repos; avoid exceeding 5GB max. even if this is not a limit)

	#!/usr/bin/env python

	"""
	See: https://huggingface.co/docs/safetensors/en/index
	"""

	from pathlib import Path
	from pprint import pp
	from time import perf_counter

anilkeshwani / quartz_github_workflows_deploy.yml

Created October 25, 2024 12:43

quartz/.github/workflows/deploy.yml - 2024-10-25 - Source: https://quartz.jzhao.xyz/hosting

	name: Deploy Quartz site to GitHub Pages

	on:
	push:
	branches:
	- v4

	permissions:
	contents: read
	pages: write

anilkeshwani / salted.py

Last active October 19, 2024 21:26

Decoding output from hexdump (hexadecimal integers) and converting to binary

	#!/usr/bin/env python


	hexstr = "53 61 6c 74 65 64 5f 5f" # ef bf bd ef bf bd ef bf
	hexstr = hexstr.replace(" ", "")
	print(len(str(hexstr)))
	print(bytes.fromhex(hexstr).decode("utf-8"))

	decimal = int(hexstr, 16)

NewerOlder