apnea

Spontaneous Language Switching in LLMs

LLMs may spontaneously switch to Chinese mid-reasoning regardless of prompt language — observed in both OpenAI's o1 and Chinese models (DeepSeek, Qwen, GLM)

The papers listed below list 3 possible reasons for this: internal circuit competition, strategic reasoning advantages gained during training, and the influence of distributed training data.

1. Competition Between Internal Circuits

Mechanistic interpretability research suggests that multilingual LLMs possess two distinct internal subsystems that govern generation:

Caveman Token Cost Analysis

Date: 2026-04-18

Analysis of caveman — a skill/plugin instructing AI coding agents to respond in compressed prose, dropping articles, filler, and hedging while preserving technical accuracy.

Summary

Caveman claims: "~65-75% fewer tokens," measured via an eval harness that counts visible output tokens.

driver / toolkit in prod

Driver 565.77 CUDA 12.7

(currently Aug 2025 running on 22.04 Linux pop-os 6.12.10-76061203-generic)

CUDA

https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#deprecated-architectures

CUDA Toolkit 12.9 Update 1 - Release Notes

Maxwell, Pascal, and Volta architectures are now feature-complete with no further enhancements planned. While CUDA Toolkit 12.x series will continue to support building applications for these architectures, offline compilation and library support will be removed in the next major CUDA Toolkit version release. Users should plan migration to newer architectures, as future toolkits will be unable to target Maxwell, Pascal, and Volta GPUs.

	#!/bin/bash
	python3 -c "
	import os, urllib.parse
	files = [f for f in os.listdir('.') if os.path.isfile(f)]
	files.sort(key=lambda x: os.path.getmtime(x), reverse=True)
	with open('index.html', 'w') as out:
	out.write('<html><body><h2>Files (newest first)</h2>')
	for f in files:
	url = urllib.parse.quote(f)
	out.write(f'<a href=\"{url}\">{f}</a><br>\n')