Skip to content

Instantly share code, notes, and snippets.

View MichaelChirico's full-sized avatar

Michael Chirico MichaelChirico

View GitHub Profile
@MichaelChirico
MichaelChirico / validate_wilcox_rank.R
Created May 10, 2026 20:53
Validate check of digits.rank
# This script creates two dummy packages to prove that the binary search evaluation
# correctly identifies packages that fail when digits.rank is set too low.
# -- Create dummy3 --
# This package expects wilcox.test(c(1.11, 2.22), c(1.14, 2.25))$statistic to be exactly 1.
# This assertion fails at digits.rank=1 and 2, but passes at digits.rank >= 3.
dir.create("dummy3/R", recursive = TRUE, showWarnings = FALSE)
dir.create("dummy3/tests/testthat", recursive = TRUE, showWarnings = FALSE)
writeLines(c(
@MichaelChirico
MichaelChirico / cran_wilcox_rank.R
Last active May 10, 2026 20:50
Search CRAN packages for senstivity to wilcox.test(digits.rank=X)
# Assume these are installed in a state to pass R CMD check, i.e.,
# with enough Suggests to pass with _R_CHECK_FORCE_SUGGESTS_=false
packages <- c("cardx", "caTools", "clintools", "cogmapr", "CRMetrics", "EasyDescribe",
"effectsize", "EnvStats", "eyetrackingR", "ggpubr", "ggpval", "ggsignif",
"gtsummary", "iCellR", "iDOS", "jsmodule", "LGDtoolkit", "microeco",
"mnda", "mt", "PairedData", "pairwiseCI", "papeR", "pctax", "pcutils",
"PLEXI", "plotbb", "plotthis", "PopComm", "qPCRtools", "RadOnc",
"rattle", "rbiom", "Rcmdr", "RcmdrPlugin.MPAStats", "rcompanion",
"rempsyc", "ReporterScore", "SCpubr", "sigminer", "tinyarray",
"TOSTER", "UCSCXenaShiny", "voiceR", "volcano3D")
@MichaelChirico
MichaelChirico / units_udunits_no_exceptions_crash.sh
Created December 16, 2025 20:19
units crashes if udunits compiled with -fno-exceptions
#!/bin/bash
set -e # Exit immediately if a command exits with a non-zero status.
# --- 1. Environment Setup ---
echo ">>> Installing System Dependencies..."
sudo apt-get update -qq
# Dependencies:
# - flex/bison: for udunits parser compilation
# - texinfo: for udunits documentation
# - libexpat1-dev: for udunits XML parsing
@MichaelChirico
MichaelChirico / methods_load_unload.R
Last active November 16, 2025 02:48
Check load/unload loop for methods downstreams
cran_repo = "https://cloud.r-project.org"
bioc_repo = BiocManager::repositories()["BioCsoft"]
cran_db <- data.frame(available.packages(repos = cran_repo))
bioc_db <- data.frame(available.packages(repos = bioc_repo))
methods_importers <- function(db, skip) db |>
subset(
grepl("(^|[^\\w.])methods($|[^\\w.])", Imports, perl=TRUE),
"Package",
@MichaelChirico
MichaelChirico / methods_load_unload.R
Created November 7, 2025 15:38
{methods} load+unload loop test
db <- data.frame(available.packages())
methods_imports <- db |>
subset(grepl("(^|[^\\w.])methods($|[^\\w.])", Imports, perl=TRUE), "Package", drop=TRUE)
test_lib = '/media/WesternDigital3839/tmpCRAN'
# very slow, and requires some iteration to get SystemRequirements
install.packages(methods_imports, lib=test_lib)
@MichaelChirico
MichaelChirico / nfl_game_durations.R
Created October 28, 2025 16:18
Get NFL game durations
library(rvest)
library(xml2)
PFR_URL = 'https://www.pro-football-reference.com'
read_with_backoff = function(url, sleep = 0.1) {
tryCatch(read_html(url), error = function(.) {
Sys.sleep(sleep)
sleep = 2 * sleep
message(sprintf("Failed, retrying in %.2fs", sleep))
@MichaelChirico
MichaelChirico / svn-range-finder.sh
Last active September 30, 2025 20:36
Find range of SVN commits where a file matches a pattern
#!/bin/bash
# GENERATED BY GEMINI
# Configuration Variables
# The path to the file in the SVN repository
FILE_PATH="src/library/parallel/NAMESPACE"
# The pattern you are searching for (case-sensitive)
SEARCH_PATTERN="splitList"
# --- Script Logic ---
@MichaelChirico
MichaelChirico / github_pr_file_metadata.R
Last active July 2, 2025 16:44
Scrape GitHub PRs in a repo to see which files they touch
# INITIALLY GENERATED BY GEMINI
# PR File Scraper
#
# Description:
# This script inspects a local Git repository, identifies its GitHub remote,
# and fetches all open pull requests. It then compiles a data.frame
# where each row represents a file modified in a specific pull request,
# along with metadata about that PR.
#
@MichaelChirico
MichaelChirico / rcv_bootstrap.R
Created June 6, 2025 14:19
Code for running bootstraps of ranked-choice voting elections
library(data.table)
library(ggplot2)
# Downloaded from https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/AMK8PJ
dir="/media/michael/ab3f2700-872c-4b29-95f2-9a700166bc52/dataverse_files"
run_rcv_election = function(ballots) {
# TODO(michaelchirico): there should be a way to safely avoid a full copy, is it worth it?
ballots = copy(ballots[, .(rank, candidate)])
# _don't_ rely on ballots$voterid from input -- in bootstrap, under resampling, we will have
@MichaelChirico
MichaelChirico / rbind.fill_vs_bind_rows.R
Last active June 3, 2025 23:21
Comparing plyr::rbind.fill and dplyr::bind_rows()
# quick look sheet for comparing plyr::rbind.fill --> dplyr::bind_rows()
# NB: I am only interested in migrating rbind.fill-->bind_rows(), so
# features of bind_rows() absent from rbind.fill(), e.g. .id=, are not examined.
rbind.fill = plyr::rbind.fill
bind_rows = dplyr::bind_rows
DF1 = data.frame(a = 1, b = 2)
DF2 = data.frame(a = 1, b = 2)
all.equal(rbind.fill(DF1, DF2), bind_rows(DF1, DF2))