Skip to content

Instantly share code, notes, and snippets.

@marsmensch
Created September 3, 2025 07:03
Show Gist options
  • Select an option

  • Save marsmensch/b175c09768ea4eeb61b2e6f85c50dbce to your computer and use it in GitHub Desktop.

Select an option

Save marsmensch/b175c09768ea4eeb61b2e6f85c50dbce to your computer and use it in GitHub Desktop.
Compare contributions by Bitcoin Core https://github.com/bitcoin/bitcoin contributors with Knots https://github.com/bitcoinknots/bitcoin contributors

Bitcoin Knots vs Core — Exclusive Contributors by LOC (Bash)

This script compares Bitcoin Knots and Bitcoin Core by actual lines of code (LOC) changed (adds + deletes), then lists exclusive contributors on each side. It also prints context (top directories, files, and commits) for Knots-only contributors.

Exclusivity (strict-zero):

  • Knots-only: > THRESHOLD LOC in Knots and exactly 0 LOC in Core
  • Core-only : > THRESHOLD LOC in Core and exactly 0 LOC in Knots

Counting rules

  • git log --numstat, adds + deletes
  • Merges excluded
  • Binary-only rows (shown as - - path) are ignored

Identity

  • Author identity uses mailmapped fields (%aN and %aE), shown as: Name <email> if email exists, otherwise Name.

Requirements

  • bash, git, awk, sort, join, comm, grep, cut, head, wc, mktemp
  • macOS or Linux. On Windows, run in Git Bash (or WSL).

Quick Start

chmod +x coreknots.sh
./coreknots.sh

./bitcoin-loc-compare/knots  (bitcoinknots/bitcoin)
./bitcoin-loc-compare/core   (bitcoin/bitcoin)

Notes & Caveats

Results depend on mailmapped identities (%aN|%aE) and the history scope (HEAD by default, or your RANGE/SINCE).

If you need different identity semantics, adjust the %aN|%aE format to %an|%ae (raw) consistently in both places (calc_loc and author_context_knots_raw).

#!/usr/bin/env bash
# LOC-based author exclusivity between Bitcoin Knots and Bitcoin Core
# + per-author context for Knots-only contributors (top dirs/files/commits by LOC).
#
# Counting rules:
# - Merges excluded
# - LOC = adds + deletes (from `git log --numstat`)
# - Binary-only rows ("-\t-\tfile") ignored
# Identity:
# - Author key uses mailmapped fields: %aN (name), %aE (email)
# - Displayed as: "Name <email>" if email exists, else "Name"
#
# Exclusivity (strict-zero):
# - Knots-only: >THRESHOLD LOC in Knots AND exactly 0 LOC in Core
# - Core-only: >THRESHOLD LOC in Core AND exactly 0 LOC in Knots
#
# Scope:
# - Default is current branch (HEAD) history on each repo.
# - You can constrain via RANGE (e.g., "v28.0..HEAD") or SINCE (e.g., "2023-01-01").
#
# Tested on a recent macOS and Ubuntu 22.04.
set -euo pipefail
# ---- Config (override via env) ----
WORKDIR="${WORKDIR:-bitcoin-loc-compare}"
KNOTS_REPO="${KNOTS_REPO:-https://github.com/bitcoinknots/bitcoin.git}"
CORE_REPO="${CORE_REPO:-https://github.com/bitcoin/bitcoin.git}"
KNOTS_DIR="$WORKDIR/knots"
CORE_DIR="$WORKDIR/core"
# "more than a single line" => > 1 (adds+deletes)
THRESHOLD="${THRESHOLD:-1}"
# Optional scope limiters for BOTH repos (leave empty for full HEAD history):
# RANGE="v28.0..HEAD" or SINCE="2023-01-01"
RANGE="${RANGE:-}"
SINCE="${SINCE:-}"
# Context sizes
TOP_DIRS="${TOP_DIRS:-5}"
TOP_FILES="${TOP_FILES:-5}"
TOP_COMMITS="${TOP_COMMITS:-3}"
# Quiet & deterministic git output
export GIT_TERMINAL_PROMPT=0
export GIT_PAGER=cat
GIT_CFG=(-c core.quotepath=off -c i18n.logOutputEncoding=UTF-8)
# ---- Dependency checks ----
need() { command -v "$1" >/dev/null 2>&1 || { echo "Error: '$1' not found on PATH." >&2; exit 127; }; }
for tool in git awk sort join comm grep cut head wc mktemp; do need "$tool"; done
# ---- workspace ----
mkdir -p "$WORKDIR"
# ---- Prep (quiet; FULL clones to avoid lazy blob fetch spam) ----
if [ ! -d "$KNOTS_DIR/.git" ]; then
echo "Cloning Knots (quiet)…"
git "${GIT_CFG[@]}" clone -q --no-progress "$KNOTS_REPO" "$KNOTS_DIR" >/dev/null 2>&1
else
echo "Using existing Knots repo at $KNOTS_DIR"
fi
if [ ! -d "$CORE_DIR/.git" ]; then
echo "Cloning Core (quiet)…"
git "${GIT_CFG[@]}" clone -q --no-progress "$CORE_REPO" "$CORE_DIR" >/dev/null 2>&1
else
echo "Using existing Core repo at $CORE_DIR"
fi
echo "Fetching latest (quiet)…"
git -C "$KNOTS_DIR" "${GIT_CFG[@]}" fetch -q --no-progress --all --tags --prune >/dev/null 2>&1
git -C "$CORE_DIR" "${GIT_CFG[@]}" fetch -q --no-progress --all --tags --prune >/dev/null 2>&1
# ---- Build git log arg sets ----
LOG_ARGS_BASE=(--no-merges --numstat --format=--%aN\|%aE --date=short)
if [ -n "$SINCE" ]; then LOG_ARGS_BASE=(--since="$SINCE" "${LOG_ARGS_BASE[@]}"); fi
# Range vs path selection (HEAD by default)
if [ -n "$RANGE" ]; then
LOG_ARGS_REPO_RANGE=("${LOG_ARGS_BASE[@]}" $RANGE -- .)
fi
LOG_ARGS_REPO_PATH=("${LOG_ARGS_BASE[@]}" -- .)
# ---- calc_loc: build "author\tLOC" list for a repo (mailmap-applied) ----
calc_loc() {
repo_dir="$1"; out_file="$2"
if [ -n "$RANGE" ]; then
git -C "$repo_dir" "${GIT_CFG[@]}" log "${LOG_ARGS_REPO_RANGE[@]}" 2>/dev/null
else
git -C "$repo_dir" "${GIT_CFG[@]}" log "${LOG_ARGS_REPO_PATH[@]}" 2>/dev/null
fi | awk -v THRESHOLD="$THRESHOLD" -v OFS='\t' '
BEGIN { FS="\t" }
# Header lines look like: --Name|Email
/^--/ {
line=$0; sub(/^--/,"",line)
n = split(line, p, /\|/)
author=(n>=1?p[1]:""); email=(n>=2?p[2]:"")
key = (email != "" ? author " <" email ">" : author)
next
}
# numstat rows: "<added>\t<deleted>\t<path>"
NF==3 {
a=$1; d=$2
if (a ~ /^[0-9]+$/ && d ~ /^[0-9]+$/) loc[key]+=a+d
next
}
END {
for (k in loc) if (loc[k] > THRESHOLD) print k, loc[k]
}
' | LC_ALL=C sort -u > "$out_file"
}
# ---- author_context_knots_raw: raw context lines from Knots repo ----
# Emits unsorted lines tagged as:
# DIR<TAB><dir><TAB><loc>
# FILE<TAB><path><TAB><loc>
# COMM<TAB><date><TAB><sha><TAB><loc><TAB><subject>
author_context_knots_raw() {
author_key="$1" # "Name <email>" or "Name"
repo_dir="$KNOTS_DIR"
if [ -n "$RANGE" ]; then
LOG_FMT=(--no-merges --numstat --format=@@@%aN\|%aE\|%H\|%ad\|%s --date=short $RANGE -- .)
else
LOG_FMT=(--no-merges --numstat --format=@@@%aN\|%aE\|%H\|%ad\|%s --date=short -- .)
fi
git -C "$repo_dir" "${GIT_CFG[@]}" log "${LOG_FMT[@]}" 2>/dev/null \
| awk -v KEY="$author_key" -v OFS='\t' '
BEGIN { FS="\t"; in_author=0 }
/^@@@/ {
sub(/^@@@/,"")
# header: name|email|sha|date|subject
n=split($0, h, /\|/)
name=(n>=1?h[1]:""); email=(n>=2?h[2]:"")
sha=(n>=3?h[3]:""); date=(n>=4?h[4]:""); subj=(n>=5?h[5]:"")
key = (email != "" ? name " <" email ">" : name)
in_author = (key==KEY)
if (in_author) { curr_sha=sha; curr_date=date; curr_subj=subj; commit_loc[curr_sha]=0 }
next
}
in_author && NF==3 {
a=$1; d=$2; path=$3
if (a ~ /^[0-9]+$/ && d ~ /^[0-9]+$/) {
loc = a+d
dir=path; sub(/\/.*/,"",dir)
dir_loc[dir]+=loc
file_loc[path]+=loc
commit_loc[curr_sha]+=loc
commit_date[curr_sha]=curr_date
commit_subj[curr_sha]=curr_subj
}
next
}
END {
for (k in dir_loc) print "DIR", k, dir_loc[k]
for (k in file_loc) print "FILE", k, file_loc[k]
for (k in commit_loc) print "COMM", commit_date[k], k, commit_loc[k], commit_subj[k]
}
'
}
# ---- Compute LOC per author for each repo ----
KNOTS_TSV="$WORKDIR/knots_authors_loc.tsv"
CORE_TSV="$WORKDIR/core_authors_loc.tsv"
echo "Computing LOC by author (Knots)…"
calc_loc "$KNOTS_DIR" "$KNOTS_TSV"
echo "Computing LOC by author (Core)…"
calc_loc "$CORE_DIR" "$CORE_TSV"
# ---- Build name-only sets (sorted) ----
cut -f1 "$KNOTS_TSV" | LC_ALL=C sort -u > "$WORKDIR/knots.names"
cut -f1 "$CORE_TSV" | LC_ALL=C sort -u > "$WORKDIR/core.names"
# ---- Set differences (strict-zero exclusivity) ----
LC_ALL=C comm -23 "$WORKDIR/knots.names" "$WORKDIR/core.names" > "$WORKDIR/knots_only.names"
LC_ALL=C comm -13 "$WORKDIR/knots.names" "$WORKDIR/core.names" > "$WORKDIR/core_only.names"
# ---- Attach LOC counts (join) ----
LC_ALL=C sort -u "$KNOTS_TSV" -o "$KNOTS_TSV"
LC_ALL=C sort -u "$CORE_TSV" -o "$CORE_TSV"
LC_ALL=C sort -u "$WORKDIR/knots_only.names" -o "$WORKDIR/knots_only.names"
LC_ALL=C sort -u "$WORKDIR/core_only.names" -o "$WORKDIR/core_only.names"
join -t $'\t' -1 1 -2 1 "$WORKDIR/knots_only.names" "$KNOTS_TSV" > "$WORKDIR/knots_only_with_loc.tsv" || true
join -t $'\t' -1 1 -2 1 "$WORKDIR/core_only.names" "$CORE_TSV" > "$WORKDIR/core_only_with_loc.tsv" || true
# ---- Output: summaries ----
echo
echo "Developers with >$THRESHOLD LOC in Knots and 0 LOC in Core:"
if [ -s "$WORKDIR/knots_only_with_loc.tsv" ]; then
awk -F'\t' '{printf "%s (LOC: %s)\n",$1,$2}' "$WORKDIR/knots_only_with_loc.tsv" | LC_ALL=C sort
echo "(total: $(wc -l < "$WORKDIR/knots_only_with_loc.tsv" | xargs))"
else
echo "(none)"
fi
echo
echo "Developers with >$THRESHOLD LOC in Core and 0 LOC in Knots:"
if [ -s "$WORKDIR/core_only_with_loc.tsv" ]; then
awk -F'\t' '{printf "%s (LOC: %s)\n",$1,$2}' "$WORKDIR/core_only_with_loc.tsv" | LC_ALL=C sort
echo "(total: $(wc -l < "$WORKDIR/core_only_with_loc.tsv" | xargs))"
else
echo "(none)"
fi
# ---- Detailed context for Knots-only contributors ----
cleanup_tmp() { [ -n "${_CTX_TMP:-}" ] && rm -f "$_CTX_TMP" || true; }
trap cleanup_tmp EXIT
if [ -s "$WORKDIR/knots_only_with_loc.tsv" ]; then
echo
echo "===== Knots-only contributor context ====="
while IFS=$'\t' read -r author_key loc; do
echo
echo "$author_key — total LOC in Knots: $loc"
_CTX_TMP="$(mktemp -t knotsctx.XXXXXX)"
author_context_knots_raw "$author_key" > "$_CTX_TMP"
# Top directories
echo " Top directories:"
if grep -q "^DIR"$'\t' "$_CTX_TMP"; then
grep "^DIR"$'\t' "$_CTX_TMP" \
| LC_ALL=C sort -t $'\t' -k3,3nr | head -n "$TOP_DIRS" \
| awk -F'\t' '{printf " - %-20s (LOC: %s)\n",$2,$3}'
else
echo " (none)"
fi
# Top files
echo " Top files:"
if grep -q "^FILE"$'\t' "$_CTX_TMP"; then
grep "^FILE"$'\t' "$_CTX_TMP" \
| LC_ALL=C sort -t $'\t' -k3,3nr | head -n "$TOP_FILES" \
| awk -F'\t' '{printf " - %s (LOC: %s)\n",$2,$3}'
else
echo " (none)"
fi
# Top commits (by LOC)
echo " Top commits:"
if grep -q "^COMM"$'\t' "$_CTX_TMP"; then
# COMM <date> <sha> <loc> <subject>
grep "^COMM"$'\t' "$_CTX_TMP" \
| LC_ALL=C sort -t $'\t' -k4,4nr | head -n "$TOP_COMMITS" \
| awk -F'\t' '{printf " - %s %s (LOC: %s) — %s\n",$2,substr($3,1,12),$4,$5}'
else
echo " (none)"
fi
rm -f "$_CTX_TMP"; _CTX_TMP=""
done < "$WORKDIR/knots_only_with_loc.tsv"
fi
echo
echo "Files written:"
echo " $KNOTS_TSV # Knots authors with LOC (> $THRESHOLD)"
echo " $CORE_TSV # Core authors with LOC (> $THRESHOLD)"
echo " $WORKDIR/knots_only_with_loc.tsv # Knots-only authors + LOC"
echo " $WORKDIR/core_only_with_loc.tsv # Core-only authors + LOC"
./coreknots.sh
Using existing Knots repo at bitcoin-loc-compare/knots
Using existing Core repo at bitcoin-loc-compare/core
Fetching latest (quiet)…
Computing LOC by author (Knots)…
Computing LOC by author (Core)…
Developers with >1 LOC in Knots and 0 LOC in Core:
CharlesCNorton <135471798+CharlesCNorton@users.noreply.github.com> (LOC: 2)
David Benjamin <davidben@davidben.net> (LOC: 2)
Doron Somech <somdoron@gmail.com> (LOC: 194)
Kratos <52678073+Xnork@users.noreply.github.com> (LOC: 5)
R E Broadley <github.com@esuza.com> (LOC: 216)
Steven Hay <steven.hay@protonmail.ch> (LOC: 361)
Thomas Kerin <me@thomaskerin.io> (LOC: 112)
Vojtěch Strnad <43024885+vostrnad@users.noreply.github.com> (LOC: 14)
(total: 8)
Developers with >1 LOC in Core and 0 LOC in Knots:
Boris Nagaev <bnagaev@gmail.com> (LOC: 4)
Bue-von-hon <dkssudvn2@gmail.com> (LOC: 73)
Chandra Pratap <chandrapratap3519@gmail.com> (LOC: 26)
Daniel Pfeifer <daniel@pfeifer-mail.de> (LOC: 405)
Eugene Siegel <elzeigel@gmail.com> (LOC: 733)
Eunovo <eunovo9@gmail.com> (LOC: 138)
Gutflo <107882881+klein818@users.noreply.github.com> (LOC: 2)
John Bampton <jbampton@gmail.com> (LOC: 2)
Kay <kehiiiiya@gmail.com> (LOC: 8)
Nicola Leonardo Susca <nicolaleonardo.susca@gmail.com> (LOC: 88)
Novo <eunovo9@gmail.com> (LOC: 24)
Pol Espinasa <pol.espinasa@uab.cat> (LOC: 188)
Prabhat Verma <prabhatverma329@gmail.com> (LOC: 106)
RiceChuan <lc582041246@gmail.com> (LOC: 2)
Saikiran <saikirannadipilli@gmail.com> (LOC: 74)
Salvatore Ingala <6681844+bigspider@users.noreply.github.com> (LOC: 40)
Shunsuke Shimizu <grafi@grafi.jp> (LOC: 10)
VolodymyrBg <aqdrgg19@gmail.com> (LOC: 11)
(total: 18)
===== Knots-only contributor context =====
CharlesCNorton <135471798+CharlesCNorton@users.noreply.github.com> — total LOC in Knots: 2
Top directories:
- README.md (LOC: 2)
Top files:
- README.md (LOC: 2)
Top commits:
- 2024-06-17 256f12b68e12 (LOC: 2) — fix: typo in development process documentation
David Benjamin <davidben@davidben.net> — total LOC in Knots: 2
Top directories:
- src (LOC: 2)
Top files:
- src/leveldb/util/hash.cc (LOC: 2)
Top commits:
- 2025-01-02 19e8086b9fe5 (LOC: 2) — Fix invalid pointer arithmetic in Hash (#1222)
Doron Somech <somdoron@gmail.com> — total LOC in Knots: 194
Top directories:
- src (LOC: 108)
- test (LOC: 75)
- doc (LOC: 11)
Top files:
- test/functional/interface_zmq.py (LOC: 75)
- src/zmq/zmqpublishnotifier.cpp (LOC: 37)
- src/zmq/zmqnotificationinterface.cpp (LOC: 27)
- src/zmq/zmqpublishnotifier.h (LOC: 12)
- doc/zmq.md (LOC: 11)
Top commits:
- 2017-06-08 7b41419e8def (LOC: 194) — ZMQ: add publishers of wallet tx
Kratos <52678073+Xnork@users.noreply.github.com> — total LOC in Knots: 5
Top directories:
- src (LOC: 5)
Top files:
- src/consensus/merkle.cpp (LOC: 5)
Top commits:
- 2023-09-08 42b25bbd9397 (LOC: 5) — fix: unnecessary continuation after finding mutation
R E Broadley <github.com@esuza.com> — total LOC in Knots: 216
Top directories:
- src (LOC: 216)
Top files:
- src/qt/trafficgraphwidget.cpp (LOC: 158)
- src/qt/guiutil.cpp (LOC: 34)
- src/qt/trafficgraphwidget.h (LOC: 11)
- src/util/time.cpp (LOC: 11)
- src/qt/guiutil.h (LOC: 1)
Top commits:
- 2021-12-01 fbeb4698203a (LOC: 122) — Show ToolTip on Network Traffic graph
- 2021-11-15 ad431ff5d18d (LOC: 42) — Enable non-linear network traffic graph
- 2021-12-07 e6ccf929a4e6 (LOC: 35) — Add formatBytesps function
Steven Hay <steven.hay@protonmail.ch> — total LOC in Knots: 361
Top directories:
- src (LOC: 361)
Top files:
- src/qt/res/src/bitcoin.svg (LOC: 361)
Top commits:
- 2016-02-26 d727d000d1eb (LOC: 361) — Replace bitcoin.svg with Knots version
Thomas Kerin <me@thomaskerin.io> — total LOC in Knots: 112
Top directories:
- test (LOC: 69)
- src (LOC: 43)
Top files:
- test/functional/rpc_sort_multisig.py (LOC: 41)
- test/functional/wallet_labels.py (LOC: 22)
- src/script/solver.cpp (LOC: 19)
- src/rpc/output_script.cpp (LOC: 8)
- src/wallet/rpc/addresses.cpp (LOC: 6)
Top commits:
- 2016-11-08 c22a69eed1bb (LOC: 112) — RPC: addmultisigaddress / createmultisig: parameterize _createmultisig_redeemScript to allow sorting of public keys (BIP67)
Vojtěch Strnad <43024885+vostrnad@users.noreply.github.com> — total LOC in Knots: 14
Top directories:
- src (LOC: 14)
Top files:
- src/test/fuzz/transaction.cpp (LOC: 4)
- src/policy/policy.cpp (LOC: 3)
- src/init.cpp (LOC: 2)
- src/node/mempool_args.cpp (LOC: 2)
- src/policy/policy.h (LOC: 2)
Top commits:
- 2024-01-25 df9da3a9a473 (LOC: 14) — Add a `-permitbarepubkey` option
Files written:
bitcoin-loc-compare/knots_authors_loc.tsv # Knots authors with LOC (> 1)
bitcoin-loc-compare/core_authors_loc.tsv # Core authors with LOC (> 1)
bitcoin-loc-compare/knots_only_with_loc.tsv # Knots-only authors + LOC
bitcoin-loc-compare/core_only_with_loc.tsv # Core-only authors + LOC
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment