komamitsu · May 5, 2026 05:40
diff --git a/gistfile2.txt b/gistfile2.txt
 # Wormhole4j 0.3: Thread-Safe Concurrent Access

 ## 1. Introduction
 - Recap of Wormhole4j: fast ordered map, O(log L) lookups, EuroSys '19 paper
 - The gap: single-threaded only until now
 - What this post covers: design, correctness verification, benchmarks

 ## 2. Why Concurrent Ordered Indexes Are Hard
 - Maintaining order + range scans rules out simple approaches
 - The specific problem: splits/merges modify both the LeafList
  and MetaTrieHT simultaneously — you can't just add a lock without
  serializing reads
 - Three categories of operations with different needs (from the paper §2.5):
  reads, local writes, structural writes — motivates why one strategy doesn't fit all

 ## 3. The Design

 ### Lock-Free Reads: Dual Tables + QSBR
 - Two copies of MetaTrieHT; writers modify the inactive one, then atomically flip
 - QSBR is how the writer knows it's safe to touch the old table afterwards:
  threads signal quiescent states instead of reference counting — much cheaper
 - This is why registerThread() / unregisterThread() are required

 ### Fine-Grained Locking on Leaf Nodes
 - Per-leaf locks keep contention localised to the nodes actually being modified
 - Version numbers detect stale reads and trigger lightweight retries
  rather than blocking

 ## 4. Testing Correctness with Lincheck
 - Concurrent bugs are invisible to unit tests and stress tests
 - What linearizability means and why it matters
 - How Lincheck works: specify correct sequential behaviour (TreeMap),
  then verify every concurrent execution matches it
 - Small key range + tiny leaf node size = constant splits/merges =
  maximum bug exposure
 - It found many real bugs — this was the most valuable part of the process

 ## 5. Benchmark Results
 - Setup: Java 21, 1+1 through 8+8 threads, vs ConcurrentSkipListMap
 - Put + Get: consistently faster, especially String keys
 - Put + Scan: strong PUT advantage; honest note on numeric key SCAN trade-off
 - Scalability: linear to ~4+4 threads, then plateau — advantage maintained throughout
 - Charts

 ## 6. Trade-offs and Lessons Learned
 - Explicit thread registration: a deliberate choice, not an oversight
 - Numeric key SCAN under write pressure: the one area where
  ConcurrentSkipListMap has an edge
 - Lincheck should be in your toolkit for any concurrent data structure

 ## 7. Closing
 - Links to GitHub, paper, previous posts
 - Next: persistence support
	# Wormhole4j 0.3: Thread-Safe Concurrent Access

	## 1. Introduction
	- Recap of Wormhole4j: fast ordered map, O(log L) lookups, EuroSys '19 paper
	- The gap: single-threaded only until now
	- What this post covers: design, correctness verification, benchmarks

	## 2. Why Concurrent Ordered Indexes Are Hard
	- Maintaining order + range scans rules out simple approaches
	- The specific problem: splits/merges modify both the LeafList
	and MetaTrieHT simultaneously — you can't just add a lock without
	serializing reads
	- Three categories of operations with different needs (from the paper §2.5):
	reads, local writes, structural writes — motivates why one strategy doesn't fit all

	## 3. The Design

	### Lock-Free Reads: Dual Tables + QSBR
	- Two copies of MetaTrieHT; writers modify the inactive one, then atomically flip
	- QSBR is how the writer knows it's safe to touch the old table afterwards:
	threads signal quiescent states instead of reference counting — much cheaper
	- This is why registerThread() / unregisterThread() are required

	### Fine-Grained Locking on Leaf Nodes
	- Per-leaf locks keep contention localised to the nodes actually being modified
	- Version numbers detect stale reads and trigger lightweight retries
	rather than blocking

	## 4. Testing Correctness with Lincheck
	- Concurrent bugs are invisible to unit tests and stress tests
	- What linearizability means and why it matters
	- How Lincheck works: specify correct sequential behaviour (TreeMap),
	then verify every concurrent execution matches it
	- Small key range + tiny leaf node size = constant splits/merges =
	maximum bug exposure
	- It found many real bugs — this was the most valuable part of the process

	## 5. Benchmark Results
	- Setup: Java 21, 1+1 through 8+8 threads, vs ConcurrentSkipListMap
	- Put + Get: consistently faster, especially String keys
	- Put + Scan: strong PUT advantage; honest note on numeric key SCAN trade-off
	- Scalability: linear to ~4+4 threads, then plateau — advantage maintained throughout
	- Charts

	## 6. Trade-offs and Lessons Learned
	- Explicit thread registration: a deliberate choice, not an oversight
	- Numeric key SCAN under write pressure: the one area where
	ConcurrentSkipListMap has an edge
	- Lincheck should be in your toolkit for any concurrent data structure

	## 7. Closing
	- Links to GitHub, paper, previous posts
	- Next: persistence support
No results found