Skip to content

Instantly share code, notes, and snippets.

@andhapp
Forked from jstorimer/0-README.md
Created December 29, 2013 18:24
Show Gist options
  • Select an option

  • Save andhapp/8173183 to your computer and use it in GitHub Desktop.

Select an option

Save andhapp/8173183 to your computer and use it in GitHub Desktop.
This is a proof-of-concept of a couple of concurrent data structures
written in Ruby.
The implementations are heavily commented for those interested. There
are benchmarks (with results) included below. The results are interesting,
but, as always, take with a grain of salt.
## Data structures
AtomicLinkedQueue is a lock-free queue, built on atomic CAS operations.
It doesn't use any mutexes.
TwoLockLinkedQueue is a lock-based queue, but with separate locks for the
head + tail. This means there can be lots of contention when pushing to
the queue, but little when it comes to popping off the queue.
Both of these implementations are unbounded queues, with no blocking
operations.
See the individual files below for more about their implementation.
## Background
For those unfamiliar with the atomic compare-and-swap operation, here's
a brief outline of how it operates.
Create an Atomic instance that holds a value. (This comes from the [atomic
rubygem](https://github.com/headius/atomic-ruby).
item = Atomic.new('value')
The workhorse method is #compare_and_set(old, new). You pass it the value
that you *think* it currently holds, and the value you want to set it to.
If it does hold the expected value, it's set to your new value. Otherwise,
it returns false. In this implementation, when that happens, the operation
is re-started, over and over, until it can succeed.
This compare-and-set operation is hardware-supported and works across
Ruby implementations thanks to the 'atomic' gem. The hardware support
provides the atomic guarantee. Without this guarantee, it would be possible
to read a value, then have another thread update the value before you can,
then your thread would overwrite that value. This stuff is complicated.
## Due credit
I can't take credit for this implementation. This is an implementation
of the pseudo-code from "Simple, Fast, and Practical Non-Blocking and
Blocking Concurrent Queue Algorithms"[1], with some hints from Java's
implementation of java.util.concurrent.ConcurrentLinkedQueue[2].
1. http://www.cs.rochester.edu/research/synchronization/pseudocode/queues.html
2. http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/concurrent/ConcurrentLinkedQueue.java
## Benchmark results
These are the results of running the `benchmark_concurrent_reads+writes.rb` file against Rubinius 2.0.0-rc1 and JRuby 1.7.3. Any results from MRI will be superficial because Of the global lock, so I omitted it completely.
All of the benchmarks were run against Queue (a fully synchronized queue from Ruby's std lib), TwoLockLinkedQueue (my implementation below using two different locks), and AtomicLinkedQueue (the lock-free implementation).
### Rubinius results
$ rbx benchmark_concurrent_reads+writes.rb
user system total real
Queue - more writers than readers 4.963446 4.536519 9.499965 ( 6.903411)
Queue - more readers than writers 5.774266 5.047607 10.821873 ( 7.791965)
Queue - equal writers and readers 5.398079 6.132016 11.530095 ( 8.328646)
user system total real
TwoLockLinkedQueue - more writers than readers 5.585377 7.506778 13.092155 ( 6.561008)
TwoLockLinkedQueue - more readers than writers 5.312031 8.011661 13.323692 ( 6.503712)
TwoLockLinkedQueue - equal writers and readers 6.560957 10.342055 16.903012 ( 7.134648)
user system total real
AtomicLinkedQueue - more writers than readers 6.380076 0.114960 6.495036 ( 2.058872)
AtomicLinkedQueue - more readers than writers 2.369595 0.065632 2.435227 ( 0.706512)
AtomicLinkedQueue - equal writers and readers 3.712438 0.123236 3.835674 ( 1.225167)
$ rbx benchmark_separate_reads+writes.rb
Queue - fill, then empty - 10k 0.191729 0.127249 0.318978 ( 0.220572)
Queue - fill, then empty - 100k 1.466753 1.268979 2.735732 ( 1.848341)
Queue - fill, then empty - 1mm 11.818961 12.738940 24.557901 ( 18.820045)
user system total real
TwoLockLinkedQueue - fill, then empty - 10k 0.153434 0.159268 0.312702 ( 0.210763)
TwoLockLinkedQueue - fill, then empty - 100k 1.335991 1.537175 2.873166 ( 2.124443)
TwoLockLinkedQueue - fill, then empty - 1mm 11.675556 16.168307 27.843863 ( 21.965406)
user system total real
AtomicLinkedQueue - fill, then empty - 10k 0.230940 0.007993 0.238933 ( 0.037483)
AtomicLinkedQueue - fill, then empty - 100k 1.692784 0.021976 1.714760 ( 0.364502)
AtomicLinkedQueue - fill, then empty - 1mm 11.247611 0.196520 11.444131 ( 5.000861)
### JRuby results
$ jruby benchmark_concurrent_reads+writes.rb
user system total real
Queue - more writers than readers 0.436000 0.000000 0.436000 ( 0.436000)
Queue - more readers than writers 8.962000 0.000000 8.962000 ( 8.962000)
Queue - equal writers and readers 1.216000 0.000000 1.216000 ( 1.216000)
user system total real
TwoLockLinkedQueue - more writers than readers 1.690000 0.000000 1.690000 ( 1.690000)
TwoLockLinkedQueue - more readers than writers 1.436000 0.000000 1.436000 ( 1.436000)
TwoLockLinkedQueue - equal writers and readers 1.800000 0.000000 1.800000 ( 1.800000)
user system total real
AtomicLinkedQueue - more writers than readers 2.736000 0.000000 2.736000 ( 2.736000)
AtomicLinkedQueue - more readers than writers 1.426000 0.000000 1.426000 ( 1.426000)
AtomicLinkedQueue - equal writers and readers 1.039000 0.000000 1.039000 ( 1.039000)
$ jruby benchmark_separate_reads+writes.rb
user system total real
Queue - fill, then empty - 10k 0.090000 0.000000 0.090000 ( 0.090000)
Queue - fill, then empty - 100k 0.161000 0.000000 0.161000 ( 0.161000)
Queue - fill, then empty - 1mm 1.111000 0.000000 1.111000 ( 1.111000)
user system total real
TwoLockLinkedQueue - fill, then empty - 10k 0.182000 0.000000 0.182000 ( 0.182000)
TwoLockLinkedQueue - fill, then empty - 100k 0.377000 0.000000 0.377000 ( 0.377000)
TwoLockLinkedQueue - fill, then empty - 1mm 4.789000 0.000000 4.789000 ( 4.789000)
user system total real
AtomicLinkedQueue - fill, then empty - 10k 0.785000 0.000000 0.785000 ( 0.785000)
AtomicLinkedQueue - fill, then empty - 100k 1.554000 0.000000 1.554000 ( 1.554000)
AtomicLinkedQueue - fill, then empty - 1mm 5.555000 0.000000 5.555000 ( 5.554000)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment