Skip to content

Instantly share code, notes, and snippets.

@andhapp
Forked from jstorimer/0-README.md
Created December 29, 2013 18:24
Show Gist options
  • Select an option

  • Save andhapp/8173183 to your computer and use it in GitHub Desktop.

Select an option

Save andhapp/8173183 to your computer and use it in GitHub Desktop.

Revisions

  1. @jstorimer jstorimer revised this gist Apr 4, 2013. 3 changed files with 73 additions and 69 deletions.
    86 changes: 43 additions & 43 deletions 0-README.md
    Original file line number Diff line number Diff line change
    @@ -66,59 +66,59 @@ The `benchmark_separate_reads+writes.rb` benchmark first focuses on filling up t

    $ rbx benchmark_concurrent_reads+writes.rb
    user system total real
    Queue - more writers than readers 4.963446 4.536519 9.499965 ( 6.903411)
    Queue - more readers than writers 5.774266 5.047607 10.821873 ( 7.791965)
    Queue - equal writers and readers 5.398079 6.132016 11.530095 ( 8.328646)
    user system total real
    TwoLockLinkedQueue - more writers than readers 5.585377 7.506778 13.092155 ( 6.561008)
    TwoLockLinkedQueue - more readers than writers 5.312031 8.011661 13.323692 ( 6.503712)
    TwoLockLinkedQueue - equal writers and readers 6.560957 10.342055 16.903012 ( 7.134648)
    user system total real
    AtomicLinkedQueue - more writers than readers 6.380076 0.114960 6.495036 ( 2.058872)
    AtomicLinkedQueue - more readers than writers 2.369595 0.065632 2.435227 ( 0.706512)
    AtomicLinkedQueue - equal writers and readers 3.712438 0.123236 3.835674 ( 1.225167)
    Queue - more writers than readers 4.178510 4.892144 9.070654 ( 6.677201)
    Queue - more readers than writers 5.427958 4.914869 10.342827 ( 7.545760)
    Queue - equal writers and readers 5.313148 6.802285 12.115433 ( 8.720809)
    user system total real
    TwoLockLinkedQueue - more writers than readers 5.151256 7.610410 12.761666 ( 6.458769)
    TwoLockLinkedQueue - more readers than writers 5.395152 8.326897 13.722049 ( 6.568123)
    TwoLockLinkedQueue - equal writers and readers 6.641767 10.623600 17.265367 ( 7.297145)
    user system total real
    AtomicLinkedQueue - more writers than readers 2.897964 0.061956 2.959920 ( 0.717638)
    AtomicLinkedQueue - more readers than writers 2.814892 0.050590 2.865482 ( 0.596547)
    AtomicLinkedQueue - equal writers and readers 4.175097 0.086688 4.261785 ( 0.891113)
    user system total real

    $ rbx benchmark_separate_reads+writes.rb

    Queue - fill, then empty - 10k 0.191729 0.127249 0.318978 ( 0.220572)
    Queue - fill, then empty - 100k 1.466753 1.268979 2.735732 ( 1.848341)
    Queue - fill, then empty - 1mm 11.818961 12.738940 24.557901 ( 18.820045)
    user system total real
    TwoLockLinkedQueue - fill, then empty - 10k 0.153434 0.159268 0.312702 ( 0.210763)
    TwoLockLinkedQueue - fill, then empty - 100k 1.335991 1.537175 2.873166 ( 2.124443)
    TwoLockLinkedQueue - fill, then empty - 1mm 11.675556 16.168307 27.843863 ( 21.965406)
    Queue - fill, then empty - 10k 0.113160 0.120949 0.234109 ( 0.159911)
    Queue - fill, then empty - 100k 1.035808 1.174422 2.210230 ( 1.596514)
    Queue - fill, then empty - 1mm 11.258097 12.224407 23.482504 ( 17.325185)
    user system total real
    TwoLockLinkedQueue - fill, then empty - 10k 0.139143 0.172324 0.311467 ( 0.214725)
    TwoLockLinkedQueue - fill, then empty - 100k 1.312984 1.671349 2.984333 ( 2.233421)
    TwoLockLinkedQueue - fill, then empty - 1mm 12.175179 16.069279 28.244458 ( 22.541654)
    user system total real
    AtomicLinkedQueue - fill, then empty - 10k 0.230940 0.007993 0.238933 ( 0.037483)
    AtomicLinkedQueue - fill, then empty - 100k 1.692784 0.021976 1.714760 ( 0.364502)
    AtomicLinkedQueue - fill, then empty - 1mm 11.247611 0.196520 11.444131 ( 5.000861)
    AtomicLinkedQueue - fill, then empty - 10k 0.071836 0.003300 0.075136 ( 0.009811)
    AtomicLinkedQueue - fill, then empty - 100k 0.645546 0.011743 0.657289 ( 0.147805)
    AtomicLinkedQueue - fill, then empty - 1mm 7.075495 0.108397 7.183892 ( 1.663006)

    ### JRuby results

    $ jruby benchmark_concurrent_reads+writes.rb
    user system total real
    Queue - more writers than readers 0.436000 0.000000 0.436000 ( 0.436000)
    Queue - more readers than writers 8.962000 0.000000 8.962000 ( 8.962000)
    Queue - equal writers and readers 1.216000 0.000000 1.216000 ( 1.216000)
    Queue - more writers than readers 0.224000 0.000000 0.224000 ( 0.224000)
    Queue - more readers than writers 8.529000 0.000000 8.529000 ( 8.529000)
    Queue - equal writers and readers 0.263000 0.000000 0.263000 ( 0.262000)
    user system total real
    TwoLockLinkedQueue - more writers than readers 1.690000 0.000000 1.690000 ( 1.690000)
    TwoLockLinkedQueue - more readers than writers 1.436000 0.000000 1.436000 ( 1.436000)
    TwoLockLinkedQueue - equal writers and readers 1.800000 0.000000 1.800000 ( 1.800000)
    TwoLockLinkedQueue - more writers than readers 1.492000 0.000000 1.492000 ( 1.492000)
    TwoLockLinkedQueue - more readers than writers 1.788000 0.000000 1.788000 ( 1.788000)
    TwoLockLinkedQueue - equal writers and readers 2.205000 0.000000 2.205000 ( 2.205000)
    user system total real
    AtomicLinkedQueue - more writers than readers 2.736000 0.000000 2.736000 ( 2.736000)
    AtomicLinkedQueue - more readers than writers 1.426000 0.000000 1.426000 ( 1.426000)
    AtomicLinkedQueue - equal writers and readers 1.039000 0.000000 1.039000 ( 1.039000)
    AtomicLinkedQueue - more writers than readers 1.086000 0.000000 1.086000 ( 1.086000)
    AtomicLinkedQueue - more readers than writers 0.571000 0.000000 0.571000 ( 0.572000)
    AtomicLinkedQueue - equal writers and readers 1.049000 0.000000 1.049000 ( 1.049000)

    $ jruby benchmark_separate_reads+writes.rb
    user system total real
    Queue - fill, then empty - 10k 0.090000 0.000000 0.090000 ( 0.090000)
    Queue - fill, then empty - 100k 0.161000 0.000000 0.161000 ( 0.161000)
    Queue - fill, then empty - 1mm 1.111000 0.000000 1.111000 ( 1.111000)
    user system total real
    TwoLockLinkedQueue - fill, then empty - 10k 0.182000 0.000000 0.182000 ( 0.182000)
    TwoLockLinkedQueue - fill, then empty - 100k 0.377000 0.000000 0.377000 ( 0.377000)
    TwoLockLinkedQueue - fill, then empty - 1mm 4.789000 0.000000 4.789000 ( 4.789000)
    user system total real
    AtomicLinkedQueue - fill, then empty - 10k 0.785000 0.000000 0.785000 ( 0.785000)
    AtomicLinkedQueue - fill, then empty - 100k 1.554000 0.000000 1.554000 ( 1.554000)
    AtomicLinkedQueue - fill, then empty - 1mm 5.555000 0.000000 5.555000 ( 5.554000)

    Queue - fill, then empty - 10k 0.014000 0.000000 0.014000 ( 0.014000)
    Queue - fill, then empty - 100k 0.065000 0.000000 0.065000 ( 0.065000)
    Queue - fill, then empty - 1mm 0.744000 0.000000 0.744000 ( 0.744000)
    user system total real
    TwoLockLinkedQueue - fill, then empty - 10k 0.032000 0.000000 0.032000 ( 0.032000)
    TwoLockLinkedQueue - fill, then empty - 100k 0.337000 0.000000 0.337000 ( 0.337000)
    TwoLockLinkedQueue - fill, then empty - 1mm 4.640000 0.000000 4.640000 ( 4.640000)
    user system total real
    AtomicLinkedQueue - fill, then empty - 10k 0.016000 0.000000 0.016000 ( 0.016000)
    AtomicLinkedQueue - fill, then empty - 100k 0.162000 0.000000 0.162000 ( 0.162000)
    AtomicLinkedQueue - fill, then empty - 1mm 2.706000 0.000000 2.706000 ( 2.706000)
    24 changes: 13 additions & 11 deletions benchmark_concurrent_reads+writes.rb
    Original file line number Diff line number Diff line change
    @@ -68,19 +68,21 @@ def exercise(tg)
    $go = false
    end

    queue_klasses.each do |klass|
    Benchmark.bm(50) do |bm|
    queue = klass.new
    tg = setup(queue, thread_count, (thread_count * 0.6).to_i, iterations)
    bm.report("#{klass} - more writers than readers") { exercise(tg) }
    3.times do
    queue_klasses.each do |klass|
    Benchmark.bm(50) do |bm|
    queue = klass.new
    tg = setup(queue, thread_count, (thread_count * 0.6).to_i, iterations)
    bm.report("#{klass} - more writers than readers") { exercise(tg) }

    queue = klass.new
    tg = setup(queue, (thread_count * 0.6).to_i, thread_count, iterations)
    bm.report("#{klass} - more readers than writers") { exercise(tg) }
    queue = klass.new
    tg = setup(queue, (thread_count * 0.6).to_i, thread_count, iterations)
    bm.report("#{klass} - more readers than writers") { exercise(tg) }

    queue = klass.new
    tg = setup(queue, thread_count, thread_count, iterations)
    bm.report("#{klass} - equal writers and readers") { exercise(tg) }
    queue = klass.new
    tg = setup(queue, thread_count, thread_count, iterations)
    bm.report("#{klass} - equal writers and readers") { exercise(tg) }
    end
    end
    end

    32 changes: 17 additions & 15 deletions benchmark_separate_reads+writes.rb
    Original file line number Diff line number Diff line change
    @@ -64,22 +64,24 @@ def exercise(tg)
    $go = false
    end

    queue_klasses.each do |klass|
    Benchmark.bm(45) do |bm|
    queue = klass.new
    tg = setup(queue, thread_count, 10_000)
    bm.report("#{klass} - fill, then empty - 10k") { exercise(tg) }
    raise 'hell' unless queue.size.zero?

    queue = klass.new
    tg = setup(queue, thread_count, 100_000)
    bm.report("#{klass} - fill, then empty - 100k") { exercise(tg) }
    raise 'hell' unless queue.size.zero?
    3.times do
    queue_klasses.each do |klass|
    Benchmark.bm(45) do |bm|
    queue = klass.new
    tg = setup(queue, thread_count, 10_000)
    bm.report("#{klass} - fill, then empty - 10k") { exercise(tg) }
    raise 'hell' unless queue.size.zero?

    queue = klass.new
    tg = setup(queue, thread_count, 100_000)
    bm.report("#{klass} - fill, then empty - 100k") { exercise(tg) }
    raise 'hell' unless queue.size.zero?

    queue = klass.new
    tg = setup(queue, thread_count, 1_000_000)
    bm.report("#{klass} - fill, then empty - 1mm") { exercise(tg) }
    raise 'hell' unless queue.size.zero?
    queue = klass.new
    tg = setup(queue, thread_count, 1_000_000)
    bm.report("#{klass} - fill, then empty - 1mm") { exercise(tg) }
    raise 'hell' unless queue.size.zero?
    end
    end
    end

  2. @jstorimer jstorimer revised this gist Apr 4, 2013. 1 changed file with 4 additions and 11 deletions.
    15 changes: 4 additions & 11 deletions atomic_linked_queue.rb
    Original file line number Diff line number Diff line change
    @@ -6,8 +6,10 @@

    class AtomicLinkedQueue
    class Node
    attr_accessor :item

    def initialize(item, successor)
    @item = Atomic.new(item)
    @item = item
    @successor = Atomic.new(successor)
    end

    @@ -18,15 +20,6 @@ def successor
    def update_successor(old, new)
    @successor.compare_and_set(old, new)
    end

    def item
    @item.value
    end

    def update_item(thing)
    # this feels dangerous...
    @item.update { thing }
    end
    end

    def initialize
    @@ -97,7 +90,7 @@ def pop
    item = current_head_node.item

    if item != nil
    current_head_node.update_item(nil)
    current_head_node.item = nil
    end

    # return it, success!
  3. @jstorimer jstorimer revised this gist Apr 4, 2013. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion atomic_linked_queue.rb
    Original file line number Diff line number Diff line change
    @@ -50,7 +50,7 @@ def push(item)
    # if that tail was really the last node
    if current_tail_successor.nil?
    # if we can update the previous successor of tail to point to this new node
    if current_tail_node.update_successor(current_tail_successor, new_node)
    if current_tail_node.update_successor(nil, new_node)
    # then update tail to point to this node as well
    @tail.compare_and_set(current_tail_node, new_node)
    # and return
  4. @jstorimer jstorimer revised this gist Apr 4, 2013. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion 0-README.md
    Original file line number Diff line number Diff line change
    @@ -60,7 +60,7 @@ All of the benchmarks were run against `Queue` (a fully synchronized queue from

    The `benchmark_concurrent_reads+writes.rb` benchmark allocates some threads to write to the queue, and others to read. So there's always contention for both pushing and popping.

    The `benchmark_concurrent_reads+writes.rb` benchmark first focuses on filling up the queue to capacity, then emptying it completely. This focuses all of the contention on writing, then all of it on reading.
    The `benchmark_separate_reads+writes.rb` benchmark first focuses on filling up the queue to capacity, then emptying it completely. This focuses all of the contention on writing, then all of it on reading.

    ### Rubinius results

  5. @jstorimer jstorimer revised this gist Apr 3, 2013. 1 changed file with 1 addition and 0 deletions.
    1 change: 1 addition & 0 deletions atomic_linked_queue.rb
    Original file line number Diff line number Diff line change
    @@ -24,6 +24,7 @@ def item
    end

    def update_item(thing)
    # this feels dangerous...
    @item.update { thing }
    end
    end
  6. @jstorimer jstorimer revised this gist Apr 3, 2013. 1 changed file with 3 additions and 2 deletions.
    5 changes: 3 additions & 2 deletions atomic_linked_queue.rb
    Original file line number Diff line number Diff line change
    @@ -97,9 +97,10 @@ def pop

    if item != nil
    current_head_node.update_item(nil)
    # return it, success!
    return item
    end

    # return it, success!
    return item

    # else
    # try again
  7. @jstorimer jstorimer revised this gist Apr 3, 2013. 4 changed files with 355 additions and 0 deletions.
    125 changes: 125 additions & 0 deletions atomic_linked_queue.rb
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,125 @@
    # This is a proof-of-concept of a concurrent, lock-free FIFO data
    # structure written in Ruby. It leverages atomic updates, rather than
    # lock-based synchronization.

    require 'atomic'

    class AtomicLinkedQueue
    class Node
    def initialize(item, successor)
    @item = Atomic.new(item)
    @successor = Atomic.new(successor)
    end

    def successor
    @successor.value
    end

    def update_successor(old, new)
    @successor.compare_and_set(old, new)
    end

    def item
    @item.value
    end

    def update_item(thing)
    @item.update { thing }
    end
    end

    def initialize
    dummy_node = Node.new(:dummy, nil)

    @head = Atomic.new(dummy_node)
    @tail = Atomic.new(dummy_node)
    end

    def push(item)
    # allocate a new node with the item embedded
    new_node = Node.new(item, nil)

    # keep trying until the operation succeeds
    loop do
    current_tail_node = @tail.value
    current_tail_successor = current_tail_node.successor

    # if our stored tail is still the current tail
    if current_tail_node == @tail.value
    # if that tail was really the last node
    if current_tail_successor.nil?
    # if we can update the previous successor of tail to point to this new node
    if current_tail_node.update_successor(current_tail_successor, new_node)
    # then update tail to point to this node as well
    @tail.compare_and_set(current_tail_node, new_node)
    # and return
    return true
    # else, start the loop over
    end
    else
    # in this case, the tail ref we had wasn't the real tail
    # so we try to set its successor as the real tail, then start the loop again
    @tail.compare_and_set(current_tail_node, current_tail_successor)
    end
    end
    end
    end

    def pop
    # retry until some value can be returned
    loop do
    # the value in @head is just a dummy node that always sits in that position,
    # the real 'head' is in its successor
    current_dummy_node = @head.value
    current_tail_node = @tail.value

    current_head_node = current_dummy_node.successor

    # if our local head is still consistent with the head node, continue
    # otherwise, start over
    if current_dummy_node == @head.value
    # if either the queue is empty, or falling behind
    if current_dummy_node == current_tail_node
    # if there's nothing after the 'dummy' head node
    if current_head_node.nil?
    # just return nil
    return nil
    else
    # here the head element succeeding head is not nil, but the head and tail are equal
    # so tail is falling behind, update it, then start over
    @tail.compare_and_set(current_tail_node, current_head_node)
    end
    # the queue isn't empty
    # if we can set the dummy head to the 'real' head, we're free to return the value in that real head, success
    elsif @head.compare_and_set(current_dummy_node, current_head_node)
    # grab the item from the popped node
    item = current_head_node.item

    if item != nil
    current_head_node.update_item(nil)
    # return it, success!
    return item
    end

    # else
    # try again
    end
    end
    end
    end

    def size
    successor = @head.value.successor
    count = 0

    loop do
    break if successor.nil?

    current_node = successor
    successor = current_node.successor
    count += 1
    end

    count
    end
    end
    86 changes: 86 additions & 0 deletions benchmark_concurrent_reads+writes.rb
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,86 @@
    require 'benchmark'
    require 'thread'
    require_relative 'atomic_linked_queue'
    require_relative 'two_lock_linked_queue'

    thread_count = 50
    iterations = 10_000

    queue_klasses = [Queue, TwoLockLinkedQueue, AtomicLinkedQueue]
    Thread.abort_on_exception = true

    # this one tells all the threads when to start
    $go = false

    def setup(queue, writer_thread_count, reader_thread_count, iterations)
    tg = ThreadGroup.new

    # spawn writer threads
    writer_thread_count.times do
    t = Thread.new do
    # wait until the bm starts to do the work. This should
    # minimize variance.
    nil until $go
    iterations.times do
    queue.push('item')
    end
    end

    tg.add(t)
    end

    # spawn reader threads
    if queue.class == Queue
    # the Queue class gets special behaviour because its #pop
    # method is blocking by default.
    reader_thread_count.times do
    t = Thread.new do
    nil until $go
    iterations.times do
    begin
    queue.pop(:nonblocking)
    rescue
    end
    end
    end

    tg.add(t)
    end
    else
    reader_thread_count.times do
    t = Thread.new do
    nil until $go
    iterations.times do
    queue.pop
    end
    end

    tg.add(t)
    end
    end

    tg
    end

    def exercise(tg)
    $go = true
    tg.list.each(&:join)
    $go = false
    end

    queue_klasses.each do |klass|
    Benchmark.bm(50) do |bm|
    queue = klass.new
    tg = setup(queue, thread_count, (thread_count * 0.6).to_i, iterations)
    bm.report("#{klass} - more writers than readers") { exercise(tg) }

    queue = klass.new
    tg = setup(queue, (thread_count * 0.6).to_i, thread_count, iterations)
    bm.report("#{klass} - more readers than writers") { exercise(tg) }

    queue = klass.new
    tg = setup(queue, thread_count, thread_count, iterations)
    bm.report("#{klass} - equal writers and readers") { exercise(tg) }
    end
    end

    85 changes: 85 additions & 0 deletions benchmark_separate_reads+writes.rb
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,85 @@
    require 'benchmark'
    require 'thread'
    require_relative 'atomic_linked_queue'
    require_relative 'two_lock_linked_queue'

    thread_count = 50

    queue_klasses = [Queue, TwoLockLinkedQueue, AtomicLinkedQueue]
    Thread.abort_on_exception = true

    # this one tells all the threads when to start
    $go = false

    def setup(queue, thread_count, queue_length)
    tg = ThreadGroup.new

    if queue.class == Queue
    thread_count.times do
    t = Thread.new do
    # wait until the bm starts to do the work. This should
    # minimize variance.
    nil until $go
    (queue_length / thread_count).to_i.times do
    queue.push('item')
    end

    loop do
    begin
    result = queue.pop(:nonblock)
    rescue => e
    break
    end
    end
    end

    tg.add(t)
    end

    else
    thread_count.times do
    t = Thread.new do
    nil until $go
    (queue_length / thread_count).to_i.times do
    queue.push('item')
    end

    result = true
    until result.nil?
    result = queue.pop
    end
    end

    tg.add(t)
    end
    end

    GC.start
    tg
    end

    def exercise(tg)
    $go = true
    tg.list.each(&:join)
    $go = false
    end

    queue_klasses.each do |klass|
    Benchmark.bm(45) do |bm|
    queue = klass.new
    tg = setup(queue, thread_count, 10_000)
    bm.report("#{klass} - fill, then empty - 10k") { exercise(tg) }
    raise 'hell' unless queue.size.zero?

    queue = klass.new
    tg = setup(queue, thread_count, 100_000)
    bm.report("#{klass} - fill, then empty - 100k") { exercise(tg) }
    raise 'hell' unless queue.size.zero?

    queue = klass.new
    tg = setup(queue, thread_count, 1_000_000)
    bm.report("#{klass} - fill, then empty - 1mm") { exercise(tg) }
    raise 'hell' unless queue.size.zero?
    end
    end

    59 changes: 59 additions & 0 deletions two_lock_linked_queue.rb
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,59 @@
    require 'thread'

    class TwoLockLinkedQueue
    # This Node doesn't need atomic updates, it assumes
    # that you're modifying it while holding a lock
    Node = Struct.new(:item, :successor)

    def initialize
    dummy_node = Node.new(:dummy, nil)

    @head_node = dummy_node
    @tail_node = dummy_node

    @head_lock = Mutex.new
    @tail_lock = Mutex.new
    end

    def push(item)
    # allocate a new node with the item embedded
    new_node = Node.new(item, nil)

    @tail_lock.synchronize do
    # update the successor of the current tail to point to the new node
    @tail_node.successor = new_node
    @tail_node = new_node
    end
    end

    def pop
    @head_lock.synchronize do
    dummy_node = @head_node
    head = @head_node.successor

    if head.nil? # then queue was empty
    return nil
    else
    # the current head becomes the new 'dummy' head
    @head_node = head
    # return its value
    return head.item
    end
    end
    end

    def size
    successor = @head_node.successor
    count = 0

    loop do
    break if successor.nil?

    current_node = successor
    successor = current_node.successor
    count += 1
    end

    count
    end
    end
  8. @jstorimer jstorimer revised this gist Apr 3, 2013. 1 changed file with 3 additions and 3 deletions.
    6 changes: 3 additions & 3 deletions 0-README.md
    Original file line number Diff line number Diff line change
    @@ -7,10 +7,10 @@ but, as always, take with a grain of salt.

    ## Data structures

    AtomicLinkedQueue is a lock-free queue, built on atomic CAS operations.
    `AtomicLinkedQueue` is a lock-free queue, built on atomic CAS operations.
    It doesn't use any mutexes.

    TwoLockLinkedQueue is a lock-based queue, but with separate locks for the
    `TwoLockLinkedQueue` is a lock-based queue, but with separate locks for the
    head + tail. This means there can be lots of contention when pushing to
    the queue, but little when it comes to popping off the queue.

    @@ -56,7 +56,7 @@ implementation of java.util.concurrent.ConcurrentLinkedQueue[2].

    These are the results of running the included benchmarks against Rubinius 2.0.0-rc1 and JRuby 1.7.3. Any results from MRI will be superficial due to the global lock, so I omitted it completely.

    All of the benchmarks were run against Queue (a fully synchronized queue from Ruby's std lib), TwoLockLinkedQueue (implementation below using two different locks), and AtomicLinkedQueue (the lock-free implementation).
    All of the benchmarks were run against `Queue` (a fully synchronized queue from Ruby's std lib), `TwoLockLinkedQueue` (implementation below using two different locks), and `AtomicLinkedQueue` (the lock-free implementation).

    The `benchmark_concurrent_reads+writes.rb` benchmark allocates some threads to write to the queue, and others to read. So there's always contention for both pushing and popping.

  9. @jstorimer jstorimer revised this gist Apr 3, 2013. 1 changed file with 6 additions and 2 deletions.
    8 changes: 6 additions & 2 deletions 0-README.md
    Original file line number Diff line number Diff line change
    @@ -54,9 +54,13 @@ implementation of java.util.concurrent.ConcurrentLinkedQueue[2].

    ## Benchmark results

    These are the results of running the `benchmark_concurrent_reads+writes.rb` file against Rubinius 2.0.0-rc1 and JRuby 1.7.3. Any results from MRI will be superficial because Of the global lock, so I omitted it completely.
    These are the results of running the included benchmarks against Rubinius 2.0.0-rc1 and JRuby 1.7.3. Any results from MRI will be superficial due to the global lock, so I omitted it completely.

    All of the benchmarks were run against Queue (a fully synchronized queue from Ruby's std lib), TwoLockLinkedQueue (my implementation below using two different locks), and AtomicLinkedQueue (the lock-free implementation).
    All of the benchmarks were run against Queue (a fully synchronized queue from Ruby's std lib), TwoLockLinkedQueue (implementation below using two different locks), and AtomicLinkedQueue (the lock-free implementation).

    The `benchmark_concurrent_reads+writes.rb` benchmark allocates some threads to write to the queue, and others to read. So there's always contention for both pushing and popping.

    The `benchmark_concurrent_reads+writes.rb` benchmark first focuses on filling up the queue to capacity, then emptying it completely. This focuses all of the contention on writing, then all of it on reading.

    ### Rubinius results

  10. @jstorimer jstorimer revised this gist Apr 3, 2013. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion 0-README.md
    Original file line number Diff line number Diff line change
    @@ -34,7 +34,7 @@ that you *think* it currently holds, and the value you want to set it to.

    If it does hold the expected value, it's set to your new value. Otherwise,
    it returns false. In this implementation, when that happens, the operation
    is re-started, over and over, until it can succeed.
    is re-started, over and over, until it succeeds.

    This compare-and-set operation is hardware-supported and works across
    Ruby implementations thanks to the 'atomic' gem. The hardware support
  11. @jstorimer jstorimer revised this gist Apr 3, 2013. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion 0-README.md
    Original file line number Diff line number Diff line change
    @@ -25,7 +25,7 @@ For those unfamiliar with the atomic compare-and-swap operation, here's
    a brief outline of how it operates.

    Create an Atomic instance that holds a value. (This comes from the [atomic
    rubygem](https://github.com/headius/atomic-ruby).
    rubygem](https://github.com/headius/atomic-ruby)).

    item = Atomic.new('value')

  12. @jstorimer jstorimer renamed this gist Apr 3, 2013. 1 changed file with 0 additions and 0 deletions.
    File renamed without changes.
  13. @jstorimer jstorimer created this gist Apr 3, 2013.
    120 changes: 120 additions & 0 deletions gistfile1
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,120 @@
    This is a proof-of-concept of a couple of concurrent data structures
    written in Ruby.

    The implementations are heavily commented for those interested. There
    are benchmarks (with results) included below. The results are interesting,
    but, as always, take with a grain of salt.

    ## Data structures

    AtomicLinkedQueue is a lock-free queue, built on atomic CAS operations.
    It doesn't use any mutexes.

    TwoLockLinkedQueue is a lock-based queue, but with separate locks for the
    head + tail. This means there can be lots of contention when pushing to
    the queue, but little when it comes to popping off the queue.

    Both of these implementations are unbounded queues, with no blocking
    operations.

    See the individual files below for more about their implementation.

    ## Background

    For those unfamiliar with the atomic compare-and-swap operation, here's
    a brief outline of how it operates.

    Create an Atomic instance that holds a value. (This comes from the [atomic
    rubygem](https://github.com/headius/atomic-ruby).

    item = Atomic.new('value')

    The workhorse method is #compare_and_set(old, new). You pass it the value
    that you *think* it currently holds, and the value you want to set it to.

    If it does hold the expected value, it's set to your new value. Otherwise,
    it returns false. In this implementation, when that happens, the operation
    is re-started, over and over, until it can succeed.

    This compare-and-set operation is hardware-supported and works across
    Ruby implementations thanks to the 'atomic' gem. The hardware support
    provides the atomic guarantee. Without this guarantee, it would be possible
    to read a value, then have another thread update the value before you can,
    then your thread would overwrite that value. This stuff is complicated.

    ## Due credit

    I can't take credit for this implementation. This is an implementation
    of the pseudo-code from "Simple, Fast, and Practical Non-Blocking and
    Blocking Concurrent Queue Algorithms"[1], with some hints from Java's
    implementation of java.util.concurrent.ConcurrentLinkedQueue[2].

    1. http://www.cs.rochester.edu/research/synchronization/pseudocode/queues.html
    2. http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/concurrent/ConcurrentLinkedQueue.java

    ## Benchmark results

    These are the results of running the `benchmark_concurrent_reads+writes.rb` file against Rubinius 2.0.0-rc1 and JRuby 1.7.3. Any results from MRI will be superficial because Of the global lock, so I omitted it completely.

    All of the benchmarks were run against Queue (a fully synchronized queue from Ruby's std lib), TwoLockLinkedQueue (my implementation below using two different locks), and AtomicLinkedQueue (the lock-free implementation).

    ### Rubinius results

    $ rbx benchmark_concurrent_reads+writes.rb
    user system total real
    Queue - more writers than readers 4.963446 4.536519 9.499965 ( 6.903411)
    Queue - more readers than writers 5.774266 5.047607 10.821873 ( 7.791965)
    Queue - equal writers and readers 5.398079 6.132016 11.530095 ( 8.328646)
    user system total real
    TwoLockLinkedQueue - more writers than readers 5.585377 7.506778 13.092155 ( 6.561008)
    TwoLockLinkedQueue - more readers than writers 5.312031 8.011661 13.323692 ( 6.503712)
    TwoLockLinkedQueue - equal writers and readers 6.560957 10.342055 16.903012 ( 7.134648)
    user system total real
    AtomicLinkedQueue - more writers than readers 6.380076 0.114960 6.495036 ( 2.058872)
    AtomicLinkedQueue - more readers than writers 2.369595 0.065632 2.435227 ( 0.706512)
    AtomicLinkedQueue - equal writers and readers 3.712438 0.123236 3.835674 ( 1.225167)

    $ rbx benchmark_separate_reads+writes.rb

    Queue - fill, then empty - 10k 0.191729 0.127249 0.318978 ( 0.220572)
    Queue - fill, then empty - 100k 1.466753 1.268979 2.735732 ( 1.848341)
    Queue - fill, then empty - 1mm 11.818961 12.738940 24.557901 ( 18.820045)
    user system total real
    TwoLockLinkedQueue - fill, then empty - 10k 0.153434 0.159268 0.312702 ( 0.210763)
    TwoLockLinkedQueue - fill, then empty - 100k 1.335991 1.537175 2.873166 ( 2.124443)
    TwoLockLinkedQueue - fill, then empty - 1mm 11.675556 16.168307 27.843863 ( 21.965406)
    user system total real
    AtomicLinkedQueue - fill, then empty - 10k 0.230940 0.007993 0.238933 ( 0.037483)
    AtomicLinkedQueue - fill, then empty - 100k 1.692784 0.021976 1.714760 ( 0.364502)
    AtomicLinkedQueue - fill, then empty - 1mm 11.247611 0.196520 11.444131 ( 5.000861)

    ### JRuby results

    $ jruby benchmark_concurrent_reads+writes.rb
    user system total real
    Queue - more writers than readers 0.436000 0.000000 0.436000 ( 0.436000)
    Queue - more readers than writers 8.962000 0.000000 8.962000 ( 8.962000)
    Queue - equal writers and readers 1.216000 0.000000 1.216000 ( 1.216000)
    user system total real
    TwoLockLinkedQueue - more writers than readers 1.690000 0.000000 1.690000 ( 1.690000)
    TwoLockLinkedQueue - more readers than writers 1.436000 0.000000 1.436000 ( 1.436000)
    TwoLockLinkedQueue - equal writers and readers 1.800000 0.000000 1.800000 ( 1.800000)
    user system total real
    AtomicLinkedQueue - more writers than readers 2.736000 0.000000 2.736000 ( 2.736000)
    AtomicLinkedQueue - more readers than writers 1.426000 0.000000 1.426000 ( 1.426000)
    AtomicLinkedQueue - equal writers and readers 1.039000 0.000000 1.039000 ( 1.039000)

    $ jruby benchmark_separate_reads+writes.rb
    user system total real
    Queue - fill, then empty - 10k 0.090000 0.000000 0.090000 ( 0.090000)
    Queue - fill, then empty - 100k 0.161000 0.000000 0.161000 ( 0.161000)
    Queue - fill, then empty - 1mm 1.111000 0.000000 1.111000 ( 1.111000)
    user system total real
    TwoLockLinkedQueue - fill, then empty - 10k 0.182000 0.000000 0.182000 ( 0.182000)
    TwoLockLinkedQueue - fill, then empty - 100k 0.377000 0.000000 0.377000 ( 0.377000)
    TwoLockLinkedQueue - fill, then empty - 1mm 4.789000 0.000000 4.789000 ( 4.789000)
    user system total real
    AtomicLinkedQueue - fill, then empty - 10k 0.785000 0.000000 0.785000 ( 0.785000)
    AtomicLinkedQueue - fill, then empty - 100k 1.554000 0.000000 1.554000 ( 1.554000)
    AtomicLinkedQueue - fill, then empty - 1mm 5.555000 0.000000 5.555000 ( 5.554000)