Skip to content

Instantly share code, notes, and snippets.

@folkertdev
Last active October 23, 2024 07:54
Show Gist options
  • Select an option

  • Save folkertdev/977183fb706b7693863bd7f358578292 to your computer and use it in GitHub Desktop.

Select an option

Save folkertdev/977183fb706b7693863bd7f358578292 to your computer and use it in GitHub Desktop.
zlib-rs labeled match benchmarks

zlib-rs labeled match benchmarks

build the toolchain

A proof of concept implementation can be found at https://github.com/trifectatechfoundation/rust/tree/labeled-match. Build it with ./x build, and then set up the toolchain. Now cargo +stage1 build should use a compiler with labeled-match available.

run the benchmark

git clone https://github.com/trifectatechfoundation/zlib-rs.git
git checkout len-as-match
sh replicate-labeled-match-benchmarks.sh

this runs 4 benchmarks

  • baseline: the current zlib-rs main branch approach using tail calls
  • loop-plus-match: standard approach using a loop and match; suffers from branch misprediction
  • labeled-match-len: the len function and friends now use labeled match
  • labeled-match-fast: the len and friends, and inflate_fast_help functions now use labeld match

results

Mostly what we see is that labeled match gives significant speedups for small chunk sizes. For larger chunk sizes, the results are less clear. I believe really the result is net-zero, but we need clearly need to perform some further tuning.

Note in particular how loop-plus-match works well for small and big inputs, but terribly for medium inputs.

Benchmark 1 (69 runs): /tmp/uncompress-baseline rs-chunked 4 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          72.5ms ± 2.46ms    71.0ms … 91.0ms          7 (10%)        0%
  peak_rss           24.1MB ± 77.8KB    23.9MB … 24.1MB          0 ( 0%)        0%
  cpu_cycles          294M  ± 9.90M      291M  …  371M           7 (10%)        0%
  instructions        914M  ±  274       914M  …  914M           0 ( 0%)        0%
  cache_references   3.04M  ±  519K     2.69M  … 6.09M           4 ( 6%)        0%
  cache_misses        156K  ± 39.1K      126K  …  463K           1 ( 1%)        0%
  branch_misses      4.09M  ± 10.8K     4.08M  … 4.17M           5 ( 7%)        0%
Benchmark 2 (71 runs): /tmp/loop-plus-match rs-chunked 4 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          70.9ms ±  817us    69.9ms … 76.2ms          2 ( 3%)        ⚡-  2.1% ±  0.8%
  peak_rss           24.1MB ± 58.4KB    24.0MB … 24.1MB          0 ( 0%)          +  0.1% ±  0.1%
  cpu_cycles          287M  ± 2.33M      285M  …  305M           3 ( 4%)        ⚡-  2.5% ±  0.8%
  instructions        792M  ±  300       792M  …  792M           1 ( 1%)        ⚡- 13.4% ±  0.0%
  cache_references   2.98M  ± 89.8K     2.78M  … 3.33M           1 ( 1%)          -  1.9% ±  4.0%
  cache_misses        154K  ± 15.5K      115K  …  207K           1 ( 1%)          -  1.6% ±  6.3%
  branch_misses      4.10M  ± 3.85K     4.09M  … 4.11M           1 ( 1%)          +  0.2% ±  0.1%
Benchmark 3 (79 runs): /tmp/labeled-match-len rs-chunked 4 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          63.9ms ± 1.66ms    62.7ms … 74.9ms          5 ( 6%)        ⚡- 11.8% ±  0.9%
  peak_rss           24.1MB ± 64.6KB    23.9MB … 24.1MB         18 (23%)          +  0.1% ±  0.1%
  cpu_cycles          254M  ± 6.97M      252M  …  307M           9 (11%)        ⚡- 13.6% ±  0.9%
  instructions        710M  ±  259       710M  …  710M           0 ( 0%)        ⚡- 22.3% ±  0.0%
  cache_references   3.05M  ±  736K     2.79M  … 9.38M           5 ( 6%)          +  0.4% ±  6.8%
  cache_misses        134K  ± 14.6K      111K  …  176K           1 ( 1%)        ⚡- 14.0% ±  5.9%
  branch_misses      4.08M  ± 5.27K     4.08M  … 4.11M           3 ( 4%)          -  0.1% ±  0.1%
Benchmark 4 (81 runs): /tmp/labeled-match-fast rs-chunked 4 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          62.1ms ±  554us    61.4ms … 64.4ms          3 ( 4%)        ⚡- 14.3% ±  0.8%
  peak_rss           24.1MB ± 63.0KB    23.9MB … 24.1MB          0 ( 0%)          +  0.0% ±  0.1%
  cpu_cycles          246M  ± 1.75M      245M  …  257M           6 ( 7%)        ⚡- 16.3% ±  0.7%
  instructions        689M  ±  258       689M  …  689M           0 ( 0%)        ⚡- 24.6% ±  0.0%
  cache_references   3.02M  ±  336K     2.77M  … 5.78M           4 ( 5%)          -  0.6% ±  4.5%
  cache_misses        128K  ± 16.9K      101K  …  186K           5 ( 6%)        ⚡- 17.7% ±  6.0%
  branch_misses      4.08M  ± 5.14K     4.08M  … 4.10M           2 ( 2%)          -  0.1% ±  0.1%
Benchmark 1 (108 runs): /tmp/uncompress-baseline rs-chunked 7 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          46.5ms ±  649us    45.3ms … 50.1ms          2 ( 2%)        0%
  peak_rss           24.1MB ± 57.0KB    24.0MB … 24.1MB         27 (25%)        0%
  cpu_cycles          174M  ± 1.71M      173M  …  187M           9 ( 8%)        0%
  instructions        516M  ±  277       516M  …  516M           1 ( 1%)        0%
  cache_references   3.14M  ±  181K     2.93M  … 4.33M           3 ( 3%)        0%
  cache_misses       45.8K  ± 17.2K     29.9K  … 93.9K           7 ( 6%)        0%
  branch_misses      2.00M  ± 2.42K     2.00M  … 2.02M           2 ( 2%)        0%
Benchmark 2 (78 runs): /tmp/loop-plus-match rs-chunked 7 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          64.8ms ±  705us    63.8ms … 69.2ms          1 ( 1%)        💩+ 39.3% ±  0.4%
  peak_rss           24.1MB ± 76.1KB    23.9MB … 24.1MB          0 ( 0%)          -  0.1% ±  0.1%
  cpu_cycles          257M  ± 2.22M      256M  …  273M           9 (12%)        💩+ 47.5% ±  0.3%
  instructions        720M  ±  385       720M  …  720M           1 ( 1%)        💩+ 39.7% ±  0.0%
  cache_references   3.18M  ± 92.4K     2.99M  … 3.55M           2 ( 3%)          +  1.5% ±  1.4%
  cache_misses       37.4K  ± 17.6K     26.5K  …  174K           7 ( 9%)        ⚡- 18.4% ± 11.1%
  branch_misses      2.00M  ± 2.47K     2.00M  … 2.01M           3 ( 4%)          -  0.2% ±  0.0%
Benchmark 3 (104 runs): /tmp/labeled-match-len rs-chunked 7 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          48.0ms ±  500us    47.2ms … 50.3ms          1 ( 1%)        💩+  3.2% ±  0.3%
  peak_rss           24.1MB ± 59.7KB    24.0MB … 24.1MB          0 ( 0%)          -  0.0% ±  0.1%
  cpu_cycles          181M  ± 1.18M      180M  …  187M           6 ( 6%)        💩+  3.8% ±  0.2%
  instructions        586M  ±  354       586M  …  586M           4 ( 4%)        💩+ 13.6% ±  0.0%
  cache_references   3.25M  ±  202K     3.02M  … 4.80M           6 ( 6%)        💩+  3.5% ±  1.6%
  cache_misses       48.6K  ± 8.60K     31.5K  … 83.5K           5 ( 5%)          +  6.0% ±  8.0%
  branch_misses      2.00M  ± 1.77K     2.00M  … 2.01M           5 ( 5%)          -  0.2% ±  0.0%
Benchmark 4 (110 runs): /tmp/labeled-match-fast rs-chunked 7 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          45.5ms ±  927us    44.5ms … 51.4ms          4 ( 4%)        ⚡-  2.2% ±  0.5%
  peak_rss           24.1MB ± 73.1KB    23.9MB … 24.1MB          0 ( 0%)          -  0.0% ±  0.1%
  cpu_cycles          170M  ± 2.85M      168M  …  190M          12 (11%)        ⚡-  2.7% ±  0.4%
  instructions        515M  ±  249       515M  …  515M           0 ( 0%)          -  0.1% ±  0.0%
  cache_references   3.27M  ±  107K     3.07M  … 3.94M           2 ( 2%)        💩+  4.2% ±  1.3%
  cache_misses        109K  ± 5.80K     98.7K  …  139K           4 ( 4%)        💩+139.0% ±  7.4%
  branch_misses      2.00M  ± 4.10K     1.99M  … 2.03M           4 ( 4%)          -  0.2% ±  0.0%
Benchmark 1 (181 runs): /tmp/uncompress-baseline rs-chunked 16 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          27.6ms ±  433us    26.8ms … 31.4ms          6 ( 3%)        0%
  peak_rss           24.1MB ± 58.9KB    23.9MB … 24.1MB         45 (25%)        0%
  cpu_cycles         90.0M  ±  919K     89.5M  … 99.7M          15 ( 8%)        0%
  instructions        239M  ±  322       239M  …  239M           3 ( 2%)        0%
  cache_references   2.28M  ± 53.6K     2.19M  … 2.70M           4 ( 2%)        0%
  cache_misses       43.5K  ± 2.44K     40.3K  … 64.5K           5 ( 3%)        0%
  branch_misses      1.05M  ± 1.70K     1.05M  … 1.06M           4 ( 2%)        0%
Benchmark 2 (186 runs): /tmp/loop-plus-match rs-chunked 16 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          27.0ms ±  304us    26.3ms … 28.6ms          3 ( 2%)        ⚡-  2.5% ±  0.3%
  peak_rss           24.1MB ± 64.1KB    23.9MB … 24.1MB          0 ( 0%)          -  0.0% ±  0.1%
  cpu_cycles         87.3M  ±  654K     86.8M  … 91.7M          19 (10%)        ⚡-  3.0% ±  0.2%
  instructions        248M  ±  266       248M  …  248M           2 ( 1%)        💩+  3.8% ±  0.0%
  cache_references   2.25M  ± 55.5K     2.17M  … 2.75M           5 ( 3%)          -  1.2% ±  0.5%
  cache_misses       45.4K  ± 2.26K     40.6K  … 52.2K           1 ( 1%)        💩+  4.2% ±  1.1%
  branch_misses      1.05M  ± 1.67K     1.04M  … 1.05M           4 ( 2%)          -  0.5% ±  0.0%
Benchmark 3 (184 runs): /tmp/labeled-match-len rs-chunked 16 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          27.2ms ±  817us    26.3ms … 33.7ms         16 ( 9%)        ⚡-  1.7% ±  0.5%
  peak_rss           24.1MB ± 55.9KB    23.9MB … 24.1MB         39 (21%)          +  0.0% ±  0.0%
  cpu_cycles         87.4M  ± 1.95M     86.4M  …  103M          23 (13%)        ⚡-  2.9% ±  0.3%
  instructions        248M  ±  278       248M  …  248M           2 ( 1%)        💩+  3.6% ±  0.0%
  cache_references   2.28M  ±  112K     2.18M  … 2.93M          16 ( 9%)          +  0.0% ±  0.8%
  cache_misses       47.9K  ± 11.1K     40.5K  …  168K          15 ( 8%)        💩+ 10.0% ±  3.8%
  branch_misses      1.05M  ± 2.32K     1.04M  … 1.06M           4 ( 2%)          -  0.4% ±  0.0%
Benchmark 4 (182 runs): /tmp/labeled-match-fast rs-chunked 16 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          27.5ms ±  252us    26.9ms … 28.5ms          2 ( 1%)          -  0.7% ±  0.3%
  peak_rss           24.1MB ± 64.1KB    23.9MB … 24.1MB         44 (24%)          -  0.0% ±  0.1%
  cpu_cycles         89.5M  ±  477K     89.1M  … 93.9M          18 (10%)          -  0.6% ±  0.2%
  instructions        253M  ±  306       253M  …  253M           1 ( 1%)        💩+  5.7% ±  0.0%
  cache_references   2.27M  ± 67.3K     2.19M  … 3.00M           3 ( 2%)          -  0.5% ±  0.5%
  cache_misses       66.5K  ± 2.10K     59.8K  … 73.6K           2 ( 1%)        💩+ 52.7% ±  1.1%
  branch_misses      1.05M  ± 1.22K     1.05M  … 1.06M           1 ( 1%)          -  0.2% ±  0.0%
@bjorn3
Copy link

bjorn3 commented Oct 23, 2024

On a AMD Ryzen 7 3700X 8-Core Processor:

Benchmark 1 (61 runs): /tmp/uncompress-baseline rs-chunked 4 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          82.0ms ± 1.94ms    79.6ms … 88.8ms          2 ( 3%)        0%
  peak_rss           23.2MB ± 1.22MB    20.5MB … 25.7MB          0 ( 0%)        0%
  cpu_cycles          336M  ± 7.88M      330M  …  365M           2 ( 3%)        0%
  instructions        916M  ±  245       916M  …  916M           1 ( 2%)        0%
  cache_references   3.02M  ±  333K     2.71M  … 4.11M           3 ( 5%)        0%
  cache_misses       45.8K  ± 7.34K     39.6K  … 94.0K           1 ( 2%)        0%
  branch_misses      4.01M  ± 22.2K     3.98M  … 4.06M           0 ( 0%)        0%
Benchmark 2 (61 runs): /tmp/loop-plus-match rs-chunked 4 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          82.7ms ±  642us    81.5ms … 85.6ms          3 ( 5%)          +  0.8% ±  0.6%
  peak_rss           23.4MB ± 1.30MB    20.2MB … 25.7MB          0 ( 0%)          +  0.5% ±  1.9%
  cpu_cycles          340M  ± 2.40M      338M  …  351M           2 ( 3%)          +  1.1% ±  0.6%
  instructions        794M  ±  276       794M  …  794M           0 ( 0%)        ⚡- 13.4% ±  0.0%
  cache_references   2.95M  ±  189K     2.78M  … 3.93M           6 (10%)          -  2.5% ±  3.2%
  cache_misses       41.8K  ± 4.61K     36.9K  … 72.3K           3 ( 5%)        ⚡-  8.8% ±  4.8%
  branch_misses      4.07M  ± 18.3K     4.02M  … 4.11M           5 ( 8%)        💩+  1.5% ±  0.2%
Benchmark 3 (70 runs): /tmp/labeled-match-len rs-chunked 4 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          72.0ms ±  800us    70.9ms … 75.1ms          6 ( 9%)        ⚡- 12.3% ±  0.6%
  peak_rss           23.4MB ± 1.48MB    20.6MB … 25.7MB          0 ( 0%)          +  0.7% ±  2.0%
  cpu_cycles          292M  ± 2.71M      290M  …  303M           6 ( 9%)        ⚡- 13.1% ±  0.6%
  instructions        685M  ±  286       685M  …  685M           0 ( 0%)        ⚡- 25.3% ±  0.0%
  cache_references   2.89M  ±  139K     2.63M  … 3.43M          10 (14%)        ⚡-  4.4% ±  2.8%
  cache_misses       46.2K  ± 5.53K     40.2K  … 78.5K           3 ( 4%)          +  0.9% ±  4.8%
  branch_misses      3.99M  ± 37.6K     3.92M  … 4.05M           0 ( 0%)          -  0.4% ±  0.3%
Benchmark 4 (70 runs): /tmp/labeled-match-fast rs-chunked 4 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          72.1ms ± 2.08ms    70.8ms … 86.8ms          4 ( 6%)        ⚡- 12.1% ±  0.8%
  peak_rss           23.3MB ± 1.28MB    20.7MB … 25.7MB          0 ( 0%)          +  0.1% ±  1.8%
  cpu_cycles          292M  ± 8.11M      289M  …  351M           7 (10%)        ⚡- 13.2% ±  0.8%
  instructions        685M  ±  291       685M  …  685M           0 ( 0%)        ⚡- 25.2% ±  0.0%
  cache_references   2.94M  ±  374K     2.72M  … 5.02M           5 ( 7%)          -  2.7% ±  4.0%
  cache_misses       45.9K  ± 5.04K     38.7K  … 64.7K           3 ( 4%)          +  0.1% ±  4.7%
  branch_misses      3.95M  ± 28.3K     3.92M  … 4.04M           0 ( 0%)        ⚡-  1.5% ±  0.2%
Benchmark 1 (105 runs): /tmp/uncompress-baseline rs-chunked 7 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          47.7ms ± 1.05ms    46.3ms … 51.8ms          7 ( 7%)        0%
  peak_rss           23.2MB ± 1.26MB    20.6MB … 25.7MB          0 ( 0%)        0%
  cpu_cycles          188M  ± 3.25M      186M  …  200M          12 (11%)        0%
  instructions        516M  ±  248       516M  …  516M           1 ( 1%)        0%
  cache_references   3.63M  ±  507K     3.26M  … 7.95M          13 (12%)        0%
  cache_misses       48.0K  ± 13.7K     36.9K  …  126K          10 (10%)        0%
  branch_misses      1.92M  ± 7.37K     1.91M  … 1.94M           0 ( 0%)        0%
Benchmark 2 (65 runs): /tmp/loop-plus-match rs-chunked 7 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          77.5ms ± 2.17ms    75.6ms … 89.1ms          7 (11%)        💩+ 62.4% ±  1.0%
  peak_rss           23.3MB ± 1.24MB    20.6MB … 26.0MB          0 ( 0%)          +  0.3% ±  1.7%
  cpu_cycles          312M  ± 6.06M      308M  …  338M           6 ( 9%)        💩+ 66.0% ±  0.7%
  instructions        721M  ±  304       721M  …  721M           0 ( 0%)        💩+ 39.6% ±  0.0%
  cache_references   3.94M  ±  683K     3.39M  … 7.85M           6 ( 9%)        💩+  8.5% ±  4.9%
  cache_misses       52.3K  ± 26.5K     37.3K  …  226K           5 ( 8%)          +  8.8% ± 12.6%
  branch_misses      1.93M  ± 7.55K     1.92M  … 1.96M           1 ( 2%)          +  0.5% ±  0.1%
Benchmark 3 (105 runs): /tmp/labeled-match-len rs-chunked 7 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          47.9ms ± 2.74ms    45.6ms … 57.4ms         18 (17%)          +  0.2% ±  1.2%
  peak_rss           23.3MB ± 1.16MB    20.6MB … 26.0MB          0 ( 0%)          +  0.3% ±  1.4%
  cpu_cycles          185M  ± 3.36M      183M  …  208M          17 (16%)        ⚡-  1.7% ±  0.5%
  instructions        498M  ±  373       498M  …  498M           0 ( 0%)        ⚡-  3.5% ±  0.0%
  cache_references   3.80M  ± 1.76M     3.28M  … 21.5M           9 ( 9%)          +  4.6% ±  9.7%
  cache_misses       43.8K  ± 5.01K     38.6K  … 65.7K           8 ( 8%)        ⚡-  8.7% ±  5.8%
  branch_misses      1.92M  ± 7.79K     1.91M  … 1.94M           0 ( 0%)          -  0.2% ±  0.1%
Benchmark 4 (106 runs): /tmp/labeled-match-fast rs-chunked 7 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          47.5ms ± 2.92ms    45.5ms … 62.6ms         17 (16%)          -  0.6% ±  1.2%
  peak_rss           23.1MB ± 1.36MB    20.5MB … 26.0MB          0 ( 0%)          -  0.4% ±  1.5%
  cpu_cycles          184M  ± 6.05M      181M  …  220M          17 (16%)        ⚡-  1.8% ±  0.7%
  instructions        500M  ±  330       500M  …  500M           2 ( 2%)        ⚡-  3.2% ±  0.0%
  cache_references   3.78M  ±  608K     3.30M  … 6.82M          14 (13%)          +  4.2% ±  4.2%
  cache_misses       45.8K  ± 15.3K     36.3K  …  142K           9 ( 8%)          -  4.7% ±  8.2%
  branch_misses      1.92M  ± 8.42K     1.91M  … 1.94M           0 ( 0%)          -  0.3% ±  0.1%
Benchmark 1 (197 runs): /tmp/uncompress-baseline rs-chunked 16 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          25.4ms ± 1.36ms    24.1ms … 35.7ms         22 (11%)        0%
  peak_rss           23.3MB ± 1.18MB    20.2MB … 25.8MB          0 ( 0%)        0%
  cpu_cycles         93.2M  ± 4.00M     91.1M  …  114M          32 (16%)        0%
  instructions        239M  ±  295       239M  …  239M           0 ( 0%)        0%
  cache_references   2.69M  ±  381K     2.37M  … 4.96M          26 (13%)        0%
  cache_misses       59.9K  ± 6.53K     54.5K  …  109K           9 ( 5%)        0%
  branch_misses      1.04M  ± 5.14K     1.03M  … 1.05M           0 ( 0%)        0%
Benchmark 2 (201 runs): /tmp/loop-plus-match rs-chunked 16 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          24.9ms ±  523us    24.1ms … 27.7ms          8 ( 4%)        ⚡-  2.1% ±  0.8%
  peak_rss           23.2MB ± 1.30MB    20.2MB … 25.9MB          0 ( 0%)          -  0.8% ±  1.0%
  cpu_cycles         91.9M  ± 1.54M     91.0M  …  102M          22 (11%)          -  1.4% ±  0.6%
  instructions        249M  ±  274       249M  …  249M           0 ( 0%)        💩+  3.8% ±  0.0%
  cache_references   2.58M  ±  189K     2.42M  … 3.83M          13 ( 6%)        ⚡-  4.0% ±  2.2%
  cache_misses       59.2K  ± 7.22K     53.6K  …  107K          11 ( 5%)          -  1.1% ±  2.3%
  branch_misses      1.04M  ± 4.63K     1.03M  … 1.05M           0 ( 0%)          -  0.3% ±  0.1%
Benchmark 3 (200 runs): /tmp/labeled-match-len rs-chunked 16 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          24.8ms ±  329us    24.1ms … 27.2ms          1 ( 1%)        ⚡-  2.4% ±  0.8%
  peak_rss           23.4MB ± 1.32MB    20.2MB … 26.0MB          0 ( 0%)          +  0.1% ±  1.1%
  cpu_cycles         92.0M  ±  566K     91.5M  … 99.6M           4 ( 2%)          -  1.4% ±  0.6%
  instructions        247M  ±  265       247M  …  247M           0 ( 0%)        💩+  3.1% ±  0.0%
  cache_references   2.54M  ± 64.5K     2.40M  … 3.18M           3 ( 2%)        ⚡-  5.6% ±  2.0%
  cache_misses       59.4K  ± 2.22K     54.9K  … 68.4K           4 ( 2%)          -  0.9% ±  1.6%
  branch_misses      1.04M  ± 3.79K     1.03M  … 1.04M           0 ( 0%)          -  0.3% ±  0.1%
Benchmark 4 (200 runs): /tmp/labeled-match-fast rs-chunked 16 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          25.0ms ±  488us    24.2ms … 28.4ms          6 ( 3%)        ⚡-  1.8% ±  0.8%
  peak_rss           23.2MB ± 1.15MB    20.7MB … 25.8MB          0 ( 0%)          -  0.7% ±  1.0%
  cpu_cycles         92.3M  ± 1.38M     91.6M  …  104M          19 (10%)          -  1.0% ±  0.6%
  instructions        247M  ±  260       247M  …  247M           1 ( 1%)        💩+  3.1% ±  0.0%
  cache_references   2.56M  ±  109K     2.37M  … 3.30M          11 ( 6%)        ⚡-  4.6% ±  2.0%
  cache_misses       58.4K  ± 1.83K     53.8K  … 68.0K           4 ( 2%)          -  2.4% ±  1.6%
  branch_misses      1.04M  ± 2.83K     1.03M  … 1.05M           0 ( 0%)          -  0.2% ±  0.1%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment