Skip to content

Instantly share code, notes, and snippets.

@folkertdev
Last active October 23, 2024 07:54
Show Gist options
  • Select an option

  • Save folkertdev/977183fb706b7693863bd7f358578292 to your computer and use it in GitHub Desktop.

Select an option

Save folkertdev/977183fb706b7693863bd7f358578292 to your computer and use it in GitHub Desktop.
zlib-rs labeled match benchmarks
# zlib-rs labeled match benchmarks
## build the toolchain
A proof of concept implementation can be found at https://github.com/trifectatechfoundation/rust/tree/labeled-match. Build it with `./x build`, and then set up the [toolchain](https://rustc-dev-guide.rust-lang.org/building/how-to-build-and-run.html#creating-a-rustup-toolchain). Now `cargo +stage1 build` should use a compiler with `labeled-match` available.
## run the benchmark
```
git clone https://github.com/trifectatechfoundation/zlib-rs.git
git checkout len-as-match
sh replicate-labeled-match-benchmarks.sh
```
this runs 4 benchmarks
- baseline: the current zlib-rs main branch approach using tail calls
- loop-plus-match: standard approach using a loop and match; suffers from branch misprediction
- labeled-match-len: the `len` function and friends now use labeled match
- labeled-match-fast: the `len` and friends, and `inflate_fast_help` functions now use labeld match
## results
Mostly what we see is that labeled match gives significant speedups for small chunk sizes. For larger chunk sizes, the results are less clear. I believe really the result is net-zero, but we need clearly need to perform some further tuning.
Note in particular how `loop-plus-match` works well for small _and_ big inputs, but terribly for medium inputs.
```
Benchmark 1 (69 runs): /tmp/uncompress-baseline rs-chunked 4 silesia-small.tar.gz
measurement mean ± σ min … max outliers delta
wall_time 72.5ms ± 2.46ms 71.0ms … 91.0ms 7 (10%) 0%
peak_rss 24.1MB ± 77.8KB 23.9MB … 24.1MB 0 ( 0%) 0%
cpu_cycles 294M ± 9.90M 291M … 371M 7 (10%) 0%
instructions 914M ± 274 914M … 914M 0 ( 0%) 0%
cache_references 3.04M ± 519K 2.69M … 6.09M 4 ( 6%) 0%
cache_misses 156K ± 39.1K 126K … 463K 1 ( 1%) 0%
branch_misses 4.09M ± 10.8K 4.08M … 4.17M 5 ( 7%) 0%
Benchmark 2 (71 runs): /tmp/loop-plus-match rs-chunked 4 silesia-small.tar.gz
measurement mean ± σ min … max outliers delta
wall_time 70.9ms ± 817us 69.9ms … 76.2ms 2 ( 3%) ⚡- 2.1% ± 0.8%
peak_rss 24.1MB ± 58.4KB 24.0MB … 24.1MB 0 ( 0%) + 0.1% ± 0.1%
cpu_cycles 287M ± 2.33M 285M … 305M 3 ( 4%) ⚡- 2.5% ± 0.8%
instructions 792M ± 300 792M … 792M 1 ( 1%) ⚡- 13.4% ± 0.0%
cache_references 2.98M ± 89.8K 2.78M … 3.33M 1 ( 1%) - 1.9% ± 4.0%
cache_misses 154K ± 15.5K 115K … 207K 1 ( 1%) - 1.6% ± 6.3%
branch_misses 4.10M ± 3.85K 4.09M … 4.11M 1 ( 1%) + 0.2% ± 0.1%
Benchmark 3 (79 runs): /tmp/labeled-match-len rs-chunked 4 silesia-small.tar.gz
measurement mean ± σ min … max outliers delta
wall_time 63.9ms ± 1.66ms 62.7ms … 74.9ms 5 ( 6%) ⚡- 11.8% ± 0.9%
peak_rss 24.1MB ± 64.6KB 23.9MB … 24.1MB 18 (23%) + 0.1% ± 0.1%
cpu_cycles 254M ± 6.97M 252M … 307M 9 (11%) ⚡- 13.6% ± 0.9%
instructions 710M ± 259 710M … 710M 0 ( 0%) ⚡- 22.3% ± 0.0%
cache_references 3.05M ± 736K 2.79M … 9.38M 5 ( 6%) + 0.4% ± 6.8%
cache_misses 134K ± 14.6K 111K … 176K 1 ( 1%) ⚡- 14.0% ± 5.9%
branch_misses 4.08M ± 5.27K 4.08M … 4.11M 3 ( 4%) - 0.1% ± 0.1%
Benchmark 4 (81 runs): /tmp/labeled-match-fast rs-chunked 4 silesia-small.tar.gz
measurement mean ± σ min … max outliers delta
wall_time 62.1ms ± 554us 61.4ms … 64.4ms 3 ( 4%) ⚡- 14.3% ± 0.8%
peak_rss 24.1MB ± 63.0KB 23.9MB … 24.1MB 0 ( 0%) + 0.0% ± 0.1%
cpu_cycles 246M ± 1.75M 245M … 257M 6 ( 7%) ⚡- 16.3% ± 0.7%
instructions 689M ± 258 689M … 689M 0 ( 0%) ⚡- 24.6% ± 0.0%
cache_references 3.02M ± 336K 2.77M … 5.78M 4 ( 5%) - 0.6% ± 4.5%
cache_misses 128K ± 16.9K 101K … 186K 5 ( 6%) ⚡- 17.7% ± 6.0%
branch_misses 4.08M ± 5.14K 4.08M … 4.10M 2 ( 2%) - 0.1% ± 0.1%
Benchmark 1 (108 runs): /tmp/uncompress-baseline rs-chunked 7 silesia-small.tar.gz
measurement mean ± σ min … max outliers delta
wall_time 46.5ms ± 649us 45.3ms … 50.1ms 2 ( 2%) 0%
peak_rss 24.1MB ± 57.0KB 24.0MB … 24.1MB 27 (25%) 0%
cpu_cycles 174M ± 1.71M 173M … 187M 9 ( 8%) 0%
instructions 516M ± 277 516M … 516M 1 ( 1%) 0%
cache_references 3.14M ± 181K 2.93M … 4.33M 3 ( 3%) 0%
cache_misses 45.8K ± 17.2K 29.9K … 93.9K 7 ( 6%) 0%
branch_misses 2.00M ± 2.42K 2.00M … 2.02M 2 ( 2%) 0%
Benchmark 2 (78 runs): /tmp/loop-plus-match rs-chunked 7 silesia-small.tar.gz
measurement mean ± σ min … max outliers delta
wall_time 64.8ms ± 705us 63.8ms … 69.2ms 1 ( 1%) 💩+ 39.3% ± 0.4%
peak_rss 24.1MB ± 76.1KB 23.9MB … 24.1MB 0 ( 0%) - 0.1% ± 0.1%
cpu_cycles 257M ± 2.22M 256M … 273M 9 (12%) 💩+ 47.5% ± 0.3%
instructions 720M ± 385 720M … 720M 1 ( 1%) 💩+ 39.7% ± 0.0%
cache_references 3.18M ± 92.4K 2.99M … 3.55M 2 ( 3%) + 1.5% ± 1.4%
cache_misses 37.4K ± 17.6K 26.5K … 174K 7 ( 9%) ⚡- 18.4% ± 11.1%
branch_misses 2.00M ± 2.47K 2.00M … 2.01M 3 ( 4%) - 0.2% ± 0.0%
Benchmark 3 (104 runs): /tmp/labeled-match-len rs-chunked 7 silesia-small.tar.gz
measurement mean ± σ min … max outliers delta
wall_time 48.0ms ± 500us 47.2ms … 50.3ms 1 ( 1%) 💩+ 3.2% ± 0.3%
peak_rss 24.1MB ± 59.7KB 24.0MB … 24.1MB 0 ( 0%) - 0.0% ± 0.1%
cpu_cycles 181M ± 1.18M 180M … 187M 6 ( 6%) 💩+ 3.8% ± 0.2%
instructions 586M ± 354 586M … 586M 4 ( 4%) 💩+ 13.6% ± 0.0%
cache_references 3.25M ± 202K 3.02M … 4.80M 6 ( 6%) 💩+ 3.5% ± 1.6%
cache_misses 48.6K ± 8.60K 31.5K … 83.5K 5 ( 5%) + 6.0% ± 8.0%
branch_misses 2.00M ± 1.77K 2.00M … 2.01M 5 ( 5%) - 0.2% ± 0.0%
Benchmark 4 (110 runs): /tmp/labeled-match-fast rs-chunked 7 silesia-small.tar.gz
measurement mean ± σ min … max outliers delta
wall_time 45.5ms ± 927us 44.5ms … 51.4ms 4 ( 4%) ⚡- 2.2% ± 0.5%
peak_rss 24.1MB ± 73.1KB 23.9MB … 24.1MB 0 ( 0%) - 0.0% ± 0.1%
cpu_cycles 170M ± 2.85M 168M … 190M 12 (11%) ⚡- 2.7% ± 0.4%
instructions 515M ± 249 515M … 515M 0 ( 0%) - 0.1% ± 0.0%
cache_references 3.27M ± 107K 3.07M … 3.94M 2 ( 2%) 💩+ 4.2% ± 1.3%
cache_misses 109K ± 5.80K 98.7K … 139K 4 ( 4%) 💩+139.0% ± 7.4%
branch_misses 2.00M ± 4.10K 1.99M … 2.03M 4 ( 4%) - 0.2% ± 0.0%
Benchmark 1 (181 runs): /tmp/uncompress-baseline rs-chunked 16 silesia-small.tar.gz
measurement mean ± σ min … max outliers delta
wall_time 27.6ms ± 433us 26.8ms … 31.4ms 6 ( 3%) 0%
peak_rss 24.1MB ± 58.9KB 23.9MB … 24.1MB 45 (25%) 0%
cpu_cycles 90.0M ± 919K 89.5M … 99.7M 15 ( 8%) 0%
instructions 239M ± 322 239M … 239M 3 ( 2%) 0%
cache_references 2.28M ± 53.6K 2.19M … 2.70M 4 ( 2%) 0%
cache_misses 43.5K ± 2.44K 40.3K … 64.5K 5 ( 3%) 0%
branch_misses 1.05M ± 1.70K 1.05M … 1.06M 4 ( 2%) 0%
Benchmark 2 (186 runs): /tmp/loop-plus-match rs-chunked 16 silesia-small.tar.gz
measurement mean ± σ min … max outliers delta
wall_time 27.0ms ± 304us 26.3ms … 28.6ms 3 ( 2%) ⚡- 2.5% ± 0.3%
peak_rss 24.1MB ± 64.1KB 23.9MB … 24.1MB 0 ( 0%) - 0.0% ± 0.1%
cpu_cycles 87.3M ± 654K 86.8M … 91.7M 19 (10%) ⚡- 3.0% ± 0.2%
instructions 248M ± 266 248M … 248M 2 ( 1%) 💩+ 3.8% ± 0.0%
cache_references 2.25M ± 55.5K 2.17M … 2.75M 5 ( 3%) - 1.2% ± 0.5%
cache_misses 45.4K ± 2.26K 40.6K … 52.2K 1 ( 1%) 💩+ 4.2% ± 1.1%
branch_misses 1.05M ± 1.67K 1.04M … 1.05M 4 ( 2%) - 0.5% ± 0.0%
Benchmark 3 (184 runs): /tmp/labeled-match-len rs-chunked 16 silesia-small.tar.gz
measurement mean ± σ min … max outliers delta
wall_time 27.2ms ± 817us 26.3ms … 33.7ms 16 ( 9%) ⚡- 1.7% ± 0.5%
peak_rss 24.1MB ± 55.9KB 23.9MB … 24.1MB 39 (21%) + 0.0% ± 0.0%
cpu_cycles 87.4M ± 1.95M 86.4M … 103M 23 (13%) ⚡- 2.9% ± 0.3%
instructions 248M ± 278 248M … 248M 2 ( 1%) 💩+ 3.6% ± 0.0%
cache_references 2.28M ± 112K 2.18M … 2.93M 16 ( 9%) + 0.0% ± 0.8%
cache_misses 47.9K ± 11.1K 40.5K … 168K 15 ( 8%) 💩+ 10.0% ± 3.8%
branch_misses 1.05M ± 2.32K 1.04M … 1.06M 4 ( 2%) - 0.4% ± 0.0%
Benchmark 4 (182 runs): /tmp/labeled-match-fast rs-chunked 16 silesia-small.tar.gz
measurement mean ± σ min … max outliers delta
wall_time 27.5ms ± 252us 26.9ms … 28.5ms 2 ( 1%) - 0.7% ± 0.3%
peak_rss 24.1MB ± 64.1KB 23.9MB … 24.1MB 44 (24%) - 0.0% ± 0.1%
cpu_cycles 89.5M ± 477K 89.1M … 93.9M 18 (10%) - 0.6% ± 0.2%
instructions 253M ± 306 253M … 253M 1 ( 1%) 💩+ 5.7% ± 0.0%
cache_references 2.27M ± 67.3K 2.19M … 3.00M 3 ( 2%) - 0.5% ± 0.5%
cache_misses 66.5K ± 2.10K 59.8K … 73.6K 2 ( 1%) 💩+ 52.7% ± 1.1%
branch_misses 1.05M ± 1.22K 1.05M … 1.06M 1 ( 1%) - 0.2% ± 0.0%
```
@bjorn3
Copy link

bjorn3 commented Oct 23, 2024

On a AMD Ryzen 7 3700X 8-Core Processor:

Benchmark 1 (61 runs): /tmp/uncompress-baseline rs-chunked 4 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          82.0ms ± 1.94ms    79.6ms … 88.8ms          2 ( 3%)        0%
  peak_rss           23.2MB ± 1.22MB    20.5MB … 25.7MB          0 ( 0%)        0%
  cpu_cycles          336M  ± 7.88M      330M  …  365M           2 ( 3%)        0%
  instructions        916M  ±  245       916M  …  916M           1 ( 2%)        0%
  cache_references   3.02M  ±  333K     2.71M  … 4.11M           3 ( 5%)        0%
  cache_misses       45.8K  ± 7.34K     39.6K  … 94.0K           1 ( 2%)        0%
  branch_misses      4.01M  ± 22.2K     3.98M  … 4.06M           0 ( 0%)        0%
Benchmark 2 (61 runs): /tmp/loop-plus-match rs-chunked 4 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          82.7ms ±  642us    81.5ms … 85.6ms          3 ( 5%)          +  0.8% ±  0.6%
  peak_rss           23.4MB ± 1.30MB    20.2MB … 25.7MB          0 ( 0%)          +  0.5% ±  1.9%
  cpu_cycles          340M  ± 2.40M      338M  …  351M           2 ( 3%)          +  1.1% ±  0.6%
  instructions        794M  ±  276       794M  …  794M           0 ( 0%)        ⚡- 13.4% ±  0.0%
  cache_references   2.95M  ±  189K     2.78M  … 3.93M           6 (10%)          -  2.5% ±  3.2%
  cache_misses       41.8K  ± 4.61K     36.9K  … 72.3K           3 ( 5%)        ⚡-  8.8% ±  4.8%
  branch_misses      4.07M  ± 18.3K     4.02M  … 4.11M           5 ( 8%)        💩+  1.5% ±  0.2%
Benchmark 3 (70 runs): /tmp/labeled-match-len rs-chunked 4 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          72.0ms ±  800us    70.9ms … 75.1ms          6 ( 9%)        ⚡- 12.3% ±  0.6%
  peak_rss           23.4MB ± 1.48MB    20.6MB … 25.7MB          0 ( 0%)          +  0.7% ±  2.0%
  cpu_cycles          292M  ± 2.71M      290M  …  303M           6 ( 9%)        ⚡- 13.1% ±  0.6%
  instructions        685M  ±  286       685M  …  685M           0 ( 0%)        ⚡- 25.3% ±  0.0%
  cache_references   2.89M  ±  139K     2.63M  … 3.43M          10 (14%)        ⚡-  4.4% ±  2.8%
  cache_misses       46.2K  ± 5.53K     40.2K  … 78.5K           3 ( 4%)          +  0.9% ±  4.8%
  branch_misses      3.99M  ± 37.6K     3.92M  … 4.05M           0 ( 0%)          -  0.4% ±  0.3%
Benchmark 4 (70 runs): /tmp/labeled-match-fast rs-chunked 4 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          72.1ms ± 2.08ms    70.8ms … 86.8ms          4 ( 6%)        ⚡- 12.1% ±  0.8%
  peak_rss           23.3MB ± 1.28MB    20.7MB … 25.7MB          0 ( 0%)          +  0.1% ±  1.8%
  cpu_cycles          292M  ± 8.11M      289M  …  351M           7 (10%)        ⚡- 13.2% ±  0.8%
  instructions        685M  ±  291       685M  …  685M           0 ( 0%)        ⚡- 25.2% ±  0.0%
  cache_references   2.94M  ±  374K     2.72M  … 5.02M           5 ( 7%)          -  2.7% ±  4.0%
  cache_misses       45.9K  ± 5.04K     38.7K  … 64.7K           3 ( 4%)          +  0.1% ±  4.7%
  branch_misses      3.95M  ± 28.3K     3.92M  … 4.04M           0 ( 0%)        ⚡-  1.5% ±  0.2%
Benchmark 1 (105 runs): /tmp/uncompress-baseline rs-chunked 7 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          47.7ms ± 1.05ms    46.3ms … 51.8ms          7 ( 7%)        0%
  peak_rss           23.2MB ± 1.26MB    20.6MB … 25.7MB          0 ( 0%)        0%
  cpu_cycles          188M  ± 3.25M      186M  …  200M          12 (11%)        0%
  instructions        516M  ±  248       516M  …  516M           1 ( 1%)        0%
  cache_references   3.63M  ±  507K     3.26M  … 7.95M          13 (12%)        0%
  cache_misses       48.0K  ± 13.7K     36.9K  …  126K          10 (10%)        0%
  branch_misses      1.92M  ± 7.37K     1.91M  … 1.94M           0 ( 0%)        0%
Benchmark 2 (65 runs): /tmp/loop-plus-match rs-chunked 7 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          77.5ms ± 2.17ms    75.6ms … 89.1ms          7 (11%)        💩+ 62.4% ±  1.0%
  peak_rss           23.3MB ± 1.24MB    20.6MB … 26.0MB          0 ( 0%)          +  0.3% ±  1.7%
  cpu_cycles          312M  ± 6.06M      308M  …  338M           6 ( 9%)        💩+ 66.0% ±  0.7%
  instructions        721M  ±  304       721M  …  721M           0 ( 0%)        💩+ 39.6% ±  0.0%
  cache_references   3.94M  ±  683K     3.39M  … 7.85M           6 ( 9%)        💩+  8.5% ±  4.9%
  cache_misses       52.3K  ± 26.5K     37.3K  …  226K           5 ( 8%)          +  8.8% ± 12.6%
  branch_misses      1.93M  ± 7.55K     1.92M  … 1.96M           1 ( 2%)          +  0.5% ±  0.1%
Benchmark 3 (105 runs): /tmp/labeled-match-len rs-chunked 7 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          47.9ms ± 2.74ms    45.6ms … 57.4ms         18 (17%)          +  0.2% ±  1.2%
  peak_rss           23.3MB ± 1.16MB    20.6MB … 26.0MB          0 ( 0%)          +  0.3% ±  1.4%
  cpu_cycles          185M  ± 3.36M      183M  …  208M          17 (16%)        ⚡-  1.7% ±  0.5%
  instructions        498M  ±  373       498M  …  498M           0 ( 0%)        ⚡-  3.5% ±  0.0%
  cache_references   3.80M  ± 1.76M     3.28M  … 21.5M           9 ( 9%)          +  4.6% ±  9.7%
  cache_misses       43.8K  ± 5.01K     38.6K  … 65.7K           8 ( 8%)        ⚡-  8.7% ±  5.8%
  branch_misses      1.92M  ± 7.79K     1.91M  … 1.94M           0 ( 0%)          -  0.2% ±  0.1%
Benchmark 4 (106 runs): /tmp/labeled-match-fast rs-chunked 7 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          47.5ms ± 2.92ms    45.5ms … 62.6ms         17 (16%)          -  0.6% ±  1.2%
  peak_rss           23.1MB ± 1.36MB    20.5MB … 26.0MB          0 ( 0%)          -  0.4% ±  1.5%
  cpu_cycles          184M  ± 6.05M      181M  …  220M          17 (16%)        ⚡-  1.8% ±  0.7%
  instructions        500M  ±  330       500M  …  500M           2 ( 2%)        ⚡-  3.2% ±  0.0%
  cache_references   3.78M  ±  608K     3.30M  … 6.82M          14 (13%)          +  4.2% ±  4.2%
  cache_misses       45.8K  ± 15.3K     36.3K  …  142K           9 ( 8%)          -  4.7% ±  8.2%
  branch_misses      1.92M  ± 8.42K     1.91M  … 1.94M           0 ( 0%)          -  0.3% ±  0.1%
Benchmark 1 (197 runs): /tmp/uncompress-baseline rs-chunked 16 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          25.4ms ± 1.36ms    24.1ms … 35.7ms         22 (11%)        0%
  peak_rss           23.3MB ± 1.18MB    20.2MB … 25.8MB          0 ( 0%)        0%
  cpu_cycles         93.2M  ± 4.00M     91.1M  …  114M          32 (16%)        0%
  instructions        239M  ±  295       239M  …  239M           0 ( 0%)        0%
  cache_references   2.69M  ±  381K     2.37M  … 4.96M          26 (13%)        0%
  cache_misses       59.9K  ± 6.53K     54.5K  …  109K           9 ( 5%)        0%
  branch_misses      1.04M  ± 5.14K     1.03M  … 1.05M           0 ( 0%)        0%
Benchmark 2 (201 runs): /tmp/loop-plus-match rs-chunked 16 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          24.9ms ±  523us    24.1ms … 27.7ms          8 ( 4%)        ⚡-  2.1% ±  0.8%
  peak_rss           23.2MB ± 1.30MB    20.2MB … 25.9MB          0 ( 0%)          -  0.8% ±  1.0%
  cpu_cycles         91.9M  ± 1.54M     91.0M  …  102M          22 (11%)          -  1.4% ±  0.6%
  instructions        249M  ±  274       249M  …  249M           0 ( 0%)        💩+  3.8% ±  0.0%
  cache_references   2.58M  ±  189K     2.42M  … 3.83M          13 ( 6%)        ⚡-  4.0% ±  2.2%
  cache_misses       59.2K  ± 7.22K     53.6K  …  107K          11 ( 5%)          -  1.1% ±  2.3%
  branch_misses      1.04M  ± 4.63K     1.03M  … 1.05M           0 ( 0%)          -  0.3% ±  0.1%
Benchmark 3 (200 runs): /tmp/labeled-match-len rs-chunked 16 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          24.8ms ±  329us    24.1ms … 27.2ms          1 ( 1%)        ⚡-  2.4% ±  0.8%
  peak_rss           23.4MB ± 1.32MB    20.2MB … 26.0MB          0 ( 0%)          +  0.1% ±  1.1%
  cpu_cycles         92.0M  ±  566K     91.5M  … 99.6M           4 ( 2%)          -  1.4% ±  0.6%
  instructions        247M  ±  265       247M  …  247M           0 ( 0%)        💩+  3.1% ±  0.0%
  cache_references   2.54M  ± 64.5K     2.40M  … 3.18M           3 ( 2%)        ⚡-  5.6% ±  2.0%
  cache_misses       59.4K  ± 2.22K     54.9K  … 68.4K           4 ( 2%)          -  0.9% ±  1.6%
  branch_misses      1.04M  ± 3.79K     1.03M  … 1.04M           0 ( 0%)          -  0.3% ±  0.1%
Benchmark 4 (200 runs): /tmp/labeled-match-fast rs-chunked 16 silesia-small.tar.gz
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          25.0ms ±  488us    24.2ms … 28.4ms          6 ( 3%)        ⚡-  1.8% ±  0.8%
  peak_rss           23.2MB ± 1.15MB    20.7MB … 25.8MB          0 ( 0%)          -  0.7% ±  1.0%
  cpu_cycles         92.3M  ± 1.38M     91.6M  …  104M          19 (10%)          -  1.0% ±  0.6%
  instructions        247M  ±  260       247M  …  247M           1 ( 1%)        💩+  3.1% ±  0.0%
  cache_references   2.56M  ±  109K     2.37M  … 3.30M          11 ( 6%)        ⚡-  4.6% ±  2.0%
  cache_misses       58.4K  ± 1.83K     53.8K  … 68.0K           4 ( 2%)          -  2.4% ±  1.6%
  branch_misses      1.04M  ± 2.83K     1.03M  … 1.05M           0 ( 0%)          -  0.2% ±  0.1%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment