Skip to content

Instantly share code, notes, and snippets.

@kyle0r
Last active April 6, 2025 19:35
Show Gist options
  • Select an option

  • Save kyle0r/0b3c7894b19f9aedf1633ba75197a28e to your computer and use it in GitHub Desktop.

Select an option

Save kyle0r/0b3c7894b19f9aedf1633ba75197a28e to your computer and use it in GitHub Desktop.

Revisions

  1. kyle0r renamed this gist May 26, 2024. 1 changed file with 3 additions and 3 deletions.
    6 changes: 3 additions & 3 deletions -iostats-for-zpool.md → iostats-for-zpool.md
    Original file line number Diff line number Diff line change
    @@ -3,7 +3,7 @@ Current version: 2024.21.1

    Run `iostats-for-zpool --help` for usage.

    `iostats-for-zpool` will attempt to show you a side-by-side iostat and zpool iostat with pool devices grouped together.
    `iostats-for-zpool` will attempt to show you a side-by-side `iostat` and `zpool iostat` with pool devices grouped together.

    Intro video:

    @@ -53,9 +53,9 @@ Place or link the script to a path in `$PATH`
    mkdir -p ~/bin ; mv -i ~/iostats-for-zpool ~/bin ; export PATH="$PATH":~/bin
    ```

    Note: that some distributions with automatically add `~/bin` to `$PATH` if `~/bin` exists. Test this, YMMV!
    Note: that some distros will automatically add `~/bin` to `$PATH` if `~/bin` exists. Test this, YMMV!
    For example modern Debian handles this in `~/.profile`, so creating `~/bin` and starting a new shell or
    re-sourcing `. ~/.profile` will automatically prepend `~bin` to your `$PATH`.
    re-sourcing `. ~/.profile` will automatically prepend `~/bin` to your `$PATH`.

    **Alternative:** Assuming `$PATH` contains `/usr/local/bin` and you have `sudo` rights:
    ```
  2. kyle0r revised this gist May 26, 2024. 1 changed file with 7 additions and 1 deletion.
    8 changes: 7 additions & 1 deletion iostats-for-zpool.sh
    Original file line number Diff line number Diff line change
    @@ -310,6 +310,13 @@ See known issue on handling column values and widths and table formatting.
    See known issue about time/tick drift between the three fifos. Try testing
    GNU parallel to see if this mitigates the drift.
    --
    Could add a warning and option for running as root. root should not be required.
    e.g. --disable-run-as-root-warning which would disable the warning and make
    users cognitive of the user/privileges the script is running with.
    USAGE
    )
    @@ -466,7 +473,6 @@ fifo_iostat_device_extended="$(mktemp "/tmp/${me}-iostat-device-extended.XXXXXXX
    fifo_zpool_iostat="$(mktemp "/tmp/${me}-zpool-iostat.XXXXXXX.fifo")"
    for fifo in "$fifo_iostat_device" "$fifo_iostat_device_extended" "$fifo_zpool_iostat"; do
    if rm "$fifo"; then
    sync
    mkfifo "$fifo" || die "problem initialising fifo: $fifo"
    else
    die "problem setting up fifo: $fifo"
  3. kyle0r revised this gist May 26, 2024. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions -iostats-for-zpool.md
    Original file line number Diff line number Diff line change
    @@ -14,7 +14,7 @@ https://gist.github.com/assets/517822/a707e622-7fb3-4930-bdb0-689f783946de
    You can get **_the latest_** version of `iostats-for-zpool` from this gist with the following link:
    https://gist.githubusercontent.com/kyle0r/0b3c7894b19f9aedf1633ba75197a28e/raw/iostats-for-zpool.sh

    Here are some common ways to download the file via shell
    Here are some common ways to download the script via shell

    ## with `curl`

    @@ -57,7 +57,7 @@ Note: that some distributions with automatically add `~/bin` to `$PATH` if `~/bi
    For example modern Debian handles this in `~/.profile`, so creating `~/bin` and starting a new shell or
    re-sourcing `. ~/.profile` will automatically prepend `~bin` to your `$PATH`.

    **Alternative:** Assuming `$PATH` contains `/usr/local/bin` and you have sudo rights:
    **Alternative:** Assuming `$PATH` contains `/usr/local/bin` and you have `sudo` rights:
    ```
    # if you prefer a symlink
    sudo ln -s ~/iostats-for-zpool /usr/local/bin/iostats-for-zpool
  4. kyle0r revised this gist May 26, 2024. 1 changed file with 3 additions and 1 deletion.
    4 changes: 3 additions & 1 deletion -iostats-for-zpool.md
    Original file line number Diff line number Diff line change
    @@ -5,7 +5,9 @@ Run `iostats-for-zpool --help` for usage.

    `iostats-for-zpool` will attempt to show you a side-by-side iostat and zpool iostat with pool devices grouped together.

    Intro video: TODO
    Intro video:

    https://gist.github.com/assets/517822/a707e622-7fb3-4930-bdb0-689f783946de

    # INSTALL

  5. kyle0r revised this gist May 25, 2024. 1 changed file with 84 additions and 1 deletion.
    85 changes: 84 additions & 1 deletion -iostats-for-zpool.md
    Original file line number Diff line number Diff line change
    @@ -1 +1,84 @@
    .
    First version: 2024.20.1
    Current version: 2024.21.1

    Run `iostats-for-zpool --help` for usage.

    `iostats-for-zpool` will attempt to show you a side-by-side iostat and zpool iostat with pool devices grouped together.

    Intro video: TODO

    # INSTALL

    You can get **_the latest_** version of `iostats-for-zpool` from this gist with the following link:
    https://gist.githubusercontent.com/kyle0r/0b3c7894b19f9aedf1633ba75197a28e/raw/iostats-for-zpool.sh

    Here are some common ways to download the file via shell

    ## with `curl`

    ```
    curl -sSL 'https://gist.githubusercontent.com/kyle0r/0b3c7894b19f9aedf1633ba75197a28e/raw/iostats-for-zpool.sh' > ~/iostats-for-zpool
    ```

    ## with `wget`

    ```
    wget -qO- 'https://gist.githubusercontent.com/kyle0r/0b3c7894b19f9aedf1633ba75197a28e/raw/iostats-for-zpool.sh' > ~/iostats-for-zpool
    ```

    ## with `apt-helper`

    If curl and wget are not available:

    ```
    /usr/lib/apt/apt-helper download-file 'https://gist.githubusercontent.com/kyle0r/0b3c7894b19f9aedf1633ba75197a28e/raw/iostats-for-zpool.sh' ~/iostats-for-zpool
    ```

    ## `chmod`

    ```
    chmod a+rx ~/iostats-for-zpool
    ```

    You can now run the script via `~/iostats-for-zpool`.

    ## OPTIONAL

    Place or link the script to a path in `$PATH`

    `~/bin` example that does not require sudo rights:
    ```
    mkdir -p ~/bin ; mv -i ~/iostats-for-zpool ~/bin ; export PATH="$PATH":~/bin
    ```

    Note: that some distributions with automatically add `~/bin` to `$PATH` if `~/bin` exists. Test this, YMMV!
    For example modern Debian handles this in `~/.profile`, so creating `~/bin` and starting a new shell or
    re-sourcing `. ~/.profile` will automatically prepend `~bin` to your `$PATH`.

    **Alternative:** Assuming `$PATH` contains `/usr/local/bin` and you have sudo rights:
    ```
    # if you prefer a symlink
    sudo ln -s ~/iostats-for-zpool /usr/local/bin/iostats-for-zpool
    # if you prefer moving the script to /usr/local/bin
    sudo mv -i ~/iostats-for-zpool /usr/local/bin/
    ```

    # USAGE EXAMPLES

    ```
    iostats-for-zpool /dev/sda /dev/sdf 10
    Show zpool iostats for rpool (default) pool, and groupped iostats for sda+sdf, 10 sec interval
    iostats-for-zpool -p tank /dev/sda /dev/sdf 1 30
    Show zpool iostats for tank pool, and groupped iostats for sda+sdf, 1 sec interval, 30 times (30 secs)
    iostats-for-zpool -p storage /dev/sdc /dev/sdm /dev/sdv 300 12
    Show zpool iostats for storage pool, and groupped iostats for sdc+sdm,sdv, 5min interval, 12 times (1 hour)
    ```

    # TODO

    TODO: add a changelog

    Additional TODO's and IDEA's are captured in `--help` text.
  6. kyle0r revised this gist May 25, 2024. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion iostats-for-zpool.sh
    Original file line number Diff line number Diff line change
    @@ -2,7 +2,7 @@

    # This script was created on Debian and should be POSIX compliant
    # shell check OK, some exceptions created
    version='2024.20.1'
    version='2024.21.1'
    # version convention: DIN ISO 8601 date +%G\.%V\.1 (YEAR.WEEK.RELEASE)
    #+ where 1 is incremented per release within the given week

  7. kyle0r revised this gist May 25, 2024. 1 changed file with 32 additions and 9 deletions.
    41 changes: 32 additions & 9 deletions iostats-for-zpool.sh
    Original file line number Diff line number Diff line change
    @@ -370,10 +370,27 @@ shift $((OPTIND-1)) # remove parsed options and args from $@ list
    [ -x "$(command -v which)" ] || die "program 'which' dependency not found. aborting."
    kill_path=$(which kill)

    for dep in "$kill_path" zpool readlink iostat ts awk stdbuf paste mkfifo fuser; do
    for dep in "$kill_path" zpool readlink iostat ts awk stdbuf paste mkfifo fuser mktemp; do
    [ -x "$(command -v "$dep")" ] || die "program '$dep' dependency not found. aborting."
    done

    # the script depends on gawk
    awkname="$(readlink -f "$(which awk)")"
    awkname="${awkname##*/}"
    if [ "gawk" = "$awkname" ]; then
    : # we assume gawk is available, nothing to do
    elif [ "mawk" = "$awkname" ]; then
    die "mawk has been detected, this script requires gawk. exiting."
    else
    if { awk -Wversion || awk --version; } 2>/dev/null | grep -iq mawk; then
    die "mawk has been detected, this script requires gawk. exiting."
    elif { awk -Wversion || awk --version; } 2>/dev/null | grep -iq gnu; then
    : # we assume gawk is available, nothing to do
    else
    die "this script requires gawk but it could not be detected. exiting."
    fi
    fi

    # validate the specified zpool exists
    zpool status "$pool" 1>/dev/null 2>&1 || die "$pool does not seem to exist? aborting."

    @@ -417,8 +434,8 @@ terminate() {
    fuser -k -TERM "$fifo" 1>/dev/null 2>&1
    done

    echo "fifo's processes signaled to terminate. waiting for processes to die..." 1>&2
    wait # wait for background processes to die
    echo "fifo's processes signaled to terminate. waiting 3 seconds for processes to die..." 1>&2
    sleep 3

    # cleanup fifos
    for fifo in "$fifo_iostat_device" "$fifo_iostat_device_extended" "$fifo_zpool_iostat"; do
    @@ -442,12 +459,18 @@ trap "terminate || true" TERM INT EXIT
    ######################################################################
    # START main script

    # create fifo's
    fifo_iostat_device="/tmp/${me}-iostat-device.fifo"
    fifo_iostat_device_extended="/tmp/${me}-iostat-device-extended.fifo"
    fifo_zpool_iostat="/tmp/${me}-zpool-iostat.fifo"
    # create fifo's: first mktemp to safely establish a unique path, then rm and mkfifo, mitigates attack vectors
    # safety context: https://stackoverflow.com/a/11636850
    fifo_iostat_device="$(mktemp "/tmp/${me}-iostat-device.XXXXXXX.fifo")"
    fifo_iostat_device_extended="$(mktemp "/tmp/${me}-iostat-device-extended.XXXXXXX.fifo")"
    fifo_zpool_iostat="$(mktemp "/tmp/${me}-zpool-iostat.XXXXXXX.fifo")"
    for fifo in "$fifo_iostat_device" "$fifo_iostat_device_extended" "$fifo_zpool_iostat"; do
    mkfifo "$fifo"
    if rm "$fifo"; then
    sync
    mkfifo "$fifo" || die "problem initialising fifo: $fifo"
    else
    die "problem setting up fifo: $fifo"
    fi
    done

    : <<INFO
    @@ -681,4 +704,4 @@ stdbuf -oL paste "$fifo_iostat_device" "$fifo_iostat_device_extended" "$fifo_zpo
    ' | stdbuf -oL ts "$ts_format"


    # END of script
    # END of script
  8. kyle0r revised this gist May 16, 2024. 1 changed file with 93 additions and 15 deletions.
    108 changes: 93 additions & 15 deletions iostats-for-zpool.sh
    Original file line number Diff line number Diff line change
    @@ -41,7 +41,7 @@ stdin_is_terminal=no
    # if stdin is a terminal i.e. an interactive session.
    if [ -t 0 ]; then
    stdin_is_terminal=yes
    [ -x "$(command -v stty)" ] || { die 'program 'stty' dependency not found. aborting.'; }
    [ -x "$(command -v stty)" ] || { die "program 'stty' dependency not found. aborting."; }
    original_tty_state=$(stty -g)

    # file descriptor for direct writes to tty, e.g. when stdout/err are already being redirected.
    @@ -74,7 +74,7 @@ $(print_version)
    EXAMPLES
    $me /dev/sda /dev/sdf 10
    Show zpool iostats for rpool pool, and groupped iostats for sda+sdf, 10 sec interval
    Show zpool iostats for rpool (default) pool, and groupped iostats for sda+sdf, 10 sec interval
    $me -p tank /dev/sda /dev/sdf 1 30
    Show zpool iostats for tank pool, and groupped iostats for sda+sdf, 1 sec interval, 30 times (30 secs)
    @@ -101,6 +101,8 @@ You can monitor how much physical MiB is read or written in a given interval.
    For IOPS, you can compare how much logical IO produced how much physical IO.
    The same is true for IO bandwidth.
    You can monitor the average physical read and write block size.
    The data is taken from 3 places, iostat -d, iostat -dx and zpool iostat and
    combined into a single view.
    @@ -126,6 +128,69 @@ OPTIONS
    -p the zpool to monitor
    OUTPUT COLUMNS
    iostat - Physical device IO
    Grouping: $me groups iostat values together for the provided devices.
    tps - cite: man iostat
    Indicate the number of grouped transfers per second that were issued to the
    specified device(s). A transfer is an I/O request to a device. Multiple
    logical requests can be combined into a single I/O request to a device. A
    transfer is of indeterminate size.
    r io/s - renamed iostat r/s - cite: man iostat
    Read IOPS. The number (after merges) of grouped read requests
    completed per second for the specified device(s).
    rMB/s - cite: man iostat
    The number of grouped sectors (kilobytes, megabytes) read from the specified
    device(s) per second.
    r-sz - renamed iostat rareq-sz - cite: man iostat
    The average grouped size (in kilobytes) of the read requests that were issued
    to the specified device(s).
    MB_read - cite: man iostat
    The total number of grouped blocks (megabytes) read from the specified
    device(s) since the last interval.
    w io/s - renamed iostat w/s - cite: man iostat
    Write IOPS. The number (after merges) of grouped write requests
    completed per second for the specified device(s).
    wMB/s - cite: man iostat
    The number of grouped sectors (kilobytes, megabytes) written to the specified
    device(s) per second.
    w-sz - renamed iostat wareq-sz - cite: man iostat
    The average grouped size (in kilobytes) of the write requests that were issued
    to the specified device(s).
    MB_wrtn - cite: man iostat
    The total number of grouped blocks (megabytes) writen to the specified
    device(s) since the last interval.
    --
    zpool iostat - logical ZFS IO
    pool
    r io/s
    Read IOPS.
    r bw/s
    Read bandwidth per second.
    w io/s
    Write IOPS.
    w bw/s
    Write bandwidth per second.
    THINGS TO KEEP IN MIND
    If the scenario arises that output lines appear to be out of sync, it is
    @@ -271,7 +336,6 @@ if [ ! -x "$(command -v "$PAGER")" ]; then # less not found
    fi



    ######################################################################
    # START getopts related code
    # Should be POSIX compatible
    @@ -295,8 +359,9 @@ while getopts hi:p:-: OPT; do # allow -a, -b with arg, -c, and -- "with arg"
    done
    shift $((OPTIND-1)) # remove parsed options and args from $@ list
    # END getopts
    ######################################################################


    ######################################################################
    # START validation of options, arguments and dependencies

    [ -z "$*" ] && die "missing one or more devices to monitor. aborting."
    @@ -334,8 +399,9 @@ for device in "$@"; do
    done

    # END validation
    ######################################################################


    ######################################################################
    # START trap
    # trap function - the script attempts a clean shutdown and finally terminates the process group
    terminate() {
    @@ -369,8 +435,10 @@ terminate() {
    # Define the trap for various signals
    # Modified signal names for POSIX compliance
    trap "terminate || true" TERM INT EXIT

    # END trap


    ######################################################################
    # START main script

    @@ -442,10 +510,10 @@ the start of the script, the output *should* be *relatively* in-sync.
    I say relative because the maximum human-friendly time resolution that iostat
    and zpool iostat can display is 1 second aka 1000ms. The running kernel HZ
    constant determines the number of ticks aka jiffies per second, on the script
    constant determines the number of ticks aka jiffies per second. On the script
    development system HZ=250. This means 0.004 seconds aka 4ms per kernel jiffy.
    So a given background process will start blocking at a given jiffy within a
    given second.
    So a given background process will start up, and then start blocking at a given
    jiffy within a given second.
    Note: There is also the USER_HZ constant, which AFAIK affects the granularity
    of userland programs reporting date and time. Not to be confused with the
    @@ -466,7 +534,7 @@ However, these topics are somewhat outside the scope of a simple POSIX shell
    script.
    Based on observations during development, paste will not perform any
    synchronisation on or between the 3 fifos it is reading, BUT it should start
    synchronisation on or between the 3 fifos it is reading. paste *should* start
    reading all fifos at *nearly* the same time. I'd expect the delta to be a few
    jiffies. We could probably observe this by watching strace with ns/ms
    granularity on the timestamps.
    @@ -477,16 +545,19 @@ STOP signal after they are started. Then, just before the script starts reading
    the fifos (and outputting to the console), send the processes the CONT signal.
    This should not be necessary, as this is essentially what the fifo provides us
    with, the background processes should be blocked as soon as they open their
    respective fifo.
    respective fifo, so the block should occur prior to the first fifo write.
    I'm not sure exactly what side effect this blocking has on the accuracy of the
    values in the first few intervals of iostat and zpool iostat, perhaps the
    blocking happens early enough that it doesn't mess with the internal timers and
    such of the waiting programs. Just something to keep in mind.
    INFO
    # give the background procs a chance to start doing their work
    #sleep 0.5
    Reference:
    Kernel HZ aka CONFIG_HZ constant can be calculated using awk per:
    https://stackoverflow.com/a/17371631/490487
    USER_HZ aka CLOCKS_PER_SEC constant can be retrieved via: getconf CLK_TCK
    INFO

    : <<TODO
    NOTES ON TABLE AND COLUMN LAYOUT
    @@ -549,11 +620,11 @@ TODO
    # shellcheck disable=SC2016
    stdbuf -oL paste "$fifo_iostat_device" "$fifo_iostat_device_extended" "$fifo_zpool_iostat" | \
    stdbuf -oL awk 'BEGIN {OFS="\t"} {sep ("" == $0)?"":"|"} {print $2,$10,$11,$15,$6,$16,$17,$21,$7," |",$32,$35,$37,$36,$38}' | \
    stdbuf -oL awk -v stdin_is_terminal="$stdin_is_terminal" '
    stdbuf -oL awk -v stdin_is_terminal="$stdin_is_terminal" -v interval="$interval" '
    BEGIN {
    term_size()
    OFS="\t"
    print "START"
    print "START - collecting data for first interval: "interval" seconds."
    }
    function is_term() {
    @@ -594,13 +665,20 @@ stdbuf -oL paste "$fifo_iostat_device" "$fifo_iostat_device_extended" "$fifo_zpo
    }
    print ""; output++
    }
    # first record? print header and skip the record
    (1 == NR) {
    header=$0
    print_header()
    next
    }
    # non-header records/lines
    {print;output++}
    # repeating header logic based on terminal rows/lines
    (is_term() && output >= term_lines && 0 == (output % term_lines)) {print ""; print_header();output++;output++}
    ' | stdbuf -oL ts "$ts_format"


    # END of script
  9. kyle0r revised this gist May 16, 2024. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion iostats-for-zpool.sh
    Original file line number Diff line number Diff line change
    @@ -58,7 +58,7 @@ usage() {
    NAME
    $me - side-by-side iostat and zpool iostat with pool devices group together
    $me - side-by-side iostat and zpool iostat with pool devices grouped together
    SYNOPSIS
  10. kyle0r revised this gist May 16, 2024. 1 changed file with 0 additions and 15 deletions.
    15 changes: 0 additions & 15 deletions iostats-for-zpool.sh
    Original file line number Diff line number Diff line change
    @@ -374,21 +374,6 @@ trap "terminate || true" TERM INT EXIT
    ######################################################################
    # START main script

    # read the rows and cols size of the current shell window
    # should be POSIX compliant
    # https://stackoverflow.com/a/39937626/490487
    # shellcheck disable=SC2034,SC2162
    #read rows cols << EOF
    #$(stty size)
    #EOF

    : <<TODO
    update rows/cols based on shell window resize. SIGWINCH?
    I was thinking that the script could auto-adjust to the number of rows, once per header output?
    This would require writing a simple awk function to check if stty sizes has changed since the previous check
    The logic could perhaps be moved inside awk, this would avoid having to pass the variable to awk.
    TODO

    # create fifo's
    fifo_iostat_device="/tmp/${me}-iostat-device.fifo"
    fifo_iostat_device_extended="/tmp/${me}-iostat-device-extended.fifo"
  11. kyle0r revised this gist May 16, 2024. 1 changed file with 621 additions and 1 deletion.
    622 changes: 621 additions & 1 deletion iostats-for-zpool.sh
    Original file line number Diff line number Diff line change
    @@ -1 +1,621 @@
    .
    #!/bin/sh

    # This script was created on Debian and should be POSIX compliant
    # shell check OK, some exceptions created
    version='2024.20.1'
    # version convention: DIN ISO 8601 date +%G\.%V\.1 (YEAR.WEEK.RELEASE)
    #+ where 1 is incremented per release within the given week

    me="${0##*/}" # basename
    me="${me%.sh}" # strip .sh suffix
    print_version() { printf "%s version: %s\n" "$me" "$version"; }
    ts_format='[%Y-%m-%dT%H:%M:%S]'

    ######################################################################
    # START defaults
    interval=1
    count=
    pool=rpool


    ######################################################################
    # START functions
    # https://stackoverflow.com/a/61835747/490487
    is_num() { case ${1#[-+]} in '' | . | *[!0-9.]* | *.*.* ) return 1;; esac ;}

    # handy POSIX pause function https://unix.stackexchange.com/a/293941/19406
    pause() { printf "%s" "Press ENTER to continue... or CTRL+C to abort."; read -r _; }

    # complain to STDERR and exit with error
    die() { echo "$*" >&2; exit 2; }
    # handle options that require an arg
    needs_arg() { [ -z "$OPTARG" ] && die "No arg for --$OPT option"; }

    # func to check if stdin is a terminal
    is_term() { [ 'yes' = "$stdin_is_terminal" ] && return 0 || return 1; }


    ######################################################################
    # START tty handling
    stdin_is_terminal=no
    # if stdin is a terminal i.e. an interactive session.
    if [ -t 0 ]; then
    stdin_is_terminal=yes
    [ -x "$(command -v stty)" ] || { die 'program 'stty' dependency not found. aborting.'; }
    original_tty_state=$(stty -g)

    # file descriptor for direct writes to tty, e.g. when stdout/err are already being redirected.
    exec 3<> /dev/tty
    else
    exec 3<> /dev/null
    fi


    ######################################################################
    # START usage related code
    usage() {
    usage=$(cat <<USAGE
    NAME
    $me - side-by-side iostat and zpool iostat with pool devices group together
    SYNOPSIS
    $me [ -p zpool ] device ... [ interval [ count ] ]
    VERSION
    $(print_version)
    EXAMPLES
    $me /dev/sda /dev/sdf 10
    Show zpool iostats for rpool pool, and groupped iostats for sda+sdf, 10 sec interval
    $me -p tank /dev/sda /dev/sdf 1 30
    Show zpool iostats for tank pool, and groupped iostats for sda+sdf, 1 sec interval, 30 times (30 secs)
    $me -p storage /dev/sdc /dev/sdm /dev/sdv 300 12
    Show zpool iostats for storage pool, and groupped iostats for sdc+sdm,sdv, 5min interval, 12 times (1 hour)
    DESCRIPTION
    This script was created to unify the output of iostat and zpool iostat. The
    details can be viewed interactively OR logged OR passed to another script for
    parsing.
    The primary goal is to give a sysop a quick overview of the physical device
    iostats and the logical ZFS zpool iostats, and to be able to compare them side
    by side.
    That is, for a given IO workload on a zpool, what does the IO look like when
    the IO requests hit the physical pool devices.
    You can monitor how much physical MiB is read or written in a given interval.
    For IOPS, you can compare how much logical IO produced how much physical IO.
    The same is true for IO bandwidth.
    The data is taken from 3 places, iostat -d, iostat -dx and zpool iostat and
    combined into a single view.
    The -p option is optional and tells zpool iostat which pool to monitor.
    The -p option defaults to: rpool
    At least one device argument is required, this tells iostat which devices to
    monitor and group together.
    The interval and count arguments are optional and default to:
    interval=1 count=null
    Mimicking iostat, the interval parameter specifies the time in seconds between
    each report. The count parameter can be specified in conjunction with the
    interval parameter. If the count parameter is specified, the value of count
    determines the number of reports generated at interval seconds apart.
    OPTIONS
    -h shows script usage information.
    -p the zpool to monitor
    THINGS TO KEEP IN MIND
    If the scenario arises that output lines appear to be out of sync, it is
    important to remember that this does not automatically mean that there is an
    output time drift or output synchronicity issue. It is natural for logical and
    physical IO to sometimes be aligned within the same second, but more probable
    that there is some natural delay between logical and physical IO. The longer
    you observe at short intervals, the more likely it is that this "it looks out
    of sync" phenomenon will occur.
    Remember, there is a lot going on in an IO subsystem in 1 second. You have the
    kernel, the kernel device IO scheduler, the ZFS code paths and the IO
    scheduler. These factors naturally cause things to happen at different times,
    and observing perfectly synchronised IO within a 1 second time window is
    relatively unlikely.
    Just keep in mind that observing IO at short intervals has the advantage of
    seeing "what is happening in real time", but also has the disadvantage that 1
    second is a short observation window for the IO to remain aligned.
    KNOWN ISSUES
    It seems that even if the output timestamps of the three fifos stay in sync
    (accuracy ~1000ms), the internal clocks/ticks of iostat and zpool iostat are
    slightly different, causing drift. Or maybe it is the observer effect, where
    observing 3 different processes in parallel, which were not started at exactly
    the same nanosecond, or did not start to output at exactly the same nanosecond,
    will inherently lead to time drift and/or delay in the observations, and then
    there is the factor of the clock/tick of the processes themselves. Since the
    shortest interval between lines of output is 1 second, it does not take many ms
    of drift for one line to fall out of sync with another (fall into another
    second/line).
    This issue visualises itself as values being out of sync/sequence between
    iostat and zpool iostat. It is also possible that the time taken by this script
    to process and output the data may cause or contribute to the drift. This issue
    becomes more apparent at shorter, more granular intervals.
    My assessment is that this is not a major problem because typically shorter
    intervals are observed by a human interactively for a short period of time, so
    the problem does not have time to occur in this scenario. It is only after a
    few minutes with a 1 second interval setting that the issue starts to occur.
    Being aware of this issue, the sysop can simply restart the script.
    The longer the interval, the more irrelevant this drift problem becomes. This
    makes the script suitable for analysis over a longer period of time.
    I'm not sure how to solve this issue. Suggestions are welcome. Perhaps this is
    a non-issue and just the reality of observation at short intervals? It may not
    be worth the effort to fix it, or it may not be fixable. I think it is unlikely
    to discover a primary key (other than time) to link and synchronise iostat and
    zpool iostat together.
    One idea that came to mind was to try using GNU parallel to run the 3
    sub-processes, as this supports line-buffered grouped/synchronised sub-process
    output. I suspect the same drift will happen. Would be worth testing. If this
    solves the issue, it would create a dependency on GNU parallel. This could be
    mitigated by offering an option to choose between the existing logic and GNU
    parallel.
    See the code notes section: NOTES ON PROCESS & OUTPUT SYNCHRONICITY
    --
    There are some challenges with the alignment of columns when a values char
    width overflows certain sizes. I've written a details TODO inline.
    PORTABILITY / COMPATIBILITY
    The script should be POSIX compliant.
    The script does have a number of dependencies which may be missing on some
    compact distros like alpine, or distros that use busybox.
    ENVIRONMENT
    Nothing specific
    AUTHORS
    2024 https://github.com/kyle0r
    BUGS
    Post on the GitHub gist:
    https://gist.github.com/kyle0r/0b3c7894b19f9aedf1633ba75197a28e
    CONTRIBUTING
    Feel free to make requests or fork the gist (see BUGS).
    LICENSE
    MIT
    DISCLAIMER
    This script is provided "AS IS" and any express or implied warranties,
    including, but not limited to, the implied warranties of
    merchantability and fitness for a particular purpose are disclaimed.
    See the LICENSE distributed file or for complete details.
    TODO / IDEAS
    See known issue on handling column values and widths and table formatting.
    --
    See known issue about time/tick drift between the three fifos. Try testing
    GNU parallel to see if this mitigates the drift.
    USAGE
    )
    printf "%s" "$usage" | "$PAGER"; exit 1
    }

    usage_with_prompt() {
    printf "\n%s\n" "PAGER ($PAGER) will now run to display help/usage info."
    pause
    usage
    }

    # use less as PAGER fallback if no env pager is defined
    # https://stackoverflow.com/a/28085062/490487
    : "${PAGER:=less}"

    # check for PAGER dependencies
    if [ ! -x "$(command -v "$PAGER")" ]; then # less not found
    PAGER="cat"
    if [ ! -x "$(command -v "$PAGER")" ]; then # cat not found
    echo "$PAGER and cat not found in PATH. aborting." 1>&2
    exit 1
    fi
    fi



    ######################################################################
    # START getopts related code
    # Should be POSIX compatible
    # Thank you: https://stackoverflow.com/users/519360/adam-katz
    # https://stackoverflow.com/a/28466267/490487

    while getopts hi:p:-: OPT; do # allow -a, -b with arg, -c, and -- "with arg"
    # support long options: https://stackoverflow.com/a/28466267/519360
    if [ "$OPT" = "-" ]; then # long option: reformulate OPT and OPTARG
    OPT="${OPTARG%%=*}" # extract long option name
    OPTARG="${OPTARG#"$OPT"}" # extract long option argument (may be empty)
    OPTARG="${OPTARG#=}" # if long option argument, remove assigning `=`
    fi
    case "$OPT" in
    h | help ) usage ;;
    i | interval ) interval="${OPTARG:-$interval}" ;; # optional argument
    p | pool ) pool="${OPTARG:-$pool}" ;; # optional argument
    \? ) usage_with_prompt ;; # error reported via getopts
    * ) echo "Illegal option --$OPT" 1>&2 ; usage_with_prompt ;; # bad long option
    esac
    done
    shift $((OPTIND-1)) # remove parsed options and args from $@ list
    # END getopts
    ######################################################################

    # START validation of options, arguments and dependencies

    [ -z "$*" ] && die "missing one or more devices to monitor. aborting."

    # we require which to find the kill program not the builtin
    [ -x "$(command -v which)" ] || die "program 'which' dependency not found. aborting."
    kill_path=$(which kill)

    for dep in "$kill_path" zpool readlink iostat ts awk stdbuf paste mkfifo fuser; do
    [ -x "$(command -v "$dep")" ] || die "program '$dep' dependency not found. aborting."
    done

    # validate the specified zpool exists
    zpool status "$pool" 1>/dev/null 2>&1 || die "$pool does not seem to exist? aborting."

    # reverse the args: makes for easier validation logic
    # https://unix.stackexchange.com/a/560698/19406
    arg=''; for a in "$@"; do
    # shellcheck disable=SC2086
    set -- "$a" ${arg-"$@"} # note the $@ is quoted and expansion is not
    unset arg
    done

    # parse args to determine if count and interval are present
    if is_num "$1" && is_num "$2"; then
    count="$1"; shift
    interval="$1"; shift
    elif is_num "$1"; then
    interval="$1"; shift
    fi

    # are the remaining arguments block devices?
    for device in "$@"; do
    [ -b "$(readlink -f -- "$device")" ] || die "$device is not a valid block device. aborting."
    done

    # END validation
    ######################################################################

    # START trap
    # trap function - the script attempts a clean shutdown and finally terminates the process group
    terminate() {
    trap '' TERM INT EXIT # ignore further signals
    printf "\n%s\n" "$me has been signaled to clean up and terminate..." 1>&2

    # reset terminal after CTRL+C INT signal.
    #+ https://stackoverflow.com/a/31810254
    is_term && stty "$original_tty_state"

    for fifo in "$fifo_iostat_device" "$fifo_iostat_device_extended" "$fifo_zpool_iostat"; do
    # send TERM signal to any processes using the fifos
    fuser -k -TERM "$fifo" 1>/dev/null 2>&1
    done

    echo "fifo's processes signaled to terminate. waiting for processes to die..." 1>&2
    wait # wait for background processes to die

    # cleanup fifos
    for fifo in "$fifo_iostat_device" "$fifo_iostat_device_extended" "$fifo_zpool_iostat"; do
    rm "$fifo"
    done

    echo "fifo's cleaned up. final shutdown..." 1>&2
    # kill the process group -$$
    # Try to kill any lingering procs spawned by this script
    # https://stackoverflow.com/a/2173421/490487
    "$kill_path" -- -$$;
    }

    # Define the trap for various signals
    # Modified signal names for POSIX compliance
    trap "terminate || true" TERM INT EXIT
    # END trap

    ######################################################################
    # START main script

    # read the rows and cols size of the current shell window
    # should be POSIX compliant
    # https://stackoverflow.com/a/39937626/490487
    # shellcheck disable=SC2034,SC2162
    #read rows cols << EOF
    #$(stty size)
    #EOF

    : <<TODO
    update rows/cols based on shell window resize. SIGWINCH?
    I was thinking that the script could auto-adjust to the number of rows, once per header output?
    This would require writing a simple awk function to check if stty sizes has changed since the previous check
    The logic could perhaps be moved inside awk, this would avoid having to pass the variable to awk.
    TODO

    # create fifo's
    fifo_iostat_device="/tmp/${me}-iostat-device.fifo"
    fifo_iostat_device_extended="/tmp/${me}-iostat-device-extended.fifo"
    fifo_zpool_iostat="/tmp/${me}-zpool-iostat.fifo"
    for fifo in "$fifo_iostat_device" "$fifo_iostat_device_extended" "$fifo_zpool_iostat"; do
    mkfifo "$fifo"
    done

    : <<INFO
    ATTENTION // ACHTUNG
    Note how the invocation of "iostat" and "zpool iostat" use unquoted $interval
    and $count variable expansion.
    Without the expansion, quoting these variables, especially the optional $count,
    when the variable is empty, will cause an issue for shell to interpret the
    arguments and options.
    This script is trying to be POSIX compliant, so using an array to store
    the commands options and arguments is not possible.
    E.g.: https://unix.stackexchange.com/a/388477/19406
    ^^ this answer is "nice" but is not POSIX compliant/compatible.
    However, using the :+ expansion will return a null string if the variable is
    not set. Note the inner variable is quoted, the outer is not.
    E.g.: https://stackoverflow.com/a/20306982/490487
    INFO

    ######################################################################
    # start a background proc for iostat device info, writing to fifo
    # shellcheck disable=SC2086
    iostat -m -y -H -g ALL -d "$@" ${interval:+"$interval"} ${count:+"$count"} | stdbuf -oL awk '
    (/^Device/ && 4==NR) {print;next}
    /ALL/ {print}
    ' > "$fifo_iostat_device" &

    ######################################################################
    # start background proc for iostat extended device info, writing to fifo
    # shellcheck disable=SC2086
    iostat -m -y -H -g ALL -dx "$@" ${interval:+"$interval"} ${count:+"$count"} | stdbuf -oL awk '
    (/^Device/ && 4==NR) {print;next}
    /ALL/ {print}
    ' > "$fifo_iostat_device_extended" &

    ######################################################################
    # start background proc for zpool iostat, writing to fifo
    # shellcheck disable=SC2016,SC2086
    stdbuf -oL zpool iostat -ny "$pool" ${interval:+"$interval"} ${count:+"$count"} | \
    stdbuf -oL awk -v pool="$pool" '
    BEGIN {pool_re="^"pool}
    1 == NR {next}
    2 == NR {print;next}
    ($0 ~ pool_re) {print}
    ' > "$fifo_zpool_iostat" &

    : <<INFO
    NOTES ON PROCESS & OUTPUT SYNCHRONICITY
    When the first background process starts, it will open the specified fifo, at
    which point the process should be blocked from writing (the nature of a fifo)
    until the reading program (paste in our case) starts reading from the other end
    of the named pipe. The same applies to the subsequent background processes.
    This means that each background process should be blocked from writing as it
    waits for a reader on the other end of its respective pipe. This means that at
    the start of the script, the output *should* be *relatively* in-sync.
    I say relative because the maximum human-friendly time resolution that iostat
    and zpool iostat can display is 1 second aka 1000ms. The running kernel HZ
    constant determines the number of ticks aka jiffies per second, on the script
    development system HZ=250. This means 0.004 seconds aka 4ms per kernel jiffy.
    So a given background process will start blocking at a given jiffy within a
    given second.
    Note: There is also the USER_HZ constant, which AFAIK affects the granularity
    of userland programs reporting date and time. Not to be confused with the
    kernel HZ constant.
    This means that each background process started at a slightly different time
    (different jiffies of a given second). This also means that the point at which
    the write process started AND the fifo started to block the write process due
    to the write process opening the fifo is very likely to be a slightly different
    point in time.
    These small granular aspects of process time, system calls and interrupts will
    affect the synchronicity of the 3 separate processes output by this script.
    It may be possible to reduce the jiffy deltas and achieve more accurate output
    synchronicity by using a thread pool or similar utility such as GNU parallel.
    However, these topics are somewhat outside the scope of a simple POSIX shell
    script.
    Based on observations during development, paste will not perform any
    synchronisation on or between the 3 fifos it is reading, BUT it should start
    reading all fifos at *nearly* the same time. I'd expect the delta to be a few
    jiffies. We could probably observe this by watching strace with ns/ms
    granularity on the timestamps.
    I was wondering if a rudimentary improvement to the overall output
    synchronisation of the script might be to send the background processes the
    STOP signal after they are started. Then, just before the script starts reading
    the fifos (and outputting to the console), send the processes the CONT signal.
    This should not be necessary, as this is essentially what the fifo provides us
    with, the background processes should be blocked as soon as they open their
    respective fifo.
    I'm not sure exactly what side effect this blocking has on the accuracy of the
    values in the first few intervals of iostat and zpool iostat, perhaps the
    blocking happens early enough that it doesn't mess with the internal timers and
    such of the waiting programs. Just something to keep in mind.
    INFO

    # give the background procs a chance to start doing their work
    #sleep 0.5

    : <<TODO
    NOTES ON TABLE AND COLUMN LAYOUT
    IDEA: It would make sense to study the code of zpool iostat and iostat to see
    how they are dealing with these topics.
    TODO: table/column output improvements to address cosmetic issues.
    I'm fairly certain this is a cosmetic issue which should not effect machine
    parsability.
    In the current version, awk uses tabs to separate columns (OFS).
    This gives the impression of table output.
    There is some specific handling of the iostat and zpool iostat | separator.
    There are no logic checks around col value widths, nor truncation, so cosmetic
    issues can arise for wider values.
    Remember each line is being processed and output individually.
    There is no concept of an overall table to constrain the content.
    Naturally at the start of the program, we don't know the max char width of a
    future column.
    However, a col spec could be introduced...
    Once a line is output, its static in the shell scrollback buffer. Not something
    we can change in the future.
    If a col value char width is > than 7? then the line formatting will be pushed
    right for that line.
    This causes cosmetic misalignment with the header line, and other lines.
    This misalignment will be amplified if multiple col values go over the
    mentioned char width.
    As mentioned, this creates a cosmetic issue BUT should not impact machine
    parsing, as col separator remains the same.
    We don't want to truncate column values but maybe we want to do some rounding
    of decimals?
    E.g. iostat uses two decimal place precision, and zpool iostat uses whole
    numbers for IOPS
    So, this decimal precision could be standardised?
    "column -t" cannot handle this because it relies on seeing the entire content
    first.
    Further develop of the cosmetic logic required.
    Observation: zpool IOPS use whole numbers, iostat uses fractional. This could
    be standardised.
    Perhaps a decimal rounding standard + truncation is the way? Truncation could
    be clearly marked?
    Perhaps a decimal rounding standard + individual column width spec is required
    - i think this is how iostat works
    zpool iostat -nyp <pool> 1 looks like it uses fixed with columns.
    Another factor is standardising the units per column.
    This would require a spec per column.
    This would require using zpool iostat -nyp because the default changes the
    units dynamically.
    And perhaps POSIXLY_CORRECT=1 for iostat to get blocks, 512 byte blocks vs
    KiB's and MiB's*
    * This could be impacted by 4kn drives which don't support 512 byte blocks.
    ^^ How does iostat handle this today?
    TODO

    ######################################################################
    # use paste to merge the fifos and awk to format the output
    # shellcheck disable=SC2016
    stdbuf -oL paste "$fifo_iostat_device" "$fifo_iostat_device_extended" "$fifo_zpool_iostat" | \
    stdbuf -oL awk 'BEGIN {OFS="\t"} {sep ("" == $0)?"":"|"} {print $2,$10,$11,$15,$6,$16,$17,$21,$7," |",$32,$35,$37,$36,$38}' | \
    stdbuf -oL awk -v stdin_is_terminal="$stdin_is_terminal" '
    BEGIN {
    term_size()
    OFS="\t"
    print "START"
    }
    function is_term() {
    if ("yes" == stdin_is_terminal) return 1
    else return 0
    }
    # set term_lines to the hight of the terminal
    function term_size() {
    if (is_term()) {
    cmd = "stty size <&3"; cmd | getline $0; close(cmd); term_lines = $1--
    }
    }
    function print_header() {
    # update term_lines in case of terminal resize
    term_size()
    $0 = header
    $2 = "r io/s"
    $4 = "r-sz"
    $6 = "w io/s"
    $8 = "w-sz"
    $10 = " |"
    $12 = "r io/s"
    $13 = "r bw/s"
    $14 = "w io/s"
    $15 = "w bw/s"
    # for(i = 1; i <= NF; i++) { $i = $i" " }
    # looks like need solution to make all cols fixed width spaced without tabs
    # printf can probably do this
    # for example, determine width of largest column for a row, right pad whitespace all cols to this width?
    # update a var for the widest column seen, use this for all future rows
    print; output++
    for(i = 1; i <= NF; i++) {
    if ($i !~ /\|/) { while(x++ < length($i)) printf "%s", "-"; x=0; printf "%s", OFS }
    else { printf " %s%s", "|", OFS }
    }
    print ""; output++
    }
    (1 == NR) {
    header=$0
    print_header()
    next
    }
    {print;output++}
    (is_term() && output >= term_lines && 0 == (output % term_lines)) {print ""; print_header();output++;output++}
    ' | stdbuf -oL ts "$ts_format"


  12. kyle0r created this gist May 16, 2024.
    1 change: 1 addition & 0 deletions -iostats-for-zpool.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1 @@
    .
    21 changes: 21 additions & 0 deletions LICENSE
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,21 @@
    Released under MIT License

    Copyright (c) 2024 kyle0r

    Permission is hereby granted, free of charge, to any person obtaining a copy
    of this software and associated documentation files (the "Software"), to deal
    in the Software without restriction, including without limitation the rights
    to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
    copies of the Software, and to permit persons to whom the Software is
    furnished to do so, subject to the following conditions:

    The above copyright notice and this permission notice shall be included in all
    copies or substantial portions of the Software.

    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
    AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
    OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
    SOFTWARE.
    1 change: 1 addition & 0 deletions iostats-for-zpool.sh
    Original file line number Diff line number Diff line change
    @@ -0,0 +1 @@
    .