Last active
April 6, 2025 19:35
-
-
Save kyle0r/0b3c7894b19f9aedf1633ba75197a28e to your computer and use it in GitHub Desktop.
Revisions
-
kyle0r renamed this gist
May 26, 2024 . 1 changed file with 3 additions and 3 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -3,7 +3,7 @@ Current version: 2024.21.1 Run `iostats-for-zpool --help` for usage. `iostats-for-zpool` will attempt to show you a side-by-side `iostat` and `zpool iostat` with pool devices grouped together. Intro video: @@ -53,9 +53,9 @@ Place or link the script to a path in `$PATH` mkdir -p ~/bin ; mv -i ~/iostats-for-zpool ~/bin ; export PATH="$PATH":~/bin ``` Note: that some distros will automatically add `~/bin` to `$PATH` if `~/bin` exists. Test this, YMMV! For example modern Debian handles this in `~/.profile`, so creating `~/bin` and starting a new shell or re-sourcing `. ~/.profile` will automatically prepend `~/bin` to your `$PATH`. **Alternative:** Assuming `$PATH` contains `/usr/local/bin` and you have `sudo` rights: ``` -
kyle0r revised this gist
May 26, 2024 . 1 changed file with 7 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -310,6 +310,13 @@ See known issue on handling column values and widths and table formatting. See known issue about time/tick drift between the three fifos. Try testing GNU parallel to see if this mitigates the drift. -- Could add a warning and option for running as root. root should not be required. e.g. --disable-run-as-root-warning which would disable the warning and make users cognitive of the user/privileges the script is running with. USAGE ) @@ -466,7 +473,6 @@ fifo_iostat_device_extended="$(mktemp "/tmp/${me}-iostat-device-extended.XXXXXXX fifo_zpool_iostat="$(mktemp "/tmp/${me}-zpool-iostat.XXXXXXX.fifo")" for fifo in "$fifo_iostat_device" "$fifo_iostat_device_extended" "$fifo_zpool_iostat"; do if rm "$fifo"; then mkfifo "$fifo" || die "problem initialising fifo: $fifo" else die "problem setting up fifo: $fifo" -
kyle0r revised this gist
May 26, 2024 . 1 changed file with 2 additions and 2 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -14,7 +14,7 @@ https://gist.github.com/assets/517822/a707e622-7fb3-4930-bdb0-689f783946de You can get **_the latest_** version of `iostats-for-zpool` from this gist with the following link: https://gist.githubusercontent.com/kyle0r/0b3c7894b19f9aedf1633ba75197a28e/raw/iostats-for-zpool.sh Here are some common ways to download the script via shell ## with `curl` @@ -57,7 +57,7 @@ Note: that some distributions with automatically add `~/bin` to `$PATH` if `~/bi For example modern Debian handles this in `~/.profile`, so creating `~/bin` and starting a new shell or re-sourcing `. ~/.profile` will automatically prepend `~bin` to your `$PATH`. **Alternative:** Assuming `$PATH` contains `/usr/local/bin` and you have `sudo` rights: ``` # if you prefer a symlink sudo ln -s ~/iostats-for-zpool /usr/local/bin/iostats-for-zpool -
kyle0r revised this gist
May 26, 2024 . 1 changed file with 3 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -5,7 +5,9 @@ Run `iostats-for-zpool --help` for usage. `iostats-for-zpool` will attempt to show you a side-by-side iostat and zpool iostat with pool devices grouped together. Intro video: https://gist.github.com/assets/517822/a707e622-7fb3-4930-bdb0-689f783946de # INSTALL -
kyle0r revised this gist
May 25, 2024 . 1 changed file with 84 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1 +1,84 @@ First version: 2024.20.1 Current version: 2024.21.1 Run `iostats-for-zpool --help` for usage. `iostats-for-zpool` will attempt to show you a side-by-side iostat and zpool iostat with pool devices grouped together. Intro video: TODO # INSTALL You can get **_the latest_** version of `iostats-for-zpool` from this gist with the following link: https://gist.githubusercontent.com/kyle0r/0b3c7894b19f9aedf1633ba75197a28e/raw/iostats-for-zpool.sh Here are some common ways to download the file via shell ## with `curl` ``` curl -sSL 'https://gist.githubusercontent.com/kyle0r/0b3c7894b19f9aedf1633ba75197a28e/raw/iostats-for-zpool.sh' > ~/iostats-for-zpool ``` ## with `wget` ``` wget -qO- 'https://gist.githubusercontent.com/kyle0r/0b3c7894b19f9aedf1633ba75197a28e/raw/iostats-for-zpool.sh' > ~/iostats-for-zpool ``` ## with `apt-helper` If curl and wget are not available: ``` /usr/lib/apt/apt-helper download-file 'https://gist.githubusercontent.com/kyle0r/0b3c7894b19f9aedf1633ba75197a28e/raw/iostats-for-zpool.sh' ~/iostats-for-zpool ``` ## `chmod` ``` chmod a+rx ~/iostats-for-zpool ``` You can now run the script via `~/iostats-for-zpool`. ## OPTIONAL Place or link the script to a path in `$PATH` `~/bin` example that does not require sudo rights: ``` mkdir -p ~/bin ; mv -i ~/iostats-for-zpool ~/bin ; export PATH="$PATH":~/bin ``` Note: that some distributions with automatically add `~/bin` to `$PATH` if `~/bin` exists. Test this, YMMV! For example modern Debian handles this in `~/.profile`, so creating `~/bin` and starting a new shell or re-sourcing `. ~/.profile` will automatically prepend `~bin` to your `$PATH`. **Alternative:** Assuming `$PATH` contains `/usr/local/bin` and you have sudo rights: ``` # if you prefer a symlink sudo ln -s ~/iostats-for-zpool /usr/local/bin/iostats-for-zpool # if you prefer moving the script to /usr/local/bin sudo mv -i ~/iostats-for-zpool /usr/local/bin/ ``` # USAGE EXAMPLES ``` iostats-for-zpool /dev/sda /dev/sdf 10 Show zpool iostats for rpool (default) pool, and groupped iostats for sda+sdf, 10 sec interval iostats-for-zpool -p tank /dev/sda /dev/sdf 1 30 Show zpool iostats for tank pool, and groupped iostats for sda+sdf, 1 sec interval, 30 times (30 secs) iostats-for-zpool -p storage /dev/sdc /dev/sdm /dev/sdv 300 12 Show zpool iostats for storage pool, and groupped iostats for sdc+sdm,sdv, 5min interval, 12 times (1 hour) ``` # TODO TODO: add a changelog Additional TODO's and IDEA's are captured in `--help` text. -
kyle0r revised this gist
May 25, 2024 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -2,7 +2,7 @@ # This script was created on Debian and should be POSIX compliant # shell check OK, some exceptions created version='2024.21.1' # version convention: DIN ISO 8601 date +%G\.%V\.1 (YEAR.WEEK.RELEASE) #+ where 1 is incremented per release within the given week -
kyle0r revised this gist
May 25, 2024 . 1 changed file with 32 additions and 9 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -370,10 +370,27 @@ shift $((OPTIND-1)) # remove parsed options and args from $@ list [ -x "$(command -v which)" ] || die "program 'which' dependency not found. aborting." kill_path=$(which kill) for dep in "$kill_path" zpool readlink iostat ts awk stdbuf paste mkfifo fuser mktemp; do [ -x "$(command -v "$dep")" ] || die "program '$dep' dependency not found. aborting." done # the script depends on gawk awkname="$(readlink -f "$(which awk)")" awkname="${awkname##*/}" if [ "gawk" = "$awkname" ]; then : # we assume gawk is available, nothing to do elif [ "mawk" = "$awkname" ]; then die "mawk has been detected, this script requires gawk. exiting." else if { awk -Wversion || awk --version; } 2>/dev/null | grep -iq mawk; then die "mawk has been detected, this script requires gawk. exiting." elif { awk -Wversion || awk --version; } 2>/dev/null | grep -iq gnu; then : # we assume gawk is available, nothing to do else die "this script requires gawk but it could not be detected. exiting." fi fi # validate the specified zpool exists zpool status "$pool" 1>/dev/null 2>&1 || die "$pool does not seem to exist? aborting." @@ -417,8 +434,8 @@ terminate() { fuser -k -TERM "$fifo" 1>/dev/null 2>&1 done echo "fifo's processes signaled to terminate. waiting 3 seconds for processes to die..." 1>&2 sleep 3 # cleanup fifos for fifo in "$fifo_iostat_device" "$fifo_iostat_device_extended" "$fifo_zpool_iostat"; do @@ -442,12 +459,18 @@ trap "terminate || true" TERM INT EXIT ###################################################################### # START main script # create fifo's: first mktemp to safely establish a unique path, then rm and mkfifo, mitigates attack vectors # safety context: https://stackoverflow.com/a/11636850 fifo_iostat_device="$(mktemp "/tmp/${me}-iostat-device.XXXXXXX.fifo")" fifo_iostat_device_extended="$(mktemp "/tmp/${me}-iostat-device-extended.XXXXXXX.fifo")" fifo_zpool_iostat="$(mktemp "/tmp/${me}-zpool-iostat.XXXXXXX.fifo")" for fifo in "$fifo_iostat_device" "$fifo_iostat_device_extended" "$fifo_zpool_iostat"; do if rm "$fifo"; then sync mkfifo "$fifo" || die "problem initialising fifo: $fifo" else die "problem setting up fifo: $fifo" fi done : <<INFO @@ -681,4 +704,4 @@ stdbuf -oL paste "$fifo_iostat_device" "$fifo_iostat_device_extended" "$fifo_zpo ' | stdbuf -oL ts "$ts_format" # END of script -
kyle0r revised this gist
May 16, 2024 . 1 changed file with 93 additions and 15 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -41,7 +41,7 @@ stdin_is_terminal=no # if stdin is a terminal i.e. an interactive session. if [ -t 0 ]; then stdin_is_terminal=yes [ -x "$(command -v stty)" ] || { die "program 'stty' dependency not found. aborting."; } original_tty_state=$(stty -g) # file descriptor for direct writes to tty, e.g. when stdout/err are already being redirected. @@ -74,7 +74,7 @@ $(print_version) EXAMPLES $me /dev/sda /dev/sdf 10 Show zpool iostats for rpool (default) pool, and groupped iostats for sda+sdf, 10 sec interval $me -p tank /dev/sda /dev/sdf 1 30 Show zpool iostats for tank pool, and groupped iostats for sda+sdf, 1 sec interval, 30 times (30 secs) @@ -101,6 +101,8 @@ You can monitor how much physical MiB is read or written in a given interval. For IOPS, you can compare how much logical IO produced how much physical IO. The same is true for IO bandwidth. You can monitor the average physical read and write block size. The data is taken from 3 places, iostat -d, iostat -dx and zpool iostat and combined into a single view. @@ -126,6 +128,69 @@ OPTIONS -p the zpool to monitor OUTPUT COLUMNS iostat - Physical device IO Grouping: $me groups iostat values together for the provided devices. tps - cite: man iostat Indicate the number of grouped transfers per second that were issued to the specified device(s). A transfer is an I/O request to a device. Multiple logical requests can be combined into a single I/O request to a device. A transfer is of indeterminate size. r io/s - renamed iostat r/s - cite: man iostat Read IOPS. The number (after merges) of grouped read requests completed per second for the specified device(s). rMB/s - cite: man iostat The number of grouped sectors (kilobytes, megabytes) read from the specified device(s) per second. r-sz - renamed iostat rareq-sz - cite: man iostat The average grouped size (in kilobytes) of the read requests that were issued to the specified device(s). MB_read - cite: man iostat The total number of grouped blocks (megabytes) read from the specified device(s) since the last interval. w io/s - renamed iostat w/s - cite: man iostat Write IOPS. The number (after merges) of grouped write requests completed per second for the specified device(s). wMB/s - cite: man iostat The number of grouped sectors (kilobytes, megabytes) written to the specified device(s) per second. w-sz - renamed iostat wareq-sz - cite: man iostat The average grouped size (in kilobytes) of the write requests that were issued to the specified device(s). MB_wrtn - cite: man iostat The total number of grouped blocks (megabytes) writen to the specified device(s) since the last interval. -- zpool iostat - logical ZFS IO pool r io/s Read IOPS. r bw/s Read bandwidth per second. w io/s Write IOPS. w bw/s Write bandwidth per second. THINGS TO KEEP IN MIND If the scenario arises that output lines appear to be out of sync, it is @@ -271,7 +336,6 @@ if [ ! -x "$(command -v "$PAGER")" ]; then # less not found fi ###################################################################### # START getopts related code # Should be POSIX compatible @@ -295,8 +359,9 @@ while getopts hi:p:-: OPT; do # allow -a, -b with arg, -c, and -- "with arg" done shift $((OPTIND-1)) # remove parsed options and args from $@ list # END getopts ###################################################################### # START validation of options, arguments and dependencies [ -z "$*" ] && die "missing one or more devices to monitor. aborting." @@ -334,8 +399,9 @@ for device in "$@"; do done # END validation ###################################################################### # START trap # trap function - the script attempts a clean shutdown and finally terminates the process group terminate() { @@ -369,8 +435,10 @@ terminate() { # Define the trap for various signals # Modified signal names for POSIX compliance trap "terminate || true" TERM INT EXIT # END trap ###################################################################### # START main script @@ -442,10 +510,10 @@ the start of the script, the output *should* be *relatively* in-sync. I say relative because the maximum human-friendly time resolution that iostat and zpool iostat can display is 1 second aka 1000ms. The running kernel HZ constant determines the number of ticks aka jiffies per second. On the script development system HZ=250. This means 0.004 seconds aka 4ms per kernel jiffy. So a given background process will start up, and then start blocking at a given jiffy within a given second. Note: There is also the USER_HZ constant, which AFAIK affects the granularity of userland programs reporting date and time. Not to be confused with the @@ -466,7 +534,7 @@ However, these topics are somewhat outside the scope of a simple POSIX shell script. Based on observations during development, paste will not perform any synchronisation on or between the 3 fifos it is reading. paste *should* start reading all fifos at *nearly* the same time. I'd expect the delta to be a few jiffies. We could probably observe this by watching strace with ns/ms granularity on the timestamps. @@ -477,16 +545,19 @@ STOP signal after they are started. Then, just before the script starts reading the fifos (and outputting to the console), send the processes the CONT signal. This should not be necessary, as this is essentially what the fifo provides us with, the background processes should be blocked as soon as they open their respective fifo, so the block should occur prior to the first fifo write. I'm not sure exactly what side effect this blocking has on the accuracy of the values in the first few intervals of iostat and zpool iostat, perhaps the blocking happens early enough that it doesn't mess with the internal timers and such of the waiting programs. Just something to keep in mind. Reference: Kernel HZ aka CONFIG_HZ constant can be calculated using awk per: https://stackoverflow.com/a/17371631/490487 USER_HZ aka CLOCKS_PER_SEC constant can be retrieved via: getconf CLK_TCK INFO : <<TODO NOTES ON TABLE AND COLUMN LAYOUT @@ -549,11 +620,11 @@ TODO # shellcheck disable=SC2016 stdbuf -oL paste "$fifo_iostat_device" "$fifo_iostat_device_extended" "$fifo_zpool_iostat" | \ stdbuf -oL awk 'BEGIN {OFS="\t"} {sep ("" == $0)?"":"|"} {print $2,$10,$11,$15,$6,$16,$17,$21,$7," |",$32,$35,$37,$36,$38}' | \ stdbuf -oL awk -v stdin_is_terminal="$stdin_is_terminal" -v interval="$interval" ' BEGIN { term_size() OFS="\t" print "START - collecting data for first interval: "interval" seconds." } function is_term() { @@ -594,13 +665,20 @@ stdbuf -oL paste "$fifo_iostat_device" "$fifo_iostat_device_extended" "$fifo_zpo } print ""; output++ } # first record? print header and skip the record (1 == NR) { header=$0 print_header() next } # non-header records/lines {print;output++} # repeating header logic based on terminal rows/lines (is_term() && output >= term_lines && 0 == (output % term_lines)) {print ""; print_header();output++;output++} ' | stdbuf -oL ts "$ts_format" # END of script -
kyle0r revised this gist
May 16, 2024 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -58,7 +58,7 @@ usage() { NAME $me - side-by-side iostat and zpool iostat with pool devices grouped together SYNOPSIS -
kyle0r revised this gist
May 16, 2024 . 1 changed file with 0 additions and 15 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -374,21 +374,6 @@ trap "terminate || true" TERM INT EXIT ###################################################################### # START main script # create fifo's fifo_iostat_device="/tmp/${me}-iostat-device.fifo" fifo_iostat_device_extended="/tmp/${me}-iostat-device-extended.fifo" -
kyle0r revised this gist
May 16, 2024 . 1 changed file with 621 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1 +1,621 @@ #!/bin/sh # This script was created on Debian and should be POSIX compliant # shell check OK, some exceptions created version='2024.20.1' # version convention: DIN ISO 8601 date +%G\.%V\.1 (YEAR.WEEK.RELEASE) #+ where 1 is incremented per release within the given week me="${0##*/}" # basename me="${me%.sh}" # strip .sh suffix print_version() { printf "%s version: %s\n" "$me" "$version"; } ts_format='[%Y-%m-%dT%H:%M:%S]' ###################################################################### # START defaults interval=1 count= pool=rpool ###################################################################### # START functions # https://stackoverflow.com/a/61835747/490487 is_num() { case ${1#[-+]} in '' | . | *[!0-9.]* | *.*.* ) return 1;; esac ;} # handy POSIX pause function https://unix.stackexchange.com/a/293941/19406 pause() { printf "%s" "Press ENTER to continue... or CTRL+C to abort."; read -r _; } # complain to STDERR and exit with error die() { echo "$*" >&2; exit 2; } # handle options that require an arg needs_arg() { [ -z "$OPTARG" ] && die "No arg for --$OPT option"; } # func to check if stdin is a terminal is_term() { [ 'yes' = "$stdin_is_terminal" ] && return 0 || return 1; } ###################################################################### # START tty handling stdin_is_terminal=no # if stdin is a terminal i.e. an interactive session. if [ -t 0 ]; then stdin_is_terminal=yes [ -x "$(command -v stty)" ] || { die 'program 'stty' dependency not found. aborting.'; } original_tty_state=$(stty -g) # file descriptor for direct writes to tty, e.g. when stdout/err are already being redirected. exec 3<> /dev/tty else exec 3<> /dev/null fi ###################################################################### # START usage related code usage() { usage=$(cat <<USAGE NAME $me - side-by-side iostat and zpool iostat with pool devices group together SYNOPSIS $me [ -p zpool ] device ... [ interval [ count ] ] VERSION $(print_version) EXAMPLES $me /dev/sda /dev/sdf 10 Show zpool iostats for rpool pool, and groupped iostats for sda+sdf, 10 sec interval $me -p tank /dev/sda /dev/sdf 1 30 Show zpool iostats for tank pool, and groupped iostats for sda+sdf, 1 sec interval, 30 times (30 secs) $me -p storage /dev/sdc /dev/sdm /dev/sdv 300 12 Show zpool iostats for storage pool, and groupped iostats for sdc+sdm,sdv, 5min interval, 12 times (1 hour) DESCRIPTION This script was created to unify the output of iostat and zpool iostat. The details can be viewed interactively OR logged OR passed to another script for parsing. The primary goal is to give a sysop a quick overview of the physical device iostats and the logical ZFS zpool iostats, and to be able to compare them side by side. That is, for a given IO workload on a zpool, what does the IO look like when the IO requests hit the physical pool devices. You can monitor how much physical MiB is read or written in a given interval. For IOPS, you can compare how much logical IO produced how much physical IO. The same is true for IO bandwidth. The data is taken from 3 places, iostat -d, iostat -dx and zpool iostat and combined into a single view. The -p option is optional and tells zpool iostat which pool to monitor. The -p option defaults to: rpool At least one device argument is required, this tells iostat which devices to monitor and group together. The interval and count arguments are optional and default to: interval=1 count=null Mimicking iostat, the interval parameter specifies the time in seconds between each report. The count parameter can be specified in conjunction with the interval parameter. If the count parameter is specified, the value of count determines the number of reports generated at interval seconds apart. OPTIONS -h shows script usage information. -p the zpool to monitor THINGS TO KEEP IN MIND If the scenario arises that output lines appear to be out of sync, it is important to remember that this does not automatically mean that there is an output time drift or output synchronicity issue. It is natural for logical and physical IO to sometimes be aligned within the same second, but more probable that there is some natural delay between logical and physical IO. The longer you observe at short intervals, the more likely it is that this "it looks out of sync" phenomenon will occur. Remember, there is a lot going on in an IO subsystem in 1 second. You have the kernel, the kernel device IO scheduler, the ZFS code paths and the IO scheduler. These factors naturally cause things to happen at different times, and observing perfectly synchronised IO within a 1 second time window is relatively unlikely. Just keep in mind that observing IO at short intervals has the advantage of seeing "what is happening in real time", but also has the disadvantage that 1 second is a short observation window for the IO to remain aligned. KNOWN ISSUES It seems that even if the output timestamps of the three fifos stay in sync (accuracy ~1000ms), the internal clocks/ticks of iostat and zpool iostat are slightly different, causing drift. Or maybe it is the observer effect, where observing 3 different processes in parallel, which were not started at exactly the same nanosecond, or did not start to output at exactly the same nanosecond, will inherently lead to time drift and/or delay in the observations, and then there is the factor of the clock/tick of the processes themselves. Since the shortest interval between lines of output is 1 second, it does not take many ms of drift for one line to fall out of sync with another (fall into another second/line). This issue visualises itself as values being out of sync/sequence between iostat and zpool iostat. It is also possible that the time taken by this script to process and output the data may cause or contribute to the drift. This issue becomes more apparent at shorter, more granular intervals. My assessment is that this is not a major problem because typically shorter intervals are observed by a human interactively for a short period of time, so the problem does not have time to occur in this scenario. It is only after a few minutes with a 1 second interval setting that the issue starts to occur. Being aware of this issue, the sysop can simply restart the script. The longer the interval, the more irrelevant this drift problem becomes. This makes the script suitable for analysis over a longer period of time. I'm not sure how to solve this issue. Suggestions are welcome. Perhaps this is a non-issue and just the reality of observation at short intervals? It may not be worth the effort to fix it, or it may not be fixable. I think it is unlikely to discover a primary key (other than time) to link and synchronise iostat and zpool iostat together. One idea that came to mind was to try using GNU parallel to run the 3 sub-processes, as this supports line-buffered grouped/synchronised sub-process output. I suspect the same drift will happen. Would be worth testing. If this solves the issue, it would create a dependency on GNU parallel. This could be mitigated by offering an option to choose between the existing logic and GNU parallel. See the code notes section: NOTES ON PROCESS & OUTPUT SYNCHRONICITY -- There are some challenges with the alignment of columns when a values char width overflows certain sizes. I've written a details TODO inline. PORTABILITY / COMPATIBILITY The script should be POSIX compliant. The script does have a number of dependencies which may be missing on some compact distros like alpine, or distros that use busybox. ENVIRONMENT Nothing specific AUTHORS 2024 https://github.com/kyle0r BUGS Post on the GitHub gist: https://gist.github.com/kyle0r/0b3c7894b19f9aedf1633ba75197a28e CONTRIBUTING Feel free to make requests or fork the gist (see BUGS). LICENSE MIT DISCLAIMER This script is provided "AS IS" and any express or implied warranties, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose are disclaimed. See the LICENSE distributed file or for complete details. TODO / IDEAS See known issue on handling column values and widths and table formatting. -- See known issue about time/tick drift between the three fifos. Try testing GNU parallel to see if this mitigates the drift. USAGE ) printf "%s" "$usage" | "$PAGER"; exit 1 } usage_with_prompt() { printf "\n%s\n" "PAGER ($PAGER) will now run to display help/usage info." pause usage } # use less as PAGER fallback if no env pager is defined # https://stackoverflow.com/a/28085062/490487 : "${PAGER:=less}" # check for PAGER dependencies if [ ! -x "$(command -v "$PAGER")" ]; then # less not found PAGER="cat" if [ ! -x "$(command -v "$PAGER")" ]; then # cat not found echo "$PAGER and cat not found in PATH. aborting." 1>&2 exit 1 fi fi ###################################################################### # START getopts related code # Should be POSIX compatible # Thank you: https://stackoverflow.com/users/519360/adam-katz # https://stackoverflow.com/a/28466267/490487 while getopts hi:p:-: OPT; do # allow -a, -b with arg, -c, and -- "with arg" # support long options: https://stackoverflow.com/a/28466267/519360 if [ "$OPT" = "-" ]; then # long option: reformulate OPT and OPTARG OPT="${OPTARG%%=*}" # extract long option name OPTARG="${OPTARG#"$OPT"}" # extract long option argument (may be empty) OPTARG="${OPTARG#=}" # if long option argument, remove assigning `=` fi case "$OPT" in h | help ) usage ;; i | interval ) interval="${OPTARG:-$interval}" ;; # optional argument p | pool ) pool="${OPTARG:-$pool}" ;; # optional argument \? ) usage_with_prompt ;; # error reported via getopts * ) echo "Illegal option --$OPT" 1>&2 ; usage_with_prompt ;; # bad long option esac done shift $((OPTIND-1)) # remove parsed options and args from $@ list # END getopts ###################################################################### # START validation of options, arguments and dependencies [ -z "$*" ] && die "missing one or more devices to monitor. aborting." # we require which to find the kill program not the builtin [ -x "$(command -v which)" ] || die "program 'which' dependency not found. aborting." kill_path=$(which kill) for dep in "$kill_path" zpool readlink iostat ts awk stdbuf paste mkfifo fuser; do [ -x "$(command -v "$dep")" ] || die "program '$dep' dependency not found. aborting." done # validate the specified zpool exists zpool status "$pool" 1>/dev/null 2>&1 || die "$pool does not seem to exist? aborting." # reverse the args: makes for easier validation logic # https://unix.stackexchange.com/a/560698/19406 arg=''; for a in "$@"; do # shellcheck disable=SC2086 set -- "$a" ${arg-"$@"} # note the $@ is quoted and expansion is not unset arg done # parse args to determine if count and interval are present if is_num "$1" && is_num "$2"; then count="$1"; shift interval="$1"; shift elif is_num "$1"; then interval="$1"; shift fi # are the remaining arguments block devices? for device in "$@"; do [ -b "$(readlink -f -- "$device")" ] || die "$device is not a valid block device. aborting." done # END validation ###################################################################### # START trap # trap function - the script attempts a clean shutdown and finally terminates the process group terminate() { trap '' TERM INT EXIT # ignore further signals printf "\n%s\n" "$me has been signaled to clean up and terminate..." 1>&2 # reset terminal after CTRL+C INT signal. #+ https://stackoverflow.com/a/31810254 is_term && stty "$original_tty_state" for fifo in "$fifo_iostat_device" "$fifo_iostat_device_extended" "$fifo_zpool_iostat"; do # send TERM signal to any processes using the fifos fuser -k -TERM "$fifo" 1>/dev/null 2>&1 done echo "fifo's processes signaled to terminate. waiting for processes to die..." 1>&2 wait # wait for background processes to die # cleanup fifos for fifo in "$fifo_iostat_device" "$fifo_iostat_device_extended" "$fifo_zpool_iostat"; do rm "$fifo" done echo "fifo's cleaned up. final shutdown..." 1>&2 # kill the process group -$$ # Try to kill any lingering procs spawned by this script # https://stackoverflow.com/a/2173421/490487 "$kill_path" -- -$$; } # Define the trap for various signals # Modified signal names for POSIX compliance trap "terminate || true" TERM INT EXIT # END trap ###################################################################### # START main script # read the rows and cols size of the current shell window # should be POSIX compliant # https://stackoverflow.com/a/39937626/490487 # shellcheck disable=SC2034,SC2162 #read rows cols << EOF #$(stty size) #EOF : <<TODO update rows/cols based on shell window resize. SIGWINCH? I was thinking that the script could auto-adjust to the number of rows, once per header output? This would require writing a simple awk function to check if stty sizes has changed since the previous check The logic could perhaps be moved inside awk, this would avoid having to pass the variable to awk. TODO # create fifo's fifo_iostat_device="/tmp/${me}-iostat-device.fifo" fifo_iostat_device_extended="/tmp/${me}-iostat-device-extended.fifo" fifo_zpool_iostat="/tmp/${me}-zpool-iostat.fifo" for fifo in "$fifo_iostat_device" "$fifo_iostat_device_extended" "$fifo_zpool_iostat"; do mkfifo "$fifo" done : <<INFO ATTENTION // ACHTUNG Note how the invocation of "iostat" and "zpool iostat" use unquoted $interval and $count variable expansion. Without the expansion, quoting these variables, especially the optional $count, when the variable is empty, will cause an issue for shell to interpret the arguments and options. This script is trying to be POSIX compliant, so using an array to store the commands options and arguments is not possible. E.g.: https://unix.stackexchange.com/a/388477/19406 ^^ this answer is "nice" but is not POSIX compliant/compatible. However, using the :+ expansion will return a null string if the variable is not set. Note the inner variable is quoted, the outer is not. E.g.: https://stackoverflow.com/a/20306982/490487 INFO ###################################################################### # start a background proc for iostat device info, writing to fifo # shellcheck disable=SC2086 iostat -m -y -H -g ALL -d "$@" ${interval:+"$interval"} ${count:+"$count"} | stdbuf -oL awk ' (/^Device/ && 4==NR) {print;next} /ALL/ {print} ' > "$fifo_iostat_device" & ###################################################################### # start background proc for iostat extended device info, writing to fifo # shellcheck disable=SC2086 iostat -m -y -H -g ALL -dx "$@" ${interval:+"$interval"} ${count:+"$count"} | stdbuf -oL awk ' (/^Device/ && 4==NR) {print;next} /ALL/ {print} ' > "$fifo_iostat_device_extended" & ###################################################################### # start background proc for zpool iostat, writing to fifo # shellcheck disable=SC2016,SC2086 stdbuf -oL zpool iostat -ny "$pool" ${interval:+"$interval"} ${count:+"$count"} | \ stdbuf -oL awk -v pool="$pool" ' BEGIN {pool_re="^"pool} 1 == NR {next} 2 == NR {print;next} ($0 ~ pool_re) {print} ' > "$fifo_zpool_iostat" & : <<INFO NOTES ON PROCESS & OUTPUT SYNCHRONICITY When the first background process starts, it will open the specified fifo, at which point the process should be blocked from writing (the nature of a fifo) until the reading program (paste in our case) starts reading from the other end of the named pipe. The same applies to the subsequent background processes. This means that each background process should be blocked from writing as it waits for a reader on the other end of its respective pipe. This means that at the start of the script, the output *should* be *relatively* in-sync. I say relative because the maximum human-friendly time resolution that iostat and zpool iostat can display is 1 second aka 1000ms. The running kernel HZ constant determines the number of ticks aka jiffies per second, on the script development system HZ=250. This means 0.004 seconds aka 4ms per kernel jiffy. So a given background process will start blocking at a given jiffy within a given second. Note: There is also the USER_HZ constant, which AFAIK affects the granularity of userland programs reporting date and time. Not to be confused with the kernel HZ constant. This means that each background process started at a slightly different time (different jiffies of a given second). This also means that the point at which the write process started AND the fifo started to block the write process due to the write process opening the fifo is very likely to be a slightly different point in time. These small granular aspects of process time, system calls and interrupts will affect the synchronicity of the 3 separate processes output by this script. It may be possible to reduce the jiffy deltas and achieve more accurate output synchronicity by using a thread pool or similar utility such as GNU parallel. However, these topics are somewhat outside the scope of a simple POSIX shell script. Based on observations during development, paste will not perform any synchronisation on or between the 3 fifos it is reading, BUT it should start reading all fifos at *nearly* the same time. I'd expect the delta to be a few jiffies. We could probably observe this by watching strace with ns/ms granularity on the timestamps. I was wondering if a rudimentary improvement to the overall output synchronisation of the script might be to send the background processes the STOP signal after they are started. Then, just before the script starts reading the fifos (and outputting to the console), send the processes the CONT signal. This should not be necessary, as this is essentially what the fifo provides us with, the background processes should be blocked as soon as they open their respective fifo. I'm not sure exactly what side effect this blocking has on the accuracy of the values in the first few intervals of iostat and zpool iostat, perhaps the blocking happens early enough that it doesn't mess with the internal timers and such of the waiting programs. Just something to keep in mind. INFO # give the background procs a chance to start doing their work #sleep 0.5 : <<TODO NOTES ON TABLE AND COLUMN LAYOUT IDEA: It would make sense to study the code of zpool iostat and iostat to see how they are dealing with these topics. TODO: table/column output improvements to address cosmetic issues. I'm fairly certain this is a cosmetic issue which should not effect machine parsability. In the current version, awk uses tabs to separate columns (OFS). This gives the impression of table output. There is some specific handling of the iostat and zpool iostat | separator. There are no logic checks around col value widths, nor truncation, so cosmetic issues can arise for wider values. Remember each line is being processed and output individually. There is no concept of an overall table to constrain the content. Naturally at the start of the program, we don't know the max char width of a future column. However, a col spec could be introduced... Once a line is output, its static in the shell scrollback buffer. Not something we can change in the future. If a col value char width is > than 7? then the line formatting will be pushed right for that line. This causes cosmetic misalignment with the header line, and other lines. This misalignment will be amplified if multiple col values go over the mentioned char width. As mentioned, this creates a cosmetic issue BUT should not impact machine parsing, as col separator remains the same. We don't want to truncate column values but maybe we want to do some rounding of decimals? E.g. iostat uses two decimal place precision, and zpool iostat uses whole numbers for IOPS So, this decimal precision could be standardised? "column -t" cannot handle this because it relies on seeing the entire content first. Further develop of the cosmetic logic required. Observation: zpool IOPS use whole numbers, iostat uses fractional. This could be standardised. Perhaps a decimal rounding standard + truncation is the way? Truncation could be clearly marked? Perhaps a decimal rounding standard + individual column width spec is required - i think this is how iostat works zpool iostat -nyp <pool> 1 looks like it uses fixed with columns. Another factor is standardising the units per column. This would require a spec per column. This would require using zpool iostat -nyp because the default changes the units dynamically. And perhaps POSIXLY_CORRECT=1 for iostat to get blocks, 512 byte blocks vs KiB's and MiB's* * This could be impacted by 4kn drives which don't support 512 byte blocks. ^^ How does iostat handle this today? TODO ###################################################################### # use paste to merge the fifos and awk to format the output # shellcheck disable=SC2016 stdbuf -oL paste "$fifo_iostat_device" "$fifo_iostat_device_extended" "$fifo_zpool_iostat" | \ stdbuf -oL awk 'BEGIN {OFS="\t"} {sep ("" == $0)?"":"|"} {print $2,$10,$11,$15,$6,$16,$17,$21,$7," |",$32,$35,$37,$36,$38}' | \ stdbuf -oL awk -v stdin_is_terminal="$stdin_is_terminal" ' BEGIN { term_size() OFS="\t" print "START" } function is_term() { if ("yes" == stdin_is_terminal) return 1 else return 0 } # set term_lines to the hight of the terminal function term_size() { if (is_term()) { cmd = "stty size <&3"; cmd | getline $0; close(cmd); term_lines = $1-- } } function print_header() { # update term_lines in case of terminal resize term_size() $0 = header $2 = "r io/s" $4 = "r-sz" $6 = "w io/s" $8 = "w-sz" $10 = " |" $12 = "r io/s" $13 = "r bw/s" $14 = "w io/s" $15 = "w bw/s" # for(i = 1; i <= NF; i++) { $i = $i" " } # looks like need solution to make all cols fixed width spaced without tabs # printf can probably do this # for example, determine width of largest column for a row, right pad whitespace all cols to this width? # update a var for the widest column seen, use this for all future rows print; output++ for(i = 1; i <= NF; i++) { if ($i !~ /\|/) { while(x++ < length($i)) printf "%s", "-"; x=0; printf "%s", OFS } else { printf " %s%s", "|", OFS } } print ""; output++ } (1 == NR) { header=$0 print_header() next } {print;output++} (is_term() && output >= term_lines && 0 == (output % term_lines)) {print ""; print_header();output++;output++} ' | stdbuf -oL ts "$ts_format" -
kyle0r created this gist
May 16, 2024 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1 @@ . This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,21 @@ Released under MIT License Copyright (c) 2024 kyle0r Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1 @@ .