Skip to content

Instantly share code, notes, and snippets.

@kyletaylored
Last active April 7, 2026 15:17
Show Gist options
  • Select an option

  • Save kyletaylored/6a6a03d103f123cac3d1c467e624aee1 to your computer and use it in GitHub Desktop.

Select an option

Save kyletaylored/6a6a03d103f123cac3d1c467e624aee1 to your computer and use it in GitHub Desktop.
Query Azure Log Data
*.log
*.csv
.cursor
.claude

Azure Log Analytics Workspace Usage Exporter

This script pulls monthly usage trends (record count and volume in bytes) for all Log Analytics workspaces in your Azure account. It uses the Azure CLI and supports:

  • All subscriptions by default (or target one via --subscription / -Subscription)
  • Region and resource group filtering
  • Parallel processing with timeout protection
  • CSV output with clean, deduplicated structure
  • Live progress reporting
  • Error handling and debug logging

Usage

Bash (Linux/macOS):

./query_az_logs.sh [OPTIONS]

PowerShell (Windows/macOS/Linux):

./query_az_logs.ps1 [OPTIONS]

Options

Bash PowerShell Description
--subscription <ID> -Subscription <ID> Azure subscription ID (if omitted, queries all)
--region <REGION> -Region <REGION> Filter by region (e.g. eastus, centralus)
--resource-group <RG> -ResourceGroup <RG> Filter by resource group name
--timeout <SECONDS> -Timeout <SECONDS> Per-workspace query timeout in seconds (default: 15)
-h / --help -? / Get-Help Show usage instructions

Install

Bash

Download the script and make it executable:

chmod +x query_az_logs.sh

Log in to Azure:

az login

PowerShell

No additional setup is required. You may need to allow script execution:

Set-ExecutionPolicy -Scope CurrentUser RemoteSigned

Log in to Azure:

az login

Examples

Query all subscriptions:

./query_az_logs.sh
./query_az_logs.ps1

Target a single subscription with filters:

./query_az_logs.sh \
  --subscription 11111111-2222-3333-4444-555555555555 \
  --region eastus \
  --resource-group my-analytics-rg \
  --timeout 20
./query_az_logs.ps1 `
  -Subscription 11111111-2222-3333-4444-555555555555 `
  -Region eastus `
  -ResourceGroup my-analytics-rg `
  -Timeout 20

KQL Query (Run Directly in Azure)

You can run this query manually in Azure Monitor → Logs or the Log Analytics workspace query editor without using the scripts.

let startDate = startofmonth(datetime_add('month', -11, startofmonth(now())));
union *
| where TimeGenerated >= startDate
| extend Month = format_datetime(TimeGenerated, 'yyyy-MM')
| summarize
    MonthlyRecords = count(),
    MonthlyVolume_GB = round(sum(_BilledSize) / 1073741824, 3)
  by Month
| sort by Month asc

What it returns: One row per month for the last 12 months, with total record count and billed volume in GB across all tables in the workspace.

Retention period is a workspace-level property and is not accessible from within a KQL log query. The export scripts (query_az_logs.sh / query_az_logs.ps1) capture retention via az monitor log-analytics workspace list and include it as the RetentionInDays column in the CSV output. To check retention for a specific workspace manually: Log Analytics workspace → Usage and estimated costs → Data retention.

To run it:

  1. Open the Azure Portal
  2. Navigate to your Log Analytics workspace (search "Log Analytics workspaces")
  3. Select Logs from the left menu
  4. Paste the query and click Run

Note: This query must be run inside a Log Analytics workspace → Logs. It will not work in Azure Resource Graph Explorer (portal.azure.com/#view/...resourceGraph), which uses a different schema and does not expose TimeGenerated or _BilledSize.

To adjust the lookback window, change -11 to the number of additional months to go back (e.g. -23 for 24 months).

Volume by Table (Index)

To break down volume by individual table (the Azure equivalent of a Splunk index), add Type to the grouping:

let startDate = startofmonth(datetime_add('month', -11, startofmonth(now())));
union *
| where TimeGenerated >= startDate
| extend Month = format_datetime(TimeGenerated, 'yyyy-MM')
| summarize
    MonthlyRecords = count(),
    MonthlyVolume_GB = round(sum(_BilledSize) / 1073741824, 3)
  by Month, Table = Type
| sort by Month asc, MonthlyVolume_GB desc

What it returns: One row per table per month. Sorted by volume descending within each month so the noisiest tables surface first. Useful for identifying which data sources are driving ingestion costs.

In Excel, pivot on Table to compare volume trends across tables side by side.


Output Files

File Description
workspace_logs.csv Main output: one row per workspace per month
errors.log Captures failed or timed-out queries
debug.log Diagnostic info and status logs

Sample Output (workspace_logs.csv)

SubscriptionID,WorkspaceID,Name,ResourceGroup,Location,CreatedDate,RetentionInDays,DailyQuotaGb,Month,MonthlyRecords,MonthlyVolume
abc-sub-123,...,MyWorkspace,...,2023-01-01T12:00:00Z,30,5.0,2024-12,124234,203948573
abc-sub-123,...,MyWorkspace,...,2023-01-01T12:00:00Z,30,5.0,2025-01,122003,200330123

Prerequisites

  • Azure CLI (az)
  • Azure credentials with Reader or Log Analytics Reader role

Bash only:

  • Bash-compatible shell (Linux/macOS)
  • timeout or gtimeout (macOS: brew install coreutils)

PowerShell only:

  • PowerShell 7+ (pwsh) recommended for cross-platform use
  • Windows PowerShell 5.1 is also supported

Known Limitations

  • Workspaces with no recent data won't return rows (this is not an error).
  • Queries are billed per run (beware of large environments).
  • Script does not yet support parallel throttling (all workspaces are queried concurrently).

Tips

  • Visualize the CSV in Excel or Google Sheets to analyze trends.
  • Use filters to scope large environments and reduce cost.
  • Combine with cron (Bash) or Task Scheduler (PowerShell) to collect monthly snapshots for auditing.
  • Consider enriching with workspace.id or tags if needed.

License

MIT — free to use, fork, and modify.

#!/usr/bin/env pwsh
<#
.SYNOPSIS
Pulls monthly log volume trends from all Log Analytics workspaces in an Azure subscription.
.PARAMETER Subscription
Azure subscription ID.
.PARAMETER Region
Filter by Azure region (e.g. eastus).
.PARAMETER ResourceGroup
Filter by resource group name.
.PARAMETER Timeout
Query timeout in seconds (default: 15).
.EXAMPLE
./query_az_logs.ps1 -Subscription <sub-id> -Region eastus -ResourceGroup my-rg
#>
param(
[string]$Subscription = "",
[string]$Region = "",
[string]$ResourceGroup = "",
[int] $Timeout = 15
)
# Output files
$OutputFile = "workspace_logs.csv"
$ErrorLog = "errors.log"
$DebugLog = "debug.log"
$WorkspacesTmp = "workspaces.tmp"
# Verify az CLI is available
if (-not (Get-Command az -ErrorAction SilentlyContinue)) {
Write-Error "Azure CLI ('az') not found. Install it from https://aka.ms/installazurecliwindows"
exit 1
}
# Wizard mode if no parameters were provided
if (-not $Subscription -and -not $Region -and -not $ResourceGroup) {
Write-Host "No options provided, entering wizard mode..."
Write-Host "-----------------------------------------------"
$Subscription = Read-Host "Enter your Azure Subscription ID (leave blank to use all)"
$Region = Read-Host "Filter by region (press Enter to skip)"
$ResourceGroup = Read-Host "Filter by resource group (press Enter to skip)"
}
# Determine subscriptions
if ($Subscription) {
$SubscriptionIds = @($Subscription)
} else {
Write-Host "[INFO] No subscription provided. Querying all accessible subscriptions..." | Tee-Object -FilePath $DebugLog -Append
$SubscriptionIds = az account list --query "[?state=='Enabled'].id" -o tsv 2>&1
if ($LASTEXITCODE -ne 0) {
Write-Error "Failed to list subscriptions. Ensure you are logged in with 'az login'."
exit 1
}
$SubscriptionIds = $SubscriptionIds | Where-Object { $_ -match '\S' }
}
# Prepare output files
"SubscriptionID,WorkspaceID,Name,ResourceGroup,Location,CreatedDate,RetentionInDays,DailyQuotaGb,Month,MonthlyRecords,MonthlyVolume" | Set-Content $OutputFile
Clear-Content $ErrorLog -ErrorAction SilentlyContinue; New-Item $ErrorLog -ItemType File -Force | Out-Null
Clear-Content $DebugLog -ErrorAction SilentlyContinue; New-Item $DebugLog -ItemType File -Force | Out-Null
# Build JMESPath filter
$QueryFilter = "[]"
if ($Region) { $QueryFilter = "[?location=='$Region']" }
if ($ResourceGroup) { $QueryFilter = $QueryFilter -replace '\]$', " && resourceGroup=='$ResourceGroup']" }
# KQL query for last 12 months of monthly volume
$KqlQuery = @"
let startDate = startofmonth(datetime_add('month', -11, startofmonth(now())));
union *
| where TimeGenerated >= startDate
| extend Month = format_datetime(TimeGenerated, 'yyyy-MM')
| summarize MonthlyRecords = count(), MonthlyVolume = sum(_BilledSize) by Month
| sort by Month asc
"@
# Collect all workspace info across subscriptions
$AllWorkspaces = [System.Collections.Generic.List[hashtable]]::new()
foreach ($SubId in $SubscriptionIds) {
"[INFO] Processing subscription $SubId" | Tee-Object -FilePath $DebugLog -Append | Write-Host
$null = az account set --subscription $SubId
if ($LASTEXITCODE -ne 0) {
"[$(Get-Date)] Failed to set subscription $SubId" | Add-Content $ErrorLog
continue
}
$WorkspacesJson = az monitor log-analytics workspace list `
--query "$QueryFilter | [].{id:customerId, name:name, rg:resourceGroup, loc:location, created:createdDate, retention:retentionInDays, quota:workspaceCapping.dailyQuotaGb}" `
--output json 2>&1
if ($LASTEXITCODE -ne 0) {
"[$(Get-Date)] Failed to list workspaces for subscription $SubId" | Add-Content $ErrorLog
$WorkspacesJson | Add-Content $ErrorLog
continue
}
$Workspaces = $WorkspacesJson | ConvertFrom-Json
foreach ($ws in $Workspaces) {
$AllWorkspaces.Add(@{
SubId = $SubId
Id = $ws.id
Name = $ws.name
Rg = $ws.rg
Loc = $ws.loc
Created = $ws.created
Retention = $ws.retention
Quota = if ($ws.quota) { $ws.quota } else { "N/A" }
})
}
}
$Total = $AllWorkspaces.Count
$Launched = 0
if ($Total -eq 0) {
Write-Host "No workspaces found matching the specified filters."
exit 0
}
# Scriptblock executed in each parallel job
$JobScript = {
param($ws, $KqlQuery, $Timeout, $OutputFile, $ErrorLog, $DebugLog)
$Id = $ws.Id
$Name = $ws.Name
$SubId = $ws.SubId
$Rg = $ws.Rg
$Loc = $ws.Loc
$Created = $ws.Created
$Retention = $ws.Retention
$Quota = $ws.Quota
"[INFO] Starting job for $Id" | Add-Content $DebugLog
# Run the query with a timeout via a nested job
$QueryJob = Start-Job -ScriptBlock {
param($Id, $KqlQuery)
az monitor log-analytics query -w $Id --analytics-query $KqlQuery --output yaml 2>&1
} -ArgumentList $Id, $KqlQuery
$Finished = Wait-Job $QueryJob -Timeout $Timeout
if (-not $Finished) {
Stop-Job $QueryJob
Remove-Job $QueryJob -Force
"[$(Get-Date)] Query timed out for workspace $Id" | Add-Content $ErrorLog
"[INFO] Finished job for $Id (timeout)" | Add-Content $DebugLog
return
}
$Result = Receive-Job $QueryJob
$ExitCode = $QueryJob.State
Remove-Job $QueryJob -Force
if ($ExitCode -eq "Completed" -and $Result -notmatch "^ERROR") {
# Parse YAML-like output from az cli
$Month = $null; $Records = $null; $Volume = $null
foreach ($line in $Result) {
if ($line -match '^- Month:\s*(.+)') {
$Month = $Matches[1].Trim().Trim("'")
} elseif ($line -match '^\s+MonthlyRecords:\s*(.+)') {
$Records = $Matches[1].Trim().Trim("'")
} elseif ($line -match '^\s+MonthlyVolume:\s*(.+)') {
$Volume = $Matches[1].Trim().Trim("'")
} elseif ($line -match '^\s+TableName:') {
if ($Month -and $Records -and $Volume) {
"$SubId,$Id,$Name,$Rg,$Loc,$Created,$Retention,$Quota,$Month,$Records,$Volume" |
Add-Content $OutputFile
}
$Month = $null; $Records = $null; $Volume = $null
}
}
# Flush last record if no trailing TableName line
if ($Month -and $Records -and $Volume) {
"$SubId,$Id,$Name,$Rg,$Loc,$Created,$Retention,$Quota,$Month,$Records,$Volume" |
Add-Content $OutputFile
}
} else {
"[$(Get-Date)] Query failed for workspace $Id" | Add-Content $ErrorLog
$Result | Add-Content $ErrorLog
"----------------------------------------" | Add-Content $ErrorLog
}
"[INFO] Finished job for $Id" | Add-Content $DebugLog
}
# Launch parallel jobs
$Jobs = @()
foreach ($ws in $AllWorkspaces) {
"[DEBUG] Launching job for $($ws.Id) in $($ws.SubId)" | Add-Content $DebugLog
$Jobs += Start-Job -ScriptBlock $JobScript -ArgumentList $ws, $KqlQuery, $Timeout, $OutputFile, $ErrorLog, $DebugLog
$Launched++
}
# Progress tracking
while ($true) {
$Running = ($Jobs | Where-Object { $_.State -eq "Running" }).Count
$Completed = $Launched - $Running
Write-Host -NoNewline "`rProcessed $Completed of $Launched workspaces... ($Running running)"
if ($Running -eq 0) { break }
Start-Sleep -Seconds 2
}
# Collect and clean up jobs
foreach ($job in $Jobs) {
$null = Receive-Job $job -ErrorAction SilentlyContinue
Remove-Job $job -Force
}
Write-Host "`nDone. Output written to $OutputFile"
if ((Get-Item $ErrorLog).Length -gt 0) {
Write-Host "Some errors were logged to $ErrorLog"
}
#!/bin/sh
# Initialize defaults
SUBSCRIPTION_ID=""
REGION_FILTER=""
RG_FILTER=""
SHOW_HELP="false"
CHILD_PIDS=""
QUERY_TIMEOUT=15 # seconds
# Output files
OUTPUT_FILE="workspace_logs.csv"
ERROR_LOG="errors.log"
DEBUG_LOG="debug.log"
WORKSPACES_TMP="workspaces.tmp"
# Determine timeout command
if command -v timeout >/dev/null 2>&1; then
TIMEOUT_CMD="timeout"
elif command -v gtimeout >/dev/null 2>&1; then
TIMEOUT_CMD="gtimeout"
else
echo "Error: 'timeout' or 'gtimeout' not found. Install coreutils (e.g. 'brew install coreutils')"
exit 1
fi
# Cleanup
cleanup() {
echo "\nTerminating... cleaning up temporary files and background jobs."
for pid in $CHILD_PIDS; do
kill "$pid" 2>/dev/null
done
rm -f "$WORKSPACES_TMP"
exit 1
}
trap cleanup INT TERM
# Help text
print_help() {
cat <<EOF
Usage: ./query_az_logs.sh [OPTIONS]
Pulls monthly log volume trends from all Log Analytics workspaces in an Azure subscription.
Options:
--subscription Azure subscription ID
--region Filter by Azure region (e.g. eastus)
--resource-group Filter by resource group
--timeout Query timeout in seconds (default: $QUERY_TIMEOUT)
-h, --help Show this help message
Example:
./query_az_logs.sh --subscription <sub-id> --region eastus --resource-group my-rg
EOF
}
# Parse CLI args
while [ $# -gt 0 ]; do
case "$1" in
--subscription) SUBSCRIPTION_ID=$2; shift 2 ;;
--region) REGION_FILTER=$2; shift 2 ;;
--resource-group) RG_FILTER=$2; shift 2 ;;
--timeout) QUERY_TIMEOUT=$2; shift 2 ;;
-h|--help) SHOW_HELP="true"; shift 1 ;;
*) echo "Unknown option: $1"; print_help; exit 1 ;;
esac
done
if [ "$SHOW_HELP" = "true" ]; then
print_help
exit 0
fi
# Prompt if needed
if [ -z "$SUBSCRIPTION_ID$REGION_FILTER$RG_FILTER" ]; then
echo "No CLI options provided, entering wizard mode..."
echo "-----------------------------------------------"
printf "Enter your Azure Subscription ID (leave blank to use all): "
read SUBSCRIPTION_ID
printf "Filter by region (press Enter to skip): "
read REGION_FILTER
printf "Filter by resource group (press Enter to skip): "
read RG_FILTER
fi
# Determine subscriptions
if [ -n "$SUBSCRIPTION_ID" ]; then
SUBSCRIPTION_IDS="$SUBSCRIPTION_ID"
else
echo "[INFO] No subscription provided. Querying all accessible subscriptions..." | tee -a "$DEBUG_LOG"
SUBSCRIPTION_IDS=$(az account list --query "[?state=='Enabled'].id" -o tsv)
fi
# Prepare output
echo "SubscriptionID,WorkspaceID,Name,ResourceGroup,Location,CreatedDate,RetentionInDays,DailyQuotaGb,Month,MonthlyRecords,MonthlyVolume" > "$OUTPUT_FILE"
: > "$ERROR_LOG"
: > "$DEBUG_LOG"
# Build filter
QUERY_FILTER="[]"
[ -n "$REGION_FILTER" ] && QUERY_FILTER="[?location=='$REGION_FILTER']"
[ -n "$RG_FILTER" ] && QUERY_FILTER=$(echo "$QUERY_FILTER" | sed "s/]$/ && resourceGroup=='$RG_FILTER']/")
LAUNCHED=0
# Loop through all subscriptions
for SUB_ID in $SUBSCRIPTION_IDS; do
echo "[INFO] Processing subscription $SUB_ID" | tee -a "$DEBUG_LOG"
az account set --subscription "$SUB_ID"
az monitor log-analytics workspace list \
--query "$QUERY_FILTER | [].{id:customerId, name:name, rg:resourceGroup, loc:location, created:createdDate, retention:retentionInDays, quota:workspaceCapping.dailyQuotaGb}" \
--output tsv > "$WORKSPACES_TMP"
TOTAL=$(wc -l < "$WORKSPACES_TMP" | tr -d '[:space:]')
while IFS= read -r LINE || [ -n "$LINE" ]; do
(
ID=$(echo "$LINE" | cut -f1)
NAME=$(echo "$LINE" | cut -f2)
RG=$(echo "$LINE" | cut -f3)
LOC=$(echo "$LINE" | cut -f4)
CREATED=$(echo "$LINE" | cut -f5)
RETENTION=$(echo "$LINE" | cut -f6)
QUOTA=$(echo "$LINE" | cut -f7)
[ -z "$QUOTA" ] && QUOTA="N/A"
echo "[INFO] Starting job for $ID" >> "$DEBUG_LOG"
QUERY="let startDate = startofmonth(datetime_add('month', -11, startofmonth(now())));
union *
| where TimeGenerated >= startDate
| extend Month = format_datetime(TimeGenerated, 'yyyy-MM')
| summarize MonthlyRecords = count(), MonthlyVolume = sum(_BilledSize) by Month
| sort by Month asc"
CMD="$TIMEOUT_CMD $QUERY_TIMEOUT az monitor log-analytics query -w \"$ID\" --analytics-query \"$QUERY\" --output yaml"
RESULT=$(eval "$CMD" 2>&1)
EXIT_CODE=$?
if [ $EXIT_CODE -eq 0 ]; then
echo "$RESULT" | awk '
/^- Month:/ { month=$3; gsub(/^'\''|'\''$/, "", month) }
/^ MonthlyRecords:/ { records=$2; gsub(/^'\''|'\''$/, "", records) }
/^ MonthlyVolume:/ { volume=$2; gsub(/^'\''|'\''$/, "", volume) }
/^ TableName:/ {
if (month && records && volume) {
printf "%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s\n", "'"$SUB_ID"'", "'"$ID"'", "'"$NAME"'", "'"$RG"'", "'"$LOC"'", "'"$CREATED"'", "'"$RETENTION"'", "'"$QUOTA"'", month, records, volume
}
month=""; records=""; volume=""
}
' >> "$OUTPUT_FILE"
else
echo "[$(date)] Query failed or timed out for workspace $ID (exit $EXIT_CODE)" >> "$ERROR_LOG"
echo "$CMD" >> "$ERROR_LOG"
echo "$RESULT" >> "$ERROR_LOG"
echo "----------------------------------------" >> "$ERROR_LOG"
fi
echo "[INFO] Finished job for $ID (exit $EXIT_CODE)" >> "$DEBUG_LOG"
) & pid=$!
CHILD_PIDS="$CHILD_PIDS $pid"
LAUNCHED=$((LAUNCHED + 1))
echo "[DEBUG] Launched PID $pid for $ID in $SUB_ID" >> "$DEBUG_LOG"
done < "$WORKSPACES_TMP"
done
# Wait with progress tracking
COMPLETED=0
while [ "$COMPLETED" -lt "$LAUNCHED" ]; do
count_running=0
for pid in $CHILD_PIDS; do
if kill -0 "$pid" 2>/dev/null; then
count_running=$((count_running + 1))
fi
done
COMPLETED=$((LAUNCHED - count_running))
printf "\rProcessed %s of %s workspaces... (%s running)" "$COMPLETED" "$LAUNCHED" "$count_running"
sleep 2
done
# Clean up leftover PIDs (if stuck)
for pid in $CHILD_PIDS; do
if kill -0 "$pid" 2>/dev/null; then
echo "[WARNING] Forcibly killing PID $pid" >> "$DEBUG_LOG"
kill "$pid" 2>/dev/null
fi
done
rm -f "$WORKSPACES_TMP"
printf "\nDone. Output written to %s\n" "$OUTPUT_FILE"
[ -s "$ERROR_LOG" ] && echo "Some errors were logged to $ERROR_LOG"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment