Average Origin Latency:
avg_over_time(aws_cloudfront_origin_latency{<DIMENSION NAME="DIMENSION VALUE">}[5m])
Application Latency p99:
histogram_quantile(0.99, rate(aws_elasticbeanstalk_application_latency_p99{<DIMENSION NAME="DIMENSION VALUE">}[5m]))
Disk Read Latency:
avg(rate(aws_ec2_disk_read_ops{<DIMENSION NAME="DIMENSION VALUE">}[5m]))
Disk Write Latency:
avg(rate(aws_ec2_disk_write_ops{<DIMENSION NAME="DIMENSION VALUE">}[5m]))
Read Latency:
avg(rate(aws_rds_read_latency{<DIMENSION NAME="DIMENSION VALUE">}[5m]))
Write Latency:
avg(rate(aws_rds_write_latency{<DIMENSION NAME="DIMENSION VALUE">}[5m]))
Total Requests:
sum(rate(aws_cloudfront_requests{<DIMENSION NAME="DIMENSION VALUE">}[1m]))
Total Application Requests:
sum(rate(aws_elasticbeanstalk_application_requests_total{<DIMENSION NAME="DIMENSION VALUE">}[1m]))
Network In:
sum(rate(aws_ec2_network_in{<DIMENSION NAME="DIMENSION VALUE">}[1m]))
Network Out:
sum(rate(aws_ec2_network_out{<DIMENSION NAME="DIMENSION VALUE">}[1m]))
Database Connections:
sum(rate(aws_rds_database_connections{<DIMENSION NAME="DIMENSION VALUE">}[1m]))
4XX Error Rate:
sum(rate(aws_cloudfront_4xx_error_rate{<DIMENSION NAME="DIMENSION VALUE">}[1m]))
5XX Error Rate:
sum(rate(aws_cloudfront_5xx_error_rate{<DIMENSION NAME="DIMENSION VALUE">}[1m]))
4XX Requests:
sum(rate(aws_elasticbeanstalk_application_requests4xx{<DIMENSION NAME="DIMENSION VALUE">}[1m]))
5XX Requests:
sum(rate(aws_elasticbeanstalk_application_requests5xx{<DIMENSION NAME="DIMENSION VALUE">}[1m]))
Status Check Failed:
increase(aws_ec2_status_check_failed{<DIMENSION NAME="DIMENSION VALUE">}[5m])
Deadlocks:
increase(aws_rds_deadlocks{<DIMENSION NAME="DIMENSION VALUE">}[5m])
Cache Hit Rate:
avg_over_time(aws_cloudfront_cache_hit_rate{<DIMENSION NAME="DIMENSION VALUE">}[15m])
CPU Utilization:
avg(rate(aws_elasticbeanstalk_cpuutilization{<DIMENSION NAME="DIMENSION VALUE">}[5m]))
CPU Utilization:
avg(rate(aws_ec2_cpuutilization{<DIMENSION NAME="DIMENSION VALUE">}[5m]))
CPU Utilization:
avg(rate(aws_rds_cpuutilization{<DIMENSION NAME="DIMENSION VALUE">}[5m]))
95th Percentile of Origin Latency Over Time
Calculates the 95th percentile of origin response times, highlighting the worst latency experiences:
histogram_quantile(0.95, rate(aws_cloudfront_origin_latency{<DIMENSION NAME="DIMENSION VALUE">}[5m]))
Error Rate Increase
Measures the total increase in error rates over an hour, capturing spikes in client or server errors:
increase(sum(rate(aws_cloudfront_4xx_error_rate{<DIMENSION NAME="DIMENSION VALUE">}[1h])) + sum(rate(aws_cloudfront_5xx_error_rate{<DIMENSION NAME="DIMENSION VALUE">}[1h])))
Cache Hit Ratio vs. Total Requests
Compares total requests to cache hit ratio, providing insights into cache effectiveness versus demand:
sum(rate(aws_cloudfront_requests{<DIMENSION NAME="DIMENSION VALUE">}[5m])) / avg_over_time(aws_cloudfront_cache_hit_rate{<DIMENSION NAME="DIMENSION VALUE">}[5m])
Latency Distribution Change
Tracks day-over-day changes in the 99th percentile latency, identifying significant shifts in application performance:
changes(histogram_quantile(0.99, rate(aws_elasticbeanstalk_application_latency_p99{DIMENSION NAME="DIMENSION VALUE"}[1d])))
Traffic vs. CPU Utilization Correlation
Correlates total application requests with CPU utilization, assessing how traffic impacts resource consumption:
sum(rate(aws_elasticbeanstalk_application_requests_total{DIMENSION NAME="DIMENSION VALUE"}[5m])) * avg(rate(aws_elasticbeanstalk_cpuutilization{DIMENSION NAME="DIMENSION VALUE"}[5m]))
Disk I/O vs. CPU Saturation
Evaluates disk I/O in relation to CPU utilization, identifying potential bottlenecks in data processing or resource saturation:
(sum(rate(aws_ec2_disk_read_bytes{<DIMENSION NAME="DIMENSION VALUE>}[5m])) + sum(rate(aws_ec2_disk_write_bytes{<DIMENSION NAME="DIMENSION VALUE>}[5m]))) / avg(rate(aws_ec2_cpuutilization{<DIMENSION NAME="DIMENSION VALUE>}[5m]))
Network Throughput Efficiency
Calculates total network throughput per EC2 instance, offering a measure of network efficiency across the fleet:
(sum(rate(aws_ec2_network_in{<DIMENSION NAME="DIMENSION VALUE>}[5m])) + sum(rate(aws_ec2_network_out{<DIMENSION NAME="DIMENSION VALUE>}[5m]))) / count(aws_ec2_info)
Read/Write Ratio
Analyzes the ratio of read IOPS to write IOPS, providing insights into the read-heavy or write-heavy nature of database operations:
sum(rate(aws_rds_read_iops{<DIMENSION NAME="DIMENSION VALUE">}[5m])) / sum(rate(aws_rds_write_iops{<DIMENSION NAME="DIMENSION VALUE">}[5m]))
Database Connection Spikes
Monitors sudden increases in database connections over 15 minutes, helping identify unexpected spikes in demand or potential DDoS attacks:
increase(aws_rds_database_connections{<DIMENSION NAME="DIMENSION VALUE">}[15m]) / time()
Sourced from running aws cloudwatch list-metrics --namespace "CWAgent" | rg "MetricName" and cleaning duplicates
Names and dimension examples to filter on:
"MetricName": "ethtool_rx_packets",
"MetricName": "ethtool_tx_packets",
"MetricName": "mem_used_percent",
"MetricName": "ethtool_bw_in_allowance_exceeded",
"MetricName": "ethtool_pps_allowance_exceeded",
"MetricName": "ethtool_bw_out_allowance_exceeded",
"MetricName": "ethtool_conntrack_allowance_exceeded",
"MetricName": "ethtool_linklocal_allowance_exceeded",
{
"MetricName": "mem_used_percent",
"Dimensions": [
{
"Name": "InstanceId",
"Value": "i-0426944dd2ab4de8e"
}
]
},
{
"MetricName": "ethtool_rx_packets",
"Dimensions": [
{
"Name": "driver",
"Value": "ixgbevf"
},
{
"Name": "InstanceId",
"Value": "i-0426944dd2ab4de8e"
},
{
"Name": "interface",
"Value": "eth0"
}
]
},
{
"MetricName": "ethtool_tx_packets",
"Dimensions": [
{
"Name": "driver",
"Value": "ixgbevf"
},
{
"Name": "InstanceId",
"Value": "i-0426944dd2ab4de8e"
},
{
"Name": "interface",
"Value": "eth0"
}
]
},
{
"MetricName": "ethtool_bw_in_allowance_exceeded",
"Dimensions": [
{
"Name": "driver",
"Value": "ena"
},
{
"Name": "InstanceId",
"Value": "i-031b6cd4fd1ce0e96"
},
{
"Name": "interface",
"Value": "eth0"
}
]
},
{
"MetricName": "ethtool_linklocal_allowance_exceeded",
"Dimensions": [
{
"Name": "driver",
"Value": "ena"
},
{
"Name": "InstanceId",
"Value": "i-031b6cd4fd1ce0e96"
},
{
"Name": "interface",
"Value": "eth0"
}
]
},
{
"MetricName": "ethtool_pps_allowance_exceeded",
"Dimensions": [
{
"Name": "driver",
"Value": "ena"
},
{
"Name": "InstanceId",
"Value": "i-031b6cd4fd1ce0e96"
},
{
"Name": "interface",
"Value": "eth0"
}
]
},
{
"MetricName": "ethtool_bw_out_allowance_exceeded",
"Dimensions": [
{
"Name": "driver",
"Value": "ena"
},
{
"Name": "InstanceId",
"Value": "i-031b6cd4fd1ce0e96"
},
{
"Name": "interface",
"Value": "eth0"
}
]
},
{
"MetricName": "ethtool_conntrack_allowance_exceeded",
"Dimensions": [
{
"Name": "driver",
"Value": "ena"
},
{
"Name": "InstanceId",
"Value": "i-031b6cd4fd1ce0e96"
},
{
"Name": "interface",
"Value": "eth0"
}
]
},