Metrics & SLOs ¶
x-pdb exposes Prometheus metrics on the /metrics
path. To enable it, set the serviceMonitor.enabled
Helm flag to true.
Metrics ¶
x-pdb ¶
Name | Type | Description |
---|---|---|
pod_eviction_rejected | Counter | Represents the number of eviction which have been rejected through x-pdb. |
pod_matches_multiple_xpdbs | Counter | A eviction attempt for a pod has been observed which matches multiple XPDBs. This is a invalid configuration and must be fixed. |
lock_errors | Counter | Counter that represents the number of errors when obtaining locks for xpdb. |
grpc metrics ¶
x-pdb exposes GRPC metrics for both client and server which allow you to get insights into latency and availability of the remote x-pdb servers.
SLOs ¶
Availability ¶
There should be at least one pod ready to serve traffic at any time, preferably measured from both kube-apiserver
and x-pdb
on other clusters.
sum(increase(apiserver_admission_webhook_fail_open_count{name=~".*x-pdb.*"}[5m]))
Latency ¶
The amount of time x-pdb needs to respond to a admission webhook, preferably measured from the kube-apiserver. It should take less than 150ms for x-pdb
to respond to admission requests on the p99. The threshold may vary in your environment, depending on the cross-cluster latency.
histogram_quantile(0.99,
sum(rate(apiserver_admission_webhook_admission_duration_seconds_bucket{name=~".*x-pdb.*"}[5m])) by (le, name)
)