Kubernetes Checks Reference¶

KubeBuddy runs checks to find issues and misconfigurations in your Kubernetes cluster. These checks power the health report and help you fix problems, reduce risk, and improve stability. This page lists all checks by category, with their ID, name, description, severity, and score weight.

Overview¶

Each check targets a specific part of your cluster—nodes, pods, workloads, security, etc. Tables group checks by category. Use them to understand what’s being evaluated, how serious the issue is, and how much it affects your overall health score.

Section Mapping: YAML to Report Tabs¶

Each check includes a section value in its YAML. This table shows how those values map to the tabs in the HTML report:

YAML Section Value	Report Tab Name
`Nodes`	Nodes
`Namespaces`	Namespaces
`Workloads`	Workloads
`Pods`	Pods
`Jobs`	Jobs
`Networking`	Networking
`Storage`	Storage
`Configuration`	Configuration Hygiene
`Security`	Security
`Kubernetes Events`	Kubernetes Events

Use this when defining or updating checks to control where they appear in the report.

Checks by Category¶

Each table includes:

ID – Identifier for the check
Name – Short label
Description – What it checks and why it matters
Severity – Low / Medium / High / Warning / Info
Weight – Contribution to health score

Some low-severity checks are marked as advisories in the check catalog. Advisories surface useful review items without implying the resource is immediately broken.

Performance¶

ID	Name	Description	Severity	Weight
PROM001	High CPU Pods (Prometheus)	Checks for pods with sustained high CPU usage over the last 24 hours using Prometheus.	Warning	3
PROM002	High Memory Usage Pods (Prometheus)	Detects pods with high memory usage over the last 24 hours based on Prometheus metrics.	Warning	3

Configuration¶

ID	Name	Description	Severity	Weight
CFG001	Orphaned ConfigMaps	Unused ConfigMaps that can be removed.	Medium	1
CFG002	Duplicate ConfigMap Names	Same name used in different namespaces—creates confusion.	Medium	1
CFG003	Large ConfigMaps	Oversized ConfigMaps that may affect performance.	Medium	2

Events¶

ID	Name	Description	Severity	Weight
EVENT001	Grouped Warning Events	Groups frequent warnings to help identify recurring issues.	Low	1
EVENT002	Full Warning Event Log	Lists all recent warning events.	Low	1

Jobs¶

ID	Name	Description	Severity	Weight
JOB001	Stuck Kubernetes Jobs	Jobs stuck in start or finish states due to controller issues.	High	2
JOB002	Failed Kubernetes Jobs	Jobs that failed or hit backoff limits.	High	2
JOB003	CronJob Hygiene	Flags CronJobs with risky scheduling or retention settings.	Warning	2

Networking¶

ID	Name	Description	Severity	Weight
NET001	Services Without Endpoints	Identifies services that have no backing endpoints.	High	2
NET002	Publicly Accessible Services	Detects services of type LoadBalancer or NodePort that may be publicly exposed.	High	4
NET003	Ingress Health Validation	Validates ingress classes, TLS secrets, and backend service references.	High	3
NET004	Namespace Missing Network Policy	Flags namespaces that do not define any NetworkPolicy.	Warning	2
NET005	Ingress Host/Path Conflicts	Detects duplicate host/path combinations across ingresses in the same namespace.	High	2
NET006	Ingress Using Wildcard Hosts	Detects ingress rules that use wildcard hosts.	Warning	2
NET007	Service TargetPort Mismatch	Detects services whose `targetPort` does not exist on backing pods.	High	2
NET008	ExternalName Service to Internal IP	Identifies `ExternalName` services that point to internal IP addresses.	Warning	2
NET009	Overly Permissive Network Policy	Identifies NetworkPolicies with empty rules or broad all-IP blocks.	High	4
NET010	Network Policy Overly Permissive IPBlock	Flags NetworkPolicies that allow `0.0.0.0/0` through `ipBlock` rules.	High	5
NET011	Network Policy Missing PolicyTypes	Detects NetworkPolicies that do not explicitly define `policyTypes`.	Low	1
NET012	Pod HostNetwork Usage	Identifies pods configured with `hostNetwork: true`.	High	2
NET013	Ingress Present Without Gateway API Adoption	Detects clusters still using Ingress without any Gateway API resources.	Warning	2
NET014	HTTPRoute Missing or Unaccepted Parent	Detects HTTPRoutes with missing `parentRefs` or no accepted parent Gateway.	High	3
NET015	Gateways Without Attached HTTPRoutes	Detects Gateway resources that have no attached HTTPRoutes.	Warning	2
NET016	Gateway API Readiness Conditions	Detects Gateway resources that are not accepted or programmed.	High	3
NET017	Gateway TLS Secret and Cross-Namespace ReferenceGrant Validation	Validates Gateway `certificateRefs` against existing Secrets and ReferenceGrants.	High	3
NET018	Duplicate Service Selectors	Detects multiple Services in the same namespace with identical selectors.	Warning	3
NET019	Services Using External IPs	Detects Services that use `spec.externalIPs`, which can bypass normal load balancer ownership and create traffic interception risk.	High	4
NET020	Ingress-NGINX Controller Detected	Detects Ingress-NGINX controller components so teams can review maintenance and migration plans.	Low	1
PROM003	High Network Receive Rate (Prometheus)	Detects pods receiving large amounts of network traffic over the last 24 hours.	Medium	2

Nodes¶

ID	Name	Description	Severity	Weight
NODE001	Node Readiness	Nodes not ready or with critical conditions.	High	3
NODE002	Node Resource Pressure	High usage of CPU, memory, or disk.	High	3
NODE003	Max Pods per Node	Node pod count exceeds configured threshold.	Warning	2

Control Plane¶

ID	Name	Description	Severity	Weight
PROM004	API Server High Latency (Prometheus)	Detects high latency in Kubernetes API server requests over 24 hours.	High	5

Capacity¶

ID	Name	Description	Severity	Weight
PROM005	Overcommitted CPU (Prometheus)	Checks if CPU requests on nodes exceed allocatable capacity over the last 24 hours.	Info	2
PROM006	Node Sizing Insights (Prometheus)	Uses Prometheus p95 CPU and memory usage to identify underutilized or saturated nodes and suggest sizing actions.	Info	3
PROM007	Pod Sizing Insights (Prometheus)	Uses 7-day p95 per-container CPU/memory usage to recommend right-sized requests and memory limits. CPU limit recommendation defaults to `none`.	Info	4

Namespaces¶

ID	Name	Description	Severity	Weight
NS001	Empty Namespaces	No resources; can be cleaned up.	Low	1
NS002	Weak or Missing ResourceQuotas	No quotas or soft limits; risks resource overuse.	Medium	2
NS003	Missing LimitRanges	No resource caps; enables excessive use.	Medium	2

Pods¶

ID	Name	Description	Severity	Weight
POD001	High Restart Count	Pods restarting too often; indicates instability.	Medium	2
POD002	Long Running Pods	Pods running longer than expected.	Medium	2
POD003	Failed Pods	Pods in failed state.	High	3
POD004	Pending Pods	Pods stuck Pending—often resource/scheduling issues.	Medium	2
POD005	CrashLoopBackOff	Pods stuck restarting in CrashLoopBackOff.	High	3
POD006	Leftover Debug Pods	Debug pods not cleaned up.	Medium	2
POD007	Images Using `latest` Tag	Risk of inconsistent deployments due to floating tags.	Low	1
POD009	Unhealthy Allocated Device Resources	Detects pods whose allocated device resources report `Unhealthy` or `Unknown` status.	High	3
POD010	Naked Pods	Detects pods that are not owned by a workload controller.	Warning	2

RBAC¶

ID	Name	Description	Severity	Weight
RBAC001	Misconfigurations	Missing or incorrect role bindings.	High	3
RBAC002	Overexposed Roles	Roles with overly broad permissions.	High	3
RBAC003	Orphaned ServiceAccounts	Not in use; can be removed.	Medium	2
RBAC004	Ineffective Roles	Unused roles cluttering the system.	Medium	2
RBAC005	Kubelet Proxy RBAC Access	Bound Roles or ClusterRoles that grant broad `nodes/proxy` kubelet access.	High	4
RBAC006	Dangerous RBAC Verbs and Subresources	Detects bound Roles or ClusterRoles granting impersonation, bind/escalate, exec, port-forward, or broad secret access.	High	4

Security¶

ID	Name	Description	Severity	Weight
SEC001	Orphaned Secrets	Not used; safe to delete.	Medium	2
SEC002	hostPID/hostNetwork Usage	Shared host namespaces increase risk.	High	3
SEC003	Pods Running as Root	Containers should avoid root for security.	High	3
SEC004	Privileged Containers	Grants unnecessary access.	High	3
SEC005	hostIPC Usage	Sharing IPC namespace with host is a security risk.	Medium	2
SEC006	Pods Missing Secure Defaults	Missing recommended `securityContext` fields (e.g. `runAsNonRoot`).	Medium	3
SEC007	Missing Pod Security Admission Labels	Namespaces lacking `pod-security.kubernetes.io/enforce` labels.	Low	1
SEC008	Secrets in Environment Variables	Exposed via `env.valueFrom.secretKeyRef`; can leak via logs or `/proc`.	High	4
SEC009	Missing Capabilities Drop	Containers not dropping all capabilities.	Medium	3
SEC010	HostPath Volume Usage	Use of `hostPath` volumes exposes host filesystem.	High	3
SEC011	Containers Running as UID 0	Explicit `runAsUser: 0` even with securityContext.	High	3
SEC012	Added Linux Capabilities	Use of extra Linux capabilities via `securityContext.capabilities.add`.	Medium	2
SEC013	EmptyDir Volume Usage	`emptyDir` volumes are non-persistent and cleared on restart.	Low	1
SEC014	Untrusted Image Registries	Pulling from unapproved registries.	High	3
SEC015	Pods Using Default ServiceAccount	Flags pods using the default service account, which may have broad permissions.	Warning	3
SEC016	Unconfined Seccomp Profiles	Detects pod or container seccomp profiles explicitly set to `Unconfined`.	High	3
SEC017	Non-Default ProcMount	Detects containers that set a `procMount` value other than `Default`.	High	3
SEC018	Automounting API Credentials Enabled in ServiceAccounts	Flags ServiceAccounts where API token automounting is enabled.	Warning	3
SEC019	Unsupported AppArmor Values	Detects unsupported AppArmor annotations or structured profile types.	High	2
SEC020	Seccomp Profile Not Configured	Detects pods and containers without an explicit seccomp profile.	Warning	2
SEC021	Host Ports in Pod Specs	Detects containers that bind host ports directly on the node.	Critical	4
SEC022	Non-Existent Secret References	Flags pods referencing Secrets that do not exist.	Critical	4
SEC023	Disallowed Sysctls	Detects sysctls outside the Kubernetes baseline allowlist.	Critical	3
SEC024	ValidatingAdmissionPolicy Ignore Failure Policy	Flags policies where `failurePolicy: Ignore` silently allows requests when CEL evaluation fails.	High	4
SEC025	ValidatingAdmissionPolicy With No Binding	Detects `ValidatingAdmissionPolicy` resources with no associated binding, meaning the policy is never enforced.	Warning	3
SEC026	ValidatingAdmissionPolicy With No Validation Rules	Detects `ValidatingAdmissionPolicy` resources with an empty `spec.validations` list — the policy enforces nothing.	Medium	2
SEC027	GitRepo Volume Usage	Detects pods that use the legacy `gitRepo` volume source.	High	3
SEC028	Image Pull Secrets in Use	Flags Pods or ServiceAccounts that reference imagePullSecrets for credential rotation review.	Low	1
SEC029	Sensitive HostPath Mounts	Detects hostPath volumes that mount container runtime sockets or sensitive host filesystem paths.	Critical	5
SEC030	Admission Webhook Fail-Open or Broad Scope	Flags admission webhooks that fail open, omit sideEffects, or apply broadly without namespace scoping.	High	4

Storage¶

ID	Name	Description	Severity	Weight
PV001	Orphaned Persistent Volumes	Detects Persistent Volumes that are not bound to any Persistent Volume Claim.	Warning	3
PVC001	Unused Persistent Volume Claims	Detects PVCs not attached to any pod.	Warning	2
PVC002	PVCs Using Default StorageClass	Detects PVCs that do not explicitly specify a storageClassName.	Low	1
PVC003	ReadWriteMany PVCs on Incompatible Storage	Detects PVCs requesting ReadWriteMany access mode where the underlying storage is typically block-based and does not support concurrent writes from multiple nodes.	High	5
PVC004	Unbound Persistent Volume Claims	Detects Persistent Volume Claims that are in a Pending phase and have not been bound to a Persistent Volume.	High	3
PVC005	PVC Expansion Failures	Detects PersistentVolumeClaims with failed volume expansion status or resize failure events.	High	3
SC001	Deprecated StorageClass Provisioners	Detects StorageClasses using deprecated or legacy in-tree provisioners, which should be migrated to CSI drivers.	High	4
SC004	StorageClass Prevents Volume Expansion	Identifies StorageClasses that do not permit volume expansion, which can limit dynamic scaling of stateful applications.	Medium	2
SC003	High Cluster Storage Usage	Monitors the overall percentage of used storage across the cluster.	Warning	4

Workloads¶

ID	Name	Description	Severity	Weight
WRK001	DaemonSets Not Fully Running	Some pods unscheduled or not ready.	High	2
WRK002	Deployment Missing Replicas	Fewer replicas than specified.	High	2
WRK003	Incomplete StatefulSet Rollout	Rollout not finished; may cause issues.	Medium	2
WRK004	HPA Misconfig or Inactivity	HPA not working or pointing to nothing.	Medium	2
WRK005	Missing Resource Requests	Missing CPU or memory requests on one or more containers.	High	3
WRK006	PodDisruptionBudget Coverage	Missing or misconfigured PDBs.	Medium	2
WRK007	Missing Health Probes	No liveness or readiness probes; risks silent failures.	Medium	2
WRK008	Deployment Selector Without Matching Pods	Selectors that don’t match any pods, leading to 0 replicas.	Medium	2
WRK009	Deployment, Pod, and Service Label Consistency	Mismatched labels between Deployments, Pods, or Services; breaks routing.	Medium	3
WRK010	HPA Metrics Without Matching Resource Requests	HPAs scale on CPU/memory while target containers miss matching requests.	Warning	3
WRK011	VPA Update Mode and Declarative Resource Conflict Risk	Flags VPA Auto/Recreate targets likely to conflict with declarative ownership or HPA.	Warning	2
WRK012	PodDisruptionBudget Adequacy for Replicated Workloads	Detects missing/overly strict/overly permissive PDB settings on 2+ replica workloads.	Warning	2
WRK013	CrashLoopBackOff and OOMKilled Guardrail	Highlights unstable pods to guard against unsafe right-sizing decisions.	High	3
WRK014	Missing Memory Limits	Detects workloads whose containers do not define a memory limit.	Warning	2
WRK015	Replicated Workloads Missing Spread Constraints	Detects Deployments or StatefulSets with 2+ replicas that define neither anti-affinity nor topology spread constraints.	Warning	2
WRK016	Missing Recommended Application Labels	Detects workloads that do not use the recommended `app.kubernetes.io` label set.	Low	1

Usage Notes¶

Severity
Low: Cosmetic or cleanup
Medium: May affect performance or reliability
High: Causes downtime or poses security risk
Warning/Info: Advisory thresholds
Weight
Scores range 1 (low impact) to 5 (high impact)
Higher weight = greater effect on cluster score
Interpreting Reports
Passed: no items listed
Failed: lists affected resources + suggested fixes
Click IDs to jump to detailed recommendations in the report

AKS Automatic Readiness Metadata¶

Some shared checks now also carry AKS Automatic-specific metadata used to derive the AKS Automatic migration readiness view in reports.

AutomaticRelevance
blocker
warning
AutomaticScope
workload
cluster
platform
AutomaticReason
used to group remediation actions in the AKS Automatic action plan
AutomaticAdmissionBehavior
denies_on_enforce
warns_only
mutates_on_enforce
AutomaticMutationOutcome
optional observed/documented AKS Automatic behavior note shown in HTML, text, and JSON outputs

This metadata does not create a separate check engine. It allows KubeBuddy to derive AKS Automatic readiness from the same shared checks used elsewhere in the product.

In the AKS Automatic standalone action plan, this metadata is also used to:

split actions into blocker-driven and warning-driven migration sections
build per-action migration cards instead of a single wide table
group structured affected resources into namespace/workload/resource/Helm tables
attach manifest examples for common remediation themes