compliance-scan
ugallu-compliance-scan evaluates compliance baselines (CIS
Kubernetes Benchmark via kube-bench,
runtime policy via Falco, or custom CEL rules)
and produces a uniform ComplianceScanResult CR plus a
SecurityEvent when the scan crosses a configured threshold.
Backends
Section titled “Backends”| Backend | Mechanism |
|---|---|
kube-bench | The controller creates a privileged Job in --job-namespace (default ugallu-system-privileged) using --kube-bench-image, tails the pod log, parses the report. |
falco | Opens a gRPC connection to Falco’s outputs service (mTLS, certs from a Secret) and pulls a window of events. |
cel | Evaluates user-provided CEL expressions against live Pod / Namespace state through the controller-runtime client. |
A backend that fails to start (missing image, Falco endpoint
unreachable) does not crash the operator - the run is recorded
with Phase=Failed and a structured reason, so dashboards see the
gap rather than a silent skip.
Threshold model
Section titled “Threshold model”Each finding carries severity ∈ {info, low, medium, high, critical}.
The result’s worstSeverity is the max across findings; the SE
class flips from Compliance to Detection when worstSeverity ≥ spec.alertOn (default high). That decoupling lets you keep
informational scans noise-free without losing the audit trail.
Example
Section titled “Example”apiVersion: security.ugallu.io/v1alpha1kind: ComplianceScanRunmetadata: { name: cis-control-plane, namespace: ugallu-system }spec: backend: kube-bench benchmark: cis-1.9 alertOn: high targetNodeSelector: kubernetes.io/role: master timeout: 5mInternals
Section titled “Internals”State machine
Section titled “State machine”ComplianceScanRun.status.phase: “ (Pending) -> Running ->
Succeeded | Failed. No finalizer. Spec is immutable.
ComplianceScanResult is created by the controller once the run
reaches a terminal phase, with status.worstSeverity ranking the
checks.
Reconcile loop
Section titled “Reconcile loop”on each ComplianceScanRun event: run := Get(req) if run.Status.Phase in {Succeeded, Failed}: return if run.Status.Phase == "": patch Status.Phase=Running, StartTime=now emitSE(ComplianceScanStarted, info) return Requeue scanner := scannerFor(run.Spec.Backend) # kube-bench | falco | cel-custom checks, err := scanner.Scan(ctx, run.Spec.Profile) decorate checks with run.Spec.ControlMappings # SOC2 / ISO27001 upsert ComplianceScanResult (status subresource carries worstSeverity) patch Status.Phase = (success ? Succeeded : Failed) emitSE(ComplianceScanCompleted | ComplianceScanFailed)Error recovery
Section titled “Error recovery”Restart with Phase=Running: re-instantiates the backend, re-runs
Scan(). For the kube-bench backend, the privileged Job is keyed
by run name; a Create returning AlreadyExists is taken as a
recovery hint and the controller polls the existing Job’s status
instead of starting a new one. Result CR is safe to re-create
(same name + namespace) - the Status update overwrites
worstSeverity.
Crash recovery scenario
Section titled “Crash recovery scenario”Pod killed while the kube-bench Job is running: the new pod
re-Gets the run (Phase=Running), recognises the kube-bench Job
already exists, polls until terminal, parses the report, writes
the Result. Best-effort Job deletion at terminal phase.
Edge cases
Section titled “Edge cases”- Falco endpoint missing. Backend degrades gracefully: a stub
finding with
code=falco-endpoint-unreachableandseverity=infois added; run still completesSucceeded. - Privileged Job namespace. Defaults to
ugallu-system-privileged; configurable via--job-namespace. - Timeout. A
ValidatingAdmissionPolicycapsspec.timeoutat 30m to keep runaway Jobs out of the cluster. - Control mappings. A single scan can populate multiple framework reports (SOC2 + ISO27001) without re-running.
Full RBAC (ClusterRole)
Section titled “Full RBAC (ClusterRole)”rules: - apiGroups: [security.ugallu.io] resources: [compliancescanruns, compliancescanruns/status, compliancescanresults, compliancescanresults/status] verbs: [get, list, watch, create, update, patch] - apiGroups: [security.ugallu.io] resources: [securityevents] verbs: [create] - apiGroups: [batch] resources: [jobs] verbs: [get, list, watch, create, delete] - apiGroups: [""] resources: [pods, pods/log, namespaces] verbs: [get, list, watch] - apiGroups: [""] resources: [events] verbs: [create, patch] - apiGroups: [coordination.k8s.io] resources: [leases] verbs: [get, list, watch, create, update, patch, delete]CRDs owned
Section titled “CRDs owned”ComplianceScanRun- one CR per scan, immutable spec.
ComplianceScanResult- one per run; status holds
worstSeverityand per-check breakdown.
- one per run; status holds
Key flags
Section titled “Key flags”--cluster-id, --cluster-name, --job-namespace,
--kube-bench-image, --falco-host, --falco-port,
--falco-cert-file / --falco-key-file / --falco-ca-root-file.
Deployment
Section titled “Deployment”Deployment (1 replica) in ugallu-system, leader election on. Falco
mTLS certs are mounted from falco-client-certs; kube-bench Jobs
are created in the privileged workload namespace because they need
hostPath into /etc, /var/lib/etcd, and the kubelet config.
Telemetry
Section titled “Telemetry”ugallu_compliance_scan_runs_total{backend,outcome},
ugallu_compliance_scan_findings_total{benchmark,severity},
ugallu_compliance_scan_duration_seconds{backend}.