Skip to content

audit-detection

ugallu-audit-detection consumes the Kubernetes apiserver audit log, evaluates user-supplied SigmaRule CRs against each event, and emits SecurityEvent CRs of class Detection (or Audit for low-severity matches) through the emitter SDK. No sidecars - the detector runs inside the operator process.

Two interchangeable backends, picked at startup with --source=:

SourceWorkloadAudit-log path
fileDaemonSet/host/var/log/audit/audit.log (kubelet hostPath mount)
webhookDeploymentapiserver --audit-webhook-config-file HTTPS POST

The webhook source authenticates the apiserver with a bearer token (AUDIT_WEBHOOK_TOKEN env, or any name passed via --webhook-secret-env) and optionally enforces mTLS (--webhook-client-ca). Both sources emit onto the same engine channel - the rule surface is source-agnostic.

The engine implements a deliberately small subset of the Sigma matching language - the parts that matter against K8s audit events:

  • objectRef filters: apiGroup, apiVersion, resource, subresource, namespace, name
  • verb / stage set membership
  • userGlob, nameGlob, namespaceGlob glob lists
  • requestObjectGlob: JSONPath into request.requestObject plus a glob list (supports $.x.y[*].z-style wildcard array steps)
  • compositional anyOf / not (single-level - deliberate, keeps the generated OpenAPI schema finite)

No event aggregation, no time windows. Per-rule rate limiting uses a token bucket (burst + sustainedPerSec). Matches that exceed the budget are dropped and counted in Status.DroppedRateLimit.

The reconciler hot-swaps the in-memory rule set on every CR write - counters (MatchCount, DroppedRateLimit, LastMatchedAt) survive a re-compile so an edit doesn’t erase history; only the limiter is rebuilt with the new budget. Compile errors land on Status.ParseError and disable the rule.

The engine maps objectRef.resource onto SecurityEvent.subject.kind via a small allowlist (pods, nodes, secrets, clusterrolebindings, …). Unknown resources fall back to External so the SE remains valid against the SubjectKind enum. The SE name is derived deterministically from (auditID, type, subjectUID) so an apiserver replay re-emits the same SE - idempotent Create returns AlreadyExists.

Every emit stamps the configured cluster identity (--cluster-id / chart clusterIdentity.clusterID) so downstream consumers (attestor, forensics) can partition WORM keys by cluster.

apiVersion: security.ugallu.io/v1alpha1
kind: SigmaRule
metadata:
name: cluster-admin-granted
spec:
description: Detect creation of a wildcard ClusterRoleBinding to cluster-admin.
match:
verb: [create, update]
objectRef:
resource: clusterrolebindings
requestObjectGlob:
- jsonPath: "$.roleRef.name"
glob: ["cluster-admin"]
emit:
type: ClusterAdminGranted
severity: high
class: Detection
rateLimit:
burst: 5
sustainedPerSec: 1

Source-level Prometheus counters:

  • ugallu_audit_file_lines_total
  • ugallu_audit_file_parse_errors_total
  • ugallu_audit_webhook_events_total
  • ugallu_audit_webhook_parse_errors_total
  • ugallu_audit_webhook_auth_failures_total
  • ugallu_audit_webhook_backpressure_total

Per-rule:

  • ugallu_audit_rule_matches_total{rule}
  • ugallu_audit_rule_dropped_total{rule}
  • ugallu_audit_rule_emit_errors_total{rule}
  • ugallu_audit_rule_compile_errors_total{rule}

Surfaced as alerts and a Grafana dashboard by the monitoring subchart.

SigmaRule has no terminal phase. Every CR write is a recompile request; the outcome is reflected on the status:

  • Status.Conditions[type=Compiled] flips True after a successful compile, False on parse error.
  • Status.ParseError carries the human-readable reason when Compiled=False.
  • Status.MatchCount, Status.LastMatchedAt, Status.DroppedRateLimit are updated by the engine, not by the reconciler.

No finalizer. Deleting a SigmaRule removes it from the in-memory ruleset on the next reconcile tick.

on each SigmaRule event:
rule := Get(req.Name)
if rule.Spec.Enabled:
compiled, err := sigma.Compile(rule.Spec.Match)
if err:
patch Status.Conditions[Compiled]=False, ParseError=err
remove from in-memory RuleSet
return
install in RuleSet (replaces any prior compile)
else:
remove from RuleSet
patch Status.Conditions[Compiled]=True
RequeueAfter: 30s # keeps Status.MatchCount fresh

Compile is pure - same input always produces the same RuleSet entry. Operator restart re-Gets every rule and recompiles in parallel; in-flight MatchCount history is lost (lives in the in-memory entry) but the persisted lifetime metric counters survive in Prometheus. A rule disabled in spec stays out of the RuleSet across restarts.

Pod killed mid-compile: new pod re-Gets the rule, recompiles deterministically, reupserts the Compiled condition. No half-state to clean up.

  • Hot reload. Mutations to a SigmaRule trigger a recompile immediately; the next match against the rule uses the new predicate.
  • Per-rule rate limit. Drops over budget surface on Status.DroppedRateLimit.
  • Compile error reasons. ErrTooManyWildcards -> Reason=GlobBudget, ErrInvalidJSONPath -> Reason=JSONPath.
  • Source mode swap (file <-> webhook) requires a chart upgrade plus restart - it changes the workload shape.
rules:
- apiGroups: [security.ugallu.io]
resources: [sigmarules]
verbs: [get, list, watch]
- apiGroups: [security.ugallu.io]
resources: [sigmarules/status]
verbs: [update, patch]
- apiGroups: [security.ugallu.io]
resources: [securityevents]
verbs: [create]
- apiGroups: [""]
resources: [events]
verbs: [create, patch]
- apiGroups: [coordination.k8s.io]
resources: [leases]
verbs: [get, list, watch, create, update, patch, delete]
  • SigmaRule - one CR per rule; the engine compiles and reloads on every write.
  • AuditDetectionConfig
    • singleton; toggles source mode and declares event-bus consumers.

Helm subchart audit-detection in the umbrella. The chart provisions the apiserver audit-webhook bearer-token Secret + TLS Secret, plus a Service (webhook) or DaemonSet+hostPath (file). mTLS-only mode is toggled by setting webhook.sharedSecretRef to empty.