seccomp-gen

ugallu-seccomp-gen produces seccomp profiles from observation rather than guesswork. Point a SeccompTrainingRun at a pod, give it a duration, and the operator subscribes to the tetragon-bridge syscall stream filtered to that pod for the training window. At the end, an OCI seccomp.json is emitted as a SeccompTrainingProfile CR.

Flow

User creates a SeccompTrainingRun with targetPodRef + duration.
Reconciler spawns a per-run goroutine; each opens an independent subscription to the bridge - concurrent training sessions don’t interfere.
The engine accumulates the unique syscall surface, debounced with a 30-second sliding window so a flurry of fork/exec at workload start doesn’t drown out the steady-state surface.
On duration expiry, the engine assembles the seccomp profile in the OCI format: defaultAction: SCMP_ACT_ERRNO, plus an allow rule for every syscall observed.
SeccompTrainingProfile is created with the profile bytes embedded; an optional ConfigMap is also created for easy securityContext.seccompProfile.localhostProfile consumption.

Profile shape

{
  "defaultAction": "SCMP_ACT_ERRNO",
  "syscalls": [
    {
      "names": ["read", "write", "openat", "close", "fstat",
                "mmap", "munmap", "brk", "rt_sigaction", "futex",
                "clone", "execve", "exit_group"],
      "action": "SCMP_ACT_ALLOW"
    }
  ]
}

The output is intentionally bare - no architecture filters, no syscall arg matchers. We trade granularity for portability: the profile loads on any architecture the kernel supports.

Example

apiVersion: security.ugallu.io/v1alpha1
kind: SeccompTrainingRun
metadata: { name: payments-canary, namespace: ugallu-system }
spec:
  targetPodRef:
    namespace: payments
    name: api-canary-7c4d
  duration: 10m
  emitConfigMap: true

Internals

State machine

SeccompTrainingRun.status.phase: “ (Pending) -> Running -> Succeeded | Failed. No finalizer. The engine runs as a detached goroutine bounded by spec.duration; the reconciler polls engine.IsRunning(name) on a 30s requeue.

Reconcile loop

on each SeccompTrainingRun event:
  run := Get(req)
  if run.Status.Phase in {Succeeded, Failed}: return
  if run.Status.Phase == "":
    pods := selectTargetPods(run.Spec.TargetSelector, replicaRatio)
    engine.Start(run.Name, pods, run.Spec.Duration)
    patch Status.Phase=Running, SelectedReplicas=len(pods), StartTime=now
    emitSE(SeccompTrainingStarted, info)
    RequeueAfter: 30s; return
  if engine.IsRunning(run.Name):
    RequeueAfter: 30s; return
  result := engine.Result(run.Name)
  if result == nil:                  # orphaned engine
    patch Status.Phase=Failed
    emitSE(SeccompTrainingFailed, high)
    return
  upsert SeccompTrainingProfile (carrying result.profileJSON)
  patch Status.Phase=Succeeded, ProfileRef, ObservedSyscallCount, CompletionTime
  emitSE(SeccompTrainingCompleted, info)

Error recovery

Engine ctx is detached from Reconcile ctx so a reconcile completion does not stop the engine. On operator restart, the engine in-memory state is lost. The new pod re-Gets the run; if Phase=Running, engine.IsRunning(name) returns false (engine was in the dead pod), so the run is marked Failed with reason=engine-orphaned. The user re-applies a fresh SeccompTrainingRun to retry.

Crash recovery scenario

Pod killed during a 30-minute training window: the user creates a new run with a longer duration covering the missed time. The operator does not auto-resume the old run because the syscall surface during the gap is unknown - producing a profile from partial data could lock out a code path that was about to run.

Edge cases

ReplicaRatio. A bounded number of matched pods (default 50%) participate in the training; the rest run untraced as a control. Useful when training in production.
Cancel on delete. Deleting the CR calls engine.Cancel(name) synchronously - the bridge subscription is closed cleanly.
DefaultAction. Profile ships with defaultAction=SCMP_ACT_ERRNO; can be overridden to SCMP_ACT_KILL / SCMP_ACT_LOG / SCMP_ACT_TRACE per run.
Profile is bare - no architecture filters, no syscall arg matchers. Trade granularity for portability.

Full RBAC (ClusterRole)

rules:
  - apiGroups: [security.ugallu.io]
    resources:
      - seccomptrainingruns
      - seccomptrainingruns/status
      - seccomptrainingprofiles
      - seccomptrainingprofiles/status
    verbs: [get, list, watch, create, update, patch]
  - apiGroups: [security.ugallu.io]
    resources: [securityevents]
    verbs: [create]
  - apiGroups: [""]
    resources: [pods]
    verbs: [get, list, watch]   # target selection
  - apiGroups: [""]
    resources: [events]
    verbs: [create, patch]
  - apiGroups: [coordination.k8s.io]
    resources: [leases]
    verbs: [get, list, watch, create, update, patch, delete]

CRDs owned

SeccompTrainingRun
- per-run, immutable spec.
SeccompTrainingProfile
- produced once when the training completes.

Key flags

--cluster-id, --cluster-name, --bridge-endpoint (default ugallu-tetragon-bridge.ugallu-system-privileged.svc:50051), --bridge-token.

Deployment

Deployment (1 replica) in ugallu-system-privileged - the bridge endpoint is in the privileged namespace so the network policy is intra-namespace; the operator process itself doesn’t need privileged caps. Leader election on.

Telemetry

ugallu_seccomp_runs_active, ugallu_seccomp_runs_total{outcome}, ugallu_seccomp_syscalls_observed_total{run}, ugallu_seccomp_bridge_disconnects_total.

Note on safety

The profile is deny-by-default with an allowlist of observed syscalls - applying it to a workload that executes a code path not exercised during training will SIGSYS. Treat training duration like a coverage exercise: the longer the run + broader the workload, the safer the resulting profile. The operator doesn’t deploy the profile for you; that’s an explicit human step, by design.