Hardening

The umbrella chart’s defaults optimise for “boots on a fresh Kubernetes cluster on first try”. For production you want a different set of switches flipped.

One-flag prod baseline

hardening:
  strict: true

Setting hardening.strict=true is shorthand for the rest of this page. It flips on every preflight check the chart owns, and it makes the chart fail to install when a prerequisite is missing rather than degrade silently.

The preflights are runtime-evaluated, not just chart-rendered:

policy/v1beta1.PodSecurityPolicy is not present (PSPs are gone since 1.25; presence is a cluster mis-version signal).
the namespace ugallu-system-privileged carries the pod-security.kubernetes.io/enforce=privileged label - without it the privileged DaemonSets won’t admit.
the policy.sigstore.dev CRDs from sigstore-policy-controller are present (signature gating for ugallu’s own images).
spire.io CRDs are present when attestor.signingMode=fulcio-keyless - the operator needs a trusted SVID issuer.
a working DNS name for worm.endpoint resolves AND a TCP probe succeeds.
NTP skew at chronyd is < 250ms (the attestor’s Rekor inclusion proofs are timestamp-sensitive).

The CRD missing on the cluster yields a Reason + remediation link in the chart’s helm install --debug output, so the operator team knows what to install before retrying.

Per-component switches

WORM

worm:
  endpoint: https://worm.example.internal
  bucket: ugallu-worm-prod
  encryption:
    mode: sse-kms
    kmsKeyID: arn:aws:kms:eu-west-1:...:key/...
  retention:
    bundle: 10y
    forensicsFs: 5y
    forensicsMem: 1y

Object Lock must be on at the bucket level in COMPLIANCE mode (not GOVERNANCE - GOVERNANCE allows a privileged user to break retention). The chart can’t enforce this for you because it lives on the bucket, not the cluster; it WILL emit a WormBucketUnlocked SecurityEvent on every startup if it sees a non-locked bucket.

Attestor

attestor:
  signingMode: dual
  fulcio:
    issuer: https://kubernetes.default.svc.cluster.local
    fulcioURL: https://fulcio.sigstore.dev
  openbao:
    address: https://openbao.openbao:8200
    transitMount: transit
    keyName: ugallu-attestor-prod
    authRole: ugallu-attestor
  rekor:
    enabled: true
    url: https://rekor.sigstore.dev

signingMode=dual is the recommended production posture: every bundle is co-signed by Fulcio (identity-rooted) and OpenBao (key-rooted). Either backend going down still produces signed bundles; both going down is the signal that something infrastructural is broken, not just a flaky external.

Resolver mTLS

resolver:
  tls:
    mode: spire
    spire:
      trustDomain: example.internal

In mode=spire the resolver’s gRPC service mounts SPIRE-issued SVIDs and refuses connections without a verified peer certificate. The DNS-detect operator uses the same trust domain to authenticate

a misconfigured trust domain produces a clear refusal at startup rather than a silent fallback.

Network policy

The chart ships a default NetworkPolicy per operator under charts/ugallu/charts/<op>/templates/04-networkpolicy.yaml. The default permits:

ingress from the Prometheus operator scrape namespace (configured via monitoring.prometheusNamespace)
egress to kube-apiserver, the configured WORM endpoint, the resolver service, and (where applicable) the audit-bus / bridge

For Cilium clusters a parallel CiliumNetworkPolicy with stricter identity-based selection ships under the same template. Toggle the backend with umbrella.networkPolicy.backend=cilium|coreV1.

UI / BFF

ui:
  auth:
    mode: oidc
    issuer: https://keycloak.example.internal/realms/ugallu
    clientID: ugallu-ui
    redirectURL: https://ugallu.example.internal/oauth/callback

The BFF supports OIDC + PKCE only - no password grant, no client credentials, no implicit flow. ServiceAccount impersonation (mode=sa-impersonation) is supported as a fallback for clusters without an OIDC issuer, but the audit log entries it produces will attribute every UI action to the BFF SA, not the human.

What the chart will NOT do for you

A few hardening responsibilities live outside the chart:

Container image signing for non-ugallu workloads. The chart signs ugallu’s own images and configures sigstore-policy-controller to gate them; gating workloads from other vendors is a policy you author.
Audit logging at the apiserver. The audit-detection operator consumes the audit log; it does not configure the apiserver to produce it. On managed control planes use the platform’s audit webhook output; on self-managed clusters add --audit-webhook-config-file.
Backup creation. backup-verify checks Velero / etcd snapshots; it doesn’t take them. Velero schedules and etcd snapshot CronJobs are the cluster operator’s responsibility.