Agent Skill
2/7/2026

platform-engineer

**Master Skill**: Unified Platform, SRE & Release Engineering. Covers OpenShift 4.20+, GitOps (ArgoCD/Tekton), Container Hardening, Service Mesh, Feature Flags, Progressive Rollouts, Observability (LGTM Stack), Chaos Engineering, and Disaster Recovery.

F
fajjarnr
0GitHub Stars
1Views
npx skills add fajjarnr/payu

SKILL.md

Nameplatform-engineer
Description**Master Skill**: Unified Platform, SRE & Release Engineering. Covers OpenShift 4.20+, GitOps (ArgoCD/Tekton), Container Hardening, Service Mesh, Feature Flags, Progressive Rollouts, Observability (LGTM Stack), Chaos Engineering, and Disaster Recovery.

name: platform-engineer version: 3.0.0 maturity: stable updated: 2026-05-04 author: payu-platform-team requires: [] tags: [devops, k8s, openshift, infrastructure, gitops, argocd, tekton, helm, sre, reliability, releases, feature-flags] related: [cybersecurity-architect, integration-architect, finops-engineer] description: Master Skill: Unified Platform, SRE & Release Engineering. Covers OpenShift 4.20+, GitOps (ArgoCD/Tekton), Container Hardening, Service Mesh, Feature Flags, Progressive Rollouts, Observability (LGTM Stack), Chaos Engineering, and Disaster Recovery.

πŸ“š Reference Implementation Patterns

For detailed patterns and historical context on PayU infrastructure, see:

PayU Platform Architect Master Skill

You are the Lead Platform Engineer for the PayU Platform. You design and maintain the enterprise-grade automated delivery infrastructure on top of Red Hat OpenShift 4.20+.

⚑ 2026 Platform Engineering Trends

  1. Internal Developer Portal (IDP): Backstage/Red Hat Developer Hub is the golden path interface.
  2. eBPF Observability: Using Pixie/Cilium for zero-instrumentation monitoring.
  3. GreenOps: Carbon-aware scheduling for batch jobs.
  4. Policy as Code: Kyverno/OPA for strict governance enforcement at the cluster level.
  5. Container Port Standardization: All 22 microservices MUST listen on internal port 8080 to simplify networking, healthchecks, and service mesh routing.

πŸš€ GitOps & Continuous Delivery (ArgoCD)

1. ApplicationSet for Multi-Environment

# infrastructure/platform/argocd-gitops/applicationsets/payu-applicationsets.yaml
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: payu-environments
  namespace: openshift-gitops
spec:
  goTemplate: true
  goTemplateOptions:
    - missingkey=error
  generators:
    - list:
        elements:
          - name: dev
            path: infrastructure/workloads/overlays/dev
            namespace: payu-dev
            project: payu-dev
          - name: sit
            path: infrastructure/workloads/overlays/sit
            namespace: payu-sit
            project: payu-sit
          - name: uat
            path: infrastructure/workloads/overlays/uat
            namespace: payu-uat
            project: payu-uat
          - name: preprod
            path: infrastructure/workloads/overlays/preprod
            namespace: payu-preprod
            project: payu-preprod
          - name: prod
            path: infrastructure/workloads/overlays/prod
            namespace: payu
            project: payu
  template:
    metadata:
      name: "payu-{{.name}}"
    spec:
      project: "{{.project}}"
      source:
        repoURL: https://github.com/fajjarnr/payu.git
        targetRevision: main
        path: "{{.path}}"
      destination:
        server: https://kubernetes.default.svc
        namespace: "{{.namespace}}"
      syncPolicy:
        automated:
          prune: true
          selfHeal: true
        syncOptions:
          - CreateNamespace=true
          - RespectIgnoreDifferences=true

Repo alignment note:

  • Stable environments are generated from infrastructure/workloads/overlays/{dev,sit,uat,preprod,prod}.
  • Preview environments must override namespace to payu-dev-pr-* because the dev overlay hardcodes payu-dev.

2. Sync Windows for Production Safety

# infrastructure/platform/argocd-gitops/projects/payu-projects.yaml
apiVersion: argoproj.io/v1alpha1
kind: AppProject
metadata:
  name: payu
  namespace: openshift-gitops
spec:
  sourceRepos:
    - https://github.com/fajjarnr/payu.git
  destinations:
    - namespace: payu
      server: https://kubernetes.default.svc
  syncWindows:
    - kind: allow
      schedule: "0 1 * * 1-5"
      duration: 8h
      applications:
        - payu-prod
      namespaces:
        - payu
    - kind: deny
      schedule: "0 0 * * 0,6"
      duration: 24h
      applications:
        - payu-prod
      namespaces:
        - payu

3. Automated Rollback

# infrastructure/platform/argocd-gitops/applicationsets/payu-applicationsets.yaml
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
spec:
  template:
    spec:
      syncPolicy:
        automated:
          prune: true
          selfHeal: true
        retry:
          limit: 5
          backoff:
            duration: 5s
            factor: 2
            maxDuration: 3m
      ignoreDifferences:
        - group: apps
          kind: Deployment
          jqPathExpressions:
            - .spec.template.metadata.annotations

πŸ”§ Tekton CI/CD Pipelines

1. Modular Pipeline Structure

# infrastructure/platform/tekton-pipelines/build-pipeline.yaml
apiVersion: tekton.dev/v1
kind: Pipeline
metadata:
  name: payu-build-pipeline
  namespace: payu-cicd
spec:
  tasks:
    - name: fetch-repository
      taskRef:
        name: git-clone

    - name: secret-scan
      runAfter: [fetch-repository]
      taskRef:
        name: gitleaks

    - name: deep-secret-scan
      runAfter: [fetch-repository]
      taskRef:
        name: trufflehog

    - name: semgrep-scan
      runAfter: [fetch-repository]
      taskRef:
        name: semgrep

    - name: service-sast-sca
      runAfter: [semgrep-scan]
      taskRef:
        name: security-scan

    - name: build-image
      runAfter: [secret-scan, deep-secret-scan, service-sast-sca]
      taskRef:
        name: buildah

    - name: trivy-image-scan
      runAfter: [build-image]
      taskRef:
        name: trivy

    - name: rhacs-policy-check
      runAfter: [trivy-image-scan]
      taskRef:
        name: rhacs-image-check

    - name: generate-sbom
      runAfter: [rhacs-policy-check]
      taskRef:
        name: syft-sbom

    - name: grype-sbom-check
      runAfter: [generate-sbom]
      taskRef:
        name: grype-scan

    - name: sign-image
      runAfter: [grype-sbom-check]
      taskRef:
        name: cosign-sign

Repo alignment note:

  • PayU build pipelines in payu-cicd enforce gitleaks -> trufflehog -> semgrep -> service SAST/SCA -> build -> trivy -> RHACS -> Syft -> Grype -> Cosign.
  • Deploy pipelines gate environment promotion with Argo sync wait plus ZAP/Litmus in SIT, Schemathesis/k6 in UAT, and Cerberus/Kraken in preprod.

2. Pipeline Trigger for Git Events

# tekton/triggers/github-push-trigger.yaml
apiVersion: triggers.tekton.dev/v1beta1
kind: TriggerTemplate
metadata:
  name: java-service-trigger
spec:
  params:
    - name: gitrevision
    - name: gitrepositoryurl
    - name: servicename
  resourcetemplates:
    - apiVersion: tekton.dev/v1beta1
      kind: PipelineRun
      metadata:
        generateName: "$(tt.params.servicename)-"
      spec:
        pipelineRef:
          name: java-service-pipeline
        params:
          - name: git-url
            value: $(tt.params.gitrepositoryurl)
          - name: git-revision
            value: $(tt.params.gitrevision)
          - name: service-name
            value: $(tt.params.servicename)
        workspaces:
          - name: source
            volumeClaimTemplate:
              spec:
                accessModes:
                  - ReadWriteOnce
                resources:
                  requests:
                    storage: 1Gi
---
apiVersion: triggers.tekton.dev/v1beta1
kind: EventListener
metadata:
  name: github-listener
spec:
  serviceAccountName: tekton-triggers-sa
  triggers:
    - name: github-push
      interceptors:
        - ref:
            name: github
          params:
            - name: secretRef
              value:
                secretName: github-webhook-secret
                secretKey: token
            - name: eventTypes
              value: ["push"]
      bindings:
        - ref: github-push-binding
      template:
        ref: java-service-trigger

πŸ—οΈ Container Hardening (Podman/UBI9)

PayU menggunakan Podman secara eksklusif karena arsitekturnya yang daemonless dan kemampuan eksekusi rootless secara native, yang jauh lebih aman dibanding Docker.

1. Production Containerfile Template

# Containerfile (Podman) - Multi-stage build for Java service
# Stage 1: Build
FROM registry.access.redhat.com/ubi9/openjdk-21:1.18 AS builder
WORKDIR /build
COPY pom.xml .
COPY src ./src
RUN mvn clean package -DskipTests -Dmaven.repo.local=/build/.m2

# Stage 2: Runtime (minimal)
FROM registry.access.redhat.com/ubi9/ubi-minimal:9.3

# Security: Create non-root user
RUN microdnf install -y java-21-openjdk-headless shadow-utils && \
    microdnf clean all && \
    groupadd -r payu -g 1001 && \
    useradd -r -g payu -u 1001 -d /app payu

WORKDIR /app

# Copy only the built artifact
COPY --from=builder --chown=payu:payu /build/target/*.jar app.jar

# Security: Run as non-root
USER 1001

# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=60s --retries=3 \
    CMD curl -f http://localhost:8080/actuator/health/liveness || exit 1

# Security: Drop all capabilities
# Read-only root filesystem
# No new privileges
EXPOSE 8080

ENTRYPOINT ["java", \
    "-XX:+UseContainerSupport", \
    "-XX:MaxRAMPercentage=75.0", \
    "-Djava.security.egd=file:/dev/./urandom", \
    "-jar", "app.jar"]

2. Security Context in Kubernetes

# deployment.yaml
spec:
  template:
    spec:
      securityContext:
        runAsNonRoot: true
        runAsUser: 1001
        runAsGroup: 1001
        fsGroup: 1001
        seccompProfile:
          type: RuntimeDefault
      containers:
        - name: app
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
            capabilities:
              drop:
                - ALL
          volumeMounts:
            - name: tmp
              mountPath: /tmp
            - name: logs
              mountPath: /app/logs
      volumes:
        - name: tmp
          emptyDir: {}
        - name: logs
          emptyDir: {}

3. SELinux Guardrails (Red Hat Best Practices)

Platform PayU mengandalkan SELinux untuk pertahanan Enforced secara default. Jangan pernah mematikan SELinux (setenforce 0) di lingkungan produksi.

Volume Labeling (:Z vs :z)

Saat mounting volume di Podman, label SELinux harus dikelola agar proses kontainer memiliki izin akses.

  • :Z: Private unshared volume. Mencegah kontainer lain mengakses data ini. (Direkomendasikan).
  • :z: Shared volume. Bisa diakses oleh beberapa kontainer.
# Contoh running rootless podman dengan SELinux labeling
podman run -v /data/db:/var/lib/postgresql/data:Z postgres:16

OpenShift MCS (Multi-Category Security)

Di OpenShift, setiap namespace mendapatkan kategori SELinux yang unik (misal: s0:c12,c34). Ini mencegah kontainer di Namespace A mengakses volume di Namespace B meskipun UUID-nya sama.

Security Context Constraints (SCC)

Gunakan SCC restricted-v2 (default di OCP 4.12+) yang secara otomatis:

  1. Mengalokasikan UID unik dari range namespace.
  2. Menerapkan tipe SELinux container_t.
  3. Memaksa penggunaan seccompProfile tipe RuntimeDefault.

Troubleshooting Commands

Jika terjadi Permission Denied meskipun permission file di host (Linux) sudah 777:

  1. Cek audit log: ausearch -m avc -ts recent
  2. Lihat konteks file: ls -Z /path/to/data
  3. Perbaiki label: restorecon -Rv /path/to/data

βš“ Platform Port Standardization

All PayU backend services follow the 8080 Standard for internal container networking. This reduces configuration complexity and aligns with OpenShift/Kubernetes networking patterns.

1. Port Mapping Principles

  • Internal Port: Always 8080. All applications (Spring Boot, Quarkus, FastAPI) must listen on this port inside the container.
  • External Port: Managed via docker-compose or podman-compose host mapping (e.g., 8001:8080).
  • Service Discovery: Internal communication between containers uses the service name and port 8080 (e.g., http://account-service:8080).

2. Implementation Checklist

  • Dockerfile: EXPOSE 8080.

  • Application Config: server.port=8080 (Spring) or quarkus.http.port=8080.

  • Health Check: Endpoint must be matched to port 8080 (e.g., http://localhost:8080/actuator/health).

  • Gateway Routes: All ROUTES_URL must point to port 8080 of the target service.


4. OCI & Metadata Standards (Legacy Container Engineer)

Semua container image PayU WAJIB memiliki metadata standar untuk auditability dan traceability, menggunakan standar OCI (Open Container Initiative).

Containerfile Labels (Build Time)

# Standard OCI Labels
LABEL org.opencontainers.image.vendor="PayU Digital Banking" \
      org.opencontainers.image.authors="platform@payu.fajjjar.my.id" \
      org.opencontainers.image.title="Wallet Service" \
      org.opencontainers.image.description="Core ledger and balance management service" \
      org.opencontainers.image.licenses="Proprietary" \
      org.opencontainers.image.source="https://github.com/payu/wallet-service" \
      org.opencontainers.image.documentation="https://docs.payu.internal/services/wallet" \
      org.opencontainers.image.version="${VERSION}" \
      org.opencontainers.image.created="${BUILD_DATE}" \
      org.opencontainers.image.revision="${GIT_COMMIT}"

# PayU Specific Metadata
LABEL id.payu.service.tier="1" \
      id.payu.service.domain="transaction" \
      id.payu.compliance.pci-dss="true" \
      id.payu.security.scan-level="critical"

Kubernetes Annotations (Runtime)

metadata:
  annotations:
    # Build Info
    image.openshift.io/triggers: '[{''from'':{''kind'':''ImageStreamTag'',''name'':''wallet-service:latest''},''fieldPath'':''spec.template.spec.containers[?(@.name=="app")].image''}]'

    # Ownership & Contact
    start.payu.fajjjar.my.id/owner: "Wallet Team <wallet@payu.fajjjar.my.id>"
    start.payu.fajjjar.my.id/slack-channel: "#dev-wallet"

    # Operational Metadata
    prometheus.io/scrape: "true"
    prometheus.io/port: "8080"
    prometheus.io/path: "/actuator/prometheus"

    # Documentation
    link.argocd.argoproj.io/external-link: "https://docs.payu.internal/services/wallet"

πŸ“¦ Helm Chart Standards

1. Chart Structure

helm/
└── wallet-service/
    β”œβ”€β”€ Chart.yaml
    β”œβ”€β”€ values.yaml
    β”œβ”€β”€ values-dev.yaml
  β”œβ”€β”€ values-sit.yaml
    β”œβ”€β”€ values-prod.yaml
    β”œβ”€β”€ templates/
    β”‚   β”œβ”€β”€ _helpers.tpl
    β”‚   β”œβ”€β”€ deployment.yaml
    β”‚   β”œβ”€β”€ service.yaml
    β”‚   β”œβ”€β”€ configmap.yaml
    β”‚   β”œβ”€β”€ secret.yaml
    β”‚   β”œβ”€β”€ hpa.yaml
    β”‚   β”œβ”€β”€ pdb.yaml
    β”‚   β”œβ”€β”€ networkpolicy.yaml
    β”‚   β”œβ”€β”€ servicemonitor.yaml
    β”‚   └── NOTES.txt
    └── tests/
        └── test-connection.yaml

2. Values Schema

# values.yaml
replicaCount: 2

image:
  repository: registry.payu.internal/payu/wallet-service
  tag: "latest"
  pullPolicy: IfNotPresent

resources:
  requests:
    cpu: 250m
    memory: 512Mi
  limits:
    cpu: 1000m
    memory: 1Gi

autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilization: 70
  targetMemoryUtilization: 80

podDisruptionBudget:
  enabled: true
  minAvailable: 1

networkPolicy:
  enabled: true
  ingress:
    - from:
        - namespaceSelector:
            matchLabels:
              name: payu-gateway
      ports:
        - port: 8080

monitoring:
  enabled: true
  path: /actuator/prometheus
  port: 8080

πŸ”— Service Mesh (Istio)

1. Traffic Management

# VirtualService for Canary Deployment
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: wallet-service
spec:
  hosts:
    - wallet-service
  http:
    - match:
        - headers:
            x-canary:
              exact: "true"
      route:
        - destination:
            host: wallet-service
            subset: canary
          weight: 100
    - route:
        - destination:
            host: wallet-service
            subset: stable
          weight: 90
        - destination:
            host: wallet-service
            subset: canary
          weight: 10
---
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: wallet-service
spec:
  host: wallet-service
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 100
      http:
        h2UpgradePolicy: UPGRADE
        http1MaxPendingRequests: 100
        http2MaxRequests: 1000
    outlierDetection:
      consecutive5xxErrors: 5
      interval: 30s
      baseEjectionTime: 30s
  subsets:
    - name: stable
      labels:
        version: stable
    - name: canary
      labels:
        version: canary

2. Mutual TLS (mTLS) Strict Mode

apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: default
  namespace: payu
spec:
  mtls:
    mode: STRICT
---
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: wallet-service-authz
  namespace: payu
spec:
  selector:
    matchLabels:
      app: wallet-service
  rules:
    - from:
        - source:
            principals:
              - cluster.local/ns/payu/sa/gateway-service
              - cluster.local/ns/payu/sa/transaction-service
      to:
        - operation:
            methods: ["GET", "POST", "PUT"]
            paths: ["/api/*"]

🌍 Multi-Region Disaster Recovery

1. Architecture Pattern

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     Global Load Balancer (GSLB)                  β”‚
β”‚                    (Cloudflare/AWS Route53)                      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚
          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
          β”‚                               β”‚
          β–Ό                               β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Region 1 (Active)  β”‚       β”‚  Region 2 (Standby)  β”‚
β”‚   Jakarta DC         β”‚       β”‚  Singapore DC        β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€       β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ OpenShift Cluster   β”‚       β”‚ OpenShift Cluster    β”‚
β”‚ - All services      │──────▢│ - All services       β”‚
β”‚ - Kafka (Primary)   β”‚ Sync  β”‚ - Kafka (Mirror)     β”‚
β”‚ - PostgreSQL (RW)   │──────▢│ - PostgreSQL (RO)    β”‚
β”‚ - Redis (Master)    │──────▢│ - Redis (Replica)    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

2. Failover Configuration

# Multi-region Kafka MirrorMaker2
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaMirrorMaker2
metadata:
  name: payu-mm2
spec:
  version: 3.6.0
  replicas: 3
  connectCluster: "region-2"
  clusters:
    - alias: "region-1"
      bootstrapServers: kafka-region1.payu.internal:9092
    - alias: "region-2"
      bootstrapServers: kafka-region2.payu.internal:9092
  mirrors:
    - sourceCluster: "region-1"
      targetCluster: "region-2"
      sourceConnector:
        config:
          replication.factor: 3
          offset-syncs.topic.replication.factor: 3
      topicsPattern: "payu.*"

πŸ’° Cloud FinOps

1. Resource Right-Sizing with VPA

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: wallet-service-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: wallet-service
  updatePolicy:
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
      - containerName: "*"
        minAllowed:
          cpu: 100m
          memory: 256Mi
        maxAllowed:
          cpu: 2
          memory: 4Gi

2. Cost Attribution Labels

# Required labels for all resources
metadata:
  labels:
    app.kubernetes.io/name: wallet-service
    app.kubernetes.io/version: "1.2.3"
    app.kubernetes.io/component: backend
    app.kubernetes.io/part-of: payu-platform
    cost-center: platform-team
    environment: prod
    owner: wallet-team

πŸ› Container Build Debugging (Podman/UBI9)

Learned from: E2E test infrastructure setup - February 1, 2026

Common Build Failure Patterns

1. Parent POM Resolution Failure

Symptom: Maven build fails with Could not resolve dependencies or parent POM not found

Root Cause: Containerfile copies only service pom.xml, but Spring Boot services reference parent POM at ../pom.xml

# ❌ WRONG - Only copies service pom.xml
COPY pom.xml ./
RUN mvn dependency:go-offline -B
COPY src ./src

# βœ… CORRECT - Copies entire project for parent POM access
COPY . .
RUN mvn clean package -DskipTests

Fix: Change COPY pom.xml ./ to COPY . . in Containerfiles

2. Maven Build Hanging (4+ hours)

Symptom: Maven build process hangs indefinitely during dependency download or compilation

Root Cause:

  • Parallel builds (-T 1C) causing deadlock in certain services
  • Network issues accessing Maven Central during container build
  • Large dependency downloads timing out

Fix - Use Pre-Built JARs:

# Build stage: Skip Maven, use pre-built JAR
# Runtime stage only
FROM registry.access.redhat.com/ubi9/openjdk-21-runtime:1.24-2

# Copy pre-built JAR from local build
COPY target/*.jar /app/app.jar

USER 1001
ENTRYPOINT ["java", "-jar", "/app/app.jar"]

Build Strategy:

  1. Build all JARs first with Maven from backend directory:

    cd /home/ubuntu/payu/backend
    mvn clean package -DskipTests -T 1C
    
  2. Create runtime-only Containerfiles that copy pre-built JARs

  3. Build images much faster (minutes vs hours)

3. UBI9 Runtime Image Conflicts

Symptom: curl-minimal conflicts when trying to install curl

Root Cause: UBI9 runtime images have curl-minimal pre-installed, conflicts with installing regular curl

Fix: Remove curl installation and curl-based health checks from Containerfiles, or use curl-minimal for health checks:

# ❌ WRONG - Tries to install curl (conflicts)
RUN microdnf install -y curl

# βœ… CORRECT - curl-minimal already available
HEALTHCHECK CMD curl-minimal --fail-with-body http://localhost:8080/actuator/health || exit 1

4. User Creation Conflicts (GID 185)

Symptom: groupadd: GID '185' already exists when creating non-root user

Root Cause: UBI9 images already have user jboss with GID 185

Fix: Use existing jboss user (UID 185) instead of creating new user:

# ❌ WRONG - Tries to create user with GID 185
RUN groupadd -r payu -g 1001 && \
    useradd -r -g payu -u 1001 -d /app payu

# βœ… CORRECT - Use existing jboss user
USER 185

5. Dockerfile Excludes Target Directory

Symptom: COPY target/*.jar /app/app.jar fails with "no such file or directory"

Root Cause: .dockerignore or .containerignore excludes target/ directory

Fix: Either:

  1. Build from parent directory with proper context
  2. Remove target/ from ignore files
  3. Use --ignorefile=.containerignore to bypass dockerignore

Debugging Commands

# Check if parent POM is accessible
cd backend/some-service
cat ../pom.xml  # Should show parent POM content

# Check Maven can resolve parent
mvn help:evaluate -Dexpression=project.parentGroupId
mvn help:evaluate -Dexpression=project.parentArtifactId

# Check what's in target directory
ls -la target/ | grep -E "\.jar$"

# Test Maven build locally (without container)
mvn clean package -DskipTests

# Check dockerignore
cat .dockerignore | grep target

Build Performance Optimization

StrategyBuild TimeDisk SpaceUse When
Full container build10-30 min/serviceHighInitial setup, CI/CD
Pre-built JARs1-2 min/serviceMediumDevelopment, fast iteration
Multi-stage with cache5-10 min/serviceMediumProduction, optimized
Runtime-only (local JAR)<1 min/serviceLowDebugging, testing

PayU Build Standards

  1. All Spring Boot services use payu-backend-parent (not direct spring-boot-starter-parent)
  2. Containerfiles use COPY . . for parent POM resolution
  3. Non-root user with UID 1001 or existing jboss user (185)
  4. UBI9 images: ubi9/openjdk-21:1.24-2 for build, ubi9/openjdk-21-runtime:1.24-2 for runtime
  5. Node.js images: ubi9/nodejs-20:9.7 for frontend

Known Working Services

ServiceImageBuild Method
account-serviceβœ… payu-account-service:testPre-built JAR
auth-serviceβœ… payu-auth-service:testPre-built JAR
wallet-serviceβœ… payu-wallet-service:testPre-built JAR
transaction-serviceβœ… payu-transaction-service:testPre-built JAR
investment-serviceβœ… payu-investment-service:testPre-built JAR
gateway-serviceβœ… payu-gateway-service:testPre-built JAR
bi-fast-simulatorβœ… payu-bifast-simulator:testPre-built JAR
dukcapil-simulatorβœ… payu-dukcapil-simulator:testFull build
qris-simulatorβœ… payu-qris-simulator:testPre-built JAR

References


πŸ›‘οΈ Platform Integrity Checklist

Security

  • Containerfile menggunakan UBI9-minimal dan non-root USER

  • Dijalankan menggunakan Podman rootless (UID 1001)

  • SecurityContext drops all capabilities

  • NetworkPolicies isolate service traffic

  • Secrets managed via Vault + External Secrets Operator (not Git)

Delivery

  • Service deployed via ArgoCD (GitOps)

  • Sync windows configured for production

  • Automated rollback enabled

  • Tekton pipeline includes secret scan, SAST/SCA, SBOM, vulnerability gating, and Cosign signing

Observability

  • PodMonitor/ServiceMonitor configured

  • Distributed tracing enabled (Jaeger/OpenTelemetry)

  • Log aggregation configured (Loki)

  • eBPF probes enabled for network visibility

Resilience

  • PodDisruptionBudget defined

  • HPA configured with appropriate thresholds

  • Multi-region DR tested quarterly

  • Chaos testing run in SIT automatically


πŸ“š References

Merged Skill References (Consolidated)

CategoryTopicFile
ReleasesFeature Flags, Progressive Rollouts, Blue-Green/Canaryrelease-engineering.md
SREObservability, SLO/SLI, Chaos Engineering, DRsre-practices.md
K8sKubernetes manifest generator patternsk8s-manifest-generator.md

External Documentation


Last Updated: January 2026

Skills Info
Original Name:platform-engineerAuthor:fajjarnr