GAP introduction
Current state
We have full support for operating stateless applications at Gjensidige in GAP.
What is GAP
GAP - Gjensidige Application Platform.
GAP is a modern, Kubernetes-based internal platform that provides secure, paved roads for application teams. The Platform Squad delivers a fully managed developer foundation, while Application Owners maintain responsibility for the functionality, deployments, and runtime health of their services.
GAP – Capabilities & Components
GAP provides a complete, managed foundation for building, deploying and operating applications. It consists of the following key capabilities:
1. Kubernetes Compute (AKS):
- Managed AKS clusters for running all applications
- Automated upgrades, security patching, and cluster operations
- Standardized networking, RBAC, and security policies
2. Continuous Integration (GitHub Actions):
- Central CI infrastructure with managed runners
- Shared templates and secure defaults for build pipelines
- Supports team‑owned CI workflows and application builds
3. Continuous Delivery / GitOps (ArgoCD):
- Declarative deployments to AKS using GitOps
- Automated sync, health checks, and rollout visibility
- Platform‑operated ArgoCD instance with team‑owned manifests
- Gappynator operator providing a unified abstraction layer for the Kubernetes functionality supported in GAP
4. Container Management (Docker Registry):
- Centralized image storage
- Retention policies, cleanup, and security scanning
- Supports team‑built application images
5. Observability Stack (Loki, Grafana, Mimir, Tempo):
- Logs: Loki
- Metrics: Mimir (Prometheus‑compatible)
- Traces: Tempo
- Dashboards & Alerts: Grafana
- RUM (Real user monitoring) & Frontend Telemetry: Faro
- Platform‑operated stack with team‑owned service dashboards
Newest features in GAP
- Ephemeral test environments:
Planned for GAP:
- Automation of bootstrapping new applications
- Firewall management
- Setup of state resources (e.g. postgres)
Principles to follow in order to run on GAP
1. Application State & Behavior
- The application does not store state locally; all state is kept in external services such as databases, caches, or object storage.
Ref: The Twelve‑Factor App -- Processes (https://12factor.net/processes) - All requests are independent and self‑contained, without relying on prior requests.
Ref: REST Stateless Constraint (https://restfulapi.net/statelessness/) - Operations should be idempotent where possible to support retries in distributed environments.
2. Configuration & Secrets
- Configuration and secrets are injected via environment variables or mounted volumes.
Ref: The Twelve‑Factor App -- Config (https://12factor.net/config)
3. Observability
- Logs are JSON formatted and written to stdout/stderr, not local files.
Ref: The Twelve‑Factor App -- Logs (https://12factor.net/logs) - Metrics are exposed via a Prometheus‑compatible endpoint.
- A health endpoint provides liveness and readiness information for Kubernetes probes.
Ref: Kubernetes Probes (https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/)
4. Scalability & Deployment
- The application should be able to run multiple instances simultaneously.
- Multiple versions of the application can run in parallel to support canary and blue/green deployment patterns.
- The application performs a graceful shutdown on
SIGTERMand exits before theterminationGracePeriodSecondsexpires.
Ref: Kubernetes Pod Lifecycle (https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/)
5. Build & Runtime
- Application builds are immutable docker containers, with configuration applied only at runtime.
RACI Table for applications running on GAP
| Activity / Responsibility | Platform Squad | Application Owners |
|---|---|---|
| AKS cluster operations (scaling, backups, node pools, networking) | R/A | C |
| AKS security patching & upgrades | R/A | I |
| Cluster-wide security baselines (RBAC, policies) | R/A | I/C |
| Operate GitHub Actions runners / CI infrastructure | R/A | C |
| Maintain shared CI templates / reusable workflows | R | C |
| Implement CI pipelines for their app (build, test) | I | R/A |
| ArgoCD platform operation | R/A | I |
| Define GitOps standards, folder structure, and deployment patterns | R | C |
| Application GitOps manifests | I/C | R/A |
| Application deployments (via ArgoCD automation) | C | R/A |
| Docker registry operation (availability, retention, scanning) | R/A | I |
| Build and publish application images | I | R/A |
| Application runtime ownership (config, scaling settings, healthchecks) | C | R/A |
| Operate observability stack (Loki, Grafana, Mimir, Tempo) | R/A | I |
| Provide dashboards, alerting templates, shared panels | R | C |
| Create application-level dashboards & alerts | C | R/A |
| Incident response: platform/cluster-level issues | R/A | I |
| Incident response: application-level issues | I | R/A |
| - | Definitions |
|---|---|
| R | Responsible: Executes the work |
| A | Accountable: Final authority / decision maker |
| C | Consulted: Provides input before work is done |
| I | Informed: Updated after work is done |