GAP introduction
Current state
We have full support for operating stateless applications at Gjensidige in GAP.
What is GAP
GAP - Gjensidige Application Platform.
GAP is a modern, Kubernetes-based internal platform that provides secure, paved roads for application teams. The Platform Squad delivers a fully managed developer foundation, while Application Owners maintain responsibility for the functionality, deployments, and runtime health of their services.
GAP – Capabilities & Components
GAP provides a complete, managed foundation for building, deploying and operating applications. It consists of the following key capabilities:
1. Kubernetes Compute (AKS):
- Managed AKS clusters for running all applications
- Automated upgrades, security patching, and cluster operations
- Standardized networking, RBAC, and security policies
2. Continuous Integration (GitHub Actions):
- Central CI infrastructure with managed runners
- Shared templates and secure defaults for build pipelines
- Supports team‑owned CI workflows and application builds
3. Continuous Delivery / GitOps (ArgoCD):
- Declarative deployments to AKS using GitOps
- Automated sync, health checks, and rollout visibility
- Platform‑operated ArgoCD instance with team‑owned manifests
- Gappynator operator providing a unified abstraction layer for the Kubernetes functionality supported in GAP
4. Container Management (Docker Registry):
- Centralized image storage
- Retention policies, cleanup, and security scanning
- Supports team‑built application images
5. Observability Stack (Loki, Grafana, Mimir, Tempo):
- Logs: Loki
- Metrics: Mimir (Prometheus‑compatible)
- Traces: Tempo
- Dashboards & Alerts: Grafana
- RUM (Real user monitoring) & Frontend Telemetry: Faro
- Platform‑operated stack with team‑owned service dashboards
For a full overview, see Observability Stack Overview.
Security in GAP
Security is embedded throughout GAP — from the moment a team is onboarded to how applications run in production. Key layers include:
- Team onboarding — Azure environments are provisioned through Terraform with hardened defaults, tiered Entra ID groups, and self-service access packages with approval workflows and annual reviews
- Source code security — every repository is created with enforced branch protection, signed commits, GitHub Advanced Security (code scanning, dependency scanning, secret scanning with push protection), and a continuous Security Score that tracks posture across the organisation
- Container deployment — images are scanned, signed, attested (SLSA provenance + SBOM), and pushed through a standardised pipeline with tag immutability and OIDC authentication
- Kubernetes runtime — clusters enforce Entra ID-only authentication, Gatekeeper policies (restricted pod security), Falco runtime threat detection, Zero Trust networking with Cilium, and mutual TLS via Linkerd
- Frontend deployment — static assets are secret-scanned, vulnerability-scanned, attested, and uploaded to per-team isolated CDN storage with OIDC credentials
- Terraform deployment — infrastructure changes are scanned, plan-reviewed, encrypted, and applied through reusable workflows with OIDC authentication and environment approval gates
- Credential management — all credentials are short-lived (OIDC tokens, GitHub App installation tokens, Workload Identity), and access follows the Principle of Least Privilege with Just-in-Time elevation
For a full overview, see Security in GAP.
Newest features in GAP
- Ephemeral test environments:
Planned for GAP:
- Automation of bootstrapping new applications
- Firewall management
- Setup of state resources (e.g. postgres)
Future state
The following describes the direction we expect GAP to evolve over the next 1–4 years.
Unified resource management
Today, developers must manage access and resources across multiple systems — GitHub, Azure, databases, and more. GAP will provide a single interface for provisioning firewall openings, key vaults, databases, AI models, Kafka topics, authentication, and other resources, reducing cognitive overhead and manual coordination.
AI-native platform
GAP will expose interfaces and skills that enable AI models to interact with the platform directly — making it easier to automate workflows, generate configurations, and assist developers in getting the most out of GAP.
Proactive incident management
By combining improved monitoring with AI-assisted root cause analysis, GAP will detect problems before they become user-facing. Where possible, issues will be resolved automatically; otherwise, the platform will surface actionable remediation suggestions for teams to approve.
Principles to follow in order to run on GAP
1. Application State & Behavior
- The application does not store state locally; all state is kept in external services such as databases, caches, or object storage.
Ref: The Twelve‑Factor App -- Processes (https://12factor.net/processes) - All requests are independent and self‑contained, without relying on prior requests.
Ref: REST Stateless Constraint (https://restfulapi.net/statelessness/) - Operations should be idempotent where possible to support retries in distributed environments.
2. Configuration & Secrets
- Configuration and secrets are injected via environment variables or mounted volumes.
Ref: The Twelve‑Factor App -- Config (https://12factor.net/config)
3. Observability
- Logs are JSON formatted and written to stdout/stderr, not local files.
Ref: The Twelve‑Factor App -- Logs (https://12factor.net/logs) - Metrics are exposed via a Prometheus‑compatible endpoint.
- A health endpoint provides liveness and readiness information for Kubernetes probes.
Ref: Kubernetes Probes (https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/)
4. Scalability & Deployment
- The application should be able to run multiple instances simultaneously.
- Multiple versions of the application can run in parallel to support canary and blue/green deployment patterns.
- The application performs a graceful shutdown on
SIGTERMand exits before theterminationGracePeriodSecondsexpires.
Ref: Kubernetes Pod Lifecycle (https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/)
5. Build & Runtime
- Application builds are immutable docker containers, with configuration applied only at runtime.
RACI Table for applications running on GAP
| Activity / Responsibility | Platform Squad | Application Owners |
|---|---|---|
| AKS cluster operations (scaling, backups, node pools, networking) | R/A | C |
| AKS security patching & upgrades | R/A | I |
| Cluster-wide security baselines (RBAC, policies) | R/A | I/C |
| Operate GitHub Actions runners / CI infrastructure | R/A | C |
| Maintain shared CI templates / reusable workflows | R | C |
| Implement CI pipelines for their app (build, test) | I | R/A |
| ArgoCD platform operation | R/A | I |
| Define GitOps standards, folder structure, and deployment patterns | R | C |
| Application GitOps manifests | I/C | R/A |
| Application deployments (via ArgoCD automation) | C | R/A |
| Docker registry operation (availability, retention, scanning) | R/A | I |
| Build and publish application images | I | R/A |
| Application runtime ownership (config, scaling settings, healthchecks) | C | R/A |
| Operate observability stack (Loki, Grafana, Mimir, Tempo) | R/A | I |
| Provide dashboards, alerting templates, shared panels | R | C |
| Create application-level dashboards & alerts | C | R/A |
| Incident response: platform/cluster-level issues | R/A | I |
| Incident response: application-level issues | I | R/A |
| - | Definitions |
|---|---|
| R | Responsible: Executes the work |
| A | Accountable: Final authority / decision maker |
| C | Consulted: Provides input before work is done |
| I | Informed: Updated after work is done |