Creating an Auto-scalable Deployment
In this guide, we aim to create a Production ready deployment with automatic scaling capabilities. You should generally use auto scaling for your deployments to:
- increase resilience when load is high during peak hours
- decrease cost by using only the necessary resources to run your application at any given time
Let's get started! 🌟 🚀
If you encounter errors or get stuck along the way, refer to "A visual guide on troubleshooting Kubernetes deployments" for a great guide on debugging Kubernetes deployments ðŸ›
Prerequisites​
- A container image on Gjensidige's Container Registry
Learn how to push your container image to Gjensidige's Container Registry in this guide
Get secrets from Azure Key Vault​
We start off by defining which secrets your app depends on. If your app don't depend on any secret values, you can skip this step, but remember to remove secrets related properties in the Deployment
in the next step.
Replace values in the yaml file below to match your preferences:
apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
namespace: "your-team-namespace"
name: "your-app-name-secrets"
labels:
app: "your-app-name"
spec:
provider: "azure"
secretObjects:
- secretName: "your-app-name-secrets"
type: Opaque
data:
- objectName: "your-keyvault-secret-name-1"
key: "your-keyvault-secret-name-1"
- objectName: "your-keyvault-secret-name-2"
key: "your-keyvault-secret-name-2"
parameters:
usePodIdentity: "true"
keyvaultName: "your-keyvault-name"
objects: |
array:
- |
objectName: "your-keyvault-secret-name-1"
objectType: secret
- |
objectName: "your-keyvault-secret-name-2"
objectType: secret
tenantId: "azure-tenant-uuid"
Learn about the Secrets Store CSI Driver and SecretProviderClass
in this guide
ServiceAccount​
You can think of a ServiceAccount
as an identity for your application deployment. It's used to set fine-grained permissions for accessing other resources in Kubernetes. The ServiceAccount
specified below are given a minimal set of privileges and will work for most deployments. Read more about how to configure service accounts in the Kubernetes docs.
Every Deployment must use a dedicated ServiceAccount
. This is to ensure identities are not shared across applications. Do not use the "default" ServiceAccount
in your namespace - always specify serviceAccountName
in your Deployment
manifest.
apiVersion: v1
kind: ServiceAccount
metadata:
namespace: "your-team-namespace"
name: "your-app-name"
labels:
app: "your-app-name"
automountServiceAccountToken: false
This ServiceAccount
is used by our Deployment
in the next section by specifying spec.template.spec.serviceAccountName
.
Deployment​
Deployments represent a set of multiple, identical Pods with no unique identities. A Deployment runs multiple replicas of your application and automatically replaces any instances that fail or become unresponsive.
The following Kubernetes Deployment manifest makes some assumptions:
- Your app is running on port
8080
- Your app is exposing metrics and probes on port
8081
with the following endpoints:- liveness probe at
/actuator/liveness
- readiness probe at
/actuator/readiness
- prometheus metrics exposed on
/actuator/prometheus
- liveness probe at
If these assumptions don't match, you should change values in the yaml file below accordingly.
The example Deployment uses strategy.type=RollingUpdate
with maxSurge=1
and maxUnavailable=0
. This enables resilient deployments with zero downtime, but only if you have configured a well functioning readinessProbe
. Probes enables Kubernetes to know if your Pod is healthy and ready to receive traffic. Click here to learn more about zero downtime deployments.
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: "your-team-namespace"
name: "your-app-name"
labels:
app: "your-app-name"
spec:
selector:
matchLabels:
app: "your-app-name"
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
type: RollingUpdate
progressDeadlineSeconds: 300
revisionHistoryLimit: 5
template:
metadata:
name: "your-app-name"
labels:
app: "your-app-name"
aadpodidbinding: "your-team-managed-identity"
environment: "dev/test/prod"
owner: "your-team-name"
cost-center: "your-cost-center"
service-code: "your-service-code"
spec:
containers:
- name: "your-app-name"
image: "gjensidige.azurecr.io/your-team-name/your-app-name:image-tag"
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8080
name: "http-container"
protocol: TCP
- containerPort: 8081
name: "metrics"
protocol: TCP
securityContext:
privileged: false
allowPrivilegeEscalation: false
runAsNonRoot: true
runAsUser: 10001
readOnlyRootFilesystem: true
resources:
limits:
cpu: "500m"
memory: "400Mi"
requests:
cpu: "100m"
memory: "300Mi"
livenessProbe:
httpGet:
path: "/actuator/health/liveness"
port: "metrics"
initialDelaySeconds: 20
failureThreshold: 3
periodSeconds: 10
readinessProbe:
httpGet:
path: "/actuator/health/readiness"
port: "metrics"
initialDelaySeconds: 20
failureThreshold: 3
periodSeconds: 10
lifecycle:
preStop:
exec:
command: ["sh", "-c", "sleep 10"] # Check https://spring.io/guides/topicals/spring-on-kubernetes/
env:
- name: "NODE_IP"
valueFrom:
fieldRef:
fieldPath: "status.hostIP"
- name: "ENVIRONMENT"
valueFrom:
fieldRef:
fieldPath: "metadata.labels['environment']"
- name: "OTEL_SERVICE_NAME"
valueFrom:
fieldRef:
fieldPath: "metadata.labels['app']"
- name: "OTEL_EXPORTER_OTLP_ENDPOINT"
value: "http://$(NODE_IP):4317"
- name: "OTEL_RESOURCE_ATTRIBUTES"
value: "deployment.environment=$(ENVIRONMENT)"
volumeMounts:
- name: "jwt-config-volume"
mountPath: "/mnt/jwt-config"
- name: "cacerts-volume"
mountPath: "/etc/ssl/certs/java/cacerts"
subPath: "cacerts"
- name: "ca-certificates-volume"
mountPath: "/etc/ssl/certs/ca-certificates.crt"
subPath: "ca-certificates.crt"
- name: "secrets-store-inline"
mountPath: "/mnt/secrets-store"
readOnly: true
- name: "tmp" # Needed by the Splunk Observability Tracing agent
mountPath: "/tmp"
imagePullSecrets:
- name: "acr-credentials"
serviceAccountName: "your-app-name"
restartPolicy: Always
terminationGracePeriodSeconds: 30
dnsPolicy: ClusterFirst
volumes:
- name: "jwt-config-volume"
configMap:
name: "jwt-config"
- name: "cacerts-volume"
configMap:
name: "cacerts"
- name: "ca-certificates-volume"
configMap:
name: "ca-certificates"
- name: "secrets-store-inline"
csi:
driver: secrets-store.csi.k8s.io
readOnly: true
volumeAttributes:
secretProviderClass: "your-app-name-secrets"
- name: "tmp"
emptyDir: {}
This Deployment has support for:
- Gjensidige JWT auth, learn configuration details in this guide
- Java Trust Store, learn configuration details in this guide
- Azure AD Pod Identity, learn configuration details in this guide
- Secrets Store CSI Driver, learn configuration details in this guide
- Prometheus Metrics, learn configuration details in this guide
- Splunk Observability Tracing, learn configuration details in this guide
HorizontalPodAutoscaler​
HorizontalPodAutoscaler (HPA) changes the shape of your Kubernetes workload by automatically increasing or decreasing the number of Pods in response to the workload's CPU or memory consumption. The example below will scale your deployment between 2 and 4 pods depending on CPU load. If the load exceeds 50%, it will scale up. If it falls below 50%, a scale down is triggered.
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
namespace: "your-team-namespace"
name: "your-app-name"
labels:
app: "your-app-name"
spec:
minReplicas: 2
maxReplicas: 4
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: "your-app-name"
targetCPUUtilizationPercentage: 75
Service​
A Kubernetes Service exposes Pods in your Deployment as a network service. The below Service is of type ClusterIP
and creates an internal IP for your deployment inside the cluster. Kubernetes has an internal DNS which makes your Service reachable on:
<your-service-name>.<your-namespace>.svc.cluster.local
Consider the following Kubernetes manifest:
apiVersion: v1
kind: Service
metadata:
namespace: "your-team-namespace"
name: "your-app-name"
labels:
app: "your-app-name"
spec:
selector:
app: "your-app-name" # [1]
type: ClusterIP
ports:
- port: 80
name: "http-service"
protocol: TCP
targetPort: "http-container" # [2]
- port: 8081
name: "metrics"
protocol: TCP
targetPort: "metrics"
- The selector must match the name of your
Deployment
created earlier targetPort
must match name of the port in yourDeployment
Monitoring with ServiceMonitor and PrometheusRule​
All Gjensidige's Kubernetes clusters have the Prometheus Operator installed. A ServiceMonitor resource can be applied to enable Prometheus to automatically discover your metrics endpoint:
The ServiceMonitor
and PrometheusRule
resources below are configured using metrics generated by Spring Boot. These metrics will be available at the required endpoints if you have followed the guide "Preparing a Spring Boot App for Kubernetes". If you are using another framework, you need to update the resources to match the Prometheus configuration.
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: "your-app-name"
namespace: "your-team-namespace"
labels:
app: "your-app-name"
spec:
endpoints:
- interval: "15s"
path: "/actuator/prometheus"
port: "metrics"
selector:
matchLabels:
app: "your-app-name"
Now your metrics should be available in Prometheus and Grafana and you can set up Alerts with Alertmanager. To receive alerts from Alertmanager, you first have to define one or more alert rules in Prometheus. Alertmanager will send alerts to your team Slack channel when one of these rules are triggered. We will use the resource PrometheusRule
to define a set of alerts for this deployment:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: "your-app-name"
namespace: "your-team-namespace"
labels:
app: "your-app-name"
role: "alert-rules"
spec:
groups:
- name: "your-team-namespace.your-app-name"
rules:
- alert: "HighAmountOfHTTPServerErrors" # More than 1% of all requests results in a 5** response
annotations:
description: "High amount of HTTP server errors in '{{ $labels.container }}' in namespace '{{ $labels.namespace }}'"
expr: "(100 * sum by (container) (rate(http_server_requests_seconds_count{container='your-app-name',namespace='your-team-namespace',status=~'5.+'}[3m])) / sum by (container) (rate(http_server_requests_seconds_count{container='your-app-name',namespace='your-team-namespace'}[3m]))) > 1"
for: "3m"
labels:
severity: "warning"
namespace: "your-team-namespace"
- alert: "HighAmountOfHTTPClientErrors" # More than 10% of all requests results in a 4** response
annotations:
description: "High amount of HTTP client errors in '{{ $labels.container }}' in namespace '{{ $labels.namespace }}'"
expr: "(100 * sum by (container) (rate(http_server_requests_seconds_count{container='your-app-name',namespace='your-team-namespace',status=~'4.+'}[3m])) / sum by (container) (rate(http_server_requests_seconds_count{container='your-app-name',namespace='your-team-namespace'}[3m]))) > 10"
for: "3m"
labels:
severity: "warning"
namespace: "your-team-namespace"
- alert: "HighAmountOfErrorsOrWarnsInLogs" # More that 10% of all log entries are of level WARN or ERROR
annotations:
description: "High amount of log entries with level WARN or ERROR in '{{ $labels.container }}' in namespace '{{ $labels.namespace }}'"
expr: "(100 * sum by (container) (rate(logback_events_total{container='your-app-name',namespace='your-team-namespace',level=~'warn|error'}[3m])) / sum by (container) (rate(logback_events_total{container='your-app-name',namespace='your-team-namespace'}[3m]))) > 10"
for: "3m"
labels:
severity: "warning"
namespace: "your-team-namespace"
You now have a basic monitoring setup ready for your deployment. Head over to metrics or alerts to learn more about these topics 🚀
Ingress​
An Ingress exposes HTTP and HTTPS routes from outside the cluster to services within the cluster.
Let's assume "cluster-ingress" is ".apps-int.testgjensidige.io". Then the following Ingress will expose your app on https://your-app-name.apps-int.testgjensidige.io.
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
namespace: "your-team-namespace"
name: "your-app-name"
labels:
app: "your-app-name"
annotations:
kubernetes.io/ingress.allow-http: "false"
kubernetes.io/ingress.class: "nginx"
nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
spec:
tls:
- hosts:
- "your-app-name.<cluster-ingress>"
rules:
- host: "your-app-name.<cluster-ingress>"
http:
paths:
- path: "/"
pathType: Prefix
backend:
service:
name: "your-app-name" # [1]
port:
name: "http-service" # [2]
- Must match name of the Service created earlier
- Must match a port name defines in the Service
Make sure you know how to select an available Ingress and how to configure your Ingress correctly. Read about it here.
To access your app from another app within the cluster, you should use http://your-service-name.your-team-namespace. Remember to use http instead of https, Linkerd will ensure encryption for calls within the cluster.
Before trying to connect to your app from another app in the cluster, make sure both apps have set up ingress/egress for each other See overriding the default policy for more information.