In a previous post we saw the differences between horizontal scaling and vertical scaling in Kubernetes. But thanks to Google’s implementation of Kubernetes with GKE we can enjoy a third type of scaling: Multidimensional scaling in GKE.

To avoid having to configure two different types of scaling, Google offers us a new type of scaling for its multidimensional GKE applications; that is, you can use CPU-based horizontal scaling and memory-based vertical scaling at the same time. A MultidimPodAutoscaler object modifies memory requests and aggregates replicas so that the average CPU usage of each replica matches the target usage.
However, if we have a Standard cluster (and not Autopilot) we will also have to enable the automatic vertical scaling settings first; since in Standard clusters it is disabled by default.
An example implementation would look like this:
apiVersion: autoscaling.gke.io/v1beta1
kind: MultidimPodAutoscaler
metadata:
name: php-apache-autoscaler
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: php-apache
goals:
metrics:
- type: Resource
resource:
# Define the target CPU utilization request here
name: cpu
target:
type: Utilization
averageUtilization: 60
constraints:
global:
minReplicas: 1
maxReplicas: 5
containerControlledResources: [ memory ]
container:
- name: '*'
# Define boundaries for the memory request here
requests:
minAllowed:
memory: 1Gi
maxAllowed:
memory: 2Gi
policy:
updateMode: Auto
As can be seen, it introduces a new unique GKE object type (offered only by Google) called “MultidimPodAutoscaler” and allocates a minimum and maximum number of replicas; and a minimum and maximum memory to keep the average CPU usage at 60% usage.
If you want to know more about this kind of autoscaler, I invite you to take a look into official Google documentation.