What is the concept of Predictable Demands?
It is critical for Kubernetes to have knowledge of the requirements of each resource in order to maximize the resource utilization. The Predictable Demands pattern is a mechanism that helps Kubernetes grasp the resource profiles and make smart decisions around container placement and also aid the developers holistically with capacity planning.
Predictable demands essentially means the declaration of each Kubernetes entity is strongly coupled with the complete context of the runtime dependencies and resource profiles.
What is a Resource Profile?
A resource in K8s is something that can be requested by, allocated to and consumed by a container. Requests and limits are the mechanisms Kubernetes uses to manage resources such as CPU and memory. Requests are the resources that a container is guaranteed to be allocated with, on the other hard Limit is the maximum a container can use based on the resource availability. There are analogous to soft and hard limits.
The CPU resource is a compressible resource (ie can be throttled) and the memory resource is an incompressible resource. If a container consumes too much of a compressible resource it is not killed but throttled, on the contrary consuming too much incompressible resources will lead to eviction.
Why are the Resource Requests critical?
Resource request is a contract, it is critical for the decision-making process of the K8s Scheduler. It is hence highly recommended that each Pod to have a resource request and limit, based on the specification of the requests, limits or both Kubernetes defines the Quality of Service (QoS).
Best-Effort: Pod does not have either request or limits specified for it’s containers, such a pos if considered lowest priority and is evicted first when node is overwhelmed.
Burstable: Pod has requests and limits defined, but they are not equal (limits generally higher than the requests), such pods have minimal resource guarantees but consumes more resources when available, they are likely to be killed if no more Best-Effort Pods remain in a node pressure situation.
Guaranteed: Pod has equal requests and limits defined, they are the highest priority pods. Are only killed if there are no more Best-Effort, Burstable pods.
What trend we see is that the more certain the contract of the resource the higher stability it enjoys, hence making resource requests crucial.
Too Little (OOM, Workload eviction, CPU Starvation)
Too much (Resource Waste)
Resource Requests gone wrong:
There are two deviations from the holy grail of the perfect resource requests, we request too little or too much. Both these situations are not a good situation to be in for the following reasons.
We request too little resources than what the pod needs, this typically causes CPU starvation and Out Of memory issues resulting in workload evictions.
We request too much, we will end up using more nodes than required leading to resource wastage and added costs.
While a Kubernetes cluster might work fine without setting resource requests and limits, one will start running into stability issues with increasing complexity and hence making it key to adding Resource requests to pods.
Also to build some discipline around setting the resource requests or avoiding scenarios of setting them too high and taking up more than their fair share of the cluster, one can leverage the concept of Resource Quotas and Limit Range at a namespace level.