This article assumes some basic knowledge of Kubernetes. If you need an introduction to Kubernetes you can look at:
- Playing with Kubernetes locally with Minikube
- Introduction to Kubernetes
- Managing Kubernetes Objects With Yaml Configurations
Why do we need to know about requests and limit
If we tried to run a pod that requires 8GB of memory in a host that only has 4GB of memory, the host would run out of memory and it would crash. In this article we are going to learn how to specify how much resources a pod needs so the scheduler can make better decisions about where to run pods.
There are two main resources Kubernetes supports: CPU and memory. What they are is self explanatory, but how they are specified deserves a little more attention.
CPU is specified as the amount of CPU cores we need for our containers. We can use
1 to say we need one core or
5 to say we need five cores.
It’s also possible to request fractions of a core, down to a single milicore using
1m. Some examples:
- One core:
- One core:
- Half a core:
- A tenth of a core:
- Ten cores:
- One and a half cores:
Memory is specified in bytes. When specifying memory, we can use the following suffixes:
k- Kilobytes (
M- Megabytes (
G- Gigabytes (
T- Terabytes (
P- Petabytes (
Ki- Kibibytes (
Mi- Mebibytes (
Gi- Gibibytes (
Ti- Tebibytes (
Pi- Petibytes (
I was surprised, but It seems that at some point in history 1 kilobyte stopped meaning
1024 bytes and now it means
1000 bytes. The correct term for
1024 bytes is kibibyte: https://en.wikipedia.org/wiki/Byte#Multiple-byte_units
Here are some examples:
- One mebibyte:
- Five gigabytes:
- Five hunded bytes:
Even though the scheduling unit in Kubernetes is a pod, resources are specified per container. The resources that a pod needs is just the sum of the resources all the containers in the pod need.
Requests are specified for containers using the
resources option. For example:
1 2 3 4 5 6 7 8 9 10 11 12 apiVersion: apps/v1 kind: Pod metadata: name: echo-server spec: containers: - image: ealen/echo-server name: echo-server resources: requests: memory: 100Mi cpu: 100m
Requests are used by the Kubernetes scheduler to decide in which host (node) a pod will be scheduled. A pod like the one specified above will only be scheduled in hosts with
100Mi of memory and
100m of CPU available.
Another thing to keep in mind is that requests are guaranteed. This means, if we have a host with
2Gi of memory, the scheduler can schedule two pods requesting
1Gi each. If one pod is not using much memory and the other pod needs more than the requested
1Gi, it will not be allowed to use more since the rest of the memory belongs to the other pod.
With CPU, it’s a little different because even if one pod is allowed to use the CPU allocated to another pod, CPU can be made immediately available on request. If two pods are running in a host with
1 core and each of them requests
500m cores. Any pod can use the whole
1 core if the other one doesn’t need it. If at some point both pods are very busy and need to use 100% of their requested CPU, each will get their requested time (Each pod would get half of the Host’s CPU time).
Limits are specified similarly to requests:
1 2 3 4 5 6 7 8 9 10 11 12 apiVersion: apps/v1 kind: Pod metadata: name: echo-server spec: containers: - image: ealen/echo-server name: echo-server resources: limits: memory: 100Mi cpu: 100m
limits are specified, but no
requests will be set to the same values as
limits. If no
limits is specified, the limits are set to
no limits (i.e. Can use all resources in the host).
requests are used to make scheduling decisions,
limits are used to
limit how much resources a container can use (this is typically enforced using cgroups).
If we set a CPU limit for a container, the OS Kernel will not allocate more CPU time even if there is CPU time available.
If we set a memory limit and a container tries to use more memory than this limit, the process running in the container will most likely be killed and the container will be restarted.
If you want to see requests and limits in action, take a look at: Resource management in Kubernetes
You’ll find some examples you can run on your own. Follow the instructions in the README and try changing things some see what happens.
In this post we learned the importance of setting requests and limits for our containers. In a production environment it’s recommended to set these values for all our containers so it’s easier to undestand the scheduler’s decisions.