This article assumes some basic knowledge of Kubernetes. If you need an introduction to Kubernetes you can look at:

Why do we need to know about requests and limit

If we tried to run a pod that requires 8GB of memory in a host that only has 4GB of memory, the host would run out of memory and it would crash. In this article we are going to learn how to specify how much resources a pod needs so the scheduler can make better decisions about where to run pods.

Resources

There are two main resources Kubernetes supports: CPU and memory. What they are is self explanatory, but how they are specified deserves a little more attention.

CPU is specified as the amount of CPU cores we need for our containers. We can use 1 to say we need one core or 5 to say we need five cores.

It’s also possible to request fractions of a core, down to a single milicore using 1m. Some examples:

  • One core: 1
  • One core: 1000m
  • Half a core: 500m
  • A tenth of a core: 100m
  • Ten cores: 10
  • One and a half cores: 1500m

Memory is specified in bytes. When specifying memory, we can use the following suffixes:

  • k - Kilobytes (1k = 1000)
  • M - Megabytes (1M = 1000k)
  • G - Gigabytes (1G = 1000M)
  • T - Terabytes (1T = 1000G)
  • P - Petabytes (1P = 1000T)
  • Ki - Kibibytes (1Ki = 1024)
  • Mi - Mebibytes (1Mi = 1024Ki)
  • Gi - Gibibytes (1Gi = 1024Mi)
  • Ti - Tebibytes (1Ti = 1024Gi)
  • Pi - Petibytes (1Pi = 1024Ti)

I was surprised, but It seems that at some point in history 1 kilobyte stopped meaning 1024 bytes and now it means 1000 bytes. The correct term for 1024 bytes is kibibyte: https://en.wikipedia.org/wiki/Byte#Multiple-byte_units

Here are some examples:

  • One mebibyte: 1Mi
  • Five gigabytes: 5G
  • Five hunded bytes: 500

Even though the scheduling unit in Kubernetes is a pod, resources are specified per container. The resources that a pod needs is just the sum of the resources all the containers in the pod need.

Requests

Requests are specified for containers using the resources option. For example:

1
2
3
4
5
6
7
8
9
10
11
12
apiVersion: apps/v1
kind: Pod
metadata:
  name: echo-server
spec:
  containers:
  - image: ealen/echo-server
    name: echo-server
    resources:
      requests:
        memory: 100Mi
        cpu: 100m

Requests are used by the Kubernetes scheduler to decide in which host (node) a pod will be scheduled. A pod like the one specified above will only be scheduled in hosts with 100Mi of memory and 100m of CPU available.

Another thing to keep in mind is that requests are guaranteed. This means, if we have a host with 2Gi of memory, the scheduler can schedule two pods requesting 1Gi each. If one pod is not using much memory and the other pod needs more than the requested 1Gi, it will not be allowed to use more since the rest of the memory belongs to the other pod.

With CPU, it’s a little different because even if one pod is allowed to use the CPU allocated to another pod, CPU can be made immediately available on request. If two pods are running in a host with 1 core and each of them requests 500m cores. Any pod can use the whole 1 core if the other one doesn’t need it. If at some point both pods are very busy and need to use 100% of their requested CPU, each will get their requested time (Each pod would get half of the Host’s CPU time).

Limits

Limits are specified similarly to requests:

1
2
3
4
5
6
7
8
9
10
11
12
apiVersion: apps/v1
kind: Pod
metadata:
  name: echo-server
spec:
  containers:
  - image: ealen/echo-server
    name: echo-server
    resources:
      limits:
        memory: 100Mi
        cpu: 100m

If limits are specified, but no requests, then requests will be set to the same values as limits. If no limits is specified, the limits are set to no limits (i.e. Can use all resources in the host).

While requests are used to make scheduling decisions, limits are used to limit how much resources a container can use (this is typically enforced using cgroups).

If we set a CPU limit for a container, the OS Kernel will not allocate more CPU time even if there is CPU time available.

If we set a memory limit and a container tries to use more memory than this limit, the process running in the container will most likely be killed and the container will be restarted.

Practical examples

If you want to see requests and limits in action, take a look at: Resource management in Kubernetes

You’ll find some examples you can run on your own. Follow the instructions in the README and try changing things some see what happens.

Conclusion

In this post we learned the importance of setting requests and limits for our containers. In a production environment it’s recommended to set these values for all our containers so it’s easier to undestand the scheduler’s decisions.

[ linux  docker  ]
Managing Kubernetes Applications with Helm
Managing Kubernetes Objects With Yaml Configurations
Container Linux by CoreOS
Playing with Kubernetes locally with Minikube
Getting Rails to run in an Alpine container