# Kubernetes cheatsheet - [Kubernetes cheatsheet](#kubernetes-cheatsheet) - [Getting Started](#getting-started) - [Sample yaml](#sample-yaml) - [Workflow](#workflow) - [Physical components](#physical-components) - [Master](#master) - [Node](#node) - [Everything is an object - persistent entities](#everything-is-an-object---persistent-entities) - [Namespaces](#namespaces) - [Labels](#labels) - [ClusterIP](#clusterip) - [Controller manager](#controller-manager) - [Kube-scheduler](#kube-scheduler) - [Pod](#pod) - [Status](#status) - [Probe](#probe) - [Pod priorities](#pod-priorities) - [Multi-Container Pods](#multi-container-pods) - [Init containers](#init-containers) - [Lifecycle hooks](#lifecycle-hooks) - [Quality of Service (QoS)](#quality-of-service-qos) - [PodPreset](#podpreset) - [ReplicaSet](#replicaset) - [Deployments](#deployments) - [ReplicationController](#replicationcontroller) - [DaemonSet](#daemonset) - [StatefulSet](#statefulset) - [Job (batch/v1)](#job-batchv1) - [Cronjob](#cronjob) - [Horizontal pod autoscaler](#horizontal-pod-autoscaler) - [Services](#services) - [Volumes](#volumes) - [Persistent volumes](#persistent-volumes) - [Role-Based Access Control (RBAC)](#role-based-access-control-rbac) - [Custom Resource Definitions](#custom-resource-definitions) - [Notes](#notes) - [Basic commands](#basic-commands) - [jsonpath](#jsonpath) - [Resource limit](#resource-limit) - [CPU](#cpu) - [Memory](#memory) - [Chapter 13. Integrating storage solutions and Kubernetes](#chapter-13-integrating-storage-solutions-and-kubernetes) - [Downward API](#downward-api) - [Labs](#labs) - [Guaranteed Scheduling For Critical Add-On Pods](#guaranteed-scheduling-for-critical-add-on-pods) - [Set command or arguments via env](#set-command-or-arguments-via-env) ## Getting Started - Fault tolerance - Rollback - Auto-healing - Auto-scaling - Load-balancing - Isolation (sandbox) ## Sample yaml ```yaml apiVersion: <> kind: <> metadata: name: <> labels: ... annotations: ... spec: containers: ... initContainers: ... priorityClassName: <> ``` ## Workflow - (kube-scheduler, controller-manager, etcd) --443--> API Server - API Server --10055--> kubelet - non-verified certificate - MITM - Solution: - set kubelet-certificate-authority - ssh tunneling - API server --> (nodes, pods, services) - Plain HTTP (unsafe) ## Physical components ### Master - API Server (443) - kube-scheduler - controller-manager - cloud-controller-manager - kube-controller-manager - etcd Other components talk to API server, no direct communication ### Node - Kubelet - Container Engine - CRI - The protocol which used to connect between Kubelet & container engine - Kube-proxy ## Everything is an object - persistent entities - maintained in etcd, identified using - names: client-given - UIDs: system-generated - Both need to be unique - three management methods - Imperative commands (kubectl) - Imperative object configuration (kubectl + yaml) - repeatable - observable - auditable - Declarative object configuration (yaml + config files) - Live object configuration - Current object configuration file - Last-applied object configuration file ```text Node Capacity --------------------------- | kube-reserved | |---------------------------| | system-reserved | | ------------------------- | | eviction-threshold | | ------------------------- | | | | allocatable | | (available for pods) | | | | | --------------------------- ``` ### Namespaces - Three pre-defined - default - kube-system - kube-public: auto-readable by all users - Objects without namespaces - Nodes - PersistentVolumes - Namespaces ### Labels - key / value - loose coupling via selectors - need not be unique #### ClusterIP - Independent of lifespan of any backend pod - Service object has a static port assigned to it ### Controller manager - ReplicaSet, deployment, daemonset, statefulSet - Actual state <-> desired state - reconciliation loop ### Kube-scheduler - nodeSelector - Affinity & Anti-Affinity - Node - Steer pod to node - Pod - Steer pod towards or away from pods - Taints & tolerations (anti-affinity between node and pod!) - Base on predefined configuration (env=dev:NoSchedule) ```yaml ... tolerations: - key: "dev" operator: "equal" value: "env" effect: NoSchedule ... ``` - Base on node condition (alpha in v1.8) - taints added by node controller ### Pod ```bash kubectl run name --image= ``` What's available inside the container? - File system - Image - Associated Volumes - ordinary - persistent - Container - Hostname - Pod - Pod name - User-defined envs - Services - List of all services Access with: - Symlink (important): - /etc/podinfo/labels - /etc/podinfo/annotations - Or: ```yaml volumes: - name: podinfo downwardAPI: items: - path: "labels" fieldRef: fieldPath: metadata.labels - path: "annotations" fieldRef: fieldPath: metadata.annotations ``` #### Status - Pending - Running - Succeeded - Failed - Unknown #### Probe - Liveness - Failed? Restart policy applied - Readiness - Failed? Removed from service #### Pod priorities - available since 1.8 - PriorityClass object - Affect scheduling order - High priority pods could jump the queue - Preemption - Low priority pods could be pre-empted to make way for higher one (if no node is available for high priority) - These preempted pods would have a graceful termination period #### Multi-Container Pods - Share access to memory space - Connect to each other using localhost - Share access to the same volume - entire pod is host on the same node - all in or nothing - no auto healing or scaling #### Init containers - run before app containers - always run to completion - run serially #### Lifecycle hooks - PostStart - PreStop (blocking) Handlers: - Exec - HTTP ```yaml ... spec: containers: lifecycle: postStart: exec: command: <> preStop: http: ... ``` Could invoke multiple times #### Quality of Service (QoS) When Kubernetes creates a Pod it assigns one of these QoS classes to the Pod: - Guaranteed (all containers have limits == requests) >If a Container specifies its own memory limit, but does not specify a memory request, Kubernetes automatically assigns a memory request that matches the limit. Similarly, if a Container specifies its own cpu limit, but does not specify a cpu request, Kubernetes automatically assigns a cpu request that matches the limit. - Burstable (at least 1 has limits or requests) - BestEffort (no limits or requests) #### PodPreset You can use a podpreset object to inject information like secrets, volume mounts, and environment variables etc into pods at creation time. This task shows some examples on using the PodPreset resource ```yaml apiVersion: settings.k8s.io/v1alpha1 kind: PodPreset metadata: name: allow-database spec: selector: matchLabels: role: frontend env: - name: DB_PORT value: "6379" volumeMounts: - mountPath: /cache name: cache-volume volumes: - name: cache-volume emptyDir: {} ``` ### ReplicaSet Features: - Scaling and healing - Pod template - number of replicas Components: - Pod template - Pod selector (could use matchExpressions) - Label of replicaSet - Number of replica - Could delete replicaSet without its pods using `--cascade =false` - Isolating pods from replicaSet by changing its labels ### Deployments - versioning and rollback - Contains spec of replicaSet within it - advanced deployment - blue-green - canary - Update containers --> new replicaSet & new pods created --> old RS still exists --> reduced to zero - Every change is tracked - Append `--record` in kubectl to keep history - Update strategy - Recreate - Old pods would be killed before new pods come up - RollingUpdate - progressDeadlineSeconds - minReadySeconds - rollbackTo - revisionHistoryLimit - paused - spec.Paused - `kubectl rollout undo deployment/<> --to-revision=<>` - `kubectl rollout statua deployment/<>` - `kubectl set image deployment/<> <>=<>:<>` - `kubectl rollout resume/pause <>` ### ReplicationController - RC = ( RS + deployment ) before - Obsolete ### DaemonSet - Ensure all nodes run a copy of pod - Cluster storage, log collection, node monitor ... ### StatefulSet - Maintains a sticky identity - Not interchangeable - Identifier maintains across any rescheduling Limitation - volumes must be pre-provisioned - Deleting / Scaling will not delete associated volumes Flow - Deployed 0 --> (n-1) - Deleted (n-1) --> 0 (successor must be completely shutdown before proceed) - Must be all ready and running before scaling happens ### Job (batch/v1) - Non-parallel jobs - Parallel jobs - Fixed completion count - job completes when number of completions reaches target - With work queue - requires coordination - Use spec.activeDeadlineSeconds to prevent infinite loop ### Cronjob - Job should be idempotent ### Horizontal pod autoscaler - Targets: replicaControllers, deployments, replicaSets - CPU or custom metrics - Won't work with non-scaling objects: daemonSets - Prevent thrashing (upscale/downscale-delay) ### Services Credit: https://www.reddit.com/user/__brennerm/ ![](https://i.redd.it/brjcbq9xk7261.png) - Logical set of backend pods + frontend - Frontend: static IP + port + dns name - Backend: set of backend pods (via selector) - Static IP and networking. - Kube-proxy route traffic to VIP. - Automatically create endpoint based on selector. - CluterIP - NodePort - external --> NodeIP + NodePort --> kube-proxy --> ClusterIP - LoadBalancer - Need to have cloud-controller-manager - Node controller - Route controller - Service controller - Volume controller - external --> LB --> NodeIP + NodePort --> kube-proxy --> ClusterIP - ExternalName - Can only resolve with kube-dns - No selector `Service discovery` - SRV record for named port - port-name.port-protocol.service-name.namespace.svc.cluster.local - Pod domain - pod-ip-address.namespace.pod.cluster.local - hostname is `metadata.name` `spec.dnsPolicy` - default - inherit node's name resolution - ClusterFirst - Any DNS query that does not match the configured cluster domain suffix, such as “www.kubernetes.io”, is forwarded to the upstream nameserver inherited from the node - ClusterFirstWithHostNet - if host network = true - None (since k8s 1.9) - Allow custom dns server usage Headless service - with selector? --> associate with pods in cluster - without selector? --> forward to externalName Could specify externalIP to service ### Volumes Credit: https://www.reddit.com/user/__brennerm/ ![](https://i.redd.it/iaflueca8m261.png) Lifetime longer than any containers inside a pod. 4 types: - configMap - emptyDir - share space / state across containers in same pod - containers can mount at different times - pod crash --> data lost - container crash --> ok - gitRepo - secret - store on RAM - hostPath #### Persistent volumes ### Role-Based Access Control (RBAC) Credit: https://www.reddit.com/user/__brennerm/ ![](https://i.redd.it/868lf3pp70361.png) - Role - Apply on namespace resources - ClusterRole - cluster-scoped resources (nodes,...) - non-resources endpoint (/healthz) - namespace resources across all namespaces ### Custom Resource Definitions CustomResourceDefinitions themselves are non-namespaced and are available to all namespaces. ```yaml apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: # name must match the spec fields below, and be in the form: . name: crontabs.stable.example.com spec: # group name to use for REST API: /apis// group: stable.example.com # version name to use for REST API: /apis// version: v1 # either Namespaced or Cluster scope: Namespaced names: # plural name to be used in the URL: /apis/// plural: crontabs # singular name to be used as an alias on the CLI and for display singular: crontab # kind is normally the CamelCased singular type. Your resource manifests use this. kind: CronTab # shortNames allow shorter string to match your resource on the CLI shortNames: - ct # categories is a list of grouped resources the custom resource belongs to. categories: - all validation: # openAPIV3Schema is the schema for validating custom objects. openAPIV3Schema: properties: spec: properties: cronSpec: type: string pattern: '^(\d+|\*)(/\d+)?(\s+(\d+|\*)(/\d+)?){4}$' replicas: type: integer minimum: 1 maximum: 10 # subresources describes the subresources for custom resources. subresources: # status enables the status subresource. status: {} # scale enables the scale subresource. scale: # specReplicasPath defines the JSONPath inside of a custom resource that corresponds to Scale.Spec.Replicas. specReplicasPath: .spec.replicas # statusReplicasPath defines the JSONPath inside of a custom resource that corresponds to Scale.Status.Replicas. statusReplicasPath: .status.replicas # labelSelectorPath defines the JSONPath inside of a custom resource that corresponds to Scale.Status.Selector. labelSelectorPath: .status.labelSelector ``` ## Notes ### Basic commands ```bash # show current context kubectl config current-context # get specific resource kubectl get (pod|svc|deployment|ingress) # Get pod logs kubectl logs -f # Get nodes list kubectl get no -o custom-columns=NAME:.metadata.name,AWS-INSTANCE:.spec.externalID,AGE:.metadata.creationTimestamp # Run specific command | Drop to shell kubectl exec -it # Describe specific resource kubectl describe (pod|svc|deployment|ingress) # Set context kubectl config set-context $(kubectl config current-context) --namespace= # Run a test pod kubectl run -it --rm --generator=run-pod/v1 --image=alpine:3.6 tuan-shell -- sh ``` - from @so0k [link](https://gist.github.com/so0k/42313dbb3b547a0f51a547bb968696ba#gistcomment-2040702) - access dashboard ```bash # bash kubectl -n kube-system port-forward $(kubectl get pods -n kube-system -o wide | grep dashboard | awk '{print $1}') 9090 # fish kubectl -n kube-system port-forward (kubectl get pods -n kube-system -o wide | grep dashboard | awk '{print $1}') 9090 ``` ### jsonpath From [link](https://github.com/kubernetes/website/blob/master/content/en/docs/reference/kubectl/jsonpath.md) ```json { "kind": "List", "items":[ { "kind":"None", "metadata":{"name":"127.0.0.1"}, "status":{ "capacity":{"cpu":"4"}, "addresses":[{"type": "LegacyHostIP", "address":"127.0.0.1"}] } }, { "kind":"None", "metadata":{"name":"127.0.0.2"}, "status":{ "capacity":{"cpu":"8"}, "addresses":[ {"type": "LegacyHostIP", "address":"127.0.0.2"}, {"type": "another", "address":"127.0.0.3"} ] } } ], "users":[ { "name": "myself", "user": {} }, { "name": "e2e", "user": {"username": "admin", "password": "secret"} } ] } ``` | Function | Description | Example | Result | |-------------------|---------------------------|---------------------------------------------------------------|-------------------------------------------------| | text | the plain text | kind is {.kind} | kind is List | | @ | the current object | {@} | the same as input | | . or [] | child operator | {.kind} or {['kind']} | List | | .. | recursive descent | {..name} | 127.0.0.1 127.0.0.2 myself e2e | | \* | wildcard. Get all objects | {.items[*].metadata.name} | [127.0.0.1 127.0.0.2] | | [start:end :step] | subscript operator | {.users[0].name} | myself | | [,] | union operator | {.items[*]['metadata.name', 'status.capacity']} | 127.0.0.1 127.0.0.2 map[cpu:4] map[cpu:8] | | ?() | filter | {.users[?(@.name=="e2e")].user.password} | secret | | range, end | iterate list | {range .items[*]}[{.metadata.name}, {.status.capacity}] {end} | [127.0.0.1, map[cpu:4]] [127.0.0.2, map[cpu:8]] | | '' | quote interpreted string | {range .items[*]}{.metadata.name}{'\t'}{end} | 127.0.0.1 127.0.0.2 | Below are some examples using jsonpath: ```shell $ kubectl get pods -o json $ kubectl get pods -o=jsonpath='{@}' $ kubectl get pods -o=jsonpath='{.items[0]}' $ kubectl get pods -o=jsonpath='{.items[0].metadata.name}' $ kubectl get pods -o=jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.startTime}{"\n"}{end}' ``` ### Resource limit #### CPU The CPU resource is measured in cpu units. One cpu, in Kubernetes, is equivalent to: - 1 AWS vCPU - 1 GCP Core - 1 Azure vCore - 1 Hyperthread on a bare-metal Intel processor with Hyperthreading #### Memory The memory resource is measured in bytes. You can express memory as a plain integer or a fixed-point integer with one of these suffixes: E, P, T, G, M, K, Ei, Pi, Ti, Gi, Mi, Ki. For example, the following represent approximately the same value: 128974848, 129e6, 129M , 123Mi ### Chapter 13. Integrating storage solutions and Kubernetes - External service without selector (access with `external-database.svc.default.cluster` endpoint) ```yaml kind: Service apiVersion: v1 metadata: name: external-database spec: type: ExternalName externalName: "database.company.com ``` - external service with IP only ```yaml kind: Service apiVersion: v1 metadata: name: external-ip-database --- kind: Endpoints apiVersion: v1 metadata: name: external-ip-database subsets: - addresses: - ip: 192.168.0.1 ports: - port: 3306 ``` #### Downward API The following information is available to containers through environment variables and downwardAPI volumes: Information available via fieldRef: - spec.nodeName - the node’s name - status.hostIP - the node’s IP - metadata.name - the pod’s name - metadata.namespace - the pod’s namespace - status.podIP - the pod’s IP address - spec.serviceAccountName - the pod’s service account name - metadata.uid - the pod’s UID - metadata.labels[''] - the value of the pod’s label (for example, metadata.labels['mylabel']); available in Kubernetes 1.9+ - metadata.annotations[''] - the value of the pod’s annotation (for example, metadata.annotations['myannotation']); available in Kubernetes 1.9+ - Information available via resourceFieldRef: - A Container’s CPU limit - A Container’s CPU request - A Container’s memory limit - A Container’s memory request In addition, the following information is available through downwardAPI volume fieldRef: - metadata.labels - all of the pod’s labels, formatted as label-key="escaped-label-value" with one label per line - metadata.annotations - all of the pod’s annotations, formatted as annotation-key="escaped-annotation-value" with one annotation per line ## Labs ### Guaranteed Scheduling For Critical Add-On Pods See [link](https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/) - Marking pod as critical when using Rescheduler. To be considered critical, the pod has to: - Run in the `kube-system` namespace (configurable via flag) - Have the `scheduler.alpha.kubernetes.io/critical-pod` annotation set to empty string - Have the PodSpec’s tolerations field set to `[{"key":"CriticalAddonsOnly", "operator":"Exists"}]`. > The first one marks a pod a critical. The second one is required by Rescheduler algorithm. - Marking pod as critical when priorites are enabled. To be considered critical, the pod has to: - Run in the `kube-system` namespace (configurable via flag) - Have the priorityClass set as `system-cluster-critical` or `system-node-critical`, the latter being the highest for entire cluster - `scheduler.alpha.kubernetes.io/critical-pod` annotation set to empty string(This will be deprecated too). ### Set command or arguments via env ```yaml env: - name: MESSAGE value: "hello world" command: ["/bin/echo"] args: ["$(MESSAGE)"] ```