Skip to content

Instantly share code, notes, and snippets.

@allamand
Last active April 28, 2021 14:28
Show Gist options
  • Save allamand/148bbd2b88df5182812e7c0dd334b25a to your computer and use it in GitHub Desktop.
Save allamand/148bbd2b88df5182812e7c0dd334b25a to your computer and use it in GitHub Desktop.

Revisions

  1. allamand renamed this gist Apr 28, 2021. 1 changed file with 0 additions and 0 deletions.
    File renamed without changes.
  2. allamand revised this gist Feb 18, 2021. No changes.
  3. allamand revised this gist Dec 22, 2020. 1 changed file with 1 addition and 0 deletions.
    1 change: 1 addition & 0 deletions 2048-deployment.yaml
    Original file line number Diff line number Diff line change
    @@ -1,3 +1,4 @@
    # Adapt the nodeSelector according to your configuration
    apiVersion: apps/v1
    kind: Deployment
    metadata:
  4. allamand revised this gist Dec 22, 2020. 1 changed file with 1 addition and 0 deletions.
    1 change: 1 addition & 0 deletions Readme.md
    Original file line number Diff line number Diff line change
    @@ -41,6 +41,7 @@ If you are in this configuration, and you execute the `2048-deployment.yaml` you

    ## Using Custom Networking

    > You can follow the tutorial on [eksworkshop.com](https://www.eksworkshop.com/beginner/160_advanced-networking/secondary_cidr/)
    VPC CNI allows you to have custom networking. By default, when new network interfaces are allocated for pods, ipamD uses the node's primary network interface's security groups and subnet. You might want your pods to use a different security group or subnet, within the same VPC as your control plane security group. Examples:
    - There are a limited number of IP addresses available in a subnet. This might limit the number of pods that can be created in the cluster. Using different subnets for pods allows you to increase the number of available IP addresses.
    - For security reasons, your pods must use different security groups or subnets than the node's primary network interface.
  5. allamand revised this gist Dec 22, 2020. 2 changed files with 74 additions and 13 deletions.
    File renamed without changes.
    87 changes: 74 additions & 13 deletions Readme.md
    Original file line number Diff line number Diff line change
    @@ -3,10 +3,19 @@
    When using VPC CNI, there are a number of things that can influence the number of pods an instance can have.
    Here, I won't go into details about the Kubernetes recommended limit of [100 pods per instance](https://kubernetes.io/docs/setup/best-practices/cluster-large/), neither than checking cpu/memory limits (i'll use very small pods) in order to focus on the IP allocation limits.

    Our goal here will be to uses as many IP as possible on an instance.

    ## How this works

    Amazon EKS supports native VPC networking with the Amazon VPC Container Network Interface (CNI) plugin for Kubernetes.
    By default every instance can have multiple Elastic Network Interface (ENI), and are attached one by default called the primary ENI. Any additional network interface attached to the instance is called a secondary network interface.
    Each network interface can be assigned multiple private IP addresses.

    The Amazon VPC Container Network Interface (CNI) plugin for Kubernetes is deployed with each of your Amazon EC2 nodes in a Daemonset with the name `aws-node`.
    - Responsible for creating network interfaces and attaching the network interfaces to Amazon EC2 instances, assigning secondary IP addresses to network interfaces, and maintaining a warm pool of IP addresses on each node for assignment to Kubernetes pods when they are scheduled.
    - When the number of pods running on the node exceeds the number of addresses that can be assigned to a single network interface, the plugin starts allocating a new network interface, as long as the maximum number of network interfaces for the instance aren't already attached
    - you can tweak default behaviour if needed see https://github.com/aws/amazon-vpc-cni-k8s/blob/master/docs/eni-and-ip-target.md

    The number of network interfaces, and the number of IP allowed in each network interface depend on the instance type, see [IP addresses per network interface per instance type](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-eni.html#AvailableIpPerENI)

    For instance, for a c5d.4xlarge we have:
    @@ -15,19 +24,71 @@ For instance, for a c5d.4xlarge we have:
    | -------- | -------- | -------- | ----------- |
    | c5d.4xlarge | 8 | 30 | 30 |

    Test limits in number of pods on c5d.4xlarge instances

    I can see that the max seems to be 225 pods for my 2048 game.
    The Instances is using the 8 ENI available using 30 IP in each so the max usable IP would be :
    8ENI * (30-1)@ +2 = 234 @IP availables
    30*8-2 = 238 pods

    You can chack for each instance type the maximum number of pods in [eni-max-pods.txt](https://github.com/awslabs/amazon-eks-ami/blob/master/files/eni-max-pods.txt)

    We can also see this information from the node itself:
    ```bash
    kubectl describe node ip-10-0-83-246.eu-west-1.compute.internal | grep -i pods
    pods: 234
    ```

    If you are in this configuration, and you execute the `2048-deployment.yaml` you'll see that you can apply up to 225 pods and that the node has 234 pods (including other workloads present on my nodes such as daemonsets (aws-node, cloudwatch-agent, aws-for-fluent-bit, ...)

    ## Using Custom Networking

    VPC CNI allows you to have custom networking. By default, when new network interfaces are allocated for pods, ipamD uses the node's primary network interface's security groups and subnet. You might want your pods to use a different security group or subnet, within the same VPC as your control plane security group. Examples:
    - There are a limited number of IP addresses available in a subnet. This might limit the number of pods that can be created in the cluster. Using different subnets for pods allows you to increase the number of available IP addresses.
    - For security reasons, your pods must use different security groups or subnets than the node's primary network interface.

    For using Custom Networking, we will need to configure the vcp-cni plugin to uses custom network configuration

    ```bash
    kubectl set env daemonset aws-node -n kube-system AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG=true
    ```

    > this needs latests versions of cni-plugin (I have 1.7.8)
    > `kubectl describe daemonset aws-node --namespace kube-system | grep Image | cut -d "/" -f 2`
    Then we need to create a ENIConfig Custom Ressource Definition for each subnet we want to schedule pods in.

    We recommend creating a file with name=AZ specifying the subnet, and security group to uses:
    ```yaml
    apiVersion: crd.k8s.amazonaws.com/v1alpha1
    kind: ENIConfig
    metadata:
    name: <us-west-2a>
    spec:
    securityGroups:
    - <sg-0dff111a1d11c1c11>
    subnet: <subnet-011b111c1f11fdf11>
    ```
    When Using custom networking, the formula to calculate the number of Pods is different, because each instance will keep 1 ENI for IP routing, and not available for Pod scheduling:
    ```bash
    maxPods = (number of interfaces - 1) * (max IPv4 addresses per interface - 1) + 2
    ```
    In our case we have:
    (8-1) * (30-1) + 2 = 205 pods

    By playing with the `replicas` field of the 2048-deployment.yaml file, I can push my instance with 205 pods
    ```bash
    k describe node ip-10-0-83-246.eu-west-1.compute.internal | grep -i pods
    pods: 234
    pods: 234
    Non-terminated Pods: (205 in total)
    ```

    In this case we see that the instance still considere to have a max pods to 234. It may be useful in this case to update the bootstrap arguments used to configure your nodes adding for instance :

    ```bash
    --use-max-pods false --kubelet-extra-args '--max-pods=<205>'
    ```

    The Instances is using the 8 ENI available using 30 IP in each so the max would be : 30*8-2 = 238 pods
    > If you have any nodes in your cluster that had pods placed on them before you completed this procedure, you should terminate them. Only new nodes that are registered with the k8s.amazonaws.com/eniConfig label use the new custom networking feature.
    there are also other pods in the instance in fact I have 234 pods which seems coherent with the limits
    amazon-cloudwatch cloudwatch-agent-wp97d 200m (1%) 200m (1%) 200Mi (0%) 200Mi (0%) 28d
    kube-system aws-for-fluent-bit-9h5nk 500m (3%) 0 (0%) 500Mi (1%) 500Mi (1%) 28d
    kube-system aws-load-balancer-controller-79499bdd8b-6f2qj 0 (0%) 0 (0%) 0 (0%) 0 (0%) 27d
    kube-system aws-node-td4wq 10m (0%) 0 (0%) 0 (0%) 0 (0%) 28d
    kube-system coredns-6987776bbd-z9t7d 100m (0%) 0 (0%) 70Mi (0%) 170Mi (0%) 14d
    kube-system ebs-csi-node-fp6pq 0 (0%) 0 (0%) 0 (0%) 0 (0%) 28d
    kube-system kube-proxy-ncrpc 100m (0%) 0 (0%) 0 (0%) 0 (0%) 28d
    kube-system metrics-server-7578984995-2jkrl 0 (0%) 0 (0%) 0 (0%) 0 (0%) 14d
    kube-system statefulclusterchartspotinterrupthandler061f1c88-aws-node-8tlx8 50m (0%) 100m (0%) 64Mi (0%) 128Mi (0%) 28d
  6. allamand revised this gist Dec 22, 2020. 1 changed file with 2 additions and 1 deletion.
    3 changes: 2 additions & 1 deletion Readme.md
    Original file line number Diff line number Diff line change
    @@ -10,8 +10,9 @@ Each network interface can be assigned multiple private IP addresses.
    The number of network interfaces, and the number of IP allowed in each network interface depend on the instance type, see [IP addresses per network interface per instance type](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-eni.html#AvailableIpPerENI)

    For instance, for a c5d.4xlarge we have:
    | - | - | - | - |

    | Instance type | Maximum network interfaces | Private IPv4 addresses per interface | IPv6 addresses per interface |
    | -------- | -------- | -------- | ----------- |
    | c5d.4xlarge | 8 | 30 | 30 |

    Test limits in number of pods on c5d.4xlarge instances
  7. allamand revised this gist Dec 22, 2020. 2 changed files with 18 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions 2048-daemponset.yaml → 2048-daemonset.yaml
    Original file line number Diff line number Diff line change
    @@ -7,7 +7,7 @@ metadata:
    namespace: 2048-game
    spec:
    progressDeadlineSeconds: 600
    replicas: 225
    replicas: 200
    revisionHistoryLimit: 10
    selector:
    matchLabels:
    @@ -42,7 +42,7 @@ spec:
    terminationMessagePolicy: File
    dnsPolicy: ClusterFirst
    nodeSelector:
    kubernetes.io/hostname: ip-10-0-61-197.eu-west-1.compute.internal
    topology.kubernetes.io/zone: eu-west-1c
    restartPolicy: Always
    schedulerName: default-scheduler
    securityContext: {}
    16 changes: 16 additions & 0 deletions Readme.md
    Original file line number Diff line number Diff line change
    @@ -1,3 +1,19 @@
    # Test limit of Pods regarding IP allocation with VPC CNI

    When using VPC CNI, there are a number of things that can influence the number of pods an instance can have.
    Here, I won't go into details about the Kubernetes recommended limit of [100 pods per instance](https://kubernetes.io/docs/setup/best-practices/cluster-large/), neither than checking cpu/memory limits (i'll use very small pods) in order to focus on the IP allocation limits.

    Amazon EKS supports native VPC networking with the Amazon VPC Container Network Interface (CNI) plugin for Kubernetes.
    By default every instance can have multiple Elastic Network Interface (ENI), and are attached one by default called the primary ENI. Any additional network interface attached to the instance is called a secondary network interface.
    Each network interface can be assigned multiple private IP addresses.

    The number of network interfaces, and the number of IP allowed in each network interface depend on the instance type, see [IP addresses per network interface per instance type](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-eni.html#AvailableIpPerENI)

    For instance, for a c5d.4xlarge we have:
    | - | - | - | - |
    | Instance type | Maximum network interfaces | Private IPv4 addresses per interface | IPv6 addresses per interface |
    | c5d.4xlarge | 8 | 30 | 30 |

    Test limits in number of pods on c5d.4xlarge instances

    I can see that the max seems to be 225 pods for my 2048 game.
  8. allamand revised this gist Dec 17, 2020. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion 2048-daemponset.yaml
    Original file line number Diff line number Diff line change
    @@ -25,7 +25,7 @@ spec:
    spec:
    automountServiceAccountToken: false
    containers:
    - image: alexwhen/docker-2048
    - image: public.ecr.aws/u0b4h6b4/docker-2048
    imagePullPolicy: Always
    name: "2048"
    ports:
  9. allamand created this gist Dec 16, 2020.
    50 changes: 50 additions & 0 deletions 2048-daemponset.yaml
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,50 @@
    apiVersion: apps/v1
    kind: Deployment
    metadata:
    labels:
    app: My2048App
    name: "2048"
    namespace: 2048-game
    spec:
    progressDeadlineSeconds: 600
    replicas: 225
    revisionHistoryLimit: 10
    selector:
    matchLabels:
    app: My2048App
    strategy:
    rollingUpdate:
    maxSurge: 25%
    maxUnavailable: 25%
    type: RollingUpdate
    template:
    metadata:
    creationTimestamp: null
    labels:
    app: My2048App
    spec:
    automountServiceAccountToken: false
    containers:
    - image: alexwhen/docker-2048
    imagePullPolicy: Always
    name: "2048"
    ports:
    - containerPort: 80
    protocol: TCP
    resources:
    limits:
    cpu: 1m
    memory: 50Mi
    requests:
    cpu: 1m
    memory: 20Mi
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    dnsPolicy: ClusterFirst
    nodeSelector:
    kubernetes.io/hostname: ip-10-0-61-197.eu-west-1.compute.internal
    restartPolicy: Always
    schedulerName: default-scheduler
    securityContext: {}
    shareProcessNamespace: false
    terminationGracePeriodSeconds: 30
    16 changes: 16 additions & 0 deletions Readme.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,16 @@
    Test limits in number of pods on c5d.4xlarge instances

    I can see that the max seems to be 225 pods for my 2048 game.

    The Instances is using the 8 ENI available using 30 IP in each so the max would be : 30*8-2 = 238 pods

    there are also other pods in the instance in fact I have 234 pods which seems coherent with the limits
    amazon-cloudwatch cloudwatch-agent-wp97d 200m (1%) 200m (1%) 200Mi (0%) 200Mi (0%) 28d
    kube-system aws-for-fluent-bit-9h5nk 500m (3%) 0 (0%) 500Mi (1%) 500Mi (1%) 28d
    kube-system aws-load-balancer-controller-79499bdd8b-6f2qj 0 (0%) 0 (0%) 0 (0%) 0 (0%) 27d
    kube-system aws-node-td4wq 10m (0%) 0 (0%) 0 (0%) 0 (0%) 28d
    kube-system coredns-6987776bbd-z9t7d 100m (0%) 0 (0%) 70Mi (0%) 170Mi (0%) 14d
    kube-system ebs-csi-node-fp6pq 0 (0%) 0 (0%) 0 (0%) 0 (0%) 28d
    kube-system kube-proxy-ncrpc 100m (0%) 0 (0%) 0 (0%) 0 (0%) 28d
    kube-system metrics-server-7578984995-2jkrl 0 (0%) 0 (0%) 0 (0%) 0 (0%) 14d
    kube-system statefulclusterchartspotinterrupthandler061f1c88-aws-node-8tlx8 50m (0%) 100m (0%) 64Mi (0%) 128Mi (0%) 28d