Skip to content

Instantly share code, notes, and snippets.

@mikkimonroe
Forked from thimslugga/01_setup-docker-al2023.md
Created February 10, 2025 08:49
Show Gist options
  • Save mikkimonroe/ed9230f4f590088ad1c2d78b348c2dd1 to your computer and use it in GitHub Desktop.
Save mikkimonroe/ed9230f4f590088ad1c2d78b348c2dd1 to your computer and use it in GitHub Desktop.

Revisions

  1. @thimslugga thimslugga revised this gist Dec 18, 2024. 1 changed file with 3 additions and 1 deletion.
    4 changes: 3 additions & 1 deletion setup-docker-al2023.md
    Original file line number Diff line number Diff line change
    @@ -2,7 +2,7 @@

    The following guide is for setting up Docker with docker-compose v2 on Amazon Linux 2023. The steps are intendend for AL2023 on EC2 but should mostly work for the AL2023 VMs running on other hypervisors.

    ## Install and configure Docker on Amazon Linux 2023
    ## Overview of Updating Amazon Linux 2023

    ### Check for new updates

    @@ -70,6 +70,8 @@ dnf repoinfo
    dnf repolist all --verbose
    ```

    ## Install and configure Docker on Amazon Linux 2023

    ### Install Base OS Packages

    Install the following packages, which are good to have installed:
  2. @thimslugga thimslugga revised this gist Dec 13, 2024. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion setup-docker-al2023.md
    Original file line number Diff line number Diff line change
    @@ -417,7 +417,7 @@ cat <<'EOF' | sudo tee /etc/systemd/journald.conf.d/00-override.conf
    SystemMaxUse=100M
    RuntimeMaxUse=100M
    RuntimeMaxFileSize=10M
    RateLimitIntervals=1s
    RateLimitInterval=1s
    RateLimitBurst=10000
    EOF
  3. @thimslugga thimslugga revised this gist Dec 13, 2024. 1 changed file with 200 additions and 3 deletions.
    203 changes: 200 additions & 3 deletions zz_cloudcfg.yml
    Original file line number Diff line number Diff line change
    @@ -22,11 +22,117 @@ datasource:
    bootcmd:
    - systemctl stop amazon-ssm-agent

    package_update: false
    package_upgrade: false
    package_update: true
    package_upgrade: true
    package_reboot_if_required: false

    packages:
    - docker
    # Base OS packages
    - kernel-modules-extra
    - dnf-plugins-core
    - dnf-plugin-release-notification
    - dnf-plugin-support-info
    - dnf-utils
    - git-core
    - grubby
    - kexec-tools
    - chrony
    - audit
    - dbus
    - dbus-daemon
    - polkit
    - systemd-pam
    - systemd-container
    - udisks2
    - crypto-policies
    - crypto-policies-scripts
    - openssl
    - nss-util
    - nss-tools
    - dmidecode
    - nvme-cli
    - lvm2
    - dosfstools
    - e2fsprogs
    - xfsprogs
    - xfsprogs-xfs_scrub
    - attr
    - acl
    - shadow-utils
    - shadow-utils-subid
    - fuse3
    - squashfs-tools
    - star
    - gzip
    - pigz
    - bzip2
    - zstd
    - xz
    - unzip
    - p7zip
    - numactl
    - iproute
    - iproute-tc
    - iptables-nft
    - nftables
    - conntrack-tools
    - ipset
    - ethtool
    - net-tools
    - iputils
    - traceroute
    - mtr
    - telnet
    - whois
    #- socat
    - bind-utils
    #- tcpdump
    - cifs-utils
    - nfsv4-client-utils
    - nfs4-acl-tools
    - libseccomp
    - psutils
    - python3
    - python3-pip
    - python3-psutil
    - python3-policycoreutils
    - policycoreutils-python-utils
    - bash-completion
    - vim-minimal
    - wget
    - jq
    - awscli-2
    - ec2rl
    - ec2-utils
    #- htop
    #- sysstat
    - fio
    #- inotify-tools
    #- rsync
    # Docker related packages
    - docker
    - containerd
    - runc
    - container-selinux
    - cni-plugins
    - oci-add-hooks
    - amazon-ecr-credential-helper
    - udica
    # AWS related packages
    - amazon-ssm-agent
    - amazon-cloudwatch-agent
    - amazon-efs-utils
    - ec2-instance-connect
    - ec2-instance-connect-selinux
    # Optional utilities
    #- smart-restart
    #- python3-dnf-plugin-post-transaction-actions
    #- kpatch-dnf
    #- kpatch-runtime
    # Ansible
    #- ansible
    #- ansible-core
    #- sshpass

    manage_resolv_conf: true
    resolv_conf:
    @@ -48,6 +154,41 @@ locale: en_US.UTF-8
    disable_root: true

    write_files:
    - path: /etc/motd.d/31-banner
    content: |
    , #_
    ~\_ ####_
    ~~ \_#####\
    ~~ \###|
    ~~ \#/ ___ Amazon Linux 2023 (Docker Optimized)
    ~~ V~' '->
    ~~~ /
    ~~._. _/
    _/ _/
    _/m/'
    - path: /etc/systemd/journald.conf.d/00-override.conf
    content: |
    [Journal]
    SystemMaxUse=100M
    RuntimeMaxUse=100M
    RuntimeMaxFileSize=10M
    RateLimitIntervals=1s
    RateLimitBurst=10000
    - path: /etc/docker/daemon.json
    content: |
    {
    "debug": false,
    "experimental": false,
    "exec-opts": ["native.cgroupdriver=systemd"],
    "userland-proxy": false,
    "live-restore": true,
    "log-level": "warn",
    "log-driver": "json-file",
    "log-opts": {
    "max-size": "100m",
    "max-file": "3"
    }
    }
    - path: /etc/systemd/system/amazon-ssm-agent.service.d/00-override.conf
    permissions: "0644"
    content: |
    @@ -62,3 +203,59 @@ write_files:
    After=network-online.target nss-lookup.target cloud-init.target
    DefaultDependencies=no
    ConditionFileIsExecutable=/usr/bin/amazon-ssm-agent
    - path: /etc/systemd/system/amazon-cloudwatch-agent.d/00-override.conf
    content: |
    [Unit]
    Wants=network-online.target
    After=network-online.target nss-lookup.target cloud-init.target
    DefaultDependencies=no
    ConditionFileIsExecutable=/opt/aws/amazon-cloudwatch-agent/bin/start-amazon-cloudwatch-agent
    runcmd:
    # System Configuration
    - [ touch, /etc/dnf/vars/releasever ]
    #- [ bash, -c, 'echo "latest" > /etc/dnf/vars/releasever' ]
    - [ localectl, set-locale, LANG=en_US.UTF-8 ]
    - [ timedatectl, set-timezone, Etc/UTC ]
    - [ timedatectl, set-ntp, true ]
    - [ ln, -s, /dev/null, /etc/motd.d/30-banner ]
    # Enable and start services
    - [ systemctl, daemon-reload ]
    - [ systemctl, enable, --now, chronyd ]
    #- [ systemctl, enable, --now, amazon-ssm-agent ]
    #- [ systemctl, enable, --now, amazon-cloudwatch-agent ]
    #- [ systemctl, enable, --now, kpatch.service ]
    # Setup Docker
    - systemctl enable --now docker.service containerd.service
    - groupadd docker
    - usermod -aG docker ec2-user
    # Install Docker Compose v2
    - mkdir -p /usr/local/lib/docker/cli-plugins
    - curl -sL https://github.com/docker/compose/releases/latest/download/docker-compose-linux-$(uname -m) -o /usr/local/lib/docker/cli-plugins/docker-compose
    - chmod +x /usr/local/lib/docker/cli-plugins/docker-compose
    # Configure services
    - systemctl enable --now chronyd
    - systemctl enable --now amazon-ssm-agent
    - systemctl enable --now amazon-cloudwatch-agent
    # User environment setup
    - [ loginctl, enable-linger, ec2-user ]
    - mkdir -p /home/ec2-user/bin
    - mkdir -p /home/ec2-user/.config/{systemd,environment.d}
    - mkdir -p /home/ec2-user/.config/systemd/user/sockets.target.wants
    - mkdir -p /home/ec2-user/.local/share/systemd/user
    - mkdir -p /home/ec2-user/.local/bin
    - chown -R ec2-user:ec2-user /home/ec2-user
    # Configure AWS CLI for ec2-user
    - su - ec2-user -c "aws configure set default.region $(curl -s http://169.254.169.254/latest/meta-data/placement/region)"
    - su - ec2-user -c "aws configure set default.sts_regional_endpoints regional"
    - su - ec2-user -c "aws configure set default.credential_source Ec2InstanceMetadata"
    - su - ec2-user -c "aws configure set default.duration_seconds 3600"
    - su - ec2-user -c "aws configure set default.cli_pager ''"
    - su - ec2-user -c "aws configure set default.output json"

    final_message: "System configuration completed."

    power_state:
    mode: reboot
    message: Rebooting after system configuration
    condition: True
  4. @thimslugga thimslugga revised this gist Oct 15, 2024. 1 changed file with 17 additions and 19 deletions.
    36 changes: 17 additions & 19 deletions zz-performance-tuning-al2023.md
    Original file line number Diff line number Diff line change
    @@ -301,28 +301,27 @@ net.ipv4.tcp_wmem=4096 65536 16777216
    #net.core.busy_read=50
    #net.core.busy_poll=50
    # It's recommended to use a 'fair queueing' qdisc e.g. fq or fq_codel.
    # For queue management, sch_fq is/was recommended instead of fq_codel as of linux 3.12.
    # It's recommended to use a 'fair queueing' qdisc e.g. fq or fq_codel.
    #
    # - fq or fq_codel can be safely used as a drop-in replacement for pfifo_fast.
    # - fq or fq_codel is required to use tcp_bbr as it requires fair queuing.
    # - fq-codel is best for forwarding/routers which don't originate local traffic,
    # hypervisors and best general purpose qdisc.
    # - fq is best for fat servers with tcp-heavy workloads and particularly at 10GigE+.
    #
    # - BBR supports fq_codel in Linux Kernel version 4.13 and later.
    # - BBR must be used with fq qdisc with pacing enabled, since pacing is integral to the BBR design
    # and implementation. BBR without pacing would not function properly and may incur unnecessary
    # high packet loss rates.
    #
    # http://man7.org/linux/man-pages/man8/tc-fq.8.html
    # https://github.com/systemd/systemd/blob/main/sysctl.d/50-default.conf
    # https://www.bufferbloat.net/projects/codel/wiki/
    #
    # Note: fq can be safely used as a drop-in replacement for pfifo_fast.
    # Note: fq is required to use tcp_bbr as it requires fair queuing.
    # Note: fq is best for fat servers with tcp-heavy workloads and particularly at 10GigE and above.
    # Note: fq-codel is a better choice for forwarding/routers which don't originate local traffic,
    # hypervisors and best general purpose qdisc.
    net.core.default_qdisc = fq
    #net.core.default_qdisc = fq_codel
    # https://github.com/systemd/systemd/issues/9725#issuecomment-412286509
    # https://forum.vyos.io/t/bbr-and-fq-as-new-defaults/12344
    # https://research.google/pubs/pub45646/
    # https://github.com/google/bbr/blob/master/README
    #
    # Note: This is not an official Google product.. lol
    # Note: BBR will support fq_codel after linux-4.13.
    # Note: BBR must be used with fq qdisc with pacing enabled, since pacing is integral to the BBR design
    # and implementation. BBR without pacing would not function properly and may incur unnecessary
    # high packet loss rates.
    net.core.default_qdisc = fq_codel
    #net.ipv4.tcp_congestion_control = bbr
    # Negotiate TCP ECN for active and passive connections
    @@ -342,7 +341,6 @@ net.ipv4.tcp_ecn_fallback = 1
    net.ipv4.ip_default_ttl = 127
    # Enable forwarding so that docker networking works as expected.
    #
    # Enable IPv4 forwarding
    net.ipv4.ip_forward = 1
    net.ipv4.conf.all.forwarding = 1
    @@ -390,7 +388,7 @@ net.ipv4.tcp_adv_win_scale = 1
    # Disable the TCP timestamps option for better CPU utilization.
    #net.ipv4.tcp_timestamps = 0
    # Recommended for hosts with jumbo frames enabled
    # Recommended for hosts with jumbo frames enabled. Default in AWS.
    net.ipv4.tcp_mtu_probing = 1
    # Enable to send data in the opening SYN packet.
  5. @thimslugga thimslugga revised this gist Oct 9, 2024. 1 changed file with 66 additions and 57 deletions.
    123 changes: 66 additions & 57 deletions zz-performance-tuning-al2023.md
    Original file line number Diff line number Diff line change
    @@ -36,7 +36,8 @@ uname -sr; cat /proc/cmdline
    ```

    ```shell
    sudo grubby --update-kernel=ALL --args="intel_idle.max_cstate=1 processor.max_cstate=1 cpufreq.default_governor=performance swapaccount=1 psi=1"
    sudo grubby --update-kernel=ALL --args="intel_idle.max_cstate=1 processor.max_cstate=1 cpufreq.default_governor=performance"
    sudo grubby --update-kernel=ALL --args="swapaccount=1 psi=1"
    ```

    Verify:
    @@ -57,9 +58,12 @@ sudo systemctl reboot

    ```shell
    # start with 50-70
    echo 50 > /proc/sys/net/core/busy_read
    echo 50 > /proc/sys/net/core/busy_poll
    echo 0 > /proc/sys/net/ipv4/tcp_sack
    echo 50 | sudo tee /proc/sys/net/core/busy_read
    echo 50 | sudo tee /proc/sys/net/core/busy_poll
    ```

    ```shell
    echo 0 | sudo tee /proc/sys/net/ipv4/tcp_sack
    ```

    ```shell
    @@ -184,17 +188,20 @@ user.max_user_namespaces=28633
    # to their size.
    #
    # Setting min_free_kbytes to an extremely low value prevents the system from
    # reclaiming memory, which can result in system hangs and OOM-killing processes.
    # reclaiming memory, which can result in system hangs and OOM-killing processes.
    #
    # However, setting min_free_kbytes too high e.g. 5–10% of total system memory can
    # cause the system to enter an out-of-memory state immediately, resulting in the
    # system spending too much time trying to reclaim memory.
    #
    # As a rule of thumb, yset this value to between 1-3% of available system
    # memory and adjust this value up or down to meet the needs of your application
    # workload.
    # workload. It is not recommended that the setting of vm.min_free_kbytes
    # exceed 5% of the system's physical memory.
    #
    # Ensure that the reserved kernel memory is sufficient to sustain a high
    # hrate of packet buffer allocations as the default value may be too small.
    # rate of packet buffer allocations as the default value may be too small.
    # awk 'BEGIN {OFMT = "%.0f";} /MemTotal/ {print "vm.min_free_kbytes =", $2 * .03;}' /proc/meminfo
    vm.min_free_kbytes=1048576
    # Maximum number of memory map areas a process may have (memory map areas are used
    @@ -249,33 +256,31 @@ fs.inotify.max_user_watches=524288
    # Suppress logging of net_ratelimit callback
    #net.core.message_cost=0
    # Increasing this value for high speed cards may help prevent losing packets
    # https://access.redhat.com/solutions/1241943
    net.core.netdev_max_backlog = 2000
    net.core.netdev_budget = 600
    # The maximum number of "backlogged sockets, accept and syn queues are governed by
    # net.core.somaxconn and net.ipv4.tcp_max_syn_backlog. The maximum number of
    # "backlogged sockets". The net.core.somaxconn setting caps both queue sizes.
    # Ensure that net.core.somaxconn is always set to a value equal to or greater than
    # tcp_backlog e.g. net.core.somaxconn >= 4096.
    #
    # Increase number of incoming connections
    #net.core.somaxconn = 1024
    #net.ipv4.tcp_max_syn_backlog = 2048
    #net.core.somaxconn = 4096
    #net.ipv4.tcp_max_syn_backlog = 8192
    net.core.somaxconn = 4096
    net.ipv4.tcp_max_syn_backlog = 4096
    # Increasing this value for high speed cards may help prevent losing packets
    # https://access.redhat.com/solutions/1241943
    net.core.netdev_max_backlog = 1000
    net.core.netdev_budget = 600
    # Increase the UDP buffer size
    # Increase UDP Buffers
    # Maximum Receive/Transmit Window Size
    # if netstat -us is reporting errors, another underlying issue may
    # be preventing the application from draining its receive queue.
    # https://github.com/quic-go/quic-go/wiki/UDP-Buffer-Sizes
    # https://medium.com/@CameronSparr/increase-os-udp-buffers-to-improve-performance-51d167bb1360
    # The maximum receive socket buffer (size in bytes)
    #net.core.rmem_max=7500000
    # The maximum send socket buffer (size in bytes)
    #net.core.wmem_max=7500000
    # OR allow testing with buffers up to 16MB
    #net.core.rmem_max=16777216
    #net.core.wmem_max=16777216
    # The maximum allowed (16MB) receive socket buffer (size in bytes)
    net.core.rmem_max=16777216
    # The maximum allowed (16MB) send socket buffer (size in bytes)
    net.core.wmem_max=16777216
    # The default socket receive buffer (size in bytes)
    #net.core.rmem_default=31457280
    @@ -292,6 +297,7 @@ net.ipv4.tcp_wmem=4096 65536 16777216
    # The downside of busy polling is higher CPU usage in the host that comes from polling
    # for new data in a tight loop. There are two global settings that control the number of
    # microseconds to wait for packets for all interfaces.
    # ethtool -k eth0
    #net.core.busy_read=50
    #net.core.busy_poll=50
    @@ -306,8 +312,8 @@ net.ipv4.tcp_wmem=4096 65536 16777216
    # Note: fq is best for fat servers with tcp-heavy workloads and particularly at 10GigE and above.
    # Note: fq-codel is a better choice for forwarding/routers which don't originate local traffic,
    # hypervisors and best general purpose qdisc.
    net.core.default_qdisc=fq
    #net.core.default_qdisc=fq_codel
    net.core.default_qdisc = fq
    #net.core.default_qdisc = fq_codel
    # https://research.google/pubs/pub45646/
    # https://github.com/google/bbr/blob/master/README
    @@ -317,7 +323,7 @@ net.core.default_qdisc=fq
    # Note: BBR must be used with fq qdisc with pacing enabled, since pacing is integral to the BBR design
    # and implementation. BBR without pacing would not function properly and may incur unnecessary
    # high packet loss rates.
    #net.ipv4.tcp_congestion_control=bbr
    #net.ipv4.tcp_congestion_control = bbr
    # Negotiate TCP ECN for active and passive connections
    #
    @@ -329,21 +335,20 @@ net.core.default_qdisc=fq
    #
    # https://github.com/systemd/systemd/pull/9143
    # https://github.com/systemd/systemd/issues/9748
    net.ipv4.tcp_ecn=2
    net.ipv4.tcp_ecn_fallback=1
    # Bump the TTL from the default i.e. 64 to 127 on AWS
    net.ipv4.ip_default_ttl=127
    net.ipv4.tcp_ecn = 2
    net.ipv4.tcp_ecn_fallback = 1
    ## Enable forwarding so that docker networking works as expected.
    # Bump the TTL from the default of 64 to 127 on AWS
    net.ipv4.ip_default_ttl = 127
    # Enable forwarding so that docker networking works as expected.
    #
    # Enable IPv4 forwarding
    net.ipv4.ip_forward=1
    net.ipv4.conf.all.forwarding=1
    net.ipv4.ip_forward = 1
    net.ipv4.conf.all.forwarding = 1
    # Enable IPv6 forwarding
    net.ipv6.conf.default.forwarding=1
    net.ipv6.conf.all.forwarding=1
    net.ipv6.conf.default.forwarding = 1
    net.ipv6.conf.all.forwarding = 1
    # Disables ICMP redirect sending
    net.ipv4.conf.eth0.send_redirects=0
    @@ -360,71 +365,75 @@ net.ipv4.conf.all.secure_redirects=0
    net.ipv4.conf.default.secure_redirects=0
    # Increase the local outgoing port range
    net.ipv4.ip_local_port_range=10000 65535
    net.ipv4.ip_local_port_range = 10000 65535
    #net.ipv4.ip_local_reserved_ports=
    # Enable Multipath TCP
    net.mptcp.enabled=1
    net.mptcp.enabled = 1
    # Enable low latency mode for TCP, intended to give preference to low latency
    # over higher throughput. Setting to 1 will disable IPv4 tcp pre-queue processing.
    #net.ipv4.tcp_low_latency=1
    #net.ipv4.tcp_low_latency = 1
    # Enable TCP Window Scaling
    net.ipv4.tcp_window_scaling=1
    net.ipv4.tcp_window_scaling = 1
    # RFC 1323, Support for IPV4 TCP window sizes larger than 64K, which is generally
    # needed on high bandwidth networks. Tells the kernel how much of the socket buffer
    # space should be used for TCP window size and how much to save for an application buffer.
    net.ipv4.tcp_adv_win_scale=1
    net.ipv4.tcp_adv_win_scale = 1
    #net.ipv4.tcp_no_metrics_save = 1
    #net.ipv4.tcp_moderate_rcvbuf = 1
    # Disable the TCP timestamps option for better CPU utilization.
    #net.ipv4.tcp_timestamps=0
    #net.ipv4.tcp_timestamps = 0
    # Recommended for hosts with jumbo frames enabled
    net.ipv4.tcp_mtu_probing=1
    net.ipv4.tcp_mtu_probing = 1
    # Enable to send data in the opening SYN packet.
    net.ipv4.tcp_fastopen=1
    net.ipv4.tcp_fastopen = 1
    # Protect Against TCP Time-Wait Assassination Attacks
    net.ipv4.tcp_rfc1337=1
    net.ipv4.tcp_rfc1337 = 1
    # Enable the TCP selective ACKs option for better throughput.
    #net.ipv4.tcp_sack=1
    #net.ipv4.tcp_sack = 1
    # https://blog.cloudflare.com/optimizing-the-linux-stack-for-mobile-web-per/
    # https://access.redhat.com/solutions/168483
    # Use this parameter to ensure that the maximum speed is used from beginning
    # also for previously idle TCP connections. Avoid falling back to slow start
    # after a connection goes idle keeps our cwnd large with the keep alive
    # connections (kernel > 3.6).
    net.ipv4.tcp_slow_start_after_idle=0
    net.ipv4.tcp_slow_start_after_idle = 0
    # The maximum times an IPV4 packet can be reordered in a TCP packet stream without
    # TCP assuming packet loss and going into slow start.
    #net.ipv4.tcp_reordering=3
    #net.ipv4.tcp_reordering = 3
    # The net.ipv4.tcp_tw_recycle option is quite problematic for public-facing servers as it
    # will not handle connections from two different computers behind the same NAT device, which
    # is a problem hard to detect and waiting to bite you in the ass.
    #net.ipv4.tcp_tw_recycle=
    net.ipv4.tcp_tw_reuse=1
    net.ipv4.tcp_tw_reuse= 1
    # Decrease the time default value for connections to keep alive.
    #net.ipv4.tcp_keepalive_time=300
    #net.ipv4.tcp_keepalive_probes=5
    #net.ipv4.tcp_keepalive_intvl=15
    #net.ipv4.tcp_keepalive_time = 300
    #net.ipv4.tcp_keepalive_probes = 5
    #net.ipv4.tcp_keepalive_intvl = 15
    # Decrease the time default value for tcp_fin_timeout connection, FIN-WAIT-2
    #net.ipv4.tcp_fin_timeout=15
    #net.ipv4.tcp_fin_timeout = 15
    # Reduce TIME_WAIT from the 120s default to 30-60s
    #net.netfilter.nf_conntrack_tcp_timeout_time_wait=30
    #net.netfilter.nf_conntrack_tcp_timeout_time_wait = 30
    # Reduce FIN_WAIT from teh 120s default to 30-60s
    #net.netfilter.nf_conntrack_tcp_timeout_fin_wait=30
    #net.netfilter.nf_conntrack_tcp_timeout_fin_wait = 30
    EOF
    ```
  6. @thimslugga thimslugga revised this gist Oct 4, 2024. 1 changed file with 88 additions and 47 deletions.
    135 changes: 88 additions & 47 deletions zz-performance-tuning-al2023.md
    Original file line number Diff line number Diff line change
    @@ -72,6 +72,8 @@ cat <<'EOF' | sudo tee /etc/sysctl.d/99-custom-tuning.conf
    # https://www.kernel.org/doc/Documentation/sysctl/net.txt
    # https://www.kernel.org/doc/Documentation/networking/proc_net_tcp.txt
    # https://www.kernel.org/doc/Documentation/networking/scaling.txt
    # https://www.kernel.org/doc/Documentation/networking/multiqueue.txt
    # https://www.kernel.org/doc/Documentation/networking/ena.txt
    #
    # For binary values, 0 is typically disabled, 1 is enabled.
    #
    @@ -91,17 +93,19 @@ cat <<'EOF' | sudo tee /etc/sysctl.d/99-custom-tuning.conf
    #
    # Misc References:
    #
    # https://github.com/leandromoreira/linux-network-performance-parameters
    # https://oxnz.github.io/2016/05/03/performance-tuning-networking/
    # https://www.speedguide.net/articles/linux-tweaking-121
    # https://www.tweaked.io/guide/kernel/
    # http://rhaas.blogspot.co.at/2014/06/linux-disables-vmzonereclaimmode-by.html
    # https://fasterdata.es.net/host-tuning/linux/
    # https://documentation.suse.com/sles/15-SP5/html/SLES-all/cha-tuning-network.html
    # https://blog.packagecloud.io/monitoring-tuning-linux-networking-stack-receiving-data/
    # https://github.com/myllynen/rhel-performance-guide
    # https://github.com/myllynen/rhel-troubleshooting-guide
    # https://www.brendangregg.com/linuxperf.html
    # - https://github.com/leandromoreira/linux-network-performance-parameters
    # - https://oxnz.github.io/2016/05/03/performance-tuning-networking/
    # - https://www.speedguide.net/articles/linux-tweaking-121
    # - https://www.tweaked.io/guide/kernel/
    # - http://rhaas.blogspot.co.at/2014/06/linux-disables-vmzonereclaimmode-by.html
    # - https://fasterdata.es.net/host-tuning/linux/
    # - https://documentation.suse.com/sles/15-SP5/html/SLES-all/cha-tuning-network.html
    # - https://blog.packagecloud.io/monitoring-tuning-linux-networking-stack-receiving-data/
    # - https://blog.packagecloud.io/monitoring-tuning-linux-networking-stack-sending-data/
    # - https://blog.cloudflare.com/optimizing-tcp-for-high-throughput-and-low-latency/
    # - https://github.com/myllynen/rhel-performance-guide
    # - https://github.com/myllynen/rhel-troubleshooting-guide
    # - https://www.brendangregg.com/linuxperf.html
    # Adjust the kernel printk to minimize seiral console logging.
    # The defaults are very verbose and they can have a performance impact.
    @@ -153,6 +157,9 @@ kernel.sched_autogroup_enabled=0
    # A lower value e.g. like 500000 (0.5 ms) may improve the responsiveness for certain workloads.
    #kernel.sched_migration_cost_ns=500000
    # Security feature. No randomization, everything is static.
    #kernel.randomize_va_space=1
    # For rngd
    #kernel.random.write_wakeup_threshold=3072
    @@ -163,9 +170,11 @@ kernel.sched_autogroup_enabled=0
    # 0 = re-enable, 1 = disable, 2 = disable but allow admin to re-enable without a reboot
    #kernel.unprivileged_bpf_disabled=0
    # Rootless Containers
    # https://github.com/containers/podman/blob/main/docs/tutorials/rootless_tutorial.md
    user.max_user_namespaces=28633
    # When set to "enabled", all users are allowed to use userfaultfd syscalls.
    # https://lwn.net/Articles/782745/
    #vm.unprivileged_userfaultfd=1
    @@ -192,9 +201,13 @@ vm.min_free_kbytes=1048576
    # as a side-effect of calling malloc, directly by mmap and mprotect, and also when
    # loading shared libraries).
    vm.max_map_count=262144
    vm.overcommit_memory=1
    # Make sure the host does not try to swap too early.
    # https://access.redhat.com/solutions/6785021
    # https://access.redhat.com/solutions/7042476
    # vm.force_cgroup_v2_swappiness=1
    vm.swappiness=10
    # The maximum percentage of dirty system memory.
    @@ -241,34 +254,37 @@ fs.inotify.max_user_watches=524288
    # "backlogged sockets". The net.core.somaxconn setting caps both queue sizes.
    # Ensure that net.core.somaxconn is always set to a value equal to or greater than
    # tcp_backlog e.g. net.core.somaxconn >= 4096.
    #
    # Increase number of incoming connections
    #net.core.somaxconn = 1024
    #net.ipv4.tcp_max_syn_backlog = 2048
    #net.core.somaxconn = 4096
    #net.ipv4.tcp_max_syn_backlog = 8192
    # Increasing this value for high speed cards may help prevent losing packets
    #net.core.netdev_max_backlog=16384
    # https://access.redhat.com/solutions/1241943
    net.core.netdev_max_backlog = 1000
    net.core.netdev_budget = 600
    # Increase the UDP buffer size
    # https://github.com/quic-go/quic-go/wiki/UDP-Buffer-Sizes
    # https://medium.com/@CameronSparr/increase-os-udp-buffers-to-improve-performance-51d167bb1360
    # The default socket receive buffer (size in bytes)
    #net.core.rmem_default=31457280
    # The maximum receive socket buffer (size in bytes)
    #net.core.rmem_max=7500000
    # The maximum send socket buffer (size in bytes)
    #net.core.wmem_max=7500000
    # OR allow testing with buffers up to 16MB
    #net.core.rmem_max=16777216
    #net.core.wmem_max=16777216
    # Increase linux auto-tuning of TCP buffer limits to 16MB
    # The default socket receive buffer (size in bytes)
    #net.core.rmem_default=31457280
    #net.core.wmem_default=
    # Increase linux auto-tuning of TCP buffer limits to 16MB to prevent dropped packets.
    # https://blog.cloudflare.com/the-story-of-one-latency-spike/
    #net.ipv4.tcp_rmem=4096 87380 16777216
    #net.ipv4.tcp_wmem=4096 65536 16777216
    net.ipv4.tcp_rmem=4096 87380 16777216
    net.ipv4.tcp_wmem=4096 65536 16777216
    # Enable busy poll mode
    # Busy poll mode reduces latency on the network receive path. When you enable busy poll
    @@ -301,7 +317,7 @@ net.core.default_qdisc=fq
    # Note: BBR must be used with fq qdisc with pacing enabled, since pacing is integral to the BBR design
    # and implementation. BBR without pacing would not function properly and may incur unnecessary
    # high packet loss rates.
    #net.ipv4.tcp_congestion_control = bbr
    #net.ipv4.tcp_congestion_control=bbr
    # Negotiate TCP ECN for active and passive connections
    #
    @@ -313,11 +329,13 @@ net.core.default_qdisc=fq
    #
    # https://github.com/systemd/systemd/pull/9143
    # https://github.com/systemd/systemd/issues/9748
    #net.ipv4.tcp_ecn=1
    net.ipv4.tcp_ecn=2
    net.ipv4.tcp_ecn_fallback=1
    # Enable forwarding so that docker networking works as expected
    # Bump the TTL from the default i.e. 64 to 127 on AWS
    net.ipv4.ip_default_ttl=127
    ## Enable forwarding so that docker networking works as expected.
    # Enable IPv4 forwarding
    net.ipv4.ip_forward=1
    @@ -327,57 +345,80 @@ net.ipv4.conf.all.forwarding=1
    net.ipv6.conf.default.forwarding=1
    net.ipv6.conf.all.forwarding=1
    # Increase the outgoing port range
    net.ipv4.ip_local_port_range="10000 65535"
    # Disables ICMP redirect sending
    net.ipv4.conf.eth0.send_redirects=0
    net.ipv4.conf.all.send_redirects=0
    net.ipv4.conf.default.send_redirects=0
    # Disables ICMP redirect acceptance
    net.ipv4.conf.all.accept_redirects=0
    net.ipv4.conf.default.accept_redirects=0
    net.ipv6.conf.all.accept_redirects=0
    net.ipv6.conf.default.accept_redirects=0
    net.ipv4.conf.all.secure_redirects=0
    net.ipv4.conf.default.secure_redirects=0
    # Increase the local outgoing port range
    net.ipv4.ip_local_port_range=10000 65535
    #net.ipv4.ip_local_reserved_ports=
    # Enable Multipath TCP
    net.mptcp.enabled=1
    # Enable low latency mode for TCP, intended to give preference to low latency
    # over higher throughput. Setting to 1 will disable IPv4 tcp pre-queue processing.
    #net.ipv4.tcp_low_latency=1
    # Enable TCP Window Scaling
    net.ipv4.tcp_window_scaling=1
    # RFC 1323, Support for IPV4 TCP window sizes larger than 64K, which is generally
    # needed on high bandwidth networks. Tells the kernel how much of the socket buffer
    # space should be used for TCP window size and how much to save for an application buffer.
    net.ipv4.tcp_adv_win_scale=1
    # Bump the TTL from the default i.e. 64 to 127 on AWS
    net.ipv4.ip_default_ttl=127
    # Disable the TCP timestamps option for better CPU utilization.
    #net.ipv4.tcp_timestamps=0
    # Recommended for hosts with jumbo frames enabled
    net.ipv4.tcp_mtu_probing=1
    # Enable to send data in the opening SYN packet.
    net.ipv4.tcp_fastopen=1
    # Protect Against TCP Time-Wait Assassination Attacks
    net.ipv4.tcp_rfc1337=1
    #net.ipv4.tw_reuse=1
    # Disables ICMP redirect sending
    net.ipv4.conf.eth0.send_redirects=0
    net.ipv4.conf.all.send_redirects=0
    net.ipv4.conf.default.send_redirects=0
    # Disables ICMP redirect acceptance
    net.ipv4.conf.all.accept_redirects=0
    net.ipv4.conf.default.accept_redirects=0
    net.ipv6.conf.all.accept_redirects=0
    net.ipv6.conf.default.accept_redirects=0
    net.ipv4.conf.all.secure_redirects=0
    net.ipv4.conf.default.secure_redirects=0
    # Enable the TCP selective ACKs option for better throughput.
    #net.ipv4.tcp_sack=1
    # https://blog.cloudflare.com/optimizing-the-linux-stack-for-mobile-web-per/
    # https://access.redhat.com/solutions/168483
    # Use this parameter to ensure that the maximum speed is used from beginning
    # also for previously idle TCP connections. Avoid falling back to slow start
    # after a connection goes idle keeps our cwnd large with the keep alive
    # connections (kernel > 3.6).
    net.ipv4.tcp_slow_start_after_idle = 0
    net.ipv4.tcp_slow_start_after_idle=0
    # The maximum times an IPV4 packet can be reordered in a TCP packet stream without
    # TCP assuming packet loss and going into slow start.
    #net.ipv4.tcp_reordering=3
    # The net.ipv4.tcp_tw_recycle option is quite problematic for public-facing servers as it
    # will not handle connections from two different computers behind the same NAT device, which
    # is a problem hard to detect and waiting to bite you in the ass.
    #net.ipv4.tcp_tw_recycle=
    net.ipv4.tcp_tw_reuse=1
    # Decrease the time default value for connections to keep alive
    #net.ipv4.tcp_keepalive_time = 300
    #net.ipv4.tcp_keepalive_probes = 5
    #net.ipv4.tcp_keepalive_intvl = 15
    # Decrease the time default value for connections to keep alive.
    #net.ipv4.tcp_keepalive_time=300
    #net.ipv4.tcp_keepalive_probes=5
    #net.ipv4.tcp_keepalive_intvl=15
    # Decrease the time default value for tcp_fin_timeout connection, FIN-WAIT-2
    #net.ipv4.tcp_fin_timeout = 15
    #net.ipv4.tcp_fin_timeout=15
    # Reduce TIME_WAIT from the 120s default to 30-60s
    #net.netfilter.nf_conntrack_tcp_timeout_time_wait=30
  7. @thimslugga thimslugga revised this gist Sep 11, 2024. 1 changed file with 1 addition and 0 deletions.
    1 change: 1 addition & 0 deletions setup-docker-al2023.md
    Original file line number Diff line number Diff line change
    @@ -143,6 +143,7 @@ sudo dnf install --allowerasing -y \
    psutils \
    python3 \
    python3-pip \
    python3-psutil \
    python3-policycoreutils \
    policycoreutils-python-utils \
    bash-completion \
  8. @thimslugga thimslugga revised this gist Sep 5, 2024. 1 changed file with 16 additions and 5 deletions.
    21 changes: 16 additions & 5 deletions zz-performance-tuning-al2023.md
    Original file line number Diff line number Diff line change
    @@ -169,12 +169,23 @@ user.max_user_namespaces=28633
    # https://lwn.net/Articles/782745/
    #vm.unprivileged_userfaultfd=1
    # Ensure that your reserved kernel memory is sufficient to sustain a
    # high rate of packet buffer allocations (the default value may be too small).
    # Specifies the minimum number of kilobytes to keep free across the system.
    # This is used to determine an appropriate value for each low memory zone,
    # each of which is assigned a number of reserved free pages in proportion
    # to their size.
    #
    # As a rule of thumb, you should set this value to between 1-3% of available
    # system memory and adjust this value up or down to meet the needs of your
    # application requirements.
    # Setting min_free_kbytes to an extremely low value prevents the system from
    # reclaiming memory, which can result in system hangs and OOM-killing processes.
    # However, setting min_free_kbytes too high e.g. 5–10% of total system memory can
    # cause the system to enter an out-of-memory state immediately, resulting in the
    # system spending too much time trying to reclaim memory.
    #
    # As a rule of thumb, yset this value to between 1-3% of available system
    # memory and adjust this value up or down to meet the needs of your application
    # workload.
    #
    # Ensure that the reserved kernel memory is sufficient to sustain a high
    # hrate of packet buffer allocations as the default value may be too small.
    vm.min_free_kbytes=1048576
    # Maximum number of memory map areas a process may have (memory map areas are used
  9. @thimslugga thimslugga revised this gist Sep 5, 2024. 1 changed file with 19 additions and 19 deletions.
    38 changes: 19 additions & 19 deletions zz-performance-tuning-al2023.md
    Original file line number Diff line number Diff line change
    @@ -103,25 +103,25 @@ cat <<'EOF' | sudo tee /etc/sysctl.d/99-custom-tuning.conf
    # https://github.com/myllynen/rhel-troubleshooting-guide
    # https://www.brendangregg.com/linuxperf.html
    # Minimize console logging level for kernel printk messages.
    # The defaults are very verbose and have a performance impact.
    # Note: 4 4 1 7 is also fine and works too
    # Adjust the kernel printk to minimize seiral console logging.
    # The defaults are very verbose and they can have a performance impact.
    # Note: 4 4 1 7 should also be fine, just not debug i.e. 7
    kernel.printk=3 4 1 7
    # A feature aimed at improving system responsiveness under load by
    # man 7 sched
    #
    # This feature aimed at improving system responsiveness under load by
    # automatically grouping task groups with similar execution patterns.
    # While beneficial for desktop responsiveness, in server environments,
    # especially those running Kubernetes, this behavior might not always
    # be desirable as it could lead to uneven distribution of CPU resources
    # among pods.
    #
    # man 7 sched
    #
    # The use of the cgroups(7) CPU controller to place processes in cgroups
    # other than the root CPU cgroup overrides the affect of autogrouping.
    # other than the root CPU cgroup overrides the affect of auto-grouping.
    #
    # This setting enables better interactivity for desktop workloads and not
    # generally suitable for many server workloads e.g. postgres db.
    # This setting enables better interactivity for desktop workloads and is
    # not typically suitable for many server type workloads e.g. postgresdb.
    #
    # https://cateee.net/lkddb/web-lkddb/SCHED_AUTOGROUP.html
    # https://www.postgresql.org/message-id/[email protected]
    @@ -183,10 +183,10 @@ vm.min_free_kbytes=1048576
    vm.max_map_count=262144
    vm.overcommit_memory=1
    # Make sure the host doesn't swap too early
    # Make sure the host does not try to swap too early.
    vm.swappiness=10
    # Maximum percentage of dirty system memory
    # The maximum percentage of dirty system memory.
    # https://www.suse.com/support/kb/doc/?id=000017857
    vm.dirty_ratio = 10
    @@ -195,27 +195,27 @@ vm.dirty_ratio = 10
    vm.dirty_background_ratio=5
    # Some kernels won't allow dirty_ratio to be set below 5%.
    # Therefore, in dealing with larger amounts of RAM, percentage ratios
    # might not be granular enough. If that is the case, then use the
    # below instead of the settings above.
    # Therefore when dealing with larger amounts of system memory,
    # percentage ratios might not be granular enough. If that is the
    # case, then use the below instead of the settings above.
    #
    # Configure 600 MB maximum dirty cache
    #vm.dirty_bytes=629145600
    # Spawn background write threads once the cache holds 300 MB
    #vm.dirty_background_bytes=314572800
    # The value in file-max denotes the maximum number of file- handles that the Linux kernel will allocate.
    # When you get lots of error messages about running out of file handles, you might want to increase this limit.
    # The value in file-max denotes the maximum number of file-handlers that the Linux kernel will allocate.
    # When you get lots of error messages about running out of file handlers, you will want to increase this limit.
    # Attempts to allocate more file descriptors than file-max are reported with printk, look for in the kernel logs.
    # VFS: file-max limit <number> reached
    fs.file-max=1048576
    # Maximum number of concurrent asynchronous I/O operations (you might need to
    # increase this limit further if you have a lot of workloads that use the AIO
    # subsystem e.g. MySQL, etc.
    # increase this limit further if you have a lot of workloads that uses the AIO
    # subsystem e.g. MySQL, MariaDB, etc.
    # 524288, 1048576, etc.
    fs.aio-max-nr=524288
    #fs.aio-max-nr=1048576
    # Upper limit on the number of watches that can be created per real user ID
    # Raise the limit for watches to the limit i.e. 524,288
  10. @thimslugga thimslugga revised this gist Sep 5, 2024. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion zz-performance-tuning-al2023.md
    Original file line number Diff line number Diff line change
    @@ -175,7 +175,7 @@ user.max_user_namespaces=28633
    # As a rule of thumb, you should set this value to between 1-3% of available
    # system memory and adjust this value up or down to meet the needs of your
    # application requirements.
    vm_min_free_kbytes=1048576
    vm.min_free_kbytes=1048576
    # Maximum number of memory map areas a process may have (memory map areas are used
    # as a side-effect of calling malloc, directly by mmap and mprotect, and also when
  11. @thimslugga thimslugga revised this gist Sep 5, 2024. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion setup-docker-al2023.md
    Original file line number Diff line number Diff line change
    @@ -468,7 +468,7 @@ mkdir -pv "${HOME}"/.local/bin
    ```

    ```shell
    loginctl enable-linger $(whoami)
    sudo loginctl enable-linger $(whoami)
    systemctl --user daemon-reload
    ```

  12. @thimslugga thimslugga revised this gist Sep 4, 2024. 1 changed file with 24 additions and 2 deletions.
    26 changes: 24 additions & 2 deletions setup-docker-al2023.md
    Original file line number Diff line number Diff line change
    @@ -655,10 +655,24 @@ sudo systemctl enable --now nitro-enclaves-allocator.service
    To install the Nvidia drivers:

    ```shell
    sudo dnf install -y wget kernel-modules-extra kernel-devel gcc
    sudo dnf install -y wget kernel-modules-extra kernel-devel gcc dkms
    ```

    Download the driver install script, run it to install teh drivers and verify:
    Add the Nvidia Driver and CUDA repository:

    ```shell
    sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/amzn2023/x86_64/cuda-amzn2023.repo
    sudo dnf clean expire-cache
    ```

    Install the Nvidia driver + CUDA toolkit from the Nvidia repo:

    ```
    sudo dnf module install -y nvidia-driver:latest-dkms
    sudo dnf install -y cuda-toolkit
    ```

    (Alternative) Download the driver install script and run it to install the nvidia drivers:

    ```shell
    curl -sL 'https://us.download.nvidia.com/tesla/535.161.08/NVIDIA-Linux-x86_64-535.161.08.run' -O
    @@ -675,13 +689,17 @@ For the Nvidia container runtime, add the nvidia container repo:

    ```shell
    curl -sL 'https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo' | sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
    sudo dnf clean expire-cache
    sudo dnf check-update
    ```

    Install and configure the `nvidia-container-toolkit`:

    ```shell
    sudo dnf install -y nvidia-container-toolkit
    ```

    ```shell
    sudo nvidia-ctk runtime configure --runtime=docker
    ```

    @@ -697,6 +715,10 @@ To create an Ubuntu based container with access to the host GPUs:
    docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
    ```

    ```shell
    docker run --rm --runtime=nvidia --gpus all public.ecr.aws/amazonlinux/amazonlinux:2023 nvidia-smi
    ```

    ### (Optional) Configure the aws-cli for the ec2-user

    ```shell
  13. @thimslugga thimslugga revised this gist Aug 23, 2024. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion zz-setup-dnsmasq-al2023.md
    Original file line number Diff line number Diff line change
    @@ -3,7 +3,7 @@
    The following steps can be used to setup a local DNS caching service (dnsmasq) to cache DNS lookups on AL2023.

    ```shell
    sudo dnf install -y dnsmasq bind-utils
    sudo dnf install --allowerasing -y dnsmasq bind-utils
    ```

    Backup the defualt configuration:
  14. @thimslugga thimslugga revised this gist Aug 23, 2024. 1 changed file with 6 additions and 0 deletions.
    6 changes: 6 additions & 0 deletions setup-docker-al2023.md
    Original file line number Diff line number Diff line change
    @@ -368,6 +368,11 @@ Configure the locale:

    ```shell
    sudo localectl set-locale LANG=en_US.UTF-8
    ```

    Verify:

    ```shell
    localectl
    ```

    @@ -376,6 +381,7 @@ Configure the hostname:
    ```shell
    sudo hostnamectl set-hostname --static <hostname>
    sudo hostnamectl set-chassis vm
    ```

    Verify:

  15. @thimslugga thimslugga revised this gist Aug 20, 2024. 1 changed file with 11 additions and 0 deletions.
    11 changes: 11 additions & 0 deletions setup-docker-al2023.md
    Original file line number Diff line number Diff line change
    @@ -270,6 +270,17 @@ sudo systemctl try-reload-or-restart amazon-ssm-agent.service
    sudo systemctl status amazon-ssm-agent.service
    ```

    Verify:

    ```shell
    systemd-delta --type=extended
    systemctl show amazon-ssm-agent --all
    # systemctl show <unit>.service --property=<PROPERTY_NAME>
    # systemctl show <unit>.service --property=<PROPERTY_NAME1>,<PROPERTY_NAME2>
    systemctl show amazon-ssm-agent.service --property=After,Wants
    ```


    - https://ubuntu.com/blog/cloud-init-v-18-2-cli-subcommands

    ### (Optional) Install and setup the Unified CloudWatch Agent
  16. @thimslugga thimslugga revised this gist Aug 13, 2024. 1 changed file with 54 additions and 17 deletions.
    71 changes: 54 additions & 17 deletions zz-setup-dnsmasq-al2023.md
    Original file line number Diff line number Diff line change
    @@ -19,49 +19,82 @@ cat <<'EOF' | sudo tee /etc/dnsmasq.conf
    # https://thekelleys.org.uk/dnsmasq/docs/dnsmasq-man.html
    # https://thekelleys.org.uk/gitweb/?p=dnsmasq.git
    ## Server Configuration
    # The user to which dnsmasq will change to after startup
    user=dnsmasq
    # The group which dnsmasq will run as
    group=dnsmasq
    # PID file
    pid-file=/var/run/dnsmasq.pid
    # The alternative would be just 127.0.0.1 without ::1
    listen-address=::1,127.0.0.1
    # Port 53
    # port=53 or port=0 to disable the dnsmasq DNS server functionality.
    port=53
    # For a local only DNS resolver use interface=lo + bind-interfaces
    # See for more details: https://serverfault.com/a/830737
    #
    # Listen only on the specified interface(s).
    interface=lo
    # dnsmasq binds to the wildcard address, even if it is listening
    # on only some interfaces. It then discards requests that it
    # shouldn't reply to. This has the advantage of working even
    # when interfaces come and go and change address.
    # dnsmasq binds to the wildcard address, even if it is only
    # listening some interfaces. It then discards requests that
    # it shouldn't reply to. This has the advantage of working
    # even when interfaces come and go and change address.
    bind-interfaces
    #bind-dynamic
    # Do not listen on the specified interface.
    # Do not listen on the specified interface(s).
    #except-interface=eth0
    #except-interface=eth1
    ## DHCP Server
    # Turn off DHCP and TFTP Server features
    #no-dhcp-interface=eth0
    # The user to which dnsmasq will change to after startup
    user=dnsmasq
    #dhcp-authoritative
    # The group which dnsmasq will run as
    group=dnsmasq
    # Dynamic range of IPs to make available to LAN PC and the lease time.
    # Ideally set the lease time to 5m only at first to test everything
    # works okay before you set long-lasting records.
    #dhcp-range=192.168.1.100,192.168.1.253,255.255.255.0,16h
    # PID file
    pid-file=/var/run/dnsmasq.pid
    # Provide IPv6 DHCPv6 leases, where the range is constructed using the
    # network interface as prefix.
    #dhcp-range=::f,::ff,constructor:eth0
    # Whenever /etc/resolv.conf is re-read or the upstream servers are set via DBus, clear the
    # DNS cache. This is useful when new nameservers may have different data than that held in cache.
    #clear-on-reload
    # Set default gateway
    # dhcp-option=3,192.168.1.1
    #dhcp-option=option:router,192.168.1.1
    # If your dnsmasq server is also doing the routing for your network,
    # you can use option 121 to push a static route out. where x.x.x.x is
    # the destination LAN, yy is the CIDR notation (usually /24) and
    # z.z.z.z is the host that will be doing the routing.
    #dhcp-option=121,x.x.x.x/yy,z.z.z.z
    ## Name resolution options
    # Set DNS servers to announce
    # dhcp-option=6,192.168.1.10
    #dhcp-option=option:dns-server,192.168.1.10
    # Optionally set a domain name
    #domain=local
    # To have dnsmasq assign static IPs to some of the clients, you can specify
    # a static assignment i.e. Hosts NIC MAC addresses to IP address.
    #dhcp-host=aa:bb:cc:dd:ee:ff,fw01,192.168.1.1,infinite
    #dhcp-host=aa:bb:cc:dd:ee:ff,sw01,192.168.1.2,infinite
    #dhcp-host=aa:bb:cc:ff:dd:ee,dns01,192.168.1.10,infinite
    ## Name Resolution Options
    # Specify the upstream AWS VPC Resolver within this config file
    # https://docs.aws.amazon.com/vpc/latest/userguide/vpc-dns.html#AmazonDNS
    @@ -103,6 +136,10 @@ resolv-file=/etc/resolv.dnsmasq
    # /etc/resolv.conf. Get upstream servers only from cli or dnsmasq conf.
    #no-resolv
    # Whenever /etc/resolv.conf is re-read or the upstream servers are set via DBus, clear the
    # DNS cache. This is useful when new nameservers may have different data than that held in cache.
    #clear-on-reload
    # Additional hosts files to include
    #addn-hosts=/etc/dnsmasq-blocklist
  17. @thimslugga thimslugga revised this gist Aug 12, 2024. 2 changed files with 131 additions and 82 deletions.
    167 changes: 85 additions & 82 deletions setup-docker-al2023.md
    Original file line number Diff line number Diff line change
    @@ -2,36 +2,6 @@

    The following guide is for setting up Docker with docker-compose v2 on Amazon Linux 2023. The steps are intendend for AL2023 on EC2 but should mostly work for the AL2023 VMs running on other hypervisors.

    ## Amazon Linux 2023 Resources

    - https://aws.amazon.com/linux/amazon-linux-2023/faqs/
    - https://github.com/amazonlinux/amazon-linux-2023/
    - https://cdn.amazonlinux.com/al2023/os-images/latest/
    - https://alas.aws.amazon.com/alas2023.html
    - https://docs.aws.amazon.com/linux/al2023/
    - https://docs.aws.amazon.com/linux/al2023/release-notes/relnotes.html
    - https://docs.aws.amazon.com/linux/al2023/ug/deterministic-upgrades.html
    - [Manage package and operating system updates in AL2023](https://docs.aws.amazon.com/linux/al2023/ug/managing-repos-os-updates.html)
    - https://lwn.net/Articles/926352/

    AL2023 Repository details:

    ```
    # cdn.amazonlinux.com (x86_64)
    https://cdn.amazonlinux.com/al2023/core/mirrors/latest/x86_64/mirror.list
    https://cdn.amazonlinux.com/al2023/core/guids/<guid>/x86_64/
    # cdn.amazonlinux.com (aarch64)
    https://cdn.amazonlinux.com/al2023/core/mirrors/latest/aarch64/mirror.list
    https://cdn.amazonlinux.com/al2023/core/guids/<guid>/aarch64/
    # al2023-repos-us-east-1-<guid>.s3.dualstack.<region>.amazonaws.com
    https://al2023-repos-<region>-<guid>.s3.dualstack.<region>.amazonaws.com/core/mirrors/<releasever>/x86_64/mirror.list
    https://al2023-repos-<region>-<guid>.s3.dualstack.<region>.amazonaws.com/core/guids/<guid>/x86_64/<rest_of_url>
    https://al2023-repos-<region>-<guid>.s3.dualstack.<region>.amazonaws.com/core/mirrors/<releasever>/SRPMS/mirror.list
    https://al2023-repos-<region>-<guid>.s3.dualstack.<region>.amazonaws.com/kernel-livepatch/mirrors/al2023/x86_64/mirror.list
    ```

    ## Install and configure Docker on Amazon Linux 2023

    ### Check for new updates
    @@ -381,7 +351,7 @@ sudo dnf install -y \
    sshpass
    ```

    ### Configure Sane OS Defaults
    ### Configure sane defaults for the OS

    Configure the locale:

    @@ -433,27 +403,44 @@ RateLimitIntervals=1s
    RateLimitBurst=10000
    EOF
    ```
    ```shell
    sudo systemctl daemon-reload
    sudo systemctl try-reload-or-restart systemd-journald.service
    sudo systemctl status systemd-journald.service
    ```

    ### Configure a sane user environment for the current user e.g. ec2-user
    Configure custom MOTD banner:

    ```shell
    touch ~/.{profile,bashrc,bash_profile,bash_login,bash_logout,hushlogin}
    # Disable the AL2023 MOTD banner (found at /usr/lib/motd.d/30-banner):
    sudo ln -s /dev/null /etc/motd.d/30-banner
    cat <<'EOF' | sudo tee /etc/motd.d/31-banner
    , #_
    ~\_ ####_
    ~~ \_#####\
    ~~ \###|
    ~~ \#/ ___ Amazon Linux 2023 (Docker Optimized)
    ~~ V~' '->
    ~~~ /
    ~~._. _/
    _/ _/
    _/m/'
    EOF
    ```

    AL2023 uses pam-motd, see: http://www.linux-pam.org/Linux-PAM-html/sag-pam_motd.html


    ### Configure a sane user environment for the current user e.g. ec2-user

    ```shell
    mkdir -pv "${HOME}/bin"
    mkdir -pv "${HOME}/.config/environment.d"
    mkdir -pv "${HOME}/.config/systemd/user"
    mkdir -pv "${HOME}/.config/systemd/user/sockets.target.wants"
    mkdir -pv "${HOME}/.local/share/systemd/user"
    mkdir -pv "${HOME}/.local/bin"
    touch ~/.{profile,bashrc,bash_profile,bash_login,bash_logout,hushlogin}
    mkdir -pv "${HOME}"/bin
    mkdir -pv "${HOME}"/.config/{systemd,environment.d}
    mkdir -pv "${HOME}"/.config/systemd/user/sockets.target.wants
    mkdir -pv "${HOME}"/.local/share/systemd/user
    mkdir -pv "${HOME}"/.local/bin
    ```

    ```shell
    @@ -468,7 +455,7 @@ loginctl enable-linger $(whoami)
    systemctl --user daemon-reload
    ```

    If you need to switch to root user, use the following instead of `sudo su - <user>`.
    Note: If you need to switch to root user, use the following instead of `sudo su - <user>`.

    ```shell
    # sudo machinectl shell <username>@
    @@ -491,12 +478,20 @@ sudo dnf install --allowerasing -y \
    udica
    ```

    Add the current user e.g. `ec2-user` to the docker group:

    ```shell
    sudo groupadd docker
    sudo usermod -aG docker $USER
    sudo newgrp docker
    ```

    Configure the following docker daemon settings:

    ```shell
    sudo mkdir -pv /etc/docker
    test -d /etc/docker || sudo mkdir -pv /etc/docker
    cat <<'EOF' | sudo tee /etc/docker/daemon.json
    test -f /etc/docker/daemon.json || cat <<'EOF' | sudo tee /etc/docker/daemon.json
    {
    "debug": false,
    "experimental": false,
    @@ -516,25 +511,20 @@ EOF
    - https://docs.docker.com/reference/cli/dockerd/#daemon-configuration-file
    - https://docs.docker.com/config/containers/logging/awslogs/

    Add the current user e.g. `ec2-user` to the docker group:
    Enable and start the docker and containerd service(s):

    ```shell
    sudo usermod -aG docker $USER
    sudo systemctl enable --now docker.service containerd.service
    sudo systemctl status docker containerd
    ```

    Enable and start the docker service:

    ```shell
    sudo systemctl enable --now docker
    sudo systemctl status docker
    ```
    ### Install the Docker Compose v2 CLI Plugin

    ### Install the Docker Compose v2 Plugin
    Install the Docker Compose plugin with the following commands.

    Install the Docker Compose plugin with the following commands:
    To install the docker compose plugin for all users:

    ```shell
    # Install the docker compose plugin for all users
    sudo mkdir -p /usr/local/lib/docker/cli-plugins
    sudo curl -sL https://github.com/docker/compose/releases/latest/download/docker-compose-linux-"$(uname -m)" \
    @@ -547,27 +537,27 @@ test -f /usr/local/lib/docker/cli-plugins/docker-compose \
    && sudo chmod +x /usr/local/lib/docker/cli-plugins/docker-compose
    ```

    (Optional) To install for the local user, run the following commands:
    (Optional) To install only for the local user e.g. `ec2-user`, run the following commands:

    ```shell
    mkdir -p "${HOME}/.docker/cli-plugins" \
    && touch "${HOME}/.docker/config.json"
    cp /usr/local/lib/docker/cli-plugins/docker-compose "${HOME}/.docker/cli-plugins/docker-compose"
    mkdir -p "${HOME}/.docker/cli-plugins" && touch "${HOME}/.docker/config.json"
    curl -sL https://github.com/docker/compose/releases/latest/download/docker-compose-linux-"$(uname -m)" \
    -o "${HOME}/.docker/cli-plugins/docker-compose"
    cat <<'EOF' | tee -a "${HOME}/.bashrc"
    # https://specifications.freedesktop.org/basedir-spec/latest/index.html
    XDG_CONFIG_HOME="${HOME}/.config"
    XDG_DATA_HOME="${HOME}/.local/share"
    XDG_RUNTIME_DIR="${XDG_RUNTIME_DIR:-/run/user/$(id -u)}"
    DBUS_SESSION_BUS_ADDRESS="unix:path=${XDG_RUNTIME_DIR}/bus"
    export XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR DBUS_SESSION_BUS_ADDRESS
    # Docker
    DOCKER_TLS_VERIFY=1
    #DOCKER_CONFIG=/usr/local/lib/docker
    DOCKER_CONFIG="${DOCKER_CONFIG:-$HOME/.docker}"
    DOCKER_TLS_VERIFY=1
    export DOCKER_CONFIG DOCKER_TLS_VERIFY
    #DOCKER_HOST="unix:///run/user/$(id -u)/docker.sock"
    #export DOCKER_HOST
    @@ -585,7 +575,8 @@ docker compose version
    (Optional) Install docker scout with the following commands:

    ```shell
    <commands goes here>
    curl -sSfL https://raw.githubusercontent.com/docker/scout-cli/main/install.sh | sh -s --
    chmod +x $HOME/.docker/scout/docker-scout
    ```

    - https://github.com/docker/scout-cli
    @@ -622,7 +613,19 @@ This is mostly optional if needed, otherwise you can just skip this one.

    ```shell
    sudo dnf install --allowerasing -y aws-nitro-enclaves-cli aws-nitro-enclaves-cli-devel
    ```

    Add teh user to the `ne` group:

    ```shell
    sudo groupadd ne
    sudo usermod -aG ne $USER
    sudo newgrp ne
    ```

    Enable and start the service:

    ```shell
    sudo systemctl enable --now nitro-enclaves-allocator.service
    ```

    @@ -638,22 +641,37 @@ To install the Nvidia drivers:
    sudo dnf install -y wget kernel-modules-extra kernel-devel gcc
    ```

    Download the driver install script, run it and verify:
    Download the driver install script, run it to install teh drivers and verify:

    ```shell
    curl -sL 'https://us.download.nvidia.com/tesla/535.161.08/NVIDIA-Linux-x86_64-535.161.08.run' -O
    sudo sh NVIDIA-Linux-x86_64-535.161.08.run -a -s --ui=none -m=kernel-open
    ```

    Verify:

    ```
    nvidia-smi
    ```

    For the Nvidia container runtime:
    For the Nvidia container runtime, add the nvidia container repo:

    ```shell
    curl -sL 'https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo' | sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
    sudo dnf check-update
    ```

    Install and configure the `nvidia-container-toolkit`:

    ```shell
    sudo dnf install -y nvidia-container-toolkit
    sudo nvidia-ctk runtime configure --runtime=docker
    sudo systemctl restart docker
    ```

    Restart the docker and containerd services:

    ```shell
    sudo systemctl restart docker containerd
    ```

    To create an Ubuntu based container with access to the host GPUs:
    @@ -699,19 +717,4 @@ To create an AL2023 based container:
    ```shell
    docker pull public.ecr.aws/amazonlinux/amazonlinux:2023
    docker run -it --security-opt seccomp=unconfined public.ecr.aws/amazonlinux/amazonlinux:2023 /bin/bash
    ```

    ## Docker Resources

    * https://mobyproject.org/
    * https://github.com/docker/docker-install
    * https://github.com/docker/docker-ce-packaging
    * https://download.docker.com/linux/static/stable/
    * https://docs.docker.com/compose/install/linux/
    * https://github.com/docker/compose/
    * https://github.com/docker/docker-credential-helpers
    * https://github.com/docker/buildx

    ## Containers

    * https://gallery.ecr.aws/
    ```
    46 changes: 46 additions & 0 deletions zz-README.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,46 @@
    # Unoffical Guide to Amazon Linux 2023

    ## Amazon Linux 2023 Resources

    - https://aws.amazon.com/linux/amazon-linux-2023/faqs/
    - https://github.com/amazonlinux/amazon-linux-2023/
    - https://cdn.amazonlinux.com/al2023/os-images/latest/
    - https://alas.aws.amazon.com/alas2023.html
    - https://docs.aws.amazon.com/linux/al2023/
    - https://docs.aws.amazon.com/linux/al2023/release-notes/relnotes.html
    - https://docs.aws.amazon.com/linux/al2023/ug/deterministic-upgrades.html
    - [Manage package and operating system updates in AL2023](https://docs.aws.amazon.com/linux/al2023/ug/managing-repos-os-updates.html)
    - https://lwn.net/Articles/926352/

    ### AL2023 Repository Details

    ```
    # cdn.amazonlinux.com (x86_64)
    https://cdn.amazonlinux.com/al2023/core/mirrors/latest/x86_64/mirror.list
    https://cdn.amazonlinux.com/al2023/core/guids/<guid>/x86_64/
    # cdn.amazonlinux.com (aarch64)
    https://cdn.amazonlinux.com/al2023/core/mirrors/latest/aarch64/mirror.list
    https://cdn.amazonlinux.com/al2023/core/guids/<guid>/aarch64/
    # al2023-repos-us-east-1-<guid>.s3.dualstack.<region>.amazonaws.com
    https://al2023-repos-<region>-<guid>.s3.dualstack.<region>.amazonaws.com/core/mirrors/<releasever>/x86_64/mirror.list
    https://al2023-repos-<region>-<guid>.s3.dualstack.<region>.amazonaws.com/core/guids/<guid>/x86_64/<rest_of_url>
    https://al2023-repos-<region>-<guid>.s3.dualstack.<region>.amazonaws.com/core/mirrors/<releasever>/SRPMS/mirror.list
    https://al2023-repos-<region>-<guid>.s3.dualstack.<region>.amazonaws.com/kernel-livepatch/mirrors/al2023/x86_64/mirror.list
    ```

    ## Docker Resources

    - https://mobyproject.org/
    - https://github.com/docker/docker-install
    - https://github.com/docker/docker-ce-packaging
    - https://download.docker.com/linux/static/stable/
    - https://docs.docker.com/compose/install/linux/
    - https://github.com/docker/compose/
    - https://github.com/docker/docker-credential-helpers
    - https://github.com/docker/buildx

    ## Containers Resources

    * https://gallery.ecr.aws/
  18. @thimslugga thimslugga revised this gist Aug 9, 2024. 2 changed files with 68 additions and 49 deletions.
    115 changes: 67 additions & 48 deletions setup-docker-al2023.md
    Original file line number Diff line number Diff line change
    @@ -100,7 +100,7 @@ dnf repoinfo
    dnf repolist all --verbose
    ```

    ### Install base os packages
    ### Install Base OS Packages

    Install the following packages, which are good to have installed:

    @@ -189,19 +189,21 @@ sudo dnf install --allowerasing -y \
    rsync
    ```

    ### (Optional) Install the EC2 Instance Connect Utility
    ### (Optional) Remove EC2 Hibernation Agent

    Run the following command to remove the EC2 Hibernation Agent:

    ```shell
    sudo dnf install --allowerasing -y ec2-instance-connect ec2-instance-connect-selinux
    sudo dnf remove -y ec2-hibinit-agent
    ```

    ### (Optional) Install the Amazon EFS Utils helper tool
    ### (Optional) Install EC2 Instance Connect Utility

    ```shell
    sudo dnf install --allowerasing -y amazon-efs-utils
    sudo dnf install --allowerasing -y ec2-instance-connect ec2-instance-connect-selinux
    ```

    ### (Optional) Install the smart-restart utility package
    ### (Optional) Install Smart-Restart Utility

    Amazon Linux now ships with the [smart-restart](https://github.com/amazonlinux/smart-restart) package, which the smart-restart utility restarts systemd services on system updates whenever a package is installed or deleted using the systems package manager. This occurs whenever a `dnf <update|upgrade|downgrade>` is executed.

    @@ -215,45 +217,49 @@ After the installation, the subsequent transactions will trigger the smart-resta

    - https://docs.aws.amazon.com/linux/al2023/ug/managing-repos-os-updates.html#automatic-restart-services

    ### (Optional) Enable FIPS Mode on the Host

    This step is very much optional and will be very end user environment specific. I would recommend reading into FIPS compliance and validation before enabling this on your EC2 instances.
    ### (Optional) Enable Kernel Live Patching (KLP)

    - https://docs.aws.amazon.com/linux/al2023/ug/fips-mode.html
    Run the following command to install the kernel live patching feature:

    ```shell
    sudo dnf install --allowerasing -y crypto-policies crypto-policies-scripts
    sudo dnf install --allowerasing -y kpatch-dnf kpatch-runtime
    ```

    Enable the service:

    ```shell
    sudo fips-mode-setup --check
    sudo fips-mode-setup --enable
    sudo fips-mode-setup --check
    sudo dnf kernel-livepatch -y auto
    sudo systemctl daemon-reload
    sudo systemctl enable --now kpatch.service
    ```

    ### (Optional) Install Amazon EFS Utils

    ```shell
    sudo systemctl reboot
    sudo dnf install --allowerasing -y amazon-efs-utils
    ```

    ### (Optional) Install and enable the Kernel Live Patching (KLP) feature
    ### (Optional) Enable FIPS Mode on the Host

    Run the following command to install and enable the kernel live patching feature:
    This step is safe to skip as it will only apply to specific end user environments. I would recommend reading into FIPS compliance, validation and certification before enabling FIPS mode on EC2 instances.

    - https://docs.aws.amazon.com/linux/al2023/ug/fips-mode.html

    ```shell
    sudo dnf install --allowerasing -y kpatch-dnf kpatch-runtime
    sudo dnf kernel-livepatch -y auto
    sudo systemctl enable --now kpatch.service
    sudo dnf install --allowerasing -y crypto-policies crypto-policies-scripts
    ```

    ### (Optional) Remove the EC2 Hibernation agent

    Run the following command to remove the EC2 Hibernation Agent:
    ```shell
    sudo fips-mode-setup --check
    sudo fips-mode-setup --enable
    sudo fips-mode-setup --check
    ```

    ```shell
    sudo dnf remove -y ec2-hibinit-agent
    sudo systemctl reboot
    ```

    ### (Optional) Install and setup the Amazon SSM agent service
    ### (Optional) Setup Amazon SSM Agent

    Install the Amazon SSM Agent:

    @@ -268,8 +274,8 @@ The following is a tweak, which should resolve the following reported issue.

    Add the following drop-in to make sure networking is up, dns resolution works and cloud-init has finished before the amazon ssm agent is started.

    ```
    sudo mkdir -p /etc/systemd/system/amazon-ssm-agent.service.d
    ```shell
    sudo mkdir -pv /etc/systemd/system/amazon-ssm-agent.service.d

    cat <<'EOF' | sudo tee /etc/systemd/system/amazon-ssm-agent.service.d/00-override.conf
    [Unit]
    @@ -285,11 +291,10 @@ DefaultDependencies=no
    ConditionFileIsExecutable=/usr/bin/amazon-ssm-agent
    EOF
    sudo systemctl daemon-reload
    ```

    ```shell
    sudo systemctl daemon-reload
    sudo systemctl enable --now amazon-ssm-agent.service
    sudo systemctl try-reload-or-restart amazon-ssm-agent.service
    sudo systemctl status amazon-ssm-agent.service
    @@ -307,8 +312,8 @@ sudo dnf install --allowerasing -y amazon-cloudwatch-agent collectd

    Add the following drop-in to make sure networking is up, dns resolution works and cloud-init has finished before the unified cloudwatch agent is started.

    ```
    sudo mkdir -p /etc/systemd/system/amazon-cloudwatch-agent.d
    ```shell
    sudo mkdir -pv /etc/systemd/system/amazon-cloudwatch-agent.d

    cat <<'EOF' | sudo tee /etc/systemd/system/amazon-cloudwatch-agent.d/00-override.conf
    [Unit]
    @@ -324,11 +329,10 @@ DefaultDependencies=no
    ConditionFileIsExecutable=/opt/aws/amazon-cloudwatch-agent/bin/start-amazon-cloudwatch-agent
    EOF
    sudo systemctl daemon-reload
    ```

    ```shell
    sudo systemctl daemon-reload
    sudo systemctl enable --now amazon-cloudwatch-agent.service
    sudo systemctl try-reload-or-restart amazon-cloudwatch-agent.service
    sudo systemctl status amazon-cloudwatch-agent.service
    @@ -365,7 +369,7 @@ The current version of the `CloudWatchAgentServerPolicy`:
    }
    ```

    ### (Optional) Install Ansible on the host
    ### (Optional) Install Ansible

    Run the following to install ansible on the host:

    @@ -377,49 +381,64 @@ sudo dnf install -y \
    sshpass
    ```

    ### Configure some sane OS default settings
    ### Configure Sane OS Defaults

    Locale:
    Configure the locale:

    ```shell
    sudo localectl set-locale LANG=en_US.UTF-8
    localectl
    ```

    Hostname:
    Configure the hostname:

    ```shell
    sudo hostnamectl set-hostname <hostname>
    sudo hostnamectl set-hostname --static <hostname>
    sudo hostnamectl set-chassis vm

    Verify:

    ```shell
    hostnamectl
    ```

    Set the system timezone to UTC and ensure chronyd is enabled and started:

    ```shell
    sudo systemctl enable --now chronyd
    ```

    ```shell
    sudo timedatectl set-timezone Etc/UTC
    sudo systemctl enable --now chronyd
    sudo timedatectl set-ntp true
    ```

    Verify:

    ```shell
    timedatectl
    ```

    Logging:
    Configure logging:

    ```shell
    sudo mkdir -p /etc/systemd/journald.conf.d
    sudo mkdir -pv /etc/systemd/journald.conf.d
    cat <<'EOF' | sudo tee /etc/systemd/journald.conf.d/00-override.conf
    [Journal]
    SystemMaxUse=100M
    RuntimeMaxUse=100M
    RuntimeMaxFileSize=10M
    RateLimitIntervals=1s
    RateLimitBurst=10000
    EOF
    ```

    ```shell
    sudo systemctl daemon-reload
    sudo systemctl restart systemd-journald.service
    sudo systemctl try-reload-or-restart systemd-journald.service
    sudo systemctl status systemd-journald.service
    ```

    ### Configure a sane user environment for the current user e.g. ec2-user
    @@ -475,7 +494,7 @@ sudo dnf install --allowerasing -y \
    Configure the following docker daemon settings:

    ```shell
    sudo mkdir -p /etc/docker
    sudo mkdir -pv /etc/docker
    cat <<'EOF' | sudo tee /etc/docker/daemon.json
    {
    @@ -494,8 +513,8 @@ cat <<'EOF' | sudo tee /etc/docker/daemon.json
    EOF
    ```

    * https://docs.docker.com/reference/cli/dockerd/#daemon-configuration-file
    * https://docs.docker.com/config/containers/logging/awslogs/
    - https://docs.docker.com/reference/cli/dockerd/#daemon-configuration-file
    - https://docs.docker.com/config/containers/logging/awslogs/

    Add the current user e.g. `ec2-user` to the docker group:

    @@ -671,13 +690,13 @@ aws sts get-caller-identity

    Login to the AWS ECR service:

    ```sh
    ```shell
    aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws
    ```

    To create an AL2023 based container:

    ```sh
    ```shell
    docker pull public.ecr.aws/amazonlinux/amazonlinux:2023
    docker run -it --security-opt seccomp=unconfined public.ecr.aws/amazonlinux/amazonlinux:2023 /bin/bash
    ```
    2 changes: 1 addition & 1 deletion zz-performance-tuning-al2023.md
    Original file line number Diff line number Diff line change
    @@ -2,7 +2,7 @@

    ## EC2 Bandwidth Limits

    ```
    ```shell
    ethtool -S eth0 | grep -E 'err|exceeded|missed'
    ```

  19. @thimslugga thimslugga revised this gist Aug 8, 2024. 1 changed file with 3 additions and 1 deletion.
    4 changes: 3 additions & 1 deletion zz-setup-dnsmasq-al2023.md
    Original file line number Diff line number Diff line change
    @@ -240,10 +240,12 @@ Unlink the stub and re-create the /etc/resolv.conf file:

    ```shell
    sudo unlink /etc/resolv.conf
    ```

    ```shell
    cat <<'EOF' | sudo tee /etc/resolv.conf
    nameserver ::1
    nameserver 127.0.0.1
    nameserver ::1
    search ec2.internal
    options edns0 timeout:1 attempts:5
    #options trust-ad
  20. @thimslugga thimslugga revised this gist Jul 30, 2024. 1 changed file with 2 additions and 1 deletion.
    3 changes: 2 additions & 1 deletion setup-docker-al2023.md
    Original file line number Diff line number Diff line change
    @@ -108,8 +108,9 @@ Install the following packages, which are good to have installed:
    sudo dnf install --allowerasing -y \
    kernel-modules-extra \
    dnf-plugins-core \
    dnf-utils \
    dnf-plugin-release-notification \
    dnf-plugin-support-info \
    dnf-utils \
    git-core \
    git-lfs \
    grubby \
  21. @thimslugga thimslugga revised this gist Jul 30, 2024. 1 changed file with 11 additions and 5 deletions.
    16 changes: 11 additions & 5 deletions setup-docker-al2023.md
    Original file line number Diff line number Diff line change
    @@ -48,23 +48,29 @@ OR
    cat /etc/amazon-linux-release
    ```

    To find out the latest Amazon Linux 2023 release:
    To find out the LATEST SYSTEM RELEASE of Amazon Linux 2023:

    ```shell
    sudo dnf check-release-update --latest-only --version-only

    # You can use the following command to get more verbose output
    sudo dnf check-release-update --refresh --latest-only --version-only
    #sudo dnf check-release-update
    ```

    To upgrade the host for the current system release i.e. (`cat /etc/amazon-linux-release`):
    To upgrade the Amazon Linux 2023 based host for the CURRENT SYSTEM RELEASE i.e. (`cat /etc/amazon-linux-release`):

    ```shell
    sudo dnf check-update --refresh
    sudo dnf upgrade --refresh
    ```

    To upgrade the host to the latest Amazon Linux 2023 system release:
    To upgrade the Amazon Linux 2023 based host to a SPECIFIC SYSTEM RELEASE:

    ```shell
    sudo dnf check-update --refresh --releasever=2023.5.20240722
    sudo dnf update --refresh --releasever=2023.5.20240722
    ```

    To upgrade the Amazon Linux 2023 based host to the LATEST SYSTEM RELEASE:

    ```shell
    sudo dnf check-update --refresh --releasever=latest
  22. @thimslugga thimslugga revised this gist Jul 30, 2024. 1 changed file with 18 additions and 18 deletions.
    36 changes: 18 additions & 18 deletions setup-docker-al2023.md
    Original file line number Diff line number Diff line change
    @@ -14,6 +14,24 @@ The following guide is for setting up Docker with docker-compose v2 on Amazon Li
    - [Manage package and operating system updates in AL2023](https://docs.aws.amazon.com/linux/al2023/ug/managing-repos-os-updates.html)
    - https://lwn.net/Articles/926352/

    AL2023 Repository details:

    ```
    # cdn.amazonlinux.com (x86_64)
    https://cdn.amazonlinux.com/al2023/core/mirrors/latest/x86_64/mirror.list
    https://cdn.amazonlinux.com/al2023/core/guids/<guid>/x86_64/
    # cdn.amazonlinux.com (aarch64)
    https://cdn.amazonlinux.com/al2023/core/mirrors/latest/aarch64/mirror.list
    https://cdn.amazonlinux.com/al2023/core/guids/<guid>/aarch64/
    # al2023-repos-us-east-1-<guid>.s3.dualstack.<region>.amazonaws.com
    https://al2023-repos-<region>-<guid>.s3.dualstack.<region>.amazonaws.com/core/mirrors/<releasever>/x86_64/mirror.list
    https://al2023-repos-<region>-<guid>.s3.dualstack.<region>.amazonaws.com/core/guids/<guid>/x86_64/<rest_of_url>
    https://al2023-repos-<region>-<guid>.s3.dualstack.<region>.amazonaws.com/core/mirrors/<releasever>/SRPMS/mirror.list
    https://al2023-repos-<region>-<guid>.s3.dualstack.<region>.amazonaws.com/kernel-livepatch/mirrors/al2023/x86_64/mirror.list
    ```

    ## Install and configure Docker on Amazon Linux 2023

    ### Check for new updates
    @@ -76,24 +94,6 @@ dnf repoinfo
    dnf repolist all --verbose
    ```

    AL2023 Repository details:

    ```
    # cdn.amazonlinux.com (x86_64)
    https://cdn.amazonlinux.com/al2023/core/mirrors/latest/x86_64/mirror.list
    https://cdn.amazonlinux.com/al2023/core/guids/<guid>/x86_64/
    # cdn.amazonlinux.com (aarch64)
    https://cdn.amazonlinux.com/al2023/core/mirrors/latest/aarch64/mirror.list
    https://cdn.amazonlinux.com/al2023/core/guids/<guid>/aarch64/
    # al2023-repos-us-east-1-<guid>.s3.dualstack.<region>.amazonaws.com
    https://al2023-repos-<region>-<guid>.s3.dualstack.<region>.amazonaws.com/core/mirrors/<releasever>/x86_64/mirror.list
    https://al2023-repos-<region>-<guid>.s3.dualstack.<region>.amazonaws.com/core/guids/<guid>/x86_64/<rest_of_url>
    https://al2023-repos-<region>-<guid>.s3.dualstack.<region>.amazonaws.com/core/mirrors/<releasever>/SRPMS/mirror.list
    https://al2023-repos-<region>-<guid>.s3.dualstack.<region>.amazonaws.com/kernel-livepatch/mirrors/al2023/x86_64/mirror.list
    ```

    ### Install base os packages

    Install the following packages, which are good to have installed:
  23. @thimslugga thimslugga revised this gist Jul 29, 2024. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion zz-setup-dnsmasq-al2023.md
    Original file line number Diff line number Diff line change
    @@ -225,8 +225,8 @@ sudo mkdir -pv /etc/systemd/resolved.conf.d
    cat <<'EOF' | sudo tee /etc/systemd/resolved.conf.d/00-override.conf
    [Resolve]
    DNS=127.0.0.1
    FallbackDNS=169.254.169.253
    DNSStubListener=no
    FallbackDNS=
    MulticastDNS=no
    LLMNR=no
  24. @thimslugga thimslugga revised this gist Jul 15, 2024. 1 changed file with 1 addition and 0 deletions.
    1 change: 1 addition & 0 deletions zz-performance-tuning-al2023.md
    Original file line number Diff line number Diff line change
    @@ -101,6 +101,7 @@ cat <<'EOF' | sudo tee /etc/sysctl.d/99-custom-tuning.conf
    # https://blog.packagecloud.io/monitoring-tuning-linux-networking-stack-receiving-data/
    # https://github.com/myllynen/rhel-performance-guide
    # https://github.com/myllynen/rhel-troubleshooting-guide
    # https://www.brendangregg.com/linuxperf.html
    # Minimize console logging level for kernel printk messages.
    # The defaults are very verbose and have a performance impact.
  25. @thimslugga thimslugga revised this gist Jul 15, 2024. 1 changed file with 33 additions and 17 deletions.
    50 changes: 33 additions & 17 deletions zz-performance-tuning-al2023.md
    Original file line number Diff line number Diff line change
    @@ -31,15 +31,24 @@ cat /proc/interrupts | grep Tx-Rx

    ## GRUB Configuration

    ```
    uname -sr
    cat /proc/cmdline
    ```shell
    uname -sr; cat /proc/cmdline
    ```

    ```
    ```shell
    sudo grubby --update-kernel=ALL --args="intel_idle.max_cstate=1 processor.max_cstate=1 cpufreq.default_governor=performance swapaccount=1 psi=1"
    ```

    Verify:

    ```shell
    sudo grubby --info=ALL
    #sudo systemctl reboot
    ```

    To reboot the host:

    ```shell
    sudo systemctl reboot
    ```

    - https://fasterdata.es.net/host-tuning/linux/100g-tuning/cpu-governor/
    @@ -57,7 +66,7 @@ echo 0 > /proc/sys/net/ipv4/tcp_sack
    cat <<'EOF' | sudo tee /etc/sysctl.d/99-custom-tuning.conf
    # Custom kernel sysctl configuration file
    #
    # Disclaimer: These settings are not a one size fits all, you will need to test and valid them in your environment.
    # Disclaimer: These settings are not a one size fits all and you will need to test and validate them in your own environment.
    #
    # https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt
    # https://www.kernel.org/doc/Documentation/sysctl/net.txt
    @@ -70,13 +79,15 @@ cat <<'EOF' | sudo tee /etc/sysctl.d/99-custom-tuning.conf
    #
    # AWS References:
    #
    # https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-network-bandwidth.html
    # https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ena-nitro-perf.html#ena-nitro-perf-considerations
    # https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ena-improve-network-latency-linux.html#ena-latency-kernel-config
    # https://github.com/amzn/amzn-drivers/blob/master/kernel/linux/ena/ENA_Linux_Best_Practices.rst#performance-optimizations-faqs
    # https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/processor_state_control.html
    # https://github.com/amzn/amzn-ec2-ena-utilities/tree/main
    # https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/install-and-configure-cloudwatch-agent-using-ec2-console.html
    # - https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-network-bandwidth.html
    # - https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ena-nitro-perf.html
    # - https://github.com/amzn/amzn-drivers/blob/master/kernel/linux/ena/ENA_Linux_Best_Practices.rst
    # - https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ena-improve-network-latency-linux.html
    # - https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ena-express.html
    # - https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/processor_state_control.html
    # - https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/monitoring-network-performance-ena.html
    # - https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/install-and-configure-cloudwatch-agent-using-ec2-console.html
    # - https://github.com/amzn/amzn-ec2-ena-utilities/tree/main
    #
    # Misc References:
    #
    @@ -175,12 +186,11 @@ vm.overcommit_memory=1
    vm.swappiness=10
    # Maximum percentage of dirty system memory
    # Note: The default on SLES 12 and 15 the default is 20.
    # https://www.suse.com/support/kb/doc/?id=000017857
    vm.dirty_ratio = 10
    # Percentage of dirty system memory at which background
    # writeback will start (default 10).
    # Percentage of dirty system memory at which background writeback will start.
    # (default 10)
    vm.dirty_background_ratio=5
    # Some kernels won't allow dirty_ratio to be set below 5%.
    @@ -229,7 +239,7 @@ fs.inotify.max_user_watches=524288
    # Increasing this value for high speed cards may help prevent losing packets
    #net.core.netdev_max_backlog=16384
    # Increase UDP buffer size
    # Increase the UDP buffer size
    # https://github.com/quic-go/quic-go/wiki/UDP-Buffer-Sizes
    # https://medium.com/@CameronSparr/increase-os-udp-buffers-to-improve-performance-51d167bb1360
    # The default socket receive buffer (size in bytes)
    @@ -248,6 +258,12 @@ fs.inotify.max_user_watches=524288
    #net.ipv4.tcp_rmem=4096 87380 16777216
    #net.ipv4.tcp_wmem=4096 65536 16777216
    # Enable busy poll mode
    # Busy poll mode reduces latency on the network receive path. When you enable busy poll
    # mode, the socket layer code can directly poll the receive queue of a network device.
    # The downside of busy polling is higher CPU usage in the host that comes from polling
    # for new data in a tight loop. There are two global settings that control the number of
    # microseconds to wait for packets for all interfaces.
    #net.core.busy_read=50
    #net.core.busy_poll=50
  26. @thimslugga thimslugga revised this gist Jul 5, 2024. 1 changed file with 2 additions and 7 deletions.
    9 changes: 2 additions & 7 deletions setup-docker-al2023.md
    Original file line number Diff line number Diff line change
    @@ -9,8 +9,9 @@ The following guide is for setting up Docker with docker-compose v2 on Amazon Li
    - https://cdn.amazonlinux.com/al2023/os-images/latest/
    - https://alas.aws.amazon.com/alas2023.html
    - https://docs.aws.amazon.com/linux/al2023/
    - https://docs.aws.amazon.com/linux/al2023/release-notes/relnotes.html
    - https://docs.aws.amazon.com/linux/al2023/ug/deterministic-upgrades.html
    - https://docs.aws.amazon.com/linux/al2023/ug/managing-repos-os-updates.html
    - [Manage package and operating system updates in AL2023](https://docs.aws.amazon.com/linux/al2023/ug/managing-repos-os-updates.html)
    - https://lwn.net/Articles/926352/

    ## Install and configure Docker on Amazon Linux 2023
    @@ -674,12 +675,6 @@ docker pull public.ecr.aws/amazonlinux/amazonlinux:2023
    docker run -it --security-opt seccomp=unconfined public.ecr.aws/amazonlinux/amazonlinux:2023 /bin/bash
    ```

    ## Amazon Linux 2023 Resources

    * https://docs.aws.amazon.com/linux/al2023/release-notes/relnotes.html
    * https://docs.aws.amazon.com/linux/al2023/ug/deterministic-upgrades-usage.html
    * [Manage package and operating system updates in AL2023](https://docs.aws.amazon.com/linux/al2023/ug/managing-repos-os-updates.html)

    ## Docker Resources

    * https://mobyproject.org/
  27. @thimslugga thimslugga revised this gist Jul 5, 2024. 1 changed file with 89 additions and 10 deletions.
    99 changes: 89 additions & 10 deletions setup-docker-al2023.md
    Original file line number Diff line number Diff line change
    @@ -2,40 +2,96 @@

    The following guide is for setting up Docker with docker-compose v2 on Amazon Linux 2023. The steps are intendend for AL2023 on EC2 but should mostly work for the AL2023 VMs running on other hypervisors.

    ## Amazon Linux 2023 Resources

    - https://aws.amazon.com/linux/amazon-linux-2023/faqs/
    - https://github.com/amazonlinux/amazon-linux-2023/
    - https://cdn.amazonlinux.com/al2023/os-images/latest/
    - https://alas.aws.amazon.com/alas2023.html
    - https://docs.aws.amazon.com/linux/al2023/
    - https://docs.aws.amazon.com/linux/al2023/ug/deterministic-upgrades.html
    - https://docs.aws.amazon.com/linux/al2023/ug/managing-repos-os-updates.html
    - https://lwn.net/Articles/926352/

    ## Install and configure Docker on Amazon Linux 2023

    ### Check for new updates

    Get the current release:
    Get the hosts current Amazon Linux 2023 release:

    ```shell
    rpm -q system-release --qf "%{VERSION}\n"
    ```

    Find out the latest release:
    OR

    ```
    cat /etc/amazon-linux-release
    ```

    To find out the latest Amazon Linux 2023 release:

    ```shell
    sudo dnf check-release-update --latest-only --version-only

    # Use the following for more verbose output
    # You can use the following command to get more verbose output
    #sudo dnf check-release-update
    ```

    To upgrade the host for the current release:

    To upgrade the host for the current system release i.e. (`cat /etc/amazon-linux-release`):

    ```shell
    sudo dnf check-update --refresh
    sudo dnf upgrade --refresh
    ```

    To upgrade the host to the latest release:
    To upgrade the host to the latest Amazon Linux 2023 system release:

    ```shell
    #sudo touch /etc/dnf/vars/releasever && echo 'latest' | sudo tee /etc/dnf/vars/releasever
    sudo dnf check-update --refresh --releasever=latest
    sudo dnf upgrade --refresh --releasever=latest
    ````
    ```

    Note: Using `sudo dnf upgrade --releasever=latest` updates all packages, including system-release. Then, the version remains locked to the new system-release unless you set the persistent override.

    To permanently switch the host to always get the latest system release updates:

    ```shell
    # This command only needs to be run once
    sudo touch /etc/dnf/vars/releasever && echo 'latest' | sudo tee /etc/dnf/vars/releasever
    ```

    Then it's just a matter of running the following commands to update via `latest`:

    ```shell
    sudo dnf check-update --refresh
    sudo dnf upgrade --refresh
    ```

    To get more details about the current repos:

    ```shell
    dnf repoinfo
    dnf repolist all --verbose
    ```

    AL2023 Repository details:

    ```
    # cdn.amazonlinux.com (x86_64)
    https://cdn.amazonlinux.com/al2023/core/mirrors/latest/x86_64/mirror.list
    https://cdn.amazonlinux.com/al2023/core/guids/<guid>/x86_64/
    # cdn.amazonlinux.com (aarch64)
    https://cdn.amazonlinux.com/al2023/core/mirrors/latest/aarch64/mirror.list
    https://cdn.amazonlinux.com/al2023/core/guids/<guid>/aarch64/
    # al2023-repos-us-east-1-<guid>.s3.dualstack.<region>.amazonaws.com
    https://al2023-repos-<region>-<guid>.s3.dualstack.<region>.amazonaws.com/core/mirrors/<releasever>/x86_64/mirror.list
    https://al2023-repos-<region>-<guid>.s3.dualstack.<region>.amazonaws.com/core/guids/<guid>/x86_64/<rest_of_url>
    https://al2023-repos-<region>-<guid>.s3.dualstack.<region>.amazonaws.com/core/mirrors/<releasever>/SRPMS/mirror.list
    https://al2023-repos-<region>-<guid>.s3.dualstack.<region>.amazonaws.com/kernel-livepatch/mirrors/al2023/x86_64/mirror.list
    ```

    ### Install base os packages

    @@ -59,6 +115,9 @@ sudo dnf install --allowerasing -y \
    systemd-pam \
    systemd-container \
    udisks2 \
    crypto-policies \
    crypto-policies-scripts \
    openssl \
    nss-util \
    nss-tools \
    dmidecode \
    @@ -148,6 +207,26 @@ After the installation, the subsequent transactions will trigger the smart-resta

    - https://docs.aws.amazon.com/linux/al2023/ug/managing-repos-os-updates.html#automatic-restart-services

    ### (Optional) Enable FIPS Mode on the Host

    This step is very much optional and will be very end user environment specific. I would recommend reading into FIPS compliance and validation before enabling this on your EC2 instances.

    - https://docs.aws.amazon.com/linux/al2023/ug/fips-mode.html

    ```shell
    sudo dnf install --allowerasing -y crypto-policies crypto-policies-scripts
    ```

    ```shell
    sudo fips-mode-setup --check
    sudo fips-mode-setup --enable
    sudo fips-mode-setup --check
    ```

    ```shell
    sudo systemctl reboot
    ```

    ### (Optional) Install and enable the Kernel Live Patching (KLP) feature

    Run the following command to install and enable the kernel live patching feature:
    @@ -158,15 +237,15 @@ sudo dnf kernel-livepatch -y auto
    sudo systemctl enable --now kpatch.service
    ```

    ### (Optional) Remove the EC2 Hibernation Agent
    ### (Optional) Remove the EC2 Hibernation agent

    Run the following command to remove the EC2 Hibernation Agent:

    ```shell
    sudo dnf remove -y ec2-hibinit-agent
    ```

    ### (Optional) Install and setup the Amazon SSM Agent
    ### (Optional) Install and setup the Amazon SSM agent service

    Install the Amazon SSM Agent:

  28. @thimslugga thimslugga revised this gist Jun 11, 2024. 1 changed file with 86 additions and 56 deletions.
    142 changes: 86 additions & 56 deletions setup-docker-al2023.md
    Original file line number Diff line number Diff line change
    @@ -83,12 +83,22 @@ sudo dnf install --allowerasing -y \
    unzip \
    p7zip \
    numactl \
    iproute \
    iproute-tc \
    iptables-nft \
    nftables \
    conntrack-tools \
    ipset \
    ethtool \
    net-tools \
    iputils \
    traceroute \
    mtr \
    telnet \
    whois \
    socat \
    bind-utils \
    tcpdump \
    cifs-utils \
    nfsv4-client-utils \
    nfs4-acl-tools \
    @@ -105,15 +115,11 @@ sudo dnf install --allowerasing -y \
    awscli-2 \
    ec2rl \
    ec2-utils \
    bind-utils \
    traceroute \
    mtr \
    telnet \
    whois \
    htop \
    sysstat \
    fio \
    inotify-tools
    inotify-tools \
    rsync
    ```

    ### (Optional) Install the EC2 Instance Connect Utility
    @@ -128,16 +134,14 @@ sudo dnf install --allowerasing -y ec2-instance-connect ec2-instance-connect-sel
    sudo dnf install --allowerasing -y amazon-efs-utils
    ```

    ### (Optional) Install the Smart-Restart Utility Package
    ### (Optional) Install the smart-restart utility package

    Amazon Linux now ships with the [smart-restart](https://github.com/amazonlinux/smart-restart) package, which the smart-restart utility restarts systemd services on system updates whenever a package is installed or deleted using the systems package manager. This occurs whenever a dnf <update|upgrade|downgrade> is executed.
    Amazon Linux now ships with the [smart-restart](https://github.com/amazonlinux/smart-restart) package, which the smart-restart utility restarts systemd services on system updates whenever a package is installed or deleted using the systems package manager. This occurs whenever a `dnf <update|upgrade|downgrade>` is executed.

    **How it works?**

    Smart-restart uses the needs-restarting package from dnf-utils and a custom denylisting mechanism to determine which services need to be restarted and whether a system reboot is advised. If a system reboot is advised, a reboot hint marker file is generated (/run/smart-restart/reboot-hint-marker).
    The smart-restart uses the needs-restarting from the dnf-utils package and a custom denylisting mechanism to determine which services need to be restarted and whether a system reboot is advised. If a system reboot is advised, a reboot hint marker file is generated (/run/smart-restart/reboot-hint-marker).

    ```shell
    sudo dnf install -y smart-restart
    sudo dnf install --allowerasing -y smart-restart python3-dnf-plugin-post-transaction-actions
    ```

    After the installation, the subsequent transactions will trigger the smart-restart logic.
    @@ -200,7 +204,7 @@ sudo systemctl daemon-reload
    ```shell
    sudo systemctl enable --now amazon-ssm-agent.service
    sudo systemctl restart amazon-ssm-agent.service
    sudo systemctl try-reload-or-restart amazon-ssm-agent.service
    sudo systemctl status amazon-ssm-agent.service
    ```

    @@ -211,7 +215,7 @@ sudo systemctl status amazon-ssm-agent.service
    Install the Unified CloudWatch Agent:

    ```shell
    sudo dnf install --allowerasing -y amazon-cloudwatch-agent
    sudo dnf install --allowerasing -y amazon-cloudwatch-agent collectd
    ```

    Add the following drop-in to make sure networking is up, dns resolution works and cloud-init has finished before the unified cloudwatch agent is started.
    @@ -239,11 +243,11 @@ sudo systemctl daemon-reload

    ```shell
    sudo systemctl enable --now amazon-cloudwatch-agent.service
    sudo systemctl restart amazon-cloudwatch-agent.service
    sudo systemctl try-reload-or-restart amazon-cloudwatch-agent.service
    sudo systemctl status amazon-cloudwatch-agent.service
    ```

    The current version of the CloudWatchAgentServerPolicy looks like this:
    The current version of the `CloudWatchAgentServerPolicy`:

    ```json
    {
    @@ -331,23 +335,29 @@ sudo systemctl daemon-reload
    sudo systemctl restart systemd-journald.service
    ```

    ### Configure sane user environment for ec2-user
    ### Configure a sane user environment for the current user e.g. ec2-user

    ```shell
    touch ~/.{profile,bashrc,bash_profile,bash_login,bash_logout,.hushlogin}
    touch ~/.{profile,bashrc,bash_profile,bash_login,bash_logout,hushlogin}
    ```

    mkdir -p "${HOME}/.config/environment.d"
    mkdir -p "${HOME}/.config/systemd/user"
    mkdir -p "${HOME}/.config/systemd/user/sockets.target.wants"
    mkdir -p "${HOME}/.local/share/systemd/user"
    mkdir -p "${HOME}/.local/bin"
    mkdir -p "${HOME}/bin"
    ```shell
    mkdir -pv "${HOME}/bin"
    mkdir -pv "${HOME}/.config/environment.d"
    mkdir -pv "${HOME}/.config/systemd/user"
    mkdir -pv "${HOME}/.config/systemd/user/sockets.target.wants"
    mkdir -pv "${HOME}/.local/share/systemd/user"
    mkdir -pv "${HOME}/.local/bin"
    ```

    ```shell
    #cat <<'EOF' | tee ~/.config/environment.d/environment_vars.conf
    #PATH="${HOME}/bin:${HOME}/.local/bin:${PATH}"
    #
    #EOF
    ```

    ```shell
    loginctl enable-linger $(whoami)
    systemctl --user daemon-reload
    ```
    @@ -359,31 +369,7 @@ If you need to switch to root user, use the following instead of `sudo su - <use
    sudo machinectl shell root@
    ```

    ### (Optional) Configure the aws-cli for the ec2-user

    ```shell
    # configure region
    aws configure set default.region $(curl --noproxy '*' -w "\n" -s -H "X-aws-ec2-metadata-token: $(curl --noproxy '*' -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")" http://169.254.169.254/latest/dynamic/instance-identity/document | jq -r .region)
    # use regional endpoints
    aws configure set default.sts_regional_endpoints regional
    # get credentials from imds
    aws configure set default.credential_source Ec2InstanceMetadata
    # get credentials last for 1hr
    aws configure set default.duration_seconds 3600
    # set default pager
    aws configure set default.cli_pager ""
    # set output to json
    aws configure set default.output json
    ```

    Verify:

    ```shell
    aws configure list
    aws sts get-caller-identity
    ```

    ### Install and setup moby aka docker service
    ### Install and configure Moby aka Docker on the host

    Run the following command to install moby aka docker:

    @@ -424,10 +410,10 @@ EOF
    * https://docs.docker.com/reference/cli/dockerd/#daemon-configuration-file
    * https://docs.docker.com/config/containers/logging/awslogs/

    Add the `ec2-user` to the docker group:
    Add the current user e.g. `ec2-user` to the docker group:

    ```shell
    sudo usermod -aG docker ec2-user
    sudo usermod -aG docker $USER
    ```

    Enable and start the docker service:
    @@ -488,7 +474,7 @@ Verify the plugin is installed correctly with the following command(s):
    docker compose version
    ```

    ### Install the Docker Scout Plugin
    ### (Optional) Install the Docker Scout Plugin

    (Optional) Install docker scout with the following commands:

    @@ -498,11 +484,11 @@ docker compose version

    - https://github.com/docker/scout-cli

    ### Install the Docker Buildx Plugin
    ### (Skip) Install the Docker Buildx Plugin

    **Note: You can safely skip this step as it should not be necessary due to the version of Moby shipped in AL2023 bundling the buildx plugin by default.**

    (Optional) Install Docker buildx with the following commands:
    (Optional) Install the docker buildx plugin with the following commands:

    ```shell
    sudo curl -sSfL 'https://github.com/docker/buildx/releases/download/v0.14.0/buildx-v0.14.0.linux-amd64' \
    @@ -524,11 +510,31 @@ docker buildx install

    - https://github.com/docker/buildx

    ### (Optional) Install the EC2 Nitro Enclave CLI tool

    This is mostly optional if needed, otherwise you can just skip this one.

    ```shell
    sudo dnf install --allowerasing -y aws-nitro-enclaves-cli aws-nitro-enclaves-cli-devel
    sudo usermod -aG ne $USER
    sudo systemctl enable --now nitro-enclaves-allocator.service
    ```

    - https://docs.aws.amazon.com/enclaves/latest/user/nitro-enclave-cli-install.html
    - https://github.com/aws/aws-nitro-enclaves-cli


    ### (Optional) Install the Nvidia Drivers

    ### (Optional) Nvidia Drivers
    To install the Nvidia drivers:

    ```shell
    sudo dnf install -y wget kernel-modules-extra kernel-devel gcc
    ```

    Download the driver install script, run it and verify:

    ```shell
    curl -sL 'https://us.download.nvidia.com/tesla/535.161.08/NVIDIA-Linux-x86_64-535.161.08.run' -O
    sudo sh NVIDIA-Linux-x86_64-535.161.08.run -a -s --ui=none -m=kernel-open
    nvidia-smi
    @@ -550,7 +556,31 @@ To create an Ubuntu based container with access to the host GPUs:
    docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
    ```

    ### (Optional) Create Amazon Linux 2023 Containers
    ### (Optional) Configure the aws-cli for the ec2-user

    ```shell
    # configure region
    aws configure set default.region $(curl --noproxy '*' -w "\n" -s -H "X-aws-ec2-metadata-token: $(curl --noproxy '*' -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")" http://169.254.169.254/latest/dynamic/instance-identity/document | jq -r .region)
    # use regional endpoints
    aws configure set default.sts_regional_endpoints regional
    # get credentials from imds
    aws configure set default.credential_source Ec2InstanceMetadata
    # get credentials last for 1hr
    aws configure set default.duration_seconds 3600
    # set default pager
    aws configure set default.cli_pager ""
    # set output to json
    aws configure set default.output json
    ```

    Verify:

    ```shell
    aws configure list
    aws sts get-caller-identity
    ```

    ### (Optional) Create your first Amazon Linux 2023 based container(s)

    Login to the AWS ECR service:

  29. @thimslugga thimslugga revised this gist Jun 10, 2024. 1 changed file with 124 additions and 94 deletions.
    218 changes: 124 additions & 94 deletions setup-docker-al2023.md
    Original file line number Diff line number Diff line change
    @@ -1,10 +1,10 @@
    # Setup Docker on Amazon Linux 2023

    Setup Docker with docker-compose on Amazon Linux 2023
    The following guide is for setting up Docker with docker-compose v2 on Amazon Linux 2023. The steps are intendend for AL2023 on EC2 but should mostly work for the AL2023 VMs running on other hypervisors.

    ## Install and setup docker on Amazon Linux 2023
    ## Install and configure Docker on Amazon Linux 2023

    ### Check for updates
    ### Check for new updates

    Get the current release:

    @@ -116,100 +116,33 @@ sudo dnf install --allowerasing -y \
    inotify-tools
    ```

    ### Configure sane OS default settings

    ```shell
    sudo localectl set-locale LANG=en_US.UTF-8
    sudo hostnamectl set-chassis vm
    sudo timedatectl set-timezone Etc/UTC
    sudo systemctl enable --now chronyd
    sudo timedatectl set-ntp true
    hostnamectl
    timedatectl
    ```
    ### (Optional) Install the EC2 Instance Connect Utility

    ```shell
    sudo mkdir -p /etc/systemd/journald.conf.d
    cat <<'EOF' | sudo tee /etc/systemd/journald.conf.d/00-override.conf
    [Journal]
    SystemMaxUse=100M
    RuntimeMaxUse=100M
    RuntimeMaxFileSize=10M
    RateLimitIntervals=1s
    RateLimitBurst=10000
    EOF
    sudo systemctl daemon-reload
    sudo systemctl restart systemd-journald.service
    sudo dnf install --allowerasing -y ec2-instance-connect ec2-instance-connect-selinux
    ```

    ### Configure sane user environment for ec2-user
    ### (Optional) Install the Amazon EFS Utils helper tool

    ```shell
    touch ~/.{profile,bashrc,bash_profile,bash_login,bash_logout,.hushlogin}
    mkdir -p "${HOME}/.config/environment.d"
    mkdir -p "${HOME}/.config/systemd/user"
    mkdir -p "${HOME}/.config/systemd/user/sockets.target.wants"
    mkdir -p "${HOME}/.local/share/systemd/user"
    mkdir -p "${HOME}/.local/bin"
    mkdir -p "${HOME}/bin"
    #cat <<'EOF' | tee ~/.config/environment.d/environment_vars.conf
    #PATH="${HOME}/bin:${HOME}/.local/bin:${PATH}"
    #
    #EOF
    loginctl enable-linger $(whoami)
    systemctl --user daemon-reload
    sudo dnf install --allowerasing -y amazon-efs-utils
    ```

    If you need to switch to root user, use the following instead of `sudo su - <user>`.

    ```shell
    # sudo machinectl shell <username>@
    sudo machinectl shell root@
    ```
    ### (Optional) Install the Smart-Restart Utility Package

    ### (Optional) Configure the aws-cli for the ec2-user
    Amazon Linux now ships with the [smart-restart](https://github.com/amazonlinux/smart-restart) package, which the smart-restart utility restarts systemd services on system updates whenever a package is installed or deleted using the systems package manager. This occurs whenever a dnf <update|upgrade|downgrade> is executed.

    ```shell
    # configure region
    aws configure set default.region $(curl --noproxy '*' -w "\n" -s -H "X-aws-ec2-metadata-token: $(curl --noproxy '*' -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")" http://169.254.169.254/latest/dynamic/instance-identity/document | jq -r .region)
    # use regional endpoints
    aws configure set default.sts_regional_endpoints regional
    # get credentials from imds
    aws configure set default.credential_source Ec2InstanceMetadata
    # get credentials last for 1hr
    aws configure set default.duration_seconds 3600
    # set default pager
    aws configure set default.cli_pager ""
    # set output to json
    aws configure set default.output json
    ```
    **How it works?**

    Verify:
    Smart-restart uses the needs-restarting package from dnf-utils and a custom denylisting mechanism to determine which services need to be restarted and whether a system reboot is advised. If a system reboot is advised, a reboot hint marker file is generated (/run/smart-restart/reboot-hint-marker).

    ```shell
    aws configure list
    aws sts get-caller-identity
    sudo dnf install -y smart-restart
    ```

    ### (Optional) Install Ansible on the host

    Run the following to install ansible on the host:
    After the installation, the subsequent transactions will trigger the smart-restart logic.

    ```shell
    sudo dnf install -y \
    python3-psutil \
    ansible \
    ansible-core \
    sshpass
    ```
    - https://docs.aws.amazon.com/linux/al2023/ug/managing-repos-os-updates.html#automatic-restart-services

    ### (Optional) Install and enable the Kernel Live Patching (KLP) feature

    @@ -229,12 +162,6 @@ Run the following command to remove the EC2 Hibernation Agent:
    sudo dnf remove -y ec2-hibinit-agent
    ```

    ### (Optional) Install the EC2 Instance Connect Utility

    ```shell
    sudo dnf install --allowerasing -y ec2-instance-connect ec2-instance-connect-selinux
    ```

    ### (Optional) Install and setup the Amazon SSM Agent

    Install the Amazon SSM Agent:
    @@ -347,10 +274,113 @@ The current version of the CloudWatchAgentServerPolicy looks like this:
    }
    ```

    ### (Optional) Install the Amazon EFS Utils helper tool
    ### (Optional) Install Ansible on the host

    Run the following to install ansible on the host:

    ```shell
    sudo dnf install --allowerasing -y amazon-efs-utils
    sudo dnf install -y \
    python3-psutil \
    ansible \
    ansible-core \
    sshpass
    ```

    ### Configure some sane OS default settings

    Locale:

    ```shell
    sudo localectl set-locale LANG=en_US.UTF-8
    localectl
    ```

    Hostname:

    ```shell
    sudo hostnamectl set-hostname <hostname>
    sudo hostnamectl set-chassis vm
    hostnamectl
    ```

    Set the system timezone to UTC and ensure chronyd is enabled and started:

    ```
    sudo timedatectl set-timezone Etc/UTC
    sudo systemctl enable --now chronyd
    sudo timedatectl set-ntp true
    timedatectl
    ```

    Logging:

    ```shell
    sudo mkdir -p /etc/systemd/journald.conf.d
    cat <<'EOF' | sudo tee /etc/systemd/journald.conf.d/00-override.conf
    [Journal]
    SystemMaxUse=100M
    RuntimeMaxUse=100M
    RuntimeMaxFileSize=10M
    RateLimitIntervals=1s
    RateLimitBurst=10000
    EOF

    sudo systemctl daemon-reload
    sudo systemctl restart systemd-journald.service
    ```

    ### Configure sane user environment for ec2-user

    ```shell
    touch ~/.{profile,bashrc,bash_profile,bash_login,bash_logout,.hushlogin}

    mkdir -p "${HOME}/.config/environment.d"
    mkdir -p "${HOME}/.config/systemd/user"
    mkdir -p "${HOME}/.config/systemd/user/sockets.target.wants"
    mkdir -p "${HOME}/.local/share/systemd/user"
    mkdir -p "${HOME}/.local/bin"
    mkdir -p "${HOME}/bin"

    #cat <<'EOF' | tee ~/.config/environment.d/environment_vars.conf
    #PATH="${HOME}/bin:${HOME}/.local/bin:${PATH}"
    #
    #EOF

    loginctl enable-linger $(whoami)
    systemctl --user daemon-reload
    ```

    If you need to switch to root user, use the following instead of `sudo su - <user>`.

    ```shell
    # sudo machinectl shell <username>@
    sudo machinectl shell root@
    ```

    ### (Optional) Configure the aws-cli for the ec2-user

    ```shell
    # configure region
    aws configure set default.region $(curl --noproxy '*' -w "\n" -s -H "X-aws-ec2-metadata-token: $(curl --noproxy '*' -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")" http://169.254.169.254/latest/dynamic/instance-identity/document | jq -r .region)
    # use regional endpoints
    aws configure set default.sts_regional_endpoints regional
    # get credentials from imds
    aws configure set default.credential_source Ec2InstanceMetadata
    # get credentials last for 1hr
    aws configure set default.duration_seconds 3600
    # set default pager
    aws configure set default.cli_pager ""
    # set output to json
    aws configure set default.output json
    ```

    Verify:

    ```shell
    aws configure list
    aws sts get-caller-identity
    ```

    ### Install and setup moby aka docker service
    @@ -407,7 +437,7 @@ sudo systemctl enable --now docker
    sudo systemctl status docker
    ```

    ### Install Docker Compose Plugin
    ### Install the Docker Compose v2 Plugin

    Install the Docker Compose plugin with the following commands:

    @@ -458,19 +488,19 @@ Verify the plugin is installed correctly with the following command(s):
    docker compose version
    ```

    ### Docker Scout plugin
    ### Install the Docker Scout Plugin

    (Optional) Install docker scout with the following commands:

    ```shell

    <commands goes here>
    ```

    - https://github.com/docker/scout-cli

    ### Docker Buildx plugin
    ### Install the Docker Buildx Plugin

    **You can safely skip this step as it should not be necessary due to the version of Moby shipped in AL2023 bundling the buildx plugin by default.**
    **Note: You can safely skip this step as it should not be necessary due to the version of Moby shipped in AL2023 bundling the buildx plugin by default.**

    (Optional) Install Docker buildx with the following commands:

  30. @thimslugga thimslugga revised this gist Jun 2, 2024. 1 changed file with 8 additions and 4 deletions.
    12 changes: 8 additions & 4 deletions zz-setup-dnsmasq-al2023.md
    Original file line number Diff line number Diff line change
    @@ -27,15 +27,19 @@ listen-address=::1,127.0.0.1
    # Port 53
    port=53
    # For a local only DNS resolver use interface=lo + bind-interfaces
    # See for more details: https://serverfault.com/a/830737
    # Listen only on the specified interface(s).
    interface=lo
    # dnsmasq binds to the wildcard address, even if it is listening
    # on only some interfaces. It then discards requests that it
    # shouldn't reply to. This has the advantage of working even
    # when interfaces come and go and change address.
    # when interfaces come and go and change address.
    bind-interfaces
    #bind-dynamic
    # Listen only on the specified interface(s).
    interface=lo
    #bind-dynamic
    # Do not listen on the specified interface.
    #except-interface=eth0