The following guide is for setting up Docker with docker-compose v2 on Amazon Linux 2023. The steps are intendend for AL2023 on EC2 but should mostly work for the AL2023 VMs running on other hypervisors.
Install the following packages, which are good to have installed:
sudo dnf install --allowerasing -y \
  kernel-modules-extra \
  dnf-plugins-core \
  dnf-plugin-release-notification \
  dnf-plugin-support-info \
  dnf-utils \
  git-core \
  git-lfs \
  grubby \
  kexec-tools \
  chrony \
  audit \
  dbus \
  dbus-daemon \
  dbus-broker \
  polkit \
  systemd-pam \
  systemd-container \
  udisks2 \
  crypto-policies \
  crypto-policies-scripts \
  openssl \
  nss-util \
  nss-tools \
  dmidecode \
  nvme-cli \
  device-mapper-multipath \
  device-mapper-persistent-data \
  lvm2 \
  dosfstools \
  e2fsprogs \
  xfsprogs \
  xfsprogs-xfs_scrub \
  attr \
  acl \
  shadow-utils \
  shadow-utils-subid \
  fuse3 \
  squashfs-tools \
  star \
  gzip \
  pigz \
  bzip2 \
  zstd \
  xz \
  unzip \
  p7zip \
  numactl \
  iproute \
  iproute-tc \
  iptables-nft \
  nftables \
  conntrack-tools \
  ipset \
  ethtool \
  net-tools \
  iputils \
  traceroute \
  mtr \
  telnet \
  whois \
  socat \
  bind-utils \
  tcpdump \
  cifs-utils \
  nfsv4-client-utils \
  nfs4-acl-tools \
  libseccomp \
  psutils \
  python3 \
  python3-pip \
  python3-psutil \
  python3-policycoreutils \
  policycoreutils-python-utils \
  bash-completion \
  vim-minimal \
  wget \
  jq \
  awscli-2 \
  ec2rl \
  ec2-utils \
  htop \
  sysstat \
  fio \
  inotify-tools \
  rsyncRun the following command to remove the EC2 Hibernation Agent:
sudo dnf remove -y ec2-hibinit-agentsudo dnf install --allowerasing -y ec2-instance-connect ec2-instance-connect-selinuxAmazon Linux now ships with the smart-restart package, which the smart-restart utility restarts systemd services on system updates whenever a package is installed or deleted using the systems package manager. This occurs whenever a dnf <update|upgrade|downgrade> is executed.
The smart-restart uses the needs-restarting from the dnf-utils package and a custom denylisting mechanism to determine which services need to be restarted and whether a system reboot is advised. If a system reboot is advised, a reboot hint marker file is generated (/run/smart-restart/reboot-hint-marker).
sudo dnf install --allowerasing -y smart-restart python3-dnf-plugin-post-transaction-actionsAfter the installation, the subsequent transactions will trigger the smart-restart logic.
Run the following command to install the kernel live patching feature:
sudo dnf install --allowerasing -y kpatch-dnf kpatch-runtimeEnable the service:
sudo dnf kernel-livepatch -y auto
sudo systemctl daemon-reload
sudo systemctl enable --now kpatch.servicesudo dnf install --allowerasing -y amazon-efs-utilsThis step is safe to skip as it will only apply to specific end user environments. I would recommend reading into FIPS compliance, validation and certification before enabling FIPS mode on EC2 instances.
sudo dnf install --allowerasing -y crypto-policies crypto-policies-scriptssudo fips-mode-setup --check
sudo fips-mode-setup --enable
sudo fips-mode-setup --checksudo systemctl rebootInstall the Amazon SSM Agent:
sudo dnf install --allowerasing -y amazon-ssm-agentThe following is a tweak, which should resolve the following reported issue.
- https://repost.aws/questions/QU_tj7NQl6ReKoG53zzEqYOw/amazon-linux-2023-issue-with-installing-packages-with-cloud-init
- amazonlinux/amazon-linux-2023#397
Add the following drop-in to make sure networking is up, dns resolution works and cloud-init has finished before the amazon ssm agent is started.
systemctl is-enabled systemd-networkd-wait-online.service NetworkManager-wait-online.servicesudo mkdir -p /etc/systemd/system/amazon-ssm-agent.service.d
cat <<'EOF' | sudo tee /etc/systemd/system/amazon-ssm-agent.service.d/00-override.conf
# To have a service start after cloud-init.target it requires the
# addition of DefaultDependencies=no due to the following default
# DefaultDependencies=y, which results in the default target e.g.
# multi-user.target to depending on the service.
# See: https://serverfault.com/a/973985
[Unit]
Wants=network-online.target
After=network-online.target nss-lookup.target cloud-init.target
DefaultDependencies=no
ConditionFileIsExecutable=/usr/bin/amazon-ssm-agent
EOFsudo systemctl daemon-reload
sudo systemctl enable --now amazon-ssm-agent.service
sudo systemctl try-reload-or-restart amazon-ssm-agent.service
sudo systemctl status amazon-ssm-agent.serviceVerify:
systemd-delta --type=extended
systemctl show amazon-ssm-agent --all
# systemctl show <unit>.service --property=<PROPERTY_NAME>
# systemctl show <unit>.service --property=<PROPERTY_NAME1>,<PROPERTY_NAME2>
systemctl show amazon-ssm-agent.service --property=After,WantsInstall the Unified CloudWatch Agent:
sudo dnf install --allowerasing -y amazon-cloudwatch-agent collectdAdd the following drop-in to make sure networking is up, dns resolution works and cloud-init has finished before the unified cloudwatch agent is started.
sudo mkdir -p /etc/systemd/system/amazon-cloudwatch-agent.d
cat <<'EOF' | sudo tee /etc/systemd/system/amazon-cloudwatch-agent.d/00-override.conf
# To have a service start after cloud-init.target it requires the
# addition of DefaultDependencies=no due to the following default
# DefaultDependencies=y, which results in the default target e.g.
# multi-user.target depending on the service.
# See: https://serverfault.com/a/973985
[Unit]
Wants=network-online.target
After=network-online.target nss-lookup.target cloud-init.target
DefaultDependencies=no
ConditionFileIsExecutable=/opt/aws/amazon-cloudwatch-agent/bin/start-amazon-cloudwatch-agent
EOFsudo systemctl daemon-reload
sudo systemctl enable --now amazon-cloudwatch-agent.service
sudo systemctl try-reload-or-restart amazon-cloudwatch-agent.service
sudo systemctl status amazon-cloudwatch-agent.serviceThe current version of the CloudWatchAgentServerPolicy:
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": [
                    "cloudwatch:PutMetricData",
                    "ec2:DescribeVolumes",
                    "ec2:DescribeTags",
                    "logs:PutLogEvents",
                    "logs:DescribeLogStreams",
                    "logs:DescribeLogGroups",
                    "logs:CreateLogStream",
                    "logs:CreateLogGroup"
                ],
                "Resource": "*"
            },
            {
                "Effect": "Allow",
                "Action": [
                    "ssm:GetParameter"
                ],
                "Resource": "arn:aws:ssm:*:*:parameter/AmazonCloudWatch-*"
            }
        ]
    }Run the following to install ansible on the host:
sudo dnf install -y \
  python3-psutil \
  ansible \
  ansible-core \
  sshpassConfigure the locale:
sudo localectl set-locale LANG=en_US.UTF-8Verify:
localectlConfigure the hostname:
sudo hostnamectl set-hostname --static <hostname_goes_here>
sudo hostnamectl set-chassis vmVerify:
hostnamectlSet the system timezone to UTC and ensure chronyd is enabled and started:
sudo timedatectl set-timezone Etc/UTC
sudo systemctl enable --now chronyd
sudo timedatectl set-ntp trueVerify:
timedatectlConfigure journal logging:
sudo mkdir -p /etc/systemd/journald.conf.d
cat <<'EOF' | sudo tee /etc/systemd/journald.conf.d/00-override.conf
[Journal]
SystemMaxUse=100M
RuntimeMaxUse=100M
RuntimeMaxFileSize=10M
RateLimitInterval=1s
RateLimitBurst=10000
EOFsudo systemctl daemon-reload
sudo systemctl try-reload-or-restart systemd-journald.service
sudo systemctl status systemd-journald.serviceConfigure custom MOTD banner:
# Disable the AL2023 MOTD banner (found at /usr/lib/motd.d/30-banner):
sudo ln -s /dev/null /etc/motd.d/30-banner
cat <<'EOF' | sudo tee /etc/motd.d/31-banner
   ,     #_
   ~\_  ####_
  ~~  \_#####\
  ~~     \###|
  ~~       \#/ ___   Amazon Linux 2023 (Docker Optimized)
   ~~       V~' '->
    ~~~         /
      ~~._.   _/
         _/ _/
       _/m/'
EOFAL2023 uses pam-motd, see: http://www.linux-pam.org/Linux-PAM-html/sag-pam_motd.html
touch ~/.{profile,bashrc,bash_profile,bash_login,bash_logout,hushlogin}
mkdir -pv "${HOME}/bin"
mkdir -pv "${HOME}"/.config/{systemd,environment.d}
mkdir -pv "${HOME}/.config/systemd/user/sockets.target.wants"
mkdir -pv "${HOME}/.local/share/systemd/user"
mkdir -pv "${HOME}/.local/bin"cat <<'EOF' | tee ~/.config/environment.d/00-environment_variables.conf
#PATH="${HOME}/bin:${HOME}/.local/bin:${PATH}"
EOFEnable linger for the user:
sudo loginctl enable-linger $(whoami)
systemctl --user daemon-reloadNote: If you need to switch to root user, use the following instead of sudo su - <user>.
#sudo machinectl shell <username>@
sudo machinectl shell root@Run the following command to install moby aka docker:
sudo dnf install --allowerasing -y \
  docker \
  containerd \
  runc \
  container-selinux \
  cni-plugins \
  oci-add-hooks \
  amazon-ecr-credential-helper \
  udicaAdd the current user e.g. ec2-user to the docker group:
sudo groupadd docker
sudo usermod -aG docker $USER
sudo newgrp dockerConfigure the following docker daemon settings:
test -d /etc/docker || sudo mkdir -p /etc/docker
test -f /etc/docker/daemon.json || cat <<'EOF' | sudo tee /etc/docker/daemon.json
{
  "debug": false,
  "experimental": false,
  "exec-opts": ["native.cgroupdriver=systemd"],
  "userland-proxy": false,
  "live-restore": true,
  "log-level": "warn",
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m",
    "max-file": "3"
  }
}
EOFEnable and start the docker and containerd service(s):
sudo systemctl enable --now docker.service containerd.service
sudo systemctl status docker containerd- https://mobyproject.org/
- https://github.com/docker/docker-install
- https://github.com/docker/docker-ce-packaging
- https://download.docker.com/linux/static/stable/
- https://docs.docker.com/compose/install/linux/
- https://github.com/docker/compose/
- https://github.com/docker/docker-credential-helpers
- https://github.com/docker/buildx
- https://docs.docker.com/reference/cli/dockerd/#daemon-configuration-file
- https://docs.docker.com/config/containers/logging/awslogs/
Install the Docker Compose plugin with the following commands.
To install the docker compose plugin for all users:
sudo mkdir -p /usr/local/lib/docker/cli-plugins
sudo curl -sL https://github.com/docker/compose/releases/latest/download/docker-compose-linux-"$(uname -m)" \
  -o /usr/local/lib/docker/cli-plugins/docker-compose
# Set ownership to root and make executable
test -f /usr/local/lib/docker/cli-plugins/docker-compose \
  && sudo chown root:root /usr/local/lib/docker/cli-plugins/docker-compose
test -f /usr/local/lib/docker/cli-plugins/docker-compose \
  && sudo chmod +x /usr/local/lib/docker/cli-plugins/docker-compose(Optional) To install only for the local user e.g. ec2-user, run the following commands:
mkdir -p "${HOME}/.docker/cli-plugins" && touch "${HOME}/.docker/config.json"
curl -sL https://github.com/docker/compose/releases/latest/download/docker-compose-linux-"$(uname -m)" \
  -o "${HOME}/.docker/cli-plugins/docker-compose"
cat <<'EOF' | tee -a "${HOME}/.bashrc"
# https://specifications.freedesktop.org/basedir-spec/latest/index.html
XDG_CONFIG_HOME="${HOME}/.config"
XDG_DATA_HOME="${HOME}/.local/share"
XDG_RUNTIME_DIR="${XDG_RUNTIME_DIR:-/run/user/$(id -u)}"
DBUS_SESSION_BUS_ADDRESS="unix:path=${XDG_RUNTIME_DIR}/bus"
export XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR DBUS_SESSION_BUS_ADDRESS 
# Docker
DOCKER_TLS_VERIFY=1
#DOCKER_CONFIG=/usr/local/lib/docker
DOCKER_CONFIG="${DOCKER_CONFIG:-$HOME/.docker}"
export DOCKER_CONFIG DOCKER_TLS_VERIFY
#DOCKER_HOST="unix:///run/user/$(id -u)/docker.sock"
#export DOCKER_HOST
EOFVerify the plugin is installed correctly with the following command(s):
docker compose version(Optional) Install docker scout with the following commands:
curl -sSfL https://raw.githubusercontent.com/docker/scout-cli/main/install.sh | sh -s --
chmod +x $HOME/.docker/scout/docker-scoutNote: You can safely skip this step as it should not be necessary due to the version of Moby shipped in AL2023 bundling the buildx plugin by default.
(Optional) Install the docker buildx plugin with the following commands:
sudo curl -sSfL 'https://github.com/docker/buildx/releases/download/v0.14.0/buildx-v0.14.0.linux-amd64' \
  -o /usr/local/lib/docker/cli-plugins/docker-buildx
#sudo curl -sL https://github.com/docker/compose/releases/latest/download/docker-buildx-linux-"$(uname -m)" \
#  -o /usr/local/lib/docker/cli-plugins/docker-buildx
# Set ownership to root and make executable
test -f /usr/local/lib/docker/cli-plugins/docker-buildx \
  && sudo chown root:root /usr/local/lib/docker/cli-plugins/docker-buildx
test -f /usr/local/lib/docker/cli-plugins/docker-buildx \
  && sudo chmod +x /usr/local/lib/docker/cli-plugins/docker-buildx
cp /usr/local/lib/docker/cli-plugins/docker-buildx "${HOME}/.docker/cli-plugins/docker-buildx"
docker buildx installThis is mostly optional if needed, otherwise you can just skip this one.
sudo dnf install --allowerasing -y aws-nitro-enclaves-cli aws-nitro-enclaves-cli-develAdd teh user to the ne group:
sudo groupadd ne
sudo usermod -aG ne $USER
sudo newgrp neEnable and start the service:
sudo systemctl enable --now nitro-enclaves-allocator.service- https://docs.aws.amazon.com/enclaves/latest/user/nitro-enclave-cli-install.html
- https://github.com/aws/aws-nitro-enclaves-cli
To install the Nvidia drivers:
sudo dnf install -y wget kernel-modules-extra kernel-devel gcc dkmsAdd the Nvidia Driver and CUDA repository:
sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/amzn2023/x86_64/cuda-amzn2023.repo
sudo dnf clean expire-cacheInstall the Nvidia driver + CUDA toolkit from the Nvidia repo:
sudo dnf module install -y nvidia-driver:latest-dkms
sudo dnf install -y cuda-toolkit
(Alternative) Download the driver install script and run it to install the nvidia drivers:
curl -sL 'https://us.download.nvidia.com/tesla/535.161.08/NVIDIA-Linux-x86_64-535.161.08.run' -O
sudo sh NVIDIA-Linux-x86_64-535.161.08.run -a -s --ui=none -m=kernel-openVerify:
nvidia-smi
For the Nvidia container runtime, add the nvidia container repo:
curl -sL 'https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo' | sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
sudo dnf clean expire-cache
sudo dnf check-updateInstall and configure the nvidia-container-toolkit:
sudo dnf install -y nvidia-container-toolkitsudo nvidia-ctk runtime configure --runtime=dockerRestart the docker and containerd services:
sudo systemctl restart docker containerdTo create an Ubuntu based container with access to the host GPUs:
docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smidocker run --rm --runtime=nvidia --gpus all public.ecr.aws/amazonlinux/amazonlinux:2023 nvidia-smi# configure region
aws configure set default.region $(curl --noproxy '*' -w "\n" -s -H "X-aws-ec2-metadata-token: $(curl --noproxy '*' -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")" http://169.254.169.254/latest/dynamic/instance-identity/document | jq -r .region)
# use regional endpoints
aws configure set default.sts_regional_endpoints regional
# get credentials from imds
aws configure set default.credential_source Ec2InstanceMetadata
# get credentials last for 1hr
aws configure set default.duration_seconds 3600
# set default pager
aws configure set default.cli_pager ""
# set output to json
aws configure set default.output jsonVerify:
aws configure list
aws sts get-caller-identityLogin to the AWS ECR service:
aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.awsTo create an AL2023 based container:
docker pull public.ecr.aws/amazonlinux/amazonlinux:2023
docker run -it --security-opt seccomp=unconfined public.ecr.aws/amazonlinux/amazonlinux:2023 /bin/bash
Thank you I found this super helpful for configuring AL2023 nodes with docker-compose.
For the docker-compose install we could extract the node's cpu arch
x86_64vsaarch64to download the appropriate binary