aka what i did to get from nothing to done.
Purpose of Proxmox cluster project
Required Outomces of cluster project
Start Date - First week of August 2023
End Date - none
Updates as of 9/27/2023 This cluster is no longer a PoC and is my production cluster for my two dommain controllers, home assistant, 2 of 3 docker VM host nodes
note: these are designed to be primarily a re-install guide for myself (writing things down helps me memorize the knowledge), as such don't take any of this on blind faith - some areas are well tested and the docs are very robust, some items, less so). YMMV i will regularly update until all outcomes and todo's are achieved.
-
Enable OSPF Routing On Mesh network- deprecated - old gist here -
Migrate my debian VM based docker swarm to proxmox - in progess 9/27/23
-
Extra Credit (optional):
- add TLS to the mail relay? with LE certs? maybe?
- maybe send syslog to my syslog server (securely)
- figure out ceph public/cluster running on different networks - unclear its needed for this size of install
- get all nodes listening to my network UPS and shut down before power runs out
- For the docker VMs implement both cephfs via virtiofs for and a cephs docker volume and test which i like best in a swarm - using this ceph volume guide and this mounting guide by Drallas - using one of these three ceph volume plugins Brindster/docker-plugin-cephfs flaviostutz/cepher n0r1sk/docker-volume-cephfs each has different strengths and weaknesses (i will like choose either the n0r1sk or the Brindster one).
I have been using Hyper-V for my docker swarm cluster VM hosts (see other gists). Original intenttion was to try and get Thunderbolt Networking for a Hyper-V cluster going and clustered storage for the VMs. This turns out to be super hard when using NUCs as cluster nodes due to too few disks. I looked at solar winds as alternative but this was both complex and not pervasive.
I had been watching proxmox for years and thought now was a good time to jump in and see what it is all about. (i had never booted or looked at proxmox UI before doing this - so this documentation is soup to nuts and intended for me to repro if needed)
- VMs running on clustered storage {completed}
- Use of ThunderBolt for ~26Gbe Cluster VM operations (replication, failover etc)
- Thunderbolt meshs with OSPF routing {completed}
- Ceph over thunderbolt mesh {completed}
- VM running with live migration {completed}
- VM running with HA failove of node failure {completed}
- Seperate VM/CT Migration network over thunderbolt mesh {not started}
- Use low powered off the shelf Intel NUCs {completed}
- Migrate VMs from Hyper-V:
- Windows Server Domain Controler / DNS / DHCP / CA / AAD SYNC VMs {not started}
- Debian Dcoker Host (for my 3 running 3 node swarm) VMs {not started}
- HomeAssistant VM {not started}
- Sized to last me 5+ years (lol, yeah, right)
- 3x 13th Gen Intel NUCs (NUC13ANHi7):
- Core i7-1360P Processor(12 Cores, 5.0 GHz, 16 Threads)
- Intel Iris Xe Graphics
- 64 GB DDR4 3200 CL22 RAM
- Samsung 870 EVO SSD 1TB Boot Drive
- Samsung 980 Pro NVME 2 TB Data Drive
- 1x Onboard 2.5Gbe LAN Port
- 2x Onboard Thunderbolt4 Ports
- 1 x 2.5Gbe usinng Intel NUCIOALUWS nvme epxansion port
- 3 x OWC TB4 Cables
- Proxmox v8.x
- Ceph (included with Proxmox)
- LLDP (included with Proxmox)
- Free Range Routing - FRR OSPF - (included with Proxmox)
- nano ;-)
Proxmox/Ceph Guide from packet pushers
Proxmox Forum - several community members were invaluable in providing me a breadcrumb trail.