# Enable Dual Stack (IPv4 and IPv6) OpenFabric Routing # !!WORK IN PROGRESS!! ## Version 2.1 (2025.04.24) [this gist is part of this series](/76e94832927a89d977ea989da157e9dc) This assumes you are running Proxmox 8.4 and that the line `source /etc/network/interfaces.d/*` is at the end of the interfaces file (this is automatically added to both new and upgraded installations of Proxmox 8.2). This changes the previous file design thanks to @NRGNet and @tisayama to make the system much more reliable in general, more maintainable esp for folks using IPv4 on the private cluster network ~(i still recommend the use of the IPv6 FC00 network you will see in these docs)~ Notable changes from original version [here](/4c664734535da122f4ab2951b22b2085) - move IP address configuration from `interfaces.d/thundebolt` to frr configuration - new approach to remove dependecy on post-up, new script in if-up.d that logs to systemlog - reminder to copy frr.conf > frr.conf.local to prevent breakage if you enable Proxmox SDN - dependent on the changes to the udev link scripts [here](/67fdc9a517faefa68f730f82d7fa3570#set-interfaces-to-up-on-reboots-and-cable-insertions) This will result in an IPv4 and IPv6 routable mesh network that can survive any one node failure or any one cable failure. Alls the steps in this section must be performed on each node ## NOTES on Dual Stack ~I have included this for completeness, i only run the FC00:: IPv6 network as ceph does not support dual stack, i strongly recommend you consider only using IPv6. For example for ceph do not dual stack - either use IPv4 or IPv6 addressees for all the monitors, MDS and daemons - despite the docs implying it is ok my findings on quincy are is it is funky....~ I think now with all the scripts and changes folks have contributed IPv4 should now be stable. I am recommending new folks use IPv4 ceph as docuemnted in that gists in the series to avoid ongoing issues with SDN and IPv6. I have yet to decide if i will migrate my ceph back to IPv4 so i can play with SDN or just wait for the SDN issues to be solved. ## Defining thunderbolt network Create a new file using `nano /etc/network/interfaces.d/thunderbolt` and populate with the following There should no lober be any IP addresses in this file for lo and lo:6 ``` allow-hotplug en05 iface en05 inet manual mtu 65520 allow-hotplug en06 iface en06 inet manual mtu 65520 ``` Save file, repeat on each node. ## Enable IPv4 and IPv6 forwarding 1. use `nano /etc/sysctl.conf` to open the file 2. uncomment `#net.ipv6.conf.all.forwarding=1` (remove the # symbol) 3. uncomment `#net.ipv4.ip_forward=1` (remove the # symbol) 4. save the file 5. issue `reboot now` for a complete reboot ## FRR Setup ### Install & enable FRR Install Free Range Routing (FRR) `apt install frr` Enable frr `systemctl enable frr` ### Enable the fabricd daemon 1. edit the frr daemons file (`nano /etc/frr/daemons`) to change `fabricd=no` to `fabricd=yes` 2. save the file 3. restart the service with `systemctl restart frr` ### Mitigate FRR Timing Issues (I need someone with an MS-101 to confirm if helps solve their IPv4 issues) #### create script that is automatically processed when en05/en06 are brougt up to restart frr this should make IPv4 more stable for all users (i ended up seeing IPv4 issues too, just less commonly than MS-101 users) 1. create a new file with `nano /etc/network/if-up.d/en0x` 2. add to file the following ``` #!/bin/bash # note the logger entries log to the system journal in the pve UI etc INTERFACE=$IFACE if [ "$INTERFACE" = "en05" ] || [ "$INTERFACE" = "en06" ]; then logger "Checking if frr.service is running for $INTERFACE" if ! systemctl is-active --quiet frr.service; then logger -t SCYTO " [SCYTO SCRIPT ] frr.service not running. Starting service." if systemctl start frr.service; then logger -t SCYTO " [SCYTO SCRIPT ] Successfully started frr.service" else logger -t SCYTO " [SCYTO SCRIPT ] Failed to start frr.service" fi exit 0 fi logger "Attempting to reload frr.service for $INTERFACE" if systemctl reload frr.service; then logger -t SCYTO " [SCYTO SCRIPT ] Successfully reloaded frr.service for $INTERFACE" else logger -t SCYTO " [SCYTO SCRIPT ] Failed to reload frr.service for $INTERFACE" fi fi ``` 3. make it executable with `chmod +x /etc/network/if-up.d/en0x` ### mitgigate issues cause by things that reset the loopback #### create script that is automatically processed when lo is reprocessed by ifreload, ifupdown2, pve set, etc 1. create a new file with `nano /etc/network/if-up.d/lo` 2. add to file the following ``` #!/bin/bash INTERFACE=$IFACE if [ "$INTERFACE" = "lo" ] ; then logger "Attempting to restart frr.service for $INTERFACE" if systemctl restart frr.service; then logger -t SCYTO " [SCYTO SCRIPT ] Successfully restart frr.service for $INTERFACE" else logger -t SCYTO " [SCYTO SCRIPT ] Failed to restart frr.service for $INTERFACE" fi fi ``` make it executable with `chmod +x /etc/network/if-up.d/lo` ### Configure OpenFabric (perforn on all nodes) 1. enter the FRR shell with `vtysh` 2. optionally show the current config with `show running-config` 3. enter the configure mode with `configure` 4. Apply the bellow configuration (it is possible to cut and paste this into the shell instead of typing it manually, you may need to press return to set the last !. Also check there were no errors in repsonse to the paste text.). **Note: the X should be the number of the node you are working on** For example node 1 would use 1 in place of X ``` ip forwarding ipv6 forwarding ! interface en05 ip router openfabric 1 ipv6 router openfabric 1 exit ! interface en06 ip router openfabric 1 ipv6 router openfabric 1 exit ! interface lo ip address 10.0.0.8x/32 ipv6 address fc00::8x/128 ip router openfabric 1 ipv6 router openfabric 1 openfabric passive exit ! router openfabric 1 net 49.0000.0000.000x.00 exit ! exit ``` 5. you may need to pres return after the last `exit` to get to a new line - if so do this 6. save the configu with `write memory` 7. show the configure applied correctly with `show running-config` - note the order of the items will be different to how you entered them and thats ok. (If you made a mistake i found the easiest way was to edt `/etc/frr/frr.conf` - but be careful if you do that.) 8. use the command `exit` to leave setup 9. repeat steps 1 to 9 on the other 3 nodes 10. once you have configured all 3 nodes issue the command `vtysh -c "show openfabric topology"` if you did everything right you will see: ``` Area 1: IS-IS paths to level-2 routers that speak IP Vertex Type Metric Next-Hop Interface Parent pve1 10.0.0.81/32 IP internal 0 pve1(4) pve2 TE-IS 10 pve2 en06 pve1(4) pve3 TE-IS 10 pve3 en05 pve1(4) 10.0.0.82/32 IP TE 20 pve2 en06 pve2(4) 10.0.0.83/32 IP TE 20 pve3 en05 pve3(4) IS-IS paths to level-2 routers that speak IPv6 Vertex Type Metric Next-Hop Interface Parent pve1 fc00::81/128 IP6 internal 0 pve1(4) pve2 TE-IS 10 pve2 en06 pve1(4) pve3 TE-IS 10 pve3 en05 pve1(4) fc00::82/128 IP6 internal 20 pve2 en06 pve2(4) fc00::83/128 IP6 internal 20 pve3 en05 pve3(4) IS-IS paths to level-2 routers with hop-by-hop metric Vertex Type Metric Next-Hop Interface Parent ``` Now you should be in a place to ping each node from evey node across the thunderbolt mesh using IPv4 or IPv6 as you see fit. ### IMPORTAT - you need to do this to stop SDN breaking you in future if all is working issue a `cp /etc/frr/frr.conf /etc/frr/frr.conf.local` this is because when enabling proxmox SDN proxmox will overwrite frr.conf - however it will read the .local file and apply that. So do this copy whenever you have edited the conf using vtysh or by hand. I haven't yet tested to see if we can have just the settings in frr.conf.local when SDN is not configured....