I have other devices that need to access the ceph mesh that are on my LAN. This gist is only needed if you want LAN clients to access the ceph mesh.
- let any client on LAN access the mesh
- avoid setting static routes on my router
- enable support for routing topology changes without having to reconfigure router
REMEMBER ceph clients want to access the MONSs / OSDs / MGRs and MDSs on the lo interface loopback addresses - thats the goal!
- BGP is used to advertise routes for both LAN based clients and VM based clients
- BGP P2P links have to be used as other BGP mechnanisms didn't seem to work
- BGP routes are explictly defined for dvertisement, no broadcast is used
- Ubiquiti unifi OS router with BGP feature (EFG and maybe others)
- all previous gists have been followed and are working perfectly, this will only detail changes to the overall setup
- you have a true dual stack setup on your LAN (if you only have IPv4 including for ceph you drop the IPv6 sections)
| Field | pve1 | pve2 | pve3 | 
|---|---|---|---|
| BGP Router-ID | 192.168.1.81 | 192.168.1.82 | 192.168.1.83 | 
| BGP ASN | 65001 | 65001 | 65001 | 
| IPv6 Neighbors (Node LAN IP) | 2001:db8:1000:1::822001:db8:1000:1::832001:db8:1000:1::1 | 2001:db8:1000:1::812001:db8:1000:1::832001:db8:1000:1::1 | 2001:db8:1000d:1::812001:db8:1000:1::82d2001:db8:1000:1::1 | 
| IPv4 Neighbors (Node LAN IP) | 192.168.1.82192.168.1.83192.168.1.1 | 192.168.1.81192.168.1.83192.168.1.1 | 192.168.1.81192.168.1.82192.168.1.1 | 
| IPv6 Routes (mesh network) | fc00::81/128fc00:81::/64 | fc00::82/128fc00:82::/64 | fc00::83/128fc00:83::/64 | 
| IPv4 Routes (mesh network) | 10.0.0.81/3210.0.81.0/24 | 10.0.0.82/3210.0.82.0/24 | 10.0.0.83/3210.0.83.0/24 | 
**notes
2001:db8:1000:1::is not my real subnet,2001:db8::is a subnet resevered for documentation- you should use your subnet addresses as appropriate
- Find out the MTU your router uses - in my case it is 9182
- Ensure same MTU is set on vmbr0 - this can be done in the proxmox gui or by setting mtu 9182in the/etc/network/interfacesin the vmbr0 stanza.
- nano /etc/frr/daemonschangebgpd=notobgpd=yesand save the file
- then systemctl reload frr
use the settings from the table above and remember to change them as needed for each node
- add this to frr.confand restart frr
router bgp 65001
 bgp router-id 192.168.1.81
 no bgp ebgp-requires-policy
 neighbor 2001:db8:1000:1::82 remote-as 65001
 neighbor 2001:db8:1000:1::83 remote-as 65001
 neighbor 2001:db8:1000:1::1 remote-as 65001
 neighbor 192.168.1.82 remote-as 65001
 neighbor 192.168.1.83 remote-as 65001
 neighbor 192.168.1.1 remote-as 65001
 address-family ipv6 unicast
  network fc00::81/128
  network fc00:81::/64
  neighbor 2001:db8:1000:1::82 activate
  neighbor 2001:db8:1000:1::83 activate
  neighbor 2001:db8:1000:1::1 activate
 exit-address-family
 address-family ipv4 unicast
  network 10.0.0.81/32
  network 10.0.81.0/24
  neighbor 192.168.1.82 activate
  neighbor 192.168.1.83 activate
  neighbor 192.168.1.1 activate
 exit-address-family
Remember to change the IP addreses to match the table above the easy way to think about this is:
- network = subnets on this node i want to advertise
- neighbor = other routers/nodes i have to talk to
Instead of explicitly defining the networks you could use one line that says
redistribute connectedin place of the twonetworkvalues in each familly. This would advetise all learnt routes (not kernel / static routes) to your router.
I think this would be a better way as there is less reconfiguration. But given the unpredictably of this in different peoples environments I elected to explicitly define the routes for predictability.
This has only be tested on an EFG running network app 9.1.92
| Field | Value | 
|---|---|
| BGP ASN | 65001 | 
| BGP Router ID | 192.168.1.1 | 
| IPv4 LAN Address | 192.168.1.1 | 
| IPv6 LAN Address | 2001:db8:1000:1::1/64 | 
**notes
- the BGP AS must be in the range of
64512-65534or your ISP will get very cross at you ;-)- the BGP Router ID can be anything really, convention makes it the router IPv4 address
- The LAN addresses are you normal LAN addresses of you routers LAN port
Remember to use something that uses linux text formatting.
router bgp 65001
 bgp router-id 192.168.1.1
 no bgp ebgp-requires-policy
 neighbor 192.168.1.81 remote-as 65001
 neighbor 192.168.1.81 update-source 192.168.1.1
 neighbor 192.168.1.81 next-hop-self
 neighbor 192.168.1.82 remote-as 65001
 neighbor 192.168.1.82 update-source 192.168.1.1
 neighbor 192.168.1.82 next-hop-self
 neighbor 192.168.1.83 remote-as 65001
 neighbor 192.168.1.83 update-source 192.168.1.1
 neighbor 192.168.1.83 next-hop-self
 neighbor 2001:db8:1000:1::81 remote-as 65001
 neighbor 22001:db8:1000:1::81 update-source 2001:db8:1000:1::1
 neighbor 2001:db8:1000:1::82 remote-as 65001
 neighbor 22001:db8:1000:1::82 update-source 2001:db8:1000:1::1
 neighbor 2001:db8:1000:1::83 remote-as 65001
 neighbor2001:db8:1000:1::83 update-source 2001:db8:1000:1::1
 address-family ipv6 unicast
  neighbor 2001:db8:1000::81 activate
  neighbor 2001:db8:1000:1::81 next-hop-self
  neighbor 22001:db8:1000:1::82 activate
  neighbor 2001:db8:1000:1::82 next-hop-self
  neighbor 2001:db8:1000:1::83 activate
  neighbor 2001:db8:1000:1::83 next-hop-self
 exit-address-family
 address-family ipv4 unicast
  neighbor 192.168.1.81 activate
  neighbor 192.168.1.81 next-hop-self
  neighbor 192.168.1.82 activate
  neighbor 192.168.1.82 next-hop-self
  neighbor 192.168.1.83 activate
  neighbor 192.168.1.83 next-hop-self
 exit-address-family**note
- the format above is ordered for easy reading
vtysh -c "show running-config"will show a different layout- the settings above wont be written to
/etc/frr/frr.conf- so don't worry if thats empty
- in unifi network application go to settings > routing > bgp
- name = ceph-mesh
- device = name of router (should be on the drop down)
- click upload and upload file
- do NOT select the check box override wan monitors(uncheck it it is checked)
Notes:
- I found some times times frr.service can crash on UI whem you upload, if it does just restart it
- The router learns
fc00::8x/128,fc00:8x::/64,10.0.0.8x/32, and10.0.8x.0/24routes from your the nodes.- you can use the following command to check everything looks good:
vtysh -c "show bgp ipv6 unicast summary"
vtysh -c "show bgp ipv4 unicast summary"
it will look something like this:
note the i infront of the IPv6 addresses is just a known display bug in frr 8.1 that my router is running
root@EFG:/etc/frr# vtysh -c "show bgp ipv6 unicast"
BGP table version is 6, local router ID is 192.168.1.1, vrf id 0
Default local pref 100, local AS 65001
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
   Network          Next Hop            Metric LocPrf Weight Path
*>ifc00::81/128     fe80::4a21:bff:fe58:9c45
                                             0    100      0 i
*>ifc00::82/128     fe80::4a21:bff:fe56:a5d8
                                             0    100      0 i
*>ifc00::83/128     fe80::4a21:bff:fe56:a650
                                             0    100      0 i
*>ifc00:81::/64     fe80::4a21:bff:fe58:9c45
                                             0    100      0 i
*>ifc00:82::/64     fe80::4a21:bff:fe56:a5d8
                                             0    100      0 i
*>ifc00:83::/64     fe80::4a21:bff:fe56:a650
                                             0    100      0 i
Displayed  6 routes and 6 total paths
root@EFG:/etc/frr# vtysh -c "show bgp ipv4 unicast"
BGP table version is 6, local router ID is 192.168.1.1, vrf id 0
Default local pref 100, local AS 65001
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
   Network          Next Hop            Metric LocPrf Weight Path
*>i10.0.0.81/32     192.168.1.81             0    100      0 i
* i                 192.168.1.81             0    100      0 i
*>i10.0.0.82/32     192.168.1.82             0    100      0 i
* i                 192.168.1.82             0    100      0 i
*>i10.0.0.83/32     192.168.1.83             0    100      0 i
* i                 192.168.1.83             0    100      0 i
*>i10.0.81.0/24     192.168.1.81             0    100      0 i
* i                 192.168.1.81             0    100      0 i
*>i10.0.82.0/24     192.168.1.82             0    100      0 i
* i                 192.168.1.82             0    100      0 i
*>i10.0.83.0/24     192.168.1.83             0    100      0 i
* i                 192.168.1.83             0    100      0 i
Displayed  6 routes and 12 total paths
If you don't see the right things above see double check the router can see its neighbors wth vtysh -c "show bgp sum" you should get something like this:
IPv4 Unicast Summary (VRF default):
BGP router identifier 192.168.1.1, local AS number 65001 vrf-id 0
BGP table version 6
RIB entries 11, using 2024 bytes of memory
Peers 6, using 4338 KiB of memory
Neighbor                  V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt Desc
pve1(192.168.1.81)        4      65001        86        85        0    0    0 00:04:09            2        0 N/A
pve2(192.168.1.82)        4      65001        86        85        0    0    0 00:04:08            2        0 N/A
pve3(192.168.1.83)        4      65001        86        85        0    0    0 00:04:08            2        0 N/A
pve1(2001:db8:1000:1::81) 4      65001        88        86        0    0    0 00:04:09            2        0 N/A
pve2(2001:db8:1000:1::82) 4      65001        88        86        0    0    0 00:04:08            2        0 N/A
pve3(2001:db8:1000:1::83) 4      65001        88        86        0    0    0 00:04:08            2        0 N/A
Total number of neighbors 6
IPv6 Unicast Summary (VRF default):
BGP router identifier 192.168.1.1, local AS number 65001 vrf-id 0
BGP table version 6
RIB entries 11, using 2024 bytes of memory
Peers 3, using 2169 KiB of memory
Neighbor                  V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt Desc
pve1(2001:db8:1000:1::81) 4      65001        88        86        0    0    0 00:04:09            2        0 N/A
pve2(2001:db8:1000:1::82) 4      65001        88        86        0    0    0 00:04:08            2        0 N/A
pve3(2001:db8:1000:1::83) 4      65001        88        86        0    0    0 00:04:08            2        0 N/A
Total number of neighbors 3
If you seen nothing or something missing the mostly like issues are:
- MTU mismatch between br0 on the router and vmbr0 on the proxmox nodes
- a firewall blocking BGP packets somehwere
Example frr.conf after these changes (this is from my pve1)
Click me
root@pve1 16:29:49 ~ # cat  /etc/frr/frr.conf
frr version 8.5.2
frr defaults datacenter
hostname pve1
log syslog informational
service integrated-vtysh-config
interface en05
 ip router openfabric 1
 ipv6 router openfabric 1
 openfabric hello-interval 1
 openfabric hello-multiplier 3
 openfabric csnp-interval 5
 openfabric psnp-interval 2
exit
interface en06
 ip router openfabric 1
 ipv6 router openfabric 1
 openfabric hello-interval 1
 openfabric hello-multiplier 3
 openfabric csnp-interval 5
 openfabric psnp-interval 2
exit
interface lo
 ip router openfabric 1
 ipv6 router openfabric 1
 openfabric passive
exit
interface vmbr100
 ip router openfabric 1
 ipv6 router openfabric 1
 openfabric passive
exit
router openfabric 1
 net 49.0000.0000.0081.00
 lsp-gen-interval 5
exit
router bgp 65001
 bgp router-id 192.168.1.81
 no bgp ebgp-requires-policy
 neighbor 2600:a801:830:1::82 remote-as 65001
 neighbor 2600:a801:830:1::83 remote-as 65001
 neighbor 2600:a801:830:1::1 remote-as 65001
 neighbor 192.168.1.82 remote-as 65001
 neighbor 192.168.1.83 remote-as 65001
 neighbor 192.168.1.1 remote-as 65001
 address-family ipv6 unicast
  network fc00::81/128
  network fc00:81::/64
  neighbor 2600:a801:830:1::82 activate
  neighbor 2600:a801:830:1::83 activate
  neighbor 2600:a801:830:1::1 activate
 exit-address-family
 address-family ipv4 unicast
  network 10.0.0.81/32
  network 10.0.81.0/24
  neighbor 192.168.1.82 activate
  neighbor 192.168.1.83 activate
  neighbor 192.168.1.1 activate
 exit-address-family
If my router doesn't support BGP I suppose there isn't really any way around it? I need to upgrade, or this type of thunderbolt based network isn't going to work for me I imagine.