Created
March 21, 2019 08:02
-
-
Save knadh/3ce491d9ddffc6679af9e45b4bdb59f2 to your computer and use it in GitHub Desktop.
Revisions
-
knadh created this gist
Mar 21, 2019 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,80 @@ # Running multiple active publishers on a NATS cluster for failover while avoding message duplication [NATS](https://nats.io) is an excellent, clustered, full-mesh PubSub messaging system, highly performant and a cakewalk to setup. Full mesh means every node (servers and clients) knows about every other node, which is great, but makes it tricky to have multiple publishers on hot standby, for high availability of publishers (not the NATS network), while avoiding duplicate pubs. Here `--no-advertise` comes in handy if we're willing to sacrifice the automatic meshing and discovery mechanism. This may be acceptable in setups where only a fixed set of NATS servers run in a cluster and whose addresses (either IPs or hostnames) are known. ### --no-advertise The `gnatsd --no-advertise` flag makes a NATS server not advertise itself automatically to the mesh. For other nodes to discover `--no-advertise` nodes, the `--routes` have to be explicitly specified. If there are `N` servers, there should be `N` routes. ### -sl reload `gnatds -sl reload=pid` makes a running NATS server reload configuration from its config file (`-c`) without downtime. This can be used to take out ## Premise - A cluster of two NATS servers `server0` and `server1` that have `N` subscribers listening to the subject `test`. - A live publisher, `publisher0` that is publishing on the subject `test`. - A hot standby publisher `publisher1`, who is also publisheing on the subject `test`, but whose messages should only take effect in the cluster if `publisher0` goes down. ## Solution - Each publisher gets its own local NATS server (here, `dummy-nats0` and `dummy-nats1` respectively for publishers `publisher0` and `publisher`). - The publishers do not publish directly to the upstream cluster, but to their local NATS servers. - The primary publisher `publisher0`'s dummy NATS server `dummy-nats0` is clustered to the upstream NATS servers (via `routes`). - The backup publiser `publisher1`'s dummy NATS server `dummy-nats1` is not clustered to the upstream NATS servers (empty `routes`). - These configurations are specified in local [configuration files](https://github.com/nats-io/gnatsd#configuration-file). - When publisher0 goes down or there is a fault (assuming there's a healthcheck mechanism) 1. Remove the upstream's NATS `routes` from `nats-dummy0`'s configuration and issue a `gnatsd -sl reload`. 2. Add the upstream's NATS `routes` to `nats-dummy1` and do a `gnatsd -sl reload`. The messages `publisher0` had been publishing will immediately cease and make way for `publisher1`. Even if `publisher0` or `dummy-nats0` come back up, the messages will be self contained and not pushed to the cluster as the `--no_advertise` prevents automatic discovery and cluster formation, avoiding duplicate messages. ``` +-----------------------------------------------------------------------------------------------------+ | | | N ... subscribers | | | +-----------------------------------------------------------------------------------------------------+ -/ -\ -/ -\ -/ -\ -/ -\ -/ -\ +----------------------------+ +---------------------------------+ | | | | | NATS server0 | | NATS server1 | | | | | | listen :4222 | | listen :4222 | | cluster-listen :4248 |---------| cluster-listen :4248 | | no-advertise | | no-advertise | | | | | | | | nats-routes server0:4248 | +-------------|--------------+ +---------------------------------+ | -----/ | -----/ | -----/ | -----/ | -----/ | -----/ |--/ +-----------------|--------------+ +--------------------------------+ | | | | | NATS dummy-nats0 | | NATS dummy-nats1 | | | | | | listen :4222 | | listen :4222 | | cluster-listen :4248 | | cluster-listen :4248 | | no-advertise | | no-advertise | | | | | | routes server0:4248 | | nats [] | | server1:4248 | | | +----------------|---------------+ +----------------|---------------+ | | | | | | +----------|----------+ +----------|----------+ | | | | | publisher0 | | publisher1 | | subject test | | subject test | | | | | | | | | +---------------------+ +---------------------+ ```