Skip to content

Instantly share code, notes, and snippets.

@andrewlubenets
Created July 18, 2019 07:10
Show Gist options
  • Select an option

  • Save andrewlubenets/0731e2bd6c72bd1c95c675518e70e34a to your computer and use it in GitHub Desktop.

Select an option

Save andrewlubenets/0731e2bd6c72bd1c95c675518e70e34a to your computer and use it in GitHub Desktop.

Revisions

  1. andrewlubenets created this gist Jul 18, 2019.
    23 changes: 23 additions & 0 deletions Disaster recovery procedure
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,23 @@
    Backup procedure for DeltaDevOps

    One of very popular solution which support DeltaDevops it is etcd Backup Operator

    etcd Backup Operator.

    etc-operator which can:

    - Periodic back up the data of a etcd cluster running on DeltaDevops to a remote storage such as PV, AWS S3, and Azure Blob Store. For periodical backup we have filed “backupIntervalInSecond:” and “maxBackups:”.Project official page: https://github.com/coreos/etcd-operator/blob/master/doc/user/walkthrough/backup-operator.md
    - Restore backup with restore-operator https://github.com/coreos/etcd-operator/blob/master/doc/user/walkthrough/restore-operator.md

    For intalling it need:
    1) Setup RBAC and deploy an etcd operator. See Install Guide - https://github.com/coreos/etcd-operator/blob/master/doc/user/install_guide.md
    2) A running etcd cluster named "example-etcd-cluster:. See instructions to deploy it.https://github.com/coreos/etcd-operator#create-and-destroy-an-etcd-cluster
    3) Create a deployment of etcd backup operator: $ kubectl create -f example/etcd-backup-operator/deployment.yaml
    4) Setup AWS Secret
    5) Create EtcdBackup CR

    Etcd cluster can be recovered from failure using snapshot and restore.
    etcd is designed to withstand machine failures. An etcd cluster automatically recovers from temporary failures (e.g., machine reboots) and tolerates up to (N-1)/2 permanent failures for a cluster of N members. When a member permanently fails, whether due to hardware failure or disk corruption, it loses access to the cluster. If the cluster permanently loses more than (N-1)/2 members then it disastrously fails, irrevocably losing quorum. Once quorum is lost, the cluster cannot reach consensus and therefore cannot continue accepting updates.

    To recover from disastrous failure, etcd v3 provides snapshot and restore facilities to recreate the cluster without v3 key data loss. To recover v2 keys, refer to the v2 admin guide.
    Project official page: https://github.com/etcd-io/etcd/blob/master/Documentation/op-guide/recovery.md