**A quick "how to" on what you need to do to both setup AND recover a single-server PostgreSQL database using WAL-E** * WAL-E: https://github.com/wal-e/wal-e * Assuming Ubuntu 12.04 LTS ("Precise") * We'll be using S3. Make sure you have an IAM in a group with `GetObject`, `ListBucket` and `PutObject` on the bucket you want to use (and that it's not public). #### Setup: 1. These packages: ``` $ sudo apt-get install daemontools libevent-dev python-all-dev lzop pv $ sudo easy_install pip $ sudo pip install wal-e ``` Notes: daemontools provides us with the `envdir` program, which lets us (safely) store sensitive keys as environmental variables. We'll use it to store our S3 credentials. 2. An S3 bucket (i.e. `/myservice/pg-backups`) 3. An S3 user that can (at a minimum) GetObject, ListBucket and PutObject. I'd suggest turning on versioning on the bucket and adding GetObjectVersion to that list too. That'll make sure server compromise won't allow someone to overwrite the backups you made[3] 4. Set up your access keys: ``` $ mkdir -p /etc/wal-e.d/env $ echo "YOUR_AWS_ACCESS_KEY" > /etc/wal-e.d/env/AWS_ACCESS_KEY_ID $ echo "YOUR_AWS_SECRET" > /etc/wal-e.d/env/AWS_SECRET_ACCESS_KEY $ echo 's3://myservice/pg-backups' > /etc/wal-e.d/env/WALE_S3_PREFIX $ chown -R root:postgres /etc/wal-e.d ``` 5. Uncomment and modify these lines in your `postgresql.conf` file under `/etc/postgresql/9.3/main/` ``` wal_level = archive archive_mode = on archive_command = 'envdir /etc/wal-e.d/env /usr/local/bin/wal-e wal-push %p' archive_timeout = 60 ``` Note: Make sure to use the full path to wal-e so that PostgreSQL can find it. Now we're ready to make our first backup. You can also set the `PGDATA` environmental variable to the location of your PostgreSQL data dir (`/var/lib/postgresql/9.3/main` on Ubuntu/Debian) either as a shell variable or using `envdir` like we did for the AWS keys. #### Pushing Backups: 1. Make sure you're the `postgres` user (the default user PostgreSQL runs as) with `sudo su - postgres` 2. Create our first full backup, which is the reference point for our WAL logs with the following: `envdir /etc/wal-e.d/env /usr/local/bin/wal-e backup-push /var/lib/postgresql/9.3/main` 3. We can check the list of full backups with `envdir /etc/wal-e.d/env /usr/local/bin/wal-e backup-list` You can also delete older backups with `wal-e delete [--confirm] before ` where is from `backup-list`. Make sure to double check the dates on the backups. #### Fetching a Backup: We'll assume worst-case here: you have a blank machine with everything as it was *except* your data. Postgres has been re-installed as before and you're now wanting to recover your database. Note: Make sure to [read the WAL-E docs on user-created tablespaces](https://github.com/wal-e/wal-e#backup-fetch) and how that interacts with fetches first. 1. Make sure PostgreSQL isn't running and change to the `postgres` user. 2. Delete the default data directory as the restore process will re-create it. ``` $ rm -r /var/lib/postgresql/9.3/main # (under Ubuntu/Debian) ``` 3. Fetch the latest backup. You can replace `LATEST` with the name of a specific backup as identified from `backup-list`: ``` $ envdir /etc/wal-e.d/env /usr/local/bin/wal-e backup-fetch /var/lib/postgresql/9.3/main LATEST ``` 4. Ensure the permissions are correct: ``` $ chown -R postgres:postgres /var/lib/postgresql/9.3/main $ chmod 0700 /var/lib/postgresql/9.3/main ``` 5. Create a `recovery.conf` within `/var/lib/postgresql/9.3/main/` with the following contents: ``` restore_command = '/usr/bin/envdir /etc/wal-e.d/env /usr/local/bin/wal-e wal-fetch "%f" "%p"' ``` 6. Start your PostgreSQL server. It might take a little while (it will replay the the WAL files on top of your specified base backup). The `recovery.conf` file will be renamed to `recovery.done` once the process is complete. 7. Get back to work (read: check your data is there!) ##### References: * [1] https://gist.github.com/ruckus/2293434 * [2] https://coderwall.com/p/cwe2_a * [3] http://blog.opbeat.com/2013/01/07/postgresql-backup-to-s3-part-one/