cabbiepete · September 27, 2018 22:30 · Mar 29, 2016
diff --git a/README.md b/README.md
@@ -0,0 +1,205 @@
+# Taking back your Mandrill click-tracking links
+
+My company, like many, has recently switched away from using Mandrill
+for our transactional email.
+
+We'd been using Mandrill's [click-tracking][mandrill-click-tracking] feature,
+and became worried about what would happen to all those old emailed links
+after we cancel our Mandrill account.
+
+Why should we care about links in emails sent a month or more ago?
+Well, turns out a lot of our users treat emails as de facto bookmarks.
+To get back to our site, they dig up an old email from us and click a
+link in it. (This is more common than you might expect, particularly 
+for consumer-oriented sites.)
+
+Mandrill hasn't stated what will happen to click tracking redirects
+in cancelled accounts. There's no reason to believe they'd break them,
+but to be safe, we decided to take back our click-tracking domain and
+handle those redirects ourselves.
+
+And you can too.
+
+
+## Prerequisite: custom tracking domain
+
+This only works if you had set up your own [custom tracking domain][custom-tracking-domain]
+in Mandrill -- because you'll be able to repoint that domain to your own server.
+If all your old emails have links directly to mandrillapp.com, there's nothing
+you can do about that now.
+
+Say you'd been using `click.example.com` as your custom tracking domain, and you'd
+CNAMEd that to `mandrillapp.com`. Our goal is to write our own server code that can handle
+Mandrill's redirect links. We'll then change `click.example.com`
+to point at our code, and we'll have no more dependencies on Mandrill.
+
+
+## Decoding a Mandrill tracking link
+
+To do our own redirects, we'll need to figure out how Mandrill was encoding links.
+(Not intrested in the gory details? Just [skip to the code](#handling-in-django).)
+
+Here's an actual Mandrill tracking link I extracted from one of my old emails:
+
+    http://go.planapple.com/track/click/11003603/www.planapple.com?p=eyJzIjoiZGk1ZDNtM2tHaFBjaXJvRWZKU2w3LXhqRnBzIiwidiI6MSwicCI6IntcInVcIjoxMTAwMzYwMyxcInZcIjoxLFwidXJsXCI6XCJodHRwczpcXFwvXFxcL3d3dy5wbGFuYXBwbGUuY29tXFxcL3N1cHBvcnRcXFwvP3V0bV9tZWRpdW09ZW1haWwmdXRtX3NvdXJjZT10cmFuc2FjdGlvbmFsJnV0bV9jYW1wYWlnbj1wYXNzd29yZF9yZXNldFwiLFwiaWRcIjpcIjk5ZGIyYjNiOTM1MzQ4Mjc5OTg1ZDY4ZGI3MWU4ODI0XCIsXCJ1cmxfaWRzXCI6W1wiY2U2OTJhMTlkMmUyMjc5OWJiM2E2YzU5OGNlN2NkMmNmMWYxYzQ2ZFwiXX0ifQ
+
+In the visible parts of that URL:
+
+* `go.planapple.com` is our custom tracking domain
+* `11003603` was (I'm guessing) our Mandrill account id
+* `www.planapple.com` is the host portion (only) of the target link.
+  (I'm guessing Mandrill includes it so users can see some hint about
+  where the link will lead.)
+
+But we want the full target link, not just the host. It must be in
+that `p` parameter, which looks like it might be base64 (without
+the trailing-equals padding). Sure enough, [decoding][base64-decoder] 
+it gives a JSON string:
+
+```json
+{
+  "s": "di5d3m3kGhPciroEfJSl7-xjFps",
+  "v": 1,
+  "p": "{\"u\":11003603,\"v\":1,\"url\":\"https:\\\/\\\/www.planapple.com\\\/support\\\/?utm_medium=email&utm_source=transactional&utm_campaign=password_reset\",\"id\":\"99db2b3b935348279985d68db71e8824\",\"url_ids\":[\"ce692a19d2e22799bb3a6c598ce7cd2cf1f1c46d\"]}"
+}
+```
+
+This appears to be some sort of signed JSON blob, where the real JSON of interest
+is in the `p` payload field, and the `s` field is a signature Mandrill can
+use to validate the payload.
+
+Validation is important. We don't want to create an [open redirect vulnerability][open-redirect]
+on our site. We won't be able to verify Mandrill's signature (since we don't have their secret),
+so we'll take a different approach. More about that later.
+
+Parsing the JSON from the `p` payload gives us the actual redirect params
+we were hoping for:
+
+```json
+{
+  "u": 11003603,
+  "v": 1,
+  "url": "https://www.planapple.com/support/?utm_medium=email&utm_source=transactional&utm_campaign=password_reset",
+  "id": "99db2b3b935348279985d68db71e8824",
+  "url_ids": [
+    "ce692a19d2e22799bb3a6c598ce7cd2cf1f1c46d"
+  ]
+}
+```
+
+* `u` is our Mandrill account ID, again
+* `v` is probably the version of the parameters format.
+  (I've only seen v:1, on recent emails. Hoping there wasn't a v:0 earlier.)
+* `url` is the target url we were looking for.
+  (It even has the Google Analytics params Mandrill added for us!)
+* `id` is the Mandrill message uuid -- which you could use
+  if you wanted to keep logging click tracking stats
+  from old emails. (We're not going to bother with that.)
+* `url_ids` is... I'm not sure. (Maybe related to Mandrill's url-tagging
+  feature. We're going to ignore it.)
+
+
+## Handling in Django
+
+Now that we've decoded Mandrill's redirect data format, handling it is easy.
+The code below is for Django, but it shouldn't be hard to adapt to other
+environments.
+
+The only (somewhat) tricky part is validating the target to make sure we don't
+create an [open redirect vulnerability][open-redirect] on our site. 
+In my case, our emails *only* contained links back to our own site, so I can 
+simply check the targets using Django's [`is_safe_url`][is-safe-url] helper with
+my site's hostname. If your emails linked to a variety of domains,
+you'll need to come up with some other way to validate the redirect targets.
+
+Here's a simple Django view to handle the redirects:
+
+```python
+import json
+from django.core.exceptions import SuspiciousOperation
+from django.http import HttpResponseRedirect
+from django.utils.http import is_safe_url, urlsafe_base64_decode
+
+TARGET_HOSTNAME="www.example.com"  # You expect all redirects to go here
+
+def legacy_mandrill_click_tracking_redirect(request, mandrill_account=None, target_host=None):
+    """Handle a Mandrill click-tracking redirect link"""
+
+    try:
+        b64payload = request.GET['p']
+        # (Django's urlsafe_base64_decode handles missing '=' padding)
+        payload = json.loads(urlsafe_base64_decode(b64payload))
+        assert payload['v'] == 1  # we've only seen v:1 signed payloads
+        params = json.loads(payload['p'])
+        assert params['v'] == 1  # we've only seen v:1 params
+        target = params['url']
+    except (AssertionError, KeyError, TypeError, ValueError):
+        # Missing/unparseable query params/payload format
+        raise SuspiciousOperation("tried to redirect with garbled payload")
+
+    # Verify we're only redirecting to our own site (don't be an open redirect server):
+    if not is_safe_url(target, TARGET_HOSTNAME):
+        raise SuspiciousOperation("tried to redirect to unsafe url '%s'" % target)
+
+    # If you want to be extra-paranoid, you could also check that:
+    #   mandrill_account == params['u']
+    #   target_host == urlparse(target).netloc
+
+    # Want to actually log the click for your own metrics?
+    # This would be a good place to do it.
+
+    return HttpResponseRedirect(target)
+```
+
+Add this view to your Django urlpatterns with something like this:
+
+```python
+from django.conf.urls import patterns, url
+from yourapp.views import legacy_mandrill_click_tracking_redirect
+
+urlpatterns = patterns('',
+    # ...
+    url(r'^track/click/(?P<mandrill_account>(\d+))/(?P<target_host>([a-z0-9.-]+))/?$',
+        legacy_mandrill_click_tracking_redirect),
+    # ...
+)
+```
+
+You *may* also need to add your Mandrill click-tracking domain to Django's
+[ALLOWED_HOSTS][allowed-hosts] setting.
+
+This is a good time to do some local testing with redirect urls from your own Mandrill 
+messages. Edit your /etc/hosts file to point your Mandrill click-tracking domain to your
+dev server, then try clicking some links in old Mandrill emails. (Don't forget to
+edit /etc/hosts back when your'e done.)
+
+Once it's working, you can edit your DNS to change your old Mandrill 
+click-tracking CNAME from `mandrillapp.com` to your live Django app.
+
+
+## Bonus: open-tracking
+
+If you were using Mandrill open-tracking, its tracking pixels will now be loaded from
+your site. You can just ignore them (users won't notice a 404 error in a 1x1
+pixel image). Or you could write some code to handle them and return a transparent
+image (and even log the open metrics, if you wanted).
+
+The format of the open-tracking pixel is:
+
+    http://go.planapple.com/track/open.php?u=11003603&id=99db2b3b935348279985d68db71e8824
+
+where:
+
+* `go.planapple.com` is your custom tracking domain
+* the `u` param is your Mandrill account id
+* the `id` param is the Mandrill message uuid
+
+Code to serve a transparent GIF on /track/open.php is "left as an exercise to the reader."
+
+
+[allowed-hosts]: https://docs.djangoproject.com/en/1.9/ref/settings/#allowed-hosts
+[base64-decoder]: https://www.base64decode.org/
+[custom-tracking-domain]: https://mandrill.zendesk.com/hc/en-us/articles/205582917
+[is-safe-url]: https://github.com/django/django/blob/1.8.11/django/utils/http.py#L269
+[mandrill-click-tracking]: https://mandrill.zendesk.com/hc/en-us/articles/205582897
+[open-redirect]: https://www.owasp.org/index.php/Unvalidated_Redirects_and_Forwards_Cheat_Sheet
diff --git a/urls.py b/urls.py
@@ -0,0 +1,9 @@
+from django.conf.urls import patterns, url
+from yourapp.views import legacy_mandrill_click_tracking_redirect
+
+urlpatterns = patterns('',
+    # ...
+    url(r'^track/click/(?P<mandrill_account>(\d+))/(?P<target_host>([a-z0-9.-]+))/?$',
+        legacy_mandrill_click_tracking_redirect),
+    # ...
+)
diff --git a/views.py b/views.py
@@ -0,0 +1,34 @@
+import json
+from django.core.exceptions import SuspiciousOperation
+from django.http import HttpResponseRedirect
+from django.utils.http import is_safe_url, urlsafe_base64_decode
+
+TARGET_HOSTNAME="www.example.com"  # You expect all redirects to go here
+
+def legacy_mandrill_click_tracking_redirect(request, mandrill_account=None, target_host=None):
+    """Handle a Mandrill click-tracking redirect link"""
+
+    try:
+        b64payload = request.GET['p']
+        # (Django's urlsafe_base64_decode handles missing '=' padding)
+        payload = json.loads(urlsafe_base64_decode(b64payload))
+        assert payload['v'] == 1  # we've only seen v:1 signed payloads
+        params = json.loads(payload['p'])
+        assert params['v'] == 1  # we've only seen v:1 params
+        target = params['url']
+    except (AssertionError, KeyError, TypeError, ValueError):
+        # Missing/unparseable query params/payload format
+        raise SuspiciousOperation("tried to redirect with garbled payload")
+
+    # Verify we're only redirecting to our own site (don't be an open redirect server):
+    if not is_safe_url(target, TARGET_HOSTNAME):
+        raise SuspiciousOperation("tried to redirect to unsafe url '%s'" % target)
+
+    # If you want to be extra-paranoid, you could also check that:
+    #   mandrill_account == params['u']
+    #   target_host == urlparse(target).netloc
+
+    # Want to actually log the click for your own metrics?
+    # This would be a good place to do it.
+
+    return HttpResponseRedirect(target)
No results found