Skip to content

Instantly share code, notes, and snippets.

@cabbiepete
Forked from medmunds/README.md
Created September 27, 2018 22:30
Show Gist options
  • Select an option

  • Save cabbiepete/f26785ece2182649e46cccb97c33ca02 to your computer and use it in GitHub Desktop.

Select an option

Save cabbiepete/f26785ece2182649e46cccb97c33ca02 to your computer and use it in GitHub Desktop.

Revisions

  1. @medmunds medmunds created this gist Mar 29, 2016.
    205 changes: 205 additions & 0 deletions README.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,205 @@
    # Taking back your Mandrill click-tracking links

    My company, like many, has recently switched away from using Mandrill
    for our transactional email.

    We'd been using Mandrill's [click-tracking][mandrill-click-tracking] feature,
    and became worried about what would happen to all those old emailed links
    after we cancel our Mandrill account.

    Why should we care about links in emails sent a month or more ago?
    Well, turns out a lot of our users treat emails as de facto bookmarks.
    To get back to our site, they dig up an old email from us and click a
    link in it. (This is more common than you might expect, particularly
    for consumer-oriented sites.)

    Mandrill hasn't stated what will happen to click tracking redirects
    in cancelled accounts. There's no reason to believe they'd break them,
    but to be safe, we decided to take back our click-tracking domain and
    handle those redirects ourselves.

    And you can too.


    ## Prerequisite: custom tracking domain

    This only works if you had set up your own [custom tracking domain][custom-tracking-domain]
    in Mandrill -- because you'll be able to repoint that domain to your own server.
    If all your old emails have links directly to mandrillapp.com, there's nothing
    you can do about that now.

    Say you'd been using `click.example.com` as your custom tracking domain, and you'd
    CNAMEd that to `mandrillapp.com`. Our goal is to write our own server code that can handle
    Mandrill's redirect links. We'll then change `click.example.com`
    to point at our code, and we'll have no more dependencies on Mandrill.


    ## Decoding a Mandrill tracking link

    To do our own redirects, we'll need to figure out how Mandrill was encoding links.
    (Not intrested in the gory details? Just [skip to the code](#handling-in-django).)

    Here's an actual Mandrill tracking link I extracted from one of my old emails:

    http://go.planapple.com/track/click/11003603/www.planapple.com?p=eyJzIjoiZGk1ZDNtM2tHaFBjaXJvRWZKU2w3LXhqRnBzIiwidiI6MSwicCI6IntcInVcIjoxMTAwMzYwMyxcInZcIjoxLFwidXJsXCI6XCJodHRwczpcXFwvXFxcL3d3dy5wbGFuYXBwbGUuY29tXFxcL3N1cHBvcnRcXFwvP3V0bV9tZWRpdW09ZW1haWwmdXRtX3NvdXJjZT10cmFuc2FjdGlvbmFsJnV0bV9jYW1wYWlnbj1wYXNzd29yZF9yZXNldFwiLFwiaWRcIjpcIjk5ZGIyYjNiOTM1MzQ4Mjc5OTg1ZDY4ZGI3MWU4ODI0XCIsXCJ1cmxfaWRzXCI6W1wiY2U2OTJhMTlkMmUyMjc5OWJiM2E2YzU5OGNlN2NkMmNmMWYxYzQ2ZFwiXX0ifQ

    In the visible parts of that URL:

    * `go.planapple.com` is our custom tracking domain
    * `11003603` was (I'm guessing) our Mandrill account id
    * `www.planapple.com` is the host portion (only) of the target link.
    (I'm guessing Mandrill includes it so users can see some hint about
    where the link will lead.)

    But we want the full target link, not just the host. It must be in
    that `p` parameter, which looks like it might be base64 (without
    the trailing-equals padding). Sure enough, [decoding][base64-decoder]
    it gives a JSON string:

    ```json
    {
    "s": "di5d3m3kGhPciroEfJSl7-xjFps",
    "v": 1,
    "p": "{\"u\":11003603,\"v\":1,\"url\":\"https:\\\/\\\/www.planapple.com\\\/support\\\/?utm_medium=email&utm_source=transactional&utm_campaign=password_reset\",\"id\":\"99db2b3b935348279985d68db71e8824\",\"url_ids\":[\"ce692a19d2e22799bb3a6c598ce7cd2cf1f1c46d\"]}"
    }
    ```

    This appears to be some sort of signed JSON blob, where the real JSON of interest
    is in the `p` payload field, and the `s` field is a signature Mandrill can
    use to validate the payload.

    Validation is important. We don't want to create an [open redirect vulnerability][open-redirect]
    on our site. We won't be able to verify Mandrill's signature (since we don't have their secret),
    so we'll take a different approach. More about that later.

    Parsing the JSON from the `p` payload gives us the actual redirect params
    we were hoping for:

    ```json
    {
    "u": 11003603,
    "v": 1,
    "url": "https://www.planapple.com/support/?utm_medium=email&utm_source=transactional&utm_campaign=password_reset",
    "id": "99db2b3b935348279985d68db71e8824",
    "url_ids": [
    "ce692a19d2e22799bb3a6c598ce7cd2cf1f1c46d"
    ]
    }
    ```

    * `u` is our Mandrill account ID, again
    * `v` is probably the version of the parameters format.
    (I've only seen v:1, on recent emails. Hoping there wasn't a v:0 earlier.)
    * `url` is the target url we were looking for.
    (It even has the Google Analytics params Mandrill added for us!)
    * `id` is the Mandrill message uuid -- which you could use
    if you wanted to keep logging click tracking stats
    from old emails. (We're not going to bother with that.)
    * `url_ids` is... I'm not sure. (Maybe related to Mandrill's url-tagging
    feature. We're going to ignore it.)


    ## Handling in Django

    Now that we've decoded Mandrill's redirect data format, handling it is easy.
    The code below is for Django, but it shouldn't be hard to adapt to other
    environments.

    The only (somewhat) tricky part is validating the target to make sure we don't
    create an [open redirect vulnerability][open-redirect] on our site.
    In my case, our emails *only* contained links back to our own site, so I can
    simply check the targets using Django's [`is_safe_url`][is-safe-url] helper with
    my site's hostname. If your emails linked to a variety of domains,
    you'll need to come up with some other way to validate the redirect targets.

    Here's a simple Django view to handle the redirects:

    ```python
    import json
    from django.core.exceptions import SuspiciousOperation
    from django.http import HttpResponseRedirect
    from django.utils.http import is_safe_url, urlsafe_base64_decode

    TARGET_HOSTNAME="www.example.com" # You expect all redirects to go here

    def legacy_mandrill_click_tracking_redirect(request, mandrill_account=None, target_host=None):
    """Handle a Mandrill click-tracking redirect link"""

    try:
    b64payload = request.GET['p']
    # (Django's urlsafe_base64_decode handles missing '=' padding)
    payload = json.loads(urlsafe_base64_decode(b64payload))
    assert payload['v'] == 1 # we've only seen v:1 signed payloads
    params = json.loads(payload['p'])
    assert params['v'] == 1 # we've only seen v:1 params
    target = params['url']
    except (AssertionError, KeyError, TypeError, ValueError):
    # Missing/unparseable query params/payload format
    raise SuspiciousOperation("tried to redirect with garbled payload")

    # Verify we're only redirecting to our own site (don't be an open redirect server):
    if not is_safe_url(target, TARGET_HOSTNAME):
    raise SuspiciousOperation("tried to redirect to unsafe url '%s'" % target)

    # If you want to be extra-paranoid, you could also check that:
    # mandrill_account == params['u']
    # target_host == urlparse(target).netloc

    # Want to actually log the click for your own metrics?
    # This would be a good place to do it.

    return HttpResponseRedirect(target)
    ```

    Add this view to your Django urlpatterns with something like this:

    ```python
    from django.conf.urls import patterns, url
    from yourapp.views import legacy_mandrill_click_tracking_redirect

    urlpatterns = patterns('',
    # ...
    url(r'^track/click/(?P<mandrill_account>(\d+))/(?P<target_host>([a-z0-9.-]+))/?$',
    legacy_mandrill_click_tracking_redirect),
    # ...
    )
    ```

    You *may* also need to add your Mandrill click-tracking domain to Django's
    [ALLOWED_HOSTS][allowed-hosts] setting.

    This is a good time to do some local testing with redirect urls from your own Mandrill
    messages. Edit your /etc/hosts file to point your Mandrill click-tracking domain to your
    dev server, then try clicking some links in old Mandrill emails. (Don't forget to
    edit /etc/hosts back when your'e done.)

    Once it's working, you can edit your DNS to change your old Mandrill
    click-tracking CNAME from `mandrillapp.com` to your live Django app.


    ## Bonus: open-tracking

    If you were using Mandrill open-tracking, its tracking pixels will now be loaded from
    your site. You can just ignore them (users won't notice a 404 error in a 1x1
    pixel image). Or you could write some code to handle them and return a transparent
    image (and even log the open metrics, if you wanted).

    The format of the open-tracking pixel is:

    http://go.planapple.com/track/open.php?u=11003603&id=99db2b3b935348279985d68db71e8824

    where:

    * `go.planapple.com` is your custom tracking domain
    * the `u` param is your Mandrill account id
    * the `id` param is the Mandrill message uuid

    Code to serve a transparent GIF on /track/open.php is "left as an exercise to the reader."


    [allowed-hosts]: https://docs.djangoproject.com/en/1.9/ref/settings/#allowed-hosts
    [base64-decoder]: https://www.base64decode.org/
    [custom-tracking-domain]: https://mandrill.zendesk.com/hc/en-us/articles/205582917
    [is-safe-url]: https://github.com/django/django/blob/1.8.11/django/utils/http.py#L269
    [mandrill-click-tracking]: https://mandrill.zendesk.com/hc/en-us/articles/205582897
    [open-redirect]: https://www.owasp.org/index.php/Unvalidated_Redirects_and_Forwards_Cheat_Sheet
    9 changes: 9 additions & 0 deletions urls.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,9 @@
    from django.conf.urls import patterns, url
    from yourapp.views import legacy_mandrill_click_tracking_redirect

    urlpatterns = patterns('',
    # ...
    url(r'^track/click/(?P<mandrill_account>(\d+))/(?P<target_host>([a-z0-9.-]+))/?$',
    legacy_mandrill_click_tracking_redirect),
    # ...
    )
    34 changes: 34 additions & 0 deletions views.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,34 @@
    import json
    from django.core.exceptions import SuspiciousOperation
    from django.http import HttpResponseRedirect
    from django.utils.http import is_safe_url, urlsafe_base64_decode

    TARGET_HOSTNAME="www.example.com" # You expect all redirects to go here

    def legacy_mandrill_click_tracking_redirect(request, mandrill_account=None, target_host=None):
    """Handle a Mandrill click-tracking redirect link"""

    try:
    b64payload = request.GET['p']
    # (Django's urlsafe_base64_decode handles missing '=' padding)
    payload = json.loads(urlsafe_base64_decode(b64payload))
    assert payload['v'] == 1 # we've only seen v:1 signed payloads
    params = json.loads(payload['p'])
    assert params['v'] == 1 # we've only seen v:1 params
    target = params['url']
    except (AssertionError, KeyError, TypeError, ValueError):
    # Missing/unparseable query params/payload format
    raise SuspiciousOperation("tried to redirect with garbled payload")

    # Verify we're only redirecting to our own site (don't be an open redirect server):
    if not is_safe_url(target, TARGET_HOSTNAME):
    raise SuspiciousOperation("tried to redirect to unsafe url '%s'" % target)

    # If you want to be extra-paranoid, you could also check that:
    # mandrill_account == params['u']
    # target_host == urlparse(target).netloc

    # Want to actually log the click for your own metrics?
    # This would be a good place to do it.

    return HttpResponseRedirect(target)