Skip to content

Instantly share code, notes, and snippets.

@wcaleb
Last active January 16, 2021 22:55
Show Gist options
  • Select an option

  • Save wcaleb/dcd769a64fa2773f2c3b to your computer and use it in GitHub Desktop.

Select an option

Save wcaleb/dcd769a64fa2773f2c3b to your computer and use it in GitHub Desktop.

Revisions

  1. wcaleb revised this gist Oct 22, 2015. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion wayback.py
    Original file line number Diff line number Diff line change
    @@ -22,7 +22,7 @@ def wayback(k, v, f, m):
    r = requests.get(base_url + '/save/' + url)
    s = r.status_code
    new_url = base_url + r.headers['content-location'] if s == requests.codes.ok else url
    return Link(v[0], (new_url, ""))
    return Link(v[0], (new_url, attrs[1]))

    if __name__ == "__main__":
    toJSONFilter(wayback)
  2. wcaleb created this gist Oct 22, 2015.
    28 changes: 28 additions & 0 deletions wayback.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,28 @@
    #!/usr/local/bin/python
    # -*- coding: utf-8 -*-

    # Usage: pandoc --filter=wayback.py input
    # Install pandocfilters and requests with pip before using
    # Warning: may take a while to process input with lots of links
    # Note: Links that can't be saved to WBM or already point to WBM are left as is

    from pandocfilters import toJSONFilter, Link
    import requests

    base_url = 'http://web.archive.org'

    def wayback(k, v, f, m):
    ''' Take a non-Wayback-Machine URL, save it to Wayback, replace with snapshot URL '''
    if k == 'Link':
    attrs = v[1]
    url = attrs[0]
    if base_url in url:
    return Link(v[0], attrs)
    else:
    r = requests.get(base_url + '/save/' + url)
    s = r.status_code
    new_url = base_url + r.headers['content-location'] if s == requests.codes.ok else url
    return Link(v[0], (new_url, ""))

    if __name__ == "__main__":
    toJSONFilter(wayback)