Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save MethOS-32bit/aec6f4b0bacc2aa3403ed59d4b4b0499 to your computer and use it in GitHub Desktop.

Select an option

Save MethOS-32bit/aec6f4b0bacc2aa3403ed59d4b4b0499 to your computer and use it in GitHub Desktop.

Revisions

  1. EM3R50N revised this gist Dec 1, 2021. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion archive.org-scanned-book-downloader-bookmarklet.md
    Original file line number Diff line number Diff line change
    @@ -1,4 +1,4 @@
    # Download Scanned Books from Archive.org via Bookmarklet
    # Archive.org Scanned Book Downloader Bookmarklet

    A simple "1-click" javascript approach to downloading a scanned book from archive.org to read at your leisure on the device of your choosing w/out having to manually screenshot every pages of the book by hand. In short it's a glorified "Save Image As..." approach but consolidated down to "1 click". BTW there may be a much better option than this out there - I just built this as an autistic project to see if it would work.

  2. EM3R50N renamed this gist Dec 1, 2021. 1 changed file with 0 additions and 0 deletions.
  3. EM3R50N revised this gist Dec 1, 2021. 1 changed file with 3 additions and 0 deletions.
    3 changes: 3 additions & 0 deletions download-scanned-archive.org-book-via-bookmarklet.md
    Original file line number Diff line number Diff line change
    @@ -2,6 +2,9 @@

    A simple "1-click" javascript approach to downloading a scanned book from archive.org to read at your leisure on the device of your choosing w/out having to manually screenshot every pages of the book by hand. In short it's a glorified "Save Image As..." approach but consolidated down to "1 click". BTW there may be a much better option than this out there - I just built this as an autistic project to see if it would work.

    ## Demo Video
    [![Archive.org SBDL Demo](https://live.staticflickr.com/65535/51717277946_690c59115f_b.jpg)](https://player.vimeo.com/video/651998350 "Archive.org SBDL Demo")

    ## Obligatory Legal/Disclaimer:
    By using this script you agree to delete all book files/images after your 1 hour or 14 days is up! I don't support using this script for any other use cases. After all, none of us have ever kept a library book past it's return date, right?

  4. EM3R50N revised this gist Dec 1, 2021. 1 changed file with 8 additions and 3 deletions.
    11 changes: 8 additions & 3 deletions download-scanned-archive.org-book-via-bookmarklet.md
    Original file line number Diff line number Diff line change
    @@ -48,7 +48,9 @@ function getNewURL(pageCount){
    return newURL;
    }
    var pageCount = prompt('how many pages?');
    var confirm1 = confirm('Archive.org Scanned Book Downloader:\n\nReady Check: Are you on a window/tab viewing *just* the IMAGE of the 1st page of the book? If not cancel and run this when you are.');
    if(!confirm1) return false;
    var pageCount = prompt('Archive.org Scanned Book Downloader:\n\nHow many pages are in this book?');
    var pageCounter = 0;
    var pageInterval = null;
    if(pageCount == null || pageCount == undefined || parseInt(pageCount) == NaN){
    @@ -58,8 +60,11 @@ if(pageCount == null || pageCount == undefined || parseInt(pageCount) == NaN){
    if(pageCounter > parseInt(pageCount)){
    window.clearInterval(pageInterval);
    pageInterval = null;
    console.log('downloading done!..');
    alert('all pages downloaded!');
    console.log('downloading done!..');
    var pdfTime = confirm('All pages downloaded! (some files may still be downloading though)\n\nWould you like to go to a site to create a PDF with them now?');
    if(pdfTime){
    window.open('https://tools.pdf24.org/en/images-to-pdf','_blank');
    }
    }else{
    var nextFile = getNewURL(pageCounter);
    downloadFile(nextFile);
  5. EM3R50N revised this gist Dec 1, 2021. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion download-scanned-archive.org-book-via-bookmarklet.md
    Original file line number Diff line number Diff line change
    @@ -13,7 +13,7 @@ By using this script you agree to delete all book files/images after your 1 hour


    ## Instructions
    1. Create a bookmark in your browser using the code below via https://mrcoles.com/bookmarklet/
    1. Create a bookmarklet in your browser using the code below via https://mrcoles.com/bookmarklet/
    2. Go to archive.org and "Borrow" the book for 1 hour or 14 days (only tested with the 1 hour)
    3. Once the borrowed book page reloads click zoom icon to zoom into the 1st page of book at least 2 times (otherwise you'll get low-res version of book images)
    4. Write down or make a mental note of how many pages the book has
  6. EM3R50N revised this gist Nov 30, 2021. 1 changed file with 1 addition and 0 deletions.
    1 change: 1 addition & 0 deletions download-scanned-archive.org-book-via-bookmarklet.md
    Original file line number Diff line number Diff line change
    @@ -8,6 +8,7 @@ By using this script you agree to delete all book files/images after your 1 hour
    ## NOTES:
    * Scanned Books Only: This only works on "scanned" books where each page is an image file. This means A) you won't be able search the text of the book and B) the book file size will be tens of megabytes not kilobytes like an EPUB/etc. Given the above always try to find the book in text format first (epub, etc) before using this method.
    * Compatibility: As of 11/2021 I've tested this on a few books w/no problems so it seems pretty stable but if someone finds a book that doesn't work w/it LMK in comments. It's very possible (likely?) at some point archive.org will change something that either requires some adjustments to this script and/or makes this approach no longer possible. Feel free to recommend tweaks or fixes if anyone has any suggestions btw.
    * Borrowed and?: I've only tested this for "Borrowed" books but I suppose you could use on Free books too - although normally those already offer a PDF download so not really a reason to do that.
    * Support: This is just a basic javascript thing so there's no real danger here but I can't/don't provide any support if this doesn't work for you and/or your browser crashes while trying it.


  7. EM3R50N revised this gist Nov 30, 2021. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion download-scanned-archive.org-book-via-bookmarklet.md
    Original file line number Diff line number Diff line change
    @@ -1,6 +1,6 @@
    # Download Scanned Books from Archive.org via Bookmarklet

    A simple "1-click" javascript approach to downloading a scanned book from archive.org to read at your leisure on the device of your choosing w/out having to manually screenshot every pages of the book by hand. BTW there may be a much better option than this out there - I just built this as an autistic project to see if it would work.
    A simple "1-click" javascript approach to downloading a scanned book from archive.org to read at your leisure on the device of your choosing w/out having to manually screenshot every pages of the book by hand. In short it's a glorified "Save Image As..." approach but consolidated down to "1 click". BTW there may be a much better option than this out there - I just built this as an autistic project to see if it would work.

    ## Obligatory Legal/Disclaimer:
    By using this script you agree to delete all book files/images after your 1 hour or 14 days is up! I don't support using this script for any other use cases. After all, none of us have ever kept a library book past it's return date, right?
  8. EM3R50N revised this gist Nov 30, 2021. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion download-scanned-archive.org-book-via-bookmarklet.md
    Original file line number Diff line number Diff line change
    @@ -7,7 +7,7 @@ By using this script you agree to delete all book files/images after your 1 hour

    ## NOTES:
    * Scanned Books Only: This only works on "scanned" books where each page is an image file. This means A) you won't be able search the text of the book and B) the book file size will be tens of megabytes not kilobytes like an EPUB/etc. Given the above always try to find the book in text format first (epub, etc) before using this method.
    * Compatibility: As of 11/2021 I've tested this on a few books w/no problems - however it's very possible at some point archive.org will change something that either requires some adjustments to this script and/or makes this approach no longer possible. Feel free to recommend tweaks or fixes if anyone has any suggestions btw.
    * Compatibility: As of 11/2021 I've tested this on a few books w/no problems so it seems pretty stable but if someone finds a book that doesn't work w/it LMK in comments. It's very possible (likely?) at some point archive.org will change something that either requires some adjustments to this script and/or makes this approach no longer possible. Feel free to recommend tweaks or fixes if anyone has any suggestions btw.
    * Support: This is just a basic javascript thing so there's no real danger here but I can't/don't provide any support if this doesn't work for you and/or your browser crashes while trying it.


  9. EM3R50N revised this gist Nov 30, 2021. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion download-scanned-archive.org-book-via-bookmarklet.md
    Original file line number Diff line number Diff line change
    @@ -1,6 +1,6 @@
    # Download Scanned Books from Archive.org via Bookmarklet

    A simple "1-click" javascript approach to downloading a scanned book from archive.org to read at your leisure on the device of your choosing w/out having to manually screenshot every pages of the book by hand.
    A simple "1-click" javascript approach to downloading a scanned book from archive.org to read at your leisure on the device of your choosing w/out having to manually screenshot every pages of the book by hand. BTW there may be a much better option than this out there - I just built this as an autistic project to see if it would work.

    ## Obligatory Legal/Disclaimer:
    By using this script you agree to delete all book files/images after your 1 hour or 14 days is up! I don't support using this script for any other use cases. After all, none of us have ever kept a library book past it's return date, right?
  10. EM3R50N revised this gist Nov 30, 2021. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion download-scanned-archive.org-book-via-bookmarklet.md
    Original file line number Diff line number Diff line change
    @@ -1,4 +1,4 @@
    # Bookmarklet to download borrowed Archive.org books
    # Download Scanned Books from Archive.org via Bookmarklet

    A simple "1-click" javascript approach to downloading a scanned book from archive.org to read at your leisure on the device of your choosing w/out having to manually screenshot every pages of the book by hand.

  11. EM3R50N revised this gist Nov 30, 2021. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion download-scanned-archive.org-book-via-bookmarklet.md
    Original file line number Diff line number Diff line change
    @@ -3,7 +3,7 @@
    A simple "1-click" javascript approach to downloading a scanned book from archive.org to read at your leisure on the device of your choosing w/out having to manually screenshot every pages of the book by hand.

    ## Obligatory Legal/Disclaimer:
    * By using this script you agree to delete all book files/images after your 1 hour or 14 days is up! I don't support using this script for any other use cases. After all, none of us have ever kept a library book past it's return date, right?
    By using this script you agree to delete all book files/images after your 1 hour or 14 days is up! I don't support using this script for any other use cases. After all, none of us have ever kept a library book past it's return date, right?

    ## NOTES:
    * Scanned Books Only: This only works on "scanned" books where each page is an image file. This means A) you won't be able search the text of the book and B) the book file size will be tens of megabytes not kilobytes like an EPUB/etc. Given the above always try to find the book in text format first (epub, etc) before using this method.
  12. EM3R50N revised this gist Nov 30, 2021. 1 changed file with 16 additions and 21 deletions.
    37 changes: 16 additions & 21 deletions download-scanned-archive.org-book-via-bookmarklet.md
    Original file line number Diff line number Diff line change
    @@ -24,36 +24,30 @@ A simple "1-click" javascript approach to downloading a scanned book from archiv
    10. At this point you'll have a bunch of book page images in your Downloads folder like mybookwhatever_000.jpg, mybookwhatever_001.jpg etc.
    11. If you want to make a PDF of the pages go to https://tools.pdf24.org/en/images-to-pdf and drag all these images into the upload area. When the images are uploading click the "A-Z sort" button at the bottom of the page to make sure the pages sort by filename.
    12. Click the "Create PDF" button when it's ready and download the PDF when it's done.
    13. Now you can enjoy reading the book at your leisure, wherever you want without having to wait for the annoying page load times of archive.org/etc!
    13. Now you can enjoy reading the book at your leisure, wherever you want without having to wait for the annoying page load times of archive.org, etc!

    `function downloadFile(filePath){
    ```
    function downloadFile(filePath){
    var link=document.createElement('a');
    link.href = filePath;
    link.download = filePath.substr(filePath.lastIndexOf('/') + 1);
    link.click();
    }`
    }
    `function getNewURL(pageCount){
    function getNewURL(pageCount){
    if(pageCount == null) pageCount = 1;
    var url = document.location.href;
    /* console.log(url); */
    var urlParts = url.split(".jp2");
    /* console.log(urlParts[0]); */
    var urlPrefixParts = urlParts[0].split("_");
    /* console.log(urlPrefixParts[0]); */
    var urlPageNumber = urlPrefixParts[urlPrefixParts.length-1];
    /* console.log(urlPageNumber); */
    var nextPageNumberString = String(parseInt(urlPageNumber)+pageCount).padStart(4,'0');
    /* console.log(nextPageNumberString); */
    var url = document.location.href;
    var urlParts = url.split(".jp2");
    var urlPrefixParts = urlParts[0].split("_");
    var urlPageNumber = urlPrefixParts[urlPrefixParts.length-1];
    var nextPageNumberString = String(parseInt(urlPageNumber)+pageCount).padStart(4,'0');
    var newURLPrefix = '';
    for(var p=0;p<urlPrefixParts.length-1;p++) newURLPrefix += urlPrefixParts[p] + '_';
    /* console.log(newURLPrefix); */
    var newURL = newURLPrefix + nextPageNumberString + '.jp2' + urlParts[1];
    /* console.log(newURL); */
    for(var p=0;p<urlPrefixParts.length-1;p++) newURLPrefix += urlPrefixParts[p] + '_';
    var newURL = newURLPrefix + nextPageNumberString + '.jp2' + urlParts[1];
    return newURL;
    }`
    }
    `var pageCount = prompt('how many pages?');
    var pageCount = prompt('how many pages?');
    var pageCounter = 0;
    var pageInterval = null;
    if(pageCount == null || pageCount == undefined || parseInt(pageCount) == NaN){
    @@ -72,4 +66,5 @@ if(pageCount == null || pageCount == undefined || parseInt(pageCount) == NaN){
    }
    pageCounter += 1;
    },900);
    }`
    }
    ```
  13. EM3R50N created this gist Nov 30, 2021.
    75 changes: 75 additions & 0 deletions download-scanned-archive.org-book-via-bookmarklet.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,75 @@
    # Bookmarklet to download borrowed Archive.org books

    A simple "1-click" javascript approach to downloading a scanned book from archive.org to read at your leisure on the device of your choosing w/out having to manually screenshot every pages of the book by hand.

    ## Obligatory Legal/Disclaimer:
    * By using this script you agree to delete all book files/images after your 1 hour or 14 days is up! I don't support using this script for any other use cases. After all, none of us have ever kept a library book past it's return date, right?

    ## NOTES:
    * Scanned Books Only: This only works on "scanned" books where each page is an image file. This means A) you won't be able search the text of the book and B) the book file size will be tens of megabytes not kilobytes like an EPUB/etc. Given the above always try to find the book in text format first (epub, etc) before using this method.
    * Compatibility: As of 11/2021 I've tested this on a few books w/no problems - however it's very possible at some point archive.org will change something that either requires some adjustments to this script and/or makes this approach no longer possible. Feel free to recommend tweaks or fixes if anyone has any suggestions btw.
    * Support: This is just a basic javascript thing so there's no real danger here but I can't/don't provide any support if this doesn't work for you and/or your browser crashes while trying it.


    ## Instructions
    1. Create a bookmark in your browser using the code below via https://mrcoles.com/bookmarklet/
    2. Go to archive.org and "Borrow" the book for 1 hour or 14 days (only tested with the 1 hour)
    3. Once the borrowed book page reloads click zoom icon to zoom into the 1st page of book at least 2 times (otherwise you'll get low-res version of book images)
    4. Write down or make a mental note of how many pages the book has
    5. Use browser's "Inspect Element" on first page of book to find the page image URL and right-click to "open link" in a new tab.
    6. Once on the new tab looking at the book's 1st page image, click the bookmarklet button made in step 1 and type in the number of pages the book has that you noted in step 4. Tip: Add 5-10 more pages than the book has just in case the covers/final pages of the book actually add up to a higher number.
    7. As soon as you click 'OK' after entering the page count watch for the browser's "Allow Multiple Downloads from this Site" type message in your browser and click 'Accept' or whatever. Otherwise the process will fail. Some browsers may not do this - so disregard if this isn't an issue w/your browser.
    8. Wait for the process to finish - a 300 page book takes around 3-5 minutes. Note: You can minimize the browser tab/window while the pages are downloading.
    9. Once all pages have been downloaded an "alert" message will popup when the pages have all been downloaded.
    10. At this point you'll have a bunch of book page images in your Downloads folder like mybookwhatever_000.jpg, mybookwhatever_001.jpg etc.
    11. If you want to make a PDF of the pages go to https://tools.pdf24.org/en/images-to-pdf and drag all these images into the upload area. When the images are uploading click the "A-Z sort" button at the bottom of the page to make sure the pages sort by filename.
    12. Click the "Create PDF" button when it's ready and download the PDF when it's done.
    13. Now you can enjoy reading the book at your leisure, wherever you want without having to wait for the annoying page load times of archive.org/etc!

    `function downloadFile(filePath){
    var link=document.createElement('a');
    link.href = filePath;
    link.download = filePath.substr(filePath.lastIndexOf('/') + 1);
    link.click();
    }`

    `function getNewURL(pageCount){
    if(pageCount == null) pageCount = 1;
    var url = document.location.href;
    /* console.log(url); */
    var urlParts = url.split(".jp2");
    /* console.log(urlParts[0]); */
    var urlPrefixParts = urlParts[0].split("_");
    /* console.log(urlPrefixParts[0]); */
    var urlPageNumber = urlPrefixParts[urlPrefixParts.length-1];
    /* console.log(urlPageNumber); */
    var nextPageNumberString = String(parseInt(urlPageNumber)+pageCount).padStart(4,'0');
    /* console.log(nextPageNumberString); */
    var newURLPrefix = '';
    for(var p=0;p<urlPrefixParts.length-1;p++) newURLPrefix += urlPrefixParts[p] + '_';
    /* console.log(newURLPrefix); */
    var newURL = newURLPrefix + nextPageNumberString + '.jp2' + urlParts[1];
    /* console.log(newURL); */
    return newURL;
    }`

    `var pageCount = prompt('how many pages?');
    var pageCounter = 0;
    var pageInterval = null;
    if(pageCount == null || pageCount == undefined || parseInt(pageCount) == NaN){
    console.log('no page count provided.. giving up.');
    }else{
    pageInterval = window.setInterval(function(){
    if(pageCounter > parseInt(pageCount)){
    window.clearInterval(pageInterval);
    pageInterval = null;
    console.log('downloading done!..');
    alert('all pages downloaded!');
    }else{
    var nextFile = getNewURL(pageCounter);
    downloadFile(nextFile);
    console.log('downloading next page! (' + nextFile + ')');
    }
    pageCounter += 1;
    },900);
    }`