-
Star
(236)
You must be signed in to star a gist -
Fork
(37)
You must be signed in to fork a gist
-
-
Save jonleighton/958841 to your computer and use it in GitHub Desktop.
| // Converts an ArrayBuffer directly to base64, without any intermediate 'convert to string then | |
| // use window.btoa' step. According to my tests, this appears to be a faster approach: | |
| // http://jsperf.com/encoding-xhr-image-data/5 | |
| /* | |
| MIT LICENSE | |
| Copyright 2011 Jon Leighton | |
| Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: | |
| The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. | |
| THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. | |
| */ | |
| function base64ArrayBuffer(arrayBuffer) { | |
| var base64 = '' | |
| var encodings = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/' | |
| var bytes = new Uint8Array(arrayBuffer) | |
| var byteLength = bytes.byteLength | |
| var byteRemainder = byteLength % 3 | |
| var mainLength = byteLength - byteRemainder | |
| var a, b, c, d | |
| var chunk | |
| // Main loop deals with bytes in chunks of 3 | |
| for (var i = 0; i < mainLength; i = i + 3) { | |
| // Combine the three bytes into a single integer | |
| chunk = (bytes[i] << 16) | (bytes[i + 1] << 8) | bytes[i + 2] | |
| // Use bitmasks to extract 6-bit segments from the triplet | |
| a = (chunk & 16515072) >> 18 // 16515072 = (2^6 - 1) << 18 | |
| b = (chunk & 258048) >> 12 // 258048 = (2^6 - 1) << 12 | |
| c = (chunk & 4032) >> 6 // 4032 = (2^6 - 1) << 6 | |
| d = chunk & 63 // 63 = 2^6 - 1 | |
| // Convert the raw binary segments to the appropriate ASCII encoding | |
| base64 += encodings[a] + encodings[b] + encodings[c] + encodings[d] | |
| } | |
| // Deal with the remaining bytes and padding | |
| if (byteRemainder == 1) { | |
| chunk = bytes[mainLength] | |
| a = (chunk & 252) >> 2 // 252 = (2^6 - 1) << 2 | |
| // Set the 4 least significant bits to zero | |
| b = (chunk & 3) << 4 // 3 = 2^2 - 1 | |
| base64 += encodings[a] + encodings[b] + '==' | |
| } else if (byteRemainder == 2) { | |
| chunk = (bytes[mainLength] << 8) | bytes[mainLength + 1] | |
| a = (chunk & 64512) >> 10 // 64512 = (2^6 - 1) << 10 | |
| b = (chunk & 1008) >> 4 // 1008 = (2^6 - 1) << 4 | |
| // Set the 2 least significant bits to zero | |
| c = (chunk & 15) << 2 // 15 = 2^4 - 1 | |
| base64 += encodings[a] + encodings[b] + encodings[c] + '=' | |
| } | |
| return base64 | |
| } |
@LittleSaya Thanks man, your code's the bomb and works like a charm. I validated it with a fuzztest and it produces correct results!
I wrote a url-safe version with a corresponding decoding function in typescript: base64ArrayBuffer.ts
@LittleSaya You’ve saved my life <3!
// Use a lookup table to find the index.
var lookup = new Uint8Array(256);
for (var i = 0; i < chars.length; i++) {
lookup[chars.charCodeAt(i)] = i;
}
Shouldn't the loop go to lookup.length?
The problem I am seeing in all the encoder implementations above is that they seem to be creating a string for each byte that this being encoded, and doing a string concatenation for each byte. Am I missing something? Both string creation and string concatenation are very expensive when done in large numbers.
During encoding, isn't it better to get character codes and join them in bulk, using String.fromCharCode(...charCodes)?
I didn't dig what is changed in the V8 engine, but the standard way (block 4) is much faster than this implementation (block 3). I'm using Edge Version 139.0.3405.102
Updated benchmarks: https://jsben.ch/wnaZC
@soundlake The original GIST goes back to 2011 (!!!!). I made an update on 2020, and using this was still the prefered approach - I remember testing the binary += String.fromCharCode(bytes[i]); solution and it was still slower. Somehow it got faster now, so you are right, the most time-efficient solution appears to be that one nowadays.
EDIT: Plus it correctly handles all values from 0 to 255, unlike some of the other proposed solutions that only work with binary strings coming from text and not ACTUAL binary arrays
Yeah I was gonna ask, isn't the bota string approach just... wrong? As in, it actually wrongly encodes the bytes. It has to be, because unicode characters 128-255 actually require two bytes to represent, so that's just wasted space in the base64 encode then.
That's the reason I've used this gist and not the string approach.
I wrote a url-safe version with a corresponding decoding function in typescript: base64ArrayBuffer.ts