Last active
October 21, 2023 17:53
-
-
Save wolever/3c3fa1f23a7e2e19dcb39e74af3d9282 to your computer and use it in GitHub Desktop.
Revisions
-
wolever revised this gist
Jul 26, 2023 . 1 changed file with 3 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -178,4 +178,6 @@ the cost of slightly larger values): https://github.com/rocicorp/fractional-indexing See also: * https://madebyevan.com/algos/crdt-fractional-indexing/ * https://observablehq.com/@dgreensp/implementing-fractional-indexing -
wolever revised this gist
Jul 26, 2023 . 1 changed file with 12 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -161,10 +161,21 @@ division by 2 happens by shifting a 1 into the mantissa bits; ex:: 1: 0 01111110 10000000000000000000000 (0.75) 2: 0 01111110 11000000000000000000000 (0.875) 3: 0 01111110 11100000000000000000000 (0.9375) ... Which means there can only be approximately ``(number of mantissa bits)`` insertions before precision is exhausted. So, in practice, how many insertions between any two elements can we expect to be able to make before precision is exhausted? Using a 64 bit float (the default in JavaScript, Python, and most other languages), "about 50" is a reasonable estimate. But if there's any concern about precision, it's likely best to use a string instead of a float, which will allow for an arbitrary number of insertions (at the cost of slightly larger values): https://github.com/rocicorp/fractional-indexing See also: * https://madebyevan.com/algos/crdt-fractional-indexing/ -
wolever revised this gist
Jul 26, 2023 . 1 changed file with 3 additions and 2 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -48,7 +48,8 @@ The value of the float is calculated with:: apply to the more common 64 bit floats) With this in mind, it's straightforward to determine that the *maximum* number of insertions that can be made are ``2 ^ (number of exponent bits - 1) + (number mantissa bits) - 1``, or ``2 ^ (8 - 1) + 23 - 1 = 150`` insertions for a 32 bit float, or ``2 ^ (11 - 1) + 52 - 1 = 1,075`` insertions for a 64 bit float. @@ -166,4 +167,4 @@ before precision is exhausted. See also: * https://madebyevan.com/algos/crdt-fractional-indexing/ -
wolever revised this gist
Jul 26, 2023 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -50,7 +50,7 @@ apply to the more common 64 bit floats) With this in mind, it's straightforward to determine that the *maximum* number of insertions that can be made are ``2 ^ (exponent length - 1) + mantissa length - 1``, or ``2 ^ (8 - 1) + 23 - 1 = 150`` insertions for a 32 bit float, or ``2 ^ (11 - 1) + 52 - 1 = 1,075`` insertions for a 64 bit float. This can be demonstrated with the following program: -
wolever revised this gist
Jul 26, 2023 . 1 changed file with 15 additions and 16 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -15,8 +15,8 @@ require updating the rank on each subsequent item:: { name: "c", rank: 3 } // changed However, using fractional indexing, a new item can be introduced without updating any existing item by setting ``new_item.rank = (next_item.rank + prev_item.rank) / 2``:: { name: "a", rank: 1 } { name: "b", rank: 1.5 } // new @@ -111,17 +111,17 @@ number of insertions is substantially lower: Which produces:: $ python float-division-towards-1.py 0: 0 01111110 00000000000000000000000 (0.5) 1: 0 01111110 10000000000000000000000 (0.75) 2: 0 01111110 11000000000000000000000 (0.875) 3: 0 01111110 11100000000000000000000 (0.9375) 4: 0 01111110 11110000000000000000000 (0.96875) 5: 0 01111110 11111000000000000000000 (0.984375) 6: 0 01111110 11111100000000000000000 (0.9921875) 7: 0 01111110 11111110000000000000000 (0.99609375) 8: 0 01111110 11111111000000000000000 (0.998046875) 9: 0 01111110 11111111100000000000000 (0.9990234375) 10: 0 01111110 11111111110000000000000 (0.99951171875) 11: 0 01111110 11111111111000000000000 (0.999755859375) 12: 0 01111110 11111111111100000000000 (0.9998779296875) @@ -142,7 +142,7 @@ To understand why this is the case, notice that in the first instance - when the new item's rank converges to 0 - the division by 2 happens by subtracting 1 from the exponent bits; ex:: v-- notice the subtraction of 1 at each step 0: 0 01111111 00000000000000000000000 (1.0) 1: 0 01111110 00000000000000000000000 (0.5) 2: 0 01111101 00000000000000000000000 (0.25) @@ -166,5 +166,4 @@ before precision is exhausted. See also: * https://madebyevan.com/algos/crdt-fractional-indexing/ -
wolever revised this gist
Jul 26, 2023 . 1 changed file with 1 addition and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -166,4 +166,5 @@ before precision is exhausted. See also: * https://madebyevan.com/algos/crdt-fractional-indexing/ -
wolever revised this gist
Jul 26, 2023 . 1 changed file with 8 additions and 8 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -15,8 +15,8 @@ require updating the rank on each subsequent item:: { name: "c", rank: 3 } // changed However, using fractional indexing, a new item can be introduced without updating any existing item by setting ``new_item.rank = (next_item.rank + prev_item.rank) / 2``:: { name: "a", rank: 1 } { name: "b", rank: 1.5 } // new @@ -48,9 +48,9 @@ The value of the float is calculated with:: apply to the more common 64 bit floats) With this in mind, it's straightforward to determine that the *maximum* number of insertions that can be made are ``2 ^ (exponent length - 1) + mantissa length - 1``, or ``2 ^ (8 - 1) + 23 - 1 = 150`` insertions for a 32 bit float, or ``2 ^ (11 - 1) + 52 - 1 = 2,073,741,823`` insertions for a 64 bit float. This can be demonstrated with the following program: @@ -149,7 +149,7 @@ from the exponent bits; ex:: 3: 0 01111100 00000000000000000000000 (0.125) ... Which means there can be approximately ``2^(number of exponent bits)`` insertions before precision is exhausted. However, in the second instance - when the new item's rank converges to 1 - the @@ -161,9 +161,9 @@ division by 2 happens by shifting a 1 into the mantissa bits; ex:: 2: 0 01111110 11000000000000000000000 (0.875) 3: 0 01111110 11100000000000000000000 (0.9375) Which means there can only be approximately ``(number of mantissa bits)`` insertions before precision is exhausted. See also: * https://madebyevan.com/algos/crdt-fractional-indexing/ -
wolever revised this gist
Jul 26, 2023 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -40,7 +40,7 @@ and a mantissa:: 1.25 = 0 01111111 01000000000000000000000 ^- sign ^- mantissa The value of the float is calculated with:: value = -1^sign * 2^(exponent - 127) * mantissa -
wolever revised this gist
Jul 26, 2023 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -34,7 +34,7 @@ To understand how many insertions can be made, it's necessary to understand how floating point numbers are represented in memory. A detailed explanation is out of scope of this document (ex: https://www.youtube.com/watch?v=PZRI1IfStY0), but in brief: floating point numbers are represented as a sign bit, an exponent and a mantissa:: v- exponent 1.25 = 0 01111111 01000000000000000000000 -
wolever created this gist
Jul 26, 2023 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,169 @@ Fractional indexing is a technique for assigning a rank to ordered items, where an item's position can be changed - or a new item introduced - with only one rank value update. For example, given the items:: { name: "a", rank: 1 } { name: "c", rank: 2 } With a naive rank ordering scheme, introducing "b" in between "a" and "c" would require updating the rank on each subsequent item:: { name: "a", rank: 1 } { name: "b", rank: 2 } // new { name: "c", rank: 3 } // changed However, using fractional indexing, a new item can be introduced without updating any existing item by setting `new_item.rank = (next_item.rank + prev_item.rank) / 2`:: { name: "a", rank: 1 } { name: "b", rank: 1.5 } // new { name: "c", rank: 2 } If floating point numbers are used, the question naturally arises as to how many times a new item can be introduced before the floating point number precision is exhausted. (note: in practice, if precision exhaustion is a concern, a string value can be used to allow for arbitrarily many insertions https://github.com/rocicorp/fractional-indexing) To understand how many insertions can be made, it's necessary to understand how floating point numbers are represented in memory. A detailed explanation is out of scope of this document (ex: https://www.youtube.com/watch?v=PZRI1IfStY0), but in brief: floating point numbers are represented as a sign bit, an exponent and a mantissa: v- exponent 1.25 = 0 01111111 01000000000000000000000 ^- sign ^- mantissa The value of the float is calculated with: value = -1^sign * 2^(exponent - 127) * mantissa (note: the examples in this document use 32 bit floats, but the same principles apply to the more common 64 bit floats) With this in mind, it's straightforward to determine that the *maximum* number of insertions that can be made are `2 ^ (exponent length - 1) + mantissa length - 1`, or `2 ^ (8 - 1) + 23 - 1 = 150` insertions for a 32 bit float, or `2 ^ (11 - 1) + 52 - 1 = 2,073,741,823` insertions for a 64 bit float. This can be demonstrated with the following program: .. code-block:: python import struct import numpy as np def float2bin(num: np.float32): bits = ''.join('{:0>8b}'.format(c) for c in struct.pack('!f', num)) return f'{bits[0]} {bits[1:9]} {bits[9:]}' v = np.float32(1) iteration = 0 while True: print(f"{iteration:3d}: {float2bin(v)} ({v})") iteration += 1 next_v = np.float32(v / 2) if v == next_v: break v = next_v Which produces:: $ python float-division.py 0: 0 01111111 00000000000000000000000 (1.0) 1: 0 01111110 00000000000000000000000 (0.5) 2: 0 01111101 00000000000000000000000 (0.25) 3: 0 01111100 00000000000000000000000 (0.125) 4: 0 01111011 00000000000000000000000 (0.0625) 5: 0 01111010 00000000000000000000000 (0.03125) ... 146: 0 00000000 00000000000000000001000 (1.1210387714598537e-44) 147: 0 00000000 00000000000000000000100 (5.605193857299268e-45) 148: 0 00000000 00000000000000000000010 (2.802596928649634e-45) 149: 0 00000000 00000000000000000000001 (1.401298464324817e-45) 150: 0 00000000 00000000000000000000000 (0.0) Note, however, that this is the *maximum* number of insertions, and it assumes that items are always being added in between the "first" (rank = 0) and second items; ie, so the rank of new items converges to 0. If instead items are being added in between the "last" (rank = 1) and second to last items - ie, so the rank of new items converges to 1 - then the maximum number of insertions is substantially lower: .. code-block:: python v = np.float32(0.5) iteration = 0 while True: print(f"{iteration:3d}: {float2bin(v)} ({v})") iteration += 1 next_v = np.float32(v + (1 - v) / 2) if v == next_v: break v = next_v Which produces:: $ python float-addition.py 0: 0 01111110 00000000000000000000000 (0.5) 1: 0 01111110 10000000000000000000000 (0.75) 2: 0 01111110 11000000000000000000000 (0.875) 3: 0 01111110 11100000000000000000000 (0.9375) 4: 0 01111110 11110000000000000000000 (0.96875) 5: 0 01111110 11111000000000000000000 (0.984375) 6: 0 01111110 11111100000000000000000 (0.9921875) 7: 0 01111110 11111110000000000000000 (0.99609375) 8: 0 01111110 11111111000000000000000 (0.998046875) 9: 0 01111110 11111111100000000000000 (0.9990234375) 10: 0 01111110 11111111110000000000000 (0.99951171875) 11: 0 01111110 11111111111000000000000 (0.999755859375) 12: 0 01111110 11111111111100000000000 (0.9998779296875) 13: 0 01111110 11111111111110000000000 (0.99993896484375) 14: 0 01111110 11111111111111000000000 (0.999969482421875) 15: 0 01111110 11111111111111100000000 (0.9999847412109375) 16: 0 01111110 11111111111111110000000 (0.9999923706054688) 17: 0 01111110 11111111111111111000000 (0.9999961853027344) 18: 0 01111110 11111111111111111100000 (0.9999980926513672) 19: 0 01111110 11111111111111111110000 (0.9999990463256836) 20: 0 01111110 11111111111111111111000 (0.9999995231628418) 21: 0 01111110 11111111111111111111100 (0.9999997615814209) 22: 0 01111110 11111111111111111111110 (0.9999998807907104) 23: 0 01111110 11111111111111111111111 (0.9999999403953552) 24: 0 01111111 00000000000000000000000 (1.0) To understand why this is the case, notice that in the first instance - when the new item's rank converges to 0 - the division by 2 happens by subtracting 1 from the exponent bits; ex:: v-- notice the subtraction of 1 at each step 0: 0 01111111 00000000000000000000000 (1.0) 1: 0 01111110 00000000000000000000000 (0.5) 2: 0 01111101 00000000000000000000000 (0.25) 3: 0 01111100 00000000000000000000000 (0.125) ... Which means there can be approximately `2^(number of exponent bits)` insertions before precision is exhausted. However, in the second instance - when the new item's rank converges to 1 - the division by 2 happens by shifting a 1 into the mantissa bits; ex:: v-- notice the shift of 1 at each step 0: 0 01111110 00000000000000000000000 (0.5) 1: 0 01111110 10000000000000000000000 (0.75) 2: 0 01111110 11000000000000000000000 (0.875) 3: 0 01111110 11100000000000000000000 (0.9375) Which means there can only be approximately `(number of mantissa bits)` insertions before precision is exhausted. See also: * https://madebyevan.com/algos/crdt-fractional-indexing/