Last active
January 4, 2016 04:19
-
-
Save clinejj/8568197 to your computer and use it in GitHub Desktop.
An Algorithm to Blend Lists of Lists
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| ### Description | |
| These functions will take a list of lists of sorted objects and | |
| blends them together based on input target percentages and a target length, | |
| where each element from a given list appears in the final list sorted relatively | |
| to other elements from that list (and backfill from other lists if necessary). | |
| For example, let's say you have sorted lists A, B, and C each of size 10. You | |
| want to blend them into a single list of size 20 with 50% from A, 30% from B, | |
| and 10% from C. In the final list, you would expect 10 elements from A, 6 from | |
| B, and 4 from C. | |
| This algorithm assumes that a) we want to get as close to the target length as | |
| possible, and b) to do that we may need to "backfill" from other lists if a | |
| given list can't meet its target percentage. This may mean that the input blend | |
| percentages will not match up to the output if a given list doesn't contain | |
| enough elements to meet its quota. | |
| ### Algorithm Overview | |
| The basic blend function requires 3 inputs: | |
| - A list of sorted lists (each list can be any length) | |
| - A list of "blend percentages", where the nth element in the list is the | |
| goal percentage of the blended list to contain elements from the nth list | |
| in the list of lists | |
| - The target length of the blended lists | |
| From this function, the first step is to calculate the total size of our input | |
| lists. If all the elements across our list of lists is less than the target | |
| length, we can simply blend the lists according to our preferred blending method | |
| without having to do any extra work.* | |
| * This ends up ignoring the blend percentages since our algorithm calls to get | |
| * as close to the target length as possible | |
| When calculating the input sizes, we create a basic matrix with each list, the | |
| size of that list, the target number of elements from that list to include in | |
| the blended list, the difference between the size and the target, and then a | |
| "use" space that tells us later how many elements we need to use from that list. | |
| The matrix looks like: | |
| ID: 0 1 2 3 4 | |
| SIZE: 5 5 5 5 5 | |
| GOAL: 9 7 2 2 1 | |
| DIFF: -4 -2 3 3 4 | |
| USE: -4 -2 3 3 4 | |
| After creating the matrix, we sort it based on the "use" field, from smallest to | |
| largest. If this value is negative, it means that a given list is not long | |
| enough to meet its target percentage and those slots need to be reallocated to | |
| the other lists. We reallocate slots to the remaining lists based on their | |
| relative blend percentages, keeping the ratio of reallocated slots in parity | |
| with the ratio of the goals. | |
| To do so, we iterate through the columns in the matrix, until we reach the first | |
| column that has a positive (greater than 0) "use" value. We assign this column a | |
| number of slots in proportion to its blend ratio with the remaining columns that | |
| have a positive use value. This is reflected by updating it's "use" value | |
| (decreasing it by the number of slots it was assigned) and increasing the first | |
| column's "use" value by the same amount. We continue iterating through the rest | |
| of the columns and reallocate as we go. | |
| After we iterate through all the columns once, the first column will be | |
| reallocated. We re-sort the matrix again in ascending order of "use" value, and | |
| proceed with the reallocation steps again. This sort and reallocate process is | |
| repeated until all columns in our matrix have 0 or positive use values and the | |
| size of our elements in use is equal to the original target blended list length. | |
| The size of our elements in use can be calculated by substracting the absolute | |
| value of the "use" field from the size of each column, and summing these values | |
| for each column. | |
| Once our size matrix has been completed, we can move to the blending step. Our | |
| size matrix now tells us how many elements to use from each list (the same value | |
| used to sum the total - the size of the list minus the absolute value of the | |
| "use" field). We can use various blending methods, from a round robin (popping | |
| the 0th element from each list), to a random assignment (pick a random list, add | |
| the 0th element, repeat until the 0th element has been added for each list, then | |
| move to the 1st element, etc.), where we add each element from our lists in | |
| order into our blended list until the size limit for the list has been reached | |
| and the target blended list length is reached. The blending algorithm may vary | |
| by need, so I will not go into more detail on this portion. | |
| After adding the elements from our lists of lists as dicated by our size matrix, | |
| we will end up with a singular blended list no longer than our target length | |
| (possibly less if the input lists were not big enough). | |
| See the example file for a walkthrough. I have created a Java implementation of | |
| this algorithm, using a round-robin blending method, available for use at: | |
| https://github.com/clinejj/ListBlender |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Let's work through an example. Here are our inputs: | |
| LISTS | |
| 0 1 2 3 4 | |
| 0-0 1-12 2-24 3-36 4-48 | |
| 0-1 1-13 2-25 3-37 4-49 | |
| 0-2 1-14 2-26 3-38 4-50 | |
| 0-3 1-15 2-27 3-39 4-51 | |
| 0-4 1-16 2-28 3-40 4-52 | |
| PERCENTAGES | |
| 44 | |
| 33 | |
| 11 | |
| 8 | |
| 4 | |
| LENGTH | |
| 20 | |
| Our first step is to create our matrix: | |
| ID: 0 1 2 3 4 | |
| SIZE: 5 5 5 5 5 | |
| GOAL: 9 7 2 2 1 | |
| DIFF: -4 -2 3 3 4 | |
| USE: -4 -2 3 3 4 | |
| In this case, our inputs are already sorted so we can skip that step and move | |
| directly into the reallocation process. We need to reallocate 4 slots from | |
| column 0. The first column with a postive "use" value is column 2. It's ratio | |
| is 2/5 (the sum of the goals of the columns with a positive "use" value is 5), | |
| so with integer math 2/5 * 4 = 2. We reallocate two slots from column 0 to | |
| column 2, like so: | |
| ID: 0 1 2 3 4 | |
| SIZE: 5 5 5 5 5 | |
| GOAL: 9 7 2 2 1 | |
| DIFF: -4 -2 3 3 4 | |
| USE: -2 -2 1 3 4 | |
| The next column with a positive value is column 3, which gets the same | |
| allocation of 2 slots since it has the same ratio. | |
| ID: 0 1 2 3 4 | |
| SIZE: 5 5 5 5 5 | |
| GOAL: 9 7 2 2 1 | |
| DIFF: -4 -2 3 3 4 | |
| USE: 0 -2 1 1 4 | |
| There are no more reallocations needed. We re-calculate our current size, | |
| which shows we are so we re-sort the matrix in ascending order based on the | |
| "use" value: | |
| ID: 1 0 2 3 4 | |
| SIZE: 5 5 5 5 5 | |
| GOAL: 7 9 2 2 1 | |
| DIFF: -2 -4 3 3 4 | |
| USE: -2 0 1 1 4 | |
| And start the reallocation process again. We need to reallocate 2 slots from | |
| column 1. The first column with a positive value is column 2. We reallocate 1 | |
| slot to column 2 (2/5 * 2), like so: | |
| ID: 1 0 2 3 4 | |
| SIZE: 5 5 5 5 5 | |
| GOAL: 7 9 2 2 1 | |
| DIFF: -2 -4 3 3 4 | |
| USE: -1 0 0 1 4 | |
| And again, column 3 gets 1 slot as well: | |
| ID: 1 0 2 3 4 | |
| SIZE: 5 5 5 5 5 | |
| GOAL: 7 9 2 2 1 | |
| DIFF: -2 -4 3 3 4 | |
| USE: 0 0 0 0 4 | |
| There are no more reallocations, so we conclude with that step. You'll note | |
| that at this point there are 21 items to add to the list - this extra item | |
| is taken care of in the blending step, as you will see. We'll resort the | |
| matrix, this time based on ID: | |
| ID: 0 1 2 3 4 | |
| SIZE: 5 5 5 5 5 | |
| GOAL: 9 7 2 2 1 | |
| DIFF: -4 -2 3 3 4 | |
| USE: 0 0 0 0 4 | |
| Now that we have our sizes and we know how much we need to take from each | |
| list, we begin the blending process. This example will use a round-robin | |
| algorithm, where on each pass it will take the first element from each list | |
| (and then remove that element), until either the list size is met (in which | |
| case it will skip the list) or the target blended list size is reached. For | |
| reference here are our lists again: | |
| LISTS BLENDED LIST: | |
| 0 1 2 3 4 | |
| 0-0 1-12 2-24 3-36 4-48 | |
| 0-1 1-13 2-25 3-37 4-49 | |
| 0-2 1-14 2-26 3-38 4-50 | |
| 0-3 1-15 2-27 3-39 4-51 | |
| 0-4 1-16 2-28 3-40 4-52 | |
| In this example, we're taking the entirety of lists 0, 1, 2, and 3 to blend, | |
| and 1 element from list 4. After the first pass, our lists look like this: | |
| LISTS BLENDED LIST: | |
| 0 1 2 3 4 0-0 | |
| 0-1 1-13 2-25 3-37 4-49 1-12 | |
| 0-2 1-14 2-26 3-38 4-50 2-24 | |
| 0-3 1-15 2-27 3-39 4-51 3-36 | |
| 0-4 1-16 2-28 3-40 4-52 4-48 | |
| After the second pass: | |
| LISTS BLENDED LIST: | |
| 0 1 2 3 4 0-0 | |
| 0-2 1-14 2-26 3-38 4-49 1-12 | |
| 0-3 1-15 2-27 3-39 4-50 2-24 | |
| 0-4 1-16 2-28 3-40 4-51 3-36 | |
| 4-52 4-48 | |
| 0-1 | |
| 1-13 | |
| 2-25 | |
| 3-37 | |
| After the next pass: | |
| LISTS BLENDED LIST: | |
| 0 1 2 3 4 0-0 | |
| 0-3 1-15 2-27 3-39 4-49 1-12 | |
| 0-4 1-16 2-28 3-40 4-50 2-24 | |
| 4-51 3-36 | |
| 4-52 4-48 | |
| 0-1 | |
| 1-13 | |
| 2-25 | |
| 3-37 | |
| 0-2 | |
| 1-14 | |
| 2-26 | |
| 3-38 | |
| This continues until our final list, which looks like this: | |
| BLENDED LIST: | |
| 0-0 | |
| 1-12 | |
| 2-24 | |
| 3-36 | |
| 4-48 | |
| 0-1 | |
| 1-13 | |
| 2-25 | |
| 3-37 | |
| 0-2 | |
| 1-14 | |
| 2-26 | |
| 3-38 | |
| 0-3 | |
| 1-15 | |
| 2-27 | |
| 3-39 | |
| 0-4 | |
| 1-16 | |
| 2-28 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment