Quick and easy generation of int[] sequences for Codility and Leetcode algorithm challenge test cases
Recently I've been catching up on modern Java.. I haven't used it on the job since Java 5 came out. I've been practicing algorithms on Codility and Leetcode.
The algorithm puzzles on these sites almost always use int and/or int[] arguments and return types, and sometimes have BigO performance requirements. To test your algorithms properly, you need to generate very large input arrays, of sizes up to 100,000 with values from +/- 1 billion. My test cases weren't anywhere near as comprehensive as those run by Codility when you submit you solution, and I wanted a way to quickly create tests to cover edge cases and very large inputs.
I started using the data tables feature in the awesome Spock testing framework, and right away I started creating some helper methods to help with the generation of int[] sequences for input to my solutions.
Here's a simple example of what I was doing:
import spock.lang.*
class MySpecification extends Specification {
def solution = new AlgorithmSolution()
// helper for generation of a range of ints
ArrayList range( int from, int to ) {
return IntStream.rangeClosed( from, to ).boxed().collect( Collectors.toList() )
}
@Unroll
def "given input sequence #sequence with #indexReplacements, the result is #result"() {
given:
// replace values at specified positions if indexReplacements is not empty
if ( !!indexReplacements ) {
indexReplacements.each { position, value ->
sequence[ position ] = value
}
}
expect:
solution.solution( sequence ) == result
where:
result | sequence | indexReplacements
5 | [ 1, 2, 3 ] * 2 | [ : ] // 1, 2, 3, 1, 2, 3
5 | range(1, 5) * 2 | [ 1: 99, 8: 99 ] // 1, 99, 3, 4, 5, 1, 2, 3, 99, 5
42 | [ 9 ] * 10_000 | [ 0: 0, 9_999: 0 ] // 0, 9, 9, 9 ... 9, 9, 9, 0
42 | range(1, 100) | [ 0: 100, 99: 1 ] // 100, 2, 3, 4 ... 97, 98, 99, 1
I called my range() Specification helper method from within my Spock data column to generate a range of ints, and I specified values for arbitrary sequence positions in a separate indexReplacements column of position:value maps. The code that applied these replacements was included in the given: block.
This worked pretty well, but I didn't like cluttering the feature method (or even the Specification) so much - the helper code distracted from the readability of the spec.
Also, while Spock's data tables are very expressive, it can be a pretty tedious to add new facets to the generation of test data.
For example, say I wanted to add a bit more utility to the Specification - like a new valueExclusions column where I could specify values I want yanked out of the sequence:
// (previous code omitted)
given:
if ( !!indexReplacements ) {
indexReplacements.each { position, value ->
sequence[ position ] = value
}
}
if ( !!valueExclusions ) // NEW CODE
sequence.removeAll( valueExclusions )
expect:
solution.solution( sequence ) == result
where: // NEW COLUMN: values (i.e. empty lists) required
result | sequence | indexReplacements | valueExclusions
5 | [ 1, 2, 3 ] * 2 | [ : ] | [ ]
5 | range(1, 5) * 2) | [ 1: 99, 8: 99 ] | [ ]
42 | [ 9 ] * 10_000 | [ 0: 0, 9_999: 0 ] | [ ]
5 | range(1, 100) | [ 0: 100, 99: 1 ] | [ ]
42 | range(1, 10_000) | [ : ] | [ 5_000, 5_001 ] // <=== new test caseThe new column requires a value for each row - requiring the addition of an empty list ([]) to every test case just to keep everything working. That's a lot of table fiddling just to add something new; not very conducive for experimentation freedom. Plus, while this example only added two new lines of helper code, it further clutters the Specification and detracts from readability and clarity of the spec.
It's pretty easy to improve things by extracting helpers like the range() method above into a superclass Specification, but the other (non-method) utility code wasn't as easily abstracted. Besides, most of the fiddling I was doing was adding and changing the columns in the data tables.
So, I did a little digging on Spock Annotation extensions, and found a chance to catch up and play with Annotations, Reflection, Groovy, and more.
The extension I came up with is described below, and included in this Gist.
The extension is enabled by simply by applying the @TweakSequence Annotation to the Feature Method. The sequenceParameterName attribute can be used to specify which column contains the sequence you want to fiddle with. 'sequence' is the default, but 'A' is pretty common for int[] input of algorithm challenges.
@Unroll
@TweakSequence( sequenceParameterName = "A" )
def "the sum of #sequence is #result"() {
expect:
IntStream.of( A ).sum == result
where:
result | A
12 | [ 1, 2, 3, 1, 2, 3 ]
12 | [ 1, 2, 3 ] * 2Note: Henceforth, I use 'sequence column' to refer to the column specified to the annotation. Also, since I'm solely discussing data tables for data-driven testing in Spock, I'm referring to Spock data parameters as 'columns'.
Now let's use the extension to generate a range of numbers. All four rows below are equivalent:
@Unroll
@TweakSequence( sequenceParameterName = "A" )
def "the sum of #sequence is #result"() {
expect:
IntStream.of( A ).sum == result
where:
result | A
12 | [ 1, 2, 3, 1, 2, 3 ]
12 | [ 1, 2, 3 ] * 2
12 | [ range: [ start: 1, end: 3 ]] * 2
12 | [ range: [ start: 1, end: 3, repeat: 2 ]]
The first two rows don't use the extension at all.. Those values for A are List literals, and * is a Groovy operator override for List.multiply(), which repeats the List.
The last two rows are different. Instead of a List, we're providing a map literal containing a range mapping of range attributes, which the extension uses to generate the specified range. The third row again uses the Groovy * operator override, while for the fourth, the extension recognizes the repeat attribute for the same result.
Let's look at another range example - all three rows below are equivalent:
@Unroll
@TweakSequence
def "the sum of #sequence is #result"() {
expect:
IntStream.of( sequence ).sum == result
where:
result | sequence
-6 | [ 2, -1, -4, 2, -1, -4 ]
-6 | [ 2, -1, -4 ] * 2
-6 | [ range: [ start: 2, end: -4, step: 3, repeat: 2 ]]The step and repeat attributes do exactly what their names imply. Further, if the start value is higher than the end value, the resulting range is produced in descending order.
The absolute value for
stepis effectively used. Negation has no effect, but stepping works the same for ascending and descending ranges.
There are a few ways to use the tweaks from the extension.. To demonstrate, let's bring back the indexReplacements utility I was using earlier. Once again, all three rows in the example below are equivalent.
expect:
sequence == result
where:
result | sequence | indexReplacements
[ 9, 2, 0, 4 ] | [ 1, 2, 3, 4 ] | [ 0: 9, 2: 0 ]
[ 9, 2, 0, 4 ] | [ range: [ start: 1, end: 4 ], indexReplacements: [ 0: 9, 2: 0 ] ] | [ : ]
[ 9, 2, 0, 4 ] | [ sequence: [ 1, 2, 3, 4 ], indexReplacements: [ 0: 9, 2: 0 ] ] | [ : ]
In the first row, we see that if the table includes an indexReplacements data column of position:value mappings, the replacements are performed like before: the element at the index of the map's key is replaced by its value.
In the second row, we see that if we're using range to generate the sequence, we can include tweaks within the same map as the range. If we only need them for one or two rows, we are spared the trouble of adding columns out to the table.
Rows where the indexReplacements value is an empty map are intended to depict the convenience and tradeoffs of leaving the column out of the table altogether. In other words, for those rows, pretend the column isn't there. :-)
In the third and final row, we see that we can still avoid adding that column if we're not using range: we can specify both the sequence and the tweak(s) within a single map. Instead of range, simply specify the sequence using the same attribute name as the sequence column.. just look at the column header.
Hint: to include your sequence plus all your tweaks in a single data column, include them in a map with your sequence, the latter as either a
rangeor asequence.
There's one final way to specify a tweak. Let's bring back the valueExclusions tweak to demonstrate it:
expect:
sequence == result
where:
result | sequence | tweaks
[ 1, 4 ] | [ 1, 2, 3, 4 ] | [ valueExclusions: [ 2, 3 ] ]
[ 1, 4 ] | [ range: [ start: 1, end: 4 ] ] | [ valueExclusions: [ 2, 3 ] ]
[ 99, 4 ] | [ 1, 2, 3, 4 ] | [ valueExclusions: [ 2, 3 ], indexReplacements: [ 0: 99 ] ]The extension allows us to include a tweaks column containing a map of our desired tweaks. This way you can have a single column for all your tweaks, separate from the sequence column.
An alternate name for the tweaks column can be specified to the extension via the
tweaksParameterNameannotation attribute.
I find this is a decent separation and when setting up tests for a new challenge, I'll usually include a single tweaks column.
But I still find all the methods of specifying tweaks useful. For example, if there's one tweak that I'll apply to almost every row in a table, it's much less cluttered to include a dedicated column for it. On the other hand, including a tweak in a map (within the sequence or tweaks column) is a quick way to experiment with one-off tweaks here and there, without adding a new column.
Now let's review everything that's available with the extension:
| Tweak / Name | what it does | How specified | Contents |
|---|---|---|---|
sequence overridden via the sequenceParameterName annotation attribute. |
specifies the (pre-tweak) sequence, typically as a List literal or an expression resulting in a List | * as a data column or * as a map attribute, within the sequence column |
List, e.g.[ 1, 2, 3 ]or [ sequence: [ 1, 2, 3 ] ] |
range |
generates a range of values | as a map, within the sequence column | Map with attributes: start, end, step, repeat:[ range: [ start: 1, end: 3 ] ] |
tweaks overridden via the tweaksParameterName annotation attribute. |
specifies a map of tweaks as tweakName:data mappings |
as a data column | a Map with zero or more tweak names and the data used by the tweak (e.g. List, Map, or Number, etc) |
indexReplacements |
overwrites values at positions corresponding with mapping's key, with the mapping's value | * As a data column or * as a map, within the sequence column or the tweaks column |
Map of location:value mappings:[ 1:99, 3: 0 ] |
valueExclusions |
removes all occurrences of specified values from the sequence | * As a data column or * as a map, within the sequence column or the tweaks column |
List of values to exclude from the sequence:[ 1, 3, 5 ] |
You can easily add a new tweak from your Specification class.
@TweaksSequence
static List addNumberToSequence( List inputSequence, int numberToAdd ) {
return inputSequence.stream().mapToInt { i -> i + numberToAdd }.toArray()
}
@TweaksSequence
static List reverseSequence( List inputSequence, boolean whetherToReverse ) {
if ( !whetherToReverse ) return inputSequence
return new LinkedList( inputSequence ).descendingIterator().toList()
}
@Unroll
@TweakSequence
def "#iterationCount): added TweakFunctions are used correctly: #sequence"() {
expect:
sequence == result
where:
result | sequence
[ 1, 2, 3 ] | [ 1, 2, 3 ]
[ 1, 2, 3 ] | [ sequence: [ 1, 2, 3 ] ]
[ 2, 3, 4 ] | [ sequence: [ 1, 2, 3 ], addNumberToSequence: 1 ]
[ 1, 2, 3 ] | [ sequence: [ 1, 2, 3 ], reverseSequence: false ]
[ 3, 2, 1 ] | [ sequence: [ 1, 2, 3 ], reverseSequence: true ]
}Simply add a static method to your Specification, and apply the @TweaksSequence annotation (notice the additional 's' in the annotation name).
A few simple rules:
- Method should be static
- Signature: Two arguments
- The first argument should be
List inputList - The second argument can be any type. It will receive the value specified by the specifying map.
- The first argument should be
- The return type should be
List, and the method should return the modified List. - Apply the
TweaksSequenceannotation
Then, you can freely use the tweak (use the method name) in your data table anywhere you can use a tweak: in a dedicated column, or within the sequence or tweaks columns.
When you're generating very large sequences, you probably want to avoid including the entire sequence in the name of the feature method (i.e. when using Spock's @Unroll feature). (See the related TODO below)
Watch out for precendence of tweak operations. You might expect them to be performed in the order you wrote them in, but they'll actually be applied in the order of their addition to the tweakOperations map under the covers in the extension. Any added @TweaksSequence tweaks are also added to this map, so they'll be performed last.
I had a gas doing this. I got to play with creating Annotations, and using Groovy was especially cool. Aside from a bit of build and deploy scripting long ago, I hadn't used it.. but it seemed very natural and familiar from my recent years in ES6+ and Node.js. For example, the extension relies heavily on Closures. Groovy is like a perfect marriage of Java with the dynamic freedom I've grown accustomed to from using JS. Much easier to debug too!
TODOs:
- Inject an additional
sequenceDescriptiondata parameter to the feature method, containing the (pre-evaluation) contents of the data column, for use in@Unrolled feature method names. - Enforce that added
@TweaksSequencemethods are static (Spock does a lot of funky stuff with scope and it's probably best to avoid any instance state) - Use generics to enforce signatures of
@TweaksSequencesmethods: BiFunction<List, List, Object> - Include a couple of Specification features showing all/most usage possibilities, for reference