Over the last few weeks I've been practicing algorithms on Codility and Leetcode.
The algorithm puzzles on Codility and Leetcode are almost always dealing with int and/or int[] arguments and results, and sometimes have BigO performance requirements. I started using the data tables in the awesome Spock testing framework, and right away I started creating some helper methods to help with the generation of int sequences for input to the algorithms I was practicing. With a bit of code paired up with additional data columns (parameters, in Spock),
Here's a simple example - I was using a range() helper method from within my data column to generate a range of ints, and specifying overwriting values I wanted to apply at arbitrary spots in the sequence in a separate indexReplacements column of index:value maps, which was processed by some code in the fixture method if not empty.
import spock.lang.*
class MySpecification extends Specification {
def solution = new AlgorithmSolution()
// helper for generation of a range of ints
ArrayList range( int from, int to ) {
return IntStream.rangeClosed( from, to ).boxed().collect( Collectors.toList() )
}
@Unroll
def "given input sequence #sequence with #indexReplacements, the result is #result"() {
given:
// replace values at specified positions if indexReplacements is not empty
if ( !!indexReplacements ) {
indexReplacements.each { position, value ->
sequence[ position ] = value
}
}
expect:
solution.solution( sequence ) == result
where:
result | sequence | indexReplacements
5 | [ 1, 2, 3 ] * 2 | [ : ] // 1, 2, 3, 1, 2, 3
5 | range(1, 5) * 2 | [ 1: 99, 8: 99 ] // 1, 99, 3, 4, 5, 1, 2, 3, 99, 5
42 | [ 9 ] * 10_000 | [ 0: 0, 9_999: 0 ] // 0, 9, 9, 9 ... 9, 9, 9, 0
42 | range(1, 100) | [ 0: 100, 99: 1 ] // 100, 2, 3, 4 ... 97, 98, 99, 1
This worked pretty well, but I didn't like cluttering the feature method (or even the Specification) with code that distracted from the semantics of the solution and the functionality being tested.
Also, Spock's data tables are very expressive, but it can be a little tedious to add new facets to the generation of test data. Say I wanted to add a bit more utility to the example:
// (previous code omitted)
given:
if ( !!indexReplacements ) {
indexReplacements.each { position, value ->
sequence[ position ] = value
}
}
if ( !!valueExclusions ) // NEW CODE
sequence.removeAll( valueExclusions )
expect:
solution.solution( sequence ) == result
where: // NEW COLUMN: values (i.e. empty lists) required
result | sequence | indexReplacements | valueExclusions
5 | [ 1, 2, 3 ] * 2 | [ : ] | [ ]
5 | range(1, 5) * 2) | [ 1: 99, 8: 99 ] | [ ]
42 | [ 9 ] * 10_000 | [ 0: 0, 9_999: 0 ] | [ ]
5 | range(1, 100) | [ 0: 100, 99: 1 ] | [ ]
42 | range(1, 10_000) | [ : ] | [ 5_000, 5_001 ] // <=== new test caseThe new column requires a value for each row - requiring the addition of an empty list ([]) to every test case just to keep them working. Using the data tables is still worthwhile, but that's a lot of table fiddling just to something new for one test case. Plus, while this example only added two new lines of helper code, it further clutters the Specification and detracts from its value as readable documentation for my solution.
###Improvement options
It's pretty easy to improve things by extracting helpers like the range() method above into a subclass of Spock's Specification base class, but most of the fiddling I was doing was adding and changing the columns in the table. Adding code to use the data columns for manipulating the sequence, but this code isn't easily abstracted into a superclass like range(), which is used within the sequence column that is consumed by the solution.
Aside from learning and practicing algorithms, I've also been catching up on modern Java, which I used and loved for many years but haven't used on the job Java 5 came out. So, I saw this as a chance to practice custom Generics, Annotations, Lambdas, etc. I decided to write a Spock extension to replace the helper fixtures I had. Got to play with Groovy and practice custom Annotations.
Just a side note on Groovy - aside from a bit of build and deploy scripting about ten years ago, I hadn't done much with Groovy.. but I'm loving it.. a lot of Groovy's features (e.g. Closures) are pretty familiar from my recent years in ES6+ and Node.js.. it's like a perfect marriage of Java with the dynamic freedom I've grown accustomed to from using JS.
Here's what I wound up with. The extension is enabled by simply by applying the @TweakSequence Annotation to the Feature Method. It accepts one argument, for specifying which data parameter (column) contains the sequence you want to fiddle with ('sequence' is the default):
@Unroll
@TweakSequence("A") // ('sequence' is the default)
def "#iterationCount: result is the expected #result"() {
...
}
For the remainder of this article, I use 'sequence column' to refer to the column specified to the annotation, and since I'm solely discussing data tables for data-driven testing in Spock, I refer to Spock data variables as 'columns'. TODO - rename columns/parameters to data variables
Once this is done, the extension makes a few 'tweak' utilities available for the Spock feature: range, indexReplacements, valueReplacements, valueExclusions, sequence, and tweaks. It's easiest to show how they're used in Spock data tables:
@Unroll
@TweakSequence
def "the sum of #sequence is #result"() {
expect:
IntStream.of( sequence ).sum == result
where:
result | sequence
12 | [ 1, 2, 3, 1, 2, 3 ]
12 | [ 1, 2, 3 ] * 2
Both of the rows in the table above are equivalent - Spock (actually Groovy) allow us to easiy repeat a List via 'multiplying' it.
Now let's apply the TweakSequence Spock extension for another way - all of these are also equivalent:
@Unroll
@TweakSequence
def "the sum of #sequence is #result"() {
expect:
IntStream.of( sequence ).sum == result
where:
result | sequence
12 | [ 1, 2, 3, 1, 2, 3 ]
12 | [ 1, 2, 3 ] * 2
12 | [ range: [ start: 1, end: 3 ]] * 2
12 | [ range: [ start: 1, end: 3, repeat: 2 ]]In the last two rows, a Map is specified instead of a List in the sequence column, which in turn, contains a range map with start and end attribute. When a map containing range is provided, the extension uses its start and end values to generate the sequence. It also recognizes repeat and step values, which do exactly what their names imply. If the start value is greater than end, the sequence is reversed. So, all these are equivalent:
expect:
IntStream.of( sequence ).sum == result
where:
result | sequence
-6 | [ 2, -1, -4, 2, -1, -4 ]
-6 | [ 2, -1, -4 ] * 2
-6 | [ range: [ start: 2, end: -4, step: 3, repeat: 2 ]Now let's bring back the indexReplacements utility I was using earlier, now available via the extension as a 'tweak'. There are a few ways to use it:
expect:
sequence == result
where:
result | sequence | indexReplacements
[ 1, 99, 3 ] | [ 1, 2, 3 ] | [ 1: 99 ]
[ 1, 99, 3 ] | [ range: [ start: 1, end: 3 ], indexReplacements: [ 1: 99 ] ] | [ : ]
[ 1, 99, 3 ] | [ sequence: [ 1, 2, 3 ], indexReplacements: [ 1: 99 ] ] | [ : ]
In the first two rows, we see that if we provide an indexReplacements data column of position:value mappings, the replacements are performed like before.
In the third row, we see that if we're using range to generate the sequence, we can include the indexReplacements mappings within the same map as the range.
In the fourth and final row, we see something new: using the name of the sequence column again in the map allows us to specify a List literal as the source sequence, plus any other tweaks within the map, all within the one column. This way, we can specify all our sequences and tweaks without adding any columns. Essentially this is the same that we did earlier with range, but where we don't need to generate a range for the sequence.
There's one final way to specify a tweak. Let's bring back the valueExclusions tweak to demonstrate it:
result | sequence | tweaks
[ 1, 3 ] | [ 1, 2, 3 ] | [ valueExclusions: [ 2 ] ]
[ 1, 3 ] | [ range: [ start: 1, end: 3 ] ] | [ valueExclusions: [ 2 ] ]
[ 99, 3 ] | [ 1, 2, 3 ] | [ valueExclusions: [ 2 ], indexReplacements: 0, 99 ]Here we see that we can provide a tweaks column containing a map of our tweaks. This way you can have a single column for all your tweaks, but separate from the sequence column.
Note: an alternate name for the
tweakscolumn can be specified to the extension via the annotation:TweakSequence(tweaksParameterName="otherTweaks")
I find this is a decent separation and when setting up tests for a new challenge, I'll usually include a single tweaks column. But I still find all the methods of specifying tweaks useful. If there's one tweak that I'll apply to almost every row in a table, it's much less cluttered to include a dedicated column for it.
On the other hand, including a tweak in a map (within the sequence or tweaks column) is a quick way to experiment with one-off tweaks here and there, without adding a new column.
Now let's review everything that's available with the extension:
| Tweak | what it does | How available | Contents | Example | |
|---|---|---|---|---|---|
| sequence | specifies the pre-tweak starting point for the sequence | * As a data column * As a map, within the sequence column |
List: [ 1, 2, 3, ] | name can be changed via the value annotation attribute |
|
| range | * As a map, within the sequence column | Map - recognized attributes start, end, step, repeat: [ start: 1, end: 3, step: 2, repeat: 2 ] | attributes: start, end, step, repeat | ||
| tweaks | As a data column | each row specifies a Map with zero or more tweaks | name can be changed via the tweaksParameterName attribute |
||
| indexReplacements | * As a data column* As a map, within the sequence column or the tweaks column | Map - of location / value mappings | |||
| valueExclusions | * As a data column* As a map within the sequence column* As a map within the tweaks column | List |
Where available:
expected: map
expected: List
6 | [ 1, 3, 4 ] * 2 | [ indexReplacements: [ 3: 100 ] ]
3 | [ ].range( 1, 10 ) | [ indexReplacements: [ 5: 1 ] ]
3 | [ range: [ start: 1, end: 10 ] ] | [ indexReplacements: [ 5: 1 ], valueExclusions: [ 5, 6 ] ]
3 | [ range: [ start: 1, end: 10 ] ] | [ indexReplacements: [ 5: 1 ], valueExclusions: [ 5, 6 ] ]
20_002 | [ 10001 ] * 5 | [ : ]
10 | [ 5, 6, 5, 6 ] * 8 | [ : ]
2 | [ 1 ] * 5 | [ : ]
2 | [ 1 ] * 5 | [ : ]
2 | [ 1 ] * 10 | [ : ]
4 | [ 2 ] * 100_000 | [ indexReplacements: [ 0: 1, 100_000: 1 ] ]
2 | [ 2 ] * 100_000 | [ indexReplacements: [ 1: 1, 99_998: 1 ] ]
3 | [ range: [ start: 1, end: 100 ] ] | [ indexReplacements: [ 98: 1 ] ]
6 | [ range: [ start: 10, end: 1 ] ] | [ : ]
6 | [ range: [ start: 100, end: 1 ] ] | [ : ]
6 | [ range: [ start: 1, end: 100 ] ] | [ indexReplacements: [ 2: 1 ] ]
20 | [ range: [ start: 10, end: 100, repeat: 100 ] ] | [ indexReplacements: [ 2: 100 ] ]
3 | [ range: [ start: 1, end: 100 ] ] | [ indexReplacements: [ 3: 1 ] ]
6 | [ range: [ start: 1, end: 100_000 ] ] | [ valueExclusions: [ 5, 6 ] ]
2 | [ range: [ start: 1, end: 100_000 ] ] | [ indexReplacements: [ 1: 1, 99_998: 1 ] ]
3 | [ range: [ start: 1, end: 100_000 ] ] | [ indexReplacements: [ 99_998: 1 ] ]
12 | [ range: [ start: 100_000, end: 10 ] ] | [ indexReplacements: [ 5555: 1 ] ]
}
@Unroll
@TweakSequence
def "#iterationCount: result is the expected #result"() {
where:
result | sequence | tweaks
7 | [ 5 ] * 7 | [ indexReplacements: [ 1: 2, 2: 2 ] ]
6 | [ 1, 3, 4 ] * 2 | [ indexReplacements: [ 3: 100 ] ]
3 | [ ].range( 1, 10 ) | [ indexReplacements: [ 5: 1 ] ]
3 | [ range: [ start: 1, end: 10 ] ] | [ indexReplacements: [ 5: 1 ], valueExclusions: [ 5, 6 ] ]
3 | [ range: [ start: 1, end: 10 ] ] | [ indexReplacements: [ 5: 1 ], valueExclusions: [ 5, 6 ] ]
20_002 | [ 10001 ] * 5 | [ : ]
10 | [ 5, 6, 5, 6 ] * 8 | [ : ]
2 | [ 1 ] * 5 | [ : ]
2 | [ 1 ] * 5 | [ : ]
2 | [ 1 ] * 10 | [ : ]
4 | [ 2 ] * 100_000 | [ indexReplacements: [ 0: 1, 100_000: 1 ] ]
2 | [ 2 ] * 100_000 | [ indexReplacements: [ 1: 1, 99_998: 1 ] ]
3 | [ range: [ start: 1, end: 100 ] ] | [ indexReplacements: [ 98: 1 ] ]
6 | [ range: [ start: 10, end: 1 ] ] | [ : ]
6 | [ range: [ start: 100, end: 1 ] ] | [ : ]
6 | [ range: [ start: 1, end: 100 ] ] | [ indexReplacements: [ 2: 1 ] ]
20 | [ range: [ start: 10, end: 100, repeat: 100 ] ] | [ indexReplacements: [ 2: 100 ] ]
3 | [ range: [ start: 1, end: 100 ] ] | [ indexReplacements: [ 3: 1 ] ]
6 | [ range: [ start: 1, end: 100_000 ] ] | [ valueExclusions: [ 5, 6 ] ]
2 | [ range: [ start: 1, end: 100_000 ] ] | [ indexReplacements: [ 1: 1, 99_998: 1 ] ]
3 | [ range: [ start: 1, end: 100_000 ] ] | [ indexReplacements: [ 99_998: 1 ] ]
12 | [ range: [ start: 100_000, end: 10 ] ] | [ indexReplacements: [ 5555: 1 ] ]
}
PS: For convenience, the sequence, for which ArrayList is its natural form in Groovy, is converted to an int[]
Lots of learning.. Java.. and algorithms.
all pretty easy by subclassing Spock's Specification class, but I thought it would be fun (and convenient) to get it done with a simple annotation.
New to annotations
New to groovy, but Javascript concepts helped a lot.
New to Spock
Needing to generate very large input arrays, of sizes up to 100,000 and values from +/- 1 billion.
As I worked through the problems, I wanted to be able to quickly create test cases.. edge cases and very large inputs so I could check out their correctness and how they perform
I'd done some basic Groovy build scripting about ten years ago
To use, simply use the annotation on your feature method:
```groovy
There are several ways.. So you shouldn't have to rewrite your data table to add columns etc. just to use a tweak once or twice.
result | sequence | tweaks
7 | [ 5 ] * 7 | [ indexReplacements: [ 1: 2, 2: 2 ] ]
6 | [ 1, 3, 4 ] * 2 | [ indexReplacements: [ 3: 100 ] ]
3 | [ ].range( 1, 10 ) | [ indexReplacements: [ 5: 1 ] ]
3 | [ range: [ start: 1, end: 10 ] ] | [ indexReplacements: [ 5: 1 ], valueExclusions: [ 5, 6 ] ]
3 | [ range: [ start: 1, end: 10 ] ] | [ indexReplacements: [ 5: 1 ], valueExclusions: [ 5, 6 ] ]
20_002 | [ 10001 ] * 5 | [ : ]
10 | [ 5, 6, 5, 6 ] * 8 | [ : ]
2 | [ 1 ] * 5 | [ : ]
2 | [ 1 ] * 5 | [ : ]
2 | [ 1 ] * 10 | [ : ]
4 | [ 2 ] * 100_000 | [ indexReplacements: [ 0: 1, 100_000: 1 ] ]
2 | [ 2 ] * 100_000 | [ indexReplacements: [ 1: 1, 99_998: 1 ] ]
3 | [ range: [ start: 1, end: 100 ] ] | [ indexReplacements: [ 98: 1 ] ]
6 | [ range: [ start: 10, end: 1 ] ] | [ : ]
6 | [ range: [ start: 100, end: 1 ] ] | [ : ]
6 | [ range: [ start: 1, end: 100 ] ] | [ indexReplacements: [ 2: 1 ] ]
20 | [ range: [ start: 10, end: 100, repeat: 100 ] ] | [ indexReplacements: [ 2: 100 ] ]
3 | [ range: [ start: 1, end: 100 ] ] | [ indexReplacements: [ 3: 1 ] ]
6 | [ range: [ start: 1, end: 100_000 ] ] | [ valueExclusions: [ 5, 6 ] ]
2 | [ range: [ start: 1, end: 100_000 ] ] | [ indexReplacements: [ 1: 1, 99_998: 1 ] ]
3 | [ range: [ start: 1, end: 100_000 ] ] | [ indexReplacements: [ 99_998: 1 ] ]
12 | [ range: [ start: 100_000, end: 10 ] ] | [ indexReplacements: [ 5555: 1 ] ]
result | sequence | indexReplacements
10 | [ 1 ] * 5 | [ : ]
9 | [ 1, 2, 3 ] * 3 | [ : ]
18 | [ 1, 2, 3 ] * 4 | [ : ]
result | [ 1, 2, 3, 4, 5 ] | [ 1: 0, 3: 0 ]
result | sequence | indexReplacements
10 | [ 1 ] * 5 | [ : ]
9 | [ 1, 2, 3 ] * 3 | [ : ]
18 | [ 1, 2, 3 ] * 4 | [ : ]
15 | [ 1, 2, 3, 4, 5 ] * 3 | [ : ]
50 | [ 1, 2, 3, 4, 5 ] * 5 | [ : ]
49995000 | [ 1 ] * 10_000 | [ : ]
1_000_000_000 | [ range: [ start: 0, end: 20, repeat: 10_000 ] ] | [ : ]
1_000_000_000 | [ range: [ start: -10, end: 10, repeat: 10_000 ] ] | [ : ]
1_000_000_000 | [ range: [ start: -10, end: 10, repeat: 9_999 ] ] | [ : ]
0 | [ range: [ start: 50_000, end: -49_999 ] ] | [ : ]
1 | [ range: [ start: 50_000, end: -49_999 ] ] | [ 0: 0 ]
5 | [ range: [ start: 50_000, end: -49_999 ] ] | [ 0: 0, 100: 1, 200: 2, 30_000: 3, 70_000: 4 ]
21 | [ range: [ start: 50_000, end: -49_999 ] ] | [ 0: 0, 100: 0, 200: 0, 30_000: 0, 70_000: 0, 90_000: 0 ]
result | sequence | indexReplacements
10 | [ ] | [ indexReplacements: [ 1: 0, 3: 0 ], valueExclusions: [ 1, 2 ] ]
result | sequence | indexReplacements
10 | [ sequence: [ 1, 2, 3, 4, 5, ], indexReplacements: [ 1: 0, 3: 0 ], valueExclusions: [ 1, 2 ] ]
[ range: [ start: 1, end: 5 ], indexReplacements: [ 1: 0, 3: 0 ], valueExclusions: [ 1, 2 ] ]
Notes: watch out for order of precedence.at the moment, they're applied in this order:
when you're generating very large sequences, you probably want to avoid including the entire sequence in the name of the feature method (i.e..if you're naming using Spock's @Unroll annotation).
any provided are added after these in order
and of course any operations added afterward, outside the sequence