# Sequence Tweaking Spock Extension
*Quick and easy generation of int[] sequences for Codility and Leetcode algorithm challenge test cases*
Recently I've been catching up on modern Java.. I haven't used it on the job since Java 5 came out. I've been practicing algorithms on [Codility](http://www.codility.com) and [Leetcode](http://www.leetcode.com).
The algorithm puzzles on these sites almost always use `int` and/or `int[]` arguments and return types, and sometimes have BigO performance requirements. To test your algorithms properly, you need to generate very large input arrays, of sizes up to 100,000 with values from +/- 1 billion. My test cases weren't anywhere near as comprehensive as those run by Codility when you submit you solution, and I wanted a way to quickly create tests to cover edge cases and very large inputs.
I started using the [data tables](http://spockframework.org/spock/docs/1.1/data_driven_testing.html#data-tables) feature in the awesome [Spock](http://spockframework.org) testing framework, and right away I started creating some [helper methods](http://spockframework.org/spock/docs/1.1/spock_primer.html#_helper_methods) to help with the generation of `int[]` sequences for input to my solutions.
Here's a simple example of what I was doing:
### Simple beginnings
```groovy
import spock.lang.*
class MySpecification extends Specification {
def solution = new AlgorithmSolution()
// helper for generation of a range of ints
ArrayList range( int from, int to ) {
return IntStream.rangeClosed( from, to ).boxed().collect( Collectors.toList() )
}
@Unroll
def "given input sequence #sequence with #indexReplacements, the result is #result"() {
given:
// replace values at specified positions if indexReplacements is not empty
if ( !!indexReplacements ) {
indexReplacements.each { position, value ->
sequence[ position ] = value
}
}
expect:
solution.solution( sequence ) == result
where:
result | sequence | indexReplacements
5 | [ 1, 2, 3 ] * 2 | [ : ] // 1, 2, 3, 1, 2, 3
5 | range(1, 5) * 2 | [ 1: 99, 8: 99 ] // 1, 99, 3, 4, 5, 1, 2, 3, 99, 5
42 | [ 9 ] * 10_000 | [ 0: 0, 9_999: 0 ] // 0, 9, 9, 9 ... 9, 9, 9, 0
42 | range(1, 100) | [ 0: 100, 99: 1 ] // 100, 2, 3, 4 ... 97, 98, 99, 1
```
I called my `range()` Specification helper method from within my Spock data column to generate a range of ints, and I specified values for arbitrary sequence positions in a separate `indexReplacements` column of position:value maps. The code that applied these replacements was included in the `given:` block.
This worked pretty well, but I didn't like cluttering the feature method (or even the Specification) so much - the helper code distracted from the readability of the spec.
Also, while Spock's data tables are very expressive, it can be a pretty tedious to add new facets to the generation of test data.
For example, say I wanted to add a bit more utility to the Specification - like a new `valueExclusions` column where I could specify values I want yanked out of the sequence:
```groovy
// (previous code omitted)
given:
if ( !!indexReplacements ) {
indexReplacements.each { position, value ->
sequence[ position ] = value
}
}
if ( !!valueExclusions ) // NEW CODE
sequence.removeAll( valueExclusions )
expect:
solution.solution( sequence ) == result
where: // NEW COLUMN: values (i.e. empty lists) required
result | sequence | indexReplacements | valueExclusions
5 | [ 1, 2, 3 ] * 2 | [ : ] | [ ]
5 | range(1, 5) * 2) | [ 1: 99, 8: 99 ] | [ ]
42 | [ 9 ] * 10_000 | [ 0: 0, 9_999: 0 ] | [ ]
5 | range(1, 100) | [ 0: 100, 99: 1 ] | [ ]
42 | range(1, 10_000) | [ : ] | [ 5_000, 5_001 ] // <=== new test case
```
The new column requires a value for each row - requiring the addition of an empty list (`[]`) to every test case just to keep everything working. That's a lot of table fiddling just to add something new; not very conducive for experimentation freedom. Plus, while this example only added two new lines of helper code, it further clutters the Specification and detracts from readability and clarity of the spec.
### Options for improvement
It's pretty easy to improve things by extracting helpers like the `range()` method above into a superclass Specification, but the other (non-method) utility code wasn't as easily abstracted. Besides, most of the fiddling I was doing was adding and changing the columns in the data tables.
So, I did a little digging on Spock Annotation extensions, and found a chance to catch up and play with Annotations, Reflection, Groovy, and more.
The extension I came up with is described below, and included in this Gist.
### Enabling the extension
The extension is enabled by simply by applying the `@TweakSequence` Annotation to the Feature Method. The sequenceParameterName attribute can be used to specify which column contains the sequence you want to fiddle with. '`sequence`' is the default, but 'A' is pretty common for int[] input of algorithm challenges.
```groovy
@Unroll
@TweakSequence( sequenceParameterName = "A" )
def "the sum of #sequence is #result"() {
expect:
IntStream.of( A ).sum == result
where:
result | A
12 | [ 1, 2, 3, 1, 2, 3 ]
12 | [ 1, 2, 3 ] * 2
```
> Note: Henceforth, I use 'sequence column' to refer to the column specified to the annotation. Also, since I'm solely discussing [data tables](http://spockframework.org/spock/docs/1.1/data_driven_testing.html#data-tables) for data-driven testing in Spock, I'm referring to Spock data parameters as 'columns'.
### Generating ranges
Now let's use the extension to generate a range of numbers. All four rows below are equivalent:
```groovy
@Unroll
@TweakSequence( sequenceParameterName = "A" )
def "the sum of #sequence is #result"() {
expect:
IntStream.of( A ).sum == result
where:
result | A
12 | [ 1, 2, 3, 1, 2, 3 ]
12 | [ 1, 2, 3 ] * 2
12 | [ range: [ start: 1, end: 3 ]] * 2
12 | [ range: [ start: 1, end: 3, repeat: 2 ]]
```
The first two rows don't use the extension at all.. Those values for `A` are List literals, and `*` is a Groovy operator override for `List.multiply()`, which repeats the List.
The last two rows are different. Instead of a List, we're providing a __map__ literal containing a `range` mapping of range attributes, which the extension uses to generate the specified range. The third row again uses the Groovy `*` operator override, while for the fourth, the extension recognizes the `repeat` attribute for the same result.
Let's look at another range example - all three rows below are equivalent:
```groovy
@Unroll
@TweakSequence
def "the sum of #sequence is #result"() {
expect:
IntStream.of( sequence ).sum == result
where:
result | sequence
-6 | [ 2, -1, -4, 2, -1, -4 ]
-6 | [ 2, -1, -4 ] * 2
-6 | [ range: [ start: 2, end: -4, step: 3, repeat: 2 ]]
```
The `step` and `repeat` attributes do exactly what their names imply. Further, if the `start` value is higher than the `end` value, the resulting range is produced in descending order.
> The absolute value for `step` is effectively used. Negation has no effect, but stepping works the same for ascending and descending ranges.
### Ways to use tweaks
There are a few ways to use the tweaks from the extension.. To demonstrate, let's bring back the `indexReplacements` utility I was using earlier. Once again, all three rows in the example below are equivalent.
```groovy
expect:
sequence == result
where:
result | sequence | indexReplacements
[ 9, 2, 0, 4 ] | [ 1, 2, 3, 4 ] | [ 0: 9, 2: 0 ]
[ 9, 2, 0, 4 ] | [ range: [ start: 1, end: 4 ], indexReplacements: [ 0: 9, 2: 0 ] ] | [ : ]
[ 9, 2, 0, 4 ] | [ sequence: [ 1, 2, 3, 4 ], indexReplacements: [ 0: 9, 2: 0 ] ] | [ : ]
```
In the first row, we see that if the table includes an `indexReplacements` data column of `position:value` mappings, the replacements are performed like before: the element at the index of the map's key is replaced by its value.
In the second row, we see that if we're using `range` to generate the sequence, we can include tweaks within the same map as the range. If we only need them for one or two rows, we are spared the trouble of adding columns out to the table.
> Rows where the indexReplacements value is an empty map are intended to depict the convenience and tradeoffs of leaving the column out of the table altogether. In other words, for those rows, pretend the column isn't there. :-)
In the third and final row, we see that we can still avoid adding that column if we're not using `range`: we can specify both the sequence and the tweak(s) within a single map. Instead of `range`, simply specify the sequence using the same attribute name as the sequence column.. just look at the column header.
> Hint: to include your sequence plus all your tweaks in a single data column, include them in a map with your sequence, the latter as either a `range` or a `sequence`.
### The `tweaks` column
There's one final way to specify a tweak. Let's bring back the `valueExclusions` tweak to demonstrate it:
```groovy
expect:
sequence == result
where:
result | sequence | tweaks
[ 1, 4 ] | [ 1, 2, 3, 4 ] | [ valueExclusions: [ 2, 3 ] ]
[ 1, 4 ] | [ range: [ start: 1, end: 4 ] ] | [ valueExclusions: [ 2, 3 ] ]
[ 99, 4 ] | [ 1, 2, 3, 4 ] | [ valueExclusions: [ 2, 3 ], indexReplacements: [ 0: 99 ] ]
```
The extension allows us to include a `tweaks` column containing a map of our desired tweaks. This way you can have a single column for all your tweaks, separate from the sequence column.
> An alternate name for the tweaks column can be specified to the extension via the `tweaksParameterName` annotation attribute.
I find this is a decent separation and when setting up tests for a new challenge, I'll usually include a single tweaks column.
But I still find all the methods of specifying tweaks useful. For example, if there's one tweak that I'll apply to almost every row in a table, it's much less cluttered to include a dedicated column for it. On the other hand, including a tweak in a map (within the sequence or `tweaks` column) is a quick way to experiment with one-off tweaks here and there, without adding a new column.
Now let's review everything that's available with the extension:
## Available tweaks
| Tweak / Name | what it does | How specified | Contents |
|----------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------|
| `sequence`
overridden via the `sequenceParameterName` annotation attribute. | specifies the (pre-tweak) sequence, typically as a List literal or an expression resulting in a List | * as a data column
or
* as a map attribute, within the sequence column | List, e.g.
`[ 1, 2, 3 ]`
or
`[ sequence: [ 1, 2, 3 ] ]` |
| `range` | generates a range of values | as a map, within the sequence column | Map with attributes: `start`, `end`, `step`, `repeat`:
`[ range: [ start: 1, end: 3 ] ]` |
| `tweaks`
overridden via the `tweaksParameterName` annotation attribute. | specifies a map of tweaks as `tweakName:data` mappings | as a data column | a Map with zero or more tweak names and the data used by the tweak (e.g. List, Map, or Number, etc) |
| `indexReplacements` | overwrites values at positions corresponding with mapping's key, with the mapping's value | * As a data column
or
* as a map, within the sequence column or the tweaks column | Map of `location:value` mappings:
`[ 1:99, 3: 0 ]` |
| `valueExclusions` | removes all occurrences of specified values from the sequence | * As a data column
or
* as a map, within the sequence column or the tweaks column | List of values to exclude from the sequence:
`[ 1, 3, 5 ]` |
## Adding new tweaks
You can easily add a new tweak from your Specification class.
```groovy
@TweaksSequence
static List addNumberToSequence( List inputSequence, int numberToAdd ) {
return inputSequence.stream().mapToInt { i -> i + numberToAdd }.toArray()
}
@TweaksSequence
static List reverseSequence( List inputSequence, boolean whetherToReverse ) {
if ( !whetherToReverse ) return inputSequence
return new LinkedList( inputSequence ).descendingIterator().toList()
}
@Unroll
@TweakSequence
def "#iterationCount): added TweakFunctions are used correctly: #sequence"() {
expect:
sequence == result
where:
result | sequence
[ 1, 2, 3 ] | [ 1, 2, 3 ]
[ 1, 2, 3 ] | [ sequence: [ 1, 2, 3 ] ]
[ 2, 3, 4 ] | [ sequence: [ 1, 2, 3 ], addNumberToSequence: 1 ]
[ 1, 2, 3 ] | [ sequence: [ 1, 2, 3 ], reverseSequence: false ]
[ 3, 2, 1 ] | [ sequence: [ 1, 2, 3 ], reverseSequence: true ]
}
```
Simply add a static method to your Specification, and apply the `@TweaksSequence` annotation (notice the additional 's' in the annotation name).
A few simple rules:
* Method should be static
* Signature: Two arguments
* The first argument should be `List inputList`
* The second argument can be any type. It will receive the value specified by the specifying map.
* The return type should be `List`, and the method should return the modified List.
* Apply the `TweaksSequence` annotation
Then, you can freely use the tweak (use the method name) in your data table anywhere you can use a tweak: in a dedicated column, or within the sequence or tweaks columns.
## Final notes:
When you're generating very large sequences, you probably want to avoid including the entire sequence in the name of the feature method (i.e. when using Spock's @Unroll feature). (See the related TODO below)
Watch out for precendence of tweak operations. You might expect them to be performed in the order you wrote them in, but they'll actually be applied in the order of their addition to the `tweakOperations` map under the covers in the extension. Any added `@TweaksSequence` tweaks are also added to this map, so they'll be performed last.
I had a gas doing this. I got to play with creating Annotations, and using Groovy was especially cool. Aside from a bit of build and deploy scripting long ago, I hadn't used it.. but it seemed very natural and familiar from my recent years in ES6+ and Node.js. For example, the extension relies heavily on Closures. Groovy is like a perfect marriage of Java with the dynamic freedom I've grown accustomed to from using JS. Much easier to debug too!
TODOs:
* Inject an additional `sequenceDescription` data parameter to the feature method, containing the (pre-evaluation) contents of the data column, for use in `@Unroll`ed feature method names.
* Enforce that added `@TweaksSequence` methods are static (Spock does a lot of funky stuff with scope and it's probably best to avoid any instance state)
* Use generics to enforce signatures of `@TweaksSequences` methods: BiFunction
* Include a couple of Specification features showing all/most usage possibilities, for reference