Continuous streams and iterators ================================ Some comments leading from Alex's thought provoking email. Thank you Alex. Note that all the attached files are stand-alone Scala so they can be compiled and run from the command line with a command like: `scala Example1.scala`. First example ------------- Here's the first example from the email: ```scala var validCustomer = newCustomer if (newCustomer.isEvaluation) { var validCode = false while (!validCode) { val scalaRandom: Random = new Random val randomCustomerCode = scalaRandom.alphanumeric//nextString(5) val code = "eval-" + randomCustomerCode.take(5).mkString // (randomCustomerCode take 5).toString() platformEnv.customerAPI().getCustomerByCode(code).asScala match { case None => { validCustomer = Customer.safeCreate(newCustomer.customerId(), code, newCustomer.customerName(), newCustomer.host(), newCustomer.isEnabled, newCustomer.languageCode(), newCustomer.isServicesOnly, newCustomer.isEvaluation) validCode = true } case _ => // Do nothing } }// end of while } ``` So, firstly, lets' make this into a small single file testcase by adding the following code at the beginning which 'mocks' the context of this code in a way that suffices to make the tests work. This is not a normal test we would use in our codebase, it is a mechanism for testing this code in complete isolation for testing this example, but some of the techniques are applicable in both cases. ```scala import scala.util.Random import scala.language.reflectiveCalls class Customer(customerId: String, code: String, customerName: String, host: String, val isEnabled: Boolean, languageCode: String, val isServicesOnly: Boolean, val isEvaluation: Boolean) { def customerId(): String = customerId def customerName(): String = customerName def host(): String = host def languageCode(): String = languageCode override def toString = s"Customer($customerId, $code, $customerName, $host, $isEnabled, $languageCode, $isServicesOnly, $isEvaluation)" } object Customer { def safeCreate(customerId: String, code: String, customerName: String, host: String, isEnabled: Boolean, languageCode: String, isServicesOnly: Boolean, isEvaluation: Boolean) = new Customer(customerId, code, customerName, host, isEnabled, languageCode, isServicesOnly, isEvaluation) } val results = Iterator.continually(Seq(Some("A"),Some("B"),Some("C"),Some("D"),Some("E"),None)).flatten val platformEnv = new { def customerAPI() = new { def getCustomerByCode(code: String) = new { def asScala: Option[String] = results.next } } } val newCustomer = new Customer("Test", "Code", "Name", "Host", true, "Language", false, true) // <--- Example code goes here println(validCustomer) ``` So all we have done here is create a `Customer` class that behaves like the one we are testing and added a mock `platformEnv`. If the original had been Scala then the `Customer` class might have been created with a one-liner `case` class: ```scala case class Customer(customerId: String, code: String, customerName: String, host: String, isEnabled: Boolean, languageCode: String, isServicesOnly: Boolean, isEvaluation: Boolean) ``` However, the example code calls the 'get' methods using empty parenthesis, e.g.: `newCustomer.customerName()` rather than `newCustomer.customerName` (the preferred Scala way) so, to avoid compiler warnings we need to create non-typical implementations which require empty parenthesis. Why is Scala concerned about this you might ask? It's because Scala tries to blur the distinction between properties of objects and purely functional functions of objects that (given that they take no parameters) should always return the same result. This enables variables and functions to be switched later on without impacting the dependent code. We override the `toString` method to give us a nicer view of the results when we `println(validCustomer)`. Now, we could import the platformEnv and all that that entails, but we certainly don't need that here. All we need is a `getCustomerByCode(...)` with a `.asScala` method that returns either a `Some` or a `None`. So we create one-off anonymous objects here that provide just the functionality we need to make this work. As for the results, well the example code never worries about the type of the Customer object returned so I'm just returning a String instead of a Customer. For the first 5 result we give `Some[String]` for the fifth we return `None`. The line: ```scala val results = Iterator.continually(Seq(Some("A"),Some("B"),Some("C"),Some("D"),Some("E"),None).toStream).flatten ``` can be broken down like this: ```scala val fixedSequenceOfResults: Seq[Option[String]] = Seq(Some("A"),Some("B"),Some("C"),Some("D"),Some("E"),None) val iteratorOfSeqOfResults: Iterator[Seq[Option[String]]] = Iterator.continually(fixedSequenceOfResults) val iteratorOfResults: Iterator[Option[String]] = iteratorOfSeqOfResults.flatten ``` the 'magical' `flatten` function takes the collection of collections -- container of containers -- iterator of sequences -- and reduces it to a single collection that takes the form of the outer collection: the iterator. So results is now an iterator that provides an infinite iterator of `Option[String]`s where every fifth one is a `None`. Note that this is not very functional because iterators don't give the same result to each call of `next`. However, this serves a very useful and concise purpose within a very limited scope of code. --- So, that done the first example works fine with the extra ` case _ => // Do nothing` line that was missing from the email. Example 2 --------- Here's the more Scala like example from the email: ```scala val validCustomer = if (newCustomer.isEvaluation) { val scalaRandom: Random = new Random def randomGen() : String = { val randomCustomerCode = scalaRandom.alphanumeric "eval-" + randomCustomerCode.take(5).mkString } val it = Iterator.continually { randomGen() }.buffered it.takeWhile(platformEnv.customerAPI().getCustomerByCode(_).asScala.isDefined) val validRandomCustomerCode = it.head Customer.safeCreate(newCustomer.customerId(), validRandomCustomerCode, newCustomer.customerName(), newCustomer.host(), newCustomer.isEnabled, newCustomer.languageCode(), newCustomer.isServicesOnly, newCustomer.isEvaluation) } else { newCustomer } ``` It is fine, but it is still suffers from potential side-effects that could easily go wrong. The line: ```scala val validRandomCustomerCode = it.head ``` Could be written in pseudo code as saying: ```scala // validRandomCustomerCode = the next value of it (assuming that the next value is valid) val validRandomCustomerCode = it.head ``` which only works because of the preceding line which says: ```scala // Keep taking values from the iterator `it` whilst they are found to be 'invalid' (already used) ones. it.takeWhile(platformEnv.customerAPI().getCustomerByCode(_).asScala.isDefined) // When we get here we know that the next value is valid (unused). ``` Example 3 --------- An alternative to `Iterator` is `Stream`. Whilst `Iterator` keeps producing new values and forgets the previous, `Stream` is just a lazy way of producing an infinitely long collection. If you ask for the head of a `Stream` you will always get the first value again and again, later values will only be generated when they are first needed, but from then on they will be remembered until the Stream is destroyed. So, with a stream we avoid having to use the complex and unusual combination of `buffered` and `takeWhile`. We can treat the results like an ordinary collection. This means we can take the infinite lazy stream of results and filter it for the results we want. The results will be all valid but will be lazily generated only when we need them. Here's the resulting code (with a few other minor modifications): ```scala val validCustomer = newCustomer.isEvaluation match { case true => val randomCustomerCodeStream = Stream.continually("eval-" + Random.alphanumeric.take(5).mkString) val undefinedCustomerStream = randomCustomerCodeStream.filter(!platformEnv.customerAPI().getCustomerByCode(_).asScala.isDefined) val validRandomCustomerCode = undefinedCustomerStream.head new Customer(newCustomer.customerId(), validRandomCustomerCode, newCustomer.customerName(), newCustomer.host(), newCustomer.isEnabled, newCustomer.languageCode(), newCustomer.isServicesOnly, newCustomer.isEvaluation) case false => newCustomer } ``` Final note ---------- If `Customer` had been a standard Scala case class rather than a Java class then the `new Customer` line could have been reduced to: ```scala newCustomer.copy(code = validRandomCustomerCode) ``` as Scala case classes provide a built in `copy` method for just this kind of situation.