Balaji Sivaraman @balajisivaraman_twitter
Hi all, I need some help understanding a piece of Doobie code from the examples. It is the StreamingCopy one: (https://github.com/tpolecat/doobie/blob/series/0.4.x/yax/example/src/main/scala/example/StreamingCopy.scala). I am using a modified version of the fuseMap2 example from that file. Hereβs how Iβve modified it for my requirements:
def fuseMap[F[_]: Catchable: Monad, A, B](
source: Process[ConnectionIO, A],
sink: Vector[A] => ConnectionIO[B],
delete: ConnectionIO[Unit]
)(
sourceXA: Transactor[F],
sinkXA: Transactor[F]
): F[Unit] =
sinkXA.exec.apply(
source
.transact(sourceXA)
.chunk(10000)
.translate(Ξ»[F ~> Kleisli[F, Connection, ?]](a => Kleisli(_ => a)))
.evalMap(sink(_).foldMap(sinkXA.interpret))
.append(Process.eval_(delete.foldMap(sinkXA.interpret)))
.run
)Basically, I needed to use PostgresCopyManager to do Copy, instead of insert as in the example. Thatβs the reason for the chunking. Also I had to run a delete query as part of the same transaction. You can see the changes for this in the above code. I was able to make modifications purely by looking at the types and functions and using a bit of intuition. It works perfectly. The only problem is I have no idea why.
My knowledge of typed FP is basic. I can grok HKTs, understand why we do Catchable: Monad etc? But type lambdas and kind projector is where my brain becomes mush. I tried splitting up the above piece of code, but that didnβt help much, because I didnβt know how to break down the entire block that is passed to apply.
I'm still having difficulty wrapping my head around the types:
val exec: ~>[Kleisli[F, Connection, ?], F] = sinkXA.exec ->
This is the type of the exec expression. So my understanding is that we need to pass something of type Kleisli[F, Connection, A] to apply or something like that. Is that the type of the expression passed in to apply? For me, it shows the type is F[Unit]. My IntelliJ is also screwing up with the types. I also don't understand this line: translate(Ξ»[F ~> Kleisli[F, Connection, ?]](a => Kleisli(_ => a))). Would appreciate help on that?
I read this post (http://underscore.io/blog/posts/2016/12/05/type-lambdas.html) to get a basic understanding of type lambdas, but it's not of much help here.
Fabio Labella @SystemFw
are you familiar with ~>?
Balaji Sivaraman @balajisivaraman
@SystemFw Apart from following the code and being taken to the natural transformation source code, not really. I also read this post (http://eed3si9n.com/learning-scalaz/Natural-Transformation.html), and got a very rudimentary idea, but the last part of that post with the Cat Theory was lost on me.
Fabio Labella @SystemFw right, let's skip Cat Theory are you familiar with Free at all? (in code, not CT)
Balaji Sivaraman @balajisivaraman Have seen both Rob's presentation (https://www.youtube.com/watch?v=M5MF6M7FHPo) and Runar's (https://www.youtube.com/watch?v=M258zVn4m2M) , but this is my first real experience using a library with those concepts. I guess that's why I'm struggling.
Fabio Labella @SystemFw so, basically you have an ADT describing the instructions of your language
trait KVS[A]
case class Get(k: String) extends KVS[Option[String]]
case class Put(k: String, v: String) extends KVS[Unit]note that KVS has higher kind, i.e. KVS[_], like Option and List but unlike Int
Free represent a description of a program, and it takes two type parameters
Free[Instr[_], A]
Instr[_] is your ADT, it represents the individual instructions in your mini language. It needs to be [_] so that we can express the fact that each instruction returns a different type (e.g. Option[String] for Get, Unit for Put)
A represents the return type of your mini program
so Free[Instr[_], A] can be seen as a program written in the mini language whose primitives are described by Instr[_] which, when run, will eventually result in a value of type A
are you with me so far?
Fabio Labella @SystemFw
cool, now you have this data structure that represents a program
the way we run it is by interpreting it into another, more powerful data structure
e.g. IO, or Task
which you can look at as a Free with an unlimited set of instructions
once we have this more powerful type, we can keep composing it with other Task (let's pick Task for now)
and finally, when we're done, we call unsafePerformSync at the end of the world
and things actually happen
is this also clear?
Balaji Sivaraman @balajisivaraman Yeah. Except the part with "unlimited set of instructions", I'm not sure what unlimited means here.
Fabio Labella @SystemFw
it means that in IO you can do everything
because IO encapsulates arbitrary side effects from the impure subset of Scala
whereas in your own Free you can only do Get or Put
it's a fairly poor metaphor, apologies
Fabio Labella @SystemFw
yes, but IO and Task can do anything Scala can do
Task.delay { // all scala code can fit here, e.g. println()}
Fabio Labella @SystemFw
yes, albeit they are not side effects, because Task and IO encapsulate them, so nothing is happening until you explicitly run them
Fabio Labella @SystemFw
right, so interpreting a Free means going from a less powerful, but more precise language, to a more powerful but less precise language
Natural Transformations are involved in the process of interpretation
I'm going to try to explain this interpretation in english, then we can move to code
let's say Instr[_] is your ADT, and Target[_] is any Monad (we will pick Task)
interpreting Free means
"if you give me a way to translate each instruction in Instr[_] into an instruction in Target[_], I will translate each program in Free[Instr, A] into a program in Target[A]"
does this sentence make sense?
Balaji Sivaraman @balajisivaraman Okay. Normally (as per the YT presentations and my understanding) when we interpret a Free program, we get something of type A (Get interpreted would give A, Put interpreted would give Unit). Now instead of that, what we're saying is we will interpret it into another target Monad (Task) and then use Task to gain additional benefits.
Fabio Labella @SystemFw
when we interpret a Free program, we get something of type A ( No, that's incorrect a
Free monadis always translated into anotherMonadthe otherMonadthen does what's needed to produce anAthe only exception is when some examples useId, with side effects
Fabio Labella @SystemFw that's wrong, and I don't like there are so many examples of this, even on the cats website
Fabio Labella @SystemFw
cool, now let's see how the interpretation function looks in code
yeah, but without getting too technical, that sentence in quotes above is actually what being free means
anyway, the most common way to interpret a Free program (there are others) is through foldMap
trait Free[F[_], A] {
//...
def foldMap[G](translator: F ~> G)(implicit gmonad: Monad[G]): G[A] = ???
}Fabio Labella @SystemFw
if you give me a way to translate each instruction in
Instr[]into an instruction inTarget[]that's F ~> G, I'll explain more about it in a second I will translate each program in Free[Instr, A] that's whyfoldMapis onFree(implicit gmonad: Monad[G]) as long as the target is a monad yes
Fabio Labella @SystemFw
now the implementation of foldMap is not important for the end user
what's important is why we need ~>
this might be a bit tricky to understand, but it's doable
so what you want is pass to foldMap something that can translate each instruction in Instr[_] to something in Task[_]
which is normally written as a pattern match
how would you write that? any ideas?
I'm only interested in the signature, the implementation doesn't really matter
Fabio Labella @SystemFw
that's the most reasonable assumption, but it doesn't work
translator takes a type parameter B (let's rename it to B to avoid confusing it with the A in Free)
if you do foldMap(translator)
you will need to specify the type B (e.g. foldMap(translator[Int])
what type do you assign to B, instead of the bogus Int I put there?
Balaji Sivaraman @balajisivaraman Hmmm, my guess is we don't have one yet. We want it to work for all possible Bs, now and in the future.
Fabio Labella @SystemFw
wow
that's the correct answer!
we want to pass in a polymorphic function
so when we pass translator, we want to keep its polymorphism
because we will only specialise once we have a specific instruction
this difference can be made specific in types, but scala doesn't give you a nice notation for it
I'm going to write it in Haskell for a second, but it should be ok:
theTranslatorYouHave :: forall f g b. (f b -> g b) -- f b = Instr[B], g b = Task[B]
theTranslatorYouWant: forall f g. (forall b. f b -> g b)which basically means
"we know what F[_] and G[_] are when we pass translator to foldMap (Instr[_] and Task[_] respectively), so we can specialise them, but we want to still be polymorphic in B, because only the body of foldMap knows how to specialise B"
does this make any sense? (btw, very impressive that you came up with "Hmmm, my guess is we don't have one yet. We want it to work for all possible Bs, now and in the future.", this is a deep concept that normally takes time to sink in)
so it should be clear I hope that just a normal function can't work as an argument to foldMap, because we want it to work for all possible B, now and in the future
Balaji Sivaraman @balajisivaraman Thank you. Yes, because when we call foldMap, we won't have a specialized B. That will only come in higher up the chain when someone calls us. So when we write it, we want it to work for anything that could be passed in. Hopefully that is right.
Fabio Labella @SystemFw yes, it's roughly right the ability to pass polymorphic functions around, without losing their polymorphism is called higher-rank polymorphism scala does not have such a feature however, we can work our way around it let's create a trait
trait Translator[F[_], G[_]] {
def apply[B](in: F[B]): G[B]
}now we can write
def translator: Translator[Instr, Task] = new Translator[Instr, Task] {
def apply[B](in: Instr[B]): Task[B] = in match {
case Put(k, v) => Task.delay (DB.put(k, v))
case Get(k) => Task.delay(DB.get(k))
}we can then say ourFree.foldMap(translator)
notice that the type of translator doesn't mention B, but only Instr and Task
the apply method is still polymorphic in every B, so the body of foldMap can call it, and it'll work
does this make sense?
Balaji Sivaraman @balajisivaraman Yes. I'm with you so far. Just to clarify, within the body of foldMap, we have an A, which is basically the A of the Free. And this is the A that is applied when we call Translator.apply. Is that right?
Fabio Labella @SystemFw
not quite
within the body of foldMap, we pattern match on Free
which is a program containing instructions
so apply is called on each of the instructions, so Get is translated to Task[Option[String]], Put is translated to Task[Unit], and so on
after doing this translation (this is the map, in foldMap)
we combine them in the same way they were combined in the original program
Balaji Sivaraman @balajisivaraman Okay. That makes a whole lot of sense. Thanks a lot. And the translator in our case is the Natural Transformation.
Fabio Labella @SystemFw
exactly, A is the return type of the whole program
B is the return type of each individual instructions, that's why we need the polymorphism
for Category Theory reasons that are (as I hope I've proved) not necessary to understand how Free works to an usable extent, Translator is the natural transformation:
trait ~>[F[_], G[_]] {
def apply[A](f: F[A]): F[B]
}Balaji Sivaraman @balajisivaraman Nothing. I got it. (I think.) Thanks a lot. And the interpreter for Doobie is the one in transactor: ConnectionOp ~> Kleisli[M, Connection, ?]. This will translate any program in ConnectionIO to a Kleisli program. In my case, it will be a Kleisli[Task, Connection, Unit]. And when I run that, I will get a Task[Unit], which is the final output type.
Fabio Labella @SystemFw
yes
you need the type lambda because G[_] is of higher kind in ~>: you still need to be polymorphic in B
Balaji Sivaraman @balajisivaraman Ok. So we need that to mean Kleisli[M, Connection, _], substitute whatever type you want for the underscore. I assume this is also because of a limitation of Scala.
Fabio Labella @SystemFw
so Kleisli[M, Connection, Unit] is wrong, you need type Target[A] = Kleisli[M, Connection, A] and ConnectionOp ~> Target (note Target, not Target[Unit]). The type lambda is a shorthand for that
well, it's more like we put an ?, instead of an _
so it's inconsistent (_ means several different things in the scala type system, which is unfortunate)
but we can do it
Kleisli[M, Connection, _] is written Kleisli[M, Connection, ?]
and there's your type lambda
Balaji Sivaraman @balajisivaraman Okay. I got it. Just to clarify, based on my reading, that would be the same astype SomeType[A] = Kleisli[M, Connection, A], which falls neatly into the "shape" G[_].
Balaji Sivaraman @balajisivaraman Actually, your entire explanation from my question has made a whole lot of sense. Thank you for taking the time to do it. π I now feel I'm in a position to dig deeper into not only the fuseMap function, but also the surface-level of Doobie.
Fabio Labella @SystemFw
yes, translate is another variation of this pattern
when you want to go from Free[Lang1, A] to Free[Lang2, A]
so basically just change the instruction set
ConnectionIO is Free
fs2 Stream has (a lot of) extra machinery, but the basic idea is the same
we parameterise over the effect F, and then in the end interpret to an F[A]
Balaji Sivaraman @balajisivaraman I will revisit both fuseMap and also poke around Doobie and Scalaz-Stream.
Fabio Labella @SystemFw Yes, the important thing is to grasp the key concepts:
- A Free monad is always translated to another monad, which does the actual work
- The process of interpretation takes a translation between instructions, and gives a translation between programs
- You need higher-rank polymorphism (the trait trick)
- everything follows by having things fit into the right shape
Fabio Labella @SystemFw @ZizhengTai Thank you! :-) I will put this in a gist for future consultation, I think conversations are a great way to learn
Wonderfull explanation!