## Sonntag, 10. Juli 2011

### From Functions to Monads in Scala

"Monad" that's a word you hear quite often these days. There is a lot of unjustified FUD whenever this word pops up. People seem to be downright scared about it. You can tell by the amount of rhetorical trick words people use to scare others away from trying to understand monads at all. "Academic", "no use in real world programming" etc etc. I find it quite frustrating to hear something like that even within the Scala community from time to time (even though this is getting rarer and rarer), since these are the "arguments" people bring up when they try to scare others away from Scala.

First of all, I'm not a category theorist. These things just fascinate me, and I enjoy reading about this in my free time. I'm your ordinary "Joe, the programmer" working at a small software company.

Still, I find these things useful in my day job. Why? Because as a programmer I'm building mathematical machines (namely software) all day long and these concepts help me to order my thoughts and make at easier to reason about my software and design stuff.

In this blogpost I want to show how to derive Monads from what you already know and give names to things you maybe never even thought about.

Categories

Since monads come from Category Theory we first have to define what a category is.

A category C basically consists of 2 sets ob(C) and hom(C) where we call ob(C) the objects of C and hom(C) the morphisms of C. Morphisms are maps that go between objects.

Never saw a Category in your every day programming? Sure? There is a Category called Hask, its objects are the Haskell types and its morphisms are Haskell functions. So you can think about that in a Scala context:

Let `f` be a function of type `Int => String`. `Int` and `String` are both elements of ob(C) and `f` is an element of hom(C).

There's more to a Category, we need to be able to compose morphisms and there has to be a special element called identity. Lets looks at composition first.

Let f and g be functions:
``````

f: A => B

g: B => C
``````

We can compose those 2 functions in Scala by using the `compose` method to get a new function.
``````

g compose f // yields a function of type A => C
``````

OK, that's half the rent. Now we just need the special element called identity which basically is a function that does nothing. It just takes a value and returns it unmodified. So when we `compose` a function `f` with `identity` we get back a function that is equivalent to `f` and it should not matter if we compose `f` with `identity` or `identity` with `f`.

In short: The following 3 functions are equivalent (which means: when fed with the same input values they will yield the exact same result)
``````

f // A => B

f compose id // A => B

id compose f // A => B
``````

Now we know that a Category is (well, at least we know enough to start, a real category theorist can take you on a breathtaking journey from here :)). Let's move on!

Enter Higher-Kinded-Types

Next "scary word", but again: You use them all the time. First we need to know what a kind is. We know what types and values are. You can think of them as levels. A value like `1` and `x => x` (that's a function value, yes those are values too!) live a the value level while types like `Int` and `Int => Int` live at the type level. We group values into types. `true` and `false` are both values of type `Boolean` (think of types as sets of values).

OK, if we group values into types, can we group types? Yes, we can! We can group types into… *drumroll* KINDS! Now we have a third level and our world looks something like this:
``````

kinds

---------

types

---------

values
``````

In most languages we only deal with one kind called `*` (pronounced type) that groups all types, but there are languages that allow infinite levels (I might cover one of them in future blog posts).

We now have a basic idea of what kinds are, so what are higher kinded types? Like High-Order-Functions (functions that take functions as arguments and yield functions e.g. compose)
higher kinded types are type constructors that take type constructors as arguments and yield types (or maybe even type constructors).

"Wait a minute… are you talking about generics?"

Think about `List`. It's not a type by itself, you need to feed it a type as an argument to get a type. This is called a Type Constructor.`List` takes one type and returns a type, hence `List` has the kind `* -> *` (this notation is borrowed from Haskell). The Scala notation for a type constructor like `List` is `List[_]`.

So a higher kinded type looks something like this: `trait Functor[F[_]]` where F could be a `List`, `Option` or any other type constructor that takes one argument.

Functors

Now that we have higher kinded types, let's take a look at our functions `f` and `g` from the beginning and lets add a type constructor `F[_]` to the mix (this is Scala notation and means: takes one type and yields a type). We can now produce a whole new set of types with this higher kind (e.g. `F[Int]`, `F[String]` etc.). We could now create functions that work with these types… hey I know this. This sounds like… a Category! Yes, it is! It's actually a Subcategory of our Hask Category we introduced at the beginning. Let's call this new Category D (so we don't confuse it with the type constructor F). So elements of ob(D) would be something like F[A] and for hom(D) it would be something like F[A] => F[B].

Now wouldn't it be cool if there was a morphism between those two categories, one that preservers composition and the identity rule? Something that could map a function of type `A => B` to a function of type `F[A] => F[B]`? That's what's we call a Functor, it's a morphism between categories.

"Why is this useful?" you might ask. Just think about the kind of code reuse you could achieve. Let's look at `Option` as a concrete example. It's a type constructor. So how do we map a function of type `Int => Int` to a function of type `Option[Int] => Option[Int]`? Can't you tell by now? The answer is the `map` method of Option ;).

Let's check it out:
``````

scala> val f = (_: Int) + 1

f: (Int) => Int = <function1>

scala> Some(1) map f

res0: Option[Int] = Some(2)
``````

We could now reuse our function `f` with `Option`, there was no need to write a special version of `f` that works with `Option`.

The second ingredient we need to make our Functor complete is a morphism that maps a value of type `A` to a value of type `F[A]`. In the case of `Option` this is just the factory function:
``````

scala> Some(_: Int)

res0: (Int) => Some[Int] =

scala> Option(_: Int)

res1: (Int) => Option[Int] =

scala> Option(1)

res2: Option[Int] = Some(1)

scala> Some(1)

res3: Some[Int] = Some(1)
``````

Endofunctors

Now, let's assume that we want something like this:
``````

scala> Some(1) map {

| case x if x % 2 == 0 => Some("YES!")

| case _               => None

| }

res1: Option[Option[java.lang.String]] = Some(None)
``````

We want to use a function of type `Int => Option[String]` with `Option`'s `map` method. But what's that? We get an `Option[Option[String]]`. That's stupid, now I have to unpack one `Option` to get the result I actually wanted.

What happened here? Remember how we mapped a function of type `A => B` to a function of type `F[A] => F[B]` in the previous section about functors? What we did now is: we mapped a function of type `Int => Option[String]` to a function of `Option[Int] => Option[Option[String]]`. Yikes, we went a little bit over the top here, huh? We kind of like mapped something into an `Option` and into an `Option` this is how we ended up with an `Option[Option[String]]`. You can think of this as a result of a composition of 2 `Option` Functors.

This is a special kind of Functor that maps a category into itself and is called an Endofunctor

Natural Transformations

Now that we have morphisms between objects of a Category which are Scala functions and Functors which are basically morphisms between categories it's time to introduce morphisms between Functors.

Why do we need this? Well in the last example we ended up with an Endofunctor (we mapped a Category into itself). We need to get rid of one of the `Option`s. We do this with a morphism that maps the composition of 2 Functors F (F composed with F) to the Functor F.

Lets do a concrete example that involves `List`
``````

scala> List(1,2,3) map { x => List(x + 1) }

res0: List[List[Int]] = List(List(2), List(3), List(4))
``````

It happened again, but this time we ended up with a `List[List[Int]]`! We now need a Natural Transformation to join those lists together, this Natural Transformation is called `flatten` in Scala.
``````

scala> List(1,2,3) map { x => List(x + 1) } flatten

res1: List[Int] = List(2, 3, 4)
``````

`List` comes with a method that does both in one step, it's called `flatMap`.
``````

scala> List(1,2,3) flatMap { x => List(x + 1) }

res2: List[Int] = List(2, 3, 4)
``````

This enables us to chain functions together in a very convenient way.
``````

scala> Some(1) flatMap { x => Some(2) flatMap { y => Some(x + y) } }

res3: Option[Int] = Some(3)
``````

or a little more concise
``````

scala> Some(1) flatMap { x => Some(2) map { y => x + y } }

res4: Option[Int] = Some(3)
``````

This is actually what makes up a Monad. It's an Endofunctor with two Natural Transformations called unit and join (which is called `flatten` in Scala). They are such a fundamental concept that Scala features a syntactic sugar for them, the `for`-comprehension. Do NOT confuse this with a mere `for`-loop, this is a different animal. The last example above can be written in the following way:
``````

scala> for {

| x <- br="" some="">     | y <- br="" some="">     | } yield x + y

res5: Option[Int] = Some(3)

//translates to

scala> Some(1) flatMap { x => Some(2) map { y => x + y } }

res6: Option[Int] = Some(3)

``````

Conclusion

This is an introduction on how to think about monads and how to derive them from simple building blocks. They are a very powerful abstraction that pops up all the time. You use them all the time without knowing. Since programming is all about abstraction it's useful to recognize those underlying patterns. Don't let others fool you into believing that this is just some bizarre functional stuff that has no use in the real world. This is just the tip of the iceberg, an amazing journey lies ahead. Let's learn!

## Samstag, 9. Juli 2011

### DTrace and it's impact on JVM performance

So, I did this blogpost that gave a very shallow introduction to what DTrace can do with the JVM and I got an amazing feedback after that. Actually I'm quite flattered that it had such an impact.

One message I received pointed out that DTrace probes are having a significant impact on the JVM's performance, he (the author) would never recommend using the DTrace JVM integration, and then he pointed me… well, a product by his own company (which shall remain nameless, since I don't want to advertise commercial products on this blog). To be fair: I admit that I would do the same. When you put a lot of effort into a piece of software you do it for reasons you believe in.

However, he gave me a link to some benchmarks that showed some impressive results and diagrams about how much the performance of the JVM suffers when DTrace probes are enabled. The bar diagrams were scary, DTrace looked pretty bad compared to the "rival" technology (you can't really call it a rival since DTrace has a completely different objective). But something was fishy about this, and that was: the axes of the diagrams were not labeled. They did show a small blue bar and a big green bar and nothing else. The code for the test case was provided as a gif (no copy and paste to reproduce the results nice and easy). Numbers were not put into any perspective. And blog comments were disabled.

None the less, this was interesting enough to start a little bit of research on the topic.

The benchmarks seemed to focus on how much enabled probes did slow down method calls. I personally don't like benchmarks that use extremely unrealistic cases as a foundation (look how fast I can increment them integers in a tight loop suckaz! Mah language iz teh fast!) but this time I will just go with the flow because this is pretty much what they did in that benchmark (don't try this at home kids, use realistic conditions to benchmark your stuff). I'm not using the same test code here but the idea seems to be pretty much the same.

The system I'm running this on is a Thinkpad X201 with a core i7 and 8 gigs of RAM (yeah I know, I'm just compensating, get over it ;)). The operating system is OpenIndiana Build 151-beta with DTrace 1.7. Java is at 1.6.0_25.

The stupid test case I will be using is this Java program:
``class Test {  public int callTest(int i) {    if (i != 0)      callTest(i - 1);    return i;  }  public static void main(String[] args) {    Test t = new Test();    long starttime = System.currentTimeMillis();    for (int i = 0; i < 1000000; i++)      callTest(100)    long endtime = System.currentTimeMillis();    System.out.println(endtime - starttime);  }}``

Once again, this is not a realistic example. I will just use it to demonstrate that there really is an impact.
``> java Test118> java -XX:+ExtendedDTraceProbes Test4106``

Wow, this really seems to hurt… the programm is about 35 times slower with DTrace probes enabled.

To put this into perspective I will commit a capital crime in programming. I will compare this stupid program in language A to the same stupid program written in language B. Let language B be Python in this case. To be crystal clear about this: this is NOT a comparison of Python and Java performance. I just need some landmarks to make all those numbers meaningful (at least to some extent since this is a very badly chosen scenario to begin with).

Here is our Python test case.
``import time                                                                     def callTest(i):  if not i == 0:    callTest(i - 1)  return ir = range(0,1000000)starttime = time.time()for i in r:  callTest(100)endtime = time.time()print (endtime - starttime)``

And the result:
``> python test.py21.9892270565``

OK, our software runs slower with probes enabled, but we are still faster than Python and Python's performance is acceptable for a lot of use cases. We now have a slower JVM that can be instrumented. So I'd say: No real harm done.

Now let's use those probes and aggregate some real data. For this test I will use a slightly modified version of `j_methodcalls.d`, a script by Brendan Gregg that is shipped with the DTrace Toolkit. The script is licensed under CDDL but I did remove the license header here to make it more concise and blog-friendly.
``#!/usr/sbin/dtrace -Zs                                                          hotspot\$target:::method-entry{  this->class = (char *)copyin(arg1, arg2 + 1);   this->class[arg2] = '\0';  this->method = (char *)copyin(arg3, arg4 + 1);   this->method[arg4] = '\0';  this->name = strjoin(strjoin(stringof(this->class), "."),                       stringof(this->method));  @calls[pid, this->name] = count();}``

Let's run it!
``> pfexec ./j_methodcalls.d -c "java -XX:+ExtendedDTraceProbes Test"[…snip…]249126``

OK, this is A LOT slower, even slower than Python. 4 Minutes!

We are now aggregating data and that means we are copying data from userspace into kernelspace from where it can be fetched by DTrace consumers (in our case the `dtrace` command line tool). What data did we get in this amount of time? Actually: A LOT. It flooded my terminal and I had to pipe the result into `less` to be able to read all of it. DTrace recorded every method call that happened in the JVM while the program was running. It counted the calls per method, and the class the method belongs to. Keep in mind that we did not need to modify our code to get these results. We don't even need to restart a running JVM to enable the probes we can activate them by using the `jinfo` command. And we could have used DTrace to gather system wide data in the same script (something I might demonstrate sometime on this blog)

Now lets use the most naive debugging technique on earth. We will print out "CALLED!" every time our `callTest` method gets called (if you ever did this before you know how disastrous the result well be). This gives us pretty much no information. We just know that a particular method has been called and we need to modify our code, recompile and load it into running JVM.
``> time java Test[…snip…]CALLED!5958514real 1h39m18.15suser 3m53.29ssys 7m55.68s``

As we expected, the result is a disaster. Calling print in a tight loop is an extremely stupid thing to do. We could have used a counter that get incremented with every method call, proxy objects, interceptors etc. (all of them would have been significantly faster).

To do something similar like the print example with DTrace I just a another clause to the script:
``tick-1s {  printa(@calls);  trunc(@calls);}``

This addition prints out what happened in 1 second intervals
``  1  75779               :tick-1s      4028  Test.callTest          400379  1  75779               :tick-1s      4028  Test.callTest          404720  1  75779               :tick-1s      4028  Test.callTest          402135  1  75779               :tick-1s     4028  Test.callTest          398934253064dtrace: pid 4028 has exitedreal 4m14.23suser 4m13.89ssys 0m0.46s``

The performance impact stays pretty much the same with DTrace, we are done in 4 Minutes while we are presented with a readable stream of information.

There are a lot of ways to generate similar data, but most of them require code changes, are not able to do system wide tracing, are limited to one process and/or just one specific runtime.

Conclusion

Tracing the JVM costs (this shows especially in this pathological use case), but DTrace provides us with a very broad spectrum of probes. The JVM ones are just one source of data. We can actually instrument every part of the system with our DTrace script. Maybe a problem is not even related to our program at all, maybe it's NFS misbehaving, something is wrong with the database or there is some heavy IO going on. With DTrace the whole system becomes transparent. This changes the whole "blame game" and that's the whole point of DTrace. Looking at the system as a whole.

The bottom line is: trace the JVM only if you need to and be aware of the performance impact. This tool is for hunting down problems that are very hard or even impossible to analyze with traditional tools. I did use it to trace obscure memory leaks and dead-locks (both in non-Java contexts) and I was able to exactly pinpoint the culprit.

Don't use DTrace when there is a tool that does a better job for this specific task. Use it wisely. Anyway, it's a great utility to have in your toolbox.

Last but not least: use realistic use cases for benchmarking, label your diagram axes, and compare tools that have the same objective.

## Mittwoch, 6. Juli 2011

### DTrace and Scala

DTrace is one of my favorite technologies ever, it absolutely redefined my view on software and how it can be debugged. In a lot of situations you are basically forced to debug your software by vomiting print-statements just all over the place. To make things worse you have to take them out when you are actually shipping your software, leaving you blind to errors to come. In other cases you probably are missing some print-statements to trace down a very nasty bug, or the bug is hidden in some 3rd-party lib that would take hours to recompile with your new print statement (don't start thinking about libs that you don't have the code of…). In most cases this can become a very unpleasant situation (to put it mildly). DTrace takes a lot of the hassle away by providing you with a mechanism that can modify your program while it's running without stopping it.
Pretty amazing, huh? Wait, it gets better. You can even trace into the runtime of some high level languages (if they provide the probes). This is also true for the JVM, and that means we can instrument a running Scala program.

In this post I will walk you through a very basic script that shows what methods from `Predef` get called by your program. DTrace is a tool that reaches deep down into the lower levels of your machine. Sounds scary? Nah not really, one of the design goals of DTrace is safety. Your scripts run in a managed environment that keeps it from doing harmful things.

OK, enough sweet-talk. Time to get our hands dirty by writing a very simple Scala program:
``import scala.annotation.tailrecobject Main {  @tailrec  def sayHallo() {    println("HALLO!")    Thread.sleep(1000)    sayHallo()  }                                                                                  def main(args: Array[String]) {    sayHallo()  }}``

This just prints out "HALLO!" in 1 second intervals (not exactly rocket science, but I put a little sugar on top of it by replacing the while loop with a tail recursive function for fun and profit).

What's that? When running my program DTrace is not showing me ANY probes!!?!?! FFFFUUUUUUUU! That's because we have to enable them first, we can instruct a running JVM to do this by using `jinfo`. Since I only got one JVM running on this box I will fetch the PID with `pgrep`.
``jinfo -flag +ExtendedDTraceProbes \$(pgrep java)``

The JVM probes are now armed and "dangerous" (just kiddin') and you will have access to the `hotspot` provider.

Now lets write the DTrace script. Keep in mind: this script is running in kernel space, so we have to copy in some information from userspace, we do this by using `copyin` and we have to `NULL`-terminate the strings ourselves. Yep, this is what it feels like to program low-level stuff, it's not as pretty as FP but aaaaaaaaaanyway, here is the little bugger.
``#!/usr/sbin/dtrace -s                                                           #pragma D option quiethotspot\$\$target:::method-entry{  this->class = (char *) copyin(arg1, arg2 + 1);   this->class[arg2] = '\0';  self->tracefunc = stringof(this->class);}hotspot\$\$target:::method-entry/self->tracefunc == "scala/Predef\$"/{  this->method = (char *) copyin(arg3, arg4 + 1);   this->method[arg4] = '\0';  printf("%s %Y\n", stringof(this->method), walltimestamp);  self->tracefunc = 0;}hotspot\$\$target:::method-entry/self->tracefunc/{  self->tracefunc = 0;}``

This thing will fire whenever a function from `Predef` is called and will give us the function name (in our test case this is just `println`) and the time when this function was being called. I run this on OpenIndiana build151-beta by issuing `pfexec dtrace ./tracescript.d -p \$(pgrep java)` after I enabled the `hotspot` provider on the JVM. (`pfexec` is kind of like `sudo`, just use whatever gives you the permission to run `dtrace` on your box)
The output will look like this:
``println 2011 Jul  6 21:27:34println 2011 Jul  6 21:27:35println 2011 Jul  6 21:27:36println 2011 Jul  6 21:27:37println 2011 Jul  6 21:27:38println 2011 Jul  6 21:27:39println 2011 Jul  6 21:27:40println 2011 Jul  6 21:27:41println 2011 Jul  6 21:27:42println 2011 Jul  6 21:27:43println 2011 Jul  6 21:27:44println 2011 Jul  6 21:27:45println 2011 Jul  6 21:27:46println 2011 Jul  6 21:27:47println 2011 Jul  6 21:27:48println 2011 Jul  6 21:27:49println 2011 Jul  6 21:27:50println 2011 Jul  6 21:27:51println 2011 Jul  6 21:27:52println 2011 Jul  6 21:27:53println 2011 Jul  6 21:27:54println 2011 Jul  6 21:27:55``

WTF IS THIS I DON'T EVEN!?!?!? TL;DR

OK, this is not even the tip of the iceberg but I think I will wrap it up because there is a lot of ground to cover when it comes to DTrace. If you are hungry for more you should check out the "DTrace Review" by @bcantrill, this stuff will blow your mind (seriously, WATCH IT!) or buy the book by @brendangregg. I will make sure to dig deeper on the topic, so stay tuned. Tell me what you think, good tool or best tool? :P