Sonntag, 10. Juli 2011

From Functions to Monads in Scala

"Monad" that's a word you hear quite often these days. There is a lot of unjustified FUD whenever this word pops up. People seem to be downright scared about it. You can tell by the amount of rhetorical trick words people use to scare others away from trying to understand monads at all. "Academic", "no use in real world programming" etc etc. I find it quite frustrating to hear something like that even within the Scala community from time to time (even though this is getting rarer and rarer), since these are the "arguments" people bring up when they try to scare others away from Scala.

First of all, I'm not a category theorist. These things just fascinate me, and I enjoy reading about this in my free time. I'm your ordinary "Joe, the programmer" working at a small software company.

Still, I find these things useful in my day job. Why? Because as a programmer I'm building mathematical machines (namely software) all day long and these concepts help me to order my thoughts and make at easier to reason about my software and design stuff.

In this blogpost I want to show how to derive Monads from what you already know and give names to things you maybe never even thought about.


Since monads come from Category Theory we first have to define what a category is.

A category C basically consists of 2 sets ob(C) and hom(C) where we call ob(C) the objects of C and hom(C) the morphisms of C. Morphisms are maps that go between objects.

Never saw a Category in your every day programming? Sure? There is a Category called Hask, its objects are the Haskell types and its morphisms are Haskell functions. So you can think about that in a Scala context:

Let f be a function of type Int => String. Int and String are both elements of ob(C) and f is an element of hom(C).

There's more to a Category, we need to be able to compose morphisms and there has to be a special element called identity. Lets looks at composition first.

Let f and g be functions:

f: A => B

g: B => C

We can compose those 2 functions in Scala by using the compose method to get a new function.

g compose f // yields a function of type A => C

OK, that's half the rent. Now we just need the special element called identity which basically is a function that does nothing. It just takes a value and returns it unmodified. So when we compose a function f with identity we get back a function that is equivalent to f and it should not matter if we compose f with identity or identity with f.

In short: The following 3 functions are equivalent (which means: when fed with the same input values they will yield the exact same result)

f // A => B

f compose id // A => B

id compose f // A => B

Now we know that a Category is (well, at least we know enough to start, a real category theorist can take you on a breathtaking journey from here :)). Let's move on!

Enter Higher-Kinded-Types

Next "scary word", but again: You use them all the time. First we need to know what a kind is. We know what types and values are. You can think of them as levels. A value like 1 and x => x (that's a function value, yes those are values too!) live a the value level while types like Int and Int => Int live at the type level. We group values into types. true and false are both values of type Boolean (think of types as sets of values).

OK, if we group values into types, can we group types? Yes, we can! We can group types into… *drumroll* KINDS! Now we have a third level and our world looks something like this:






In most languages we only deal with one kind called * (pronounced type) that groups all types, but there are languages that allow infinite levels (I might cover one of them in future blog posts).

We now have a basic idea of what kinds are, so what are higher kinded types? Like High-Order-Functions (functions that take functions as arguments and yield functions e.g. compose)
higher kinded types are type constructors that take type constructors as arguments and yield types (or maybe even type constructors).

"Wait a minute… are you talking about generics?"

Think about List. It's not a type by itself, you need to feed it a type as an argument to get a type. This is called a Type Constructor.List takes one type and returns a type, hence List has the kind * -> * (this notation is borrowed from Haskell). The Scala notation for a type constructor like List is List[_].

So a higher kinded type looks something like this: trait Functor[F[_]] where F could be a List, Option or any other type constructor that takes one argument.


Now that we have higher kinded types, let's take a look at our functions f and g from the beginning and lets add a type constructor F[_] to the mix (this is Scala notation and means: takes one type and yields a type). We can now produce a whole new set of types with this higher kind (e.g. F[Int], F[String] etc.). We could now create functions that work with these types… hey I know this. This sounds like… a Category! Yes, it is! It's actually a Subcategory of our Hask Category we introduced at the beginning. Let's call this new Category D (so we don't confuse it with the type constructor F). So elements of ob(D) would be something like F[A] and for hom(D) it would be something like F[A] => F[B].

Now wouldn't it be cool if there was a morphism between those two categories, one that preservers composition and the identity rule? Something that could map a function of type A => B to a function of type F[A] => F[B]? That's what's we call a Functor, it's a morphism between categories.

"Why is this useful?" you might ask. Just think about the kind of code reuse you could achieve. Let's look at Option as a concrete example. It's a type constructor. So how do we map a function of type Int => Int to a function of type Option[Int] => Option[Int]? Can't you tell by now? The answer is the map method of Option ;).

Let's check it out:

scala> val f = (_: Int) + 1

f: (Int) => Int = <function1>

scala> Some(1) map f

res0: Option[Int] = Some(2)

We could now reuse our function f with Option, there was no need to write a special version of f that works with Option.

The second ingredient we need to make our Functor complete is a morphism that maps a value of type A to a value of type F[A]. In the case of Option this is just the factory function:

scala> Some(_: Int)

res0: (Int) => Some[Int] = 

scala> Option(_: Int)

res1: (Int) => Option[Int] = 

scala> Option(1)

res2: Option[Int] = Some(1)

scala> Some(1)

res3: Some[Int] = Some(1)


Now, let's assume that we want something like this:

scala> Some(1) map {

     | case x if x % 2 == 0 => Some("YES!")

     | case _               => None

     | }

res1: Option[Option[java.lang.String]] = Some(None)

We want to use a function of type Int => Option[String] with Option's map method. But what's that? We get an Option[Option[String]]. That's stupid, now I have to unpack one Option to get the result I actually wanted.

What happened here? Remember how we mapped a function of type A => B to a function of type F[A] => F[B] in the previous section about functors? What we did now is: we mapped a function of type Int => Option[String] to a function of Option[Int] => Option[Option[String]]. Yikes, we went a little bit over the top here, huh? We kind of like mapped something into an Option and into an Option this is how we ended up with an Option[Option[String]]. You can think of this as a result of a composition of 2 Option Functors.

This is a special kind of Functor that maps a category into itself and is called an Endofunctor

Natural Transformations

Now that we have morphisms between objects of a Category which are Scala functions and Functors which are basically morphisms between categories it's time to introduce morphisms between Functors.

Why do we need this? Well in the last example we ended up with an Endofunctor (we mapped a Category into itself). We need to get rid of one of the Options. We do this with a morphism that maps the composition of 2 Functors F (F composed with F) to the Functor F.

Lets do a concrete example that involves List

scala> List(1,2,3) map { x => List(x + 1) }

res0: List[List[Int]] = List(List(2), List(3), List(4))

It happened again, but this time we ended up with a List[List[Int]]! We now need a Natural Transformation to join those lists together, this Natural Transformation is called flatten in Scala.

scala> List(1,2,3) map { x => List(x + 1) } flatten

res1: List[Int] = List(2, 3, 4)

List comes with a method that does both in one step, it's called flatMap.

scala> List(1,2,3) flatMap { x => List(x + 1) }

res2: List[Int] = List(2, 3, 4)

This enables us to chain functions together in a very convenient way.

scala> Some(1) flatMap { x => Some(2) flatMap { y => Some(x + y) } }

res3: Option[Int] = Some(3)

or a little more concise

scala> Some(1) flatMap { x => Some(2) map { y => x + y } }

res4: Option[Int] = Some(3)


This is actually what makes up a Monad. It's an Endofunctor with two Natural Transformations called unit and join (which is called flatten in Scala). They are such a fundamental concept that Scala features a syntactic sugar for them, the for-comprehension. Do NOT confuse this with a mere for-loop, this is a different animal. The last example above can be written in the following way:

scala> for {

     | x <- br="" some="">     | y <- br="" some="">     | } yield x + y

res5: Option[Int] = Some(3)

//translates to

scala> Some(1) flatMap { x => Some(2) map { y => x + y } }

res6: Option[Int] = Some(3)


This is an introduction on how to think about monads and how to derive them from simple building blocks. They are a very powerful abstraction that pops up all the time. You use them all the time without knowing. Since programming is all about abstraction it's useful to recognize those underlying patterns. Don't let others fool you into believing that this is just some bizarre functional stuff that has no use in the real world. This is just the tip of the iceberg, an amazing journey lies ahead. Let's learn!

Samstag, 9. Juli 2011

DTrace and it's impact on JVM performance

So, I did this blogpost that gave a very shallow introduction to what DTrace can do with the JVM and I got an amazing feedback after that. Actually I'm quite flattered that it had such an impact.

One message I received pointed out that DTrace probes are having a significant impact on the JVM's performance, he (the author) would never recommend using the DTrace JVM integration, and then he pointed me… well, a product by his own company (which shall remain nameless, since I don't want to advertise commercial products on this blog). To be fair: I admit that I would do the same. When you put a lot of effort into a piece of software you do it for reasons you believe in.

However, he gave me a link to some benchmarks that showed some impressive results and diagrams about how much the performance of the JVM suffers when DTrace probes are enabled. The bar diagrams were scary, DTrace looked pretty bad compared to the "rival" technology (you can't really call it a rival since DTrace has a completely different objective). But something was fishy about this, and that was: the axes of the diagrams were not labeled. They did show a small blue bar and a big green bar and nothing else. The code for the test case was provided as a gif (no copy and paste to reproduce the results nice and easy). Numbers were not put into any perspective. And blog comments were disabled.

None the less, this was interesting enough to start a little bit of research on the topic.

The benchmarks seemed to focus on how much enabled probes did slow down method calls. I personally don't like benchmarks that use extremely unrealistic cases as a foundation (look how fast I can increment them integers in a tight loop suckaz! Mah language iz teh fast!) but this time I will just go with the flow because this is pretty much what they did in that benchmark (don't try this at home kids, use realistic conditions to benchmark your stuff). I'm not using the same test code here but the idea seems to be pretty much the same.

The system I'm running this on is a Thinkpad X201 with a core i7 and 8 gigs of RAM (yeah I know, I'm just compensating, get over it ;)). The operating system is OpenIndiana Build 151-beta with DTrace 1.7. Java is at 1.6.0_25.

The stupid test case I will be using is this Java program:

class Test {

public int callTest(int i) {
if (i != 0)
callTest(i - 1);

return i;

public static void main(String[] args) {
Test t = new Test();

long starttime = System.currentTimeMillis();
for (int i = 0; i < 1000000; i++)
long endtime = System.currentTimeMillis();

System.out.println(endtime - starttime);

Once again, this is not a realistic example. I will just use it to demonstrate that there really is an impact.

> java Test
> java -XX:+ExtendedDTraceProbes Test

Wow, this really seems to hurt… the programm is about 35 times slower with DTrace probes enabled.

To put this into perspective I will commit a capital crime in programming. I will compare this stupid program in language A to the same stupid program written in language B. Let language B be Python in this case. To be crystal clear about this: this is NOT a comparison of Python and Java performance. I just need some landmarks to make all those numbers meaningful (at least to some extent since this is a very badly chosen scenario to begin with).

Here is our Python test case.

import time

def callTest(i):
if not i == 0:
callTest(i - 1)

return i

r = range(0,1000000)

starttime = time.time()

for i in r:

endtime = time.time()
print (endtime - starttime)

And the result:

> python

OK, our software runs slower with probes enabled, but we are still faster than Python and Python's performance is acceptable for a lot of use cases. We now have a slower JVM that can be instrumented. So I'd say: No real harm done.

Now let's use those probes and aggregate some real data. For this test I will use a slightly modified version of j_methodcalls.d, a script by Brendan Gregg that is shipped with the DTrace Toolkit. The script is licensed under CDDL but I did remove the license header here to make it more concise and blog-friendly.

#!/usr/sbin/dtrace -Zs

this->class = (char *)copyin(arg1, arg2 + 1);
this->class[arg2] = '\0';
this->method = (char *)copyin(arg3, arg4 + 1);
this->method[arg4] = '\0';
this->name = strjoin(strjoin(stringof(this->class), "."),
@calls[pid, this->name] = count();

Let's run it!

> pfexec ./j_methodcalls.d -c "java -XX:+ExtendedDTraceProbes Test"

OK, this is A LOT slower, even slower than Python. 4 Minutes!

We are now aggregating data and that means we are copying data from userspace into kernelspace from where it can be fetched by DTrace consumers (in our case the dtrace command line tool). What data did we get in this amount of time? Actually: A LOT. It flooded my terminal and I had to pipe the result into less to be able to read all of it. DTrace recorded every method call that happened in the JVM while the program was running. It counted the calls per method, and the class the method belongs to. Keep in mind that we did not need to modify our code to get these results. We don't even need to restart a running JVM to enable the probes we can activate them by using the jinfo command. And we could have used DTrace to gather system wide data in the same script (something I might demonstrate sometime on this blog)

Now lets use the most naive debugging technique on earth. We will print out "CALLED!" every time our callTest method gets called (if you ever did this before you know how disastrous the result well be). This gives us pretty much no information. We just know that a particular method has been called and we need to modify our code, recompile and load it into running JVM.

> time java Test

real 1h39m18.15s
user 3m53.29s
sys 7m55.68s

As we expected, the result is a disaster. Calling print in a tight loop is an extremely stupid thing to do. We could have used a counter that get incremented with every method call, proxy objects, interceptors etc. (all of them would have been significantly faster).

To do something similar like the print example with DTrace I just a another clause to the script:

tick-1s {

This addition prints out what happened in 1 second intervals

1 75779 :tick-1s
4028 Test.callTest 400379

1 75779 :tick-1s
4028 Test.callTest 404720

1 75779 :tick-1s
4028 Test.callTest 402135

1 75779 :tick-1s
4028 Test.callTest 398934

dtrace: pid 4028 has exited

real 4m14.23s
user 4m13.89s
sys 0m0.46s

The performance impact stays pretty much the same with DTrace, we are done in 4 Minutes while we are presented with a readable stream of information.

There are a lot of ways to generate similar data, but most of them require code changes, are not able to do system wide tracing, are limited to one process and/or just one specific runtime.


Tracing the JVM costs (this shows especially in this pathological use case), but DTrace provides us with a very broad spectrum of probes. The JVM ones are just one source of data. We can actually instrument every part of the system with our DTrace script. Maybe a problem is not even related to our program at all, maybe it's NFS misbehaving, something is wrong with the database or there is some heavy IO going on. With DTrace the whole system becomes transparent. This changes the whole "blame game" and that's the whole point of DTrace. Looking at the system as a whole.

The bottom line is: trace the JVM only if you need to and be aware of the performance impact. This tool is for hunting down problems that are very hard or even impossible to analyze with traditional tools. I did use it to trace obscure memory leaks and dead-locks (both in non-Java contexts) and I was able to exactly pinpoint the culprit.

Don't use DTrace when there is a tool that does a better job for this specific task. Use it wisely. Anyway, it's a great utility to have in your toolbox.

Last but not least: use realistic use cases for benchmarking, label your diagram axes, and compare tools that have the same objective.

Mittwoch, 6. Juli 2011

DTrace and Scala

DTrace is one of my favorite technologies ever, it absolutely redefined my view on software and how it can be debugged. In a lot of situations you are basically forced to debug your software by vomiting print-statements just all over the place. To make things worse you have to take them out when you are actually shipping your software, leaving you blind to errors to come. In other cases you probably are missing some print-statements to trace down a very nasty bug, or the bug is hidden in some 3rd-party lib that would take hours to recompile with your new print statement (don't start thinking about libs that you don't have the code of…). In most cases this can become a very unpleasant situation (to put it mildly). DTrace takes a lot of the hassle away by providing you with a mechanism that can modify your program while it's running without stopping it.
Pretty amazing, huh? Wait, it gets better. You can even trace into the runtime of some high level languages (if they provide the probes). This is also true for the JVM, and that means we can instrument a running Scala program.

In this post I will walk you through a very basic script that shows what methods from Predef get called by your program. DTrace is a tool that reaches deep down into the lower levels of your machine. Sounds scary? Nah not really, one of the design goals of DTrace is safety. Your scripts run in a managed environment that keeps it from doing harmful things.

OK, enough sweet-talk. Time to get our hands dirty by writing a very simple Scala program:

import scala.annotation.tailrec

object Main {
def sayHallo() {

def main(args: Array[String]) {

This just prints out "HALLO!" in 1 second intervals (not exactly rocket science, but I put a little sugar on top of it by replacing the while loop with a tail recursive function for fun and profit).

What's that? When running my program DTrace is not showing me ANY probes!!?!?! FFFFUUUUUUUU! That's because we have to enable them first, we can instruct a running JVM to do this by using jinfo. Since I only got one JVM running on this box I will fetch the PID with pgrep.

jinfo -flag +ExtendedDTraceProbes $(pgrep java)

The JVM probes are now armed and "dangerous" (just kiddin') and you will have access to the hotspot provider.

Now lets write the DTrace script. Keep in mind: this script is running in kernel space, so we have to copy in some information from userspace, we do this by using copyin and we have to NULL-terminate the strings ourselves. Yep, this is what it feels like to program low-level stuff, it's not as pretty as FP but aaaaaaaaaanyway, here is the little bugger.

#!/usr/sbin/dtrace -s

#pragma D option quiet

this->class = (char *) copyin(arg1, arg2 + 1);
this->class[arg2] = '\0';
self->tracefunc = stringof(this->class);

/self->tracefunc == "scala/Predef$"/
this->method = (char *) copyin(arg3, arg4 + 1);
this->method[arg4] = '\0';
printf("%s %Y\n", stringof(this->method), walltimestamp);
self->tracefunc = 0;

self->tracefunc = 0;

This thing will fire whenever a function from Predef is called and will give us the function name (in our test case this is just println) and the time when this function was being called. I run this on OpenIndiana build151-beta by issuing pfexec dtrace ./tracescript.d -p $(pgrep java) after I enabled the hotspot provider on the JVM. (pfexec is kind of like sudo, just use whatever gives you the permission to run dtrace on your box)
The output will look like this:

println 2011 Jul 6 21:27:34
println 2011 Jul 6 21:27:35
println 2011 Jul 6 21:27:36
println 2011 Jul 6 21:27:37
println 2011 Jul 6 21:27:38
println 2011 Jul 6 21:27:39
println 2011 Jul 6 21:27:40
println 2011 Jul 6 21:27:41
println 2011 Jul 6 21:27:42
println 2011 Jul 6 21:27:43
println 2011 Jul 6 21:27:44
println 2011 Jul 6 21:27:45
println 2011 Jul 6 21:27:46
println 2011 Jul 6 21:27:47
println 2011 Jul 6 21:27:48
println 2011 Jul 6 21:27:49
println 2011 Jul 6 21:27:50
println 2011 Jul 6 21:27:51
println 2011 Jul 6 21:27:52
println 2011 Jul 6 21:27:53
println 2011 Jul 6 21:27:54
println 2011 Jul 6 21:27:55


OK, this is not even the tip of the iceberg but I think I will wrap it up because there is a lot of ground to cover when it comes to DTrace. If you are hungry for more you should check out the "DTrace Review" by @bcantrill, this stuff will blow your mind (seriously, WATCH IT!) or buy the book by @brendangregg. I will make sure to dig deeper on the topic, so stay tuned. Tell me what you think, good tool or best tool? :P

English, please!

It took me quite some time to make up my mind about this… should I switch the language of this blog to English or should leave it just the way it is. I think it's worth to give this a try, since twitter brought me towards a mostly non-German audience which I also would like to address. This is also a great opportunity to polish my language skills a little since they got kind of neglected in the last few years (so bear with me ;)

So, to all my new readers, here is a little overview on what this blog is about. I work as a developer at a German software company so quite a lot of stuff you will find here will focus on programming and debugging. In my free time I like to play around with different operating systems (mainly Illumos, FreeBSD and DragonFlyBSD) so this place will be stuffed with information about OS specific wizardry :3

OK, folks! Let's give this a shot!