Java (and CF) garbage collection: Why should you care?

by kai on 27/08/2005



Just recently I was asked why one should care about JVM tuning, garbage collection and other “very low-level settings” (the latter from the point of view of the average CF developer). I mean, I’m talking and writing about that pretty often… so – a reasonable question.

Let me give you an example with figures… Let’s assume we’re talking about a Java application using Javas internal garbage collector. Let’s further assume the garbage collector needs 0.1 percent of the whole calculation/working time of the application. And finally let’s assume that the garbage collector runs in a single thread and that the application has to wait until it’s done. Then it’s quite obvious that the application itself has 99.9 percent of the whole time to do whatever it is supposed to do.

Well… this is true for a single CPU machine. Let’s have a look what happens on a machine with four CPUs. To calculate that value, some very clever computer scientists found a formula, called Amdahls Law. Basically it says that the effiency of one CPU in a setting like this is to be calculated as:

eff = (1 / (F+((1-F)/N)))/N

with F the so called serial fraction and N the number of CPUs.

F is to be specified in the form 0.1 for 10% or 0.001 for 0.1% etc.

Let’s do the example from above, 4 CPUs with 0.1% working time in the garbage collection, N=4, F=0.001… eff=99.7%

Oups, we lost efficiency? How come? Let’s have a look on 32 CPUs: N=32, F=0.001… eff=96.99%

So, we lost even more efficiency. And to make that point pretty clearly – we’re not talking about any side effects of clustering or whatever… we’re still on one server, running multiple CPUs to have a good performance.

If we increase F (which means that the fraction of garbage collection time increases) this effect will become even more dramatically:

For F=0.01 (still just 1% of the time is GC time) and N=4, eff=97.something%. Not too bad, hmmm? But try it with N=32: eff=76.3% !!!! So be careful – on highly parallel machines, things like a serial garbage collector in Java really might result in massive performance loss!

One might ask what my point is. I’ll tell you: Luckily for all us CF developers both Sun and Macromedia were clever enough to think about this and tons of others potential issues and provided you with either parallel garbage collectors (Sun) or preset the fitting GC engine for your CF server (Macromedia). But honestly: Do you think this effiency issue I explained above is the only thing about Java, GC and JVMs which might have a positive or negative effect on your application? Or do you think all CF apps are equal in terms of memory usage, good development style, creation of internal Java objects etc.

The answers are no and no and that’s the reason why also CF developers – or at least the people maintaining their machines – should care! As a rule of thumb: the bigger and more complex you’re environment is, the more you should invest in research about potential flaws!

Comments on this entry are closed.

Previous post:

Next post: