Here we go – part II of an approach into sizing ColdFusion variables from within ColdFusion. In part I I introduced the problem we’re trying to solve, a general solution (JVM instrumentation) and also pointed you to the JavaSpecialists newsletter #142 for a working solution (from a Java point of view). Heinz also commented in part I and pointed out that there is a full-blown open source project named java.sizeOf() that provides a complete solution from a Java point of view.
How does it work?
This post is geared towards the problems the out-of-the-box Java solution runs into. Let’s first have a look into two different ways of calculating the memory usage of a CF variable with the approach described so far.: sizeOf() vs. deepSizeOf() respectively shallow vs. deep. If you’ve done any work structs in ColdFusion this concept should be somewhat familiar (there are shallow copies/references as well as deep copies). In our use of the terminology shallow will refer to measuring the memory usage of the object itself and deep will refer to measuring the memory usage of the object as well as its fields (members) as well as the parent class(es) and their fields up the object hierarchy.
How does the code do this? Basically it’s using a stack and a hash map to keep track of what do to and what already has been measured. First step is to grab the CF variable (which is a Java object), check a few things (more below) and then throw all members of the object onto the stack for further processing. After that, the superclass gets processed the same way, more stuff gets thrown onto the stack and popped off the stack again when we’re running out of things to process and measure.
This fact alone contains a few interesting surprises. When I started measuring slightly more complex CF variables than simple Strings or Numbers, I got crazy figures of several MB for a simple struct. That obviously couldn’t be right. Using Java reflection on a few CF variables, such as structs or arrays, I easily found the “culprit”: reference(s) to coldfusion.monitor.memory.MemoryTrackerProxy. Sounds like the CF 8+ server monitor subsystem to me, but I might be wrong. But anyway – following this reference along up the object hierarchy creates the outrages figures for a struct and other variables. The Java class behind a CF struct is coldfusion.runtime.Struct and it actually contains a getter and a setter (getMemoryTrackerProxy() etc) to work with the MemoryTrackerProxy. For our purpose of measuring the memory used for the struct and its content, we certainly don’t need to follow the links to MemoryTrackerProxy when the instrumentation works itself through the object stack. What that basically means is that we have to ignore all members of type coldfusion.monitor.memory.MemoryTrackerProxy when we trace the object tree.
Just in case you’re interested, the inheritance structure for a CF structs from a Java point of view looks like this:
After having “fixed” the MemoryTrackerProxy issue, common CF variables worked fine. I still ran into issues with some persistent scopes. Further testing showed that some of those contain members of type javax.servlet.ServletContext. Now – following those when tracing memory usage also creates huge amounts of memory being used because you’re pretty much including a lot of the overall Java API. That particular happens when dealing with measuring the application scope. Therefore javax.servlet.ServletContext is another class I’m filtering for and ignore when measuring memory usage.
Important: Both decisions are made sort of arbitrary. One can argue that following the references and including javax.servlet.ServletContext for the application scope for instance would create the “real” amount of memory used for the application scope. My answer to that would be: yes, that’s fair enough and it might be a valid point. I’ve made the decision not to include it because it’s not what I want to measure here.
Just a few quick word on flyweights or the Flyweight Pattern. I don’t want to dive into the specifics of this pattern, just follow the linked explanations if that’s of interest for you. But – I want to give you a plain language description of what it is (from 30,000 ft): Basically the Flyweight pattern is a way to manage memory very efficiently by sharing and pooling objects whenever possible. That’s pretty much you’d need to know for the purpose of this exercise. Ah – no, hang on. What again does this have to do with Instrumentation and tracking memory usage?
Well – think of a pool of Integer objects or internalized Strings. For us, it’s pretty much impossible to say if a particular object of such characteristics is used for solely our case or if it’s reused in a lot of other places in the app. Maybe we have our own instance of an Integer xyz and therefore the memory usage of it should be counted towards the use of our complex ColdFusion variable. But maybe we just borrowed an Integer abc for the fraction of a blink that was somewhere in the pool anyway and therefore just quickly using it didn’t do any “harm”. We don’t know and that’s why we check. For Integers we would basically do: obj == Integer.valueOf((Integer) obj). If that returns true, we know it’s a shared flyweight and shouldn’t contribute to the overall memory usage. For Strings it means to check the .intern() method (as described and linked above). There are a few more types for which such testing makes sense.
Still, one can argue that one’d want to see the memory usage ignoring any flyweight/JVM concerns, so I made that configurable with a switch. I personally think it’d be better not to include them, but both will possible.
That’s it for now – part III will comprise the actual Java code and .jar file to be used in ColdFusion and some examples. I should actually write another post giving a few more explanations on using Java reflection in ColdFusion and what it can be good for (basically tinkering with undocumented features :-).