Taming side-effects with mutability

Clojure prefers immutable values and referentially transparent functions. But it also works well with the Java ecosystem, where everything is an object... mostly mutable.

I was recently dealing with ASM, the bytecode manipulation library. Most of the API is based on the Visitor pattern. To read a class from bytecode, you construct a ClassReader object from an array of bytes or an input stream, then tell it to accept an object that implements the ClassVisitor interface:

public interface ClassVisitor{
  void visit(int version, int access, String name, String signature, 
      String superName, String[] interfaces);
  void visitSource(String source, String debug);
  void visitOuterClass(String owner, String name, String desc);
  AnnotationVisitor visitAnnotation(String desc, boolean visible);
  void visitAttribute(Attribute attr);
  void visitInnerClass(String name, String outerName, 
      String innerName, int access);
  FieldVisitor visitField(int access, String name, String desc, 
      String signature, Object value);
  MethodVisitor visitMethod(int access, String name, String desc, 
      String signature, String[] exceptions);
  void visitEnd();
}

Well, now we see a bit of a problem. The ClassReader gets to call these methods, which mostly return void. When a ClassVisitor does return an object, it returns yet another visitor that the ClassReader calls to descend into members of the class. The methods of the visitor get access to information about the class, but how do we get it out?

In other words, how do you use an API that is 100% about side-effects and mutability inside a language that favors immutable values and pure functions?

One solution is to use mutability to contain the side-effects within a function, such that the function is referentially transparent from the outside. In so doing, we tame the side-effects and put a boundary around them:

(defn analyze [c]
  (let [classinfo (atom {})
        v         (reify ClassVisitor
                    (visit [this version access name sig supername interfaces]
                      (swap! classinfo assoc 
                             :static?    (static? access)
                             :interface? (interface? access)
                             :final?     (final? access)
                             :classname  name 
                             :super      supername))
                    (visitSource [this source debug])
                    (visitOuterClass [this owner name desc])
                    (visitAnnotation [this desc visible])
                    (visitAttribute [this attr])
                    (visitInnerClass [this name outername innername access])
                    (visitField [this access name desc sig value]
                      (swap! classinfo update-in [:members] conj 
                             {:type       :field
                              :visibility (visibility access)
                              :static?    (static? access)
                              :final?     (final? access)
                              :name       name})
                      nil)
                    (visitMethod [this access name desc sig exceptions]
                      (swap! classinfo update-in [:members] conj 
                             {:type       :method
                              :visibility (visibility access)
                              :static?    (static? access)
                              :final?     (final? access)
                              :abstract?  (abstract? access)
                              :name       name})
                      nil)
                    (visitEnd [this]))]
    (.accept c v 0)
    @classinfo))

In the let form at the beginning, I create a new atom that holds an empty map. This will be my mutable state for this function. Then I create an instance of ClassVisitor, where most of the methods do nothing and return nil. Only visitClass, visitField, and visitMethod are interesting to me. In those, I mutate the atom to attach the info that ClassReader passed to the visitor.

After the setup in the let, all that is left is to call ClassReader.accept with the new visitor. At that point the ClassReader does its bytecode interpretation and makes all its calls into the visitor. Finally, I get the resulting value out of the classinfo atom, which now has a nice, immutable Clojure-friendly map of information collected from that class file.

This technique is similar to using transients to build collections: allow mutability within a function, but don't let it escape. Transients do it for speed. The ClassVisitor does it to bridge to an object-oriented API.