Compiled queries in Scala

Why resort to the complexities of dynamic code generation and compilation at runtime?

Speed.

To demonstrate, lets build an interpreter that runs projection and filtering on a simulated database, then build a compiled version and look at some performance numbers.

This is our goal: compile a data structure representing a query into native code to speed up a query loop. We'll look at two specific code-generation tools to achieve our goal: ASM and Scala Quasiquotes.

ASM has been used by Java developers for years for all sorts of codegen purposes. It's is very fast and has low-memory requirements, but it's also very low-level, making it time-consuming and tedious to use and very difficult to debug.

Quasiquotes are a new Scala tool for code generation. They are similar to macros, except they can be used at runtime whereas macros are compile-time only. They're much higher-level than ASM, so we'll compare benchmarks against the two to see what cost these high-levelel semantics incur, if any.

A query system

First let's define our "database" consisting of a single dataset, represented as a simple in-memory data structure using tuples as rows.


_12// Store a mapping of column name to ordinal for fast projection
_12val schema = Seq("name", "birthYear", "dissertation").zipWithIndex.toMap
_12// schema: scala.collection.immutable.Map[String,Int] =
_12//   Map(name -> 0, birthYear -> 1, dissertation -> 2)
_12
_12val db = Seq(
_12  ("John McCarthy", 1927, "Projection Operators and Partial Differential Equations."),
_12  ("Haskell Curry", 1900, "Grundlagen der kombinatorischen Logik"),
_12  ("Philip Wadler", 1956, "Listlessness is Better than Laziness"),
_12  ("Alonzo Church", 1903, "Alternatives to Zermelo's Assumption"),
_12  ("Alan Turing", 1912, "Systems of Logic based on Ordinals")
_12)

Next, we need structures that hold:

fields to be projected
filter expression tree

An efficient representation of projection fields is simply storing the ordinals. For example, this is how we'd represent a projection of the name and dissertation columns:


_10val projections = Seq(0, 2)

The filter expression is a little more complicated, since it's treeish in nature. Even though we're not supporting SQL, let's use it to help understand:


_10WHERE name != null AND birthYear < 1910

A tree is a natural way to represent this syntax in abstract form:

In Scala, we can represent this with a Scalaz Tree[String].


_10type FilterExpr = Tree[String]
_10
_10val filterExpr: FilterExpr = And.node(
_10  IsNotNull.node(
_10    Item.node("name".leaf)),
_10  LessThan.node(
_10    Item.node("birthYear".leaf),
_10    Literal.node("1910".leaf)))

The root node of each tree or subtree is always an operator. The sub-forests are the operands, which may be made up of operator trees.

Using this we can build a filter interpreter that evaluates whether a row should be included. Some caveats about limitations of this simple example:

We'll run filtering before projection. In reality, which to run first is a decision that should be made by a query optimizer.
I'm only going to implement a few of the most common operators.
The filter interpreter is doing its own projection, which may overlap with the projected fields requested by the user and create duplicate work.
I'm only supporting Int for comparison, and I'm doing some nasty type casting. IRL, we'd keep better track of the types with a full blown schema.

These are all issues I won't address.

Here's our limited set of operators:


_10object Operators {
_10  val And = "AND"
_10  val Or = "OR"
_10  val IsNotNull = "IS_NOT_NULL"
_10  val LessThan = "LESS_THAN"
_10  val Item = "ITEM"
_10  val Literal = "LITERAL"
_10}
_10import Operators._

And the interpreter itself:


_27class FilterInterpreter(expr: FilterExpr, schema: Map[String, Int]) {
_27  import Operators._
_27  /** return true if row is filtered out by expr */
_27  def isFiltered(row: Product): Boolean = evalFilterOn(row)
_27
_27  private def evalFilterOn(row: Product): Boolean = {
_27    def eval(expr: FilterExpr): Boolean = {
_27      val (operator, operands: Stream[FilterExpr]) = (expr.rootLabel, expr.subForest)
_27      operator match {
_27        case And => operands.forall(eval)
_27        case Or => operands.exists(eval)
_27        case IsNotNull => valueFor(operands(0)) != null
_27        case LessThan => {
_27          val (x :: y :: _) = operands.map(o => valueFor(o).toString.toInt).toList
_27          x < y
_27        }
_27      }
_27    }
_27    def valueFor(node: FilterExpr) = node.rootLabel match {
_27      // Item expects a single operand
_27      case Item => row.productElement(schema(node.subForest.head.rootLabel))
_27      // Literal expects a single operand
_27      case Literal => node.subForest.head.rootLabel
_27    }
_27    eval(expr)
_27  }
_27}

And finally, a simple query loop:


_10def query(projections: Seq[Int], filterExpr: FilterExpr): Seq[Seq[Any]] = {
_10  val filterer = new FilterInterpreter(filterExpr, schema)
_10  db.flatMap { row =>
_10    // Filter
_10    if (filterer.isFiltered(row)) None
_10    // Project
_10    else Some(projections.map(row.productElement))
_10  }
_10}

NB: we could use ScalaBlitz to optimize that flatMap if this was real and we were ultra-concerned about performance.

Notice how we return Seq[Seq[Any]]. At compile-time we don't know how to typefully represent a row so we have to resort to the lowest-common type, Any. This is another issue that we'll fixup later in the compiled version.

With all this in place, let's run it:


_17// SELECT name, dissertation
_17val projections = Seq(0, 2)
_17
_17// WHERE name != null AND birthYear < 1910
_17val filterExpr: FilterExpr = And.node(
_17  IsNotNull.node(
_17    Item.node("name".leaf)),
_17  LessThan.node(
_17    Item.node("birthYear".leaf),
_17    Literal.node("1910".leaf)))
_17
_17val result = query(projections, filterExpr)
_17result.map(println)
_17
_17//=> List(John McCarthy, Projection Operators and Partial Differential Equations.)
_17//=> List(Philip Wadler, Listlessness is Better than Laziness)
_17//=> List(Alan Turing, Systems of Logic based on Ordinals)

So far so good. Next up, let's dive into some bytecodes to gain a basic understanding of what's going on when we generate code on the JVM.

Bytecode primer

Let's start by looking at the bytecode for a minimum viable Scala class:


_10class Foo

Compile it with scalac then view the bytecode with java -c:


_10scalac Foo.scala
_10javap -c Foo.class


_10public class Foo {
_10  public Foo();
_10    Code:
_10       0: aload_0
_10       1: invokespecial #12                 // Method java/lang/Object."<init>":()V
_10       4: return
_10}

From the Java bytecode instructions listings reference we can find out the meaning of these bytecode instructions:

aload_0 — load a reference onto the stack from local variable 0 (local var 0 is always this)
invokespecial — invoke instance method on object objectref (this would be the object that we just loaded onto the stack) and puts the result on the stack (might be void)
return — return void from method

This is the generated constructor for Foo. Since nothing else is going on, it simply returns void after calling the constructor. Now what if we actually do something, like instantiate a Foo?


_10class Foo
_10
_10object RunFoo {
_10  val f = new Foo
_10}

Compiling this yields a single Foo.class identical to the one above along with two class files: RunFoo.class and RunFoo$.class.


_10// RunFoo.class
_10public final class RunFoo {
_10  public static Foo f();
_10    Code:
_10       0: getstatic     #16                 // Field RunFoo$.MODULE$:LRunFoo$;
_10       3: invokevirtual #18                 // Method RunFoo$.f:()LFoo;
_10       6: areturn
_10}


_16// RunFoo$.class
_16public final class RunFoo$ {
_16  public static final RunFoo$ MODULE$;
_16
_16  public static {};
_16    Code:
_16       0: new           #2                  // class RunFoo$
_16       3: invokespecial #12                 // Method "<init>":()V
_16       6: return
_16
_16  public Foo f();
_16    Code:
_16       0: aload_0
_16       1: getfield      #17                 // Field f:LFoo;
_16       4: areturn
_16}

The JVM runs these opcodes in a stack machine: values are pushed on to the stack then used as operands to later operations. For example, this is how you could add two constants:


_10bipush 28
_10bipush 14
_10iadd

Push 28 onto the stack with bipush
Push 14 onto the stack with bipush
Execute iadd which adds two ints: it pops two values off the stack to use as its operands: first 14, then 28. It adds those two operands and pushes the result, 28, onto the stack.

At this point we could work with the new 42 value on the stack. This is how we would check that the value is indeed 42:


_1020: bipush 42
_1022: if_icmpne 28
_1024: ldc               #3
_1026: goto 32
_1028: ldc               #4
_1032: ...

Push the value 42 onto the stack
Use if_icmpne to compare two values from the stack. If they are not equal, jump to position 28, which pushes constant #4 onto the stack using ldc. If they are equal, the next code is executed, which instead pushes constant #3 onto the stack, then jumps to position 32.

Tedious, but simple.

This primer is only intended to wet your feet. If you want to learn more about bytecode, see the Further reading section at the end of this post.

ASM

ASM is a bytecode manipulation framework. It's one of several options for manipulating bytecode, but I chose it because it's one of the most mature, requires the least amount of memory, and it's very fast. The downside is it's also quite low level. If you aren't super-concerned with performance, you should check out other options, like Javassist, which is much easier to work with.

Now, to use ASM, the more familiar you are with bytecode the better off you'll be, but for newbs like us, there is ASMifier, which takes a compiled class and generates the ASM code for it. I'm going to avoid Scala for this excercise, since Java maps more closely to bytecode, and we can use the resulting bytecode from Scala either way. I want to use Tuples to represent fully typed rows, so let's see what the ASM code looks like for this Java class:


_14import scala.Tuple2;
_14
_14public class TupleFromJava {
_14
_14  Tuple2<String, Integer> tup = new Tuple2<String, Integer>("foo", 2);
_14
_14  public TupleFromJava() {
_14  }
_14
_14  public Tuple2<String, Integer> getTup() {
_14    return tup;
_14  }
_14
_14}

Compile it, making sure the scala-lib jar is on your classpath:


_10javac -cp $SCALA_LIB TupleFromJava.java

Then use the ASMifier on the classfile:


_10java -cp asm-5.0.3/lib/all/asm-all-5.0.3.jar \
_10  org.objectweb.asm.util.ASMifier TupleFromJava.class

Here is the generated ASM code, heavily annotated with comments. It's a little tedious to work through, but if you really want to understand bytecode and ASM, I encourage you to read the comments and work through every line until it's internalized and you feel comfortable with it.


_118import java.util.*;
_118import org.objectweb.asm.*;
_118public class TupleFromJavaDump implements Opcodes {
_118
_118  public static byte[] dump () throws Exception {
_118
_118    ClassWriter cw = new ClassWriter(0);
_118    FieldVisitor fv;
_118    MethodVisitor mv;
_118    AnnotationVisitor av0;
_118
_118    // Generate a public class inheriting from java.lang.Object
_118    cw.visit(52, ACC_PUBLIC + ACC_SUPER, "TupleFromJava", null,
_118      "java/lang/Object", null);
_118
_118    // Initialize the `tup` field with a null value. When fields are declared in
_118    // Java classes, they aren't fully initialized until the constructor runs.
_118    // Note the format of the string representation of `Tuple2`'s constructor.
_118    {
_118      fv = cw.visitField(0, "tup", "Lscala/Tuple2;",
_118        "Lscala/Tuple2<Ljava/lang/String;Ljava/lang/Integer;>;", null);
_118      fv.visitEnd();
_118    }
_118
_118    // Generate the public constructor named <init> by convention
_118    {
_118      mv = cw.visitMethod(ACC_PUBLIC, "<init>", "()V", null, null);
_118      mv.visitCode();
_118
_118      // Put `this` on the stack
_118      mv.visitVarInsn(ALOAD, 0);
_118
_118      // Invoke constructor on `this`. Note that it takes no params and returns
_118      // void, and it consumes the `this` that we put on the stack above.
_118      mv.visitMethodInsn(INVOKESPECIAL, "java/lang/Object", "<init>", "()V", false);
_118
_118      // Put `this` on the stack again
_118      mv.visitVarInsn(ALOAD, 0);
_118
_118      //
_118      // Start `tup` initialization {
_118      //
_118
_118      // Put a new scala.Tuple2 on the stack then duplicate the object reference
_118      // on top of the stack. Note that since it is an object reference and not
_118      // the object itself, any mutation we do to the object will be reflected in
_118      // any references to that object on the stack.
_118      mv.visitTypeInsn(NEW, "scala/Tuple2");
_118      mv.visitInsn(DUP);
_118
_118      // Push constant "foo" from the constant pool onto the stack
_118      mv.visitLdcInsn("foo");
_118
_118      // Load the int constant 2 onto the stack
_118      mv.visitInsn(ICONST_2);
_118
_118      // Autobox the int in a java.lang.Integer. This pops the int off the stack
_118      // and replaces it with the Integer.
_118      mv.visitMethodInsn(INVOKESTATIC, "java/lang/Integer", "valueOf",
_118        "(I)Ljava/lang/Integer;", false);
_118
_118      // At this point our stack looks like:
_118      // Integer.valueOf(2)
_118      // "foo"
_118      // Tuple2 (reference)
_118      // Tuple2 (reference)
_118
_118      // Initialize the Tuple2. Looking at the signature, we can see it takes
_118      // two java.lang.Objects (instead of a String and Integer, due to type
_118      // erasure), which it will consume from the stack in addition
_118      // to one of the Tuple2 references.
_118      mv.visitMethodInsn(INVOKESPECIAL, "scala/Tuple2", "<init>",
_118        "(Ljava/lang/Object;Ljava/lang/Object;)V", false);
_118
_118      // Finally, we store our Tuple2 value in the `tup` field. This consumes
_118      // the second copy of the Tuple2 reference we had on the stack.
_118      mv.visitFieldInsn(PUTFIELD, "TupleFromJava", "tup", "Lscala/Tuple2;");
_118
_118      //
_118      // End `tup` initialization }
_118      //
_118
_118      // Constructors always return void because constructors should only
_118      // initialize the object and do nothing else.
_118      mv.visitInsn(RETURN);
_118
_118      // Sets the max stack size and max number of local vars. This can be
_118      // calculated automatically for you if you use COMPUTE_FRAMES or COMPUTE_MAXS
_118      // in the ClassWriter constructor.
_118      mv.visitMaxs(5, 1);
_118      mv.visitEnd();
_118    }
_118
_118    // Create a public method `getTup` which takes no args and returns a
_118    // scala.Tuple2
_118    {
_118      mv = cw.visitMethod(ACC_PUBLIC, "getTup", "()Lscala/Tuple2;",
_118        "()Lscala/Tuple2<Ljava/lang/String;Ljava/lang/Integer;>;", null);
_118      mv.visitCode();
_118      // Load `this` onto the stack
_118      mv.visitVarInsn(ALOAD, 0);
_118      // Load a reference to the `tup` field on `this` onto the stack, consuming
_118      // `this` in the process
_118      mv.visitFieldInsn(GETFIELD, "TupleFromJava", "tup", "Lscala/Tuple2;");
_118      // Return the reference to the `tup` field
_118      mv.visitInsn(ARETURN);
_118      // We only needed a max stack size of 1 and maximum of 1 local vars
_118      mv.visitMaxs(1, 1);
_118      mv.visitEnd();
_118    }
_118
_118    // Finish writing the class
_118    cw.visitEnd();
_118
_118    return cw.toByteArray();
_118
_118  }
_118}

Not too bad. ASMifier helpfully wraps each logical chunk in blocks. The first block creates the tup field. The second block is our public constructor. It calls super, which invokes the constructor on the java.lang.Object superclass, then initializes the tup field, and finally returns void. The third block is the getTup getter.

NB: If you're having trouble following this, I recommend generating some ASM code with ASMifier, then annotating it yourself. It really helps to internalize the JVM bytecodes and how to work on a stack machine.

Inside a method, the parameters can be referenced by corresponding local variable indices. For example:


_10def add(x: Int, y: Int)

In this case, x is available at 1 and y is available at 2. To load and use these, we would use visitVarInsn which visits a local variable instruction. Using ASM, this is how we'd add x and y and store the result in local variable 3:


_10mv.visitVarInsn(ILOAD, 1);  // Load x onto the stack
_10mv.visitVarInsn(ILOAD, 2);  // Load y onto the stack
_10mv.visitInsn(IADD);         // Pop two values off the stack, add them,
_10                            // then put the result back on the stack
_10mv.visitVarInsn(ISTORE, 3); // Pop a value off the stack and store
_10                            // it in local variable 3

When you generate ASM using ASMifier it can generate all the labels and local var mappings, which is necessary information for debuggers to show you the correct names of the local variables in a given stack, since in the JVM indices are used instead of names. When writing by hand, you could opt to not write these instructions, or add them later if you need them.

Compiled queries with ASM

Let's use this knowledge to compile the query we originally interpreted. To start, let's write a fast but static version of the query so we can quickly figure out which parts need to be made dynamic. When using ASM it's a good idea to distil the essense of which part needs to be made dynamic so the generator code ends up being as small as possible. Since ASM is hard to read, write, and debug, this is very important.

Let's perform the same projection and filtering we used before, but this time without the interpretation.


_10object StaticQuery {
_10
_10  def query(db: Seq[(String, Int, String)]) = {
_10    db.flatMap { case (name, birthYear, dissertation) =>
_10      if (birthYear < 1910 && name != null) Some((name, dissertation))
_10      else None
_10    }
_10  }
_10
_10}

This should be much faster than our interpreted query because it's running a simple conditional instead of walking and evaluating a FilterExpr on every row.

Let's construct a quick benchmark using ScalaMeter to verify our assumptions.


_21object QueryBenchmark extends PerformanceTest.Quickbenchmark {
_21
_21  val birthYears = Gen.range("birthYears")(9990, 10000, 1)
_21
_21  val records = for {
_21    birthYear <- birthYears
_21  } yield (0 until birthYear).map(("name", _, "diss"))
_21
_21  performance of "Querying" in {
_21    measure method "StaticQuery" in {
_21      using (records) in { StaticQuery.query(_) }
_21    }
_21
_21    measure method "InterpretedQuery" in {
_21      using (records) in {
_21        InterpretedQuery.query(_, Main.projections, Main.filterExpr)
_21      }
_21    }
_21  }
_21
_21}

Results from the last result of each measurement:

StaticQuery: 1.329709ms
InterpretedQuery: 6.949921ms

The static query is between 5 and 6 times faster than interpreting.

Let's rewrite StaticQuery in Java and use it as a template for compiling queries. There are at least two reasons why I significantly dislike running ASMifier against Scala classes:

It generates a gnarly blob of unreadable bytes for the ScalaSignature (which is apparently used to store Scala-specific bits in class files and is required for reflection and for compiling against).
Scala objects and methods get split into separate class files when they're compiled, making it hard to stitch together the results with multiple ASMifier runs.


_26import scala.Tuple3;
_26import scala.collection.Seq;
_26import scala.collection.mutable.ArrayBuffer;
_26import scala.collection.Iterator;
_26
_26public class StaticJavaQuery {
_26
_26  public static Seq<Tuple3<String, Integer, String>> query(
_26      Seq<Tuple3<String, Integer, String>> db) {
_26    Iterator<Tuple3<String, Integer, String>> iter = db.iterator();
_26    ArrayBuffer<Tuple2<String, String>> acc =
_26      new ArrayBuffer<Tuple2<String, String>>();
_26    while (iter.hasNext()) {
_26      Tuple3<String, Integer, String> row = iter.next();
_26      Integer birthYear = row._2();
_26      if (birthYear.intValue() < 1910 && row._1() != null) {
_26        acc.$plus$eq(new Tuple2<String, String>(
_26          row._1(),
_26          row._3()
_26        ));
_26      }
_26    }
_26    return acc;
_26  }
_26
_26}

It's not pretty, but it works, and it happens to be even faster than the static Scala query. We'll start by feeding it to ASMifier, convert the output to Scala, then work on making the result dynamic. Converted output, heavily annotated:


_171import org.objectweb.asm._, Opcodes._
_171import Database.FilterExpr
_171
_171object CompiledQueryGen extends Opcodes {
_171
_171  def generate: Array[Byte] = {
_171
_171    val cw = new ClassWriter(ClassWriter.COMPUTE_FRAMES)
_171    var mv: MethodVisitor = null
_171
_171    cw.visit(52, ACC_PUBLIC + ACC_SUPER, "CompiledQuery", null,
_171      "java/lang/Object", null)
_171
_171    // Constructor
_171    {
_171      mv = cw.visitMethod(ACC_PUBLIC, "<init>", "()V", null, null)
_171      mv.visitCode()
_171      mv.visitVarInsn(ALOAD, 0)
_171      mv.visitMethodInsn(INVOKESPECIAL, "java/lang/Object", "<init>", "()V", false)
_171      mv.visitInsn(RETURN)
_171      mv.visitMaxs(1, 1)
_171      mv.visitEnd()
_171    }
_171
_171    // Static query method
_171    {
_171      mv = cw.visitMethod(ACC_PUBLIC + ACC_STATIC, "query", "(Lscala/collection/Seq)Lscala/collection/Seq", "(Lscala/collection/Seq<Lscala/Tuple3<Ljava/lang/StringLjava/lang/IntegerLjava/lang/String>>)Lscala/collection/Seq<Lscala/Tuple3<Ljava/lang/StringLjava/lang/IntegerLjava/lang/String>>", null)
_171      mv.visitCode()
_171
_171      // Load the `db` argument onto the stack
_171      mv.visitVarInsn(ALOAD, 0)
_171      // Invoke the `iterator` method on db, putting the iterator on the stack
_171      mv.visitMethodInsn(INVOKEINTERFACE, "scala/collection/Seq", "iterator", "()Lscala/collection/Iterator", true)
_171      // Store the iterator object at index 1
_171      mv.visitVarInsn(ASTORE, 1)
_171
_171      // Stack size = 0
_171
_171      // Instantiate a new ArrayBuffer `acc` {
_171      mv.visitTypeInsn(NEW, "scala/collection/mutable/ArrayBuffer")
_171      // Duplicate the reference to it on the stack
_171      mv.visitInsn(DUP)
_171      // Initialize the `acc` ArrayBuffer
_171      mv.visitMethodInsn(INVOKESPECIAL, "scala/collection/mutable/ArrayBuffer", "<init>", "()V", false)
_171      // Store the ArrayBuffer at index 2
_171      mv.visitVarInsn(ASTORE, 2)
_171
_171      // Stack size = 0
_171
_171      // A label is a point we can jump to with GOTO-style instructions. l0
_171      // marks the start of the while loop. The point at which the label is
_171      // visited represents its position and l0 is visited immediately.
_171      // while (...) {
_171      val l0 = new Label
_171      mv.visitLabel(l0)
_171
_171      // Check the while condition
_171      // Load the iterator onto the stack from index 1
_171      mv.visitVarInsn(ALOAD, 1)
_171      // Call `hasNext` on the iterator, storing the boolean result on the
_171      // stack. The JVM stores boolean as int: 0 is false, 1 is true.
_171      mv.visitMethodInsn(INVOKEINTERFACE, "scala/collection/Iterator",
_171        "hasNext", "()Z", true)
_171
_171      // Stack size = 1, hasNext boolean
_171
_171      // Create another jump location for the end of the loop. l1 isn't visited
_171      // until later at the end of the loop body but we need to create the label
_171      // here in order to reference it in `IFEQ`.
_171      val l1 = new Label
_171      // A jump instruction with IFEQ ("if equals") checks the current value on
_171      // the stack. If it's 0 (false) it jumps to the label, thus ending our
_171      // while loop.
_171      mv.visitJumpInsn(IFEQ, l1)
_171
_171      // Stack size = 0
_171
_171      // Load iterator onto the stack again
_171      mv.visitVarInsn(ALOAD, 1)
_171      // Obtain the `row` value from the iterator
_171      mv.visitMethodInsn(INVOKEINTERFACE, "scala/collection/Iterator", "next", "()Ljava/lang/Object", true)
_171      // Ensure the value is of expected type, Tuple3. This instruction pops a
_171      // value off the stack, checks it, then puts it back on the stack.
_171      mv.visitTypeInsn(CHECKCAST, "scala/Tuple3")
_171      // Store the row Tuple3 at local variable index 3
_171      mv.visitVarInsn(ASTORE, 3)
_171      // Load it again
_171      mv.visitVarInsn(ALOAD, 3)
_171
_171      // Stack size = 1, row Tuple3
_171
_171      // Invoke the `_2` method on the row to get the birthYear
_171      mv.visitMethodInsn(INVOKEVIRTUAL, "scala/Tuple3", "_2", "()Ljava/lang/Object", false)
_171      // Ensure the expected type, Integer
_171      mv.visitTypeInsn(CHECKCAST, "java/lang/Integer")
_171      // Store birthYear at local var 4
_171      mv.visitVarInsn(ASTORE, 4)
_171      // Load birthYear from local var 4
_171      mv.visitVarInsn(ALOAD, 4)
_171      // Invoke the `intValue` method on birthYear
_171      mv.visitMethodInsn(INVOKEVIRTUAL, "java/lang/Integer", "intValue", "()I", false)
_171      // Push a short constant on to the stack
_171      mv.visitIntInsn(SIPUSH, 1910)
_171
_171      // Stack size = 2, birthYear int, 1910 short
_171
_171      // Any time we need to branch in some way, we need labels and jump
_171      // instructions. l2 marks the end of the filtering if statement, allowing
_171      //
_171      // us to jump over the body.
_171      val l2 = new Label()
_171
_171      // Jump instructions are always the inverse predicate because if it
_171      // evaluates to true then it jumps, skipping the body of the if block.
_171      // IF_ICMPGE is short for "if int compare greater than or equal", so:
_171      // If value1 >= value2 then jump to l2, where
_171      // value1 = birthYear
_171      // value2 = 1910
_171      mv.visitJumpInsn(IF_ICMPGE, l2)
_171      // That's the first half of the if predicate. Now we check the other half.
_171
_171      // Load the row Tuple3
_171      mv.visitVarInsn(ALOAD, 3)
_171      // Invoke the `_1` method on the row to get the name String
_171      mv.visitMethodInsn(INVOKEVIRTUAL, "scala/Tuple3", "_1", "()Ljava/lang/Object", false)
_171      // This condition is much simpler and the JVM even has an instruction to
_171      // check for null. If the name String is null, jump to l2.
_171      mv.visitJumpInsn(IFNULL, l2)
_171
_171      // Body of the if block {
_171
_171        // Load the `acc` ArrayBuffer
_171        mv.visitVarInsn(ALOAD, 2)
_171        // Load the `row` Tuple3
_171        mv.visitVarInsn(ALOAD, 3)
_171        // Invoke the `$plus$eq` method on `acc` which mutates it, appending the
_171        // `row` Tuple3, and stores the result (which is simply itself) on the
_171        // stack.
_171        mv.visitMethodInsn(INVOKEVIRTUAL, "scala/collection/mutable/ArrayBuffer", "$plus$eq", "(Ljava/lang/Object)Lscala/collection/mutable/ArrayBuffer", false)
_171        // Discard the last item on the stack since we no longer need it.
_171        mv.visitInsn(POP)
_171
_171      // }
_171
_171      // Mark the end of the if block
_171      mv.visitLabel(l2)
_171
_171      // Jump back to the start of the while loop
_171      mv.visitJumpInsn(GOTO, l0)
_171      // Mark the end of the while loop
_171      mv.visitLabel(l1)
_171      // } // end while
_171
_171      // Load the acc Tuple3
_171      mv.visitVarInsn(ALOAD, 2)
_171      // Return the object on the stack
_171      mv.visitInsn(ARETURN)
_171      // Compute the max stack size and number of local vars (computed
_171      // automatically for us via COMPUTE_FRAMES)
_171      mv.visitMaxs(0, 0)
_171      // End the method
_171      mv.visitEnd()
_171    }
_171
_171    // End the class
_171    cw.visitEnd()
_171
_171    // Return the bytes representing a generated classfile
_171    cw.toByteArray
_171  }
_171}

Since we're using ClassWriter.COMPUTE_FRAMES I was able to remove all the visitFrame calls that ASMifier generated. I also deleted the generated FieldVisitor and AnnotationVisitor as they were both unused. I used 0 for all arguments to visitMaxs as COMPUTE_FRAMES implies COMPUTE_MAXS, which still requires calls to visitMaxs but ignores the arguments.

Scala Quasiquotes

This part of the post is not yet written. The intent was to explore Scala Quasiquotes facilities for codegen.

Next steps

An interesting direction to take this, now that we have a foundation for dynamically compiling queries, would be to add a SQL interface. Apache Calcite is well-suited to do just that.