hi, i'm patience!

VOC Optimization - Primitive Object Preallocation (Google Summer of Code)

python

performance optimization

open source

I’m about a month into my Google Summer of Code project with BeeWare, and it’s high time I wrote about the specific library I'm working on and what I’ve been up to.

About VOC

VOC is a library that compiles Python code into Java bytecode. It fits into BeeWare’s overarching goal of enabling mobile development in Python because it allows you to run your Python source code on the JVM, and write Python code that interacts with native Android APIs. Essentially, Python source files are parsed into abstract syntax trees and converted into JVM bytecode instructions. Python types and the standard library are implemented in Java (though no Java file is ever created). The result is a bunch of Java class files that can be executed in the JVM.

My Project

My project goal is to significantly improve the runtime of VOC-generated bytecode. VOC is a fairly naive compiler, and no significant efforts have been made (until now!) to optimize performance, which is good for me because there are some pretty low-cost things we can do for a major boost. Find my initial project proposalhere (though it became outdated about a week into the work).

Object Creation is Expensive

Everything in Python is an Object. This means that even small programs can create tons of Objects, including integers, booleans, and strings. Creating an integer, for example, looks like this in bytecode:

NEW org/python/types/Int
DUP
LDC2_W (Long 42)
INVOKESPECIAL org/python/types/Int.init (J)V

Luckily we can exploit the fact that integers and booleans are immutable objects. Strings are immutable as well, but it’s hard to tell which strings are used often enough that they should be cached. Integers, on the other hand, especially small integers, are used all the time. This optimization idea comes from CPython, where integers -5 to 256 inclusive are pre-allocated, shared and created only once. That’s exactly what we did in VOC, by making the Int constructor private and checking if we can use a pre-allocated Int before calling the constructor.

So on the bytecode side, we can replace a constructor call with a static method call which is much cheaper:

LDC 42
INVOKESTATIC org/python/types/Int.getInt (J)Lorg/python/types/Int;

On a benchmarking test, this results in about a 15% performance improvement. While this number is synthetic in the sense that the test is written to exploit this and only this optimization, it is still a verification of performance improvement.

For booleans, since there are only two objects, the savings are even better:

// Without optimization
NEW org/python/types/Bool
DUP
ICONST_0
INVOKESPECIAL org/python/types/Bool.init (Z)V

// With optimization
GETSTATIC org/python/types/Bool.FALSE (Lorg/python/types/Bool;)

Amazingly, on the benchmark this shows about a 50% improvement.

Reflections

So far I am really enjoying my GSoC project. My mentor Russ is super knowledgeable and is great at giving me just enough guidance that I get unstuck, but not telling me I have to implement something a specific way -- anything goes as long as I can justify it or show a performance improvemnt. That freedom is empowering and refreshing, as I’m used to adhering to very specific acceptance criteria and usually pre-defined implementation strategies.

Surprisingly, it’s a great complement to my engineering experience and has forced me to work on a few skills that I’ve never had much reason to hone before, like:

  • Learning a new codebase without much “meta” information. VOC is a pretty new project and there’s not a lot of documentation about the code, architecture or design documents, etc. I’ve learned that answers always come from reading the code.

  • Consistently making progress without humans around. Usually, on a team of engineers in the same office or at least the same time zone, I’d work at a problem for a few hours or a day maximum before asking for help or someone to talk things out with. But I don’t want to pester Russ (who is in a different time zone) too much, and I never expect a real-time response to a question. Learning how to debug the codebase to move forward is truly the most important skill when onboarding.

  • Old school debugging! Debugging without good tools is a pain. There’s no white-box testing on VOC (yet?), so I’m relying on good ol' print statements (I haven't found tools that equally accommodate Python, Java and Java bytecode). Of the three, Java bytecode is definitely the most ornery, and tinkering with bytecode is a big part of this project.

You can follow the discussion and progression around this work here, here, and here. My github incompetence is woefully obvious here, but I am leaving those giant commit chains there in hope of having lots of improvement to show by the end of the summer.