The Container Engine is a Commodity

I started my career 20 years ago when Java was the new kid on the block. I worked at IBM for many years on their JVM and I was very proud of the innovation and work that we did there.

During that time, a curious thing happened. The JVM became a commodity. People didn’t really care too much about who had the better garbage collector or the faster JIT, rather they cared about stability and function. As new versions of Java came out, a new byte code would sometimes sneak in or profiling might be improved a little, but the bulk of the innovation and function was in the class libraries – the rich APIs sitting on top of this commoditized JVM and the portable byte code spec. It’s easy to forget today just how much third party innovation grew up around it. IBM bet their entire enterprise software stack on this humble little interpreter.

I believe that the exact same thing is happening in the container space and the parallels with Java are striking.

What does a container engine actually have to do, fundamentally? It has to provide some networking and storage virtualization, it has to provide a sandpit for running a process, various means of data ingress and egress into that process, a format for representing binary dependencies and a control plane for lifecycle management. Beyond that, yes you can have logging plugins and the ability to execute other processes and snapshotting, but none of that is essential.

Nothing captured this requirement more clearly than the CRI-O announcement earlier this year. It was a conscious effort to put a stake in the ground and say, “here is a straw man for what we need a basic container runtime to do”. Docker themselves came out with containerd earlier this year as a way of attempting to map the multi-process daemon model onto the synchronous single-process runC as part of their OCI integration. The net effect is a secondary API boundary sitting below the Docker API that represents basic container compute primitives.

The VIC team did the exact same thing back in January when we were re-architecting the Bonneville codebase. We intentionally designed a container primitives layer consisting of 5 distinct services – persistence, network, compute, events and interaction. The network and storage services would mutate the underlying infrastructure, the compute service would create containers and the event and interaction services would provide optional  value-add around the running containers. We did this with the intent of being able to layer multiple personalities on top – everything from a Trivial Container Engine that only uses the compute service to a Docker personality. Beyond that, the goal was to end up with low-level interfaces that could also abstract away the underlying implementation – it should be possible to deploy a container host inside of a container host and manage them using the exact same client. Cool, huh?!

Almost a year on from that original white-boarding session, it’s worth reflecting on where we find ourselves. The Docker personality in VICe will never be fully API complete*. There are many reasons for this, but primarily it’s because that API is a level above what VICe is really aiming at. The Docker API now covers development, clustering, image resolution, HA, deployment and build. VICe is focused on the runtime – provisioning and deployment with clustering, scheduling and HA transparently integrated at a lower layer.

I’m often asked, “how is VICe ever going to keep up with the pace of innovation?”. Well, if the industry can standardize around a container primitives API with an agreement to use explicit rather than implicit interfaces, then I think it’s very possible. If the Container Engine is a commodity and we get that bit right, most of the innovation will happen in the layers above it and should be compatible with it. It may take a few goes around to find the right abstractions, but in my option, it’s a goal well worth pursuing.

[* This is a prediction, not a commitment to not doing it 😉 ]