We love Python

In my last post, I described our intention to write code that writes code. Now we have to select the programming language we are going to use to build our platform. Well, the title gives it away, but how did we get there?

We agreed upon a serverless first approach as our strategy to become cloud-native and we selected AWS as a vendor to do so.

Our lock-in with AWS is an accepted one, due to the maturity of their serverless offerings. Gregor Hohpe wrote a helpful article about the considerations around lock-ins. Maybe I'll write a future post to discuss how we applied this for our strategy, for now, this rule of thumb should suffice.

We favor vendor managed services over customization.

So, we treat our vendor lock-in as a liberating constraint. What languages does AWS Lambda support out of the box?

  • .NET
  • Java
  • NodeJS
  • Python
  • Ruby

When in doubt, stick with what you know, and luckily this list contains 3 languages that we already know how to use. Now we have to select the language that suits us best: Java, NodeJS, or Python.

https://www.youtube.com/watch?v=oQFORsso2go&t=485s

When doing serverless or more specific functions as a service you can't just look at raw language performance, you also have to factor in cold starts. A cold start is when your code has been dormant for a while, and it needs to be downloaded, containerized, booted, and primed before execution. Nathan Malishev wrote an article comparing the cold starts using several languages.

And truth be told, if we look at our context, a newly developed web application never starts with high traffic (if ever). So if we are honest, cold starts may be more of an issue for us than raw runtime performance.

The AWS recommendation regarding language selection boils down to this: Pick the language you are most comfortable with but use an interpreted language if your use case is latency-sensitive and you expect spiky or infrequent traffic.

My interpretation of this advice, performance-wise, you can't go wrong with an interpreted language.

Give or take a year ago we started prototyping, and for our first iterations, we used NodeJS guided by the sheer of tutorials, how-to's, and packages for NodeJS-Lambda available in the community. But even then, the generator itself was written in Python running in a docker container running within Lambda. Yes, we did some crazy unnecessary stuff during prototyping, but that's how you learn right? In a future post, I'll elaborate more regarding our learning curve in the realm of code generation. But that is not the point, somehow we selected NodeJS for our serverless code and yet we used Python to convert a model to said code. Why? Well, honestly, it was just easier to do in Python. This is primarily due to package maturity in and the community around Python.

Although Javascript is synchronous by default, the NodeJS movement leans toward asynchronous. All remote calls are asynchronous, calls to the database for example. Asynchronous processing is a powerful tool, but if you don't need it quickly becomes a burden. Like installing a photo frame with a hydraulic jackhammer, the tool just doesn't fit the need. And this was noticeable in our code, awaits everywhere. This is a big reason why we chose to use Python for our generator. Maybe the asynchronicity made sense if our use case was centered around OLAP instead of OLTP.

For us, it makes more sense to use a runtime where synchronous is the default and "asynchronous" is the exception. Shifting our development from NodeJS to Python made a huge difference for readability and subsequent the ability to reason about the code and its intended behavior. And the ability to reason about code and behavior is our top priority.

Maybe I'm biased by my background in bioinformatics or because my first personal projects were written in Django. Nonetheless, we love Python!