A few weeks ago, one of my colleagues at JUXT gave a presentation on Helm, and this started me thinking back over my own experiences with the tool. It appears I already had a lot to say on the subject back in 2018! Since then, I’ve made extensive use of Helm at CloudBees where we had an umbrella chart to deploy the entire SaaS platform, and at R3. It’s that latter experience that I’m going to talk about in this post.
Helm and Corda
The main Helm chart in question is the one for the R3’s Corda DLT, which you can find on GitHub. The corda.net website has, unfortunately, been sunset, but my blog post describing the rationale for using Helm is still available on the Internet Archive. Another article explains how the chart can be used, along with those for Kafka and Postgres, to spin up a complete Corda deployment quickly.
As an aside, it was a conscious decision not to provide a chart that packaged Corda along with those Kafka and PostgreSQL prereqs. The concern was that customers would take this and deploy it to production without thinking about what a production deployment of Kafka or Postgres entails. Not to mention wanting to make it clear that these were not components that we, as a company, were providing support for.
As a cautionary tale: despite its name, the corda-dev-prereqs
chart referenced in that last article (which creates a decidedly non-HA deployment of Kafka and PostgreSQL) found itself being deployed in places it shouldn’t have been…
More Go than YAML
Whilst the consumer experience with the Helm chart was pretty good, things weren’t so rosy on the authoring side. The combined novelty of Kubernetes configuration and Go templating was just too much for many developers. While some did engage, ownership of the chart definitely remained with the DevOps team that authored the initial version, rather than the application developers.
The complexity of the chart also ramped up rapidly. With multiple services requiring almost identical configuration, we soon moved from YAML with embedded Go to Go with embedded YAML! That problem is not unique to Helm; I remember having the same issue with JSPs many moons ago.
The lack of typing, combined with the fact that all functions return strings, started to make the chart fragile, particularly without any good testing of the output with different override values.
Two charts are not better than one
If you look at the GitHub repository, you might wonder why most of the logic for the chart sits in a separate library chart (corda-lib
) on which the main corda
chart depends. What you can’t see is that we had a separate Helm chart for use by paying customers. This was largely identical to the open-source chart, but included some additional configuration overrides. The library chart was an attempt to share as much logic as possible between the two.
What we couldn’t share was the values.yaml
itself and the corresponding JSON schema, and as a consequence, there was always a certain amount of double fixing that went on. What we really needed was a first-class mechanism for extending a chart.
Helm hooks
Although there were other niggles, the last issue I’m going to talk about is the use of Helm hooks. Corda has two mechanisms for bootstrapping PostgreSQL and Kafka: an administrator can use the CLI to generate the required SQL and topic definitions, or the chart can automatically perform the setup when the chart is installed. We expected customers to use the former mechanism, at least in production, but the latter was used in most of our development and testing, and by the services team in pilot projects. The automated approach used a pre-install hook to drive a containerised version of the CLI to perform the setup.
So far, so good. We then started to look at using ArgoCD to deploy the chart. ArgoCD doesn’t install Helm charts directly, instead, it renders the template and then applies the Kubernetes configuration. It does have some understanding of Helm hooks, converting them into ArgoCD waves, but it doesn’t distinguish between install and upgrade hooks. This would lead ArgoCD to try to rerun the setup during an upgrade.
Now, here some responsibility must lie with the Corda team, as those setup commands should have been idempotent, but they weren’t. The answer, for us, was to use an alternative to ArgoCD (worth a separate post), but our customers might not have the luxury of that choice.
Summary
Does all of the above mean that I think Helm is a bad choice? As always, it depends. For ‘packaged’ Kubernetes configuration, I still believe it’s a better choice than requiring consumers to understand your YAML sufficiently to be able to apply suitable modifications with Kustomize. In particular, pushing Kustomize is opening up your support organisation to having to deal with customers basically using any arbitrary YAML to deploy your solution.
In the case of Corda, we underinvested in building the skills to make the best of Helm. Fundamentally, though, I’d suggest that we simply outgrew it. If I were still working on its evolution, the next step would undoubtedly have been to implement an operator and write all of that complicated logic in a language that properly supports testing and reuse.