Matt Butcher and Matt Farina, authors of the book Learning Helm, join SE Radio host Robert Blumen to discuss Helm, the package manager for kubernetes. Beginning with a review of kubernetes and Helm, this episode explores the history of helm; the need for a package manager on kubernetes; helm terminology; how helm handles package dependencies; how helm packages are configured – including both settings and templates; expanding templates in preview mode; failures modes and rollback; helm chart repositories; and artifactory – the public package repository.
This transcript was automatically generated. To suggest improvements in the text, please contact firstname.lastname@example.org and include the episode number and URL.
Robert Blumen 00:00:21 For Software Engineering Radio, this is Robert Blumen. I have with me today two Matts: Matt Butcher and Matt Farina. Matt Butcher is the CEO at Fermyon Technologies. He is a founding member of many open source projects, including Helm. Matt Farina is a distinguished engineer at SUSE and the co-chair of Kubernetes SIG apps and is a maintainer on Helm. Along with Josh Dolitsky, who is not here today, they are the authors of the book Learning Helm: Managing Apps on Kubernetes, and we will be talking about Helm. Before we get started, I want to refer listeners to Episode 446, about Kubernetes, and 489 on package management. Matt and Matt, welcome to Software Engineering Radio.
Matt Butcher 00:01:17 Thanks for having us.
Matt Farina 00:01:18 Thanks for having us.
Robert Blumen 00:01:19 This is our first ever episode with two Matts on the same episode, very distinguished. Before we get started, would either of you like to say anything about your background that I didn’t cover?
Matt Butcher 00:01:32 All right. I’ve been working in open source for a long time now. You know, most recently I worked for a startup called DEIS who got into the container ecosystem very early. I think we were using Kubernetes when it was about 1.0, 1.1. Some of the members on my team wrote things like the Docker volume system and/or contributed to the Docker volume system. And we were kind of building a platform as a service at the time we discovered Kubernetes, and it was like a light bulb went on and we just sort of instantly fell in love. And that really got us sort of wholeheartedly invested in Kubernetes. Helm came out of that. A number of other tools came out of that. The Illustrated Children’s Guide to Kubernetes came out of that, and we never looked back, went on from there, became part of Microsoft, spent years developing there. And then most recently I left Microsoft with some of my friends, and we started a company called Fermyon Technologies.
Robert Blumen 00:02:24 For the listeners, that was Matt Butcher. Matt Farina, would you like to add anything?
Matt Farina 00:02:30 Yeah, certainly. Thanks, Butcher, for going first. It gave me some time to think about it while you just off the top of your head had to rattle something off. I’m Matt Farina. You’ll probably hear me referred to as Farina on here. I came through a different route to all of this. So, I’m a distinguished engineer at SUSE, and more recently I’m on the Technical Oversight Committee rather than being on Kubernetes SIG apps or architecture anymore because that’s a lot of work to have all those. And so, that’s where I’m at these days. And I came to Helm through a different route. At the time I was co-chair of Kubernetes SIG Apps, and Helm had become a sub-project as part of Kubernetes, before it had rolled off to be a full cloud-native computing foundation project. I got pulled in to just to start helping. Working on the charts and — that’s the packages we’ll talk more about — and just getting involved in how they work and automation around them and tooling. And I eventually became a full Helm maintainer through that process of contributing, and Matt and I have a long history of collaborating on things. So, it was very easy for me to get into the flow of working with him.
Matt Butcher 00:03:34 In fact, I think we’ve known each other since 2009; we were doing dribble websites together back then.
Matt Farina 00:03:39 Yeah. Yes we were. Something like 2009, and we worked together at two companies. This book was our third book working on together. You’ve roped me into a lot of things.
Robert Blumen 00:03:51 And for the listeners, that was Matt Farina. Today, we will be talking about Helm, a package manager for Kubernetes. Before we get into the main part of the discussion, I’d like to do a brief review of Kubernetes and a brief review of package management. One of you pick each one of those and give a thumbnail.
Matt Butcher 00:04:13 You want to take Kubernetes? I’ll take package management.
Matt Farina 00:04:16 Sure. I’ll take Kubernetes. So Kubernetes is built as a container orchestration system, and it can be more generally used as an orchestration system in general to orchestrate other things as well. But the way I like to think about it is it’s kind of like a cluster-wide operating system, and it can scale from one machine up to many. And in this case, your loads are the different containers that you’re running, and they can be scheduled across the hardware. I like to think about it sort of like you’ve got hot swappable hardware when you have a cluster where if something fails, it gets rescheduled elsewhere. You can easily add more to it, but it’s sort of a platform for running things, primarily containers, whether you’re talking about just and doing it in a declarative way where you tell the system, here’s what you want to run. And then it figures out how to run that as best it can doing things like bin packing on servers, scheduling things close to each other and doing that for you. You think that’s a pretty good explanation, Matt.
Matt Butcher 00:05:17 Yeah. In fact, the whole Kubernetes is an operator system thing is really my favorite way to describe Kubernetes. And that was sort of one of the early aha moments that led us to Helm because one of the core features of pretty much all popular operating systems is they have some sort of package management and, you know, sort of roughly conceived right package management is just a system that allows, you the user of an operating system or of a programming language or something, a pattern and a repository full of things that you can fetch and install locally. Right? So you’ll have a command to grab something and install it locally. You’ll have a command to package up something locally and push it back up into the repository and then a whole bunch of auxiliary and helper commands.
Matt Butcher 00:06:03 And when we first started working in Kubernetes at DEIS, we were building a PAs application that was supposed to sit on top of Kubernetes. Things were going great. As far as building this PAs system, when platform as a service, we were solving a lot of problems. Kubernetes was doing great things for us, but then when it came to installing, we were like, asking the user to walk through, the installer, to walk through a whole bunch of individual steps, to get each little piece and part installed one at a time and configured. And as the story goes we had a all company meeting, the purpose of the meeting was to announce to the entire company that we were going to pivot from multi-platform to just doing Kubernetes. And part of that meeting was a hackathon project. My team and the hackathon project went, wouldn’t it be cool if we solved this particular problem, if we tried to figure out how to do package management for Kubernetes so that others, as they come to Kubernetes will be able to easily get started, easily, install those first few bits and then easily start building up their own packages that house the configuration for their own applications.
Matt Butcher 00:07:13 And that was really where Helm came from.
Robert Blumen 00:07:16 You got a little bit into this, Matt Butcher in your last comment. What does a developer experience look like on Kubernetes without Helm? And how does it change when you adopt Helm?
Matt Butcher 00:07:28 Without Helm, Kubernetes really is configured via a whole bunch of YAML files. You’ll have to write a YAML file, in the Al format that describes each object that you want to put into the Kubernetes cluster. Some of these objects will be things like deployments, which describe to Kubernetes what the application is, how it should be deployed, how it should be upgraded. Other things will be more on the configuration side, like config maps or secrets, which will hold just, essentially configuration data set, settings, files, preference files, things like that. And then you’ll have other things like network attached, storage and information about how services come up. So, as you’re listening to the litany of things I’m describing, I want you to imagine writing about a 200 lines to 500 lines YAML file to describe each and every one of these things.
Matt Butcher 00:08:18 So to install your typical application, you’re talking about writing, six to 800 lines of YAML just to get going, right? And then it grows from there and then each different Kubernetes cluster with each different kind of Ingress controller, whatever its nuances and details are, would require different variations of that same YAML file. That works well when you have a very small number things and a very well-known set of features that you need to support. But if you are trying to install somebody else’s application, it is no fun to attempt to generate all those things. Or if you are responsible for deploying the same application to dozens and dozens of different Kubernetes clusters, it’s no fun to do that. So, Helm really provided a way to package up these YAML files together, but also to parameterize them and templatize them and make it possible for someone to say, Hey, here’s my deployment.
Matt Butcher 00:09:08 But if I’m running on an AWS cluster with these constraints and these configurations then tweak things over here, according to this template. But if we’re running in say an Azure, then tweak these other things over here and run it this way. And if we’re running On-prem(?) here’s a third different version, right? So, in a sense it’s a packaging up of those YAML files, but also in such a way that the operator at the time they install something into the cluster, has the ability to provide specific configuration values and turn on and off different dials and switches to make it installed just right into their cluster.
Robert Blumen 00:09:42 If I understood the difficulty of installing a complex system on Kubernetes is there could be 10 or 20 different Kubernetes objects. And not only the individual objects need to be configured correctly, but also the associations between them. And that one thing needs to point to a field in another thing, how is a way of encapsulating all the object and getting the associations between them correct. So, you can install correctly? Is that more or less right?
Matt Farina 00:10:12 Sort of. So, this is Matt Farina, I’m going to jump in here. The way I like to look at it is, say I’m going to install something on Linux, right? Like Postgres. And you got to know where if you do it by hand where to put configuration files, where to put binaries and how to wire it all up together, you need to know how to do that. In Kubernetes, if you’re going to go install something, say WordPress – it’s a popular thing, you’re going to have a bunch of different resource types, secrets, deployments, stateful sets, maybe an Ingres controller. You might have volume claims, things like this, and you’ve got to wire all of those things up to go together. And so everybody who does it by hand has to know how all those manifests work in Kubernetes, how to wire them together. And they have to know how the, how the business logic of the app works in order to do that.
Matt Farina 00:11:02 And just like, if I were going to go install something like Postgres on Linux, where I could do, you know, Zipper install or app install Postgres, and just get it without having to know this, that’s what you get with Helm. I could do Helm install and give it some information and say, you know, do WordPress. And it can go install that with default values, just like there, or just like you can, with other package managers, you can override those defaults. And so it makes that user experience a lot simpler through using templates and parameterization, and trying to use intelligent defaults, which the package author gets to choose.
Robert Blumen 00:11:36 You mentioned WordPress, give some other examples of popular packages that you can install with Helm.
Matt Farina 00:11:44 Well, I guess some of the other popular packages you could do most of the databases, right? Postgres, Maria, MySQL, Mongo, Redis. So, you can get into some of those database systems. Most of the things that you can think about is installable services. You can now find there’s a website artifacthub.io, which is another CNCF project that lists lots of these things. And so you can find stuff over there. Butcher, do you have any other ideas of other things, other things are escaping my mind?
Matt Butcher 00:12:13 Yeah. I think you kind of see Helm charts break down into three big categories, right? I think there are the infrastructure layer categories, things that augment Kubernetes itself, service meshes, things that require custom resource definitions. You’ll see a number of those. And then the second one is really sort of that data plane or the underpinnings that you would need to write an application database, key value storage, NoSQL, things like that. You tend to see a good grouping of those. In fact, last I checked, I think almost every major database, NoSQL database and key value storage had a Helm chart somewhere. And then the last one is those end user style applications where someone would want to install it and have it running and be able to immediately hit the front end of the web interface and start doing whatever they want to do. Content management systems like WordPress are a good example and Issue Trackers, you know, those kinds of things that we all have toyed around with running these applications locally at one point or another in the past. And now you want to have some sort of a productionized version running in your cluster. So those I think are really the three categories we tend to see best represented in places like artifact hub.
Robert Blumen 00:13:23 Matt Butcher, you gave a short description of how Helm came into being. I understand it has quite a long history now. We’re up to Helm3. What are the major evolutions that have occurred in going from zero to three?
Matt Butcher 00:13:40 So Helm 1 — which we now call Helm classic — was originally conceived of just as sort of like a YAML file uploader. It didn’t originally have template support. It didn’t have a lot of management features for what to do after you’d installed something. You could kind of think of it as a tar ball full of YAML files and a tool that would untar it and push all of those YAML files up into the cluster. Again, keeping in mind use case number one for us, we were trying to figure out a way to install DAIS workflow, our platform as a service. And that was a good first step. There was actually a lot of controversy at the time about whether YAML files should be templatized or parameterized. There were a lot of people who felt very strongly that they should not, that operators should have to hand tweak the YAML files and not rely on some kind of settings manager or something.
Matt Butcher 00:14:33 But as that conversation kind of began to die down, we began working on Helm 2, in which the template functions and the parameterization became sort of a focal feature set, but also in Helm 2, we made what I think was our biggest sort of misstep. It seemed like a logical thing to do at the time, but we broke apart the Helm client into two pieces, and there was Helm, which you ran locally on your machine, and there was Tiller, which ran inside of the cluster. And Helm would send the chart to Tiller, and Tiller would install it. And then Tiller would manage state, and the Helm client would just connect. But over time, we hit a number of limitations with this model — not the least of which was security: It was very, very hard to lock down Tiller so that you couldn’t have people install all kinds of things, sort of willy-nilly, as what was effectively sort of like the quote unquote root user of the Kubernetes cluster.
Matt Butcher 00:15:28 So that kicked us into our third development cycle for Helm 3, which was to move most of the logic back into the command line client, establish some better patterns, and finally take a chance to make some minor iterations on the chart format. And that was kind of the big focus there. It went really well, and in many ways, Helm 3 felt like it sort of finally realized the potential of what Helm could be for the ecosystem. You know, we talk here and there about Helm 4 — what will be the next big iteration? And it’s hard to really envision another major set of changes like we saw between one and two or as we saw between two and three, because effectively at this point, Helm is a good solid package manager for Kubernetes.
Robert Blumen 00:16:12 You’ve used the word “Chart” a few times. We should get a definition out there.
Matt Farina 00:16:17 Sure. I’ll jump in with this. A Chart is essentially the package of Helm, right? So, in the Kubernetes space, you’ll see most or many of the things use nautical terminology, right? Kubernetes: it’s nautical terminology; Helm: nautical terminology. And so in keeping with that thread, the package that Helm uses is called a Chart just to keep with that nautical terminology.
Robert Blumen 00:16:42 Many package managers have the ability for a package to specify dependencies on other packages. The package manager will figure out the closure of all the dependencies and pull everything in. Is that a feature of Helm?
Matt Butcher 00:17:00 So, Helm was not the first package manager Matt Farina and I wrote. We wrote one for the Go ecosystem, called Glide. And we worked on the dependency-resolution algorithms for quite a while when one of the things that we sort of derived from this was the appreciation of the difference between an operating system package manager and a programming languages package manager. And one of the desirable features on an operating system package manager — particularly one that’s installing into a cluster — is that you really want to know upfront exactly what you’re installing. And you also, in addition to that, may want to install, say, multiple versions of the same kind of the same thing, right? MySQL database, for example, you might want to install multiple versions of that in the same cluster. Or, in some cases, we have even seen multiple versions in the same application as different microservices and the application had different dependencies.
Matt Butcher 00:17:52 And so, when we began working on Helm’s dependency model, our big experiment that I think has largely turned out very successfully has been to have the dependency graph sort of resolved, pinned, and included inside of the chart at build time. So, there is zero ambiguity about which version of which chart you’re going to get when you install, there’s no negotiation of versions or anything like that, it’s all predetermined at the time at which you package the software. That said, I mean, there is some dependency management that happens early on in the development cycle, but that’s not something that you would get with say Cargo or NPM or systems like that, where you may want to intentionally pull whatever the latest version of a particular package is at build time. And then you produce a lock file when you want to stick to just one version or something like that.
Robert Blumen 00:18:40 Trying to think of an example. I’m guessing that if I use someone to install MySQL, it doesn’t depend on anything else, but if I’m installing WordPress, it may want to pull in Postgres and NginX. Can you think of any other examples or is my example, correct?
Matt Butcher 00:18:57 WordPress is actually a very good example of this because, as I just described it, sort of all the dependencies are pulled in at build time. If you want to allow the installer to decide between Postgres database or MySQL database, you as the package creator, when you create the package, say, “Okay, if you turn on this switch, you get this version of Postgres configured this way.” And WordPress configured to use that. If you turn on this switch, this other switch, you get MySQL configured this way with WordPress preconfigured to use that. So in a way, you know, it pushes a lot of the original configuration work back to the chart developer and the chart developer rightfully takes their places the expert on the package they’re producing and says, okay, here’s the right way to configure Postgres. Here’s the right way to configure MySQL. It’s up to you, which of those two you want to choose, but I can guarantee you that when you install them, they will each work correctly because all the versions will be pinned to the correct number. And all of the configurations will have been things that have been tested and so on.
Robert Blumen 00:19:59 What is the developer interface to a chart?
Matt Butcher 00:20:02 The primary way of developing charts these days has been through kind of a traditional development environment. One of the people on my team at Fermyon, Ivan, has produced the Kubernetes extension for VS code, which is this great platform that gives you integration with Kubernetes. It gives you Helm chart-development tools and provides you a lot of autocomplete-style features, template, reference kinds of features that help you build charts very rapidly. Matt Farina I’m curious, what do you use and what other systems have you seen?
Matt Farina 00:20:32 Well I use the VS code plugin. It’s hard to say because that’s kind of where my typical workflow has been. The other way that I’ve seen it is, people just using the Helm create command, which is a command that will stub out a chart for you, and then doing copying and pasting from other sources a lot. But they tend to know their app’s business’ logic and Kubernetes fairly well to kind of craft a user experience for a consumer, which I think kind of highlights. In the Helm community, we talk a little bit about roles. And so we’ve got roles like there is that chart consumer that Helm CLI user who’s going to use something. Then there’s the person who creates a chart and packages it up and distributes it. And we’ve got some of these different roles and that end user, we prioritize higher to create a simple user experience. And so that developer who’s working on creating a chart, they tend to know Kubernetes and the manifests and the applications they’re working on and can kind of put things all together.
Robert Blumen 00:21:30 You’ve mentioned customization mechanisms, namely parameters and templates. I want to discuss each of those individually, but preface that by what is the need for the developer to customize a template? Do the defaults work pretty well most of the time, or does it need to be highly custom to the settings and configurations like DNS and IP ranges and sizes and volumes on my Kubernetes cluster?
Matt Farina 00:22:00 You know, it kind of depends on how the chart was created. Usually for things like IP ranges or volumes, you don’t have to configure too much. A lot of it has to do with your application itself. For example, in Kubernetes, you have to deal with scaling, right? Quite often, you don’t run one instance of something. You run multiple instances of something, or you set variation, configuration parameters, and Kubernetes can scale it up and down. And so you might tell it, you know, run a maximum of five instances is where the chart default might be one. And so there’s certain things about it that may get into that. You may have your own, if you’re in a company you may have pulled in the container image from upstream, the chart doesn’t contain the container image, it references it because that’s how Kubernetes works.
Matt Farina 00:22:44 It goes and pulls it. And so if you’re in a company you may have pointed, you know, pulled that container image down, put it in your own registry after you’ve scanned it or something. And you need to tell the chart, here’s a different place to get that image from. And there are a number of things like this that are around the Kubernetisms that you might need to do and customize. Then there are things where people are now building in application logic, right into the chart. So for example, there are WordPress charts where I can and tell you at install time, here’s the name of the blog to use, and that will pass it from the chart all the way down into WordPress itself. So when it comes up that first time, it has the right, you know, site name, it can have the right configuration, the right admin username and password. And so this is application business logic that’s passed all the way down, because you’re able to do that.
Robert Blumen 00:23:33 Let’s dive into parameters, starting with examples of some parameters. I think you just gave some, but a couple of more examples. And then how does a developer go about setting parameters on a chart?
Matt Butcher 00:23:48 Yeah, to kind of pick up right from where Matt Farina left off, I think one of the most interesting developments over the course of Helm’s history has not so much been the technology, but the way that chart developers have sort of figured out patterns for parameterizing applications. At the base level templates will take kind of any of the values you pass in your values dot YAML file. And these values can be specified by the chart developer as they build out the chart. And I think originally, you know, we shot for maybe five or six different parameters without really doing much to sort of specify boundaries around them or things like that. What we saw was this sort of burgeoning expertise among operators who were building these charts, who began parameterizing in a very structured and repeatable way where values should go in the chart.
Matt Butcher 00:24:40 And we saw really sort of like the professionalization of generating the chart dot YAML and the values dot. YAML such that when you went from one chart to another, you could begin to see the patterns. And that I think when you’re getting started, it still makes sense to start out with just trying a couple of simple name value parameters. But if you take a look at, some of the big chart repositories that you see out on the internet, what you’ll see is, in some cases, dozens or even hundreds of lines of possible values that you can configure as you pass them in. And another minor change that happened in Helm 3 was we allowed people to write JSON schema files that would say exactly what types of parameters something could be. So you could essentially assist tools like VS coder other ID style tools to say, Hey, when’s the parameter must be an integer or must be a floating point between this value and this value or a string or something like that. But I think really, kind of the bottom line here is we’ve built something that we thought would be very flexible and people would kind of go with just a few brief things. And what we’ve seen is really sort of a development of an ecosystem that values patterns, and that talks a lot about chart best practices for example.
Robert Blumen 00:25:55 If I’m installing a chart such as WordPress which is going to go and pull in other charts, such as Postgres and maybe Engine X, I would need to not only possibly set parameters for WordPress, such as Matt Farina’s example of the name of the blog, but nested into the dependent packages as well. Is that correct?
Matt Farina 00:26:19 It can be, yes. And Helm provides a means to do that. So, say with your WordPress example, and you wanted to alter some of the replication characteristics of your database, Helm when you specify those parameters in, we call them values? When you specify those in, if you know, or you’re using a particular database and want to tweak it, and it provides parameters to tweak that, you have the ability to do that. So your whole nested chain of dependencies, if you want to go configure one of the configurable parameters, that is open to you. Charts usually install very simply with same defaults. And then from there, as you want to tweak things become a little bit more of an expert on each part of it. You can go ahead and do that.
Robert Blumen 00:27:04 We’ve been talking a bit about parameters. The other major customization method is templates. What is the need for template and why are parameters by themselves not sufficient?
Matt Butcher 00:27:18 Yeah, our first attempt was to really try and stick to just parameterization, and just say, Hey, here’s a value you just substituted. We even use sort of like a bash shell style, dollar sign, something notation. But what we discovered was that in a declarative syntax like Kubernetes, there are cases where you want to describe where you have to describe things using different structures, right? Different structure elements, not simply a string substitution, it’s not simply setting the replica count from three to five. It’s saying, Hey, if this condition obtains, then this whole section of the YAML file needs to be different. Or for configuration files, here’s nine name value pairs. You know, I need them all organized into individual parameters plus values. Here’s a list of volumes, I need to iterate twice on them once here and once here.
Matt Butcher 00:28:12 And as we got into those cases, the declarative format combined with a merely value substitution meant the values were, it would be many, many lines long, right? It’d be dollar volumes and it would be a 40 line value on the other side, not terribly good experience, very difficult to manage. We gave up on that very, very quickly. It just didn’t, I don’t even think, no I think Helm Classic had this feature. And then by Helm2, we had moved on. Template languages gave us just the right level of flexibility to say here’s sort of a minimalist language for expressing the logical relationships between things and for expressing a context that needs to surround particular values as we inject them. And in fact, the GO template language, the syntax that we chose was really a fairly minimal template language that provided just kind of the features that we felt like we really needed.
Matt Butcher 00:29:05 Of course we were wrong in asserting that and ended up having to write a template function, library that sort of augmented the base GO languages. But with things that made sense, right, where in one case here’s another good example of it, right? Where mere parameterization didn’t work. Kubernetes in some cases, name things with capitals and underscores, all caps and underscores, and in other places, all lowercase with dashes, and it might be the same object. Well, instead of having to maintain two versions of the same string that are differentiated only by the capitalization and the swapping of underscores and dashes, we could write template functions that allowed you to say, Hey, in this context, it needs to be Kabob case so use the dashes and underscore. In this case it needs to be shouty caps. So use all capital letters and underscores and transform the same string back and forth. Ultimately then, we have never looked back since switching from value substitution to templates. Occasionally we’ve gone back and forth on whether we chose the right template language. And I’m sure people have opinions about that, but we chose the one that at the time felt like the best one for the job and have kind of stuck with it over the years.
Matt Farina 00:30:07 Yeah. I I’d like to add just two quick things here on this. Because I came in to Helm after the template system was in place, right? That’s when I could develop on it and I was really drawn to it because I realized that when you get to value substitution, that’s one thing. But a lot of developers, people who are used to creating things are used to working with template systems. Whether it’s on the web or with text, it’s really common to work that out. And so by doing something like that, that works across programming languages and all these environments, it’s a kind of system people are used to, it made it easy for people to jump in and create things. But I also think that was a really useful thing for Helm to add in and make it easy for people to use. Because if I go look at like packaging managers for operating systems, I sometimes have to go learn a new scripting language or a new language or some, a new way of doing things.
Matt Farina 00:31:00 And a template system is, is fairly simple and what Kubernetes needs in its YAML documents, uh, lends itself very well to ING systems. And so I think that worked really well in Helm’s behavior. But I also think that it’s important to know here that it’s the chart creator who creates the templates, but the chart consumer doesn’t change them. The chart consumer only works with the parameters they pass in and they actually don’t change or work on the templates themselves. It’s kind of the way if I were working with Linux and there was a shell script inside of a package, right? The package creator would write the shells script and accept parameters into it. But you’re not necessarily going to find the package consumer going ahead and changing that shell script, same kind of philosophy.
Robert Blumen 00:31:44 So when you run Helm, after all the substitution and expanding all the templates, what you’re left with now is Kubernetes YAML files that can be deployed into a Kubernetes cluster. Is that correct?
Matt Butcher 00:31:58 You can run a chart to just spit out the YAML files for you, but Helm takes it one step further and says, well, we just like any package installer, right? If I were to APPT get, install something, it wouldn’t simply drop the binaries out in my local directory, it would install them into place and on occasion, right? It would start up a server for me, insert startup scripts, that kind of thing. Helm really very much is inspired by that level of package management. And so where we view the starting point for Helm is, creating those charts and stuff like that. But where we view the functional endpoint for Helm is it should install something and bring it up to running. And once it’s installed all the YAML files into the cluster and put into place, all the things that should be there, that’s the point at which it says, okay, my work here is done. And of course, then you’ve got other things like upgrade and delete, which essentially, an upgrade will be able to dip what’s there in the cluster and what this new version of the chart has and patch things sort of strategically so that it bring as you up to date with where you want to be. And then deleting of course goes through and using that same sort of YAMLS in texts. Okay. Remove these items back out of the cluster.
Robert Blumen 00:33:08 I want to come back to upgrade and delete in a moment, but one more question about templates, even though I would not, as a Helm user be modifying the template, there is still the question of what does it look like before it gets expanded? If I’m looking at the code, and aiming at a certain result, I understand there is a way to preview the expanded templates before they get pushed up to Kubernetes. Can you explain that?
Matt Butcher 00:33:37 Yeah. Part of the, so there are 10 multi phases as you’re rendering a template, right? So the Helm client will read in the chart, un-compress the file, read the chart dot YAML and then iterate through the template directory, find all the templates, load them into memory and then take the given values and express them into YAML. At that stage right there, you can sort of interrupt it and say, just, you know, output the results of this and stop. That can be a very useful thing. If you want to say, check your rendered YAML into a GitHub repository, or if you want to pipe the results of that template out into another program that has to do some other kind of modification or ingestion of that. So it is definitely possible to do that. We have the command home template to be able to do that, just render the templates, dump the result to standard output, that’s actually great for debugging as well, but that is actually sort of like a developer story, but not typically what we tend to think the end users do as a matter of course, right? The people who are actually installing and upgrading things.
Robert Blumen 00:34:40 And understand there are some subtleties where the preview template may not be identical to the way it runs on the Kubernetes cluster. Can you explain that?
Matt Farina 00:34:50 Sure. I’ll jump in here. The differences will end up being is you can tell Helm to do things differently for different versions of Kubernetes. And so when you’re interacting with the cluster, then we can detect the version or Helm can detect the version of Kubernetes you’re running and then see what logic you’ll want to do for that particular version of Kubernetes. An example of this is Kubernetes APIs. Some of those manifest those documents we’ve talked about, have changed over time. Many times things will be, beta and not generally available, and people will start using them in production. And then when a generally available version comes out, you’ll want to switch to that. And you’ve got to deal with sometimes different versions of Kubernetes providing different versions. You can automate that when you run something like Helm template, we don’t have the actual cluster you’re interacting with. And so we have a default set of configuration and we’ll assume a certain version of Kubernetes. Usually, it’s one of the latest released versions, the latest released version of Helm. And we’ll assume that version. And so some things might come out differently if you’re running a different version of Kubernetes, that’s probably one of the easiest examples.
Robert Blumen 00:36:01 Let’s get back to upgrade and delete. Starting with upgrade, why would I want to upgrade?
Matt Farina 00:36:08 Well a simple reason you might want to upgrade is, your application has had a new version. And take a database, we’ve talked about databases. Say there’s a patch release version of your database that had bug fixes or security fixes. Just like if I were on Linux, I’d want to go upgrade my database to pull in those fixes. The same thing happens inside of a Kubernetes cluster. You want to get those new versions, right? The new revision of your actual software. And so that’s a big reason that people upgrade.
Matt Butcher 00:36:35 I think another one that was maybe a little surprising to us was that people over time decide to change their configuration, right? So when you think about the way a WN package manager or Home Brewer or something like that works, you tend to install the software and then configure it after it’s installed. And you don’t have to upgrade for a configuration. But in a cluster managing package manager, like Kubernetes, you’re pushing the configuration into these same declarative files that hold all the operational information. And there’s no separation of concerns between configuration and operational information. And consequently, if you want to change the way that your Helm chart is working, you will often have to upgrade it by just merely supplying different configuration values and then running the upgrade command. The interesting thing about the way Kubernetes works is because it’s declarative and because one particular parameter might get injected into 15 or 20 different Kubernetes objects, what appears to be a simple one-line change to a configuration parameter may actually result in, half a dozen or a dozen or more different Kubernetes objects being sort of redeployed. So our upgrading logic then had to be, even for these cases where you weren’t changing from say Postgres 1 to Postgres 2, right? The ability of the package manager, to be able to do this sort of smooth upgrade with strategic patches, just fixing the things that are needed and cycling the objects that need to be cycled and leaving everything else alone. That was all a very critical, critical thing. Even in these cases of simple configuration change, seemingly simple configuration change.
Robert Blumen 00:38:12 In these multiple object changes, can there be partial failure modes where the upgrade not only doesn’t complete but it modifies the system and leaves you in a partial state?
Matt Butcher 00:38:25 Yeah. One of the biggest risks in these kinds of declarative systems where you would declare a bunch of things that all work together and are tied together in many cases by strings that the system interprets for you and connects in specific ways, there always a risk that one thing won’t quite attach to other things correctly, or a slight configuration modification, and one thing will render it entirely incompatible with another object. There’s some things in Kubernetes that are immutable and other things that are immutable and there can be occasions where immutable thing gets changed, but the system can’t change the immutable thing. So, there are a number of different cases where you can get yourself into a situation where some, one piece has failed or a couple pieces have failed after an upgrade, which is why Helm has a rollback command that will essentially say, okay, well, you know, reverse back out those patches, we just applied and see if we can get ourselves back to a stable state.
Matt Butcher 00:39:24 That means Helm has to retain a little more state information about what your cluster looks like. But we found that to be an invaluable tool, right? Of course, every software developer ever says, oh well, before you install this in real life, go test it out. What we all know that there are those situations where it didn’t show up in the testing environment, or you were in a rush and forgot to test it out or something like that. So command site rollback make it possible to get you out of holes like that when something goes wrong.
Matt Farina 00:39:52 And I think it’s important to also note that these things where you’re updating Kubernetes and something could go wrong, where one thing gets maybe patched and another thing can’t because it’s immutable and then you end up in a broken state. Those are parts of Kubernetes, not so much Helm. If I were manually just working with these YAML files and I did the same thing, I could end up in the same bad state. It’s one of the reasons I like Helm rollback because if I somehow screw up, I can easily roll back, multiple configuration things. All part of that same chart,
Robert Blumen 00:40:24 Kubernetes itself has a rollback capability. Is Helm rollback built on top of Kubernetes rollback?
Matt Butcher 00:40:30 Helm is not built on Kubernetes rollback. It’s built on Kubernetes as patch system. And basically we reverse out the last patch that we did by recalculating the patch to go back to its previous state. As one of my friends, Bridget, who’s one of the leads in the Helm community likes to say, there’s no time machine included here. The process of rolling back is essentially saying, Hey, we generated a DIF of this YAML and that resulted in this YAML, and then we uploaded it and that resulted in a broken state. So we’re going to reverse the DIF generate a new YAML that resets it back to the way it used to be and run that. So it is essentially an automated version of what you would do if you were manually repairing and said, okay, so what did I change? I changed these nine things. So I’ve got to reverse all of these back out again.
Matt Farina 00:41:16 It reminds me a little bit of, if I go to undo a commit on GitHub, if I go to undo a commit on GitHub, it just doesn’t take out my top commit. It creates a new commit that undid what the previous one did. And so it’s a little bit of trying to do that exact same kind of thing.
Robert Blumen 00:41:32 The other topic that I said I’d get back to is delete. What does that do?
Matt Butcher 00:41:38 The Helm delete function essentially because Helm knows which objects have been placed in the Kubernetes cluster. The Helm delete function will go in there and take all of those and remove them. Essentially run the equivalent of a Kub CTL delete command on each thing that it knows is associated with the chart. There’s there are some fascinating nuances with the way Kubernetes works that makes delete a very dangerous operation. In some cases, and Helm has gone to considerable lengths to avoid some of these because Kubernetes has the, the concept of ownership where a deployment will spin up a replica set that it then claims to own. And a replica set will spin up pods, which it then claims to own. And the desirable impact is when you delete the deployment, you want it to delete the replica set and have the replica set, delete all of the pods.
Matt Butcher 00:42:27 And so Helm doesn’t need to track where the replica sets are and what individual pods are running. It just needs to track the deployment. There are other cases that are iffy like CRDs. You might create a CRD inside of your cluster, but when you delete a CRD, you don’t necessarily want to delete every single instance of the CRD. In fact, in many cases, you don’t actually want to delete a CRD at all. And so we put a number of safeguards to prevent some of these edge cases from happening, but for the most part, Helm will track the top level objects that are created and then allow we’ll trust that Kubernetes is parent-child relationship will take care of cleaning up all the children that were created by the parent objects.
Robert Blumen 00:43:07 We’ve been talking for the first part of the interview about more or less what it does going into detail. I want to change directions now a bit and talk about what kind of public repositories are available containing Open Source charts.
Matt Farina 00:43:26 Wow. There are a lot of repositories containing Open-Source charts. Originally when Helm 2 came out, they created a chart’s repository, and it was example charts and people started adding more and more. And it turned from example charts to hundreds and hundreds of charts put together by people at different companies and the growth became mostly unmanageable. And so we shifted. Helm already had this ability to handle many different repositories. And so we kind of shifted from having a central repository that everybody was using to many repositories. And we found that people at companies all over or just individuals would get together on their own, just create these Open-Source charts. And you can search for those now on Artifact Hub, but there are there’s some from companies like Bitnami, which is now part of VM ware which has a set of really excellent charts.
Matt Farina 00:44:21 I installed something just over this past weekend and it needed Maria DB and it got it from the Bitnami set because it’s really robust and they keep it up to date. And there’s just so many, most of the major companies that I found, many of them, you know, Microsoft included and Amazon they’ll have charts out there that are public to install software. And all of this is all in Open-Source. In fact, I’m not familiar with people doing widely distributed proprietary charts. They’re making all of these things where they want people to consume and run their software Open-Source, as far as the charts go. And so there are thousands and thousands of charts for different pieces of software.
Matt Butcher 00:45:04 And I do think it’s right to look at Artifact Hub is sort of like the first place you go to find charts. It is sort of like the Docker hub or the NPM of the Helm world, also has all kinds of other artifacts that are not just Helm charts. It’s a great place to kind of see what Cloud native packages are out there and available for install, and what systems are supported. Matt Farina of course is one of the architects and lead developers on that project. But it’s just, since Artifact Hub came around, it’s been so much easier to find and install, not just Hel charts, but a wide variety of different Kubernetes and Cloud native technologies.
Robert Blumen 00:45:40 If I picked some popular opensource software, you mentioned Maria DB, Matt and I did a search on artifact hub. Would I be likely to get multiple search, to reflect different opinions by practitioners are the best way to install that piece of software?
Matt Farina 00:46:50 There’s also things like, uh, verified or repository. So you can verify that the person who owns the repository listed it here for that end to end verification, they’ll show off other characteristics, such as the container images are there known vulnerabilities in those. And so, because you have all of these, you know, different ways people could package them up. You can’t just say there’s one, I’m going to install it. You, you want to be able to easily evaluate those. And the artifact hub tries to bubble up those details to make it easy, to figure out what those are. So you can make the decision that’s right for you.
Robert Blumen 00:47:25 If you are a software vendor now, and you want people to use your software to try it out, is it becoming almost a standard that you have to issue a Helm chart along with your software to make it simple for people to try it out?
Matt Farina 00:47:41 You know, I would say that it has become sort of a standard. There are DevOps people who like to work with their own raw YAML files, uh, just to give an example here. And they would prefer to do that because they know Kubernetes, they know their applications really well, but when they want to distribute it widely, they still end up needing to create a Helm chart just to help them get the distribution of their core software. And so I think for some time, if you want to get something out there and easily consumed, you end offering a Helm chart as an option to install it. And many people use that
Robert Blumen 00:48:13 Within a large enterprise. If it’s large enough, you will have some software that is used in multiple places throughout the enterprise, or you have groups building something that another group needs. Can an enterprise set up an internal repo for sharing Helm charts within their boundaries?
Matt Butcher 00:48:35 Yeah, it is very easy to set up a Helm repository. And the reason we made it such was so that both enterprises and, you know, individuals and everything in between would be able to easily set up repositories the way they wanted. So we even had published instructions, uh, that, that say, Hey, you want to set one up internally using these tools? Here’s how to do it. You want to set it up publicly on using nothing but GitHub here’s how to do it and, and try and sort of stay on top of all the different ways that people could stand up a, a simple Helm repository for, you know, again, anything from the weekend project to the corporate Helm charts that have already passed the internal security reviews and things like that. Matt free. And I have both worked at a number of places together. And one of the virtues of that is when we worked at HP, we understood what it meant to need a strong, secured internal only repository though, when we worked at, uh, you know, the volunteer.net, doing websites, we understood the need to be able to publish something very simply and very quickly out on a website where other people could make use of it.
Matt Butcher 00:49:38 And, and we’ve kind learned this lesson and tried to apply it as have the rest of the hem maintainers, you know, to make it as simple as possible to stand up hem, repositories that, that meet the needs of you and your organization.
Robert Blumen 00:49:51 We’re getting close to end of time, Matt butcher. Is there anything you would really like the listeners to know that we haven’t talked about before we wrap up?
Matt Butcher 00:49:59 Yeah. I think that for me, the, the, the joy of working on a project like Helm has been to see it sort of flourish over the years, uh, to have an increasing number of people, join the community with different needs and work their way through those first hello world, examples to the point where they’re producing their own charts. Now, as we enter this kind of what I think of as like the third phase of Helm’s life, right, where Helm is sort of present in every Kubernetes ecosystem, it becomes more and more important for us to kind of find the leaders in the community who are going to become, you know, the, the ones who lead others in the future into Helm and the ones who make the decisions of what’s going to go into home four and home five and home six. So if that’s the kind of thing that, uh, that resonates with you, you know, uh, we’ve got an open public developer meeting, every Thursday details on that are on the Helm community website. You know, we have roles available for people who want to help triage issues and, and work their way into becoming core maintainers. We’re really excited as we get looking in the years beyond to, to what’s going to come in as features for the Helm four project. Once we get going on
Robert Blumen 00:51:05 That, Matt, uh, would you like to get anything covered that we missed so far?
Matt Farina 00:51:12 You know, uh, in addition to what, uh, Matt butcher said, I think I’m amazed at how many supporting tools there are for Helm now, right there there’s hel itself, you know, the package manager, but whether you want to create charts or put them in through CI and testing and vetting, or just as I learned this morning, somebody sent me a whole new package to help you work with charts that I’d never seen before, the ecosystem of people on their own, or at companies, and just all around, have created so many tools to help support people who want to work with Helm and charts, that almost anything I’m like, ah, I want to go create this thing. Uh, it’s a neat idea. I jump into a search engine and look for it. And I found somebody already has, because there are so many people using it and trying to make themselves and others successful with what they’re doing. That there’s just so many tools and methodologies out there,
Robert Blumen 00:52:04 Matt, but sure. Where can people find you?
Matt Butcher 00:52:07 Yeah, the easiest place for people to find me is on Twitter, I’m @technosophos pretty much everywhere. I’m technosophos. Uh, . I hang out pretty regularly in the Kubernetes Slack, the CNCF Slack as CEO of Faron; you’ll see me blogging fairly frequently @faron.com. Looking forward to seeing people in person in Valencia, Spain, at COCOM.
Robert Blumen 00:52:29 And where can people find you?
Matt Farina 00:52:31 Usually, I have a very boring username everywhere. It’s Matt Farina, whether you’re on GitHub or Twitter or in CNCF or Kubernetes, Slack… If you want to find me in all the other places, if you go to MattFarina.com, I think I’ve got links off to most of the other places that you’ll find me.
Robert Blumen 00:52:46 Where can listeners find your book?
Matt Butcher 00:52:48 The Helm book was published by O’Reilly. So it’s easy to get the book anywhere that carries O’Reilly books, including, you know, the big ones like Amazon and Barnes and Noble and things like that. I believe it’s also available as an ebook directly from the O’Reilly’s website.
Matt Farina 00:53:02 And, and it’s also available, I think, through their Safari subscriptions.
Robert Blumen 00:53:06 Great. Matt butcher and Matt Farina, thank you very much for speaking to Software Engineering Radio.
Matt Butcher 00:53:13 Thanks for having us
Matt Farina 00:53:13 Yeah, thanks for having us
Robert Blumen 00:53:15 For Software Engineering Radio, this has been Robert Blumen. Thank you for listening.
[End of Audio]