Kubernetes CRD and Operator — a basic exercise for beginners

It’s totally ok to skip this section. I wanted to do an exercise on learning about CRDs, custom operators and how to build them. This blog describes how to get started with these based on the path I followed. First step was to understand what these are. Post that we can think about some use case based on our understanding which we can try out hands on. Now we need to see about what different ways we can do this and choose one of that.

And finally let’s get our hands dirty!

Prerequisites:

  • Kubernetes cluster
  • Basic knowledge using Kubernetes
  • Golang setup for development

All code examples in this page is available in GitHub.

What are CRDs

If you already have used Kubernetes, you should have queried for a pod, a service, a deployment etc. These are all “types” of resources in Kubernetes. These are by default part of Kubernetes cluster and each have its own parameters and states. We can think about them to be like a data type in a programming language. When a resource of one of the above resource types are created, based on the values of the resource, some set of actions happen behind the scene. Let’s think about that with an example – I won’t be going deep.

Think about creating a Pod. Here we have parameters like the container image. When created, actions happen behind the scene to deploy containers on a worker node, assign it an IP address and so on. Also the Pod will have different states, which keeps getting updated in the Pod resource we created.

What if you need a custom record type to manage some part of your infrastructure custom to your architecture? For example a database that has some additional properties of its own rather than a stateless Pod deployed in Kubernetes. That’s where CRDs come into picture. CRDs are custom resource types with its own set of parameters, states and actions at different stages that you can create and deploy into your Kubernetes cluster. One thing to note is that the actions are not part of the CRD and we will see that in the below section.

What is an operator

In the above pod example, there are a set of actions that happen when we create a pod record. Not just when a pod is created, but for update, delete or any state changes. Similarly, in case of our custom resource, such actions might have to be performed when an event occurs. This is the job of an operator. We write logic using a supported language to accomplish this. In this blog I am using go. I am new to go and took this as an opportunity to learn go too along with this. I will write a quick start to go later.

Once you build an operator, you can test it locally where you develop it by connecting it to a Kubernetes cluster. Finally this can be built into a container image and deployed as a pod into a cluster.

Use case we will be building

This use case is just basic and for learning. What we are creating:

  • A CRD that represents a Compute VM/Bare Metal in a cloud infrastructure
  • An operator that will create an actual VM/Bare Metal when a CRD is created

What options do we have

We have multiple ways of creating a CRD and an operator and in this blog we’ll be using kubebuilder.

Installations

go

Go to: https://golang.org/doc/install
Install the binary or from source
Test the installation by:
go version

kubebuilder

Source: https://book.kubebuilder.io/quick-start.html
os=$(go env GOOS)
arch=$(go env GOARCH)
curl -L https://go.kubebuilder.io/dl/2.3.1/${os}/${arch} | tar -xz -C /tmp/
sudo mv /tmp/kubebuilder_2.3.1_${os}_${arch} /usr/local/kubebuilder
export PATH=$PATH:/usr/local/kubebuilder/bin

kustomize

Source: https://kubectl.docs.kubernetes.io/installation/kustomize/binaries/
curl -s "https://raw.githubusercontent.com/\
kubernetes-sigs/kustomize/master/hack/install_kustomize.sh" | bash

Let’s initialise our development directory

Before we can create a CRD or an operator, we need to get our development directory initialised. For this, we need to create a directory to work on and initialise a Go module in the directory. Let’s call it cloudbuilder.

mkdir ~/cloudbuilder
cd ~/cloudbuilder/
go mod init cloudbuilder

Initialise kubebuilder in the directory with a domain name. I am using example.com here. This will generate a boilerplate code for building a CRD and operator

kubebuilder init --domain example.com

Now, let’s build the CRD

Here we are creating our new custom resource type. I am calling the group as cloudbuilder and the resource as compute. Below command will create scaffolding required to start on. Running this will prompt to select whether to “Create Resource” and “Create Controller”. Give ‘y’ meaning yes to create both as we are covering both cases in this page.

kubebuilder create api --group cloudbuilder --kind Compute --version v1alpha1
Create Resource [y/n]
y
Create Controller [y/n]
y

Few important generated files to be noted in this stage:

  • api/v1alpha1/compute_types.go — This defines the structure of your new type.
  • controllers/compute_controller.go — This defines the actions to be performed when a reconcile occurs on a custom resource of this type.

Time to start coding! The fields required for the custom resource are defined using golang struct in the file api/v1alpha1/compute_types.go. Sample code is added below. Full version of the code is available in Github. I am adding some common fields in most of the infrastructure cloud platforms for the purpose of this tutorial. The file will have struct definitions for spec, status and list. In this blog, I am customising only spec by adding to the struct — ComputeSpec.

type ComputeSpec struct {
CloudProviderName string `json:"cloudprovidername"`
ComputeName string `json:"computename"`
OSImage string `json:"osimage"`
Shape string `json:"shape"`
Region string `json:"region"`
Zone string `json:"zone"`
}

Ideally there should be more fields than this. For this blog, I will be hard coding any other required fields while writing the controller as the concept and how to get started on this is what we are concentrating on— sorry I am a bit lazy too :)

Now that we have added the required fields to boiler plate code, we can generate the rest of the CRD.

make manifests

The CRD will be generated in the code base and in our case the below file will be generated.

  • config/crd/bases/cloudbuilder.example.com_computes.yaml

The above directory in our case will have only one file as we had only one created here. If there were more, those will be available in this directory.

It’s time to apply the CRD to a Kubernetes cluster. Have the KUBECONFIG set to point it to the required cluster and then run:

kubectl apply -f config/crd/bases/cloudbuilder.example.com_computes.yaml

Now the CRD has been installed on your Kubernetes cluster. You will be able to create resources of type compute on this cluster.

Alternatively, you can also do the below. I used kubectl above to make it a bit more layman way

# This is an alternative to above
make install

Okay, shall we create a resource on a cluster?

I am defining below yaml – this is also added to GitHub at the path config/samples/cloudbuilder_v1alpha1_compute.yaml. Please change <cloud_provider> with your respective provider.

apiVersion: cloudbuilder.example.com/v1alpha1
kind: Compute
metadata:
name: demoinstance
spec:
cloudprovidername: <cloud_provider>
computename: DemoInstance
osimage: centos-8
shape: <shape_available_in_cloud_provider>
region: <region>
zone: <zone>
network: <network>

Now create this resource on the cluster running the command below

kubectl apply -f config/samples/cloudbuilder_v1alpha1_compute.yaml

Please remember that, even though the resource will get created, no action will be done as the operator is not yet implemented and deployed.

And then build the operator

Till above we have the CRD available on the Kubernetes cluster, but when reconcile occurs upon a resource creation or update, there’s no action taken. In our scenario, we need to create a compute in the specified cloud provider.

Let’s add the code to do this in the file controllers/compute_controller.go. Code snippet of method Reconcile has been added below. Also please change <cloud_provider> as well. Full code with cloud providers added is available in GitHub.

// +kubebuilder:rbac:groups=cloudbuilder.example.com,resources=computes,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=cloudbuilder.example.com,resources=computes/status,verbs=get;update;patch
func (r *ComputeReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
ctx := context.Background()
log := r.Log.WithValues("compute", req.NamespacedName)

log.Info("This is not a perfect code and is only for the purpose of the blog")
compute := &cloudbuilderv1alpha1.Compute{}
err := r.Get(ctx, req.NamespacedName, compute)
if err != nil {
return ctrl.Result{}, err
}
computeName := compute.Spec.ComputeName
log.Info("Compute resource record created in cluster. Creating compute on cloud platform with name: " + computeName)
switch compute.Spec.CloudProviderName {
case "<cloud_provider_1>":
if error := r.create<cloud_provider_1>Compute(ctx, log, compute); error != nil {
log.Error(err, "Compute create failed for: "+computeName)
return ctrl.Result{}, error
}
case "<cloud_provider_2>":
if error := r.create<cloud_provider_2>Compute(ctx, log, compute); error != nil {
log.Error(err, "Compute create failed for: "+computeName)
return ctrl.Result{}, error
}
default:
log.Info("Unknown cloud provider")
}
return ctrl.Result{}, nil
}

The comment line above the method instructs when to execute the method. The switch case calls different methods that will handle compute create on couple of major cloud providers. Please see full code in GitHub for the same. For authentication, I am using compute based authentication in respective provider, which is called in different names — Instance Principal/Service Account/Service Principal etc. and hence the required IAM permission will have to be added for operator to create infrastructure resources.

Test it locally

This can be tested locally during the development by configuring KUBECONFIG and running the project as below.

make run

This will run the operator locally and print output to the terminal. You can try a compute create as we did in the earlier section and watch the output.

Finally, publishing the docker image

So far we have coded our operator and tested it locally. Now let’s publish it as a docker image. For this please run the following command.

make docker-build docker-push IMG=akfarooqnaveen/cloudbuilder-compute:v1alpha1

This will locally build the docker image and push it to docker hub or your private registry. Here I am pushing it to docker hub. Please make sure you are logged into the docker repository where this is being pushed to.

And, let’s deploy this to our cluster. Please make sure KUBECONFIG is configured and run the following command

make deploy IMG=akfarooqnaveen/cloudbuilder-compute:v1alpha1

Now the operator is deployed onto the Kubernetes cluster.

For all compute custom resources created in the cluster, a compute resource will be automatically created in the respective cloud platforms.

Thank you!

Thanks for going through this blog. This is my first technical blog. Hopefully I will add other blogs too on related topics. See you soon!

Related links

https://github.com/akfarooqnaveen/cloudbuilder

Software Engineer | Site Reliability Engineering