Building an application platform with LKE and Argo CD - Part 2

Building an application platform with LKE and Argo CD - Part 2

In my first part, I introduced how to set up a basic Linode Kubernetes(LKE) cluster using Argo CD autopilot and configuring Traefik ingress and ExternalDNS for routing using Linode Domains.  In this article, we are going to add automatic HTTPS for the ingress of our applications.  To do this we will be using cert-manager in conjunction with Let's Encrypt.  

Certificates with cert-manger

Kubernetes cert-manager adds the capability to manage certificates for your services in an automated way.  The easiest and cheapest way to get certificates for your services is to configure cert-manager to get certificates from Let's Encrypt.  Let's Encrypt provides free trusted certificates to websites to make encryption easy and cheap.  

Cert-manger uses the ACME protocol to verify domains.  You can verify domains in two different ways:

  • HTTP request to a well-known URL
  • DNS recorded created and verified

We will be using the second mechanism, DNS record created and verified.  This mechanism has the advantage of not requiring ingress to be configured and working for domain ownership to be verified.  Cert-manager has integration with many DNS providers to allow integration.  However, they DO NOT provide native integration with Linode Domains.  They do allow third parties to integrate through a webhook that is provided.  This is the mechanism that we will use and is documented on their site here with the implementation of the webhook here.

As we did in the previous article, we will need to add an application spec to Argo CD and the corresponding manifest for cert-manager.  In this instance, we are going to be loading a helm chart that will merge the cert-manager dependency with the cert-manager-webhook-linode dependency.

First, let's look at the ArgoCD application.  Again this will be created in the bootstrap dir and look something like this.

#./bootstrap/cert-manager.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  creationTimestamp: null
  labels:
    app.kubernetes.io/managed-by: argocd-autopilot
    app.kubernetes.io/name: cert-manager
  name: cert-manager
  namespace: argocd
spec:
  destination:
    namespace: cert-manager
    server: https://kubernetes.default.svc
  ignoreDifferences:
  - group: argoproj.io
    jsonPointers:
    - /status
    kind: Application
  project: default
  source:
    path: bootstrap/cert-manager
    repoURL: https://github.com/owner/repo.git
  syncPolicy:
    automated:
      allowEmpty: true
      prune: true
      selfHeal: true
    syncOptions:
    - allowEmpty=true
    - CreateNamespace=true
status:
  health: {}
  summary: {}
  sync:
    comparedTo:
      destination: {}
      source:
        repoURL: ""
    status: ""

In this file, you will need to change the source repo to match your git repo.

repoURL: https://github.com/owner/repo.git

Next, we need to create a cert-manager directory to put the cert-manager HELM files in like this:

mkdir ./bootstrap/cert-manager

Then we need to create a master HELM chart to combine cert-manager and the Linode webhook.

#./bootstrap/cert-manager/Chart.yaml
apiVersion: v2
name: cert-manager
description: A Helm chart for Kubernetes

# A chart can be either an 'application' or a 'library' chart.
#
# Application charts are a collection of templates that can be packaged into versioned archives
# to be deployed.
#
# Library charts provide useful utilities or functions for the chart developer. They're included as
# a dependency of application charts to inject those utilities and functions into the rendering
# pipeline. Library charts do not define any templates and therefore cannot be deployed.
type: application

# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.1.0

# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to
# follow Semantic Versioning. They should reflect the version the application is using.
# It is recommended to use it with quotes.
appVersion: "1.16.0"

dependencies:
- name: "cert-manager"
  version: 1.10.1
  repository: https://charts.jetstack.io
- name: "cert-manager-webhook-linode"
  version: 0.2.0
  repository: "file://./chart/cert-manager-webhook-linode"

The main thing to notice in this file is that it simply merges 2 dependencies.  It combines cert-manager and the cert-manager-webhook-linode.

Cert-manager is pulled from the jetstack helm chart repository.  This makes things very easy to update.  When jetstack releases a new version of the cert-manager helm chart you simply edit this file and update the version of the cert-manager dependency.

Unfortunately, cert-manager-webhook-linode is not stored in a HELM repo and is only accessible on GitHub as a code repository.  The only way to reference dependencies in HELM is to reference a repo or a subdirectory.  For this reason, we will need to copy version v0.2.0 of cert-manager-webhook-linode to our ./bootstrap/cert-manager/chart/cert-manager-webhook-linode dir.

mkdir ./bootstrap/cert-manager/chart

Now copy the cert-manager-webhook-linode dir from your downloaded resource to the chart dir.

Unfortunately, v0.2.0 of the cert-manager-wehbook-linode has a bit of an error in its helm chart.  We will need to fix this in order for the container to launch correctly.  Edit the file ./bootstrap/cert-manager/chart/cert-manager-webhook-linode/values.yaml and comment out the "logLevel: 6" line.  The deployment section will then look like this.

 #./bootstrap/cert-manager/chart/cert-manager-webhook-linode/values.yaml 
deployment:
  secretName: linode-credentials
  secretKey: token
  # logLevel: 6
🤷‍♂️
I have not had a chance to dig into the code behind the cert-manager-webhook-container to see why it does not accept the logLevel argument anymore but I will when I have some time. For this article, this is the best solution I could find.

Next, we need to create a values file to set the variables in our dependent HELM charts.

#./bootstrap/cert-manager/values.yaml
chartVersion:
keyID:

cert-manager:
  installCRDs: true

cert-manager-webhook-linode:
  api:
    groupName: acme.<your.domain>
  image:
    tag: v0.2.0
  deployment:
    secretName: linode-api-token

There are a few variables being set in this values file.  First is the cert-manager CRD install.  The CRDs will be needed for our use cases.

cert-manager:
  installCRDs: true

Next, a few variables for the cert-manager-webhook-linode process need to be set.  The api.groupName needs to be filled in with your domain name.  Then the image needs to be set to the same version as the download: v0.2.0. Also, the deployment secret needs to be set to our Linode token name: linode-api-token.

cert-manager-webhook-linode:
  api:
    groupName: acme.<your.domain>
  image:
    tag: v0.2.0
  deployment:
    secretName: linode-api-token

API Token

By default, secrets are only accessible within the same namespace in which they are created. This is because secrets are stored as Kubernetes API objects, which are scoped to a particular namespace. This means that secrets cannot be accessed by pods or services in other namespaces, even if those namespaces are part of the same cluster. Because of this, we need to copy our Linode API token to the cert-manager namespace.

kubectl create namespace cert-manager
kubectl get secret linode-api-token --namespace=external-dns -oyaml | grep -v '^\s*namespace:\s' | kubectl apply --namespace=cert-manager -f -

Cluster Issuers

For cert-manager to work properly with Let's Encrypt we need to configure a cert-manager ClusterIssuer.  We are going to create more than 1  ClusterIssuer to allow testing our setup before using production Let's Encrypt certificates. This is useful because there is a rate-limit on the production Let's Encrypt certificate issuer.

First, let's create the Let's Encrypt staging file.  This is the Let's Encrypt staging server.  It is not a trusted certificate, but the server is not rate-limited.  Use this configuration when you are testing sequences that will cause the issuance of more certificates than are allowed for Let's Encrypt trusted certificates.

#./bootstrap/cert-manager/templates/letsencrypt-staging.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-staging
spec:
  acme:
    email: <your@email>
    server: https://acme-staging-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: letsencrypt-staging
    solvers:
      - dns01:
          webhook:
            solverName: linode
            groupName: acme.<your.domain>

The next file to create is the Let's Encrypt production ClusterIssuer file.  This is the configuration to use when you want to issue a production Let's Encrypt certificate.

#./bootstrap/cert-manager/templates/letsencrypt-prod.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    email: <your@email>
    server: https://acme-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: letsencrypt-prod
    solvers:
      - dns01:
          webhook:
            solverName: linode
            groupName: acme.<your.domain>

In both of these files you will need to update the groupName to match the groupName define in the values file of the deployment.

groupName: acme.<your.domain>

Update our Ingress

The next thing we need to do to manage ingress certificates automatically is open up the ingress SSL port.  In our case, we are also going to automatically redirect all traffic to the appropriate SSL interface so that we can enforce secure traffic.

In my previous post, I showed how to configure Traefik to accept Kubernetes ingress traffic.  We must add the following image arguments to our traefik configuration in the file ./traefik/traefik.yaml.

- --entrypoints.web.http.redirections.entrypoint.to=websecure
- --entrypoints.websecure.address=:443
- --entrypoints.websecure.http.tls

We will need to open the SSL port on the container running the Traefik service.  We will need to add the following to the container.ports section of the yaml.

- name: websecure
  containerPort: 443

The SSL port for the Traefik service will also need to be opened for the ingress to work.  This will require adding the following to the traefik-web-service LoadBalancer.

    - name: websecure
      targetPort: websecure
      port: 443

The complete file will look like this.

./traefik/traefik.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: traefik-account
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: traefik-role

rules:
  - apiGroups:
      - ""
    resources:
      - services
      - endpoints
      - secrets
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - extensions
      - networking.k8s.io
    resources:
      - ingresses
      - ingressclasses
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - extensions
      - networking.k8s.io
    resources:
      - ingresses/status
    verbs:
      - update
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: traefik-role-binding

roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: traefik-role
subjects:
  - kind: ServiceAccount
    name: traefik-account
    namespace: traefik
---
kind: Deployment
apiVersion: apps/v1
metadata:
  name: traefik-deployment
  labels:
    app: traefik

spec:
  replicas: 1
  selector:
    matchLabels:
      app: traefik
  template:
    metadata:
      labels:
        app: traefik
    spec:
      serviceAccountName: traefik-account
      containers:
        - name: traefik
          image: traefik:v2.9
          args:
            - --api.insecure
            - --api.dashboard=true
            - --providers.kubernetesingress
            - --providers.kubernetesingress.ingressendpoint.publishedservice=traefik/traefik-web-service
            - --entrypoints.web.address=:80
            - --entrypoints.web.http.redirections.entrypoint.to=websecure
            - --entrypoints.websecure.address=:443
            - --entrypoints.websecure.http.tls
          ports:
            - name: web
              containerPort: 80
            - name: websecure
              containerPort: 443
---
apiVersion: v1
kind: Service
metadata:
  name: traefik-web-service
spec:
  type: LoadBalancer
  ports:
    - name: web
      targetPort: web
      port: 80
    - name: websecure
      targetPort: websecure
      port: 443
  selector:
    app: traefik

With all these edits in place, we will need to commit our code changes to our git repo like this:

git add .
git commit -m "adding cert-manager and updating traefik to force ssl"
git push origin

Let's test it!

Let's update our whoami application to use SSL with a certificate from Let's Encrypt.  We will use the test certificate store to make sure that everything is working correctly before we switch to the production certificate store.  To do that we will need to update our whoami ingress in the ./apps/whoami/base/install.yaml file to add the following annotations.

  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-staging
    traefik.ingress.kubernetes.io/router.tls: "true"
    traefik.ingress.kubernetes.io/router.entrypoints: websecure

We will also need to add TLS to the ingress definition.

  tls:
  - hosts:
      - whoami.mydomain.com
    secretName: secure-whoami-cert

The full file will look like this:

#./apps/whoami/base/install.yaml
---
kind: Deployment
apiVersion: apps/v1
metadata:
  name: whoami
  labels:
    app: whoami

spec:
  replicas: 1
  selector:
    matchLabels:
      app: whoami
  template:
    metadata:
      labels:
        app: whoami
    spec:
      containers:
        - name: whoami
          image: traefik/whoami
          ports:
            - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: whoami
spec:
  ports:
    - name: http
      port: 80

  selector:
    app: whoami
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: whoami-ingress
  annotations:
    external-dns.alpha.kubernetes.io/hostname: whoami.yourdomain.com
    cert-manager.io/cluster-issuer: letsencrypt-staging
    traefik.ingress.kubernetes.io/router.tls: "true"
    traefik.ingress.kubernetes.io/router.entrypoints: websecure
spec:
  tls:
  - hosts:
      - whoami.yourdomain.com
    secretName: secure-whoami-cert
  rules:
  - host: whoami.yourdomain.com
    http:
      paths:
      - path: /
        pathType: Exact
        backend:
          service:
            name:  whoami
            port:
              number: 80

Commit the changes:

git add .
git commit -m "update whoami ingress to use SSL"
git push origin

It may take a few seconds/minutes for the verification process and for the certificate to be issued.   You can check on the process by watching the cert-manager application in ArgoCD UI.  

You can also check the output for the web service using this curl command:

curl -k whoami.yourdomain.com

Success!! You should have a secure whoami service.  We have one problem though, we are using the untrusted staging Let's Encrypt certificate.  We need to make one last change to our ingress to use the trusted production Let's Encrypt certificate. We need to change the following line:

cert-manager.io/cluster-issuer: letsencrypt-staging

To the following:

cert-manager.io/cluster-issuer: letsencrypt-prod

Commit the change:

git add .
git commit -m "update whoami ingress to use production let's encrypt"
git push origin

Yay!! We are finally done!  You should now have cert-manager configured to provide your services with both staging and production certificates from Let's Encrypt.