Skip to content
Snippets Groups Projects

The GARR Distribution of Kubernetes

Overview

This is a Kubernetes cluster composed of the following components and features:

  • Kubernetes (automated deployment, operations, and scaling)
    • Kubernetes cluster with one master and three worker nodes.
    • Optional Kubernetes worker with GPU.
    • TLS used for communication between nodes for security.
    • A CNI plugin (Flannel).
    • A load balancer for HA kubernetes-master.
    • Optional Ingress Controller (on worker).
    • Optional Dashboard addon (on master) including Heapster for cluster monitoring.
  • EasyRSA
    • Performs the role of a certificate authority serving self signed certificates to the requesting units of the cluster.
  • Etcd (distributed key value store)
    • Three node cluster for reliability.

Usage

Installation

The server nodes to be used should be tagged as kubernetes in MAAS. At least one of them should also be tagged as public-ip, to denote a machine configured with a public IP. The server with GPUs should also be tagged as 'gpu'.

Customize the bundle configuration editing the file bundle.yaml, by following this guide. Alternatively, you can customize the configuration using an overlay file, like the one in bundle-config.yaml.

To deploy your customized bundle:

$ juju deploy ./kubernetes --overlay bundle-config.yaml

Note: If you're operating behind a proxy, remember to set the kubernetes-worker proxy configuration options as described in the Proxy configuration section above.

This bundle exposes the kubeapi-load-balancer and kubernetes-worker charms by default, so they are accessible through their public addresses.

If you would like to remove external access, unexpose them:

$ juju unexpose kubeapi-load-balancer
$ juju unexpose kubernetes-worker

To get the status of the deployment, run juju status. For constant updates, combine it with the watch command:

watch -c juju status --color

Using with your own resources

In order to support restricted-network deployments, the charms in this bundle support juju resources.

This allows you to juju attach the resources built for the architecture of your cloud.

$ juju attach kubernetes-master kubectl=/path/to/kubectl.snap
$ juju attach kubernetes-master kube-apiserver=/path/to/kube-apiserver.snap
$ juju attach kubernetes-master kube-controller-manager=/path/to/kube-controller-manager.snap
$ juju attach kubernetes-master kube-scheduler=/path/to/kube-scheduler.snap
$ juju attach kubernetes-master cdk-addons=/path/to/cdk-addons.snap

$ juju attach kubernetes-worker kubectl=/path/to/kubectl.snap
$ juju attach kubernetes-worker kubelet=/path/to/kubelet.snap
$ juju attach kubernetes-worker kube-proxy=/path/to/kube-proxy.snap
$ juju attach kubernetes-worker cni=/path/to/cni.tgz

Using a specific Kubernetes version

You can select a specific version or series of Kubernetes by configuring the charms to use a specific snap channel. For example, to use the 1.9 series:

$ juju config kubernetes-master channel=1.9/stable
$ juju config kubernetes-worker channel=1.9/stable

After changing the channel, you'll need to manually execute the upgrade action on each kubernetes-worker unit, e.g.:

$ juju run-action kubernetes-worker/N0 upgrade
$ juju run-action kubernetes-worker/N1 upgrade
$ juju run-action kubernetes-worker/N2 upgrade
...

By default, the channel is set to stable on the current minor version of Kubernetes, for example, 1.9/stable. This means your cluster will receive automatic upgrades for new patch releases (e.g. 1.9.2 -> 1.9.3), but not for new minor versions (e.g. 1.8.3 -> 1.9). To upgrade to a new minor version, configure the channel manually as described above.

Proxy configuration

If you are operating behind a proxy (i.e., your charms are running in a limited-egress environment and can not reach IP addresses external to their network), you will need to configure your model appropriately before deploying the Kubernetes bundle.

First, configure your model's http-proxy and https-proxy settings with your proxy (here we use squid.internal:3128 as an example):

$ juju model-config http-proxy=http://squid.internal:3128 https-proxy=https://squid.internal:3128

Because services often need to reach machines on their own network (including themselves), you will also need to add localhost to the no-proxy model configuration setting, along with any internal subnets you're using. The following example includes two subnets:

$ juju model-config no-proxy=localhost,10.5.5.0/24,10.246.64.0/21

After deploying the bundle, you need to configure the kubernetes-worker charm to use your proxy:

$ juju config kubernetes-worker http_proxy=http://squid.internal:3128 https_proxy=https://squid.internal:3128

Interacting with the Kubernetes cluster

Wait for the deployment to settle:

$ watch -c juju status --color

You may assume control over the Kubernetes cluster from any kubernetes-master or kubernetes-worker node.

Create the kubectl config directory.

$ mkdir -p ~/.kube

Copy the kubeconfig file to the default location.

$ juju scp kubernetes-master/N:config ~/.kube/config

Install kubectl locally.

$ sudo snap install kubectl --classic

If this fails due to error:

- Setup snap "core" (3748) security profiles (cannot setup apparmor for snap "core": cannot load apparmor profile "snap.core.hook.configure": cannot load apparmor profile: exit status 243

copy it locally:

$ mkdir -p ~/bin
$ juju scp kubernetes-master/N:/snap/kubectl/current/kubectl ~/bin/kubectl

Query the cluster.

$ kubectl cluster-info
$ kubectl get nodes

Accessing the Kubernetes Dashboard

The Kubernetes dashboard addon is installed by default, along with Heapster, Grafana and InfluxDB for cluster monitoring. The dashboard addons can be enabled (default) or disabled by setting the enable-dashboard-addons config on the kubernetes-master application:

$ juju config kubernetes-master enable-dashboard-addons=true

To reach the Kubernetes dashboard, visit http://<kube-api-load-balancer-IP>/ui, where the kube-api-load-balancer-IP can be obtained by:

$ juju run --unit kube-api-load-balancer/N unit-get public-address

To sign in, get a Bearer Token with:

kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | awk '/^kubernetes-dashboard-token-/{print $1}') | awk '/^token:/ {print $2}'

and paste it in the token field.

Control the cluster

kubectl is the command line utility to interact with a Kubernetes cluster.

Minimal getting started

To check the state of the cluster:

$ kubectl cluster-info
Kubernetes master is running at https://10.4.4.76:443
Heapster is running at https://10.4.4.76:443/api/v1/namespaces/kube-system/services/heapster/proxy
KubeDNS is running at https://10.4.4.76:443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
kubernetes-dashboard is running at https://10.4.4.76:443/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy
Grafana is running at https://10.4.4.76:443/api/v1/namespaces/kube-system/services/monitoring-grafana/proxy
InfluxDB is running at https://10.4.4.76:443/api/v1/namespaces/kube-system/services/monitoring-influxdb:http/proxy

List all nodes in the cluster:

$ kubectl get nodes
NAME         STATUS    ROLES     AGE       VERSION
ba1-r2-s15   Ready     <none>    1d        v1.9.1
ba1-r3-s04   Ready     <none>    1d        v1.9.1
ba1-r3-s05   Ready     <none>    1d        v1.9.1

Now you can run pods inside the Kubernetes cluster.

Create the following configuration file example.yaml:

apiVersion: v1
kind: Pod
metadata:
  name: web-site
  labels:
    app: web
spec:
    containers:
    - name: nginx
      image: nginx
      ports:
      - containerPort: 80
$ kubectl create -f example.yaml

List all pods in the cluster:

$ kubectl get pods
NAME                             READY     STATUS    RESTARTS   AGE
web-site                         1/1       Running   0          17s

List all services in the cluster:

$ kubectl get services
NAME                   TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
default-http-backend   ClusterIP   10.152.183.108   <none>        80/TCP    9h
kubernetes             ClusterIP   10.152.183.1     <none>        443/TCP   9h

Delete the pod after use:

$ kubectl delete pod web-site
pod "web-site" deleted

For expanded information on kubectl beyond what this README provides, please see the kubectl overview which contains practical examples and an API reference.

Additionally if you need to manage multiple clusters, there is more information about configuring kubectl in the kubectl config guide

Using Ingress

The kubernetes-worker charm supports deploying an NGINX ingress controller. Ingress allows access from the Internet to containers running web services inside the cluster.

First allow the Internet access to the kubernetes-worker charm with with the following Juju command:

$ juju expose kubernetes-worker

In Kubernetes, workloads are declared using pod, service, and ingress definitions. An ingress controller is provided to you by default and deployed into the default namespace of the cluster. If one is not available, you may deploy it with:

$ juju config kubernetes-worker ingress=true

Ingress resources are DNS mappings to your containers, routed through endpoints.

Example

As an example for users unfamiliar with Kubernetes, we packaged an action to both deploy an example and clean it up.

To deploy 3 replicas of the microbot web application inside the Kubernetes cluster on worker number N, run the following command:

$ juju run-action kubernetes-worker/N microbot replicas=3

This action performs the following steps:

  • It creates a deployment titled 'microbots' comprised of 3 replicas defined during the run of the action. It also creates a service named 'microbots' which binds an 'endpoint', using all 3 of the 'microbots' pods.

  • Finally, it will create an ingress resource, which points at a xip.io domain to simulate a proper DNS service.

Wait for the action to comlpete:

$ juju show-action-output db7cc72b-5f35-4a4d-877c-284c4b776eb8
results:
  address: microbot.104.198.77.197.xip.io
status: completed
timing:
  completed: 2016-09-26 20:42:42 +0000 UTC
  enqueued: 2016-09-26 20:42:39 +0000 UTC
  started: 2016-09-26 20:42:41 +0000 UTC

Note: Your FQDN will be different and contain the address of the cloud instance.

At this point, you can inspect the cluster to observe the workload coming online.

List the pods

$ kubectl get pods
NAME                             READY     STATUS    RESTARTS   AGE
default-http-backend-kh1dt       1/1       Running   0          1h
microbot-1855935831-58shp        1/1       Running   0          1h
microbot-1855935831-9d16f        1/1       Running   0          1h
microbot-1855935831-l5rt8        1/1       Running   0          1h
nginx-ingress-controller-hv5c2   1/1       Running   0          1h

List the services and endpoints

$ kubectl get services,endpoints
NAME                       CLUSTER-IP    EXTERNAL-IP   PORT(S)   AGE
svc/default-http-backend   10.1.225.82   <none>        80/TCP    1h
svc/kubernetes             10.1.0.1      <none>        443/TCP   1h
svc/microbot               10.1.44.173   <none>        80/TCP    1h
NAME                      ENDPOINTS                               AGE
ep/default-http-backend   10.1.68.2:80                            1h
ep/kubernetes             172.31.31.139:6443                      1h
ep/microbot               10.1.20.3:80,10.1.68.3:80,10.1.7.4:80   1h

List the ingress resources

$ kubectl get ingress
NAME               HOSTS                          ADDRESS         PORTS     AGE
microbot-ingress   microbot.52.38.62.235.xip.io   172.31.26.109   80        1h

When all the pods are listed as Running, you are ready to visit the address listed in the HOSTS column of the ingress listing.

Note: It is normal to see a 502/503 error during initial application deployment.

As you refresh the page, you will be greeted with a microbot web page, serving from one of the microbot replica pods. Refreshing will show you another microbot with a different hostname as the requests are load-balanced across the replicas.

Clean up the example

There is also an action to clean up the microbot applications. When you are done using the microbot application you can delete them from the pods with one Juju action:

$ juju run-action kubernetes-worker/N microbot delete=true

If you no longer need Internet access to your workers, remember to unexpose the kubernetes-worker charm:

$ juju unexpose kubernetes-worker

To learn more about Kubernetes Ingress and how to configure the Ingress Controller beyond defaults (such as TLS and websocket support) view the nginx-ingress-controller project on github.

Scale out Usage

Scaling kubernetes-worker

The kubernetes-worker nodes are the load-bearing units of a Kubernetes cluster.

By default, pods are automatically spread across the kubernetes-worker units that you have deployed.

To add more kubernetes-worker units to the cluster:

$ juju add-unit kubernetes-worker

or specify machine constraints to create larger nodes:

$ juju add-unit kubernetes-worker --constraints "cpu-cores=8 mem=32G"

Refer to the machine constraints documentation for other machine constraints that might be useful for the kubernetes-worker units.

Scaling Etcd

Etcd is the key-value store for the Kubernetes cluster. For reliability the bundle defaults to three instances in this cluster.

For more scalability, we recommend between 3 and 9 etcd nodes. If you want to add more nodes:

$ juju add-unit etcd

The CoreOS etcd documentation has a chart for the optimal cluster size to determine fault tolerance.

Adding optional storage

Using Juju Storage, the bundle allows you to connect with durable storage devices such as Ceph.

Deploy a minimum of three ceph-mon and three ceph-osd charms:

$ juju deploy cs:ceph-mon -n 3
$ juju deploy cs:ceph-osd -n 3

Relate the charms:

$ juju add-relation ceph-mon ceph-osd

List the storage pools available to Juju for your cloud:

$ juju storage-pools
Name     Provider  Attrs
ebs      ebs
ebs-ssd  ebs       volume-type=ssd
loop     loop
rootfs   rootfs
tmpfs    tmpfs

Note: This listing is for the Amazon Web Services public cloud. Different clouds will have different pool names.

Add a storage pool to the ceph-osd charm by NAME,SIZE,COUNT:

$ juju add-storage ceph-osd/N0 osd-devices=cinder,10G,1
$ juju add-storage ceph-osd/N1 osd-devices=cinder,10G,1
$ juju add-storage ceph-osd/N2 osd-devices=cinder,10G,1

Next relate the storage cluster with the Kubernetes cluster:

$ juju add-relation kubernetes-master:ceph-storage ceph-mon:admin

We are now ready to enlist Persistent Volumes in Kubernetes, which our workloads can use via Persistent Volume Claims (PVC).

$ juju run-action kubernetes-master/N create-rbd-pv name=test size=50

This example created a "test" Rados Block Device (rbd) in the size of 50 MB.

You should see the PV become enlisted and be marked as available:

$ watch kubectl get pv

NAME CAPACITY   ACCESSMODES   STATUS    CLAIM              REASON    AGE

test   50M          RWO       Available                              10s

To consume these Persistent Volumes, your pods will need a Persistent Volume Claim associated with them, a task that is outside the scope of this README. See the Persistent Volumes documentation for more information.

Known Limitations and Issues

The following are known issues and limitations with the bundle and charm code:

  • Destroying the the easyrsa charm will result in loss of public key infrastructure (PKI).

  • Deployment locally on LXD will require the use of conjure-up to tune settings on the host's LXD installation to support Docker and other components.

  • If resources fail to download during initial deployment for any reason, you will need to download and install them manually. For example, if kubernetes-master is missing its resources, download them from the resources section of the sidebar here and install them by running, for example:

    $ juju attach kubernetes-master kube-apiserver=/path/to/snap.

    You can find resources for the canonical-kubernetes charms here:

Kubernetes details

Flannel

Flannel is a virtual network that gives a subnet to each host for use with container runtimes.

Configuration

iface The interface to configure the flannel SDN binding. If this value is empty string or undefined the code will attempt to find the default network adapter similar to the following command:

$ route | grep default | head -n 1 | awk {'print $8'}

cidr The network range to configure the flannel SDN to declare when establishing networking setup with etcd. Ensure this network range is not active on the vlan you're deploying to, as it will cause collisions and odd behavior if care is not taken when selecting a good CIDR range to assign to flannel.

Known Limitations

This subordinate does not support being co-located with other deployments of the flannel subordinate (to gain 2 vlans on a single application). If you require this support please file a bug.

This subordinate also leverages juju-resources, so it is currently only available on Juju 2.0+ controllers.

Further information


For more details read the documentation of the Kubernetes core bundle and the CharmScaler.