# The GARR Distribution of Kubernetes   ## Overview This is a `Kubernetes` cluster composed of the following components and features: - `Kubernetes` (automated deployment, operations, and scaling) - `Kubernetes` cluster with one master and three worker nodes. - Optional `Kubernetes` worker with GPU. - TLS used for communication between nodes for security. - A CNI plugin (Flannel). - A load balancer for HA `kubernetes-master`. - Optional Ingress Controller (on worker). - Optional `Dashboard` addon (on master) including `Heapster` for cluster monitoring. - EasyRSA - Performs the role of a certificate authority serving self signed certificates to the requesting units of the cluster. - Etcd (distributed key value store) - Three node cluster for reliability. # Usage ## Installation The server nodes to be used should be tagged as `kubernetes` in `MAAS`. At least one of them should also be tagged as `public-ip`, to denote a machine configured with a public IP. The server with GPUs should also be tagged as 'gpu'. Customize the bundle configuration editing the file `bundle.yaml`, by following [this guide](https://jujucharms.com/docs/stable/charms-bundles#setting-constraints-in-a-bundle). Alternatively, you can customize the configuration using an overlay file, like the one in `bundle-config.yaml`. To deploy your customized bundle: ```sh $ juju deploy ./kubernetes --overlay bundle-config.yaml ``` > Note: If you're operating behind a proxy, remember to set the `kubernetes-worker` proxy configuration options as described in the `Proxy configuration` section above. This bundle exposes the `kubeapi-load-balancer` and `kubernetes-worker` charms by default, so they are accessible through their public addresses. If you would like to remove external access, unexpose them: ```sh $ juju unexpose kubeapi-load-balancer $ juju unexpose kubernetes-worker ``` To get the status of the deployment, run `juju status`. For constant updates, combine it with the `watch` command: ```sh watch -c juju status --color ``` ### Using with your own resources In order to support restricted-network deployments, the charms in this bundle support [juju resources](https://jujucharms.com/docs/stable/developer-resources#managing-resources). This allows you to `juju attach` the resources built for the architecture of your cloud. ```sh $ juju attach kubernetes-master kubectl=/path/to/kubectl.snap $ juju attach kubernetes-master kube-apiserver=/path/to/kube-apiserver.snap $ juju attach kubernetes-master kube-controller-manager=/path/to/kube-controller-manager.snap $ juju attach kubernetes-master kube-scheduler=/path/to/kube-scheduler.snap $ juju attach kubernetes-master cdk-addons=/path/to/cdk-addons.snap $ juju attach kubernetes-worker kubectl=/path/to/kubectl.snap $ juju attach kubernetes-worker kubelet=/path/to/kubelet.snap $ juju attach kubernetes-worker kube-proxy=/path/to/kube-proxy.snap $ juju attach kubernetes-worker cni=/path/to/cni.tgz ``` ### Using a specific Kubernetes version You can select a specific version or series of `Kubernetes` by configuring the charms to use a specific snap channel. For example, to use the 1.9 series: ```sh $ juju config kubernetes-master channel=1.9/stable $ juju config kubernetes-worker channel=1.9/stable ``` After changing the channel, you'll need to manually execute the upgrade action on each `kubernetes-worker` unit, e.g.: ```sh $ juju run-action kubernetes-worker/N0 upgrade $ juju run-action kubernetes-worker/N1 upgrade $ juju run-action kubernetes-worker/N2 upgrade ... ``` By default, the channel is set to `stable` on the current minor version of `Kubernetes`, for example, `1.9/stable`. This means your cluster will receive automatic upgrades for new patch releases (e.g. 1.9.2 -> 1.9.3), but not for new minor versions (e.g. 1.8.3 -> 1.9). To upgrade to a new minor version, configure the channel manually as described above. ## Proxy configuration If you are operating behind a proxy (i.e., your charms are running in a limited-egress environment and can not reach IP addresses external to their network), you will need to configure your model appropriately before deploying the `Kubernetes` bundle. First, configure your model's `http-proxy` and `https-proxy` settings with your proxy (here we use `squid.internal:3128` as an example): ```sh $ juju model-config http-proxy=http://squid.internal:3128 https-proxy=https://squid.internal:3128 ``` Because services often need to reach machines on their own network (including themselves), you will also need to add `localhost` to the `no-proxy` model configuration setting, along with any internal subnets you're using. The following example includes two subnets: ```sh $ juju model-config no-proxy=localhost,10.5.5.0/24,10.246.64.0/21 ``` After deploying the bundle, you need to configure the `kubernetes-worker` charm to use your proxy: ```sh $ juju config kubernetes-worker http_proxy=http://squid.internal:3128 https_proxy=https://squid.internal:3128 ``` ## Interacting with the Kubernetes cluster Wait for the deployment to settle: ```sh $ watch -c juju status --color ``` You may assume control over the `Kubernetes` cluster from any `kubernetes-master` or `kubernetes-worker` node. Create the kubectl config directory. ```sh $ mkdir -p ~/.kube ``` Copy the kubeconfig file to the default location. ```sh $ juju scp kubernetes-master/N:config ~/.kube/config ``` Install `kubectl` locally. ``` $ sudo snap install kubectl --classic ``` If this fails due to error: ```sh - Setup snap "core" (3748) security profiles (cannot setup apparmor for snap "core": cannot load apparmor profile "snap.core.hook.configure": cannot load apparmor profile: exit status 243 ``` copy it locally: ```sh $ mkdir -p ~/bin $ juju scp kubernetes-master/N:/snap/kubectl/current/kubectl ~/bin/kubectl ``` Query the cluster. ```sh $ kubectl cluster-info $ kubectl get nodes ``` ### Accessing the Kubernetes Dashboard The `Kubernetes` dashboard addon is installed by default, along with `Heapster`, `Grafana` and `InfluxDB` for cluster monitoring. The dashboard addons can be enabled (default) or disabled by setting the `enable-dashboard-addons` config on the `kubernetes-master` application: ```sh $ juju config kubernetes-master enable-dashboard-addons=true ``` To reach the `Kubernetes` dashboard, visit `http://<kube-api-load-balancer-IP>/ui`, where the `kube-api-load-balancer-IP` can be obtained by: ```sh $ juju run --unit kube-api-load-balancer/N unit-get public-address ``` To sign in, get a `Bearer Token` with: ``` kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | awk '/^kubernetes-dashboard-token-/{print $1}') | awk '/^token:/ {print $2}' ``` and paste it in the token field. ### Control the cluster `kubectl` is the command line utility to interact with a `Kubernetes` cluster. #### Minimal getting started To check the state of the cluster: ``` $ kubectl cluster-info Kubernetes master is running at https://10.4.4.76:443 Heapster is running at https://10.4.4.76:443/api/v1/namespaces/kube-system/services/heapster/proxy KubeDNS is running at https://10.4.4.76:443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy kubernetes-dashboard is running at https://10.4.4.76:443/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy Grafana is running at https://10.4.4.76:443/api/v1/namespaces/kube-system/services/monitoring-grafana/proxy InfluxDB is running at https://10.4.4.76:443/api/v1/namespaces/kube-system/services/monitoring-influxdb:http/proxy ``` List all nodes in the cluster: ``` $ kubectl get nodes NAME STATUS ROLES AGE VERSION ba1-r2-s15 Ready <none> 1d v1.9.1 ba1-r3-s04 Ready <none> 1d v1.9.1 ba1-r3-s05 Ready <none> 1d v1.9.1 ``` Now you can run pods inside the `Kubernetes` cluster. Create the following configuration file `example.yaml`: ``` apiVersion: v1 kind: Pod metadata: name: web-site labels: app: web spec: containers: - name: nginx image: nginx ports: - containerPort: 80 ``` ```sh $ kubectl create -f example.yaml ``` List all pods in the cluster: ```sh $ kubectl get pods NAME READY STATUS RESTARTS AGE web-site 1/1 Running 0 17s ``` List all services in the cluster: ```sh $ kubectl get services NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE default-http-backend ClusterIP 10.152.183.108 <none> 80/TCP 9h kubernetes ClusterIP 10.152.183.1 <none> 443/TCP 9h ``` Delete the pod after use: ```sh $ kubectl delete pod web-site pod "web-site" deleted ``` For expanded information on `kubectl` beyond what this `README` provides, please see the [kubectl overview](https://kubernetes.io/docs/user-guide/kubectl-overview/) which contains practical examples and an API reference. Additionally if you need to manage multiple clusters, there is more information about configuring kubectl in the [kubectl config guide](https://kubernetes.io/docs/user-guide/kubectl/kubectl_config/) ### Using Ingress The kubernetes-worker charm supports deploying an NGINX ingress controller. Ingress allows access from the Internet to containers running web services inside the cluster. First allow the Internet access to the `kubernetes-worker` charm with with the following Juju command: ```sh $ juju expose kubernetes-worker ``` In `Kubernetes`, workloads are declared using pod, service, and ingress definitions. An ingress controller is provided to you by default and deployed into the [default namespace](https://kubernetes.io/docs/user-guide/namespaces/) of the cluster. If one is not available, you may deploy it with: ```sh $ juju config kubernetes-worker ingress=true ``` Ingress resources are DNS mappings to your containers, routed through [endpoints](https://kubernetes.io/docs/user-guide/services/). ### Example As an example for users unfamiliar with `Kubernetes`, we packaged an action to both deploy an example and clean it up. To deploy 3 replicas of the `microbot` web application inside the `Kubernetes` cluster on worker number `N`, run the following command: ```sh $ juju run-action kubernetes-worker/N microbot replicas=3 ``` This action performs the following steps: - It creates a deployment titled 'microbots' comprised of 3 replicas defined during the run of the action. It also creates a service named 'microbots' which binds an 'endpoint', using all 3 of the 'microbots' pods. - Finally, it will create an ingress resource, which points at a [xip.io](https://xip.io) domain to simulate a proper DNS service. Wait for the action to comlpete: $ juju show-action-output db7cc72b-5f35-4a4d-877c-284c4b776eb8 results: address: microbot.104.198.77.197.xip.io status: completed timing: completed: 2016-09-26 20:42:42 +0000 UTC enqueued: 2016-09-26 20:42:39 +0000 UTC started: 2016-09-26 20:42:41 +0000 UTC > **Note**: Your FQDN will be different and contain the address of the cloud > instance. At this point, you can inspect the cluster to observe the workload coming online. #### List the pods $ kubectl get pods NAME READY STATUS RESTARTS AGE default-http-backend-kh1dt 1/1 Running 0 1h microbot-1855935831-58shp 1/1 Running 0 1h microbot-1855935831-9d16f 1/1 Running 0 1h microbot-1855935831-l5rt8 1/1 Running 0 1h nginx-ingress-controller-hv5c2 1/1 Running 0 1h #### List the services and endpoints $ kubectl get services,endpoints NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE svc/default-http-backend 10.1.225.82 <none> 80/TCP 1h svc/kubernetes 10.1.0.1 <none> 443/TCP 1h svc/microbot 10.1.44.173 <none> 80/TCP 1h NAME ENDPOINTS AGE ep/default-http-backend 10.1.68.2:80 1h ep/kubernetes 172.31.31.139:6443 1h ep/microbot 10.1.20.3:80,10.1.68.3:80,10.1.7.4:80 1h #### List the ingress resources $ kubectl get ingress NAME HOSTS ADDRESS PORTS AGE microbot-ingress microbot.52.38.62.235.xip.io 172.31.26.109 80 1h When all the pods are listed as `Running`, you are ready to visit the address listed in the HOSTS column of the ingress listing. > Note: It is normal to see a 502/503 error during initial application deployment. As you refresh the page, you will be greeted with a microbot web page, serving from one of the microbot replica pods. Refreshing will show you another microbot with a different hostname as the requests are load-balanced across the replicas. #### Clean up the example There is also an action to clean up the microbot applications. When you are done using the microbot application you can delete them from the pods with one `Juju` action: ```sh $ juju run-action kubernetes-worker/N microbot delete=true ``` If you no longer need Internet access to your workers, remember to unexpose the `kubernetes-worker` charm: ```sh $ juju unexpose kubernetes-worker ``` To learn more about [Kubernetes Ingress](https://kubernetes.io/docs/user-guide/ingress.html) and how to configure the Ingress Controller beyond defaults (such as TLS and websocket support) view the [nginx-ingress-controller](https://github.com/kubernetes/contrib/tree/master/ingress/controllers/nginx) project on github. # Scale out Usage ### Scaling kubernetes-worker The `kubernetes-worker` nodes are the load-bearing units of a `Kubernetes` cluster. By default, pods are automatically spread across the kubernetes-worker units that you have deployed. To add more `kubernetes-worker` units to the cluster: ```sh $ juju add-unit kubernetes-worker ``` or specify machine constraints to create larger nodes: ```sh $ juju add-unit kubernetes-worker --constraints "cpu-cores=8 mem=32G" ``` Refer to the [machine constraints documentation](https://jujucharms.com/docs/stable/charms-constraints) for other machine constraints that might be useful for the kubernetes-worker units. ### Scaling Etcd `Etcd` is the key-value store for the `Kubernetes` cluster. For reliability the bundle defaults to three instances in this cluster. For more scalability, we recommend between 3 and 9 `etcd` nodes. If you want to add more nodes: ```sh $ juju add-unit etcd ``` The `CoreOS etcd` documentation has a chart for the [optimal cluster size](https://coreos.com/etcd/docs/latest/admin_guide.html#optimal-cluster-size) to determine fault tolerance. # Adding optional storage Using [Juju Storage](https://jujucharms.com/docs/2.0/charms-storage), the bundle allows you to connect with durable storage devices such as [Ceph](https://ceph.com). Deploy a minimum of three `ceph-mon` and three `ceph-osd` charms: ```sh $ juju deploy cs:ceph-mon -n 3 $ juju deploy cs:ceph-osd -n 3 ``` Relate the charms: ```sh $ juju add-relation ceph-mon ceph-osd ``` List the storage pools available to `Juju` for your cloud: ```sh $ juju storage-pools Name Provider Attrs ebs ebs ebs-ssd ebs volume-type=ssd loop loop rootfs rootfs tmpfs tmpfs ``` > **Note**: This listing is for the `Amazon Web Services` public cloud. > Different clouds will have different pool names. Add a storage pool to the `ceph-osd` charm by `NAME`,`SIZE`,`COUNT`: ```sh $ juju add-storage ceph-osd/N0 osd-devices=cinder,10G,1 $ juju add-storage ceph-osd/N1 osd-devices=cinder,10G,1 $ juju add-storage ceph-osd/N2 osd-devices=cinder,10G,1 ``` Next relate the storage cluster with the `Kubernetes` cluster: ```sh $ juju add-relation kubernetes-master:ceph-storage ceph-mon:admin ``` We are now ready to enlist [Persistent Volumes](https://kubernetes.io/docs/user-guide/persistent-volumes/) in `Kubernetes`, which our workloads can use via `Persistent Volume Claims` (`PVC`). ```sh $ juju run-action kubernetes-master/N create-rbd-pv name=test size=50 ``` This example created a "test" `Rados Block Device` (`rbd`) in the size of 50 MB. You should see the `PV` become enlisted and be marked as available: ```sh $ watch kubectl get pv NAME CAPACITY ACCESSMODES STATUS CLAIM REASON AGE test 50M RWO Available 10s ``` To consume these Persistent Volumes, your pods will need a `Persistent Volume Claim` associated with them, a task that is outside the scope of this `README`. See the [Persistent Volumes](https://kubernetes.io/docs/user-guide/persistent-volumes/) documentation for more information. ## Known Limitations and Issues The following are known issues and limitations with the bundle and charm code: - Destroying the the easyrsa charm will result in loss of public key infrastructure (PKI). - Deployment locally on LXD will require the use of conjure-up to tune settings on the host's LXD installation to support Docker and other components. - If resources fail to download during initial deployment for any reason, you will need to download and install them manually. For example, if kubernetes-master is missing its resources, download them from the resources section of the sidebar [here](https://jujucharms.com/u/containers/kubernetes-master/) and install them by running, for example: `$ juju attach kubernetes-master kube-apiserver=/path/to/snap`. You can find resources for the canonical-kubernetes charms here: - [kubernetes-master](https://jujucharms.com/u/containers/kubernetes-master/) - [kubernetes-worker](https://jujucharms.com/u/containers/kubernetes-worker/) - [easyrsa](https://jujucharms.com/u/containers/easyrsa/) - [etcd](https://jujucharms.com/u/containers/etcd/) - [flannel](https://jujucharms.com/u/containers/flannel/) ## Kubernetes details - [Kubernetes User Guide](https://kubernetes.io/docs/user-guide/) - [The Canonical Distribution of Kubernetes](https://jujucharms.com/canonical-kubernetes/bundle/) - [Official Bundle](https://api.jujucharms.com/charmstore/v5/canonical-kubernetes/archive/bundle.yaml) - [Bundle Source](https://github.com/juju-solutions/bundle-canonical-kubernetes) - [Bug tracker](https://github.com/juju-solutions/bundle-canonical-kubernetes/issues) # Flannel Flannel is a virtual network that gives a subnet to each host for use with container runtimes. ## Configuration **iface** The interface to configure the flannel SDN binding. If this value is empty string or undefined the code will attempt to find the default network adapter similar to the following command: ```bash $ route | grep default | head -n 1 | awk {'print $8'} ``` **cidr** The network range to configure the flannel SDN to declare when establishing networking setup with etcd. Ensure this network range is not active on the vlan you're deploying to, as it will cause collisions and odd behavior if care is not taken when selecting a good CIDR range to assign to flannel. ## Known Limitations This subordinate does not support being co-located with other deployments of the flannel subordinate (to gain 2 vlans on a single application). If you require this support please file a bug. This subordinate also leverages juju-resources, so it is currently only available on `Juju 2.0+` controllers. ## Further information - [Flannel Charm Resource](https://jujucharms.com/u/containers/flannel/) - [Flannel Homepage](https://coreos.com/flannel/docs/latest/flannel-config.html) ----------------------- For more details read the documentation of the [Kubernetes core bundle](https://jujucharms.com/kubernetes-core/) and the [CharmScaler](https://jujucharms.com/u/elastisys/charmscaler/).