diff --git a/web/containers/dashboard.rst b/web/containers/dashboard.rst index 31cccc00ca529aed2d2146d12aefd1bd04087331..e7007437a025cc163c2b0ffb111d4760a061d637 100644 --- a/web/containers/dashboard.rst +++ b/web/containers/dashboard.rst @@ -3,7 +3,7 @@ Dashboard Access You can access a `Kubernetes` dashboard for controlling your cluster through a GUI at theURL:: - https://k8s.cloud.garr.it + https://container-platform-k8s.cloud.garr.it To log in to the dashboard you need to authenticate. Follow this procedure: diff --git a/web/support/kb/ceph/ceph-upgrade-to-pacific.rst b/web/support/kb/ceph/ceph-upgrade-to-pacific.rst new file mode 100644 index 0000000000000000000000000000000000000000..610ab37ae5e71baad7427ae09fb39348ea47cd4f --- /dev/null +++ b/web/support/kb/ceph/ceph-upgrade-to-pacific.rst @@ -0,0 +1,156 @@ +======================================= + Ceph upgrade from Nautilus to Pacific +======================================= + +It is possible to upgrade from Nautilus directly to Pacific, skipping +the intermediate Octopus release. + +We followed the `official documentation <https://docs.ceph.com/en/pacific/releases/pacific/#upgrading-from-octopus-or-nautilus>`_ + +In the following we will proceed with: + +- prepare cluster +- upgrade MON +- upgrade MGR (in our setup, colocated with MON) +- upgrade OSD +- final steps + +It is assumed the cluster is managed via ``ceph-ansible``, although some +commands and the overall procedure are valid in general. + +Prepare cluster +=============== + +Set the `noout` flag during the upgrade:: + + ceph osd set noout + +Upgrade MON +=========== + +Perform the following actions on each MON node, one by one, and +check that after the upgrade the node manages to join the cluster:: + + sed -i -e 's/nautilus/pacific/' /etc/yum.repos.d/ceph_stable.repo + yum -y update + systemctl restart ceph-mon@<monID> + +Verify the mon has joined the cluster:: + + ceph -m <monIP> -s + ceph -m <monIP> mon versions + +Verify all monitors report the `pacific` string in the mon map:: + + ceph mon dump | grep min_mon_release + + +Upgrade MGR +=========== + +Proceed as for MON, upgrading packages and restarting Ceph daemons. +In our setup, MON and MGR are co-located so by the time you are here, MGR nodes +have already been upgraded, as can be checked with:: + + ceph versions + +Upgrade OSD +=========== + +Proceed as above with MON, one node at a time, by first updating the package +manager configuration file, and then doing a package upgrade:: + + sed -i -e 's/nautilus/pacific/' /etc/yum.repos.d/ceph_stable.repo + yum -y update + systemctl restart ceph-mon@<monID> + +Finally, restart all OSD daemons with:: + + systemctl restart ceph-osd.target + +Check with:: + + ceph versions + +Note that after upgrade to Pacific, OSD daemons need to perform some sort of +initialization (read doc for more info), which takes some time: this results +in some PGs being `active+clean+laggy`. +The consequence is that at some point you may see "slow ops": if this is the +case, pause any OSD restart until your cluster is quiet, and wait for it to +calm down before proceeding. + +Final steps +=========== + +OSD omap update +--------------- + +After the upgrade, ``ceph -s`` will show ``HEALTH_WARN`` with message similar to:: + + 116 OSD(s) reporting legacy (not per-pool) BlueStore omap usage stats + +To fix that you will need to log into each OSD server and execute (more info +`here <https://docs.ceph.com/en/latest/rados/operations/health-checks/#bluestore-no-per-pool-omap>`_) +something similar to:: + + df | grep ceph | awk '{print $NF}' | awk -F- '{print "systemctl stop ceph-osd@"$NF" ; sleep 10 ; ceph osd set noup ; sleep 3 ; time ceph-bluestore-tool repair --path "$0" ; sleep 5 ; ceph osd unset noup ; sleep 3 ; systemctl start ceph-osd@"$NF" ; sleep 300"}' > /tmp/do + . /tmp/do + +Please note the above command may cause `slow ops`, both during the "repair" and during OSD restart, +so ensure you allow enough time between OSDs and carefully pick the time when you perform the upgrade. + +OSD enable RocksDB sharding +--------------------------- + +This needs to be done once, if OSD disks are upgraded from previous versions, also read +`here <https://docs.ceph.com/en/pacific/rados/configuration/bluestore-config-ref/#bluestore-rocksdb-sharding>`_ + +As it requires the OSD to be stopped, it may be useful to combine with step with the one above. +The operation needs to be performed only on OSD disk which have not yet been sharded. check the +output of the following command:: + + systemctl stop ceph-osd@## + ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-## --command show-sharding + systemctl start ceph-osd@## + +If the OSD needs to be sharded, execute:: + + systemctl stop ceph-osd@## + ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-## --sharding=\"m(3) p(3,0-12) O(3,0-13)=block_cache={type=binned_lru} L P\" reshard + systemctl start ceph-osd@## + +MON insecure global id reclaim +------------------------------ + +This warning comes from Ceph addressing a security vulnerability. +The warning can be silenced please check, for example, +`this page <https://www.suse.com/support/kb/doc/?id=000019960>`_ + +If you want to address the issue, note that there are two sides of it: clients +using insecure global id reclaim and MONs allowing insecure global id. +The output of `ceph health detail` will clearly show whether you are affected by +either one. + +Clients using insecure global id need to be updated, before proceeding. They +are clearly shown in the output of `ceph health detail`. + +Once all clients are updated and `ceph health detail` only complains about MONs like this:: + + [WRN] AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED: mons are allowing insecure global_id reclaim + mon.cephmon1 has auth_allow_insecure_global_id_reclaim set to true + +you can disable the insecure mechanism with:: + + ceph config set mon auth_allow_insecure_global_id_reclaim false + +Tidying it up +------------- + +Please take a minute to check the official docs: we assume that other suggested +configurations have already been applied to your cluster (e.g., mon_v2 or straw2 +buckets), so we won't discuss them here. + +Finally, disallow pre-Pacific OSDs and unset `noout` flag:: + + ceph osd require-osd-release pacific + ceoh osd unset noout