Skip to content
Snippets Groups Projects
Commit 3b3bb1db authored by Fulvio Galeazzi's avatar Fulvio Galeazzi
Browse files

2023-04-03: FG; Ceph added instructions to upgrade to Pacific, updated k8s dashboard address.

parent cfedd762
No related branches found
No related tags found
No related merge requests found
...@@ -3,7 +3,7 @@ Dashboard Access ...@@ -3,7 +3,7 @@ Dashboard Access
You can access a `Kubernetes` dashboard for controlling your cluster through a GUI You can access a `Kubernetes` dashboard for controlling your cluster through a GUI
at theURL:: at theURL::
https://k8s.cloud.garr.it https://container-platform-k8s.cloud.garr.it
To log in to the dashboard you need to authenticate. Follow this procedure: To log in to the dashboard you need to authenticate. Follow this procedure:
......
=======================================
Ceph upgrade from Nautilus to Pacific
=======================================
It is possible to upgrade from Nautilus directly to Pacific, skipping
the intermediate Octopus release.
We followed the `official documentation <https://docs.ceph.com/en/pacific/releases/pacific/#upgrading-from-octopus-or-nautilus>`_
In the following we will proceed with:
- prepare cluster
- upgrade MON
- upgrade MGR (in our setup, colocated with MON)
- upgrade OSD
- final steps
It is assumed the cluster is managed via ``ceph-ansible``, although some
commands and the overall procedure are valid in general.
Prepare cluster
===============
Set the `noout` flag during the upgrade::
ceph osd set noout
Upgrade MON
===========
Perform the following actions on each MON node, one by one, and
check that after the upgrade the node manages to join the cluster::
sed -i -e 's/nautilus/pacific/' /etc/yum.repos.d/ceph_stable.repo
yum -y update
systemctl restart ceph-mon@<monID>
Verify the mon has joined the cluster::
ceph -m <monIP> -s
ceph -m <monIP> mon versions
Verify all monitors report the `pacific` string in the mon map::
ceph mon dump | grep min_mon_release
Upgrade MGR
===========
Proceed as for MON, upgrading packages and restarting Ceph daemons.
In our setup, MON and MGR are co-located so by the time you are here, MGR nodes
have already been upgraded, as can be checked with::
ceph versions
Upgrade OSD
===========
Proceed as above with MON, one node at a time, by first updating the package
manager configuration file, and then doing a package upgrade::
sed -i -e 's/nautilus/pacific/' /etc/yum.repos.d/ceph_stable.repo
yum -y update
systemctl restart ceph-mon@<monID>
Finally, restart all OSD daemons with::
systemctl restart ceph-osd.target
Check with::
ceph versions
Note that after upgrade to Pacific, OSD daemons need to perform some sort of
initialization (read doc for more info), which takes some time: this results
in some PGs being `active+clean+laggy`.
The consequence is that at some point you may see "slow ops": if this is the
case, pause any OSD restart until your cluster is quiet, and wait for it to
calm down before proceeding.
Final steps
===========
OSD omap update
---------------
After the upgrade, ``ceph -s`` will show ``HEALTH_WARN`` with message similar to::
116 OSD(s) reporting legacy (not per-pool) BlueStore omap usage stats
To fix that you will need to log into each OSD server and execute (more info
`here <https://docs.ceph.com/en/latest/rados/operations/health-checks/#bluestore-no-per-pool-omap>`_)
something similar to::
df | grep ceph | awk '{print $NF}' | awk -F- '{print "systemctl stop ceph-osd@"$NF" ; sleep 10 ; ceph osd set noup ; sleep 3 ; time ceph-bluestore-tool repair --path "$0" ; sleep 5 ; ceph osd unset noup ; sleep 3 ; systemctl start ceph-osd@"$NF" ; sleep 300"}' > /tmp/do
. /tmp/do
Please note the above command may cause `slow ops`, both during the "repair" and during OSD restart,
so ensure you allow enough time between OSDs and carefully pick the time when you perform the upgrade.
OSD enable RocksDB sharding
---------------------------
This needs to be done once, if OSD disks are upgraded from previous versions, also read
`here <https://docs.ceph.com/en/pacific/rados/configuration/bluestore-config-ref/#bluestore-rocksdb-sharding>`_
As it requires the OSD to be stopped, it may be useful to combine with step with the one above.
The operation needs to be performed only on OSD disk which have not yet been sharded. check the
output of the following command::
systemctl stop ceph-osd@##
ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-## --command show-sharding
systemctl start ceph-osd@##
If the OSD needs to be sharded, execute::
systemctl stop ceph-osd@##
ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-## --sharding=\"m(3) p(3,0-12) O(3,0-13)=block_cache={type=binned_lru} L P\" reshard
systemctl start ceph-osd@##
MON insecure global id reclaim
------------------------------
This warning comes from Ceph addressing a security vulnerability.
The warning can be silenced please check, for example,
`this page <https://www.suse.com/support/kb/doc/?id=000019960>`_
If you want to address the issue, note that there are two sides of it: clients
using insecure global id reclaim and MONs allowing insecure global id.
The output of `ceph health detail` will clearly show whether you are affected by
either one.
Clients using insecure global id need to be updated, before proceeding. They
are clearly shown in the output of `ceph health detail`.
Once all clients are updated and `ceph health detail` only complains about MONs like this::
[WRN] AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED: mons are allowing insecure global_id reclaim
mon.cephmon1 has auth_allow_insecure_global_id_reclaim set to true
you can disable the insecure mechanism with::
ceph config set mon auth_allow_insecure_global_id_reclaim false
Tidying it up
-------------
Please take a minute to check the official docs: we assume that other suggested
configurations have already been applied to your cluster (e.g., mon_v2 or straw2
buckets), so we won't discuss them here.
Finally, disallow pre-Pacific OSDs and unset `noout` flag::
ceph osd require-osd-release pacific
ceoh osd unset noout
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment