Instructions for Ceph-Octopus can be found at the end of this post
I had an issue zapping the drives using ceph-deploy. Ceph-deploy will not clean drives that already had data in it. Running wipefs –all –force /dev/sdx on the target host didn’t work.
ceph-deploy disk zap ceph02 /dev/sdb
Error:
[ceph02][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition
[ceph02][WARNIN] --> RuntimeError: could not complete wipefs on device: /dev/centos_ceph01/swap
[ceph02][ERROR ] RuntimeError: command returned non-zero exit status: 1
Solution: SSH into target server, in this case ceph02 and run lvscan
ssh ceph02
lvscan
Active '/dev/cc_ceph02/swap'
Active '/dev/cc_ceph02/home'
Active '/dev/cc_ceph02/root'
You can now run lvremove which will remove the logical volumes. Do this for each item that is shown using lvscan. Be careful not to wipe the ‘live’ drive, use lsblk to list the drives.
lvremove /dev/cc_ceph02/swap
lvremove /dev/cc_ceph02/home
lvremove /dev/cc_ceph02/root
Go back to the server where ceph-deployed is installed and zap the drives
ceph-deploy disk zap ceph02 /dev/sdb
ceph-deploy disk zap ceph02 /dev/sdc
#
[ceph02][WARNIN] --> Zapping successful for: <Raw Device: /dev/sdb>
#
[ceph02][WARNIN] --> Zapping successful for: <Raw Device: /dev/sdc>
This was before zapping the drives.
#lsblk
sdb 8:16 0 111.8G 0 disk
|-sdb1 8:17 0 1G 0 part
|-sdb2 8:18 0 110.8G 0 part
|-cc_ceph02-swap 253:5 0 5.9G 0 lvm
|-cc_ceph02-home 253:6 0 54.9G 0 lvm
|-cc_ceph02-root 253:7 0 50G 0 lvm
sdc 8:32 0 223.6G 0 disk
|-sdc1 8:33 0 1G 0 part
|-sdc2 8:34 0 222.6G 0 part
After zapping the drives:
#lsblk
sdb 8:16 0 223.6G 0 disk
sdc 8:32 0 111.8G 0 disk
We are now ready to use ceph-deploy disk ceph02 /dev/sdb /dev/sdc
CEPH-OCTOPUS
I had removed the partitions of some HDs using fdisk /dev/sdx
In the process I did’t remove the FS. So the following command to add OSDs failed,
ceph orch daemon add osd ceph01:/dev/sdb
RuntimeError: cephadm exited with an error code: 1, stderr:INFO:cephadm:/bin/podman:stderr Running command: /usr/bin/ceph-authtool --gen-print-key
INFO:cephadm:/bin/podman:stderr Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 7a50a3cd-02c0-4239-a673-53b5dec2aa13
The solution was to ssh to the server and use wipefs to remove the filesystems
[root@ceph01]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdb 8:16 0 223.6G 0 disk
|__ sdb1 8:17 0 223.6G 0 part
sdc 8:32 0 223.6G 0 disk
|__ sdc1 8:33 0 223.6G 0 part
Running:
wipefs -a /dev/sdb
wipefs -a /dev/sbc
[root@ceph01]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdb 8:16 0 223.6G 0 disk
sdc 8:32 0 223.6G 0 disk