I have an OKD cluster with GlusterFS as a storage class and Heketi as a frontend. Everything works fine till destroy of the Heketi database. How I can’t make any changes to the storage, I can’t add new persistent volume or remove existing ones. GlusterFS still works fine serving existing persistent volumes to pods.
I tried to recreate Heketi database with loading topology file but I think that Heketi is trying to create physical volume on LVM on device that already contains LVM with working GlusterFS. When I try to load topology I see following line in Heketi logs:
(kubeexec) DEBUG 2021/01/23 17:04:39 heketi/pkg/remoteexec/log/commandlog.go:34:log.(*CommandLogger).Before: Will run command (/usr/sbin/lvm pvcreate -qq --metadatasize=128M --dataalignment=256K '/dev/sdb') on (pod:glusterfs-storage-vdm96 c:glusterfs ns:glusterfs (from host:okd-admdev-compute1 selector:glusterfs-node))
Heketi client hanging on adding device to cluster and then got a timeout.
(root@heketi-storage-12-wn652 tmp)# heketi-cli topology load --json=topo.json Creating cluster ... ID: 6a65d3bce35760e5075db0cae6ed8e7e Allowing file volumes on cluster. Allowing block volumes on cluster. Creating node okd-admdev-compute1 ... ID: 7da6b2b1e4f9a723cfd769618ef36a51 Adding device /dev/sdb ... Unable to add device: Initializing device /dev/sdb failed (failed to check device contents): timeout Creating node okd-admdev-compute2 ... ID: e63f5366838492219a8f929ee4cc67a7 Adding device /dev/sdb ...
How to recreate Heketi database without reinitializing devices and reuse devices with existing data?