[Gluster-users] rhev/gluster questions , possibly restoring from a failed node

Dan Lavu

2018-03-26 12:53:58 UTC

In my lab, one of my RAID cards started acting up and took one of my three
gluster nodes offline (two nodes with data and an arbiter node). I'm hoping
it's simply the backplane, but during that time troubleshooting and waiting
for parts, the hypervisors was fenced. Since the firewall was replaced and
now several VMs are not starting correctly, fsck, scandisk and xfs_repair
on the problematic VMs are running, but It's been 2 days and xfs_repair
hasn't moved an inch.

When the gluster volume has an offline brick, should it be slow?
If it's suppose to be slow, does it make sense to remove the bricks and
re-add them when the failed node is back?
Is the VM unresponsiveness an issue of the deteriorated gluster volume?

Lastly, in the event that the VMs are totally gone, I need to restore from
the failed node, I was going to simply copy/scp the contents from the
gluster directory onto the running nodes through glusterfs, is there a
better way of restoring, overwriting the data on the good nodes?

Thanks,

Dan