[Gluster-users] set up unbalanced replication scheme?

Ian Malone

2018-09-18 09:37:36 UTC

Post by Ian Malone
Hi,
Maybe this is crazy, but I've been wondering if it's possible to
unevenly mix replication and distribution among bricks?
The reason is we have an academic department with enterprise NAS and a
whole lot of linux workstations. The NAS sits behind a server which
serves things over NFS (we could serve directly from the NAS, but
currently in a bit of transition), this actually involves a VM on a
hypervisor cluster, so redundancy in the storage and server, and
things are also sent to an offsite backup.
However, we have home directories on those linux workstations mounted
from NFS, and a couple of remote(-ish) sites. So from time to time if
we have network issues, or an issue develops on the server that can't
be solved by a failover, linux users cannot get into a working desktop
environment.
I've been wondering if putting these home directories on glusterfs is
an answer. Well, of course it is, but I've also been wondering if it's
still possible to have that data replicated onto our NAS (and then
onto backups) so we have all those archives and snapshotting features
available. One option is to set up a couple of extra small servers,
for both offsite locations probably, which host the home directory
data (to be kept fairly small), set these up as a storage pool, and
have a brick on the NAS-backed VM too, set this up to replicate, and
then all three have a copy of all the data. However, if we wanted to
use smaller servers, or potentially even host those bricks on some (or
all of?) the workstations, it would be nice to have all the data
replicated on the brick on the big VM, but distributed on the other
bricks. Is there any way of arranging that?
Aside: if anyone has suggestions for a glusterfs host a bit more
sturdy than a raspberry pi, but a lot cheaper than a poweredge, that'd
be very useful!

So after a bit more thought on my own question:
I considered the possibility of using geo-replication for this.
However it wouldn't provide transparent failover. The way to do that
would probably be to arrange to have the VM+NAS as the mirror, with
the distributed cluster as the primary, effectively just make it a
backup and point all clients at the distributed cluster.

Balancing bricks. If I understand correctly the way to achieve what I
described in my first email is actually to have multiple bricks on the
VM+NAS ('big') server, one for each brick in the distributed cluster,
and run replication + duplication, setting up the scheme so each brick
on the 'big' server is a replica of one on the distributed ones. If
another distributed node is added another brick must be added to the
'big' server. Clients mounting directories using native glusterfs will
then just connect to whatever is most appropriate.

Still open to any hardware suggestions. Have been wondering about
Intel NUC or HP Proliant MicroServers.

--
imalone
http://ibmalone.blogspot.co.uk