[Gluster-users] How to use system.affinity/distributed.migrate-data on distributed/replicated volume?

Discussion:

Ingo Fischer

2018-10-24 10:54:45 UTC

Hi,

I have setup a glusterfs volume gv0 as distributed/replicated:

***@pm1:~# gluster volume info gv0

Volume Name: gv0
Type: Distributed-Replicate
Volume ID: 64651501-6df2-4106-b330-fdb3e1fbcdf4
Status: Started
Snapshot Count: 0
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Bricks:
Brick1: 192.168.178.50:/gluster/brick1/gv0
Brick2: 192.168.178.76:/gluster/brick1/gv0
Brick3: 192.168.178.50:/gluster/brick2/gv0
Brick4: 192.168.178.81:/gluster/brick1/gv0
Brick5: 192.168.178.50:/gluster/brick3/gv0
Brick6: 192.168.178.82:/gluster/brick1/gv0
Options Reconfigured:
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet

***@pm1:~# gluster volume status
Status of volume: gv0
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick 192.168.178.50:/gluster/brick1/gv0 49152 0 Y
1665
Brick 192.168.178.76:/gluster/brick1/gv0 49152 0 Y
26343
Brick 192.168.178.50:/gluster/brick2/gv0 49153 0 Y
1666
Brick 192.168.178.81:/gluster/brick1/gv0 49152 0 Y
1161
Brick 192.168.178.50:/gluster/brick3/gv0 49154 0 Y
1679
Brick 192.168.178.82:/gluster/brick1/gv0 49152 0 Y
1334
Self-heal Daemon on localhost N/A N/A Y
5022
Self-heal Daemon on 192.168.178.81 N/A N/A Y
935
Self-heal Daemon on 192.168.178.82 N/A N/A Y
1057
Self-heal Daemon on pm2.fritz.box N/A N/A Y
1651

I use the fs to store VM files, so not many, but big files.

The distribution now put 4 big files on one brick set and only one file
on an other. This means that the one brick set it "overcommited" now as
soon as all VMs using max space. SO I would like to manually
redistribute the files a bit better.

After log googling I found that the following should work:
setfattr -n 'system.affinity' -v $location $filepath
setfattr -n 'distribute.migrate-data' -v 'force' $filepath

But I have problems with it because it gives errors or doing nothing at all.

The mounting looks like:
192.168.178.50:gv0 on /mnt/pve/glusterfs type fuse.glusterfs
(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)

Here is what I tried for the first xattr:

***@pm1:~# setfattr -n 'system.affinity' -v 'gv0-client-5'
/mnt/pve/glusterfs/201/imagesvm201.qcow2
setfattr: /mnt/pve/glusterfs/201/imagesvm201.qcow2: Operation not supported

So I found on google to use trusted.affinity instead and yes this works.
I'm only not sure if the location "gv0-client-5" is correct to move the
file to "Brick 5" from "gluster volume info gv0" ... or how this
location is build?
Commit Message from http://review.gluster.org/#/c/glusterfs/+/5233/ says

The value is the internal client or AFR brick name where you want the

file to be.

So what do I need to set there? maybe I do need the "afr" because
replicated? But where to get that name from?
I also tried to enter other client or replicate names like
"gv0-replicate-0" or such which seems to be more fitting for a
replicated volume, but result the same.

For the second command I get:
***@pm1:~# setfattr -n 'distribute.migrate-data' -v 'force'
/mnt/pve/glusterfs/201/imagesvm201.qcow2
setfattr: /mnt/pve/glusterfs/images/201/vm-201-disk-0.qcow2: Operation
not supported
***@pm1:~# setfattr -n 'trusted.distribute.migrate-data' -v 'force'
/mnt/pve/glusterfs/201/imagesvm201.qcow2
setfattr: /mnt/pve/glusterfs/images/201/vm-201-disk-0.qcow2: File exists

I also experimented with other "names" then "gv0-client-5" above but
always the same.

I saw that instead of the second command I could start a rebalance with
force, but this also did nothing. Ended after max1 second and moved nothing.

Can someone please advice how to do it right?

An other idea was to enable nufa and kind of "re-copy" the files on the
glusterfs, but here it seems that the documentation is wrong.
gluster volume set gv0 cluster.nufa enable on

Is

gluster volume set gv0 cluster.nufa 1

correct?

Thank you very much!

Ingo
--
Ingo Fischer
Technical Director of Platform

Gameforge 4D GmbH
Albert-Nestler-Straße 8
76131 Karlsruhe
Germany

Tel. +49 721 354 808-2269

***@gameforge.com

http://www.gameforge.com
Amtsgericht Mannheim, Handelsregisternummer 718029
USt-IdNr.: DE814330106
Geschäftsführer Alexander Rösner, Jeffrey Brown

Ingo Fischer

2018-10-24 10:34:37 UTC

Permalink

The value is the internal client or AFR brick name where you want the

Ingo Fischer

2018-10-28 22:02:22 UTC

Permalink

Hi All,

has noone an idea on system.affinity/distributed.migrate-data ?
Or how to correctly enable nufa?

BTW: the used gluster version is 4.1.5

Thank you for your help on this!

Ingo

Post by Ingo Fischer
Hi,
Volume Name: gv0
Type: Distributed-Replicate
Volume ID: 64651501-6df2-4106-b330-fdb3e1fbcdf4
Status: Started
Snapshot Count: 0
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Brick1: 192.168.178.50:/gluster/brick1/gv0
Brick2: 192.168.178.76:/gluster/brick1/gv0
Brick3: 192.168.178.50:/gluster/brick2/gv0
Brick4: 192.168.178.81:/gluster/brick1/gv0
Brick5: 192.168.178.50:/gluster/brick3/gv0
Brick6: 192.168.178.82:/gluster/brick1/gv0
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet
Status of volume: gv0
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick 192.168.178.50:/gluster/brick1/gv0 49152 0 Y
1665
Brick 192.168.178.76:/gluster/brick1/gv0 49152 0 Y
26343
Brick 192.168.178.50:/gluster/brick2/gv0 49153 0 Y
1666
Brick 192.168.178.81:/gluster/brick1/gv0 49152 0 Y
1161
Brick 192.168.178.50:/gluster/brick3/gv0 49154 0 Y
1679
Brick 192.168.178.82:/gluster/brick1/gv0 49152 0 Y
1334
Self-heal Daemon on localhost N/A N/A Y
5022
Self-heal Daemon on 192.168.178.81 N/A N/A Y
935
Self-heal Daemon on 192.168.178.82 N/A N/A Y
1057
Self-heal Daemon on pm2.fritz.box N/A N/A Y
1651
I use the fs to store VM files, so not many, but big files.
The distribution now put 4 big files on one brick set and only one file
on an other. This means that the one brick set it "overcommited" now as
soon as all VMs using max space. SO I would like to manually
redistribute the files a bit better.
setfattr -n 'system.affinity' -v $location $filepath
setfattr -n 'distribute.migrate-data' -v 'force' $filepath
But I have problems with it because it gives errors or doing nothing at all.
192.168.178.50:gv0 on /mnt/pve/glusterfs type fuse.glusterfs
(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
/mnt/pve/glusterfs/201/imagesvm201.qcow2
setfattr: /mnt/pve/glusterfs/201/imagesvm201.qcow2: Operation not supported
So I found on google to use trusted.affinity instead and yes this works.
I'm only not sure if the location "gv0-client-5" is correct to move the
file to "Brick 5" from "gluster volume info gv0" ... or how this
location is build?
Commit Message from http://review.gluster.org/#/c/glusterfs/+/5233/ says

The value is the internal client or AFR brick name where you want the

Vlad Kopylov

2018-10-31 03:57:11 UTC

Permalink

nufa helps you write to local brick, if replication is involved it will
still copy it to other bricks (or suppose to do so)
what might be happening is that when initial file was created other nodes
were down and it didn't replicate properly and now heal is failing
check your
gluster vol heal Volname info

maybe you will find out where second copy of the file suppose to be - and
just copy it to that brick

Post by Ingo Fischer
Hi All,
has noone an idea on system.affinity/distributed.migrate-data ?
Or how to correctly enable nufa?
BTW: the used gluster version is 4.1.5
Thank you for your help on this!
Ingo

Pid
------------------------------------------------------------------------------

Post by Ingo Fischer
Brick 192.168.178.50:/gluster/brick1/gv0 49152 0 Y
1665
Brick 192.168.178.76:/gluster/brick1/gv0 49152 0 Y
26343
Brick 192.168.178.50:/gluster/brick2/gv0 49153 0 Y
1666
Brick 192.168.178.81:/gluster/brick1/gv0 49152 0 Y
1161
Brick 192.168.178.50:/gluster/brick3/gv0 49154 0 Y
1679
Brick 192.168.178.82:/gluster/brick1/gv0 49152 0 Y
1334
Self-heal Daemon on localhost N/A N/A Y
5022
Self-heal Daemon on 192.168.178.81 N/A N/A Y
935
Self-heal Daemon on 192.168.178.82 N/A N/A Y
1057
Self-heal Daemon on pm2.fritz.box N/A N/A Y
1651
I use the fs to store VM files, so not many, but big files.
The distribution now put 4 big files on one brick set and only one file
on an other. This means that the one brick set it "overcommited" now as
soon as all VMs using max space. SO I would like to manually
redistribute the files a bit better.
setfattr -n 'system.affinity' -v $location $filepath
setfattr -n 'distribute.migrate-data' -v 'force' $filepath
But I have problems with it because it gives errors or doing nothing at

all.

Post by Ingo Fischer
192.168.178.50:gv0 on /mnt/pve/glusterfs type fuse.glusterfs

(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)

Post by Ingo Fischer
/mnt/pve/glusterfs/201/imagesvm201.qcow2
setfattr: /mnt/pve/glusterfs/201/imagesvm201.qcow2: Operation not

supported

Post by Ingo Fischer
So I found on google to use trusted.affinity instead and yes this works.
I'm only not sure if the location "gv0-client-5" is correct to move the
file to "Brick 5" from "gluster volume info gv0" ... or how this
location is build?
Commit Message from http://review.gluster.org/#/c/glusterfs/+/5233/ says

The value is the internal client or AFR brick name where you want the

nothing.

Post by Ingo Fischer
Can someone please advice how to do it right?
An other idea was to enable nufa and kind of "re-copy" the files on the
glusterfs, but here it seems that the documentation is wrong.
gluster volume set gv0 cluster.nufa enable on
Is
gluster volume set gv0 cluster.nufa 1
correct?
Thank you very much!
Ingo

_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users

n***@fischer-ka.de

2018-10-31 10:58:09 UTC

Permalink

Hi Vlad,

because I never got affinity and distributed.migrate-data using attrs
working, I finally invested time in nufa and found out that (against
docs) that "gluster volume set gv0 cluster.nufa 1" works to enable nufa,
so I did also with the data I had.

Then I started to copy files off the storage and then copied them back
on the correct host and so re-distributed them as I liked it to be.

It would have been easier if setting a "custom" affinity and such would
have worked but now I reached my goal.

Ingo

Post by Vlad Kopylov
nufa helps you write to local brick, if replication is involved it will
still copy it to other bricks (or suppose to do so)
what might be happening is that when initial file was created other
nodes were down and it didn't replicate properly and now heal is failing
check your
gluster vol heal Volname info
maybe you will find out where second copy of the file suppose to be -
and just copy it to that brick
Hi All,
has noone an idea on system.affinity/distributed.migrate-data ?
Or how to correctly enable nufa?
BTW: the used gluster version is 4.1.5
Thank you for your help on this!
Ingo

Online Pid
------------------------------------------------------------------------------

file

Post by Ingo Fischer
on an other. This means that the one brick set it "overcommited"

now as

Post by Ingo Fischer
soon as all VMs using max space. SO I would like to manually
redistribute the files a bit better.
setfattr -n 'system.affinity' -v $location $filepath
setfattr -n 'distribute.migrate-data' -v 'force' $filepath
But I have problems with it because it gives errors or doing

nothing at all.

Post by Ingo Fischer
192.168.178.50:gv0 on /mnt/pve/glusterfs type fuse.glusterfs

(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)

Post by Ingo Fischer
/mnt/pve/glusterfs/201/imagesvm201.qcow2
setfattr: /mnt/pve/glusterfs/201/imagesvm201.qcow2: Operation not

supported

Post by Ingo Fischer
So I found on google to use trusted.affinity instead and yes this

works.

Post by Ingo Fischer
I'm only not sure if the location "gv0-client-5" is correct to

move the

Post by Ingo Fischer
file to "Brick 5" from "gluster volume info gv0" ... or how this
location is build?
Commit Message from

http://review.gluster.org/#/c/glusterfs/+/5233/ says

Post by Ingo Fischer

The value is the internal client or AFR brick name where you want the

exists

Post by Ingo Fischer
I also experimented with other "names" then "gv0-client-5" above but
always the same.
I saw that instead of the second command I could start a rebalance

with

Post by Ingo Fischer
force, but this also did nothing. Ended after max1 second and

moved nothing.

Post by Ingo Fischer
Can someone please advice how to do it right?
An other idea was to enable nufa and kind of "re-copy" the files

on the

Post by Ingo Fischer
glusterfs, but here it seems that the documentation is wrong.
gluster volume set gv0 cluster.nufa enable on
Is
gluster volume set gv0 cluster.nufa 1
correct?
Thank you very much!
Ingo

_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users