[Gluster-users] Stale locks on shards

Discussion:

Samuli Heinonen

2018-01-20 19:57:21 UTC

Hi all!

One hypervisor on our virtualization environment crashed and now some of
the VM images cannot be accessed. After investigation we found out that
there was lots of images that still had active lock on crashed
hypervisor. We were able to remove locks from "regular files", but it
doesn't seem possible to remove locks from shards.

We are running GlusterFS 3.8.15 on all nodes.

Here is part of statedump that shows shard having active lock on crashed
node:
[xlator.features.locks.zone2-ssd1-vmstor1-locks.inode]
path=/.shard/75353c17-d6b8-485d-9baf-fd6c700e39a1.21
mandatory=0
inodelk-count=1
lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:metadata
lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:self-heal
lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0
inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid =
3568, owner=14ce372c397f0000, client=0x7f3198388770, connection-id
ovirt8z2.xxx-5652-2017/12/27-09:49:02:946825-zone2-ssd1-vmstor1-client-1-7-0,
granted at 2018-01-20 08:57:24

If we try to run clear-locks we get following error message:
# gluster volume clear-locks zone2-ssd1-vmstor1
/.shard/75353c17-d6b8-485d-9baf-fd6c700e39a1.21 kind all inode
Volume clear-locks unsuccessful
clear-locks getxattr command failed. Reason: Operation not permitted

Gluster vol info if needed:
Volume Name: zone2-ssd1-vmstor1
Type: Replicate
Volume ID: b6319968-690b-4060-8fff-b212d2295208
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: rdma
Bricks:
Brick1: sto1z2.xxx:/ssd1/zone2-vmstor1/export
Brick2: sto2z2.xxx:/ssd1/zone2-vmstor1/export
Options Reconfigured:
cluster.shd-wait-qlength: 10000
cluster.shd-max-threads: 8
cluster.locking-scheme: granular
performance.low-prio-threads: 32
cluster.data-self-heal-algorithm: full
performance.client-io-threads: off
storage.linux-aio: off
performance.readdir-ahead: on
client.event-threads: 16
server.event-threads: 16
performance.strict-write-ordering: off
performance.quick-read: off
performance.read-ahead: on
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: on
cluster.quorum-type: none
network.ping-timeout: 22
performance.write-behind: off
nfs.disable: on
features.shard: on
features.shard-block-size: 512MB
storage.owner-uid: 36
storage.owner-gid: 36
performance.io-thread-count: 64
performance.cache-size: 2048MB
performance.write-behind-window-size: 256MB
server.allow-insecure: on
cluster.ensure-durability: off
config.transport: rdma
server.outstanding-rpc-limit: 512
diagnostics.brick-log-level: INFO

Any recommendations how to advance from here?

Best regards,
Samuli Heinonen

Samuli Heinonen

2018-01-21 19:03:38 UTC

Permalink

Hi again,

here is more information regarding issue described earlier

It looks like self healing is stuck. According to "heal statistics"
crawl began at Sat Jan 20 12:56:19 2018 and it's still going on (It's
around Sun Jan 21 20:30 when writing this). However glustershd.log says
that last heal was completed at "2018-01-20 11:00:13.090697" (which is
13:00 UTC+2). Also "heal info" has been running now for over 16 hours
without any information. In statedump I can see that storage nodes have
locks on files and some of those are blocked. Ie. Here again it says
that ovirt8z2 is having active lock even ovirt8z2 crashed after the lock
was granted.:

[xlator.features.locks.zone2-ssd1-vmstor1-locks.inode]
path=/.shard/3d55f8cc-cda9-489a-b0a3-fd0f43d67876.27
mandatory=0
inodelk-count=3
lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:self-heal
inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid =
18446744073709551610, owner=d0c6d857a87f0000, client=0x7f885845efa0,
connection-id=sto2z2.xxx-10975-2018/01/20-10:56:14:649541-zone2-ssd1-vmstor1-client-0-0-0,
granted at 2018-01-20 10:59:52
lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:metadata
lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0
inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid =
3420, owner=d8b9372c397f0000, client=0x7f8858410be0,
connection-id=ovirt8z2.xxx.com-5652-2017/12/27-09:49:02:946825-zone2-ssd1-vmstor1-client-0-7-0,
granted at 2018-01-20 08:57:23
inodelk.inodelk[1](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid =
18446744073709551610, owner=d0c6d857a87f0000, client=0x7f885845efa0,
connection-id=sto2z2.xxx-10975-2018/01/20-10:56:14:649541-zone2-ssd1-vmstor1-client-0-0-0,
blocked at 2018-01-20 10:59:52

I'd also like to add that volume had arbiter brick before crash
happened. We decided to remove it because we thought that it was causing
issues. However now I think that this was unnecessary. After the crash
arbiter logs had lots of messages like this:
[2018-01-20 10:19:36.515717] I [MSGID: 115072]
[server-rpc-fops.c:1640:server_setattr_cbk] 0-zone2-ssd1-vmstor1-server:
37374187: SETATTR <gfid:a52055bd-e2e9-42dd-92a3-e96b693bcafe>
(a52055bd-e2e9-42dd-92a3-e96b693bcafe) ==> (Operation not permitted)
[Operation not permitted]

Is there anyways to force self heal to stop? Any help would be very much
appreciated :)

Best regards,
Samuli Heinonen

20 January 2018 at 21.57
Hi all!
One hypervisor on our virtualization environment crashed and now some
of the VM images cannot be accessed. After investigation we found out
that there was lots of images that still had active lock on crashed
hypervisor. We were able to remove locks from "regular files", but it
doesn't seem possible to remove locks from shards.
We are running GlusterFS 3.8.15 on all nodes.
Here is part of statedump that shows shard having active lock on
[xlator.features.locks.zone2-ssd1-vmstor1-locks.inode]
path=/.shard/75353c17-d6b8-485d-9baf-fd6c700e39a1.21
mandatory=0
inodelk-count=1
lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:metadata
lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:self-heal
lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0
inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid =
3568, owner=14ce372c397f0000, client=0x7f3198388770, connection-id
ovirt8z2.xxx-5652-2017/12/27-09:49:02:946825-zone2-ssd1-vmstor1-client-1-7-0,
granted at 2018-01-20 08:57:24
# gluster volume clear-locks zone2-ssd1-vmstor1
/.shard/75353c17-d6b8-485d-9baf-fd6c700e39a1.21 kind all inode
Volume clear-locks unsuccessful
clear-locks getxattr command failed. Reason: Operation not permitted
Volume Name: zone2-ssd1-vmstor1
Type: Replicate
Volume ID: b6319968-690b-4060-8fff-b212d2295208
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: rdma
Brick1: sto1z2.xxx:/ssd1/zone2-vmstor1/export
Brick2: sto2z2.xxx:/ssd1/zone2-vmstor1/export
cluster.shd-wait-qlength: 10000
cluster.shd-max-threads: 8
cluster.locking-scheme: granular
performance.low-prio-threads: 32
cluster.data-self-heal-algorithm: full
performance.client-io-threads: off
storage.linux-aio: off
performance.readdir-ahead: on
client.event-threads: 16
server.event-threads: 16
performance.strict-write-ordering: off
performance.quick-read: off
performance.read-ahead: on
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: on
cluster.quorum-type: none
network.ping-timeout: 22
performance.write-behind: off
nfs.disable: on
features.shard: on
features.shard-block-size: 512MB
storage.owner-uid: 36
storage.owner-gid: 36
performance.io-thread-count: 64
performance.cache-size: 2048MB
performance.write-behind-window-size: 256MB
server.allow-insecure: on
cluster.ensure-durability: off
config.transport: rdma
server.outstanding-rpc-limit: 512
diagnostics.brick-log-level: INFO
Any recommendations how to advance from here?
Best regards,
Samuli Heinonen
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users

Krutika Dhananjay

2018-01-23 07:19:12 UTC

Permalink

Post by Samuli Heinonen
Hi again,
here is more information regarding issue described earlier
It looks like self healing is stuck. According to "heal statistics" crawl
began at Sat Jan 20 12:56:19 2018 and it's still going on (It's around Sun
Jan 21 20:30 when writing this). However glustershd.log says that last heal
was completed at "2018-01-20 11:00:13.090697" (which is 13:00 UTC+2). Also
"heal info" has been running now for over 16 hours without any information.
In statedump I can see that storage nodes have locks on files and some of
those are blocked. Ie. Here again it says that ovirt8z2 is having active
[xlator.features.locks.zone2-ssd1-vmstor1-locks.inode]
path=/.shard/3d55f8cc-cda9-489a-b0a3-fd0f43d67876.27
mandatory=0
inodelk-count=3
lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:self-heal
inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid =
18446744073709551610, owner=d0c6d857a87f0000, client=0x7f885845efa0,
649541-zone2-ssd1-vmstor1-client-0-0-0, granted at 2018-01-20 10:59:52
lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:metadata
lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0
inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid =
3420, owner=d8b9372c397f0000, client=0x7f8858410be0,
946825-zone2-ssd1-vmstor1-client-0-7-0, granted at 2018-01-20 08:57:23
inodelk.inodelk[1](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid =
18446744073709551610, owner=d0c6d857a87f0000, client=0x7f885845efa0,
649541-zone2-ssd1-vmstor1-client-0-0-0, blocked at 2018-01-20 10:59:52
I'd also like to add that volume had arbiter brick before crash happened.
We decided to remove it because we thought that it was causing issues.
However now I think that this was unnecessary. After the crash arbiter logs
[2018-01-20 10:19:36.515717] I [MSGID: 115072] [server-rpc-fops.c:1640:server_setattr_cbk]
0-zone2-ssd1-vmstor1-server: 37374187: SETATTR
<gfid:a52055bd-e2e9-42dd-92a3-e96b693bcafe> (a52055bd-e2e9-42dd-92a3-e96b693bcafe)
==> (Operation not permitted) [Operation not permitted]
Is there anyways to force self heal to stop? Any help would be very much
appreciated :)

The locks are contending in afr self-heal and data path domains. It's
possible that the deadlock is not caused by the hypervisor as if that were
the case, the locks should have been released when it crashed/disconnected.

Adding AFR devs to check what's causing the deadlock in the first place.

-Krutika

Post by Samuli Heinonen
Best regards,
Samuli Heinonen
20 January 2018 at 21.57
Hi all!
One hypervisor on our virtualization environment crashed and now some of
the VM images cannot be accessed. After investigation we found out that
there was lots of images that still had active lock on crashed hypervisor.
We were able to remove locks from "regular files", but it doesn't seem
possible to remove locks from shards.
We are running GlusterFS 3.8.15 on all nodes.
Here is part of statedump that shows shard having active lock on crashed
[xlator.features.locks.zone2-ssd1-vmstor1-locks.inode]
path=/.shard/75353c17-d6b8-485d-9baf-fd6c700e39a1.21
mandatory=0
inodelk-count=1
lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:metadata
lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:self-heal
lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0
inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid =
3568, owner=14ce372c397f0000, client=0x7f3198388770, connection-id
ovirt8z2.xxx-5652-2017/12/27-09:49:02:946825-zone2-ssd1-vmstor1-client-1-7-0,
granted at 2018-01-20 08:57:24
# gluster volume clear-locks zone2-ssd1-vmstor1 /.shard/75353c17-d6b8-485d-9baf-fd6c700e39a1.21
kind all inode
Volume clear-locks unsuccessful
clear-locks getxattr command failed. Reason: Operation not permitted
Volume Name: zone2-ssd1-vmstor1
Type: Replicate
Volume ID: b6319968-690b-4060-8fff-b212d2295208
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: rdma
Brick1: sto1z2.xxx:/ssd1/zone2-vmstor1/export
Brick2: sto2z2.xxx:/ssd1/zone2-vmstor1/export
cluster.shd-wait-qlength: 10000
cluster.shd-max-threads: 8
cluster.locking-scheme: granular
performance.low-prio-threads: 32
cluster.data-self-heal-algorithm: full
performance.client-io-threads: off
storage.linux-aio: off
performance.readdir-ahead: on
client.event-threads: 16
server.event-threads: 16
performance.strict-write-ordering: off
performance.quick-read: off
performance.read-ahead: on
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: on
cluster.quorum-type: none
network.ping-timeout: 22
performance.write-behind: off
nfs.disable: on
features.shard: on
features.shard-block-size: 512MB
storage.owner-uid: 36
storage.owner-gid: 36
performance.io-thread-count: 64
performance.cache-size: 2048MB
performance.write-behind-window-size: 256MB
server.allow-insecure: on
cluster.ensure-durability: off
config.transport: rdma
server.outstanding-rpc-limit: 512
diagnostics.brick-log-level: INFO
Any recommendations how to advance from here?
Best regards,
Samuli Heinonen
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users

Pranith Kumar Karampuri

2018-01-23 07:34:13 UTC

Permalink

Exposing .shard to a normal mount is opening a can of worms. You should
probably look at mounting the volume with gfid aux-mount where you can
access a file with <path-to-mount>/.gfid/<gfid-string>to clear locks on it.

Mount command: mount -t glusterfs -o aux-gfid-mount vm1:test /mnt/testvol

A gfid string will have some hyphens like: 11118443-1894-4273-9340-4b212fa1c0e4

That said. Next disconnect on the brick where you successfully did the
clear-locks will crash the brick. There was a bug in 3.8.x series with
clear-locks which was fixed in 3.9.0 with a feature. The self-heal
deadlocks that you witnessed also is fixed in 3.10 version of the release.

3.8.x is EOLed, so I recommend you to upgrade to a supported version soon.

--
Pranith

Pranith Kumar Karampuri

2018-01-23 07:36:38 UTC

Permalink

On Tue, Jan 23, 2018 at 1:04 PM, Pranith Kumar Karampuri <

Post by Pranith Kumar Karampuri

Post by Samuli Heinonen
Hi again,
here is more information regarding issue described earlier
It looks like self healing is stuck. According to "heal statistics" crawl
began at Sat Jan 20 12:56:19 2018 and it's still going on (It's around Sun
Jan 21 20:30 when writing this). However glustershd.log says that last heal
was completed at "2018-01-20 11:00:13.090697" (which is 13:00 UTC+2). Also
"heal info" has been running now for over 16 hours without any information.
In statedump I can see that storage nodes have locks on files and some of
those are blocked. Ie. Here again it says that ovirt8z2 is having active
[xlator.features.locks.zone2-ssd1-vmstor1-locks.inode]
path=/.shard/3d55f8cc-cda9-489a-b0a3-fd0f43d67876.27
mandatory=0
inodelk-count=3
lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:self-heal
inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid =
18446744073709551610, owner=d0c6d857a87f0000, client=0x7f885845efa0,
connection-id=sto2z2.xxx-10975-2018/01/20-10:56:14:649541-
zone2-ssd1-vmstor1-client-0-0-0, granted at 2018-01-20 10:59:52
lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:metadata
lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0
inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid =
3420, owner=d8b9372c397f0000, client=0x7f8858410be0, connection-id=
ovirt8z2.xxx.com-5652-2017/12/27-09:49:02:9468
25-zone2-ssd1-vmstor1-client-0-7-0, granted at 2018-01-20 08:57:23
inodelk.inodelk[1](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid =
18446744073709551610, owner=d0c6d857a87f0000, client=0x7f885845efa0,
connection-id=sto2z2.xxx-10975-2018/01/20-10:56:14:649541-
zone2-ssd1-vmstor1-client-0-0-0, blocked at 2018-01-20 10:59:52
I'd also like to add that volume had arbiter brick before crash happened.
We decided to remove it because we thought that it was causing issues.
However now I think that this was unnecessary. After the crash arbiter logs
[2018-01-20 10:19:36.515717] I [MSGID: 115072]
37374187: SETATTR <gfid:a52055bd-e2e9-42dd-92a3-e96b693bcafe>
(a52055bd-e2e9-42dd-92a3-e96b693bcafe) ==> (Operation not permitted)
[Operation not permitted]
Is there anyways to force self heal to stop? Any help would be very much
appreciated :)

Please use this mount only for doing just this work and unmount it after
that. But my recommendation would be to do an upgrade as soon as possible.
Your bricks will crash on the next disconnect from 'sto2z2.xxx' if you are
not lucky.

Post by Pranith Kumar Karampuri
Mount command: mount -t glusterfs -o aux-gfid-mount vm1:test /mnt/testvol
A gfid string will have some hyphens like: 11118443-1894-4273-9340-4b212fa1c0e4
That said. Next disconnect on the brick where you successfully did the
clear-locks will crash the brick. There was a bug in 3.8.x series with
clear-locks which was fixed in 3.9.0 with a feature. The self-heal
deadlocks that you witnessed also is fixed in 3.10 version of the release.
3.8.x is EOLed, so I recommend you to upgrade to a supported version soon.

Post by Samuli Heinonen
Best regards,
Samuli Heinonen
20 January 2018 at 21.57
Hi all!
One hypervisor on our virtualization environment crashed and now some of
the VM images cannot be accessed. After investigation we found out that
there was lots of images that still had active lock on crashed hypervisor.
We were able to remove locks from "regular files", but it doesn't seem
possible to remove locks from shards.
We are running GlusterFS 3.8.15 on all nodes.
Here is part of statedump that shows shard having active lock on crashed
[xlator.features.locks.zone2-ssd1-vmstor1-locks.inode]
path=/.shard/75353c17-d6b8-485d-9baf-fd6c700e39a1.21
mandatory=0
inodelk-count=1
lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:metadata
lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:self-heal
lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0
inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid =
3568, owner=14ce372c397f0000, client=0x7f3198388770, connection-id
ovirt8z2.xxx-5652-2017/12/27-09:49:02:946825-zone2-ssd1-vmstor1-client-1-7-0,
granted at 2018-01-20 08:57:24
# gluster volume clear-locks zone2-ssd1-vmstor1
/.shard/75353c17-d6b8-485d-9baf-fd6c700e39a1.21 kind all inode
Volume clear-locks unsuccessful
clear-locks getxattr command failed. Reason: Operation not permitted
Volume Name: zone2-ssd1-vmstor1
Type: Replicate
Volume ID: b6319968-690b-4060-8fff-b212d2295208
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: rdma
Brick1: sto1z2.xxx:/ssd1/zone2-vmstor1/export
Brick2: sto2z2.xxx:/ssd1/zone2-vmstor1/export
cluster.shd-wait-qlength: 10000
cluster.shd-max-threads: 8
cluster.locking-scheme: granular
performance.low-prio-threads: 32
cluster.data-self-heal-algorithm: full
performance.client-io-threads: off
storage.linux-aio: off
performance.readdir-ahead: on
client.event-threads: 16
server.event-threads: 16
performance.strict-write-ordering: off
performance.quick-read: off
performance.read-ahead: on
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: on
cluster.quorum-type: none
network.ping-timeout: 22
performance.write-behind: off
nfs.disable: on
features.shard: on
features.shard-block-size: 512MB
storage.owner-uid: 36
storage.owner-gid: 36
performance.io-thread-count: 64
performance.cache-size: 2048MB
performance.write-behind-window-size: 256MB
server.allow-insecure: on
cluster.ensure-durability: off
config.transport: rdma
server.outstanding-rpc-limit: 512
diagnostics.brick-log-level: INFO
Any recommendations how to advance from here?
Best regards,
Samuli Heinonen
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users

--
Pranith

Samuli Heinonen

2018-01-23 08:08:47 UTC

Permalink

On Mon, Jan 22, 2018 at 12:33 AM, Samuli Heinonen

connection-id=sto2z2.xxx-10975-2018/01/20-10:56:14:649541-zone2-ssd1-vmstor1-client-0-0-0,

Post by Samuli Heinonen
granted at 2018-01-20 10:59:52
lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:metadata
lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0
inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid
= 3420, owner=d8b9372c397f0000, client=0x7f8858410be0,

connection-id=ovirt8z2.xxx.com-5652-2017/12/27-09:49:02:946825-zone2-ssd1-vmstor1-client-0-7-0,

Post by Samuli Heinonen
granted at 2018-01-20 08:57:23
inodelk.inodelk[1](BLOCKED)=type=WRITE, whence=0, start=0, len=0,
pid = 18446744073709551610, owner=d0c6d857a87f0000,
client=0x7f885845efa0,

connection-id=sto2z2.xxx-10975-2018/01/20-10:56:14:649541-zone2-ssd1-vmstor1-client-0-0-0,

Post by Samuli Heinonen
blocked at 2018-01-20 10:59:52
I'd also like to add that volume had arbiter brick before crash
happened. We decided to remove it because we thought that it was
causing issues. However now I think that this was unnecessary. After
[2018-01-20 10:19:36.515717] I [MSGID: 115072]
[server-rpc-fops.c:1640:server_setattr_cbk]
0-zone2-ssd1-vmstor1-server: 37374187: SETATTR
<gfid:a52055bd-e2e9-42dd-92a3-e96b693bcafe>
(a52055bd-e2e9-42dd-92a3-e96b693bcafe) ==> (Operation not permitted)
[Operation not permitted]
Is there anyways to force self heal to stop? Any help would be very
much appreciated :)

Exposing .shard to a normal mount is opening a can of worms. You
should probably look at mounting the volume with gfid aux-mount where
you can access a file with <path-to-mount>/.gfid/<gfid-string>to clear
locks on it.
Mount command: mount -t glusterfs -o aux-gfid-mount vm1:test
/mnt/testvol
11118443-1894-4273-9340-4b212fa1c0e4
That said. Next disconnect on the brick where you successfully did the
clear-locks will crash the brick. There was a bug in 3.8.x series with
clear-locks which was fixed in 3.9.0 with a feature. The self-heal
deadlocks that you witnessed also is fixed in 3.10 version of the release.

Thank you the answer. Could you please tell more about crash? What will
actually happen or is there a bug report about it? Just want to make
sure that we can do everything to secure data on bricks. We will look
into upgrade but we have to make sure that new version works for us and
of course get self healing working before doing anything :)

Br,
Samuli

3.8.x is EOLed, so I recommend you to upgrade to a supported version soon.

Post by Samuli Heinonen
Best regards,
Samuli Heinonen

Post by Samuli Heinonen
Samuli Heinonen
20 January 2018 at 21.57
Hi all!
One hypervisor on our virtualization environment crashed and now
some of the VM images cannot be accessed. After investigation we
found out that there was lots of images that still had active lock
on crashed hypervisor. We were able to remove locks from "regular
files", but it doesn't seem possible to remove locks from shards.
We are running GlusterFS 3.8.15 on all nodes.
Here is part of statedump that shows shard having active lock on
[xlator.features.locks.zone2-ssd1-vmstor1-locks.inode]
path=/.shard/75353c17-d6b8-485d-9baf-fd6c700e39a1.21
mandatory=0
inodelk-count=1
lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:metadata
lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:self-heal
lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0
inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0,
pid = 3568, owner=14ce372c397f0000, client=0x7f3198388770,
connection-id

ovirt8z2.xxx-5652-2017/12/27-09:49:02:946825-zone2-ssd1-vmstor1-client-1-7-0,