[Gluster-users] Gluster FUSE mount sometimes reports that files do not exist until ls is performed on parent directory

Discussion:

Niels Hendriks

2018-04-16 08:24:30 UTC

Hi,

We have a 3-node gluster setup where gluster is both the server and the
client.
Every few days we have some $random file or directory that does not exist
according to the FUSE mountpoint. When we try to access the file (stat,
cat, etc...) the filesystem reports that the file/directory does not exist,
even though it does. When we try to create the file/directory we receive
the following error which is also logged in
/var/log/glusterfs/bricks/$brick.log:

[2018-04-10 12:51:26.755928] E [MSGID: 113027] [posix.c:1779:posix_mkdir]
0-www-posix: mkdir of /storage/gluster/path/to/dir failed [File exists]

We don't see this issue on all of the servers, but only on the servers that
did not create the file/directory (so 2 of the 3 gluster nodes).

We found that this issue does not resolve itself automatically. However,
when we perform an ls command on the parent directory the issue will be
resolved for the other nodes.

We are running glusterfs 3.12.6 on debian 8

Mount-options in /etc/fstab:
/dev/storage-gluster/gluster /storage/gluster xfs rw,inode64,noatime,nouuid
0 2
localhost:/www /var/www glusterfs
backup-volfile-servers=10.0.0.2:10.0.0.3,log-level=WARNING
0 0

gluster volume info www

Volume Name: www
Type: Replicate
Volume ID: e0579d53-f671-4868-863b-ba85c4cfacb3
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: n01c01-gluster:/storage/gluster/www
Brick2: n02c01-gluster:/storage/gluster/www
Brick3: n03c01-gluster:/storage/gluster/www
Options Reconfigured:
performance.read-ahead: on
performance.client-io-threads: on
nfs.disable: on
transport.address-family: inet
performance.md-cache-timeout: 600
diagnostics.brick-log-level: WARNING
network.ping-timeout: 3
features.cache-invalidation: on
server.event-threads: 4
performance.cache-invalidation: on
performance.quick-read: on
features.cache-invalidation-timeout: 600
network.inode-lru-limit: 90000
performance.cache-priority: *.php:3,*.temp:3,*:1
performance.nl-cache: on
performance.cache-size: 1GB
performance.readdir-ahead: on
performance.write-behind: on
cluster.readdir-optimize: on
performance.io-thread-count: 64
client.event-threads: 4
cluster.lookup-optimize: on
performance.parallel-readdir: off
performance.write-behind-window-size: 4MB
performance.flush-behind: on
features.bitrot: on
features.scrub: Active
performance.io-cache: off
performance.stat-prefetch: on

We suspected that the md-cache could be the cause, but it does have a
timeout of 600 seconds so this would be strange since the issue can be
present for hours (at which point we did an ls to fix it).

Does anyone have an idea of what could be the cause of this?

Thanks!

Raghavendra Gowdappa

2018-04-16 08:37:26 UTC

Permalink

Post by Niels Hendriks
Hi,
We have a 3-node gluster setup where gluster is both the server and the
client.
Every few days we have some $random file or directory that does not exist
according to the FUSE mountpoint. When we try to access the file (stat,
cat, etc...) the filesystem reports that the file/directory does not exist,
even though it does. When we try to create the file/directory we receive
the following error which is also logged in
[2018-04-10 12:51:26.755928] E [MSGID: 113027] [posix.c:1779:posix_mkdir]
0-www-posix: mkdir of /storage/gluster/path/to/dir failed [File exists]
We don't see this issue on all of the servers, but only on the servers that
did not create the file/directory (so 2 of the 3 gluster nodes).
We found that this issue does not resolve itself automatically. However,
when we perform an ls command on the parent directory the issue will be
resolved for the other nodes.
We are running glusterfs 3.12.6 on debian 8
/dev/storage-gluster/gluster /storage/gluster xfs rw,inode64,noatime,nouuid
0 2
localhost:/www /var/www glusterfs
backup-volfile-servers=10.0.0.2:10.0.0.3,log-level=WARNING
0 0
gluster volume info www
Volume Name: www
Type: Replicate
Volume ID: e0579d53-f671-4868-863b-ba85c4cfacb3
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Brick1: n01c01-gluster:/storage/gluster/www
Brick2: n02c01-gluster:/storage/gluster/www
Brick3: n03c01-gluster:/storage/gluster/www
performance.read-ahead: on
performance.client-io-threads: on
nfs.disable: on
transport.address-family: inet
performance.md-cache-timeout: 600
diagnostics.brick-log-level: WARNING
network.ping-timeout: 3
features.cache-invalidation: on
server.event-threads: 4
performance.cache-invalidation: on
performance.quick-read: on
features.cache-invalidation-timeout: 600
network.inode-lru-limit: 90000
performance.cache-priority: *.php:3,*.temp:3,*:1
performance.nl-cache: on
performance.cache-size: 1GB
performance.readdir-ahead: on
performance.write-behind: on
cluster.readdir-optimize: on
performance.io-thread-count: 64
client.event-threads: 4
cluster.lookup-optimize: on
performance.parallel-readdir: off
performance.write-behind-window-size: 4MB
performance.flush-behind: on
features.bitrot: on
features.scrub: Active
performance.io-cache: off
performance.stat-prefetch: on
We suspected that the md-cache could be the cause, but it does have a
timeout of 600 seconds so this would be strange since the issue can be
present for hours (at which point we did an ls to fix it).
Does anyone have an idea of what could be the cause of this?

For files, it could be because of:
* cluster.lookup-optimize set to on
* datafile is present on non hashed subvol, but linkto file is absent on
hashed subvol

I see that lookup-optimize is on. Can you get the following information
from problematic file?

* Name of the file
* all xattrs on parent directory from all bricks
* stat of file from all bricks where it is present.
* all xattrs on file from all bricks where it is present.

If you are seeing the problem on directory,
* Name of directory
* xattr of directory and its parent from all bricks

regards,
Raghavendra

Post by Niels Hendriks
Thanks!
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users

Nithya Balachandran

2018-04-16 09:29:37 UTC

Permalink

Post by Raghavendra Gowdappa

Post by Niels Hendriks
Hi,
We have a 3-node gluster setup where gluster is both the server and the
client.
Every few days we have some $random file or directory that does not exist
according to the FUSE mountpoint. When we try to access the file (stat,
cat, etc...) the filesystem reports that the file/directory does not exist,
even though it does. When we try to create the file/directory we receive
the following error which is also logged in
[2018-04-10 12:51:26.755928] E [MSGID: 113027] [posix.c:1779:posix_mkdir]
0-www-posix: mkdir of /storage/gluster/path/to/dir failed [File exists]
We don't see this issue on all of the servers, but only on the servers that
did not create the file/directory (so 2 of the 3 gluster nodes).
We found that this issue does not resolve itself automatically. However,
when we perform an ls command on the parent directory the issue will be
resolved for the other nodes.
We are running glusterfs 3.12.6 on debian 8
/dev/storage-gluster/gluster /storage/gluster xfs
rw,inode64,noatime,nouuid
0 2
localhost:/www /var/www glusterfs
backup-volfile-servers=10.0.0.2:10.0.0.3,log-level=WARNING
0 0
gluster volume info www
Volume Name: www
Type: Replicate
Volume ID: e0579d53-f671-4868-863b-ba85c4cfacb3
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Brick1: n01c01-gluster:/storage/gluster/www
Brick2: n02c01-gluster:/storage/gluster/www
Brick3: n03c01-gluster:/storage/gluster/www
performance.read-ahead: on
performance.client-io-threads: on
nfs.disable: on
transport.address-family: inet
performance.md-cache-timeout: 600
diagnostics.brick-log-level: WARNING
network.ping-timeout: 3
features.cache-invalidation: on
server.event-threads: 4
performance.cache-invalidation: on
performance.quick-read: on
features.cache-invalidation-timeout: 600
network.inode-lru-limit: 90000
performance.cache-priority: *.php:3,*.temp:3,*:1
performance.nl-cache: on
performance.cache-size: 1GB
performance.readdir-ahead: on
performance.write-behind: on
cluster.readdir-optimize: on
performance.io-thread-count: 64
client.event-threads: 4
cluster.lookup-optimize: on
performance.parallel-readdir: off
performance.write-behind-window-size: 4MB
performance.flush-behind: on
features.bitrot: on
features.scrub: Active
performance.io-cache: off
performance.stat-prefetch: on
We suspected that the md-cache could be the cause, but it does have a
timeout of 600 seconds so this would be strange since the issue can be
present for hours (at which point we did an ls to fix it).
Does anyone have an idea of what could be the cause of this?

* cluster.lookup-optimize set to on
* datafile is present on non hashed subvol, but linkto file is absent on
hashed subvol

This is a pure replicate volume:

Type: Replicate
Volume ID: e0579d53-f671-4868-863b-ba85c4cfacb3
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3

So unlikely to be a lookup-optimize problem.

Post by Raghavendra Gowdappa
I see that lookup-optimize is on. Can you get the following information
from problematic file?
* Name of the file
* all xattrs on parent directory from all bricks
* stat of file from all bricks where it is present.
* all xattrs on file from all bricks where it is present.
If you are seeing the problem on directory,
* Name of directory
* xattr of directory and its parent from all bricks
regards,
Raghavendra

Post by Niels Hendriks
Thanks!
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users

Nithya Balachandran

2018-04-16 09:40:07 UTC

Permalink

Hi Niels,

As this is a pure replicate volume, lookup-optimize is not going to be much
use so you can turn it off if you wish.

Do you see any error messages in the FUSE mount logs when this happens?
If it happens again, a tcpdump of the fuse mount would help.

Regards,
Nithya

Post by Niels Hendriks

Post by Raghavendra Gowdappa

Post by Niels Hendriks
Hi,
We have a 3-node gluster setup where gluster is both the server and the
client.
Every few days we have some $random file or directory that does not exist
according to the FUSE mountpoint. When we try to access the file (stat,
cat, etc...) the filesystem reports that the file/directory does not exist,
even though it does. When we try to create the file/directory we receive
the following error which is also logged in
[2018-04-10 12:51:26.755928] E [MSGID: 113027] [posix.c:1779:posix_mkdir]
0-www-posix: mkdir of /storage/gluster/path/to/dir failed [File exists]
We don't see this issue on all of the servers, but only on the servers that
did not create the file/directory (so 2 of the 3 gluster nodes).
We found that this issue does not resolve itself automatically. However,
when we perform an ls command on the parent directory the issue will be
resolved for the other nodes.
We are running glusterfs 3.12.6 on debian 8
/dev/storage-gluster/gluster /storage/gluster xfs
rw,inode64,noatime,nouuid
0 2
localhost:/www /var/www glusterfs
backup-volfile-servers=10.0.0.2:10.0.0.3,log-level=WARNING
0 0
gluster volume info www
Volume Name: www
Type: Replicate
Volume ID: e0579d53-f671-4868-863b-ba85c4cfacb3
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Brick1: n01c01-gluster:/storage/gluster/www
Brick2: n02c01-gluster:/storage/gluster/www
Brick3: n03c01-gluster:/storage/gluster/www
performance.read-ahead: on
performance.client-io-threads: on
nfs.disable: on
transport.address-family: inet
performance.md-cache-timeout: 600
diagnostics.brick-log-level: WARNING
network.ping-timeout: 3
features.cache-invalidation: on
server.event-threads: 4
performance.cache-invalidation: on
performance.quick-read: on
features.cache-invalidation-timeout: 600
network.inode-lru-limit: 90000
performance.cache-priority: *.php:3,*.temp:3,*:1
performance.nl-cache: on
performance.cache-size: 1GB
performance.readdir-ahead: on
performance.write-behind: on
cluster.readdir-optimize: on
performance.io-thread-count: 64
client.event-threads: 4
cluster.lookup-optimize: on
performance.parallel-readdir: off
performance.write-behind-window-size: 4MB
performance.flush-behind: on
features.bitrot: on
features.scrub: Active
performance.io-cache: off
performance.stat-prefetch: on
We suspected that the md-cache could be the cause, but it does have a
timeout of 600 seconds so this would be strange since the issue can be
present for hours (at which point we did an ls to fix it).
Does anyone have an idea of what could be the cause of this?

* cluster.lookup-optimize set to on
* datafile is present on non hashed subvol, but linkto file is absent on
hashed subvol

Type: Replicate
Volume ID: e0579d53-f671-4868-863b-ba85c4cfacb3
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
So unlikely to be a lookup-optimize problem.

Post by Niels Hendriks
Thanks!
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users