[Gluster-users] Quick and small file read/write optimization

Discussion:

Pedro Costa

2018-10-09 13:58:47 UTC

Hi,

I've a 1 x 3 replicated glusterfs 4.1.5 volume, that mounts using fuse on each server into /www for various Node apps that are proxied with nginx. Servers are then load balanced to split traffic. Here's the gvol1 configuration at the moment:

Volume Name: gvol1
Type: Replicate
Volume ID: 384acec2-XXXX-40da-YYYY-5c53d12b3ae2
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: vm0:/srv/brick1/gvol1
Brick2: vm1:/srv/brick1/gvol1
Brick3: vm2:/srv/brick1/gvol1
Options Reconfigured:
cluster.strict-readdir: on
client.event-threads: 4
cluster.lookup-optimize: on
network.inode-lru-limit: 90000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.cache-samba-metadata: on
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: on
storage.fips-mode-rchecksum: on
features.utime: on
storage.ctime: on
server.event-threads: 4
performance.cache-size: 500MB
performance.read-ahead: on
cluster.readdir-optimize: on
cluster.shd-max-threads: 6
performance.strict-o-direct: on
server.outstanding-rpc-limit: 128
performance.enable-least-priority: off
cluster.nufa: on
performance.nl-cache: on
performance.nl-cache-timeout: 60
performance.cache-refresh-timeout: 10
performance.rda-cache-limit: 128MB
performance.readdir-ahead: on
performance.parallel-readdir: on
disperse.eager-lock: off
network.ping-timeout: 5
cluster.background-self-heal-count: 20
cluster.self-heal-window-size: 2
cluster.self-heal-readdir-size: 2KB

On each restart the apps delete a particular folder and rebuild it from internal packages. On one such operation on a particular client to the volume I get repeated logs hundreds of times for the same guid even, sometimes:

[2018-10-09 13:40:40.579161] W [MSGID: 114061] [client-common.c:2658:client_pre_flush_v2] 0-gvol1-client-2: (7955fd7a-3147-48b3-bf6a-5306ac97e10d) remote_fd is -1. EBADFD [File descriptor in bad state]
[2018-10-09 13:40:40.579313] W [MSGID: 114061] [client-common.c:2658:client_pre_flush_v2] 0-gvol1-client-2: (0ac67ee4-a31e-4989-ba1e-e4f513c1f757) remote_fd is -1. EBADFD [File descriptor in bad state]
[2018-10-09 13:40:40.579707] W [MSGID: 114061] [client-common.c:2658:client_pre_flush_v2] 0-gvol1-client-2: (7ea6106d-29f4-4a19-8eb6-6515ffefb9d3) remote_fd is -1. EBADFD [File descriptor in bad state]
[2018-10-09 13:40:40.579911] W [MSGID: 114061] [client-common.c:2658:client_pre_flush_v2] 0-gvol1-client-2: (7ea6106d-29f4-4a19-8eb6-6515ffefb9d3) remote_fd is -1. EBADFD [File descriptor in bad state]

I assume this is probably because the client didn't catch up with the previous delete? I do control the server (client to the gluster volume) that the restart occurs, and I prevent having more than one rebuilding the same app at the same time, which makes these logs odd.

I've implemented the volume options above after reading most of the entries in the archive here over the last few weeks, but I'm not sure what else to tweak because other than the restart of the apps it is working pretty well.

If there's any input you may have to help on this particular scenario I'd be much appreciated.

Thanks,
P.

Vlad Kopylov

2018-10-10 04:55:40 UTC

Permalink

It also matters how you mount it:
glusterfs
defaults,_netdev,negative-timeout=10,attribute-timeout=30,fopen-keep-cache,direct-io-mode=enable,fetch-attempts=5
0 0

Options Reconfigured:
performance.io-thread-count: 8
server.allow-insecure: on
cluster.shd-max-threads: 12
performance.rda-cache-limit: 128MB
cluster.readdir-optimize: on
cluster.read-hash-mode: 0
performance.strict-o-direct: on
cluster.lookup-unhashed: auto
performance.nl-cache: on
performance.nl-cache-timeout: 600
cluster.lookup-optimize: on
client.event-threads: 4
performance.client-io-threads: on
performance.md-cache-timeout: 600
server.event-threads: 4
features.cache-invalidation: on
features.cache-invalidation-timeout: 600
performance.stat-prefetch: on
performance.cache-invalidation: on
network.inode-lru-limit: 90000
performance.cache-refresh-timeout: 10
performance.enable-least-priority: off
performance.cache-size: 2GB
cluster.nufa: on
cluster.choose-local: on
server.outstanding-rpc-limit: 128
disperse.eager-lock: off
nfs.disable: on
transport.address-family: inet

Hi,
Iâve a 1 x 3 replicated glusterfs 4.1.5 volume, that mounts using fuse on
each server into /www for various Node apps that are proxied with nginx.
Servers are then load balanced to split traffic. Hereâs the gvol1
Volume Name: gvol1
Type: Replicate
Volume ID: 384acec2-XXXX-40da-YYYY-5c53d12b3ae2
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Brick1: vm0:/srv/brick1/gvol1
Brick2: vm1:/srv/brick1/gvol1
Brick3: vm2:/srv/brick1/gvol1
cluster.strict-readdir: on
client.event-threads: 4
cluster.lookup-optimize: on
network.inode-lru-limit: 90000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.cache-samba-metadata: on
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: on
storage.fips-mode-rchecksum: on
features.utime: on
storage.ctime: on
server.event-threads: 4
performance.cache-size: 500MB
performance.read-ahead: on
cluster.readdir-optimize: on
cluster.shd-max-threads: 6
performance.strict-o-direct: on
server.outstanding-rpc-limit: 128
performance.enable-least-priority: off
cluster.nufa: on
performance.nl-cache: on
performance.nl-cache-timeout: 60
performance.cache-refresh-timeout: 10
performance.rda-cache-limit: 128MB
performance.readdir-ahead: on
performance.parallel-readdir: on
disperse.eager-lock: off
network.ping-timeout: 5
cluster.background-self-heal-count: 20
cluster.self-heal-window-size: 2
cluster.self-heal-readdir-size: 2KB
On each restart the apps delete a particular folder and rebuild it from
internal packages. On one such operation on a particular client to the
volume I get repeated logs hundreds of times for the same guid even,
[2018-10-09 13:40:40.579161] W [MSGID: 114061]
(7955fd7a-3147-48b3-bf6a-5306ac97e10d) remote_fd is -1. EBADFD [File
descriptor in bad state]
[2018-10-09 13:40:40.579313] W [MSGID: 114061]
(0ac67ee4-a31e-4989-ba1e-e4f513c1f757) remote_fd is -1. EBADFD [File
descriptor in bad state]
[2018-10-09 13:40:40.579707] W [MSGID: 114061]
(7ea6106d-29f4-4a19-8eb6-6515ffefb9d3) remote_fd is -1. EBADFD [File
descriptor in bad state]
[2018-10-09 13:40:40.579911] W [MSGID: 114061]
(7ea6106d-29f4-4a19-8eb6-6515ffefb9d3) remote_fd is -1. EBADFD [File
descriptor in bad state]
I assume this is probably because the client didnât catch up with the
previous delete? I do control the server (client to the gluster volume)
that the restart occurs, and I prevent having more than one rebuilding the
same app at the same time, which makes these logs odd.
Iâve implemented the volume options above after reading most of the
entries in the archive here over the last few weeks, but Iâm not sure what
else to tweak because other than the restart of the apps it is working
pretty well.
If thereâs any input you may have to help on this particular scenario Iâd
be much appreciated.
Thanks,
P.
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users

Pedro Costa

2018-10-10 08:07:07 UTC

Permalink

Hi Vlad,

Thanks so much, I will try these, at the moment Iâm mount with:

glusterfs defaults,_netdev,noauto,x-systemd.automount 0 0

on Ubuntu 16.04.5 LTS /etc/fstab, I will try the additionals once options in the volume are set.

Thank you,
P.

From: Vlad Kopylov <***@gmail.com>
Sent: 10 October 2018 05:56
To: Pedro Costa <***@pmc.digital>
Cc: gluster-users <gluster-***@gluster.org>
Subject: Re: [Gluster-users] Quick and small file read/write optimization

It also matters how you mount it:
glusterfs defaults,_netdev,negative-timeout=10,attribute-timeout=30,fopen-keep-cache,direct-io-mode=enable,fetch-attempts=5 0 0

Options Reconfigured:
performance.io-thread-count: 8
server.allow-insecure: on
cluster.shd-max-threads: 12
performance.rda-cache-limit: 128MB
cluster.readdir-optimize: on
cluster.read-hash-mode: 0
performance.strict-o-direct: on
cluster.lookup-unhashed: auto
performance.nl-cache: on
performance.nl-cache-timeout: 600
cluster.lookup-optimize: on
client.event-threads: 4
performance.client-io-threads: on
performance.md-cache-timeout: 600
server.event-threads: 4
features.cache-invalidation: on
features.cache-invalidation-timeout: 600
performance.stat-prefetch: on
performance.cache-invalidation: on
network.inode-lru-limit: 90000
performance.cache-refresh-timeout: 10
performance.enable-least-priority: off
performance.cache-size: 2GB
cluster.nufa: on
cluster.choose-local: on
server.outstanding-rpc-limit: 128
disperse.eager-lock: off
nfs.disable: on
transport.address-family: inet

On Tue, Oct 9, 2018 at 2:33 PM Pedro Costa <***@pmc.digital<mailto:***@pmc.digital>> wrote:
Hi,

Iâve a 1 x 3 replicated glusterfs 4.1.5 volume, that mounts using fuse on each server into /www for various Node apps that are proxied with nginx. Servers are then load balanced to split traffic. Hereâs the gvol1 configuration at the moment:

Volume Name: gvol1
Type: Replicate
Volume ID: 384acec2-XXXX-40da-YYYY-5c53d12b3ae2
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: vm0:/srv/brick1/gvol1
Brick2: vm1:/srv/brick1/gvol1
Brick3: vm2:/srv/brick1/gvol1
Options Reconfigured:
cluster.strict-readdir: on
client.event-threads: 4
cluster.lookup-optimize: on
network.inode-lru-limit: 90000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.cache-samba-metadata: on
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: on
storage.fips-mode-rchecksum: on
features.utime: on
storage.ctime: on
server.event-threads: 4
performance.cache-size: 500MB
performance.read-ahead: on
cluster.readdir-optimize: on
cluster.shd-max-threads: 6
performance.strict-o-direct: on
server.outstanding-rpc-limit: 128
performance.enable-least-priority: off
cluster.nufa: on
performance.nl-cache: on
performance.nl-cache-timeout: 60
performance.cache-refresh-timeout: 10
performance.rda-cache-limit: 128MB
performance.readdir-ahead: on
performance.parallel-readdir: on
disperse.eager-lock: off
network.ping-timeout: 5
cluster.background-self-heal-count: 20
cluster.self-heal-window-size: 2
cluster.self-heal-readdir-size: 2KB

On each restart the apps delete a particular folder and rebuild it from internal packages. On one such operation on a particular client to the volume I get repeated logs hundreds of times for the same guid even, sometimes:

[2018-10-09 13:40:40.579161] W [MSGID: 114061] [client-common.c:2658:client_pre_flush_v2] 0-gvol1-client-2: (7955fd7a-3147-48b3-bf6a-5306ac97e10d) remote_fd is -1. EBADFD [File descriptor in bad state]
[2018-10-09 13:40:40.579313] W [MSGID: 114061] [client-common.c:2658:client_pre_flush_v2] 0-gvol1-client-2: (0ac67ee4-a31e-4989-ba1e-e4f513c1f757) remote_fd is -1. EBADFD [File descriptor in bad state]
[2018-10-09 13:40:40.579707] W [MSGID: 114061] [client-common.c:2658:client_pre_flush_v2] 0-gvol1-client-2: (7ea6106d-29f4-4a19-8eb6-6515ffefb9d3) remote_fd is -1. EBADFD [File descriptor in bad state]
[2018-10-09 13:40:40.579911] W [MSGID: 114061] [client-common.c:2658:client_pre_flush_v2] 0-gvol1-client-2: (7ea6106d-29f4-4a19-8eb6-6515ffefb9d3) remote_fd is -1. EBADFD [File descriptor in bad state]

I assume this is probably because the client didnât catch up with the previous delete? I do control the server (client to the gluster volume) that the restart occurs, and I prevent having more than one rebuilding the same app at the same time, which makes these logs odd.

Iâve implemented the volume options above after reading most of the entries in the archive here over the last few weeks, but Iâm not sure what else to tweak because other than the restart of the apps it is working pretty well.

If thereâs any input you may have to help on this particular scenario Iâd be much appreciated.

Thanks,
P.
_______________________________________________
Gluster-users mailing list
Gluster-***@gluster.org<mailto:Gluster-***@gluster.org>
https://lists.gluster.org/mailman/listinfo/gluster-users