[Gluster-users] Performance drop from 3.8 to 3.10

Discussion:

Lindsay Mathieson

2017-09-21 13:01:11 UTC

Upgraded recently from 3.8.15 to 3.10.5 and have seen a fairly
substantial drop in read/write perfomance

env:

- 3 node, replica 3 cluster

- Private dedicated Network: 1Gx3, bond: balance-alb

- was able to down the volume for the upgrade and reboot each node

- Usage: VM Hosting (qemu)

- Sharded Volume

- sequential read performance in VM's has dropped from 700Mbps to 300mbs

- Seq Write has dropped from 115MB/s (approx) to 110

- Write IOPS have dropped from 12MB/s to 8MB/s

Apart from increasing the op version I made no changes to the volume
settings.

op.version is 31004

gluster v info

Volume Name: datastore4
Type: Replicate
Volume ID: 0ba131ef-311d-4bb1-be46-596e83b2f6ce
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: vnb.proxmox.softlog:/tank/vmdata/datastore4
Brick2: vng.proxmox.softlog:/tank/vmdata/datastore4
Brick3: vnh.proxmox.softlog:/tank/vmdata/datastore4
Options Reconfigured:
transport.address-family: inet
cluster.locking-scheme: granular
cluster.granular-entry-heal: yes
features.shard-block-size: 64MB
network.remote-dio: enable
cluster.eager-lock: enable
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
performance.stat-prefetch: on
performance.strict-write-ordering: off
nfs.enable-ino32: off
nfs.addr-namelookup: off
nfs.disable: on
cluster.server-quorum-type: server
cluster.quorum-type: auto
features.shard: on
cluster.data-self-heal: on
performance.readdir-ahead: on
performance.low-prio-threads: 32
user.cifs: off
performance.flush-behind: on
server.event-threads: 4
client.event-threads: 4
server.allow-insecure: on

--
Lindsay Mathieson

Krutika Dhananjay

2017-09-22 03:21:56 UTC

Permalink

Could you disable cluster.eager-lock and try again?

-Krutika

On Thu, Sep 21, 2017 at 6:31 PM, Lindsay Mathieson <

Upgraded recently from 3.8.15 to 3.10.5 and have seen a fairly substantial
drop in read/write perfomance
- 3 node, replica 3 cluster
- Private dedicated Network: 1Gx3, bond: balance-alb
- was able to down the volume for the upgrade and reboot each node
- Usage: VM Hosting (qemu)
- Sharded Volume
- sequential read performance in VM's has dropped from 700Mbps to 300mbs
- Seq Write has dropped from 115MB/s (approx) to 110
- Write IOPS have dropped from 12MB/s to 8MB/s
Apart from increasing the op version I made no changes to the volume
settings.
op.version is 31004
gluster v info
Volume Name: datastore4
Type: Replicate
Volume ID: 0ba131ef-311d-4bb1-be46-596e83b2f6ce
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Brick1: vnb.proxmox.softlog:/tank/vmdata/datastore4
Brick2: vng.proxmox.softlog:/tank/vmdata/datastore4
Brick3: vnh.proxmox.softlog:/tank/vmdata/datastore4
transport.address-family: inet
cluster.locking-scheme: granular
cluster.granular-entry-heal: yes
features.shard-block-size: 64MB
network.remote-dio: enable
cluster.eager-lock: enable
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
performance.stat-prefetch: on
performance.strict-write-ordering: off
nfs.enable-ino32: off
nfs.addr-namelookup: off
nfs.disable: on
cluster.server-quorum-type: server
cluster.quorum-type: auto
features.shard: on
cluster.data-self-heal: on
performance.readdir-ahead: on
performance.low-prio-threads: 32
user.cifs: off
performance.flush-behind: on
server.event-threads: 4
client.event-threads: 4
server.allow-insecure: on
--
Lindsay Mathieson
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users

Lindsay Mathieson

2017-09-22 15:07:45 UTC

Permalink

Post by Krutika Dhananjay
Could you disable cluster.eager-lock and try again?

Thanks, but didn't seem to make any difference.

Can't test anymore at the moment as down a server that hung on reboot :(

--
Lindsay Mathieson

2017-09-22 21:19:16 UTC

Permalink

Maybe next week we can all explore this.

I'm on 3.10.5 and I don't have any complaints. Actually we are quite
happy with the new clusters, but these were green field installations
that were built and then replaced our old 3.4 stuff.

So we are still really enjoying the sharding and arbiter improvements.

I therefore don't have any baseline stats to compare any performance diffs.

I'm curious as to what changed in 3.10 that would account for any change
in performance from 3.8 and in a similar vein what changes to expect in
3.12.x as we are thinking about making that jump soon.

-wk

Post by Lindsay Mathieson

Post by Krutika Dhananjay
Could you disable cluster.eager-lock and try again?

Thanks, but didn't seem to make any difference.
Can't test anymore at the moment as down a server that hung on reboot :(