Discussion:
Gluster native mount is really slow compared to nfs
(too old to reply)
Jo Goossens
2017-07-11 09:01:52 UTC
Permalink
Raw Message
Hello,

 
 
We tried tons of settings to get a php app running on a native gluster mount:

 
e.g.: 192.168.140.41:/www /var/www glusterfs defaults,_netdev,backup-volfile-servers=192.168.140.42:192.168.140.43,direct-io-mode=disable 0 0

 
I tried some mount variants in order to speed up things without luck.

 
 
After that I tried nfs (native gluster nfs 3 and ganesha nfs 4), it was a crazy performance difference.

 
e.g.: 192.168.140.41:/www /var/www nfs4 defaults,_netdev 0 0

 
I tried a test like this to confirm the slowness:

 
./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41 --threads 8 --files 5000 --file-size 64 --record-size 64
 This test finished in around 1.5 seconds with NFS and in more than 250 seconds without nfs (can't remember exact numbers, but I reproduced it several times for both).
 With the native gluster mount the php app had loading times of over 10 seconds, with the nfs mount the php app loaded around 1 second maximum and even less. (reproduced several times)
  I tried all kind of performance settings and variants of this but not helped , the difference stayed huge, here are some of the settings played with in random order:

 
gluster volume set www features.cache-invalidation on
gluster volume set www features.cache-invalidation-timeout 600
gluster volume set www performance.stat-prefetch on
gluster volume set www performance.cache-samba-metadata on
gluster volume set www performance.cache-invalidation on
gluster volume set www performance.md-cache-timeout 600
gluster volume set www network.inode-lru-limit 250000
 gluster volume set www performance.cache-refresh-timeout 60
gluster volume set www performance.read-ahead disable
gluster volume set www performance.readdir-ahead on
gluster volume set www performance.parallel-readdir on
gluster volume set www performance.write-behind-window-size 4MB
gluster volume set www performance.io-thread-count 64
 gluster volume set www performance.client-io-threads on
 gluster volume set www performance.cache-size 1GB
gluster volume set www performance.quick-read on
gluster volume set www performance.flush-behind on
gluster volume set www performance.write-behind on
gluster volume set www nfs.disable on
 gluster volume set www client.event-threads 3
gluster volume set www server.event-threads 3
  
 
The NFS ha adds a lot of complexity which we wouldn't need at all in our setup, could you please explain what is going on here? Is NFS the only solution to get acceptable performance? Did I miss one crucial settting perhaps?

 
We're really desperate, thanks a lot for your help!

 
 
PS: We tried with gluster 3.11 and 3.8 on Debian, both had terrible performance when not used with nfs.

 
 


Kind regards

Jo Goossens

 
 
 
Soumya Koduri
2017-07-11 09:16:44 UTC
Permalink
Raw Message
+ Ambarish
Post by Jo Goossens
Hello,
e.g.: 192.168.140.41:/www /var/www glusterfs
defaults,_netdev,backup-volfile-servers=192.168.140.42:192.168.140.43,direct-io-mode=disable
0 0
I tried some mount variants in order to speed up things without luck.
After that I tried nfs (native gluster nfs 3 and ganesha nfs 4), it was
a crazy performance difference.
e.g.: 192.168.140.41:/www /var/www nfs4 defaults,_netdev 0 0
./smallfile_cli.py --top /var/www/test --host-set 192.168.140.41
--threads 8 --files 5000 --file-size 64 --record-size 64
This test finished in around 1.5 seconds with NFS and in more than 250
seconds without nfs (can't remember exact numbers, but I reproduced it
several times for both).
With the native gluster mount the php app had loading times of over 10
seconds, with the nfs mount the php app loaded around 1 second maximum
and even less. (reproduced several times)
I tried all kind of performance settings and variants of this but not
helped , the difference stayed huge, here are some of the settings
Request Ambarish & Karan (cc'ed who have been working on evaluating
performance of various access protocols gluster supports) to look at the
below settings and provide inputs.

Thanks,
Soumya
Post by Jo Goossens
gluster volume set www features.cache-invalidation on
gluster volume set www features.cache-invalidation-timeout 600
gluster volume set www performance.stat-prefetch on
gluster volume set www performance.cache-samba-metadata on
gluster volume set www performance.cache-invalidation on
gluster volume set www performance.md-cache-timeout 600
gluster volume set www network.inode-lru-limit 250000
gluster volume set www performance.cache-refresh-timeout 60
gluster volume set www performance.read-ahead disable
gluster volume set www performance.readdir-ahead on
gluster volume set www performance.parallel-readdir on
gluster volume set www performance.write-behind-window-size 4MB
gluster volume set www performance.io-thread-count 64
gluster volume set www performance.client-io-threads on
gluster volume set www performance.cache-size 1GB
gluster volume set www performance.quick-read on
gluster volume set www performance.flush-behind on
gluster volume set www performance.write-behind on
gluster volume set www nfs.disable on
gluster volume set www client.event-threads 3
gluster volume set www server.event-threads 3
The NFS ha adds a lot of complexity which we wouldn't need at all in our
setup, could you please explain what is going on here? Is NFS the only
solution to get acceptable performance? Did I miss one crucial settting
perhaps?
We're really desperate, thanks a lot for your help!
PS: We tried with gluster 3.11 and 3.8 on Debian, both had terrible
performance when not used with nfs.
Kind regards
Jo Goossens
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users
Jo Goossens
2017-07-11 09:26:28 UTC
Permalink
Raw Message
Hi all,

 
 
One more thing, we have 3 apps servers with the gluster on it, replicated on 3 different gluster nodes. (So the gluster nodes are app servers at the same time). We could actually almost work locally if we wouldn't need to have the same files on the 3 nodes and redundancy :)

 
Initial cluster was created like this:

 
gluster volume create www replica 3 transport tcp 192.168.140.41:/gluster/www 192.168.140.42:/gluster/www 192.168.140.43:/gluster/www force
gluster volume set www network.ping-timeout 5
gluster volume set www performance.cache-size 1024MB
gluster volume set www nfs.disable on # No need for NFS currently
gluster volume start www
 To my understanding it still wouldn't explain why nfs has such great performance compared to native ...
  
Regards

Jo

 

 
-----Original message-----
From:Soumya Koduri <***@redhat.com>
Sent:Tue 11-07-2017 11:16
Subject:Re: [Gluster-users] Gluster native mount is really slow compared to nfs
To:Jo Goossens <***@hosted-power.com>; gluster-***@gluster.org;
CC:Ambarish Soman <***@redhat.com>; Karan Sandha <***@redhat.com>;
+ Ambarish
Post by Jo Goossens
Hello,
e.g.: 192.168.140.41:/www /var/www glusterfs
defaults,_netdev,backup-volfile-servers=192.168.140.42:192.168.140.43,direct-io-mode=disable
0 0
I tried some mount variants in order to speed up things without luck.
After that I tried nfs (native gluster nfs 3 and ganesha nfs 4), it was
a crazy performance difference.
e.g.: 192.168.140.41:/www /var/www nfs4 defaults,_netdev 0 0
./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41
--threads 8 --files 5000 --file-size 64 --record-size 64
This test finished in around 1.5 seconds with NFS and in more than 250
seconds without nfs (can't remember exact numbers, but I reproduced it
several times for both).
With the native gluster mount the php app had loading times of over 10
seconds, with the nfs mount the php app loaded around 1 second maximum
and even less. (reproduced several times)
I tried all kind of performance settings and variants of this but not
helped , the difference stayed huge, here are some of the settings
Request Ambarish & Karan (cc'ed who have been working on evaluating
performance of various access protocols gluster supports) to look at the
below settings and provide inputs.

Thanks,
Soumya
Post by Jo Goossens
gluster volume set www features.cache-invalidation on
gluster volume set www features.cache-invalidation-timeout 600
gluster volume set www performance.stat-prefetch on
gluster volume set www performance.cache-samba-metadata on
gluster volume set www performance.cache-invalidation on
gluster volume set www performance.md-cache-timeout 600
gluster volume set www network.inode-lru-limit 250000
gluster volume set www performance.cache-refresh-timeout 60
gluster volume set www performance.read-ahead disable
gluster volume set www performance.readdir-ahead on
gluster volume set www performance.parallel-readdir on
gluster volume set www performance.write-behind-window-size 4MB
gluster volume set www performance.io-thread-count 64
gluster volume set www performance.client-io-threads on
gluster volume set www performance.cache-size 1GB
gluster volume set www performance.quick-read on
gluster volume set www performance.flush-behind on
gluster volume set www performance.write-behind on
gluster volume set www nfs.disable on
gluster volume set www client.event-threads 3
gluster volume set www server.event-threads 3
The NFS ha adds a lot of complexity which we wouldn't need at all in our
setup, could you please explain what is going on here? Is NFS the only
solution to get acceptable performance? Did I miss one crucial settting
perhaps?
We're really desperate, thanks a lot for your help!
PS: We tried with gluster 3.11 and 3.8 on Debian, both had terrible
performance when not used with nfs.
Kind regards
Jo Goossens
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users
Jo Goossens
2017-07-11 10:15:33 UTC
Permalink
Raw Message
Hello,

 
 
Here is some speedtest with a new setup we just made with gluster 3.10, there are no other differences, except glusterfs versus nfs. The nfs is about 80 times faster:

 
 
***@app1:~/smallfile-master# mount -t glusterfs -o use-readdirp=no,log-level=WARNING,log-file=/var/log/glusterxxx.log 192.168.140.41:/www /var/www
***@app1:~/smallfile-master# ./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41 --threads 8 --files 500 --file-size 64 --record-size 64
smallfile version 3.0
                           hosts in test : ['192.168.140.41']
                   top test directory(s) : ['/var/www/test']
                               operation : cleanup
                            files/thread : 500
                                 threads : 8
           record size (KB, 0 = maximum) : 64
                          file size (KB) : 64
                  file size distribution : fixed
                           files per dir : 100
                            dirs per dir : 10
              threads share directories? : N
                         filename prefix :
                         filename suffix :
             hash file number into dir.? : N
                     fsync after modify? : N
          pause between files (microsec) : 0
                    finish all requests? : Y
                              stonewall? : Y
                 measure response times? : N
                            verify read? : Y
                                verbose? : False
                          log to stderr? : False
                           ext.attr.size : 0
                          ext.attr.count : 0
               permute host directories? : N
                remote program directory : /root/smallfile-master
               network thread sync. dir. : /var/www/test/network_shared
starting all threads by creating starting gate file /var/www/test/network_shared/starting_gate.tmp
host = 192.168.140.41,thr = 00,elapsed = 68.845450,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 01,elapsed = 67.601088,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 02,elapsed = 58.677994,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 03,elapsed = 65.901922,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 04,elapsed = 66.971720,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 05,elapsed = 71.245102,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 06,elapsed = 67.574845,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 07,elapsed = 54.263242,files = 500,records = 0,status = ok
total threads = 8
total files = 4000
100.00% of requested files processed, minimum is  70.00
71.245102 sec elapsed time
56.144211 files/sec
 umount /var/www
 ***@app1:~/smallfile-master# mount -t nfs -o tcp 192.168.140.41:/www /var/www
***@app1:~/smallfile-master# ./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41 --threads 8 --files 500 --file-size 64 --record-size 64
smallfile version 3.0
                           hosts in test : ['192.168.140.41']
                   top test directory(s) : ['/var/www/test']
                               operation : cleanup
                            files/thread : 500
                                 threads : 8
           record size (KB, 0 = maximum) : 64
                          file size (KB) : 64
                  file size distribution : fixed
                           files per dir : 100
                            dirs per dir : 10
              threads share directories? : N
                         filename prefix :
                         filename suffix :
             hash file number into dir.? : N
                     fsync after modify? : N
          pause between files (microsec) : 0
                    finish all requests? : Y
                              stonewall? : Y
                 measure response times? : N
                            verify read? : Y
                                verbose? : False
                          log to stderr? : False
                           ext.attr.size : 0
                          ext.attr.count : 0
               permute host directories? : N
                remote program directory : /root/smallfile-master
               network thread sync. dir. : /var/www/test/network_shared
starting all threads by creating starting gate file /var/www/test/network_shared/starting_gate.tmp
host = 192.168.140.41,thr = 00,elapsed = 0.962424,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 01,elapsed = 0.942673,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 02,elapsed = 0.940622,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 03,elapsed = 0.915218,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 04,elapsed = 0.934349,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 05,elapsed = 0.922466,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 06,elapsed = 0.954381,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 07,elapsed = 0.946127,files = 500,records = 0,status = ok
total threads = 8
total files = 4000
100.00% of requested files processed, minimum is  70.00
0.962424 sec elapsed time
4156.173189 files/sec
  
 
-----Original message-----
From:Jo Goossens <***@hosted-power.com>
Sent:Tue 11-07-2017 11:26
Subject:Re: [Gluster-users] Gluster native mount is really slow compared to nfs
To:gluster-***@gluster.org; Soumya Koduri <***@redhat.com>;
CC:Ambarish Soman <***@redhat.com>;


Hi all,

 
 
One more thing, we have 3 apps servers with the gluster on it, replicated on 3 different gluster nodes. (So the gluster nodes are app servers at the same time). We could actually almost work locally if we wouldn't need to have the same files on the 3 nodes and redundancy :)

 
Initial cluster was created like this:

 
gluster volume create www replica 3 transport tcp 192.168.140.41:/gluster/www 192.168.140.42:/gluster/www 192.168.140.43:/gluster/www force
gluster volume set www network.ping-timeout 5
gluster volume set www performance.cache-size 1024MB
gluster volume set www nfs.disable on # No need for NFS currently
gluster volume start www
 To my understanding it still wouldn't explain why nfs has such great performance compared to native ...
  
Regards

Jo

 

 
-----Original message-----
From:Soumya Koduri <***@redhat.com>
Sent:Tue 11-07-2017 11:16
Subject:Re: [Gluster-users] Gluster native mount is really slow compared to nfs
To:Jo Goossens <***@hosted-power.com>; gluster-***@gluster.org;
CC:Ambarish Soman <***@redhat.com>; Karan Sandha <***@redhat.com>;
+ Ambarish
Post by Jo Goossens
Hello,
e.g.: 192.168.140.41:/www /var/www glusterfs
defaults,_netdev,backup-volfile-servers=192.168.140.42:192.168.140.43,direct-io-mode=disable
0 0
I tried some mount variants in order to speed up things without luck.
After that I tried nfs (native gluster nfs 3 and ganesha nfs 4), it was
a crazy performance difference.
e.g.: 192.168.140.41:/www /var/www nfs4 defaults,_netdev 0 0
./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41
--threads 8 --files 5000 --file-size 64 --record-size 64
This test finished in around 1.5 seconds with NFS and in more than 250
seconds without nfs (can't remember exact numbers, but I reproduced it
several times for both).
With the native gluster mount the php app had loading times of over 10
seconds, with the nfs mount the php app loaded around 1 second maximum
and even less. (reproduced several times)
I tried all kind of performance settings and variants of this but not
helped , the difference stayed huge, here are some of the settings
Request Ambarish & Karan (cc'ed who have been working on evaluating
performance of various access protocols gluster supports) to look at the
below settings and provide inputs.

Thanks,
Soumya
Post by Jo Goossens
gluster volume set www features.cache-invalidation on
gluster volume set www features.cache-invalidation-timeout 600
gluster volume set www performance.stat-prefetch on
gluster volume set www performance.cache-samba-metadata on
gluster volume set www performance.cache-invalidation on
gluster volume set www performance.md-cache-timeout 600
gluster volume set www network.inode-lru-limit 250000
gluster volume set www performance.cache-refresh-timeout 60
gluster volume set www performance.read-ahead disable
gluster volume set www performance.readdir-ahead on
gluster volume set www performance.parallel-readdir on
gluster volume set www performance.write-behind-window-size 4MB
gluster volume set www performance.io-thread-count 64
gluster volume set www performance.client-io-threads on
gluster volume set www performance.cache-size 1GB
gluster volume set www performance.quick-read on
gluster volume set www performance.flush-behind on
gluster volume set www performance.write-behind on
gluster volume set www nfs.disable on
gluster volume set www client.event-threads 3
gluster volume set www server.event-threads 3
The NFS ha adds a lot of complexity which we wouldn't need at all in our
setup, could you please explain what is going on here? Is NFS the only
solution to get acceptable performance? Did I miss one crucial settting
perhaps?
We're really desperate, thanks a lot for your help!
PS: We tried with gluster 3.11 and 3.8 on Debian, both had terrible
performance when not used with nfs.
Kind regards
Jo Goossens
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-***@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users
Jo Goossens
2017-07-11 14:48:19 UTC
Permalink
Raw Message
Hello,

 
 
Here is the volume info as requested by soumya:

 
#gluster volume info www
 Volume Name: www
Type: Replicate
Volume ID: 5d64ee36-828a-41fa-adbf-75718b954aff
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 192.168.140.41:/gluster/www
Brick2: 192.168.140.42:/gluster/www
Brick3: 192.168.140.43:/gluster/www
Options Reconfigured:
cluster.read-hash-mode: 0
performance.quick-read: on
performance.write-behind-window-size: 4MB
server.allow-insecure: on
performance.read-ahead: disable
performance.readdir-ahead: on
performance.io-thread-count: 64
performance.io-cache: on
performance.client-io-threads: on
server.outstanding-rpc-limit: 128
server.event-threads: 3
client.event-threads: 3
performance.cache-size: 32MB
transport.address-family: inet
nfs.disable: on
nfs.addr-namelookup: off
nfs.export-volumes: on
nfs.rpc-auth-allow: 192.168.140.*
features.cache-invalidation: on
features.cache-invalidation-timeout: 600
performance.stat-prefetch: on
performance.cache-samba-metadata: on
performance.cache-invalidation: on
performance.md-cache-timeout: 600
network.inode-lru-limit: 100000
performance.parallel-readdir: on
performance.cache-refresh-timeout: 60
performance.rda-cache-limit: 50MB
cluster.nufa: on
network.ping-timeout: 5
cluster.lookup-optimize: on
cluster.quorum-type: auto
 I started with none of them set and I added/changed while testing. But it was always slow, by tuning some kernel parameters it improved slightly (just a few percent, nothing reasonable)
 I also tried ceph just to compare, I got this with default settings and no tweaks:
  ./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41 --threads 8 --files 5000 --file-size 64 --record-size 64
smallfile version 3.0
                           hosts in test : ['192.168.140.41']
                   top test directory(s) : ['/var/www/test']
                               operation : cleanup
                            files/thread : 5000
                                 threads : 8
           record size (KB, 0 = maximum) : 64
                          file size (KB) : 64
                  file size distribution : fixed
                           files per dir : 100
                            dirs per dir : 10
              threads share directories? : N
                         filename prefix :
                         filename suffix :
             hash file number into dir.? : N
                     fsync after modify? : N
          pause between files (microsec) : 0
                    finish all requests? : Y
                              stonewall? : Y
                 measure response times? : N
                            verify read? : Y
                                verbose? : False
                          log to stderr? : False
                           ext.attr.size : 0
                          ext.attr.count : 0
               permute host directories? : N
                remote program directory : /root/smallfile-master
               network thread sync. dir. : /var/www/test/network_shared
starting all threads by creating starting gate file /var/www/test/network_shared/starting_gate.tmp
host = 192.168.140.41,thr = 00,elapsed = 1.339621,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 01,elapsed = 1.436776,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 02,elapsed = 1.498681,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 03,elapsed = 1.483886,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 04,elapsed = 1.454833,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 05,elapsed = 1.469340,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 06,elapsed = 1.439060,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 07,elapsed = 1.375074,files = 5000,records = 0,status = ok
total threads = 8
total files = 40000
100.00% of requested files processed, minimum is  70.00
1.498681 sec elapsed time
26690.134975 files/sec
  
Regards

Jo

 
-----Original message-----
From:Jo Goossens <***@hosted-power.com>
Sent:Tue 11-07-2017 12:15
Subject:Re: [Gluster-users] Gluster native mount is really slow compared to nfs
To:Soumya Koduri <***@redhat.com>; gluster-***@gluster.org;
CC:Ambarish Soman <***@redhat.com>;


Hello,

 
 
Here is some speedtest with a new setup we just made with gluster 3.10, there are no other differences, except glusterfs versus nfs. The nfs is about 80 times faster:

 
 
***@app1:~/smallfile-master# mount -t glusterfs -o use-readdirp=no,log-level=WARNING,log-file=/var/log/glusterxxx.log 192.168.140.41:/www /var/www
***@app1:~/smallfile-master# ./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41 --threads 8 --files 500 --file-size 64 --record-size 64
smallfile version 3.0
                           hosts in test : ['192.168.140.41']
                   top test directory(s) : ['/var/www/test']
                               operation : cleanup
                            files/thread : 500
                                 threads : 8
           record size (KB, 0 = maximum) : 64
                          file size (KB) : 64
                  file size distribution : fixed
                           files per dir : 100
                            dirs per dir : 10
              threads share directories? : N
                         filename prefix :
                         filename suffix :
             hash file number into dir.? : N
                     fsync after modify? : N
          pause between files (microsec) : 0
                    finish all requests? : Y
                              stonewall? : Y
                 measure response times? : N
                            verify read? : Y
                                verbose? : False
                          log to stderr? : False
                           ext.attr.size : 0
                          ext.attr.count : 0
               permute host directories? : N
                remote program directory : /root/smallfile-master
               network thread sync. dir. : /var/www/test/network_shared
starting all threads by creating starting gate file /var/www/test/network_shared/starting_gate.tmp
host = 192.168.140.41,thr = 00,elapsed = 68.845450,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 01,elapsed = 67.601088,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 02,elapsed = 58.677994,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 03,elapsed = 65.901922,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 04,elapsed = 66.971720,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 05,elapsed = 71.245102,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 06,elapsed = 67.574845,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 07,elapsed = 54.263242,files = 500,records = 0,status = ok
total threads = 8
total files = 4000
100.00% of requested files processed, minimum is  70.00
71.245102 sec elapsed time
56.144211 files/sec
 umount /var/www
 ***@app1:~/smallfile-master# mount -t nfs -o tcp 192.168.140.41:/www /var/www
***@app1:~/smallfile-master# ./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41 --threads 8 --files 500 --file-size 64 --record-size 64
smallfile version 3.0
                           hosts in test : ['192.168.140.41']
                   top test directory(s) : ['/var/www/test']
                               operation : cleanup
                            files/thread : 500
                                 threads : 8
           record size (KB, 0 = maximum) : 64
                          file size (KB) : 64
                  file size distribution : fixed
                           files per dir : 100
                            dirs per dir : 10
              threads share directories? : N
                         filename prefix :
                         filename suffix :
             hash file number into dir.? : N
                     fsync after modify? : N
          pause between files (microsec) : 0
                    finish all requests? : Y
                              stonewall? : Y
                 measure response times? : N
                            verify read? : Y
                                verbose? : False
                          log to stderr? : False
                           ext.attr.size : 0
                          ext.attr.count : 0
               permute host directories? : N
                remote program directory : /root/smallfile-master
               network thread sync. dir. : /var/www/test/network_shared
starting all threads by creating starting gate file /var/www/test/network_shared/starting_gate.tmp
host = 192.168.140.41,thr = 00,elapsed = 0.962424,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 01,elapsed = 0.942673,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 02,elapsed = 0.940622,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 03,elapsed = 0.915218,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 04,elapsed = 0.934349,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 05,elapsed = 0.922466,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 06,elapsed = 0.954381,files = 500,records = 0,status = ok
host = 192.168.140.41,thr = 07,elapsed = 0.946127,files = 500,records = 0,status = ok
total threads = 8
total files = 4000
100.00% of requested files processed, minimum is  70.00
0.962424 sec elapsed time
4156.173189 files/sec
  
 
-----Original message-----
From:Jo Goossens <***@hosted-power.com>
Sent:Tue 11-07-2017 11:26
Subject:Re: [Gluster-users] Gluster native mount is really slow compared to nfs
To:gluster-***@gluster.org; Soumya Koduri <***@redhat.com>;
CC:Ambarish Soman <***@redhat.com>;


Hi all,

 
 
One more thing, we have 3 apps servers with the gluster on it, replicated on 3 different gluster nodes. (So the gluster nodes are app servers at the same time). We could actually almost work locally if we wouldn't need to have the same files on the 3 nodes and redundancy :)

 
Initial cluster was created like this:

 
gluster volume create www replica 3 transport tcp 192.168.140.41:/gluster/www 192.168.140.42:/gluster/www 192.168.140.43:/gluster/www force
gluster volume set www network.ping-timeout 5
gluster volume set www performance.cache-size 1024MB
gluster volume set www nfs.disable on # No need for NFS currently
gluster volume start www
 To my understanding it still wouldn't explain why nfs has such great performance compared to native ...
  
Regards

Jo

 

 
-----Original message-----
From:Soumya Koduri <***@redhat.com>
Sent:Tue 11-07-2017 11:16
Subject:Re: [Gluster-users] Gluster native mount is really slow compared to nfs
To:Jo Goossens <***@hosted-power.com>; gluster-***@gluster.org;
CC:Ambarish Soman <***@redhat.com>; Karan Sandha <***@redhat.com>;
+ Ambarish
Post by Jo Goossens
Hello,
e.g.: 192.168.140.41:/www /var/www glusterfs
defaults,_netdev,backup-volfile-servers=192.168.140.42:192.168.140.43,direct-io-mode=disable
0 0
I tried some mount variants in order to speed up things without luck.
After that I tried nfs (native gluster nfs 3 and ganesha nfs 4), it was
a crazy performance difference.
e.g.: 192.168.140.41:/www /var/www nfs4 defaults,_netdev 0 0
./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41
--threads 8 --files 5000 --file-size 64 --record-size 64
This test finished in around 1.5 seconds with NFS and in more than 250
seconds without nfs (can't remember exact numbers, but I reproduced it
several times for both).
With the native gluster mount the php app had loading times of over 10
seconds, with the nfs mount the php app loaded around 1 second maximum
and even less. (reproduced several times)
I tried all kind of performance settings and variants of this but not
helped , the difference stayed huge, here are some of the settings
Request Ambarish & Karan (cc'ed who have been working on evaluating
performance of various access protocols gluster supports) to look at the
below settings and provide inputs.

Thanks,
Soumya
Post by Jo Goossens
gluster volume set www features.cache-invalidation on
gluster volume set www features.cache-invalidation-timeout 600
gluster volume set www performance.stat-prefetch on
gluster volume set www performance.cache-samba-metadata on
gluster volume set www performance.cache-invalidation on
gluster volume set www performance.md-cache-timeout 600
gluster volume set www network.inode-lru-limit 250000
gluster volume set www performance.cache-refresh-timeout 60
gluster volume set www performance.read-ahead disable
gluster volume set www performance.readdir-ahead on
gluster volume set www performance.parallel-readdir on
gluster volume set www performance.write-behind-window-size 4MB
gluster volume set www performance.io-thread-count 64
gluster volume set www performance.client-io-threads on
gluster volume set www performance.cache-size 1GB
gluster volume set www performance.quick-read on
gluster volume set www performance.flush-behind on
gluster volume set www performance.write-behind on
gluster volume set www nfs.disable on
gluster volume set www client.event-threads 3
gluster volume set www server.event-threads 3
The NFS ha adds a lot of complexity which we wouldn't need at all in our
setup, could you please explain what is going on here? Is NFS the only
solution to get acceptable performance? Did I miss one crucial settting
perhaps?
We're really desperate, thanks a lot for your help!
PS: We tried with gluster 3.11 and 3.8 on Debian, both had terrible
performance when not used with nfs.
Kind regards
Jo Goossens
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-***@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users





_______________________________________________
Gluster-users mailing list
Gluster-***@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users
Joe Julian
2017-07-11 15:04:24 UTC
Permalink
Raw Message
My standard response to someone needing filesystem performance for www
traffic is generally, "you're doing it wrong".
https://joejulian.name/blog/optimizing-web-performance-with-glusterfs/

That said, you might also look at these mount options:
attribute-timeout, entry-timeout, negative-timeout (set to some large
amount of time), and fopen-keep-cache.
RE: [Gluster-users] Gluster native mount is really slow compared to nfs
Hello,
#gluster volume info www
Volume Name: www
Type: Replicate
Volume ID: 5d64ee36-828a-41fa-adbf-75718b954aff
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Brick1: 192.168.140.41:/gluster/www
Brick2: 192.168.140.42:/gluster/www
Brick3: 192.168.140.43:/gluster/www
cluster.read-hash-mode: 0
performance.quick-read: on
performance.write-behind-window-size: 4MB
server.allow-insecure: on
performance.read-ahead: disable
performance.readdir-ahead: on
performance.io-thread-count: 64
performance.io-cache: on
performance.client-io-threads: on
server.outstanding-rpc-limit: 128
server.event-threads: 3
client.event-threads: 3
performance.cache-size: 32MB
transport.address-family: inet
nfs.disable: on
nfs.addr-namelookup: off
nfs.export-volumes: on
nfs.rpc-auth-allow: 192.168.140.*
features.cache-invalidation: on
features.cache-invalidation-timeout: 600
performance.stat-prefetch: on
performance.cache-samba-metadata: on
performance.cache-invalidation: on
performance.md-cache-timeout: 600
network.inode-lru-limit: 100000
performance.parallel-readdir: on
performance.cache-refresh-timeout: 60
performance.rda-cache-limit: 50MB
cluster.nufa: on
network.ping-timeout: 5
cluster.lookup-optimize: on
cluster.quorum-type: auto
I started with none of them set and I added/changed while testing. But
it was always slow, by tuning some kernel parameters it improved
slightly (just a few percent, nothing reasonable)
I also tried ceph just to compare, I got this with default settings
./smallfile_cli.py --top /var/www/test --host-set 192.168.140.41
--threads 8 --files 5000 --file-size 64 --record-size 64
smallfile version 3.0
hosts in test : ['192.168.140.41']
top test directory(s) : ['/var/www/test']
operation : cleanup
files/thread : 5000
threads : 8
record size (KB, 0 = maximum) : 64
file size (KB) : 64
file size distribution : fixed
files per dir : 100
dirs per dir : 10
threads share directories? : N
hash file number into dir.? : N
fsync after modify? : N
pause between files (microsec) : 0
finish all requests? : Y
stonewall? : Y
measure response times? : N
verify read? : Y
verbose? : False
log to stderr? : False
ext.attr.size : 0
ext.attr.count : 0
permute host directories? : N
remote program directory : /root/smallfile-master
network thread sync. dir. : /var/www/test/network_shared
starting all threads by creating starting gate file
/var/www/test/network_shared/starting_gate.tmp
host = 192.168.140.41,thr = 00,elapsed = 1.339621,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 01,elapsed = 1.436776,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 02,elapsed = 1.498681,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 03,elapsed = 1.483886,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 04,elapsed = 1.454833,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 05,elapsed = 1.469340,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 06,elapsed = 1.439060,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 07,elapsed = 1.375074,files = 5000,records = 0,status = ok
total threads = 8
total files = 40000
100.00% of requested files processed, minimum is 70.00
1.498681 sec elapsed time
26690.134975 files/sec
Regards
Jo
-----Original message-----
*Sent:* Tue 11-07-2017 12:15
*Subject:* Re: [Gluster-users] Gluster native mount is really slow
compared to nfs
Hello,
Here is some speedtest with a new setup we just made with gluster
3.10, there are no other differences, except glusterfs versus nfs.
use-readdirp=no,log-level=WARNING,log-file=/var/log/glusterxxx.log
192.168.140.41:/www /var/www
/var/www/test --host-set 192.168.140.41 --threads 8 --files 500
--file-size 64 --record-size 64
smallfile version 3.0
hosts in test : ['192.168.140.41']
top test directory(s) : ['/var/www/test']
operation : cleanup
files/thread : 500
threads : 8
record size (KB, 0 = maximum) : 64
file size (KB) : 64
file size distribution : fixed
files per dir : 100
dirs per dir : 10
threads share directories? : N
hash file number into dir.? : N
fsync after modify? : N
pause between files (microsec) : 0
finish all requests? : Y
stonewall? : Y
measure response times? : N
verify read? : Y
verbose? : False
log to stderr? : False
ext.attr.size : 0
ext.attr.count : 0
permute host directories? : N
remote program directory : /root/smallfile-master
/var/www/test/network_shared
starting all threads by creating starting gate file
/var/www/test/network_shared/starting_gate.tmp
host = 192.168.140.41,thr = 00,elapsed = 68.845450,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 01,elapsed = 67.601088,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 02,elapsed = 58.677994,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 03,elapsed = 65.901922,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 04,elapsed = 66.971720,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 05,elapsed = 71.245102,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 06,elapsed = 67.574845,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 07,elapsed = 54.263242,files =
500,records = 0,status = ok
total threads = 8
total files = 4000
100.00% of requested files processed, minimum is 70.00
71.245102 sec elapsed time
56.144211 files/sec
umount /var/www
192.168.140.41:/www /var/www
/var/www/test --host-set 192.168.140.41 --threads 8 --files 500
--file-size 64 --record-size 64
smallfile version 3.0
hosts in test : ['192.168.140.41']
top test directory(s) : ['/var/www/test']
operation : cleanup
files/thread : 500
threads : 8
record size (KB, 0 = maximum) : 64
file size (KB) : 64
file size distribution : fixed
files per dir : 100
dirs per dir : 10
threads share directories? : N
hash file number into dir.? : N
fsync after modify? : N
pause between files (microsec) : 0
finish all requests? : Y
stonewall? : Y
measure response times? : N
verify read? : Y
verbose? : False
log to stderr? : False
ext.attr.size : 0
ext.attr.count : 0
permute host directories? : N
remote program directory : /root/smallfile-master
/var/www/test/network_shared
starting all threads by creating starting gate file
/var/www/test/network_shared/starting_gate.tmp
host = 192.168.140.41,thr = 00,elapsed = 0.962424,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 01,elapsed = 0.942673,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 02,elapsed = 0.940622,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 03,elapsed = 0.915218,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 04,elapsed = 0.934349,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 05,elapsed = 0.922466,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 06,elapsed = 0.954381,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 07,elapsed = 0.946127,files =
500,records = 0,status = ok
total threads = 8
total files = 4000
100.00% of requested files processed, minimum is 70.00
0.962424 sec elapsed time
4156.173189 files/sec
-----Original message-----
*Sent:* Tue 11-07-2017 11:26
*Subject:* Re: [Gluster-users] Gluster native mount is really
slow compared to nfs
Hi all,
One more thing, we have 3 apps servers with the gluster on it,
replicated on 3 different gluster nodes. (So the gluster nodes
are app servers at the same time). We could actually almost
work locally if we wouldn't need to have the same files on the
3 nodes and redundancy :)
gluster volume create www replica 3 transport tcp
192.168.140.41:/gluster/www 192.168.140.42:/gluster/www
192.168.140.43:/gluster/www force
gluster volume set www network.ping-timeout 5
gluster volume set www performance.cache-size 1024MB
gluster volume set www nfs.disable on # No need for NFS currently
gluster volume start www
To my understanding it still wouldn't explain why nfs has such
great performance compared to native ...
Regards
Jo
-----Original message-----
*Sent:* Tue 11-07-2017 11:16
*Subject:* Re: [Gluster-users] Gluster native mount is
really slow compared to nfs
+ Ambarish
Post by Jo Goossens
Hello,
We tried tons of settings to get a php app running on a
native gluster
Post by Jo Goossens
e.g.: 192.168.140.41:/www /var/www glusterfs
defaults,_netdev,backup-volfile-servers=192.168.140.42:192.168.140.43,direct-io-mode=disable
Post by Jo Goossens
0 0
I tried some mount variants in order to speed up things
without luck.
Post by Jo Goossens
After that I tried nfs (native gluster nfs 3 and ganesha
nfs 4), it was
Post by Jo Goossens
a crazy performance difference.
e.g.: 192.168.140.41:/www /var/www nfs4 defaults,_netdev 0 0
./smallfile_cli.py --top /var/www/test --host-set
192.168.140.41
Post by Jo Goossens
--threads 8 --files 5000 --file-size 64 --record-size 64
This test finished in around 1.5 seconds with NFS and in
more than 250
Post by Jo Goossens
seconds without nfs (can't remember exact numbers, but I
reproduced it
Post by Jo Goossens
several times for both).
With the native gluster mount the php app had loading
times of over 10
Post by Jo Goossens
seconds, with the nfs mount the php app loaded around 1
second maximum
Post by Jo Goossens
and even less. (reproduced several times)
I tried all kind of performance settings and variants of
this but not
Post by Jo Goossens
helped , the difference stayed huge, here are some of
the settings
Request Ambarish & Karan (cc'ed who have been working on evaluating
performance of various access protocols gluster supports)
to look at the
below settings and provide inputs.
Thanks,
Soumya
Post by Jo Goossens
gluster volume set www features.cache-invalidation on
gluster volume set www
features.cache-invalidation-timeout 600
Post by Jo Goossens
gluster volume set www performance.stat-prefetch on
gluster volume set www performance.cache-samba-metadata on
gluster volume set www performance.cache-invalidation on
gluster volume set www performance.md-cache-timeout 600
gluster volume set www network.inode-lru-limit 250000
gluster volume set www performance.cache-refresh-timeout 60
gluster volume set www performance.read-ahead disable
gluster volume set www performance.readdir-ahead on
gluster volume set www performance.parallel-readdir on
gluster volume set www
performance.write-behind-window-size 4MB
Post by Jo Goossens
gluster volume set www performance.io-thread-count 64
gluster volume set www performance.client-io-threads on
gluster volume set www performance.cache-size 1GB
gluster volume set www performance.quick-read on
gluster volume set www performance.flush-behind on
gluster volume set www performance.write-behind on
gluster volume set www nfs.disable on
gluster volume set www client.event-threads 3
gluster volume set www server.event-threads 3
The NFS ha adds a lot of complexity which we wouldn't
need at all in our
Post by Jo Goossens
setup, could you please explain what is going on here?
Is NFS the only
Post by Jo Goossens
solution to get acceptable performance? Did I miss one
crucial settting
Post by Jo Goossens
perhaps?
We're really desperate, thanks a lot for your help!
PS: We tried with gluster 3.11 and 3.8 on Debian, both
had terrible
Post by Jo Goossens
performance when not used with nfs.
Kind regards
Jo Goossens
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users
Jo Goossens
2017-07-11 15:14:51 UTC
Permalink
Raw Message
Hello Joe,

 
 
I really appreciate your feedback, but I already tried the opcache stuff (to not valildate at all). It improves of course then, but not completely somehow. Still quite slow.

 
I did not try the mount options yet, but I will now!

 
 
With nfs (doesnt matter much built-in version 3 or ganesha version 4) I can even host the site perfectly fast without these extreme opcache settings.

 
I still can't understand why the nfs mount is easily 80 times faster, actually no matter what options I set it seems. It's almost there is something really wrong somehow...

 
I tried the ceph mount now and out of the box it's comparable with gluster with nfs mount.

 
 
Regards

Jo

 
BE: +32 53 599 000

NL: +31 85 888 4 555

 
https://www.hosted-power.com/

 

 
-----Original message-----
From:Joe Julian <***@julianfamily.org>
Sent:Tue 11-07-2017 17:04
Subject:Re: [Gluster-users] Gluster native mount is really slow compared to nfs
To:gluster-***@gluster.org;


My standard response to someone needing filesystem performance for www traffic is generally, "you're doing it wrong". https://joejulian.name/blog/optimizing-web-performance-with-glusterfs/

That said, you might also look at these mount options: attribute-timeout, entry-timeout, negative-timeout (set to some large amount of time), and fopen-keep-cache.



On 07/11/2017 07:48 AM, Jo Goossens wrote:


Hello,



 


 


Here is the volume info as requested by soumya:



 

#gluster volume info www

 
Volume Name: www

Type: Replicate

Volume ID: 5d64ee36-828a-41fa-adbf-75718b954aff

Status: Started

Snapshot Count: 0

Number of Bricks: 1 x 3 = 3

Transport-type: tcp

Bricks:

Brick1: 192.168.140.41:/gluster/www

Brick2: 192.168.140.42:/gluster/www

Brick3: 192.168.140.43:/gluster/www

Options Reconfigured:

cluster.read-hash-mode: 0

performance.quick-read: on

performance.write-behind-window-size: 4MB

server.allow-insecure: on

performance.read-ahead: disable

performance.readdir-ahead: on

performance.io-thread-count: 64

performance.io-cache: on

performance.client-io-threads: on

server.outstanding-rpc-limit: 128

server.event-threads: 3

client.event-threads: 3

performance.cache-size: 32MB

transport.address-family: inet

nfs.disable: on

nfs.addr-namelookup: off

nfs.export-volumes: on

nfs.rpc-auth-allow: 192.168.140.*

features.cache-invalidation: on

features.cache-invalidation-timeout: 600

performance.stat-prefetch: on

performance.cache-samba-metadata: on

performance.cache-invalidation: on

performance.md-cache-timeout: 600

network.inode-lru-limit: 100000

performance.parallel-readdir: on

performance.cache-refresh-timeout: 60

performance.rda-cache-limit: 50MB

cluster.nufa: on

network.ping-timeout: 5

cluster.lookup-optimize: on

cluster.quorum-type: auto

 
I started with none of them set and I added/changed while testing. But it was always slow, by tuning some kernel parameters it improved slightly (just a few percent, nothing reasonable)

 
I also tried ceph just to compare, I got this with default settings and no tweaks:

 
 ./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41 --threads 8 --files 5000 --file-size 64 --record-size 64

smallfile version 3.0

                           hosts in test : ['192.168.140.41']

                   top test directory(s) : ['/var/www/test']

                               operation : cleanup

                            files/thread : 5000

                                 threads : 8

           record size (KB, 0 = maximum) : 64

                          file size (KB) : 64

                  file size distribution : fixed

                           files per dir : 100

                            dirs per dir : 10

              threads share directories? : N

                         filename prefix :

                         filename suffix :

             hash file number into dir.? : N

                     fsync after modify? : N

          pause between files (microsec) : 0

                    finish all requests? : Y

                              stonewall? : Y

                 measure response times? : N

                            verify read? : Y

                                verbose? : False

                          log to stderr? : False

                           ext.attr.size : 0

                          ext.attr.count : 0

               permute host directories? : N

                remote program directory : /root/smallfile-master

               network thread sync. dir. : /var/www/test/network_shared

starting all threads by creating starting gate file /var/www/test/network_shared/starting_gate.tmp

host = 192.168.140.41,thr = 00,elapsed = 1.339621,files = 5000,records = 0,status = ok

host = 192.168.140.41,thr = 01,elapsed = 1.436776,files = 5000,records = 0,status = ok

host = 192.168.140.41,thr = 02,elapsed = 1.498681,files = 5000,records = 0,status = ok

host = 192.168.140.41,thr = 03,elapsed = 1.483886,files = 5000,records = 0,status = ok

host = 192.168.140.41,thr = 04,elapsed = 1.454833,files = 5000,records = 0,status = ok

host = 192.168.140.41,thr = 05,elapsed = 1.469340,files = 5000,records = 0,status = ok

host = 192.168.140.41,thr = 06,elapsed = 1.439060,files = 5000,records = 0,status = ok

host = 192.168.140.41,thr = 07,elapsed = 1.375074,files = 5000,records = 0,status = ok

total threads = 8

total files = 40000

100.00% of requested files processed, minimum is  70.00

1.498681 sec elapsed time

26690.134975 files/sec

 
 


Regards



Jo



 
-----Original message-----
From: Jo Goossens <***@hosted-power.com> <mailto:***@hosted-power.com>
Sent: Tue 11-07-2017 12:15
Subject: Re: [Gluster-users] Gluster native mount is really slow compared to nfs
To: Soumya Koduri <***@redhat.com> <mailto:***@redhat.com> ; gluster-***@gluster.org <mailto:gluster-***@gluster.org> ;
CC: Ambarish Soman <***@redhat.com> <mailto:***@redhat.com> ;


Hello,



 


 


Here is some speedtest with a new setup we just made with gluster 3.10, there are no other differences, except glusterfs versus nfs. The nfs is about 80 times faster:



 


 

***@app1:~/smallfile-master# mount -t glusterfs -o use-readdirp=no,log-level=WARNING,log-file=/var/log/glusterxxx.log 192.168.140.41:/www /var/www

***@app1:~/smallfile-master# ./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41 --threads 8 --files 500 --file-size 64 --record-size 64

smallfile version 3.0

                           hosts in test : ['192.168.140.41']

                   top test directory(s) : ['/var/www/test']

                               operation : cleanup

                            files/thread : 500

                                 threads : 8

           record size (KB, 0 = maximum) : 64

                          file size (KB) : 64

                  file size distribution : fixed

                           files per dir : 100

                            dirs per dir : 10

              threads share directories? : N

                         filename prefix :

                         filename suffix :

             hash file number into dir.? : N

                     fsync after modify? : N

          pause between files (microsec) : 0

                    finish all requests? : Y

                              stonewall? : Y

                 measure response times? : N

                            verify read? : Y

                                verbose? : False

                          log to stderr? : False

                           ext.attr.size : 0

                          ext.attr.count : 0

               permute host directories? : N

                remote program directory : /root/smallfile-master

               network thread sync. dir. : /var/www/test/network_shared

starting all threads by creating starting gate file /var/www/test/network_shared/starting_gate.tmp

host = 192.168.140.41,thr = 00,elapsed = 68.845450,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 01,elapsed = 67.601088,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 02,elapsed = 58.677994,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 03,elapsed = 65.901922,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 04,elapsed = 66.971720,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 05,elapsed = 71.245102,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 06,elapsed = 67.574845,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 07,elapsed = 54.263242,files = 500,records = 0,status = ok

total threads = 8

total files = 4000

100.00% of requested files processed, minimum is  70.00

71.245102 sec elapsed time

56.144211 files/sec

 
umount /var/www

 
***@app1:~/smallfile-master# mount -t nfs -o tcp 192.168.140.41:/www /var/www

***@app1:~/smallfile-master# ./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41 --threads 8 --files 500 --file-size 64 --record-size 64

smallfile version 3.0

                           hosts in test : ['192.168.140.41']

                   top test directory(s) : ['/var/www/test']

                               operation : cleanup

                            files/thread : 500

                                 threads : 8

           record size (KB, 0 = maximum) : 64

                          file size (KB) : 64

                  file size distribution : fixed

                           files per dir : 100

                            dirs per dir : 10

              threads share directories? : N

                         filename prefix :

                         filename suffix :

             hash file number into dir.? : N

                     fsync after modify? : N

          pause between files (microsec) : 0

                    finish all requests? : Y

                              stonewall? : Y

                 measure response times? : N

                            verify read? : Y

                                verbose? : False

                          log to stderr? : False

                           ext.attr.size : 0

                          ext.attr.count : 0

               permute host directories? : N

                remote program directory : /root/smallfile-master

               network thread sync. dir. : /var/www/test/network_shared

starting all threads by creating starting gate file /var/www/test/network_shared/starting_gate.tmp

host = 192.168.140.41,thr = 00,elapsed = 0.962424,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 01,elapsed = 0.942673,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 02,elapsed = 0.940622,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 03,elapsed = 0.915218,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 04,elapsed = 0.934349,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 05,elapsed = 0.922466,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 06,elapsed = 0.954381,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 07,elapsed = 0.946127,files = 500,records = 0,status = ok

total threads = 8

total files = 4000

100.00% of requested files processed, minimum is  70.00

0.962424 sec elapsed time

4156.173189 files/sec

 

 


 
-----Original message-----
From: Jo Goossens <***@hosted-power.com> <mailto:***@hosted-power.com>
Sent: Tue 11-07-2017 11:26
Subject: Re: [Gluster-users] Gluster native mount is really slow compared to nfs
To: gluster-***@gluster.org <mailto:gluster-***@gluster.org> ; Soumya Koduri <***@redhat.com> <mailto:***@redhat.com> ;
CC: Ambarish Soman <***@redhat.com> <mailto:***@redhat.com> ;


Hi all,



 


 


One more thing, we have 3 apps servers with the gluster on it, replicated on 3 different gluster nodes. (So the gluster nodes are app servers at the same time). We could actually almost work locally if we wouldn't need to have the same files on the 3 nodes and redundancy :)



 


Initial cluster was created like this:



 

gluster volume create www replica 3 transport tcp 192.168.140.41:/gluster/www 192.168.140.42:/gluster/www 192.168.140.43:/gluster/www force

gluster volume set www network.ping-timeout 5

gluster volume set www performance.cache-size 1024MB

gluster volume set www nfs.disable on # No need for NFS currently

gluster volume start www

 
To my understanding it still wouldn't explain why nfs has such great performance compared to native ...

 

 


Regards



Jo



 



 

-----Original message-----
From: Soumya Koduri <***@redhat.com> <mailto:***@redhat.com>
Sent: Tue 11-07-2017 11:16
Subject: Re: [Gluster-users] Gluster native mount is really slow compared to nfs
To: Jo Goossens <***@hosted-power.com> <mailto:***@hosted-power.com> ; gluster-***@gluster.org <mailto:gluster-***@gluster.org> ;
CC: Ambarish Soman <***@redhat.com> <mailto:***@redhat.com> ; Karan Sandha <***@redhat.com> <mailto:***@redhat.com> ;
+ Ambarish
Post by Jo Goossens
Hello,
We tried tons of settings to get a php app running on a native gluster
e.g.: 192.168.140.41:/www /var/www glusterfs
defaults,_netdev,backup-volfile-servers=192.168.140.42:192.168.140.43,direct-io-mode=disable
0 0
I tried some mount variants in order to speed up things without luck.
After that I tried nfs (native gluster nfs 3 and ganesha nfs 4), it was
a crazy performance difference.
e.g.: 192.168.140.41:/www /var/www nfs4 defaults,_netdev 0 0
./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41
--threads 8 --files 5000 --file-size 64 --record-size 64
This test finished in around 1.5 seconds with NFS and in more than 250
seconds without nfs (can't remember exact numbers, but I reproduced it
several times for both).
With the native gluster mount the php app had loading times of over 10
seconds, with the nfs mount the php app loaded around 1 second maximum
and even less. (reproduced several times)
I tried all kind of performance settings and variants of this but not
helped , the difference stayed huge, here are some of the settings
Request Ambarish & Karan (cc'ed who have been working on evaluating
performance of various access protocols gluster supports) to look at the
below settings and provide inputs.

Thanks,
Soumya
Post by Jo Goossens
gluster volume set www features.cache-invalidation on
gluster volume set www features.cache-invalidation-timeout 600
gluster volume set www performance.stat-prefetch on
gluster volume set www performance.cache-samba-metadata on
gluster volume set www performance.cache-invalidation on
gluster volume set www performance.md-cache-timeout 600
gluster volume set www network.inode-lru-limit 250000
gluster volume set www performance.cache-refresh-timeout 60
gluster volume set www performance.read-ahead disable
gluster volume set www performance.readdir-ahead on
gluster volume set www performance.parallel-readdir on
gluster volume set www performance.write-behind-window-size 4MB
gluster volume set www performance.io-thread-count 64
gluster volume set www performance.client-io-threads on
gluster volume set www performance.cache-size 1GB
gluster volume set www performance.quick-read on
gluster volume set www performance.flush-behind on
gluster volume set www performance.write-behind on
gluster volume set www nfs.disable on
gluster volume set www client.event-threads 3
gluster volume set www server.event-threads 3
The NFS ha adds a lot of complexity which we wouldn't need at all in our
setup, could you please explain what is going on here? Is NFS the only
solution to get acceptable performance? Did I miss one crucial settting
perhaps?
We're really desperate, thanks a lot for your help!
PS: We tried with gluster 3.11 and 3.8 on Debian, both had terrible
performance when not used with nfs.
Kind regards
Jo Goossens
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users <http://lists.gluster.org/mailman/listinfo/gluster-users>
_______________________________________________ Gluster-users mailing list Gluster-***@gluster.org <mailto:Gluster-***@gluster.org> http://lists.gluster.org/mailman/listinfo/gluster-users <http://lists.gluster.org/mailman/listinfo/gluster-users>





_______________________________________________ Gluster-users mailing list Gluster-***@gluster.org <mailto:Gluster-***@gluster.org> http://lists.gluster.org/mailman/listinfo/gluster-users <http://lists.gluster.org/mailman/listinfo/gluster-users>






_______________________________________________ Gluster-users mailing list Gluster-***@gluster.org <mailto:Gluster-***@gluster.org> http://lists.gluster.org/mailman/listinfo/gluster-users <http://lists.gluster.org/mailman/listinfo/gluster-users>





_______________________________________________
Gluster-users mailing list
Gluster-***@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users
Joe Julian
2017-07-11 15:16:41 UTC
Permalink
Raw Message
RE: [Gluster-users] Gluster native mount is really slow compared to nfs
Hello Joe,
I really appreciate your feedback, but I already tried the opcache
stuff (to not valildate at all). It improves of course then, but not
completely somehow. Still quite slow.
I did not try the mount options yet, but I will now!
With nfs (doesnt matter much built-in version 3 or ganesha version 4)
I can even host the site perfectly fast without these extreme opcache
settings.
I still can't understand why the nfs mount is easily 80 times faster,
actually no matter what options I set it seems. It's almost there is
something really wrong somehow...
Because the linux nfs client doesn't touch the network for many
operations and are, in stead, held in the kernel's FSCache.
I tried the ceph mount now and out of the box it's comparable with gluster with nfs mount.
Regards
Jo
BE: +32 53 599 000
NL: +31 85 888 4 555
https://www.hosted-power.com/
-----Original message-----
*Sent:* Tue 11-07-2017 17:04
*Subject:* Re: [Gluster-users] Gluster native mount is really slow
compared to nfs
My standard response to someone needing filesystem performance for
www traffic is generally, "you're doing it wrong".
https://joejulian.name/blog/optimizing-web-performance-with-glusterfs/
attribute-timeout, entry-timeout, negative-timeout (set to some
large amount of time), and fopen-keep-cache.
Post by Jo Goossens
Hello,
#gluster volume info www
Volume Name: www
Type: Replicate
Volume ID: 5d64ee36-828a-41fa-adbf-75718b954aff
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Brick1: 192.168.140.41:/gluster/www
Brick2: 192.168.140.42:/gluster/www
Brick3: 192.168.140.43:/gluster/www
cluster.read-hash-mode: 0
performance.quick-read: on
performance.write-behind-window-size: 4MB
server.allow-insecure: on
performance.read-ahead: disable
performance.readdir-ahead: on
performance.io-thread-count: 64
performance.io-cache: on
performance.client-io-threads: on
server.outstanding-rpc-limit: 128
server.event-threads: 3
client.event-threads: 3
performance.cache-size: 32MB
transport.address-family: inet
nfs.disable: on
nfs.addr-namelookup: off
nfs.export-volumes: on
nfs.rpc-auth-allow: 192.168.140.*
features.cache-invalidation: on
features.cache-invalidation-timeout: 600
performance.stat-prefetch: on
performance.cache-samba-metadata: on
performance.cache-invalidation: on
performance.md-cache-timeout: 600
network.inode-lru-limit: 100000
performance.parallel-readdir: on
performance.cache-refresh-timeout: 60
performance.rda-cache-limit: 50MB
cluster.nufa: on
network.ping-timeout: 5
cluster.lookup-optimize: on
cluster.quorum-type: auto
I started with none of them set and I added/changed while
testing. But it was always slow, by tuning some kernel parameters
it improved slightly (just a few percent, nothing reasonable)
I also tried ceph just to compare, I got this with default
./smallfile_cli.py --top /var/www/test --host-set
192.168.140.41 --threads 8 --files 5000 --file-size 64
--record-size 64
smallfile version 3.0
hosts in test : ['192.168.140.41']
top test directory(s) : ['/var/www/test']
operation : cleanup
files/thread : 5000
threads : 8
record size (KB, 0 = maximum) : 64
file size (KB) : 64
file size distribution : fixed
files per dir : 100
dirs per dir : 10
threads share directories? : N
hash file number into dir.? : N
fsync after modify? : N
pause between files (microsec) : 0
finish all requests? : Y
stonewall? : Y
measure response times? : N
verify read? : Y
verbose? : False
log to stderr? : False
ext.attr.size : 0
ext.attr.count : 0
permute host directories? : N
remote program directory : /root/smallfile-master
/var/www/test/network_shared
starting all threads by creating starting gate file
/var/www/test/network_shared/starting_gate.tmp
host = 192.168.140.41,thr = 00,elapsed = 1.339621,files =
5000,records = 0,status = ok
host = 192.168.140.41,thr = 01,elapsed = 1.436776,files =
5000,records = 0,status = ok
host = 192.168.140.41,thr = 02,elapsed = 1.498681,files =
5000,records = 0,status = ok
host = 192.168.140.41,thr = 03,elapsed = 1.483886,files =
5000,records = 0,status = ok
host = 192.168.140.41,thr = 04,elapsed = 1.454833,files =
5000,records = 0,status = ok
host = 192.168.140.41,thr = 05,elapsed = 1.469340,files =
5000,records = 0,status = ok
host = 192.168.140.41,thr = 06,elapsed = 1.439060,files =
5000,records = 0,status = ok
host = 192.168.140.41,thr = 07,elapsed = 1.375074,files =
5000,records = 0,status = ok
total threads = 8
total files = 40000
100.00% of requested files processed, minimum is 70.00
1.498681 sec elapsed time
26690.134975 files/sec
Regards
Jo
-----Original message-----
*Sent:* Tue 11-07-2017 12:15
*Subject:* Re: [Gluster-users] Gluster native mount is really
slow compared to nfs
Hello,
Here is some speedtest with a new setup we just made with
gluster 3.10, there are no other differences, except
use-readdirp=no,log-level=WARNING,log-file=/var/log/glusterxxx.log
192.168.140.41:/www /var/www
/var/www/test --host-set 192.168.140.41 --threads 8 --files
500 --file-size 64 --record-size 64
smallfile version 3.0
hosts in test : ['192.168.140.41']
top test directory(s) : ['/var/www/test']
operation : cleanup
files/thread : 500
threads : 8
record size (KB, 0 = maximum) : 64
file size (KB) : 64
file size distribution : fixed
files per dir : 100
dirs per dir : 10
threads share directories? : N
hash file number into dir.? : N
fsync after modify? : N
pause between files (microsec) : 0
finish all requests? : Y
stonewall? : Y
measure response times? : N
verify read? : Y
verbose? : False
log to stderr? : False
ext.attr.size : 0
ext.attr.count : 0
permute host directories? : N
remote program directory : /root/smallfile-master
/var/www/test/network_shared
starting all threads by creating starting gate file
/var/www/test/network_shared/starting_gate.tmp
host = 192.168.140.41,thr = 00,elapsed = 68.845450,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 01,elapsed = 67.601088,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 02,elapsed = 58.677994,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 03,elapsed = 65.901922,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 04,elapsed = 66.971720,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 05,elapsed = 71.245102,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 06,elapsed = 67.574845,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 07,elapsed = 54.263242,files =
500,records = 0,status = ok
total threads = 8
total files = 4000
100.00% of requested files processed, minimum is 70.00
71.245102 sec elapsed time
56.144211 files/sec
umount /var/www
192.168.140.41:/www /var/www
/var/www/test --host-set 192.168.140.41 --threads 8 --files
500 --file-size 64 --record-size 64
smallfile version 3.0
hosts in test : ['192.168.140.41']
top test directory(s) : ['/var/www/test']
operation : cleanup
files/thread : 500
threads : 8
record size (KB, 0 = maximum) : 64
file size (KB) : 64
file size distribution : fixed
files per dir : 100
dirs per dir : 10
threads share directories? : N
hash file number into dir.? : N
fsync after modify? : N
pause between files (microsec) : 0
finish all requests? : Y
stonewall? : Y
measure response times? : N
verify read? : Y
verbose? : False
log to stderr? : False
ext.attr.size : 0
ext.attr.count : 0
permute host directories? : N
remote program directory : /root/smallfile-master
/var/www/test/network_shared
starting all threads by creating starting gate file
/var/www/test/network_shared/starting_gate.tmp
host = 192.168.140.41,thr = 00,elapsed = 0.962424,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 01,elapsed = 0.942673,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 02,elapsed = 0.940622,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 03,elapsed = 0.915218,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 04,elapsed = 0.934349,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 05,elapsed = 0.922466,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 06,elapsed = 0.954381,files =
500,records = 0,status = ok
host = 192.168.140.41,thr = 07,elapsed = 0.946127,files =
500,records = 0,status = ok
total threads = 8
total files = 4000
100.00% of requested files processed, minimum is 70.00
0.962424 sec elapsed time
4156.173189 files/sec
-----Original message-----
*Sent:* Tue 11-07-2017 11:26
*Subject:* Re: [Gluster-users] Gluster native mount is
really slow compared to nfs
Hi all,
One more thing, we have 3 apps servers with the gluster
on it, replicated on 3 different gluster nodes. (So the
gluster nodes are app servers at the same time). We could
actually almost work locally if we wouldn't need to have
the same files on the 3 nodes and redundancy :)
gluster volume create www replica 3 transport tcp
192.168.140.41:/gluster/www 192.168.140.42:/gluster/www
192.168.140.43:/gluster/www force
gluster volume set www network.ping-timeout 5
gluster volume set www performance.cache-size 1024MB
gluster volume set www nfs.disable on # No need for NFS currently
gluster volume start www
To my understanding it still wouldn't explain why nfs has
such great performance compared to native ...
Regards
Jo
-----Original message-----
*Sent:* Tue 11-07-2017 11:16
*Subject:* Re: [Gluster-users] Gluster native mount
is really slow compared to nfs
+ Ambarish
Post by Jo Goossens
Hello,
We tried tons of settings to get a php app running
on a native gluster
Post by Jo Goossens
e.g.: 192.168.140.41:/www /var/www glusterfs
defaults,_netdev,backup-volfile-servers=192.168.140.42:192.168.140.43,direct-io-mode=disable
Post by Jo Goossens
0 0
I tried some mount variants in order to speed up
things without luck.
Post by Jo Goossens
After that I tried nfs (native gluster nfs 3 and
ganesha nfs 4), it was
Post by Jo Goossens
a crazy performance difference.
e.g.: 192.168.140.41:/www /var/www nfs4
defaults,_netdev 0 0
Post by Jo Goossens
./smallfile_cli.py --top /var/www/test --host-set
192.168.140.41
Post by Jo Goossens
--threads 8 --files 5000 --file-size 64
--record-size 64
Post by Jo Goossens
This test finished in around 1.5 seconds with NFS
and in more than 250
Post by Jo Goossens
seconds without nfs (can't remember exact numbers,
but I reproduced it
Post by Jo Goossens
several times for both).
With the native gluster mount the php app had
loading times of over 10
Post by Jo Goossens
seconds, with the nfs mount the php app loaded
around 1 second maximum
Post by Jo Goossens
and even less. (reproduced several times)
I tried all kind of performance settings and
variants of this but not
Post by Jo Goossens
helped , the difference stayed huge, here are some
of the settings
Request Ambarish & Karan (cc'ed who have been working
on evaluating
performance of various access protocols gluster
supports) to look at the
below settings and provide inputs.
Thanks,
Soumya
Post by Jo Goossens
gluster volume set www features.cache-invalidation on
gluster volume set www
features.cache-invalidation-timeout 600
Post by Jo Goossens
gluster volume set www performance.stat-prefetch on
gluster volume set www
performance.cache-samba-metadata on
Post by Jo Goossens
gluster volume set www
performance.cache-invalidation on
Post by Jo Goossens
gluster volume set www performance.md-cache-timeout 600
gluster volume set www network.inode-lru-limit 250000
gluster volume set www
performance.cache-refresh-timeout 60
Post by Jo Goossens
gluster volume set www performance.read-ahead disable
gluster volume set www performance.readdir-ahead on
gluster volume set www performance.parallel-readdir on
gluster volume set www
performance.write-behind-window-size 4MB
Post by Jo Goossens
gluster volume set www performance.io-thread-count 64
gluster volume set www performance.client-io-threads on
gluster volume set www performance.cache-size 1GB
gluster volume set www performance.quick-read on
gluster volume set www performance.flush-behind on
gluster volume set www performance.write-behind on
gluster volume set www nfs.disable on
gluster volume set www client.event-threads 3
gluster volume set www server.event-threads 3
The NFS ha adds a lot of complexity which we
wouldn't need at all in our
Post by Jo Goossens
setup, could you please explain what is going on
here? Is NFS the only
Post by Jo Goossens
solution to get acceptable performance? Did I miss
one crucial settting
Post by Jo Goossens
perhaps?
We're really desperate, thanks a lot for your help!
PS: We tried with gluster 3.11 and 3.8 on Debian,
both had terrible
Post by Jo Goossens
performance when not used with nfs.
Kind regards
Jo Goossens
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users
Jo Goossens
2017-07-11 15:39:55 UTC
Permalink
Raw Message
Hello Joe,

 
 
I just did a mount like this (added the bold):

 
mount -t glusterfs -o attribute-timeout=600,entry-timeout=600,negative-timeout=600,fopen-keep-cache,use-readdirp=no,log-level=WARNING,log-file=/var/log/glusterxxx.log 192.168.140.41:/www /var/www
 Results:

 
***@app1:~/smallfile-master# ./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41 --threads 8 --files 5000 --file-size 64 --record-size 64
smallfile version 3.0
                           hosts in test : ['192.168.140.41']
                   top test directory(s) : ['/var/www/test']
                               operation : cleanup
                            files/thread : 5000
                                 threads : 8
           record size (KB, 0 = maximum) : 64
                          file size (KB) : 64
                  file size distribution : fixed
                           files per dir : 100
                            dirs per dir : 10
              threads share directories? : N
                         filename prefix :
                         filename suffix :
             hash file number into dir.? : N
                     fsync after modify? : N
          pause between files (microsec) : 0
                    finish all requests? : Y
                              stonewall? : Y
                 measure response times? : N
                            verify read? : Y
                                verbose? : False
                          log to stderr? : False
                           ext.attr.size : 0
                          ext.attr.count : 0
               permute host directories? : N
                remote program directory : /root/smallfile-master
               network thread sync. dir. : /var/www/test/network_shared
starting all threads by creating starting gate file /var/www/test/network_shared/starting_gate.tmp
host = 192.168.140.41,thr = 00,elapsed = 1.232004,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 01,elapsed = 1.148738,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 02,elapsed = 1.130913,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 03,elapsed = 1.183088,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 04,elapsed = 1.220752,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 05,elapsed = 1.228039,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 06,elapsed = 1.216787,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 07,elapsed = 1.229036,files = 5000,records = 0,status = ok
total threads = 8
total files = 40000
100.00% of requested files processed, minimum is  70.00
1.232004 sec elapsed time
32467.428972 files/sec
  
***@app1:~/smallfile-master# ./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41 --threads 8 --files 50000 --file-size 64 --record-size 64
smallfile version 3.0
                           hosts in test : ['192.168.140.41']
                   top test directory(s) : ['/var/www/test']
                               operation : cleanup
                            files/thread : 50000
                                 threads : 8
           record size (KB, 0 = maximum) : 64
                          file size (KB) : 64
                  file size distribution : fixed
                           files per dir : 100
                            dirs per dir : 10
              threads share directories? : N
                         filename prefix :
                         filename suffix :
             hash file number into dir.? : N
                     fsync after modify? : N
          pause between files (microsec) : 0
                    finish all requests? : Y
                              stonewall? : Y
                 measure response times? : N
                            verify read? : Y
                                verbose? : False
                          log to stderr? : False
                           ext.attr.size : 0
                          ext.attr.count : 0
               permute host directories? : N
                remote program directory : /root/smallfile-master
               network thread sync. dir. : /var/www/test/network_shared
starting all threads by creating starting gate file /var/www/test/network_shared/starting_gate.tmp
host = 192.168.140.41,thr = 00,elapsed = 4.242312,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 01,elapsed = 4.250831,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 02,elapsed = 3.771269,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 03,elapsed = 4.060653,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 04,elapsed = 3.880653,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 05,elapsed = 3.847107,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 06,elapsed = 3.895537,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 07,elapsed = 3.966394,files = 50000,records = 0,status = ok
total threads = 8
total files = 400000
100.00% of requested files processed, minimum is  70.00
4.250831 sec elapsed time
94099.245073 files/sec
***@app1:~/smallfile-master#
  
As you can see it's now crazy fast, I think close to or faster than nfs !! What the hell!??!

 
I'm so exited I already post. Any suggestions for those parameters? I will do additional testing over here , because this is ridiculous. That woud mean defaults or no good at all...

 

Regards

Jo

 
-----Original message-----
From:Joe Julian <***@julianfamily.org>
Sent:Tue 11-07-2017 17:16
Subject:Re: [Gluster-users] Gluster native mount is really slow compared to nfs
To:Jo Goossens <***@hosted-power.com>; gluster-***@gluster.org;

On 07/11/2017 08:14 AM, Jo Goossens wrote:


Hello Joe,



 


 


I really appreciate your feedback, but I already tried the opcache stuff (to not valildate at all). It improves of course then, but not completely somehow. Still quite slow.



 

I did not try the mount options yet, but I will now!


 


 


With nfs (doesnt matter much built-in version 3 or ganesha version 4) I can even host the site perfectly fast without these extreme opcache settings.



 


I still can't understand why the nfs mount is easily 80 times faster, actually no matter what options I set it seems. It's almost there is something really wrong somehow...


Because the linux nfs client doesn't touch the network for many operations and are, in stead, held in the kernel's FSCache.



 


I tried the ceph mount now and out of the box it's comparable with gluster with nfs mount.



 


 


Regards



Jo



 


BE: +32 53 599 000



NL: +31 85 888 4 555



 


https://www.hosted-power.com/



 



 

-----Original message-----
From: Joe Julian <***@julianfamily.org> <mailto:***@julianfamily.org>
Sent: Tue 11-07-2017 17:04
Subject: Re: [Gluster-users] Gluster native mount is really slow compared to nfs
To: gluster-***@gluster.org <mailto:gluster-***@gluster.org> ;


My standard response to someone needing filesystem performance for www traffic is generally, "you're doing it wrong". https://joejulian.name/blog/optimizing-web-performance-with-glusterfs/

That said, you might also look at these mount options: attribute-timeout, entry-timeout, negative-timeout (set to some large amount of time), and fopen-keep-cache.



On 07/11/2017 07:48 AM, Jo Goossens wrote:


Hello,



 


 


Here is the volume info as requested by soumya:



 

#gluster volume info www

 
Volume Name: www

Type: Replicate

Volume ID: 5d64ee36-828a-41fa-adbf-75718b954aff

Status: Started

Snapshot Count: 0

Number of Bricks: 1 x 3 = 3

Transport-type: tcp

Bricks:

Brick1: 192.168.140.41:/gluster/www

Brick2: 192.168.140.42:/gluster/www

Brick3: 192.168.140.43:/gluster/www

Options Reconfigured:

cluster.read-hash-mode: 0

performance.quick-read: on

performance.write-behind-window-size: 4MB

server.allow-insecure: on

performance.read-ahead: disable

performance.readdir-ahead: on

performance.io-thread-count: 64

performance.io-cache: on

performance.client-io-threads: on

server.outstanding-rpc-limit: 128

server.event-threads: 3

client.event-threads: 3

performance.cache-size: 32MB

transport.address-family: inet

nfs.disable: on

nfs.addr-namelookup: off

nfs.export-volumes: on

nfs.rpc-auth-allow: 192.168.140.*

features.cache-invalidation: on

features.cache-invalidation-timeout: 600

performance.stat-prefetch: on

performance.cache-samba-metadata: on

performance.cache-invalidation: on

performance.md-cache-timeout: 600

network.inode-lru-limit: 100000

performance.parallel-readdir: on

performance.cache-refresh-timeout: 60

performance.rda-cache-limit: 50MB

cluster.nufa: on

network.ping-timeout: 5

cluster.lookup-optimize: on

cluster.quorum-type: auto

 
I started with none of them set and I added/changed while testing. But it was always slow, by tuning some kernel parameters it improved slightly (just a few percent, nothing reasonable)

 
I also tried ceph just to compare, I got this with default settings and no tweaks:

 
 ./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41 --threads 8 --files 5000 --file-size 64 --record-size 64

smallfile version 3.0

                           hosts in test : ['192.168.140.41']

                   top test directory(s) : ['/var/www/test']

                               operation : cleanup

                            files/thread : 5000

                                 threads : 8

           record size (KB, 0 = maximum) : 64

                          file size (KB) : 64

                  file size distribution : fixed

                           files per dir : 100

                            dirs per dir : 10

              threads share directories? : N

                         filename prefix :

                         filename suffix :

             hash file number into dir.? : N

                     fsync after modify? : N

          pause between files (microsec) : 0

                    finish all requests? : Y

                              stonewall? : Y

                 measure response times? : N

                            verify read? : Y

                                verbose? : False

                          log to stderr? : False

                           ext.attr.size : 0

                          ext.attr.count : 0

               permute host directories? : N

                remote program directory : /root/smallfile-master

               network thread sync. dir. : /var/www/test/network_shared

starting all threads by creating starting gate file /var/www/test/network_shared/starting_gate.tmp

host = 192.168.140.41,thr = 00,elapsed = 1.339621,files = 5000,records = 0,status = ok

host = 192.168.140.41,thr = 01,elapsed = 1.436776,files = 5000,records = 0,status = ok

host = 192.168.140.41,thr = 02,elapsed = 1.498681,files = 5000,records = 0,status = ok

host = 192.168.140.41,thr = 03,elapsed = 1.483886,files = 5000,records = 0,status = ok

host = 192.168.140.41,thr = 04,elapsed = 1.454833,files = 5000,records = 0,status = ok

host = 192.168.140.41,thr = 05,elapsed = 1.469340,files = 5000,records = 0,status = ok

host = 192.168.140.41,thr = 06,elapsed = 1.439060,files = 5000,records = 0,status = ok

host = 192.168.140.41,thr = 07,elapsed = 1.375074,files = 5000,records = 0,status = ok

total threads = 8

total files = 40000

100.00% of requested files processed, minimum is  70.00

1.498681 sec elapsed time

26690.134975 files/sec

 
 


Regards



Jo



 
-----Original message-----
From: Jo Goossens <***@hosted-power.com> <mailto:***@hosted-power.com>
Sent: Tue 11-07-2017 12:15
Subject: Re: [Gluster-users] Gluster native mount is really slow compared to nfs
To: Soumya Koduri <***@redhat.com> <mailto:***@redhat.com> ; gluster-***@gluster.org <mailto:gluster-***@gluster.org> ;
CC: Ambarish Soman <***@redhat.com> <mailto:***@redhat.com> ;


Hello,



 


 


Here is some speedtest with a new setup we just made with gluster 3.10, there are no other differences, except glusterfs versus nfs. The nfs is about 80 times faster:



 


 

***@app1:~/smallfile-master# mount -t glusterfs -o use-readdirp=no,log-level=WARNING,log-file=/var/log/glusterxxx.log 192.168.140.41:/www /var/www

***@app1:~/smallfile-master# ./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41 --threads 8 --files 500 --file-size 64 --record-size 64

smallfile version 3.0

                           hosts in test : ['192.168.140.41']

                   top test directory(s) : ['/var/www/test']

                               operation : cleanup

                            files/thread : 500

                                 threads : 8

           record size (KB, 0 = maximum) : 64

                          file size (KB) : 64

                  file size distribution : fixed

                           files per dir : 100

                            dirs per dir : 10

              threads share directories? : N

                         filename prefix :

                         filename suffix :

             hash file number into dir.? : N

                     fsync after modify? : N

          pause between files (microsec) : 0

                    finish all requests? : Y

                              stonewall? : Y

                 measure response times? : N

                            verify read? : Y

                                verbose? : False

                          log to stderr? : False

                           ext.attr.size : 0

                          ext.attr.count : 0

               permute host directories? : N

                remote program directory : /root/smallfile-master

               network thread sync. dir. : /var/www/test/network_shared

starting all threads by creating starting gate file /var/www/test/network_shared/starting_gate.tmp

host = 192.168.140.41,thr = 00,elapsed = 68.845450,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 01,elapsed = 67.601088,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 02,elapsed = 58.677994,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 03,elapsed = 65.901922,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 04,elapsed = 66.971720,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 05,elapsed = 71.245102,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 06,elapsed = 67.574845,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 07,elapsed = 54.263242,files = 500,records = 0,status = ok

total threads = 8

total files = 4000

100.00% of requested files processed, minimum is  70.00

71.245102 sec elapsed time

56.144211 files/sec

 
umount /var/www

 
***@app1:~/smallfile-master# mount -t nfs -o tcp 192.168.140.41:/www /var/www

***@app1:~/smallfile-master# ./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41 --threads 8 --files 500 --file-size 64 --record-size 64

smallfile version 3.0

                           hosts in test : ['192.168.140.41']

                   top test directory(s) : ['/var/www/test']

                               operation : cleanup

                            files/thread : 500

                                 threads : 8

           record size (KB, 0 = maximum) : 64

                          file size (KB) : 64

                  file size distribution : fixed

                           files per dir : 100

                            dirs per dir : 10

              threads share directories? : N

                         filename prefix :

                         filename suffix :

             hash file number into dir.? : N

                     fsync after modify? : N

          pause between files (microsec) : 0

                    finish all requests? : Y

                              stonewall? : Y

                 measure response times? : N

                            verify read? : Y

                                verbose? : False

                          log to stderr? : False

                           ext.attr.size : 0

                          ext.attr.count : 0

               permute host directories? : N

                remote program directory : /root/smallfile-master

               network thread sync. dir. : /var/www/test/network_shared

starting all threads by creating starting gate file /var/www/test/network_shared/starting_gate.tmp

host = 192.168.140.41,thr = 00,elapsed = 0.962424,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 01,elapsed = 0.942673,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 02,elapsed = 0.940622,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 03,elapsed = 0.915218,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 04,elapsed = 0.934349,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 05,elapsed = 0.922466,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 06,elapsed = 0.954381,files = 500,records = 0,status = ok

host = 192.168.140.41,thr = 07,elapsed = 0.946127,files = 500,records = 0,status = ok

total threads = 8

total files = 4000

100.00% of requested files processed, minimum is  70.00

0.962424 sec elapsed time

4156.173189 files/sec

 

 


 
-----Original message-----
From: Jo Goossens <***@hosted-power.com> <mailto:***@hosted-power.com>
Sent: Tue 11-07-2017 11:26
Subject: Re: [Gluster-users] Gluster native mount is really slow compared to nfs
To: gluster-***@gluster.org <mailto:gluster-***@gluster.org> ; Soumya Koduri <***@redhat.com> <mailto:***@redhat.com> ;
CC: Ambarish Soman <***@redhat.com> <mailto:***@redhat.com> ;


Hi all,



 


 


One more thing, we have 3 apps servers with the gluster on it, replicated on 3 different gluster nodes. (So the gluster nodes are app servers at the same time). We could actually almost work locally if we wouldn't need to have the same files on the 3 nodes and redundancy :)



 


Initial cluster was created like this:



 

gluster volume create www replica 3 transport tcp 192.168.140.41:/gluster/www 192.168.140.42:/gluster/www 192.168.140.43:/gluster/www force

gluster volume set www network.ping-timeout 5

gluster volume set www performance.cache-size 1024MB

gluster volume set www nfs.disable on # No need for NFS currently

gluster volume start www

 
To my understanding it still wouldn't explain why nfs has such great performance compared to native ...

 

 


Regards



Jo



 



 

-----Original message-----
From: Soumya Koduri <***@redhat.com> <mailto:***@redhat.com>
Sent: Tue 11-07-2017 11:16
Subject: Re: [Gluster-users] Gluster native mount is really slow compared to nfs
To: Jo Goossens <***@hosted-power.com> <mailto:***@hosted-power.com> ; gluster-***@gluster.org <mailto:gluster-***@gluster.org> ;
CC: Ambarish Soman <***@redhat.com> <mailto:***@redhat.com> ; Karan Sandha <***@redhat.com> <mailto:***@redhat.com> ;
+ Ambarish
Post by Jo Goossens
Hello,
We tried tons of settings to get a php app running on a native gluster
e.g.: 192.168.140.41:/www /var/www glusterfs
defaults,_netdev,backup-volfile-servers=192.168.140.42:192.168.140.43,direct-io-mode=disable
0 0
I tried some mount variants in order to speed up things without luck.
After that I tried nfs (native gluster nfs 3 and ganesha nfs 4), it was
a crazy performance difference.
e.g.: 192.168.140.41:/www /var/www nfs4 defaults,_netdev 0 0
./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41
--threads 8 --files 5000 --file-size 64 --record-size 64
This test finished in around 1.5 seconds with NFS and in more than 250
seconds without nfs (can't remember exact numbers, but I reproduced it
several times for both).
With the native gluster mount the php app had loading times of over 10
seconds, with the nfs mount the php app loaded around 1 second maximum
and even less. (reproduced several times)
I tried all kind of performance settings and variants of this but not
helped , the difference stayed huge, here are some of the settings
Request Ambarish & Karan (cc'ed who have been working on evaluating
performance of various access protocols gluster supports) to look at the
below settings and provide inputs.

Thanks,
Soumya
Post by Jo Goossens
gluster volume set www features.cache-invalidation on
gluster volume set www features.cache-invalidation-timeout 600
gluster volume set www performance.stat-prefetch on
gluster volume set www performance.cache-samba-metadata on
gluster volume set www performance.cache-invalidation on
gluster volume set www performance.md-cache-timeout 600
gluster volume set www network.inode-lru-limit 250000
gluster volume set www performance.cache-refresh-timeout 60
gluster volume set www performance.read-ahead disable
gluster volume set www performance.readdir-ahead on
gluster volume set www performance.parallel-readdir on
gluster volume set www performance.write-behind-window-size 4MB
gluster volume set www performance.io-thread-count 64
gluster volume set www performance.client-io-threads on
gluster volume set www performance.cache-size 1GB
gluster volume set www performance.quick-read on
gluster volume set www performance.flush-behind on
gluster volume set www performance.write-behind on
gluster volume set www nfs.disable on
gluster volume set www client.event-threads 3
gluster volume set www server.event-threads 3
The NFS ha adds a lot of complexity which we wouldn't need at all in our
setup, could you please explain what is going on here? Is NFS the only
solution to get acceptable performance? Did I miss one crucial settting
perhaps?
We're really desperate, thanks a lot for your help!
PS: We tried with gluster 3.11 and 3.8 on Debian, both had terrible
performance when not used with nfs.
Kind regards
Jo Goossens
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users <http://lists.gluster.org/mailman/listinfo/gluster-users>
_______________________________________________ Gluster-users mailing list Gluster-***@gluster.org <mailto:Gluster-***@gluster.org> http://lists.gluster.org/mailman/listinfo/gluster-users <http://lists.gluster.org/mailman/listinfo/gluster-users>





_______________________________________________ Gluster-users mailing list Gluster-***@gluster.org <mailto:Gluster-***@gluster.org> http://lists.gluster.org/mailman/listinfo/gluster-users <http://lists.gluster.org/mailman/listinfo/gluster-users>






_______________________________________________ Gluster-users mailing list Gluster-***@gluster.org <mailto:Gluster-***@gluster.org> http://lists.gluster.org/mailman/listinfo/gluster-users <http://lists.gluster.org/mailman/listinfo/gluster-users>





_______________________________________________ Gluster-users mailing list Gluster-***@gluster.org <mailto:Gluster-***@gluster.org> http://lists.gluster.org/mailman/listinfo/gluster-users <http://lists.gluster.org/mailman/listinfo/gluster-users>
Vijay Bellur
2017-07-11 16:16:38 UTC
Permalink
Raw Message
Post by Jo Goossens
Hello Joe,
mount -t glusterfs -o
*attribute-timeout=600,entry-timeout=600,negative-timeout=600,fopen-keep-cache*
,use-readdirp=no,log-level=WARNING,log-file=/var/log/glusterxxx.log
192.168.140.41:/www /var/www
--host-set 192.168.140.41 --threads 8 --files 5000 --file-size 64
--record-size 64
smallfile version 3.0
hosts in test : ['192.168.140.41']
top test directory(s) : ['/var/www/test']
operation : cleanup
files/thread : 5000
threads : 8
record size (KB, 0 = maximum) : 64
file size (KB) : 64
file size distribution : fixed
files per dir : 100
dirs per dir : 10
threads share directories? : N
hash file number into dir.? : N
fsync after modify? : N
pause between files (microsec) : 0
finish all requests? : Y
stonewall? : Y
measure response times? : N
verify read? : Y
verbose? : False
log to stderr? : False
ext.attr.size : 0
ext.attr.count : 0
permute host directories? : N
remote program directory : /root/smallfile-master
network thread sync. dir. : /var/www/test/network_shared
starting all threads by creating starting gate file
/var/www/test/network_shared/starting_gate.tmp
host = 192.168.140.41,thr = 00,elapsed = 1.232004,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 01,elapsed = 1.148738,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 02,elapsed = 1.130913,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 03,elapsed = 1.183088,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 04,elapsed = 1.220752,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 05,elapsed = 1.228039,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 06,elapsed = 1.216787,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 07,elapsed = 1.229036,files = 5000,records = 0,status = ok
total threads = 8
total files = 40000
100.00% of requested files processed, minimum is 70.00
1.232004 sec elapsed time
32467.428972 files/sec
--host-set 192.168.140.41 --threads 8 --files 50000 --file-size 64
--record-size 64
smallfile version 3.0
hosts in test : ['192.168.140.41']
top test directory(s) : ['/var/www/test']
operation : cleanup
files/thread : 50000
threads : 8
record size (KB, 0 = maximum) : 64
file size (KB) : 64
file size distribution : fixed
files per dir : 100
dirs per dir : 10
threads share directories? : N
hash file number into dir.? : N
fsync after modify? : N
pause between files (microsec) : 0
finish all requests? : Y
stonewall? : Y
measure response times? : N
verify read? : Y
verbose? : False
log to stderr? : False
ext.attr.size : 0
ext.attr.count : 0
permute host directories? : N
remote program directory : /root/smallfile-master
network thread sync. dir. : /var/www/test/network_shared
starting all threads by creating starting gate file
/var/www/test/network_shared/starting_gate.tmp
host = 192.168.140.41,thr = 00,elapsed = 4.242312,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 01,elapsed = 4.250831,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 02,elapsed = 3.771269,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 03,elapsed = 4.060653,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 04,elapsed = 3.880653,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 05,elapsed = 3.847107,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 06,elapsed = 3.895537,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 07,elapsed = 3.966394,files = 50000,records = 0,status = ok
total threads = 8
total files = 400000
100.00% of requested files processed, minimum is 70.00
4.250831 sec elapsed time
94099.245073 files/sec
As you can see it's now crazy fast, I think close to or faster than nfs !!
What the hell!??!
I'm so exited I already post. Any suggestions for those parameters? I will
do additional testing over here , because this is ridiculous. That woud
mean defaults or no good at all...
Would it be possible to profile the client [1] with defaults and the set of
options used now? That could help in understanding the performance delta
better.

Thanks,
Vijay

[1]
https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Performance%20Testing/#client-side-profiling
Jo Goossens
2017-07-11 16:23:33 UTC
Permalink
Raw Message
Hello Vijay,

 
 
What do you mean exactly? What info is missing?

 
PS: I already found out that for this particular test all the difference is made by : negative-timeout=600 , when removing it, it's much much slower again.

 
 
Regards

Jo
 
-----Original message-----
From:Vijay Bellur <***@redhat.com>
Sent:Tue 11-07-2017 18:16
Subject:Re: [Gluster-users] Gluster native mount is really slow compared to nfs
To:Jo Goossens <***@hosted-power.com>;
CC:gluster-***@gluster.org; Joe Julian <***@julianfamily.org>;



On Tue, Jul 11, 2017 at 11:39 AM, Jo Goossens <***@hosted-power.com <mailto:***@hosted-power.com> > wrote:


Hello Joe,

 
 
I just did a mount like this (added the bold):

 
mount -t glusterfs -o attribute-timeout=600,entry-timeout=600,negative-timeout=600,fopen-keep-cache,use-readdirp=no,log-level=WARNING,log-file=/var/log/glusterxxx.log 192.168.140.41:/www /var/www
 Results:

 
***@app1:~/smallfile-master# ./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41 --threads 8 --files 5000 --file-size 64 --record-size 64
smallfile version 3.0
                           hosts in test : ['192.168.140.41']
                   top test directory(s) : ['/var/www/test']
                               operation : cleanup
                            files/thread : 5000
                                 threads : 8
           record size (KB, 0 = maximum) : 64
                          file size (KB) : 64
                  file size distribution : fixed
                           files per dir : 100
                            dirs per dir : 10
              threads share directories? : N
                         filename prefix :
                         filename suffix :
             hash file number into dir.? : N
                     fsync after modify? : N
          pause between files (microsec) : 0
                    finish all requests? : Y
                              stonewall? : Y
                 measure response times? : N
                            verify read? : Y
                                verbose? : False
                          log to stderr? : False
                           ext.attr.size : 0
                          ext.attr.count : 0
               permute host directories? : N
                remote program directory : /root/smallfile-master
               network thread sync. dir. : /var/www/test/network_shared
starting all threads by creating starting gate file /var/www/test/network_shared/starting_gate.tmp
host = 192.168.140.41,thr = 00,elapsed = 1.232004,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 01,elapsed = 1.148738,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 02,elapsed = 1.130913,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 03,elapsed = 1.183088,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 04,elapsed = 1.220752,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 05,elapsed = 1.228039,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 06,elapsed = 1.216787,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 07,elapsed = 1.229036,files = 5000,records = 0,status = ok
total threads = 8
total files = 40000
100.00% of requested files processed, minimum is  70.00
1.232004 sec elapsed time
32467.428972 files/sec
  
***@app1:~/smallfile-master# ./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41 --threads 8 --files 50000 --file-size 64 --record-size 64
smallfile version 3.0
                           hosts in test : ['192.168.140.41']
                   top test directory(s) : ['/var/www/test']
                               operation : cleanup
                            files/thread : 50000
                                 threads : 8
           record size (KB, 0 = maximum) : 64
                          file size (KB) : 64
                  file size distribution : fixed
                           files per dir : 100
                            dirs per dir : 10
              threads share directories? : N
                         filename prefix :
                         filename suffix :
             hash file number into dir.? : N
                     fsync after modify? : N
          pause between files (microsec) : 0
                    finish all requests? : Y
                              stonewall? : Y
                 measure response times? : N
                            verify read? : Y
                                verbose? : False
                          log to stderr? : False
                           ext.attr.size : 0
                          ext.attr.count : 0
               permute host directories? : N
                remote program directory : /root/smallfile-master
               network thread sync. dir. : /var/www/test/network_shared
starting all threads by creating starting gate file /var/www/test/network_shared/starting_gate.tmp
host = 192.168.140.41,thr = 00,elapsed = 4.242312,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 01,elapsed = 4.250831,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 02,elapsed = 3.771269,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 03,elapsed = 4.060653,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 04,elapsed = 3.880653,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 05,elapsed = 3.847107,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 06,elapsed = 3.895537,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 07,elapsed = 3.966394,files = 50000,records = 0,status = ok
total threads = 8
total files = 400000
100.00% of requested files processed, minimum is  70.00
4.250831 sec elapsed time
94099.245073 files/sec
***@app1:~/smallfile-master#
  
As you can see it's now crazy fast, I think close to or faster than nfs !! What the hell!??!

 
I'm so exited I already post. Any suggestions for those parameters? I will do additional testing over here , because this is ridiculous. That woud mean defaults or no good at all...

 
 Would it be possible to profile the client [1] with defaults and the set of options used now? That could help in understanding the performance delta better.
 Thanks,
Vijay
 [1] https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Performance%20Testing/#client-side-profiling 
 
Jo Goossens
2017-07-11 16:48:34 UTC
Permalink
Raw Message
PS: I just tested between these 2:

 
mount -t glusterfs -o negative-timeout=1,use-readdirp=no,log-level=WARNING,log-file=/var/log/glusterxxx.log 192.168.140.41:/www /var/www
 mount -t glusterfs -o use-readdirp=no,log-level=WARNING,log-file=/var/log/glusterxxx.log 192.168.140.41:/www /var/www
 So it means only 1 second negative timeout...
 In this particular test: ./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41 --threads 8 --files 50000 --file-size 64 --record-size 64

  The result is about 4 seconds with the negative timeout of 1 second defined and many many minutes without the negative timeout (I quit after 15 minutes of waiting)
 I will go over to some real world tests now to see how it performs there.
  Regards
Jo

 
 

 
-----Original message-----
From:Jo Goossens <***@hosted-power.com>
Sent:Tue 11-07-2017 18:23
Subject:Re: [Gluster-users] Gluster native mount is really slow compared to nfs
To:Vijay Bellur <***@redhat.com>;
CC:gluster-***@gluster.org;


Hello Vijay,

 
 
What do you mean exactly? What info is missing?

 
PS: I already found out that for this particular test all the difference is made by : negative-timeout=600 , when removing it, it's much much slower again.

 
 
Regards

Jo
 
-----Original message-----
From:Vijay Bellur <***@redhat.com>
Sent:Tue 11-07-2017 18:16
Subject:Re: [Gluster-users] Gluster native mount is really slow compared to nfs
To:Jo Goossens <***@hosted-power.com>;
CC:gluster-***@gluster.org; Joe Julian <***@julianfamily.org>;



On Tue, Jul 11, 2017 at 11:39 AM, Jo Goossens <***@hosted-power.com <mailto:***@hosted-power.com> > wrote:


Hello Joe,

 
 
I just did a mount like this (added the bold):

 
mount -t glusterfs -o attribute-timeout=600,entry-timeout=600,negative-timeout=600,fopen-keep-cache,use-readdirp=no,log-level=WARNING,log-file=/var/log/glusterxxx.log 192.168.140.41:/www /var/www
 Results:

 
***@app1:~/smallfile-master# ./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41 --threads 8 --files 5000 --file-size 64 --record-size 64
smallfile version 3.0
                           hosts in test : ['192.168.140.41']
                   top test directory(s) : ['/var/www/test']
                               operation : cleanup
                            files/thread : 5000
                                 threads : 8
           record size (KB, 0 = maximum) : 64
                          file size (KB) : 64
                  file size distribution : fixed
                           files per dir : 100
                            dirs per dir : 10
              threads share directories? : N
                         filename prefix :
                         filename suffix :
             hash file number into dir.? : N
                     fsync after modify? : N
          pause between files (microsec) : 0
                    finish all requests? : Y
                              stonewall? : Y
                 measure response times? : N
                            verify read? : Y
                                verbose? : False
                          log to stderr? : False
                           ext.attr.size : 0
                          ext.attr.count : 0
               permute host directories? : N
                remote program directory : /root/smallfile-master
               network thread sync. dir. : /var/www/test/network_shared
starting all threads by creating starting gate file /var/www/test/network_shared/starting_gate.tmp
host = 192.168.140.41,thr = 00,elapsed = 1.232004,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 01,elapsed = 1.148738,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 02,elapsed = 1.130913,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 03,elapsed = 1.183088,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 04,elapsed = 1.220752,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 05,elapsed = 1.228039,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 06,elapsed = 1.216787,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 07,elapsed = 1.229036,files = 5000,records = 0,status = ok
total threads = 8
total files = 40000
100.00% of requested files processed, minimum is  70.00
1.232004 sec elapsed time
32467.428972 files/sec
  
***@app1:~/smallfile-master# ./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41 --threads 8 --files 50000 --file-size 64 --record-size 64
smallfile version 3.0
                           hosts in test : ['192.168.140.41']
                   top test directory(s) : ['/var/www/test']
                               operation : cleanup
                            files/thread : 50000
                                 threads : 8
           record size (KB, 0 = maximum) : 64
                          file size (KB) : 64
                  file size distribution : fixed
                           files per dir : 100
                            dirs per dir : 10
              threads share directories? : N
                         filename prefix :
                         filename suffix :
             hash file number into dir.? : N
                     fsync after modify? : N
          pause between files (microsec) : 0
                    finish all requests? : Y
                              stonewall? : Y
                 measure response times? : N
                            verify read? : Y
                                verbose? : False
                          log to stderr? : False
                           ext.attr.size : 0
                          ext.attr.count : 0
               permute host directories? : N
                remote program directory : /root/smallfile-master
               network thread sync. dir. : /var/www/test/network_shared
starting all threads by creating starting gate file /var/www/test/network_shared/starting_gate.tmp
host = 192.168.140.41,thr = 00,elapsed = 4.242312,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 01,elapsed = 4.250831,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 02,elapsed = 3.771269,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 03,elapsed = 4.060653,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 04,elapsed = 3.880653,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 05,elapsed = 3.847107,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 06,elapsed = 3.895537,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 07,elapsed = 3.966394,files = 50000,records = 0,status = ok
total threads = 8
total files = 400000
100.00% of requested files processed, minimum is  70.00
4.250831 sec elapsed time
94099.245073 files/sec
***@app1:~/smallfile-master#
  
As you can see it's now crazy fast, I think close to or faster than nfs !! What the hell!??!

 
I'm so exited I already post. Any suggestions for those parameters? I will do additional testing over here , because this is ridiculous. That woud mean defaults or no good at all...

 
 Would it be possible to profile the client [1] with defaults and the set of options used now? That could help in understanding the performance delta better.
 Thanks,
Vijay
 [1] https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Performance%20Testing/#client-side-profiling 
 



_______________________________________________
Gluster-users mailing list
Gluster-***@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users
l***@ulrar.net
2017-07-11 15:02:53 UTC
Permalink
Raw Message
Hi,

We've been doing that for some clients, basically it works fine if you
configure your OPCache very very agressivly. Increase the available ram
for it, disable any form of opcache validating from disk and it'll work
great, 'cause your app won't touch gluster.
Then whenever you make a change in the PHP, just restart PHP to force
it to reload the source from gluster.
For example :

zend_extension = opcache.so

[opcache]
opcache.enable = 1
opcache.enable_cli = 1
opcache.memory_consumption = 1024
opcache.max_accelerated_files = 80000
opcache.revalidate_freq = 300
opcache.validate_timestamps = 1
opcache.interned_strings_buffer = 32
opcache.fast_shutdown = 1

With that config, it works well. Needs some getting used to though, since
you'll need to restart php to see any change in the sources applied.

If you use something with an on-disk cache (Prestashop, magento, typo3 ..)
do think of storing that in a redis or something, never on gluster, that'd
kill performances. I've seen a gain of ~10 seconds by just moving the cache
from gluster to redis for Magento for example.
Post by Jo Goossens
Hello,
 
 
 
e.g.: 192.168.140.41:/www /var/www glusterfs defaults,_netdev,backup-volfile-servers=192.168.140.42:192.168.140.43,direct-io-mode=disable 0 0
 
I tried some mount variants in order to speed up things without luck.
 
 
After that I tried nfs (native gluster nfs 3 and ganesha nfs 4), it was a crazy performance difference.
 
e.g.: 192.168.140.41:/www /var/www nfs4 defaults,_netdev 0 0
 
 
./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41 --threads 8 --files 5000 --file-size 64 --record-size 64
 This test finished in around 1.5 seconds with NFS and in more than 250 seconds without nfs (can't remember exact numbers, but I reproduced it several times for both).
 With the native gluster mount the php app had loading times of over 10 seconds, with the nfs mount the php app loaded around 1 second maximum and even less. (reproduced several times)
 
gluster volume set www features.cache-invalidation on
gluster volume set www features.cache-invalidation-timeout 600
gluster volume set www performance.stat-prefetch on
gluster volume set www performance.cache-samba-metadata on
gluster volume set www performance.cache-invalidation on
gluster volume set www performance.md-cache-timeout 600
gluster volume set www network.inode-lru-limit 250000
 gluster volume set www performance.cache-refresh-timeout 60
gluster volume set www performance.read-ahead disable
gluster volume set www performance.readdir-ahead on
gluster volume set www performance.parallel-readdir on
gluster volume set www performance.write-behind-window-size 4MB
gluster volume set www performance.io-thread-count 64
 gluster volume set www performance.client-io-threads on
 gluster volume set www performance.cache-size 1GB
gluster volume set www performance.quick-read on
gluster volume set www performance.flush-behind on
gluster volume set www performance.write-behind on
gluster volume set www nfs.disable on
 gluster volume set www client.event-threads 3
gluster volume set www server.event-threads 3
  
 
The NFS ha adds a lot of complexity which we wouldn't need at all in our setup, could you please explain what is going on here? Is NFS the only solution to get acceptable performance? Did I miss one crucial settting perhaps?
 
We're really desperate, thanks a lot for your help!
 
 
PS: We tried with gluster 3.11 and 3.8 on Debian, both had terrible performance when not used with nfs.
 
 
Kind regards
Jo Goossens
 
 
 
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users
Jo Goossens
2017-07-12 12:20:30 UTC
Permalink
Raw Message
Hello,

 
 
While there are probably other interesting parameters and options in gluster itself, for us the largest difference with this speedtest and also for our website (real world performance) was the negative-timeout value during mount. Only 1 seems to solve so many problems, is there anyone knowledgeable why this is the case? 

 
This would better be default I suppose ... 

 
I'm still wondering if there is a big underlying issue in gluster causing the difference to be so gigantic.



Regards

Jo

 

 
-----Original message-----
From:Jo Goossens <***@hosted-power.com>
Sent:Tue 11-07-2017 18:48
Subject:RE: [Gluster-users] Gluster native mount is really slow compared to nfs
CC:gluster-***@gluster.org;
To:Vijay Bellur <***@redhat.com>;


PS: I just tested between these 2:

 
mount -t glusterfs -o negative-timeout=1,use-readdirp=no,log-level=WARNING,log-file=/var/log/glusterxxx.log 192.168.140.41:/www /var/www
 mount -t glusterfs -o use-readdirp=no,log-level=WARNING,log-file=/var/log/glusterxxx.log 192.168.140.41:/www /var/www
 So it means only 1 second negative timeout...
 In this particular test: ./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41 --threads 8 --files 50000 --file-size 64 --record-size 64

  The result is about 4 seconds with the negative timeout of 1 second defined and many many minutes without the negative timeout (I quit after 15 minutes of waiting)
 I will go over to some real world tests now to see how it performs there.
  Regards
Jo

 
 

 
-----Original message-----
From:Jo Goossens <***@hosted-power.com>
Sent:Tue 11-07-2017 18:23
Subject:Re: [Gluster-users] Gluster native mount is really slow compared to nfs
To:Vijay Bellur <***@redhat.com>;
CC:gluster-***@gluster.org;


Hello Vijay,

 
 
What do you mean exactly? What info is missing?

 
PS: I already found out that for this particular test all the difference is made by : negative-timeout=600 , when removing it, it's much much slower again.

 
 
Regards

Jo
 
-----Original message-----
From:Vijay Bellur <***@redhat.com>
Sent:Tue 11-07-2017 18:16
Subject:Re: [Gluster-users] Gluster native mount is really slow compared to nfs
To:Jo Goossens <***@hosted-power.com>;
CC:gluster-***@gluster.org; Joe Julian <***@julianfamily.org>;



On Tue, Jul 11, 2017 at 11:39 AM, Jo Goossens <***@hosted-power.com <mailto:***@hosted-power.com> > wrote:


Hello Joe,

 
 
I just did a mount like this (added the bold):

 
mount -t glusterfs -o attribute-timeout=600,entry-timeout=600,negative-timeout=600,fopen-keep-cache,use-readdirp=no,log-level=WARNING,log-file=/var/log/glusterxxx.log 192.168.140.41:/www /var/www
 Results:

 
***@app1:~/smallfile-master# ./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41 --threads 8 --files 5000 --file-size 64 --record-size 64
smallfile version 3.0
                           hosts in test : ['192.168.140.41']
                   top test directory(s) : ['/var/www/test']
                               operation : cleanup
                            files/thread : 5000
                                 threads : 8
           record size (KB, 0 = maximum) : 64
                          file size (KB) : 64
                  file size distribution : fixed
                           files per dir : 100
                            dirs per dir : 10
              threads share directories? : N
                         filename prefix :
                         filename suffix :
             hash file number into dir.? : N
                     fsync after modify? : N
          pause between files (microsec) : 0
                    finish all requests? : Y
                              stonewall? : Y
                 measure response times? : N
                            verify read? : Y
                                verbose? : False
                          log to stderr? : False
                           ext.attr.size : 0
                          ext.attr.count : 0
               permute host directories? : N
                remote program directory : /root/smallfile-master
               network thread sync. dir. : /var/www/test/network_shared
starting all threads by creating starting gate file /var/www/test/network_shared/starting_gate.tmp
host = 192.168.140.41,thr = 00,elapsed = 1.232004,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 01,elapsed = 1.148738,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 02,elapsed = 1.130913,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 03,elapsed = 1.183088,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 04,elapsed = 1.220752,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 05,elapsed = 1.228039,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 06,elapsed = 1.216787,files = 5000,records = 0,status = ok
host = 192.168.140.41,thr = 07,elapsed = 1.229036,files = 5000,records = 0,status = ok
total threads = 8
total files = 40000
100.00% of requested files processed, minimum is  70.00
1.232004 sec elapsed time
32467.428972 files/sec
  
***@app1:~/smallfile-master# ./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41 --threads 8 --files 50000 --file-size 64 --record-size 64
smallfile version 3.0
                           hosts in test : ['192.168.140.41']
                   top test directory(s) : ['/var/www/test']
                               operation : cleanup
                            files/thread : 50000
                                 threads : 8
           record size (KB, 0 = maximum) : 64
                          file size (KB) : 64
                  file size distribution : fixed
                           files per dir : 100
                            dirs per dir : 10
              threads share directories? : N
                         filename prefix :
                         filename suffix :
             hash file number into dir.? : N
                     fsync after modify? : N
          pause between files (microsec) : 0
                    finish all requests? : Y
                              stonewall? : Y
                 measure response times? : N
                            verify read? : Y
                                verbose? : False
                          log to stderr? : False
                           ext.attr.size : 0
                          ext.attr.count : 0
               permute host directories? : N
                remote program directory : /root/smallfile-master
               network thread sync. dir. : /var/www/test/network_shared
starting all threads by creating starting gate file /var/www/test/network_shared/starting_gate.tmp
host = 192.168.140.41,thr = 00,elapsed = 4.242312,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 01,elapsed = 4.250831,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 02,elapsed = 3.771269,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 03,elapsed = 4.060653,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 04,elapsed = 3.880653,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 05,elapsed = 3.847107,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 06,elapsed = 3.895537,files = 50000,records = 0,status = ok
host = 192.168.140.41,thr = 07,elapsed = 3.966394,files = 50000,records = 0,status = ok
total threads = 8
total files = 400000
100.00% of requested files processed, minimum is  70.00
4.250831 sec elapsed time
94099.245073 files/sec
***@app1:~/smallfile-master#
  
As you can see it's now crazy fast, I think close to or faster than nfs !! What the hell!??!

 
I'm so exited I already post. Any suggestions for those parameters? I will do additional testing over here , because this is ridiculous. That woud mean defaults or no good at all...

 
 Would it be possible to profile the client [1] with defaults and the set of options used now? That could help in understanding the performance delta better.
 Thanks,
Vijay
 [1] https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Performance%20Testing/#client-side-profiling 
 



_______________________________________________
Gluster-users mailing list
Gluster-***@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Loading...