Raghavendra Gowdappa
2018-02-09 08:21:34 UTC
+gluster-users
Another guideline we can provide is to disable all performance xlators for workloads requiring strict metadata consistency (even for non gluster-block usecases like native fuse mount etc). Note that we might still can have few perf xlators turned on. But, that will require some experimentation. The safest and easiest would be to turn off following xlators:
* performance.read-ahead
* performance.write-behind
* performance.readdir-ahead and performance.parallel-readdir
* performance.quick-read
* performance.stat-prefetch
* performance.io-cache
performance.open-behind can be turned on if the application doesn't require the functionality of file accessible through an fd opened on a mountpoint while file is deleted from a different mount point. As far as metadata inconsistencies go, I am not aware of any issues with performance.open-behind.
Please note that as has been pointed out by different mails in this thread, perf-xlators is one part (albeit larger one) of the bigger problem of metadata inconsistency.
regards,
Raghavendra
----- Original Message -----
Another guideline we can provide is to disable all performance xlators for workloads requiring strict metadata consistency (even for non gluster-block usecases like native fuse mount etc). Note that we might still can have few perf xlators turned on. But, that will require some experimentation. The safest and easiest would be to turn off following xlators:
* performance.read-ahead
* performance.write-behind
* performance.readdir-ahead and performance.parallel-readdir
* performance.quick-read
* performance.stat-prefetch
* performance.io-cache
performance.open-behind can be turned on if the application doesn't require the functionality of file accessible through an fd opened on a mountpoint while file is deleted from a different mount point. As far as metadata inconsistencies go, I am not aware of any issues with performance.open-behind.
Please note that as has been pointed out by different mails in this thread, perf-xlators is one part (albeit larger one) of the bigger problem of metadata inconsistency.
regards,
Raghavendra
----- Original Message -----
Sent: Friday, February 9, 2018 1:34:25 PM
Subject: Re: [Gluster-devel] Glusterfs and Structured data
Guide is from a FAQ that was being circulated in Gluster, Inc. indicating
the startup's market positioning. Most of that was based on not wanting to
get into performance based comparisons of storage systems that are
frequently seen in the structured data space.
to applications that were written for a local filesystem interface. We
have
encountered problems with applications like tar [4] that are not in the
realm of "Structured data". If we look at the common theme across all
these
problems, it is related to metadata & read after write consistency issues
with the default translator stack that gets exposed on the client side.
While the default stack is optimal for other scenarios, it does seem that
a
category of applications needing strict metadata consistency is not well
served by that. We have observed that disabling a few performance
translators and tuning cache timeouts for VFS/FUSE have helped to overcome
some of them. The WIP effort on timestamp consistency across the
translator
stack, patches that have been merged as a result of the bugs that you
mention & other fixes for outstanding issues should certainly help in
catering to these workloads better with the file interface.
There are deployments that I have come across where glusterfs is used for
storing structured data. gluster-block & qemu-libgfapi overcome the
metadata consistency problem by exposing a file as a block device & by
disabling most of the performance translators in the default stack.
Workloads that have been deemed problematic with the file interface for
the
reasons alluded above, function well with the block interface.
I agree that gluster-block due to its usage of a subset of glusterfs fops
(mostly reads/writes I guess), runs into less number of consistency issues.
However, as you've mentioned we seem to disable perf xlator stack in our
tests/use-cases till now. Note that perf xlator stack is one of worst
offenders as far as the metadata consistency is concerned (relatively less
scenarios of data inconsistency). So, I wonder,
* what would be the scenario if we enable perf xlator stack for
gluster-block?
tcmu-runner opens block devices with O_DIRECT. So enabling perf xlators for
gluster-block would not make a difference as translators like io-cache &
read-ahead do not enable caching for open() with O_DIRECT. In addition,
since bulk of the operations happen to be reads & writes on large files
with gluster-block, md-cache & quick-read are not appropriate for the stack
that tcmu-runner operates on.
* Is performance on gluster-block satisfactory so that we don't need these
very useful for gluster-block workloads.
Regards,
Vijay
Subject: Re: [Gluster-devel] Glusterfs and Structured data
All,
One of our users pointed out to the documentation that glusterfs is not
good for storing "Structured data" [1], while discussing an issue [2].
As far as I remember, the content around structured data in the InstallOne of our users pointed out to the documentation that glusterfs is not
good for storing "Structured data" [1], while discussing an issue [2].
Guide is from a FAQ that was being circulated in Gluster, Inc. indicating
the startup's market positioning. Most of that was based on not wanting to
get into performance based comparisons of storage systems that are
frequently seen in the structured data space.
Does any of you have more context on the feasibility of storing
"structured data" on Glusterfs? Is one of the reasons for such a
suggestion
"staleness of metadata" as encountered in bugs like [3]?
There are challenges that distributed storage systems face when exposed"structured data" on Glusterfs? Is one of the reasons for such a
suggestion
"staleness of metadata" as encountered in bugs like [3]?
to applications that were written for a local filesystem interface. We
have
encountered problems with applications like tar [4] that are not in the
realm of "Structured data". If we look at the common theme across all
these
problems, it is related to metadata & read after write consistency issues
with the default translator stack that gets exposed on the client side.
While the default stack is optimal for other scenarios, it does seem that
a
category of applications needing strict metadata consistency is not well
served by that. We have observed that disabling a few performance
translators and tuning cache timeouts for VFS/FUSE have helped to overcome
some of them. The WIP effort on timestamp consistency across the
translator
stack, patches that have been merged as a result of the bugs that you
mention & other fixes for outstanding issues should certainly help in
catering to these workloads better with the file interface.
There are deployments that I have come across where glusterfs is used for
storing structured data. gluster-block & qemu-libgfapi overcome the
metadata consistency problem by exposing a file as a block device & by
disabling most of the performance translators in the default stack.
Workloads that have been deemed problematic with the file interface for
the
reasons alluded above, function well with the block interface.
(mostly reads/writes I guess), runs into less number of consistency issues.
However, as you've mentioned we seem to disable perf xlator stack in our
tests/use-cases till now. Note that perf xlator stack is one of worst
offenders as far as the metadata consistency is concerned (relatively less
scenarios of data inconsistency). So, I wonder,
* what would be the scenario if we enable perf xlator stack for
gluster-block?
gluster-block would not make a difference as translators like io-cache &
read-ahead do not enable caching for open() with O_DIRECT. In addition,
since bulk of the operations happen to be reads & writes on large files
with gluster-block, md-cache & quick-read are not appropriate for the stack
that tcmu-runner operates on.
* Is performance on gluster-block satisfactory so that we don't need these
xlators?
- Or is it that these xlators are not useful for the workload usually
run on gluster-block (For random read/write workload, read/write caching
xlators offer less or no advantage)?
- Or theoretically the workload is ought to benefit from perf xlators,
but we don't see them in our results (there are open bugs to this effect)?
Owing to the reasons mentioned above, most performance xlators do not seem- Or is it that these xlators are not useful for the workload usually
run on gluster-block (For random read/write workload, read/write caching
xlators offer less or no advantage)?
- Or theoretically the workload is ought to benefit from perf xlators,
but we don't see them in our results (there are open bugs to this effect)?
very useful for gluster-block workloads.
Regards,
Vijay