[Gluster-users] sharding in glusterfs

Discussion:

Ashayam Gupta

2018-09-16 22:43:04 UTC

Hi All,

We are currently using glusterfs for storing large files with write-once
and multiple concurrent reads, and were interested in understanding one of
the features of glusterfs called sharding for our use case.

So far from the talk given by the developer [

and the git issue [
https://github.com/gluster/glusterfs/issues/290] , we know that it was
developed for large VM images as use case and the second link does talk
about a more general purpose usage , but we are not clear if there are some
issues if used for non-VM image large files [which is the use case for us].

Therefore it would be helpful if we can have some pointers or more
information about the more general use-case scenario for sharding and any
shortcomings if any , in case we use it for our scenario which is non-VM
large files with write-once and multiple concurrent reads.Also it would be
very helpful if you can suggest the best approach/settings for our use case
scenario.

Thanks
Ashayam Gupta

Serkan Çoban

2018-09-17 06:19:18 UTC

Permalink

Did you try disperse volume? It may work your workload I think.
We are using disperse volumes for archive workloads with 2GB files and
I did not encounter any problems.
On Mon, Sep 17, 2018 at 1:43 AM Ashayam Gupta

Post by Ashayam Gupta
Hi All,
We are currently using glusterfs for storing large files with write-once and multiple concurrent reads, and were interested in understanding one of the features of glusterfs called sharding for our use case.
So far from the talk given by the developer http://youtu.be/aAlLy9k65Gw and the git issue [https://github.com/gluster/glusterfs/issues/290] , we know that it was developed for large VM images as use case and the second link does talk about a more general purpose usage , but we are not clear if there are some issues if used for non-VM image large files [which is the use case for us].
Therefore it would be helpful if we can have some pointers or more information about the more general use-case scenario for sharding and any shortcomings if any , in case we use it for our scenario which is non-VM large files with write-once and multiple concurrent reads.Also it would be very helpful if you can suggest the best approach/settings for our use case scenario.
Thanks
Ashayam Gupta
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users

Pranith Kumar Karampuri

2018-09-18 05:30:31 UTC

Permalink

Post by Ashayam Gupta
Hi All,
We are currently using glusterfs for storing large files with write-once
and multiple concurrent reads, and were interested in understanding one of
the features of glusterfs called sharding for our use case.
So far from the talk given by the developer [
http://youtu.be/aAlLy9k65Gw and the git issue [
https://github.com/gluster/glusterfs/issues/290] , we know that it was
developed for large VM images as use case and the second link does talk
about a more general purpose usage , but we are not clear if there are some
issues if used for non-VM image large files [which is the use case for us].
Therefore it would be helpful if we can have some pointers or more
information about the more general use-case scenario for sharding and any
shortcomings if any , in case we use it for our scenario which is non-VM
large files with write-once and multiple concurrent reads.Also it would be
very helpful if you can suggest the best approach/settings for our use case
scenario.

Sharding is developed for Big file usecases and at the moment only supports
single writer workload. I also added the maintainers for sharding to the
thread. May be giving a bit of detail about access pattern w.r.t. number of
mounts that are used for writing/reading would be helpful. I am assuming
write-once and multiple concurrent reads means that Reads will not happen
until the file is completely written to. Could you explain a bit more
about the workload?

Post by Ashayam Gupta
Thanks
Ashayam Gupta
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users

--
Pranith

Ashayam Gupta

2018-09-19 06:06:44 UTC

Permalink

Please find our workload details as requested by you :

* Only 1 write-mount point as of now
* Read-Mount : Since we auto-scale our machines this can be as big as
300-400 machines during peak times
* >" multiple concurrent reads means that Reads will not happen until the
file is completely written to" Yes , in our current scenario we can ensure
that indeed this is the case.

But when you say it only supports single writer workload we would like to
understand the following scenarios with respect to multiple writers and the
current behaviour of glusterfs with sharding

- Multiple Writer writes to different files
- Multiple Writer writes to same file
- they write to same file but different shards of same file
- they write to same file (no gurantee if they write to different
shards)

There might be some more cases which are known to you , would be helpful if
you can describe us about those scenarios as well or may point us to the
relevant documents.
Also it would be helpful if you can suggest the most stable version of
glusterfs with sharding feature to use , since we would like to use this in
production.

Thanks
Ashayam Gupta

On Tue, Sep 18, 2018 at 11:00 AM Pranith Kumar Karampuri <

On Mon, Sep 17, 2018 at 4:14 AM Ashayam Gupta <

Post by Ashayam Gupta
Hi All,
We are currently using glusterfs for storing large files with write-once
and multiple concurrent reads, and were interested in understanding one of
the features of glusterfs called sharding for our use case.
So far from the talk given by the developer [
http://youtu.be/aAlLy9k65Gw and the git issue [
https://github.com/gluster/glusterfs/issues/290] , we know that it was
developed for large VM images as use case and the second link does talk
about a more general purpose usage , but we are not clear if there are some
issues if used for non-VM image large files [which is the use case for us].
Therefore it would be helpful if we can have some pointers or more
information about the more general use-case scenario for sharding and any
shortcomings if any , in case we use it for our scenario which is non-VM
large files with write-once and multiple concurrent reads.Also it would be
very helpful if you can suggest the best approach/settings for our use case
scenario.

Sharding is developed for Big file usecases and at the moment only
supports single writer workload. I also added the maintainers for sharding
to the thread. May be giving a bit of detail about access pattern w.r.t.
number of mounts that are used for writing/reading would be helpful. I am
assuming write-once and multiple concurrent reads means that Reads will not
happen until the file is completely written to. Could you explain a bit
more about the workload?

Post by Ashayam Gupta
Thanks
Ashayam Gupta
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users

--
Pranith

Pranith Kumar Karampuri

2018-09-20 14:17:31 UTC

Permalink

Post by Ashayam Gupta
* Only 1 write-mount point as of now
* Read-Mount : Since we auto-scale our machines this can be as big as
300-400 machines during peak times
* >" multiple concurrent reads means that Reads will not happen until the
file is completely written to" Yes , in our current scenario we can ensure
that indeed this is the case.
But when you say it only supports single writer workload we would like to
understand the following scenarios with respect to multiple writers and the
current behaviour of glusterfs with sharding
- Multiple Writer writes to different files
When I say multiple writers, I mean multiple mounts. Since you were saying

earlier there is only one mount which does all writes, everything should
work as expected.

Post by Ashayam Gupta
- Multiple Writer writes to same file
- they write to same file but different shards of same file
- they write to same file (no gurantee if they write to different
shards)
As long as the above happens from same mount, things should be fine.

Otherwise there could be problems.

Post by Ashayam Gupta
There might be some more cases which are known to you , would be helpful
if you can describe us about those scenarios as well or may point us to the
relevant documents.

Also it would be helpful if you can suggest the most stable version of

Post by Ashayam Gupta
glusterfs with sharding feature to use , since we would like to use this in
production.

It has been stable for a while, so use any of the latest maintained
releases like 3.12.x or 4.1.x

As I was mentioning already, sharding is mainly tested with
VM/gluster-block workloads. So there could be some corner cases with single
writer workload which we never ran into for the VM/block workloads we test.
But you may run into them. Do let us know and we can take a look if you
find something out of the ordinary. What I would suggest is to use one of
the maintained releases and run the workloads you have for some time to
test things out, once you feel confident, you can put it in production.

HTH

Post by Ashayam Gupta
Thanks
Ashayam Gupta
On Tue, Sep 18, 2018 at 11:00 AM Pranith Kumar Karampuri <

On Mon, Sep 17, 2018 at 4:14 AM Ashayam Gupta <

Post by Ashayam Gupta
Hi All,
We are currently using glusterfs for storing large files with write-once
and multiple concurrent reads, and were interested in understanding one of
the features of glusterfs called sharding for our use case.
So far from the talk given by the developer [
http://youtu.be/aAlLy9k65Gw and the git issue [
https://github.com/gluster/glusterfs/issues/290] , we know that it was
developed for large VM images as use case and the second link does talk
about a more general purpose usage , but we are not clear if there are some
issues if used for non-VM image large files [which is the use case for us].
Therefore it would be helpful if we can have some pointers or more
information about the more general use-case scenario for sharding and any
shortcomings if any , in case we use it for our scenario which is non-VM
large files with write-once and multiple concurrent reads.Also it would be
very helpful if you can suggest the best approach/settings for our use case
scenario.

Sharding is developed for Big file usecases and at the moment only
supports single writer workload. I also added the maintainers for sharding
to the thread. May be giving a bit of detail about access pattern w.r.t.
number of mounts that are used for writing/reading would be helpful. I am
assuming write-once and multiple concurrent reads means that Reads will not
happen until the file is completely written to. Could you explain a bit
more about the workload?

Post by Ashayam Gupta
Thanks
Ashayam Gupta
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users

--
Pranith

Ashayam Gupta

2018-09-30 16:23:44 UTC

Permalink

Hi Pranith,

Thanks for you reply, it would be helpful if you can please help us with
the following issues with respect to sharding.
The gluster version we are using is *glusterfs 4.1.4 *on Ubuntu 18.04.1 LTS

- *Shards-Creation Algo*: We were interested in understanding the way in
which shards are distributed across bricks and nodes, is it Round-Robin or
some other algo and can we change this mechanism using some config file.
E.g. If we have 2 nodes with each nodes having 2 bricks , with a total
of 4 (2*2) bricks how will the shards be distributed, will it be always
even distribution?(Volume type in this case is plain)

- *Sharding+Distributed-Volume*: Currently we are using plain volume
with sharding enabled and we do not see even distribution of shards across
bricks .Can we use sharding with distributed volume to achieve evenly and
better distribution of shards? Would be helpful if you can suggest the most
efficient way of using sharding , our goal is to have a evenly distributed
file system(we have large files hence using sharding) and we are not
concerned with replication as of now.
- *Shard-Block-Size: *In case we change the *
features.shard-block-size* value
from X -> Y after lots of data has been populated , how does this affect
the existing shards are they auto corrected as per the new size or do we
need to run some commdands to get this done or is this even recommended to
do the change?
- *Rebalance-Shard*: As per the docs whenever we add new server/node to
the existing gluster we need to run Rebalance command, we would like to
know if there are any known issues for re-balancing with sharding enabled.

We would highly appreciate if you can point us to the latest sharding docs,
we tried to search but could not find better than this
https://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Features/shard/
.

Thanks
Ashayam

On Wed, Sep 19, 2018 at 11:37 AM Ashayam Gupta <

saying earlier there is only one mount which does all writes, everything
should work as expected.

Otherwise there could be problems.

Post by Ashayam Gupta
There might be some more cases which are known to you , would be helpful
if you can describe us about those scenarios as well or may point us to the
relevant documents.

Also it would be helpful if you can suggest the most stable version of

Post by Ashayam Gupta
glusterfs with sharding feature to use , since we would like to use this in
production.

It has been stable for a while, so use any of the latest maintained
releases like 3.12.x or 4.1.x
As I was mentioning already, sharding is mainly tested with
VM/gluster-block workloads. So there could be some corner cases with single
writer workload which we never ran into for the VM/block workloads we test.
But you may run into them. Do let us know and we can take a look if you
find something out of the ordinary. What I would suggest is to use one of
the maintained releases and run the workloads you have for some time to
test things out, once you feel confident, you can put it in production.
HTH

Post by Ashayam Gupta
Thanks
Ashayam Gupta
On Tue, Sep 18, 2018 at 11:00 AM Pranith Kumar Karampuri <

On Mon, Sep 17, 2018 at 4:14 AM Ashayam Gupta <

Post by Ashayam Gupta
Hi All,
We are currently using glusterfs for storing large files with
write-once and multiple concurrent reads, and were interested in
understanding one of the features of glusterfs called sharding for our use
case.
So far from the talk given by the developer [
http://youtu.be/aAlLy9k65Gw and the git issue [
https://github.com/gluster/glusterfs/issues/290] , we know that it was
developed for large VM images as use case and the second link does talk
about a more general purpose usage , but we are not clear if there are some
issues if used for non-VM image large files [which is the use case for us].
Therefore it would be helpful if we can have some pointers or more
information about the more general use-case scenario for sharding and any
shortcomings if any , in case we use it for our scenario which is non-VM
large files with write-once and multiple concurrent reads.Also it would be
very helpful if you can suggest the best approach/settings for our use case
scenario.

Sharding is developed for Big file usecases and at the moment only
supports single writer workload. I also added the maintainers for sharding
to the thread. May be giving a bit of detail about access pattern w.r.t.
number of mounts that are used for writing/reading would be helpful. I am
assuming write-once and multiple concurrent reads means that Reads will not
happen until the file is completely written to. Could you explain a bit
more about the workload?

Post by Ashayam Gupta
Thanks
Ashayam Gupta
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users

--
Pranith

Raghavendra Gowdappa

2018-10-03 03:29:33 UTC

Permalink

Post by Ashayam Gupta
Hi Pranith,
Thanks for you reply, it would be helpful if you can please help us with
the following issues with respect to sharding.
The gluster version we are using is *glusterfs 4.1.4 *on Ubuntu 18.04.1 LTS
- *Shards-Creation Algo*: We were interested in understanding the way
in which shards are distributed across bricks and nodes, is it Round-Robin
or some other algo and can we change this mechanism using some config file.
E.g. If we have 2 nodes with each nodes having 2 bricks , with a total
of 4 (2*2) bricks how will the shards be distributed, will it be always
even distribution?(Volume type in this case is plain)
- *Sharding+Distributed-Volume*: Currently we are using plain volume
with sharding enabled and we do not see even distribution of shards across
bricks .Can we use sharding with distributed volume to achieve evenly and
better distribution of shards? Would be helpful if you can suggest the most
efficient way of using sharding , our goal is to have a evenly distributed
file system(we have large files hence using sharding) and we are not
concerned with replication as of now.

For distribution you need DHT as a descendant of shard xlator in graph. The
way features/shard xlator handles sharding of file is to create an
independent file for each shard and hence an individual shard is visible as
an independent file in children of xlator shard. The entire distribution
logic is off-loaded to the xlator that handles distribution logic.

Post by Ashayam Gupta
- *Shard-Block-Size: *In case we change the
* features.shard-block-size* value from X -> Y after lots of data has
been populated , how does this affect the existing shards are they auto
corrected as per the new size or do we need to run some commdands to get
this done or is this even recommended to do the change?
- *Rebalance-Shard*: As per the docs whenever we add new server/node
to the existing gluster we need to run Rebalance command, we would like to
know if there are any known issues for re-balancing with sharding enabled.
We would highly appreciate if you can point us to the latest sharding
docs, we tried to search but could not find better than this
https://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Features/shard/
.
Thanks
Ashayam
On Thu, Sep 20, 2018 at 7:47 PM Pranith Kumar Karampuri <

On Wed, Sep 19, 2018 at 11:37 AM Ashayam Gupta <

Post by Ashayam Gupta
* Only 1 write-mount point as of now
* Read-Mount : Since we auto-scale our machines this can be as big as
300-400 machines during peak times
* >" multiple concurrent reads means that Reads will not happen until
the file is completely written to" Yes , in our current scenario we can
ensure that indeed this is the case.
But when you say it only supports single writer workload we would like
to understand the following scenarios with respect to multiple writers and
the current behaviour of glusterfs with sharding
- Multiple Writer writes to different files
When I say multiple writers, I mean multiple mounts. Since you were

saying earlier there is only one mount which does all writes, everything
should work as expected.

Otherwise there could be problems.

Post by Ashayam Gupta
There might be some more cases which are known to you , would be helpful
if you can describe us about those scenarios as well or may point us to the
relevant documents.

Also it would be helpful if you can suggest the most stable version of

Post by Ashayam Gupta
glusterfs with sharding feature to use , since we would like to use this in
production.

It has been stable for a while, so use any of the latest maintained
releases like 3.12.x or 4.1.x
As I was mentioning already, sharding is mainly tested with
VM/gluster-block workloads. So there could be some corner cases with single
writer workload which we never ran into for the VM/block workloads we test.
But you may run into them. Do let us know and we can take a look if you
find something out of the ordinary. What I would suggest is to use one of
the maintained releases and run the workloads you have for some time to
test things out, once you feel confident, you can put it in production.
HTH

Post by Ashayam Gupta
Thanks
Ashayam Gupta
On Tue, Sep 18, 2018 at 11:00 AM Pranith Kumar Karampuri <

On Mon, Sep 17, 2018 at 4:14 AM Ashayam Gupta <

Post by Ashayam Gupta
Hi All,
We are currently using glusterfs for storing large files with
write-once and multiple concurrent reads, and were interested in
understanding one of the features of glusterfs called sharding for our use
case.
So far from the talk given by the developer [
http://youtu.be/aAlLy9k65Gw and the git issue [
https://github.com/gluster/glusterfs/issues/290] , we know that it
was developed for large VM images as use case and the second link does talk
about a more general purpose usage , but we are not clear if there are some
issues if used for non-VM image large files [which is the use case for us].
Therefore it would be helpful if we can have some pointers or more
information about the more general use-case scenario for sharding and any
shortcomings if any , in case we use it for our scenario which is non-VM
large files with write-once and multiple concurrent reads.Also it would be
very helpful if you can suggest the best approach/settings for our use case
scenario.

Sharding is developed for Big file usecases and at the moment only
supports single writer workload. I also added the maintainers for sharding
to the thread. May be giving a bit of detail about access pattern w.r.t.
number of mounts that are used for writing/reading would be helpful. I am
assuming write-once and multiple concurrent reads means that Reads will not
happen until the file is completely written to. Could you explain a bit
more about the workload?

Post by Ashayam Gupta
Thanks
Ashayam Gupta
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users

--
Pranith

_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users

Krutika Dhananjay

2018-10-05 06:33:10 UTC

Permalink

Hi,

Apologies for the late reply. My email filters are messed up, I missed
reading this.

Answers to questions around shard algorithm inline ...

property of a file that is set at the time of creation of the file (in the
form of an extended attribute "trusted.glusterfs.shard.block-size") and
remains same through the lifetime of the file.

If you want the shard-block-size to be changed across these files, you'll
need to perform either of the two steps below:

1. move the existing files to a local fs from your glusterfs volume and
then move them back into the volume.
2. copy the existing files into a temporary filenames on the same volume
and rename them back to their original names.
In our tests wrt vm store workload, we've found 64MB shard-block-size to be
good fit for both IO and self-heal performance.

Post by Ashayam Gupta
- *Rebalance-Shard*: As per the docs whenever we add new server/node
to the existing gluster we need to run Rebalance command, we would like to
know if there are any known issues for re-balancing with sharding enabled.
We did find some shard-dht inter-op issues in rebalance in the past again

in the supported vm storage use-case. The good news is that the problems
known to us have been fixed, but their validation is still pending.

Post by Ashayam Gupta
We would highly appreciate if you can point us to the latest sharding
docs, we tried to search but could not find better than this
https://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Features/shard/
.

The doc is still valid (except for minor changes in the To-Do list at the
bottom). But I agree, the answers to all of the questions you asked above
are well worth documenting. I'll fix this. Thanks for the feedback.
Let us know if you have any more questions or if you run into any problems.
Happy to help.
Also, since you're using a non-vm storage use case, I'd suggest that you
try shard on a test cluster first before even putting it into production. :)

-Krutika

Post by Ashayam Gupta
Thanks
Ashayam
On Thu, Sep 20, 2018 at 7:47 PM Pranith Kumar Karampuri <

On Wed, Sep 19, 2018 at 11:37 AM Ashayam Gupta <

Post by Ashayam Gupta
* Only 1 write-mount point as of now
* Read-Mount : Since we auto-scale our machines this can be as big as
300-400 machines during peak times
* >" multiple concurrent reads means that Reads will not happen until
the file is completely written to" Yes , in our current scenario we can
ensure that indeed this is the case.
But when you say it only supports single writer workload we would like
to understand the following scenarios with respect to multiple writers and
the current behaviour of glusterfs with sharding
- Multiple Writer writes to different files
When I say multiple writers, I mean multiple mounts. Since you were

saying earlier there is only one mount which does all writes, everything
should work as expected.

Otherwise there could be problems.

Post by Ashayam Gupta
There might be some more cases which are known to you , would be helpful
if you can describe us about those scenarios as well or may point us to the
relevant documents.

Also it would be helpful if you can suggest the most stable version of

Post by Ashayam Gupta
glusterfs with sharding feature to use , since we would like to use this in
production.

It has been stable for a while, so use any of the latest maintained
releases like 3.12.x or 4.1.x
As I was mentioning already, sharding is mainly tested with
VM/gluster-block workloads. So there could be some corner cases with single
writer workload which we never ran into for the VM/block workloads we test.
But you may run into them. Do let us know and we can take a look if you
find something out of the ordinary. What I would suggest is to use one of
the maintained releases and run the workloads you have for some time to
test things out, once you feel confident, you can put it in production.
HTH

Post by Ashayam Gupta
Thanks
Ashayam Gupta
On Tue, Sep 18, 2018 at 11:00 AM Pranith Kumar Karampuri <

On Mon, Sep 17, 2018 at 4:14 AM Ashayam Gupta <

Post by Ashayam Gupta
Hi All,
We are currently using glusterfs for storing large files with
write-once and multiple concurrent reads, and were interested in
understanding one of the features of glusterfs called sharding for our use
case.
So far from the talk given by the developer [
http://youtu.be/aAlLy9k65Gw and the git issue [
https://github.com/gluster/glusterfs/issues/290] , we know that it
was developed for large VM images as use case and the second link does talk
about a more general purpose usage , but we are not clear if there are some
issues if used for non-VM image large files [which is the use case for us].
Therefore it would be helpful if we can have some pointers or more
information about the more general use-case scenario for sharding and any
shortcomings if any , in case we use it for our scenario which is non-VM
large files with write-once and multiple concurrent reads.Also it would be
very helpful if you can suggest the best approach/settings for our use case
scenario.

Sharding is developed for Big file usecases and at the moment only
supports single writer workload. I also added the maintainers for sharding
to the thread. May be giving a bit of detail about access pattern w.r.t.
number of mounts that are used for writing/reading would be helpful. I am
assuming write-once and multiple concurrent reads means that Reads will not
happen until the file is completely written to. Could you explain a bit
more about the workload?

Post by Ashayam Gupta
Thanks
Ashayam Gupta
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users

--
Pranith