[Gluster-users] Can't heal a volume: "Please check if all brick processes are running."

Discussion:

Anatoliy Dmytriyev

2018-03-12 14:58:10 UTC

Hello,

We have a very fresh gluster 3.10.10 installation.
Our volume is created as distributed volume, 9 bricks 96TB in total
(87TB after 10% of gluster disk space reservation)

For some reasons I can’t “heal” the volume:
# gluster volume heal gv0
Launching heal operation to perform index self heal on volume gv0 has
been unsuccessful on bricks that are down. Please check if all brick
processes are running.

Which processes should be run on every brick for heal operation?

# gluster volume status
Status of volume: gv0
Gluster process TCP Port RDMA Port Online
Pid
------------------------------------------------------------------------------
Brick cn01-ib:/gfs/gv0/brick1/brick 0 49152 Y
70850
Brick cn02-ib:/gfs/gv0/brick1/brick 0 49152 Y
102951
Brick cn03-ib:/gfs/gv0/brick1/brick 0 49152 Y
57535
Brick cn04-ib:/gfs/gv0/brick1/brick 0 49152 Y
56676
Brick cn05-ib:/gfs/gv0/brick1/brick 0 49152 Y
56880
Brick cn06-ib:/gfs/gv0/brick1/brick 0 49152 Y
56889
Brick cn07-ib:/gfs/gv0/brick1/brick 0 49152 Y
56902
Brick cn08-ib:/gfs/gv0/brick1/brick 0 49152 Y
94920
Brick cn09-ib:/gfs/gv0/brick1/brick 0 49152 Y
56542

Task Status of Volume gv0
------------------------------------------------------------------------------
There are no active volume tasks

# gluster volume info gv0
Volume Name: gv0
Type: Distribute
Volume ID: 8becaf78-cf2d-4991-93bf-f2446688154f
Status: Started
Snapshot Count: 0
Number of Bricks: 9
Transport-type: rdma
Bricks:
Brick1: cn01-ib:/gfs/gv0/brick1/brick
Brick2: cn02-ib:/gfs/gv0/brick1/brick
Brick3: cn03-ib:/gfs/gv0/brick1/brick
Brick4: cn04-ib:/gfs/gv0/brick1/brick
Brick5: cn05-ib:/gfs/gv0/brick1/brick
Brick6: cn06-ib:/gfs/gv0/brick1/brick
Brick7: cn07-ib:/gfs/gv0/brick1/brick
Brick8: cn08-ib:/gfs/gv0/brick1/brick
Brick9: cn09-ib:/gfs/gv0/brick1/brick
Options Reconfigured:
client.event-threads: 8
performance.parallel-readdir: on
performance.readdir-ahead: on
cluster.nufa: on
nfs.disable: on

--
Best regards,
Anatoliy

Anatoliy Dmytriyev

2018-03-13 14:23:25 UTC

Permalink

Hi,

Maybe someone can point me to a documentation or explain this? I can't
find it myself.
Do we have any other useful resources except doc.gluster.org? As I see
many gluster options are not described there or there are no explanation
what is doing...

Post by Anatoliy Dmytriyev
Hello,
We have a very fresh gluster 3.10.10 installation.
Our volume is created as distributed volume, 9 bricks 96TB in total
(87TB after 10% of gluster disk space reservation)
# gluster volume heal gv0
Launching heal operation to perform index self heal on volume gv0 has
been unsuccessful on bricks that are down. Please check if all brick
processes are running.
Which processes should be run on every brick for heal operation?
# gluster volume status
Status of volume: gv0
Gluster process TCP Port RDMA Port Online
Pid
------------------------------------------------------------------------------
Brick cn01-ib:/gfs/gv0/brick1/brick 0 49152 Y
70850
Brick cn02-ib:/gfs/gv0/brick1/brick 0 49152 Y
102951
Brick cn03-ib:/gfs/gv0/brick1/brick 0 49152 Y
57535
Brick cn04-ib:/gfs/gv0/brick1/brick 0 49152 Y
56676
Brick cn05-ib:/gfs/gv0/brick1/brick 0 49152 Y
56880
Brick cn06-ib:/gfs/gv0/brick1/brick 0 49152 Y
56889
Brick cn07-ib:/gfs/gv0/brick1/brick 0 49152 Y
56902
Brick cn08-ib:/gfs/gv0/brick1/brick 0 49152 Y
94920
Brick cn09-ib:/gfs/gv0/brick1/brick 0 49152 Y
56542
Task Status of Volume gv0
------------------------------------------------------------------------------
There are no active volume tasks
# gluster volume info gv0
Volume Name: gv0
Type: Distribute
Volume ID: 8becaf78-cf2d-4991-93bf-f2446688154f
Status: Started
Snapshot Count: 0
Number of Bricks: 9
Transport-type: rdma
Brick1: cn01-ib:/gfs/gv0/brick1/brick
Brick2: cn02-ib:/gfs/gv0/brick1/brick
Brick3: cn03-ib:/gfs/gv0/brick1/brick
Brick4: cn04-ib:/gfs/gv0/brick1/brick
Brick5: cn05-ib:/gfs/gv0/brick1/brick
Brick6: cn06-ib:/gfs/gv0/brick1/brick
Brick7: cn07-ib:/gfs/gv0/brick1/brick
Brick8: cn08-ib:/gfs/gv0/brick1/brick
Brick9: cn09-ib:/gfs/gv0/brick1/brick
client.event-threads: 8
performance.parallel-readdir: on
performance.readdir-ahead: on
cluster.nufa: on
nfs.disable: on

--
Best regards,
Anatoliy

Karthik Subrahmanya

2018-03-13 15:46:19 UTC

Permalink

Hi Anatoliy,

The heal command is basically used to heal any mismatching contents between
replica copies of the files.
For the command "gluster volume heal <volname>" to succeed, you should have
the self-heal-daemon running,
which is true only if your volume is of type replicate/disperse.
In your case you have a plain distribute volume where you do not store the
replica of any files.
So the volume heal will return you the error.

Regards,
Karthik

Post by Anatoliy Dmytriyev
Hi,
Maybe someone can point me to a documentation or explain this? I can't
find it myself.
Do we have any other useful resources except doc.gluster.org? As I see
many gluster options are not described there or there are no explanation
what is doing...

Post by Anatoliy Dmytriyev
Hello,
We have a very fresh gluster 3.10.10 installation.
Our volume is created as distributed volume, 9 bricks 96TB in total
(87TB after 10% of gluster disk space reservation)
# gluster volume heal gv0
Launching heal operation to perform index self heal on volume gv0 has
been unsuccessful on bricks that are down. Please check if all brick
processes are running.
Which processes should be run on every brick for heal operation?
# gluster volume status
Status of volume: gv0
Gluster process TCP Port RDMA Port Online
Pid
------------------------------------------------------------
------------------
Brick cn01-ib:/gfs/gv0/brick1/brick 0 49152 Y
70850
Brick cn02-ib:/gfs/gv0/brick1/brick 0 49152 Y
102951
Brick cn03-ib:/gfs/gv0/brick1/brick 0 49152 Y
57535
Brick cn04-ib:/gfs/gv0/brick1/brick 0 49152 Y
56676
Brick cn05-ib:/gfs/gv0/brick1/brick 0 49152 Y
56880
Brick cn06-ib:/gfs/gv0/brick1/brick 0 49152 Y
56889
Brick cn07-ib:/gfs/gv0/brick1/brick 0 49152 Y
56902
Brick cn08-ib:/gfs/gv0/brick1/brick 0 49152 Y
94920
Brick cn09-ib:/gfs/gv0/brick1/brick 0 49152 Y
56542
Task Status of Volume gv0
------------------------------------------------------------
------------------
There are no active volume tasks
# gluster volume info gv0
Volume Name: gv0
Type: Distribute
Volume ID: 8becaf78-cf2d-4991-93bf-f2446688154f
Status: Started
Snapshot Count: 0
Number of Bricks: 9
Transport-type: rdma
Brick1: cn01-ib:/gfs/gv0/brick1/brick
Brick2: cn02-ib:/gfs/gv0/brick1/brick
Brick3: cn03-ib:/gfs/gv0/brick1/brick
Brick4: cn04-ib:/gfs/gv0/brick1/brick
Brick5: cn05-ib:/gfs/gv0/brick1/brick
Brick6: cn06-ib:/gfs/gv0/brick1/brick
Brick7: cn07-ib:/gfs/gv0/brick1/brick
Brick8: cn08-ib:/gfs/gv0/brick1/brick
Brick9: cn09-ib:/gfs/gv0/brick1/brick
client.event-threads: 8
performance.parallel-readdir: on
performance.readdir-ahead: on
cluster.nufa: on
nfs.disable: on

--
Best regards,
Anatoliy
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users

Laura Bailey

2018-03-13 23:03:42 UTC

Permalink

Can we add a smarter error message for this situation by checking volume
type first?

Cheers,
Laura B

Post by Anatoliy Dmytriyev
Hello,
We have a very fresh gluster 3.10.10 installation.
Our volume is created as distributed volume, 9 bricks 96TB in total
(87TB after 10% of gluster disk space reservation)
# gluster volume heal gv0
Launching heal operation to perform index self heal on volume gv0 has
been unsuccessful on bricks that are down. Please check if all brick
processes are running.
Which processes should be run on every brick for heal operation?
# gluster volume status
Status of volume: gv0
Gluster process TCP Port RDMA Port Online
Pid
------------------------------------------------------------
------------------
Brick cn01-ib:/gfs/gv0/brick1/brick 0 49152 Y
70850
Brick cn02-ib:/gfs/gv0/brick1/brick 0 49152 Y
102951
Brick cn03-ib:/gfs/gv0/brick1/brick 0 49152 Y
57535
Brick cn04-ib:/gfs/gv0/brick1/brick 0 49152 Y
56676
Brick cn05-ib:/gfs/gv0/brick1/brick 0 49152 Y
56880
Brick cn06-ib:/gfs/gv0/brick1/brick 0 49152 Y
56889
Brick cn07-ib:/gfs/gv0/brick1/brick 0 49152 Y
56902
Brick cn08-ib:/gfs/gv0/brick1/brick 0 49152 Y
94920
Brick cn09-ib:/gfs/gv0/brick1/brick 0 49152 Y
56542
Task Status of Volume gv0
------------------------------------------------------------
------------------
There are no active volume tasks
# gluster volume info gv0
Volume Name: gv0
Type: Distribute
Volume ID: 8becaf78-cf2d-4991-93bf-f2446688154f
Status: Started
Snapshot Count: 0
Number of Bricks: 9
Transport-type: rdma
Brick1: cn01-ib:/gfs/gv0/brick1/brick
Brick2: cn02-ib:/gfs/gv0/brick1/brick
Brick3: cn03-ib:/gfs/gv0/brick1/brick
Brick4: cn04-ib:/gfs/gv0/brick1/brick
Brick5: cn05-ib:/gfs/gv0/brick1/brick
Brick6: cn06-ib:/gfs/gv0/brick1/brick
Brick7: cn07-ib:/gfs/gv0/brick1/brick
Brick8: cn08-ib:/gfs/gv0/brick1/brick
Brick9: cn09-ib:/gfs/gv0/brick1/brick
client.event-threads: 8
performance.parallel-readdir: on
performance.readdir-ahead: on
cluster.nufa: on
nfs.disable: on

--
Best regards,
Anatoliy
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users

--
Laura Bailey
Senior Technical Writer
Customer Content Services BNE

Karthik Subrahmanya

2018-03-14 04:47:17 UTC

Permalink

Post by Laura Bailey
Can we add a smarter error message for this situation by checking volume
type first?

Yes we can. I will do that.

Thanks,
Karthik

Post by Laura Bailey
Cheers,
Laura B

Post by Anatoliy Dmytriyev
Hello,
We have a very fresh gluster 3.10.10 installation.
Our volume is created as distributed volume, 9 bricks 96TB in total
(87TB after 10% of gluster disk space reservation)
# gluster volume heal gv0
Launching heal operation to perform index self heal on volume gv0 has
been unsuccessful on bricks that are down. Please check if all brick
processes are running.
Which processes should be run on every brick for heal operation?
# gluster volume status
Status of volume: gv0
Gluster process TCP Port RDMA Port
Online Pid
------------------------------------------------------------
------------------
Brick cn01-ib:/gfs/gv0/brick1/brick 0 49152 Y
70850
Brick cn02-ib:/gfs/gv0/brick1/brick 0 49152 Y
102951
Brick cn03-ib:/gfs/gv0/brick1/brick 0 49152 Y
57535
Brick cn04-ib:/gfs/gv0/brick1/brick 0 49152 Y
56676
Brick cn05-ib:/gfs/gv0/brick1/brick 0 49152 Y
56880
Brick cn06-ib:/gfs/gv0/brick1/brick 0 49152 Y
56889
Brick cn07-ib:/gfs/gv0/brick1/brick 0 49152 Y
56902
Brick cn08-ib:/gfs/gv0/brick1/brick 0 49152 Y
94920
Brick cn09-ib:/gfs/gv0/brick1/brick 0 49152 Y
56542
Task Status of Volume gv0
------------------------------------------------------------
------------------
There are no active volume tasks
# gluster volume info gv0
Volume Name: gv0
Type: Distribute
Volume ID: 8becaf78-cf2d-4991-93bf-f2446688154f
Status: Started
Snapshot Count: 0
Number of Bricks: 9
Transport-type: rdma
Brick1: cn01-ib:/gfs/gv0/brick1/brick
Brick2: cn02-ib:/gfs/gv0/brick1/brick
Brick3: cn03-ib:/gfs/gv0/brick1/brick
Brick4: cn04-ib:/gfs/gv0/brick1/brick
Brick5: cn05-ib:/gfs/gv0/brick1/brick
Brick6: cn06-ib:/gfs/gv0/brick1/brick
Brick7: cn07-ib:/gfs/gv0/brick1/brick
Brick8: cn08-ib:/gfs/gv0/brick1/brick
Brick9: cn09-ib:/gfs/gv0/brick1/brick
client.event-threads: 8
performance.parallel-readdir: on
performance.readdir-ahead: on
cluster.nufa: on
nfs.disable: on

--
Best regards,
Anatoliy
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users

--
Laura Bailey
Senior Technical Writer
Customer Content Services BNE

Anatoliy Dmytriyev

2018-03-14 10:06:15 UTC

Permalink

Hi Karthik,

Thanks a lot for the explanation.

Does it mean a distributed volume health can be checked only by "gluster
volume status " command?

And one more question: cluster.min-free-disk is 10% by default. What
kind of "side effects" can we face if this option will be reduced to,
for example, 5%? Could you point to any best practice document(s)?

Regards,

Anatoliy

Post by Karthik Subrahmanya
Hi Anatoliy,
The heal command is basically used to heal any mismatching contents between replica copies of the files. For the command "gluster volume heal <volname>" to succeed, you should have the self-heal-daemon running,
which is true only if your volume is of type replicate/disperse. In your case you have a plain distribute volume where you do not store the replica of any files. So the volume heal will return you the error.
Regards, Karthik
Hi,
Maybe someone can point me to a documentation or explain this? I can't find it myself.
Do we have any other useful resources except doc.gluster.org [1]? As I see many gluster options are not described there or there are no explanation what is doing...
Hello,
We have a very fresh gluster 3.10.10 installation.
Our volume is created as distributed volume, 9 bricks 96TB in total
(87TB after 10% of gluster disk space reservation)
# gluster volume heal gv0
Launching heal operation to perform index self heal on volume gv0 has
been unsuccessful on bricks that are down. Please check if all brick
processes are running.
Which processes should be run on every brick for heal operation?
# gluster volume status
Status of volume: gv0
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick cn01-ib:/gfs/gv0/brick1/brick 0 49152 Y 70850
Brick cn02-ib:/gfs/gv0/brick1/brick 0 49152 Y 102951
Brick cn03-ib:/gfs/gv0/brick1/brick 0 49152 Y 57535
Brick cn04-ib:/gfs/gv0/brick1/brick 0 49152 Y 56676
Brick cn05-ib:/gfs/gv0/brick1/brick 0 49152 Y 56880
Brick cn06-ib:/gfs/gv0/brick1/brick 0 49152 Y 56889
Brick cn07-ib:/gfs/gv0/brick1/brick 0 49152 Y 56902
Brick cn08-ib:/gfs/gv0/brick1/brick 0 49152 Y 94920
Brick cn09-ib:/gfs/gv0/brick1/brick 0 49152 Y 56542
Task Status of Volume gv0
------------------------------------------------------------------------------
There are no active volume tasks
# gluster volume info gv0
Volume Name: gv0
Type: Distribute
Volume ID: 8becaf78-cf2d-4991-93bf-f2446688154f
Status: Started
Snapshot Count: 0
Number of Bricks: 9
Transport-type: rdma
Brick1: cn01-ib:/gfs/gv0/brick1/brick
Brick2: cn02-ib:/gfs/gv0/brick1/brick
Brick3: cn03-ib:/gfs/gv0/brick1/brick
Brick4: cn04-ib:/gfs/gv0/brick1/brick
Brick5: cn05-ib:/gfs/gv0/brick1/brick
Brick6: cn06-ib:/gfs/gv0/brick1/brick
Brick7: cn07-ib:/gfs/gv0/brick1/brick
Brick8: cn08-ib:/gfs/gv0/brick1/brick
Brick9: cn09-ib:/gfs/gv0/brick1/brick
client.event-threads: 8
performance.parallel-readdir: on
performance.readdir-ahead: on
cluster.nufa: on
nfs.disable: on
--
Best regards,
Anatoliy
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users [2]

--
Best regards,
Anatoliy

Links:
------
[1] http://doc.gluster.org
[2] http://lists.gluster.org/mailman/listinfo/gluster-users

Karthik Subrahmanya

2018-03-14 12:12:17 UTC

Permalink

Post by Anatoliy Dmytriyev
Hi Karthik,
Thanks a lot for the explanation.
Does it mean a distributed volume health can be checked only by "gluster
volume status " command?

Yes. I am not aware of any other command which can give the status of plain
distribute volume which is similar to the heal info command for
replicate/disperse volumes.

Post by Anatoliy Dmytriyev
And one more question: cluster.min-free-disk is 10% by default. What kind
of "side effects" can we face if this option will be reduced to, for
example, 5%? Could you point to any best practice document(s)?

Yes you can decrease it to any value. There won't be any side effect.

Regards,
Karthik

Post by Anatoliy Dmytriyev
Regards,
Anatoliy
Hi Anatoliy,
The heal command is basically used to heal any mismatching contents
between replica copies of the files.
For the command "gluster volume heal <volname>" to succeed, you should
have the self-heal-daemon running,
which is true only if your volume is of type replicate/disperse.
In your case you have a plain distribute volume where you do not store the
replica of any files.
So the volume heal will return you the error.
Regards,
Karthik

Post by Karthik Subrahmanya
Hi,
Maybe someone can point me to a documentation or explain this? I can't find it myself.
Do we have any other useful resources except doc.gluster.org? As I see
many gluster options are not described there or there are no explanation
what is doing...

Post by Anatoliy Dmytriyev
Hello,
We have a very fresh gluster 3.10.10 installation.
Our volume is created as distributed volume, 9 bricks 96TB in total
(87TB after 10% of gluster disk space reservation)
# gluster volume heal gv0
Launching heal operation to perform index self heal on volume gv0 has
been unsuccessful on bricks that are down. Please check if all brick
processes are running.
Which processes should be run on every brick for heal operation?
# gluster volume status
Status of volume: gv0
Gluster process TCP Port RDMA Port Online
Pid
------------------------------------------------------------
------------------
Brick cn01-ib:/gfs/gv0/brick1/brick 0 49152 Y
70850
Brick cn02-ib:/gfs/gv0/brick1/brick 0 49152 Y
102951
Brick cn03-ib:/gfs/gv0/brick1/brick 0 49152 Y
57535
Brick cn04-ib:/gfs/gv0/brick1/brick 0 49152 Y
56676
Brick cn05-ib:/gfs/gv0/brick1/brick 0 49152 Y
56880
Brick cn06-ib:/gfs/gv0/brick1/brick 0 49152 Y
56889
Brick cn07-ib:/gfs/gv0/brick1/brick 0 49152 Y
56902
Brick cn08-ib:/gfs/gv0/brick1/brick 0 49152 Y
94920
Brick cn09-ib:/gfs/gv0/brick1/brick 0 49152 Y
56542
Task Status of Volume gv0
------------------------------------------------------------
------------------
There are no active volume tasks
# gluster volume info gv0
Volume Name: gv0
Type: Distribute
Volume ID: 8becaf78-cf2d-4991-93bf-f2446688154f
Status: Started
Snapshot Count: 0
Number of Bricks: 9
Transport-type: rdma
Brick1: cn01-ib:/gfs/gv0/brick1/brick
Brick2: cn02-ib:/gfs/gv0/brick1/brick
Brick3: cn03-ib:/gfs/gv0/brick1/brick
Brick4: cn04-ib:/gfs/gv0/brick1/brick
Brick5: cn05-ib:/gfs/gv0/brick1/brick
Brick6: cn06-ib:/gfs/gv0/brick1/brick
Brick7: cn07-ib:/gfs/gv0/brick1/brick
Brick8: cn08-ib:/gfs/gv0/brick1/brick
Brick9: cn09-ib:/gfs/gv0/brick1/brick
client.event-threads: 8
performance.parallel-readdir: on
performance.readdir-ahead: on
cluster.nufa: on
nfs.disable: on

--
Best regards,
Anatoliy
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users

--
Best regards,
Anatoliy

Karthik Subrahmanya

2018-03-14 12:50:56 UTC

Permalink

Post by Karthik Subrahmanya

Post by Anatoliy Dmytriyev
Hi Karthik,
Thanks a lot for the explanation.
Does it mean a distributed volume health can be checked only by "gluster
volume status " command?

Yes. I am not aware of any other command which can give the status of
plain distribute volume which is similar to the heal info command for
replicate/disperse volumes.

Yes you can decrease it to any value. There won't be any side effect.

Small correction here, min-free-disk should ideally be set to larger than
the largest file size likely to be written. Decreasing it beyond a point
raises the likelihood of the brick getting full which is a very bad state
to be in.
Will update you if I get some document which explains this thing. Sorry for
the previous statement.

Post by Karthik Subrahmanya
Regards,
Karthik

Post by Anatoliy Dmytriyev
Regards,
Anatoliy
Hi Anatoliy,
The heal command is basically used to heal any mismatching contents
between replica copies of the files.
For the command "gluster volume heal <volname>" to succeed, you should
have the self-heal-daemon running,
which is true only if your volume is of type replicate/disperse.
In your case you have a plain distribute volume where you do not store
the replica of any files.
So the volume heal will return you the error.
Regards,
Karthik

Post by Anatoliy Dmytriyev
Hello,
We have a very fresh gluster 3.10.10 installation.
Our volume is created as distributed volume, 9 bricks 96TB in total
(87TB after 10% of gluster disk space reservation)
# gluster volume heal gv0
Launching heal operation to perform index self heal on volume gv0 has
been unsuccessful on bricks that are down. Please check if all brick
processes are running.
Which processes should be run on every brick for heal operation?
# gluster volume status
Status of volume: gv0
Gluster process TCP Port RDMA Port
Online Pid
------------------------------------------------------------
------------------
Brick cn01-ib:/gfs/gv0/brick1/brick 0 49152 Y
70850
Brick cn02-ib:/gfs/gv0/brick1/brick 0 49152 Y
102951
Brick cn03-ib:/gfs/gv0/brick1/brick 0 49152 Y
57535
Brick cn04-ib:/gfs/gv0/brick1/brick 0 49152 Y
56676
Brick cn05-ib:/gfs/gv0/brick1/brick 0 49152 Y
56880
Brick cn06-ib:/gfs/gv0/brick1/brick 0 49152 Y
56889
Brick cn07-ib:/gfs/gv0/brick1/brick 0 49152 Y
56902
Brick cn08-ib:/gfs/gv0/brick1/brick 0 49152 Y
94920
Brick cn09-ib:/gfs/gv0/brick1/brick 0 49152 Y
56542
Task Status of Volume gv0
------------------------------------------------------------
------------------
There are no active volume tasks
# gluster volume info gv0
Volume Name: gv0
Type: Distribute
Volume ID: 8becaf78-cf2d-4991-93bf-f2446688154f
Status: Started
Snapshot Count: 0
Number of Bricks: 9
Transport-type: rdma
Brick1: cn01-ib:/gfs/gv0/brick1/brick
Brick2: cn02-ib:/gfs/gv0/brick1/brick
Brick3: cn03-ib:/gfs/gv0/brick1/brick
Brick4: cn04-ib:/gfs/gv0/brick1/brick
Brick5: cn05-ib:/gfs/gv0/brick1/brick
Brick6: cn06-ib:/gfs/gv0/brick1/brick
Brick7: cn07-ib:/gfs/gv0/brick1/brick
Brick8: cn08-ib:/gfs/gv0/brick1/brick
Brick9: cn09-ib:/gfs/gv0/brick1/brick
client.event-threads: 8
performance.parallel-readdir: on
performance.readdir-ahead: on
cluster.nufa: on
nfs.disable: on

--
Best regards,
Anatoliy
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users

--
Best regards,
Anatoliy

Anatoliy Dmytriyev

2018-03-14 13:22:44 UTC

Permalink

Thanks

Post by Anatoliy Dmytriyev
Hi Karthik,
Thanks a lot for the explanation.
Does it mean a distributed volume health can be checked only by "gluster volume status " command?
Yes. I am not aware of any other command which can give the status of plain distribute volume which is similar to the heal info command for replicate/disperse volumes.
And one more question: cluster.min-free-disk is 10% by default. What kind of "side effects" can we face if this option will be reduced to, for example, 5%? Could you point to any best practice document(s)?
Yes you can decrease it to any value. There won't be any side effect.

Small correction here, min-free-disk should ideally be set to larger
than the largest file size likely to be written. Decreasing it beyond a
point raises the likelihood of the brick getting full which is a very
bad state to be in.
Will update you if I get some document which explains this thing. Sorry
for the previous statement.

Post by Anatoliy Dmytriyev
Regards,
Karthik
Regards,
Anatoliy
Hi Anatoliy,
The heal command is basically used to heal any mismatching contents between replica copies of the files. For the command "gluster volume heal <volname>" to succeed, you should have the self-heal-daemon running,
which is true only if your volume is of type replicate/disperse. In your case you have a plain distribute volume where you do not store the replica of any files. So the volume heal will return you the error.
Regards, Karthik
Hi,
Maybe someone can point me to a documentation or explain this? I can't find it myself.
Do we have any other useful resources except doc.gluster.org [1]? As I see many gluster options are not described there or there are no explanation what is doing...
Hello,
We have a very fresh gluster 3.10.10 installation.
Our volume is created as distributed volume, 9 bricks 96TB in total
(87TB after 10% of gluster disk space reservation)
# gluster volume heal gv0
Launching heal operation to perform index self heal on volume gv0 has
been unsuccessful on bricks that are down. Please check if all brick
processes are running.
Which processes should be run on every brick for heal operation?
# gluster volume status
Status of volume: gv0
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick cn01-ib:/gfs/gv0/brick1/brick 0 49152 Y 70850
Brick cn02-ib:/gfs/gv0/brick1/brick 0 49152 Y 102951
Brick cn03-ib:/gfs/gv0/brick1/brick 0 49152 Y 57535
Brick cn04-ib:/gfs/gv0/brick1/brick 0 49152 Y 56676
Brick cn05-ib:/gfs/gv0/brick1/brick 0 49152 Y 56880
Brick cn06-ib:/gfs/gv0/brick1/brick 0 49152 Y 56889
Brick cn07-ib:/gfs/gv0/brick1/brick 0 49152 Y 56902
Brick cn08-ib:/gfs/gv0/brick1/brick 0 49152 Y 94920
Brick cn09-ib:/gfs/gv0/brick1/brick 0 49152 Y 56542
Task Status of Volume gv0
------------------------------------------------------------------------------
There are no active volume tasks
# gluster volume info gv0
Volume Name: gv0
Type: Distribute
Volume ID: 8becaf78-cf2d-4991-93bf-f2446688154f
Status: Started
Snapshot Count: 0
Number of Bricks: 9
Transport-type: rdma
Brick1: cn01-ib:/gfs/gv0/brick1/brick
Brick2: cn02-ib:/gfs/gv0/brick1/brick
Brick3: cn03-ib:/gfs/gv0/brick1/brick
Brick4: cn04-ib:/gfs/gv0/brick1/brick
Brick5: cn05-ib:/gfs/gv0/brick1/brick
Brick6: cn06-ib:/gfs/gv0/brick1/brick
Brick7: cn07-ib:/gfs/gv0/brick1/brick
Brick8: cn08-ib:/gfs/gv0/brick1/brick
Brick9: cn09-ib:/gfs/gv0/brick1/brick
client.event-threads: 8
performance.parallel-readdir: on
performance.readdir-ahead: on
cluster.nufa: on
nfs.disable: on
--
Best regards,
Anatoliy
_______________________________________________
Gluster-users mailing list
http://lists.gluster.org/mailman/listinfo/gluster-users [2]

--
Best regards,
Anatoliy

--
Best regards,
Anatoliy

Links:
------
[1] http://doc.gluster.org
[2] http://lists.gluster.org/mailman/listinfo/gluster-users