Discussion:
[Gluster-users] KVM lockups on Gluster 4.1.1
Dmitry Melekhov
2018-08-28 05:54:52 UTC
Permalink
Hello!


Yesterday we hit something like this on 4.1.2

Centos 7.5.


Volume is replicated - two bricks and one arbiter.


We rebooted arbiter, waited for heal end,  and tried to live migrate VM
to another node ( we run VMs on gluster nodes ):


[2018-08-27 09:56:22.085411] I [MSGID: 115029]
[server-handshake.c:763:server_setvolume] 0-pool-server: accepted client
from
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-
client-6-RECON_NO:-0 (version: 4.1.2)
[2018-08-27 09:56:22.107609] I [MSGID: 115036]
[server.c:483:server_rpc_notify] 0-pool-server: disconnecting connection
from
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-
client-6-RECON_NO:-0
[2018-08-27 09:56:22.107747] I [MSGID: 101055]
[client_t.c:444:gf_client_unref] 0-pool-server: Shutting down connection
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-clien
t-6-RECON_NO:-0
[2018-08-27 09:58:37.905829] I [MSGID: 115036]
[server.c:483:server_rpc_notify] 0-pool-server: disconnecting connection
from
CTX_ID:c3eb6cfc-2ef9-470a-89d1-a87170d00da5-GRAPH_ID:0-PID:30292-HOST:father-PC_NAME:p
ool-client-6-RECON_NO:-0
[2018-08-27 09:58:37.905926] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=28c831d8bc550000}
[2018-08-27 09:58:37.905959] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=2870a7d6bc550000}
[2018-08-27 09:58:37.905979] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=2880a7d6bc550000}
[2018-08-27 09:58:37.905997] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=28f031d8bc550000}
[2018-08-27 09:58:37.906016] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=28b07dd5bc550000}
[2018-08-27 09:58:37.906034] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=28e0a7d6bc550000}
[2018-08-27 09:58:37.906056] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=28b845d8bc550000}
[2018-08-27 09:58:37.906079] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=2858a7d8bc550000}
[2018-08-27 09:58:37.906098] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=2868a8d7bc550000}
[2018-08-27 09:58:37.906121] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=28f80bd7bc550000}
...

[2018-08-27 09:58:37.907375] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=28a8cdd6bc550000}
[2018-08-27 09:58:37.907393] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=2880cdd6bc550000}
[2018-08-27 09:58:37.907476] I [socket.c:3837:socket_submit_reply]
0-tcp.pool-server: not connected (priv->connected = -1)
[2018-08-27 09:58:37.907520] E [rpcsvc.c:1378:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0xcb88cb, Program:
GlusterFS 4.x v1, ProgVers: 400, Proc: 30) to rpc-transport
(tcp.pool-server)
[2018-08-27 09:58:37.910727] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910814] E [rpcsvc.c:1378:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0xcb88ce, Program:
GlusterFS 4.x v1, ProgVers: 400, Proc: 30) to rpc-transport
(tcp.pool-server)
[2018-08-27 09:58:37.910861] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910904] E [rpcsvc.c:1378:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0xcb88cf, Program:
GlusterFS 4.x v1, ProgVers: 400, Proc: 30) to rpc-transport
(tcp.pool-server)
[2018-08-27 09:58:37.910940] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910979] E [rpcsvc.c:1378:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0xcb88d1, Program:
GlusterFS 4.x v1, ProgVers: 400, Proc: 30) to rpc-transport
(tcp.pool-server)
[2018-08-27 09:58:37.911012] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.911050] E [rpcsvc.c:1378:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0xcb88d8, Program:
GlusterFS 4.x v1, ProgVers: 400, Proc: 30) to rpc-transport
(tcp.pool-server)
[2018-08-27 09:58:37.911083] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.916217] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.916520] I [MSGID: 115013]
[server-helpers.c:286:do_fd_cleanup] 0-pool-server: fd cleanup on
/balamak.img


after this I/O on  /balamak.img was blocked.


Only solution we found was to reboot all 3 nodes.


Is there any bug report in bugzilla we can add logs?

Is it possible to turn of these locks?

Thank you!
Amar Tumballi
2018-08-28 06:43:08 UTC
Permalink
Post by Dmitry Melekhov
Hello!
Yesterday we hit something like this on 4.1.2
Centos 7.5.
Volume is replicated - two bricks and one arbiter.
We rebooted arbiter, waited for heal end, and tried to live migrate VM to
[2018-08-27 09:56:22.085411] I [MSGID: 115029]
[server-handshake.c:763:server_setvolume] 0-pool-server: accepted client
8887-HOST:son-PC_NAME:pool-
client-6-RECON_NO:-0 (version: 4.1.2)
[2018-08-27 09:56:22.107609] I [MSGID: 115036]
[server.c:483:server_rpc_notify] 0-pool-server: disconnecting connection
8887-HOST:son-PC_NAME:pool-
client-6-RECON_NO:-0
[2018-08-27 09:56:22.107747] I [MSGID: 101055]
[client_t.c:444:gf_client_unref] 0-pool-server: Shutting down connection
8887-HOST:son-PC_NAME:pool-clien
t-6-RECON_NO:-0
[2018-08-27 09:58:37.905829] I [MSGID: 115036]
[server.c:483:server_rpc_notify] 0-pool-server: disconnecting connection
30292-HOST:father-PC_NAME:p
ool-client-6-RECON_NO:-0
[2018-08-27 09:58:37.905926] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=28c831d8bc550000}
[2018-08-27 09:58:37.905959] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=2870a7d6bc550000}
[2018-08-27 09:58:37.905979] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=2880a7d6bc550000}
[2018-08-27 09:58:37.905997] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=28f031d8bc550000}
[2018-08-27 09:58:37.906016] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=28b07dd5bc550000}
[2018-08-27 09:58:37.906034] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=28e0a7d6bc550000}
[2018-08-27 09:58:37.906056] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=28b845d8bc550000}
[2018-08-27 09:58:37.906079] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=2858a7d8bc550000}
[2018-08-27 09:58:37.906098] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=2868a8d7bc550000}
[2018-08-27 09:58:37.906121] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=28f80bd7bc550000}
...
[2018-08-27 09:58:37.907375] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=28a8cdd6bc550000}
[2018-08-27 09:58:37.907393] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=2880cdd6bc550000}
[2018-08-27 09:58:37.907476] I [socket.c:3837:socket_submit_reply]
0-tcp.pool-server: not connected (priv->connected = -1)
[2018-08-27 09:58:37.907520] E [rpcsvc.c:1378:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0xcb88cb, Program: GlusterFS
4.x v1, ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.910727] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084] -->/usr/lib64/glusterfs/4.1.2/
xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba] -->/usr/lib64/glusterfs/4.1.2/
xlator/protocol/server.so(+0xafce) [0x7ffb5fd89fce] ) 0-: Reply
submission failed
[2018-08-27 09:58:37.910814] E [rpcsvc.c:1378:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0xcb88ce, Program: GlusterFS
4.x v1, ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.910861] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084] -->/usr/lib64/glusterfs/4.1.2/
xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba] -->/usr/lib64/glusterfs/4.1.2/
xlator/protocol/server.so(+0xafce) [0x7ffb5fd89fce] ) 0-: Reply
submission failed
[2018-08-27 09:58:37.910904] E [rpcsvc.c:1378:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0xcb88cf, Program: GlusterFS
4.x v1, ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.910940] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084] -->/usr/lib64/glusterfs/4.1.2/
xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba] -->/usr/lib64/glusterfs/4.1.2/
xlator/protocol/server.so(+0xafce) [0x7ffb5fd89fce] ) 0-: Reply
submission failed
[2018-08-27 09:58:37.910979] E [rpcsvc.c:1378:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0xcb88d1, Program: GlusterFS
4.x v1, ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.911012] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084] -->/usr/lib64/glusterfs/4.1.2/
xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba] -->/usr/lib64/glusterfs/4.1.2/
xlator/protocol/server.so(+0xafce) [0x7ffb5fd89fce] ) 0-: Reply
submission failed
[2018-08-27 09:58:37.911050] E [rpcsvc.c:1378:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0xcb88d8, Program: GlusterFS
4.x v1, ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.911083] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084] -->/usr/lib64/glusterfs/4.1.2/
xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba] -->/usr/lib64/glusterfs/4.1.2/
xlator/protocol/server.so(+0xafce) [0x7ffb5fd89fce] ) 0-: Reply
submission failed
[2018-08-27 09:58:37.916217] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084] -->/usr/lib64/glusterfs/4.1.2/
xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba] -->/usr/lib64/glusterfs/4.1.2/
xlator/protocol/server.so(+0xafce) [0x7ffb5fd89fce] ) 0-: Reply
submission failed
[2018-08-27 09:58:37.916520] I [MSGID: 115013]
[server-helpers.c:286:do_fd_cleanup] 0-pool-server: fd cleanup on
/balamak.img
after this I/O on /balamak.img was blocked.
Only solution we found was to reboot all 3 nodes.
Is there any bug report in bugzilla we can add logs?
Not aware of such bugs!
Post by Dmitry Melekhov
Is it possible to turn of these locks?
Not sure, will get back on this one!
Post by Dmitry Melekhov
Thank you!
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users
--
Amar Tumballi (amarts)
Dmitry Melekhov
2018-08-29 07:13:01 UTC
Permalink
Post by Dmitry Melekhov
Hello!
Yesterday we hit something like this on 4.1.2
Centos 7.5.
Volume is replicated - two bricks and one arbiter.
We rebooted arbiter, waited for heal end,  and tried to live
[2018-08-27 09:56:22.085411] I [MSGID: 115029]
[server-handshake.c:763:server_setvolume] 0-pool-server: accepted
client from
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-
client-6-RECON_NO:-0 (version: 4.1.2)
[2018-08-27 09:56:22.107609] I [MSGID: 115036]
[server.c:483:server_rpc_notify] 0-pool-server: disconnecting
connection from
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-
client-6-RECON_NO:-0
[2018-08-27 09:56:22.107747] I [MSGID: 101055]
[client_t.c:444:gf_client_unref] 0-pool-server: Shutting down connection
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-clien
t-6-RECON_NO:-0
[2018-08-27 09:58:37.905829] I [MSGID: 115036]
[server.c:483:server_rpc_notify] 0-pool-server: disconnecting
connection from
CTX_ID:c3eb6cfc-2ef9-470a-89d1-a87170d00da5-GRAPH_ID:0-PID:30292-HOST:father-PC_NAME:p
ool-client-6-RECON_NO:-0
[2018-08-27 09:58:37.905926] W
[inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: releasing
lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held by
{client=0x7ffb58035bc0, pid=30292 lk-owner=28c831d8bc550000}
[2018-08-27 09:58:37.905959] W
[inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: releasing
lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held by
{client=0x7ffb58035bc0, pid=30292 lk-owner=2870a7d6bc550000}
[2018-08-27 09:58:37.905979] W
[inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: releasing
lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held by
{client=0x7ffb58035bc0, pid=30292 lk-owner=2880a7d6bc550000}
[2018-08-27 09:58:37.905997] W
[inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: releasing
lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held by
{client=0x7ffb58035bc0, pid=30292 lk-owner=28f031d8bc550000}
[2018-08-27 09:58:37.906016] W
[inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: releasing
lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held by
{client=0x7ffb58035bc0, pid=30292 lk-owner=28b07dd5bc550000}
[2018-08-27 09:58:37.906034] W
[inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: releasing
lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held by
{client=0x7ffb58035bc0, pid=30292 lk-owner=28e0a7d6bc550000}
[2018-08-27 09:58:37.906056] W
[inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: releasing
lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held by
{client=0x7ffb58035bc0, pid=30292 lk-owner=28b845d8bc550000}
[2018-08-27 09:58:37.906079] W
[inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: releasing
lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held by
{client=0x7ffb58035bc0, pid=30292 lk-owner=2858a7d8bc550000}
[2018-08-27 09:58:37.906098] W
[inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: releasing
lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held by
{client=0x7ffb58035bc0, pid=30292 lk-owner=2868a8d7bc550000}
[2018-08-27 09:58:37.906121] W
[inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: releasing
lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held by
{client=0x7ffb58035bc0, pid=30292 lk-owner=28f80bd7bc550000}
...
[2018-08-27 09:58:37.907375] W
[inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: releasing
lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held by
{client=0x7ffb58035bc0, pid=30292 lk-owner=28a8cdd6bc550000}
[2018-08-27 09:58:37.907393] W
[inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: releasing
lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held by
{client=0x7ffb58035bc0, pid=30292 lk-owner=2880cdd6bc550000}
[2018-08-27 09:58:37.907476] I [socket.c:3837:socket_submit_reply]
0-tcp.pool-server: not connected (priv->connected = -1)
[2018-08-27 09:58:37.907520] E
[rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed to
submit message (XID: 0xcb88cb, Program: GlusterFS 4.x v1,
ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.910727] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910814] E
[rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed to
submit message (XID: 0xcb88ce, Program: GlusterFS 4.x v1,
ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.910861] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910904] E
[rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed to
submit message (XID: 0xcb88cf, Program: GlusterFS 4.x v1,
ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.910940] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910979] E
[rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed to
submit message (XID: 0xcb88d1, Program: GlusterFS 4.x v1,
ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.911012] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.911050] E
[rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed to
submit message (XID: 0xcb88d8, Program: GlusterFS 4.x v1,
ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.911083] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.916217] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.916520] I [MSGID: 115013]
[server-helpers.c:286:do_fd_cleanup] 0-pool-server: fd cleanup on
/balamak.img
after this I/O on  /balamak.img was blocked.
Only solution we found was to reboot all 3 nodes.
Is there any bug report in bugzilla we can add logs?
Not aware of such bugs!
Is it possible to turn of these locks?
Not sure, will get back on this one!
btw, found this link
https://docs.gluster.org/en/v3/Troubleshooting/troubleshooting-filelocks/

tried on another (test) cluster:

 [***@marduk ~]# gluster volume statedump pool
Segmentation fault (core dumped)


4.1.2 too...

something is wrong here.
Post by Dmitry Melekhov
Thank you!
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users
<https://lists.gluster.org/mailman/listinfo/gluster-users>
--
Amar Tumballi (amarts)
Danny Lee
2018-10-01 19:09:59 UTC
Permalink
Ran into this issue too with 4.1.5 with an arbiter setup. Also could not
run a statedump due to "Segmentation fault".

Tried with 3.12.13 and had issues with locked files as well. We were able
to do a statedump and found that some of our files were "BLOCKED"
(xlator.features.locks.vol-locks.inode). Attached part of statedump.

Also tried clearing the locks using clear-locks, which did remove the lock,
but as soon as I tried to cat the file, it got locked again and the cat
process hung.
Post by Amar Tumballi
Post by Dmitry Melekhov
Hello!
Yesterday we hit something like this on 4.1.2
Centos 7.5.
Volume is replicated - two bricks and one arbiter.
We rebooted arbiter, waited for heal end, and tried to live migrate VM
[2018-08-27 09:56:22.085411] I [MSGID: 115029]
[server-handshake.c:763:server_setvolume] 0-pool-server: accepted client
from
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-
client-6-RECON_NO:-0 (version: 4.1.2)
[2018-08-27 09:56:22.107609] I [MSGID: 115036]
[server.c:483:server_rpc_notify] 0-pool-server: disconnecting connection
from
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-
client-6-RECON_NO:-0
[2018-08-27 09:56:22.107747] I [MSGID: 101055]
[client_t.c:444:gf_client_unref] 0-pool-server: Shutting down connection
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-clien
t-6-RECON_NO:-0
[2018-08-27 09:58:37.905829] I [MSGID: 115036]
[server.c:483:server_rpc_notify] 0-pool-server: disconnecting connection
from
CTX_ID:c3eb6cfc-2ef9-470a-89d1-a87170d00da5-GRAPH_ID:0-PID:30292-HOST:father-PC_NAME:p
ool-client-6-RECON_NO:-0
[2018-08-27 09:58:37.905926] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28c831d8bc550000}
[2018-08-27 09:58:37.905959] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=2870a7d6bc550000}
[2018-08-27 09:58:37.905979] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=2880a7d6bc550000}
[2018-08-27 09:58:37.905997] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28f031d8bc550000}
[2018-08-27 09:58:37.906016] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28b07dd5bc550000}
[2018-08-27 09:58:37.906034] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28e0a7d6bc550000}
[2018-08-27 09:58:37.906056] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28b845d8bc550000}
[2018-08-27 09:58:37.906079] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=2858a7d8bc550000}
[2018-08-27 09:58:37.906098] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=2868a8d7bc550000}
[2018-08-27 09:58:37.906121] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28f80bd7bc550000}
...
[2018-08-27 09:58:37.907375] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28a8cdd6bc550000}
[2018-08-27 09:58:37.907393] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=2880cdd6bc550000}
[2018-08-27 09:58:37.907476] I [socket.c:3837:socket_submit_reply]
0-tcp.pool-server: not connected (priv->connected = -1)
[2018-08-27 09:58:37.907520] E [rpcsvc.c:1378:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0xcb88cb, Program: GlusterFS
4.x v1, ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.910727] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910814] E [rpcsvc.c:1378:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0xcb88ce, Program: GlusterFS
4.x v1, ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.910861] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910904] E [rpcsvc.c:1378:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0xcb88cf, Program: GlusterFS
4.x v1, ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.910940] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910979] E [rpcsvc.c:1378:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0xcb88d1, Program: GlusterFS
4.x v1, ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.911012] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.911050] E [rpcsvc.c:1378:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0xcb88d8, Program: GlusterFS
4.x v1, ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.911083] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.916217] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.916520] I [MSGID: 115013]
[server-helpers.c:286:do_fd_cleanup] 0-pool-server: fd cleanup on
/balamak.img
after this I/O on /balamak.img was blocked.
Only solution we found was to reboot all 3 nodes.
Is there any bug report in bugzilla we can add logs?
Not aware of such bugs!
Post by Dmitry Melekhov
Is it possible to turn of these locks?
Not sure, will get back on this one!
btw, found this link
https://docs.gluster.org/en/v3/Troubleshooting/troubleshooting-filelocks/
Segmentation fault (core dumped)
4.1.2 too...
something is wrong here.
Post by Dmitry Melekhov
Thank you!
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users
--
Amar Tumballi (amarts)
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users
Dmitry Melekhov
2018-10-02 05:09:39 UTC
Permalink
Ran into this issue too with 4.1.5 with an arbiter setup.  Also could
not run a statedump due to "Segmentation fault".
Tried with 3.12.13 and had issues with locked files as well.  We were
able to do a statedump and found that some of our files were "BLOCKED"
(xlator.features.locks.vol-locks.inode).  Attached part of statedump.
Also tried clearing the locks using clear-locks, which did remove the
lock, but as soon as I tried to cat the file, it got locked again and
the cat process hung.
I created issue in bugzilla, can't find it though :-(
Looks like there is no activity after I sent all logs...
Post by Dmitry Melekhov
Hello!
Yesterday we hit something like this on 4.1.2
Centos 7.5.
Volume is replicated - two bricks and one arbiter.
We rebooted arbiter, waited for heal end,  and tried to live
[2018-08-27 09:56:22.085411] I [MSGID: 115029]
accepted client from
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-
client-6-RECON_NO:-0 (version: 4.1.2)
[2018-08-27 09:56:22.107609] I [MSGID: 115036]
[server.c:483:server_rpc_notify] 0-pool-server: disconnecting
connection from
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-
client-6-RECON_NO:-0
[2018-08-27 09:56:22.107747] I [MSGID: 101055]
[client_t.c:444:gf_client_unref] 0-pool-server: Shutting down connection
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-clien
t-6-RECON_NO:-0
[2018-08-27 09:58:37.905829] I [MSGID: 115036]
[server.c:483:server_rpc_notify] 0-pool-server: disconnecting
connection from
CTX_ID:c3eb6cfc-2ef9-470a-89d1-a87170d00da5-GRAPH_ID:0-PID:30292-HOST:father-PC_NAME:p
ool-client-6-RECON_NO:-0
[2018-08-27 09:58:37.905926] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28c831d8bc550000}
[2018-08-27 09:58:37.905959] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=2870a7d6bc550000}
[2018-08-27 09:58:37.905979] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=2880a7d6bc550000}
[2018-08-27 09:58:37.905997] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28f031d8bc550000}
[2018-08-27 09:58:37.906016] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28b07dd5bc550000}
[2018-08-27 09:58:37.906034] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28e0a7d6bc550000}
[2018-08-27 09:58:37.906056] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28b845d8bc550000}
[2018-08-27 09:58:37.906079] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=2858a7d8bc550000}
[2018-08-27 09:58:37.906098] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=2868a8d7bc550000}
[2018-08-27 09:58:37.906121] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28f80bd7bc550000}
...
[2018-08-27 09:58:37.907375] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28a8cdd6bc550000}
[2018-08-27 09:58:37.907393] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=2880cdd6bc550000}
[2018-08-27 09:58:37.907476] I
[socket.c:3837:socket_submit_reply] 0-tcp.pool-server: not
connected (priv->connected = -1)
[2018-08-27 09:58:37.907520] E
[rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed
to submit message (XID: 0xcb88cb, Program: GlusterFS 4.x v1,
ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.910727] E
[server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910814] E
[rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed
to submit message (XID: 0xcb88ce, Program: GlusterFS 4.x v1,
ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.910861] E
[server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910904] E
[rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed
to submit message (XID: 0xcb88cf, Program: GlusterFS 4.x v1,
ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.910940] E
[server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910979] E
[rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed
to submit message (XID: 0xcb88d1, Program: GlusterFS 4.x v1,
ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.911012] E
[server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.911050] E
[rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed
to submit message (XID: 0xcb88d8, Program: GlusterFS 4.x v1,
ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.911083] E
[server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.916217] E
[server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.916520] I [MSGID: 115013]
[server-helpers.c:286:do_fd_cleanup] 0-pool-server: fd
cleanup on /balamak.img
after this I/O on  /balamak.img was blocked.
Only solution we found was to reboot all 3 nodes.
Is there any bug report in bugzilla we can add logs?
Not aware of such bugs!
Is it possible to turn of these locks?
Not sure, will get back on this one!
btw, found this link
https://docs.gluster.org/en/v3/Troubleshooting/troubleshooting-filelocks/
Segmentation fault (core dumped)
4.1.2 too...
something is wrong here.
Post by Dmitry Melekhov
Thank you!
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users
--
Amar Tumballi (amarts)
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users
Amar Tumballi
2018-10-02 08:59:46 UTC
Permalink
Recently, in one of the situation, we found that locks were not freed up
due to not getting TCP timeout..

Can you try the option like below and let us know?

`gluster volume set $volname tcp-user-timeout 42`

(ref: https://review.gluster.org/21170/ )

Regards,
Amar
Post by Danny Lee
Ran into this issue too with 4.1.5 with an arbiter setup. Also could not
run a statedump due to "Segmentation fault".
Tried with 3.12.13 and had issues with locked files as well. We were able
to do a statedump and found that some of our files were "BLOCKED"
(xlator.features.locks.vol-locks.inode). Attached part of statedump.
Also tried clearing the locks using clear-locks, which did remove the
lock, but as soon as I tried to cat the file, it got locked again and the
cat process hung.
I created issue in bugzilla, can't find it though :-(
Looks like there is no activity after I sent all logs...
Post by Amar Tumballi
Post by Dmitry Melekhov
Hello!
Yesterday we hit something like this on 4.1.2
Centos 7.5.
Volume is replicated - two bricks and one arbiter.
We rebooted arbiter, waited for heal end, and tried to live migrate VM
[2018-08-27 09:56:22.085411] I [MSGID: 115029]
[server-handshake.c:763:server_setvolume] 0-pool-server: accepted client
from
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-
client-6-RECON_NO:-0 (version: 4.1.2)
[2018-08-27 09:56:22.107609] I [MSGID: 115036]
[server.c:483:server_rpc_notify] 0-pool-server: disconnecting connection
from
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-
client-6-RECON_NO:-0
[2018-08-27 09:56:22.107747] I [MSGID: 101055]
[client_t.c:444:gf_client_unref] 0-pool-server: Shutting down connection
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-clien
t-6-RECON_NO:-0
[2018-08-27 09:58:37.905829] I [MSGID: 115036]
[server.c:483:server_rpc_notify] 0-pool-server: disconnecting connection
from
CTX_ID:c3eb6cfc-2ef9-470a-89d1-a87170d00da5-GRAPH_ID:0-PID:30292-HOST:father-PC_NAME:p
ool-client-6-RECON_NO:-0
[2018-08-27 09:58:37.905926] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28c831d8bc550000}
[2018-08-27 09:58:37.905959] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=2870a7d6bc550000}
[2018-08-27 09:58:37.905979] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=2880a7d6bc550000}
[2018-08-27 09:58:37.905997] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28f031d8bc550000}
[2018-08-27 09:58:37.906016] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28b07dd5bc550000}
[2018-08-27 09:58:37.906034] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28e0a7d6bc550000}
[2018-08-27 09:58:37.906056] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28b845d8bc550000}
[2018-08-27 09:58:37.906079] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=2858a7d8bc550000}
[2018-08-27 09:58:37.906098] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=2868a8d7bc550000}
[2018-08-27 09:58:37.906121] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28f80bd7bc550000}
...
[2018-08-27 09:58:37.907375] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28a8cdd6bc550000}
[2018-08-27 09:58:37.907393] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=2880cdd6bc550000}
[2018-08-27 09:58:37.907476] I [socket.c:3837:socket_submit_reply]
0-tcp.pool-server: not connected (priv->connected = -1)
[2018-08-27 09:58:37.907520] E [rpcsvc.c:1378:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0xcb88cb, Program: GlusterFS
4.x v1, ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.910727] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910814] E [rpcsvc.c:1378:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0xcb88ce, Program: GlusterFS
4.x v1, ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.910861] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910904] E [rpcsvc.c:1378:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0xcb88cf, Program: GlusterFS
4.x v1, ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.910940] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910979] E [rpcsvc.c:1378:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0xcb88d1, Program: GlusterFS
4.x v1, ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.911012] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.911050] E [rpcsvc.c:1378:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0xcb88d8, Program: GlusterFS
4.x v1, ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.911083] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.916217] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.916520] I [MSGID: 115013]
[server-helpers.c:286:do_fd_cleanup] 0-pool-server: fd cleanup on
/balamak.img
after this I/O on /balamak.img was blocked.
Only solution we found was to reboot all 3 nodes.
Is there any bug report in bugzilla we can add logs?
Not aware of such bugs!
Post by Dmitry Melekhov
Is it possible to turn of these locks?
Not sure, will get back on this one!
btw, found this link
https://docs.gluster.org/en/v3/Troubleshooting/troubleshooting-filelocks/
Segmentation fault (core dumped)
4.1.2 too...
something is wrong here.
Post by Dmitry Melekhov
Thank you!
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users
--
Amar Tumballi (amarts)
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users
--
Amar Tumballi (amarts)
Dmitry Melekhov
2018-10-03 04:30:20 UTC
Permalink
Post by Amar Tumballi
Recently, in one of the situation, we found that locks were not freed
up due to not getting TCP timeout..
Can you try the option like below and let us know?
`gluster volume set $volname tcp-user-timeout 42`
(ref: https://review.gluster.org/21170/ )
Regards,
Amar
Thank you, we'll try this.
Post by Amar Tumballi
Ran into this issue too with 4.1.5 with an arbiter setup.  Also
could not run a statedump due to "Segmentation fault".
Tried with 3.12.13 and had issues with locked files as well.  We
were able to do a statedump and found that some of our files were
"BLOCKED" (xlator.features.locks.vol-locks.inode).  Attached part
of statedump.
Also tried clearing the locks using clear-locks, which did remove
the lock, but as soon as I tried to cat the file, it got locked
again and the cat process hung.
I created issue in bugzilla, can't find it though :-(
Looks like there is no activity after I sent all logs...
On Tue, Aug 28, 2018 at 11:24 AM, Dmitry Melekhov
Hello!
Yesterday we hit something like this on 4.1.2
Centos 7.5.
Volume is replicated - two bricks and one arbiter.
We rebooted arbiter, waited for heal end, and tried to
live migrate VM to another node ( we run VMs on gluster
[2018-08-27 09:56:22.085411] I [MSGID: 115029]
accepted client from
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-
client-6-RECON_NO:-0 (version: 4.1.2)
[2018-08-27 09:56:22.107609] I [MSGID: 115036]
disconnecting connection from
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-
client-6-RECON_NO:-0
[2018-08-27 09:56:22.107747] I [MSGID: 101055]
[client_t.c:444:gf_client_unref] 0-pool-server: Shutting
down connection
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-clien
t-6-RECON_NO:-0
[2018-08-27 09:58:37.905829] I [MSGID: 115036]
disconnecting connection from
CTX_ID:c3eb6cfc-2ef9-470a-89d1-a87170d00da5-GRAPH_ID:0-PID:30292-HOST:father-PC_NAME:p
ool-client-6-RECON_NO:-0
[2018-08-27 09:58:37.905926] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292
lk-owner=28c831d8bc550000}
[2018-08-27 09:58:37.905959] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292
lk-owner=2870a7d6bc550000}
[2018-08-27 09:58:37.905979] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292
lk-owner=2880a7d6bc550000}
[2018-08-27 09:58:37.905997] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292
lk-owner=28f031d8bc550000}
[2018-08-27 09:58:37.906016] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292
lk-owner=28b07dd5bc550000}
[2018-08-27 09:58:37.906034] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292
lk-owner=28e0a7d6bc550000}
[2018-08-27 09:58:37.906056] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292
lk-owner=28b845d8bc550000}
[2018-08-27 09:58:37.906079] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292
lk-owner=2858a7d8bc550000}
[2018-08-27 09:58:37.906098] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292
lk-owner=2868a8d7bc550000}
[2018-08-27 09:58:37.906121] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292
lk-owner=28f80bd7bc550000}
...
[2018-08-27 09:58:37.907375] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292
lk-owner=28a8cdd6bc550000}
[2018-08-27 09:58:37.907393] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292
lk-owner=2880cdd6bc550000}
[2018-08-27 09:58:37.907476] I
not connected (priv->connected = -1)
[2018-08-27 09:58:37.907520] E
GlusterFS 4.x v1, ProgVers: 400, Proc: 30) to
rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.910727] E
[server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910814] E
GlusterFS 4.x v1, ProgVers: 400, Proc: 30) to
rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.910861] E
[server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910904] E
GlusterFS 4.x v1, ProgVers: 400, Proc: 30) to
rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.910940] E
[server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910979] E
GlusterFS 4.x v1, ProgVers: 400, Proc: 30) to
rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.911012] E
[server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.911050] E
GlusterFS 4.x v1, ProgVers: 400, Proc: 30) to
rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.911083] E
[server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.916217] E
[server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.916520] I [MSGID: 115013]
[server-helpers.c:286:do_fd_cleanup] 0-pool-server: fd
cleanup on /balamak.img
after this I/O on  /balamak.img was blocked.
Only solution we found was to reboot all 3 nodes.
Is there any bug report in bugzilla we can add logs?
Not aware of such bugs!
Is it possible to turn of these locks?
Not sure, will get back on this one!
btw, found this link
https://docs.gluster.org/en/v3/Troubleshooting/troubleshooting-filelocks/
Segmentation fault (core dumped)
4.1.2 too...
something is wrong here.
Thank you!
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users
--
Amar Tumballi (amarts)
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users
--
Amar Tumballi (amarts)
Dmitry Melekhov
2018-10-03 06:01:55 UTC
Permalink
It doesn't work for some reason:

 gluster volume set pool tcp-user-timeout 42
volume set: failed: option : tcp-user-timeout does not exist
Did you mean tcp-user-timeout?


4.1.5.
Post by Dmitry Melekhov
Post by Amar Tumballi
Recently, in one of the situation, we found that locks were not freed
up due to not getting TCP timeout..
Can you try the option like below and let us know?
`gluster volume set $volname tcp-user-timeout 42`
(ref: https://review.gluster.org/21170/ )
Regards,
Amar
Thank you, we'll try this.
Post by Amar Tumballi
Ran into this issue too with 4.1.5 with an arbiter setup.  Also
could not run a statedump due to "Segmentation fault".
Tried with 3.12.13 and had issues with locked files as well.  We
were able to do a statedump and found that some of our files
were "BLOCKED" (xlator.features.locks.vol-locks.inode). Attached
part of statedump.
Also tried clearing the locks using clear-locks, which did
remove the lock, but as soon as I tried to cat the file, it got
locked again and the cat process hung.
I created issue in bugzilla, can't find it though :-(
Looks like there is no activity after I sent all logs...
On Tue, Aug 28, 2018 at 11:24 AM, Dmitry Melekhov
Hello!
Yesterday we hit something like this on 4.1.2
Centos 7.5.
Volume is replicated - two bricks and one arbiter.
We rebooted arbiter, waited for heal end,  and tried to
live migrate VM to another node ( we run VMs on gluster
[2018-08-27 09:56:22.085411] I [MSGID: 115029]
[server-handshake.c:763:server_setvolume]
0-pool-server: accepted client from
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-
client-6-RECON_NO:-0 (version: 4.1.2)
[2018-08-27 09:56:22.107609] I [MSGID: 115036]
disconnecting connection from
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-
client-6-RECON_NO:-0
[2018-08-27 09:56:22.107747] I [MSGID: 101055]
Shutting down connection
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-clien
t-6-RECON_NO:-0
[2018-08-27 09:58:37.905829] I [MSGID: 115036]
disconnecting connection from
CTX_ID:c3eb6cfc-2ef9-470a-89d1-a87170d00da5-GRAPH_ID:0-PID:30292-HOST:father-PC_NAME:p
ool-client-6-RECON_NO:-0
[2018-08-27 09:58:37.905926] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292
lk-owner=28c831d8bc550000}
[2018-08-27 09:58:37.905959] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292
lk-owner=2870a7d6bc550000}
[2018-08-27 09:58:37.905979] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292
lk-owner=2880a7d6bc550000}
[2018-08-27 09:58:37.905997] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292
lk-owner=28f031d8bc550000}
[2018-08-27 09:58:37.906016] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292
lk-owner=28b07dd5bc550000}
[2018-08-27 09:58:37.906034] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292
lk-owner=28e0a7d6bc550000}
[2018-08-27 09:58:37.906056] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292
lk-owner=28b845d8bc550000}
[2018-08-27 09:58:37.906079] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292
lk-owner=2858a7d8bc550000}
[2018-08-27 09:58:37.906098] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292
lk-owner=2868a8d7bc550000}
[2018-08-27 09:58:37.906121] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292
lk-owner=28f80bd7bc550000}
...
[2018-08-27 09:58:37.907375] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292
lk-owner=28a8cdd6bc550000}
[2018-08-27 09:58:37.907393] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
held by {client=0x7ffb58035bc0, pid=30292
lk-owner=2880cdd6bc550000}
[2018-08-27 09:58:37.907476] I
not connected (priv->connected = -1)
[2018-08-27 09:58:37.907520] E
GlusterFS 4.x v1, ProgVers: 400, Proc: 30) to
rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.910727] E
[server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910814] E
GlusterFS 4.x v1, ProgVers: 400, Proc: 30) to
rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.910861] E
[server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910904] E
GlusterFS 4.x v1, ProgVers: 400, Proc: 30) to
rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.910940] E
[server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910979] E
GlusterFS 4.x v1, ProgVers: 400, Proc: 30) to
rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.911012] E
[server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.911050] E
GlusterFS 4.x v1, ProgVers: 400, Proc: 30) to
rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.911083] E
[server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.916217] E
[server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.916520] I [MSGID: 115013]
[server-helpers.c:286:do_fd_cleanup] 0-pool-server: fd
cleanup on /balamak.img
after this I/O on  /balamak.img was blocked.
Only solution we found was to reboot all 3 nodes.
Is there any bug report in bugzilla we can add logs?
Not aware of such bugs!
Is it possible to turn of these locks?
Not sure, will get back on this one!
btw, found this link
https://docs.gluster.org/en/v3/Troubleshooting/troubleshooting-filelocks/
Segmentation fault (core dumped)
4.1.2 too...
something is wrong here.
Thank you!
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users
--
Amar Tumballi (amarts)
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users
--
Amar Tumballi (amarts)
Amar Tumballi
2018-10-03 06:10:23 UTC
Permalink
Sorry! I should have been more specific. I over-looked the option:

---
[***@localhost ~]# gluster volume set demo1 tcp-user-timeout 42
volume set: failed: option : tcp-user-timeout does not exist
Did you mean tcp-user-timeout?
[***@localhost ~]# gluster volume set demo1 *client.tcp-user-timeout* 42
volume set: success
[***@localhost ~]# gluster volume set demo1 *server.tcp-user-timeout* 42
volume set: success
----
Looks like you need to set the option specifically on client and server.

Regards,
Post by Dmitry Melekhov
gluster volume set pool tcp-user-timeout 42
volume set: failed: option : tcp-user-timeout does not exist
Did you mean tcp-user-timeout?
4.1.5.
Recently, in one of the situation, we found that locks were not freed up
due to not getting TCP timeout..
Can you try the option like below and let us know?
`gluster volume set $volname tcp-user-timeout 42`
(ref: https://review.gluster.org/21170/ )
Regards,
Amar
Thank you, we'll try this.
Post by Danny Lee
Ran into this issue too with 4.1.5 with an arbiter setup. Also could not
run a statedump due to "Segmentation fault".
Tried with 3.12.13 and had issues with locked files as well. We were
able to do a statedump and found that some of our files were "BLOCKED"
(xlator.features.locks.vol-locks.inode). Attached part of statedump.
Also tried clearing the locks using clear-locks, which did remove the
lock, but as soon as I tried to cat the file, it got locked again and the
cat process hung.
I created issue in bugzilla, can't find it though :-(
Looks like there is no activity after I sent all logs...
Post by Amar Tumballi
Post by Dmitry Melekhov
Hello!
Yesterday we hit something like this on 4.1.2
Centos 7.5.
Volume is replicated - two bricks and one arbiter.
We rebooted arbiter, waited for heal end, and tried to live migrate VM
[2018-08-27 09:56:22.085411] I [MSGID: 115029]
[server-handshake.c:763:server_setvolume] 0-pool-server: accepted client
from
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-
client-6-RECON_NO:-0 (version: 4.1.2)
[2018-08-27 09:56:22.107609] I [MSGID: 115036]
[server.c:483:server_rpc_notify] 0-pool-server: disconnecting connection
from
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-
client-6-RECON_NO:-0
[2018-08-27 09:56:22.107747] I [MSGID: 101055]
[client_t.c:444:gf_client_unref] 0-pool-server: Shutting down connection
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-clien
t-6-RECON_NO:-0
[2018-08-27 09:58:37.905829] I [MSGID: 115036]
[server.c:483:server_rpc_notify] 0-pool-server: disconnecting connection
from
CTX_ID:c3eb6cfc-2ef9-470a-89d1-a87170d00da5-GRAPH_ID:0-PID:30292-HOST:father-PC_NAME:p
ool-client-6-RECON_NO:-0
[2018-08-27 09:58:37.905926] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28c831d8bc550000}
[2018-08-27 09:58:37.905959] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=2870a7d6bc550000}
[2018-08-27 09:58:37.905979] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=2880a7d6bc550000}
[2018-08-27 09:58:37.905997] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28f031d8bc550000}
[2018-08-27 09:58:37.906016] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28b07dd5bc550000}
[2018-08-27 09:58:37.906034] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28e0a7d6bc550000}
[2018-08-27 09:58:37.906056] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28b845d8bc550000}
[2018-08-27 09:58:37.906079] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=2858a7d8bc550000}
[2018-08-27 09:58:37.906098] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=2868a8d7bc550000}
[2018-08-27 09:58:37.906121] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28f80bd7bc550000}
...
[2018-08-27 09:58:37.907375] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28a8cdd6bc550000}
[2018-08-27 09:58:37.907393] W [inodelk.c:610:pl_inodelk_log_cleanup]
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=2880cdd6bc550000}
[2018-08-27 09:58:37.907476] I [socket.c:3837:socket_submit_reply]
0-tcp.pool-server: not connected (priv->connected = -1)
[2018-08-27 09:58:37.907520] E [rpcsvc.c:1378:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0xcb88cb, Program: GlusterFS
4.x v1, ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.910727] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910814] E [rpcsvc.c:1378:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0xcb88ce, Program: GlusterFS
4.x v1, ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.910861] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910904] E [rpcsvc.c:1378:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0xcb88cf, Program: GlusterFS
4.x v1, ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.910940] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910979] E [rpcsvc.c:1378:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0xcb88d1, Program: GlusterFS
4.x v1, ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.911012] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.911050] E [rpcsvc.c:1378:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0xcb88d8, Program: GlusterFS
4.x v1, ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.911083] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.916217] E [server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.916520] I [MSGID: 115013]
[server-helpers.c:286:do_fd_cleanup] 0-pool-server: fd cleanup on
/balamak.img
after this I/O on /balamak.img was blocked.
Only solution we found was to reboot all 3 nodes.
Is there any bug report in bugzilla we can add logs?
Not aware of such bugs!
Post by Dmitry Melekhov
Is it possible to turn of these locks?
Not sure, will get back on this one!
btw, found this link
https://docs.gluster.org/en/v3/Troubleshooting/troubleshooting-filelocks/
Segmentation fault (core dumped)
4.1.2 too...
something is wrong here.
Post by Dmitry Melekhov
Thank you!
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users
--
Amar Tumballi (amarts)
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users
--
Amar Tumballi (amarts)
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users
--
Amar Tumballi (amarts)
Dmitry Melekhov
2018-10-03 06:14:11 UTC
Permalink
Post by Dmitry Melekhov
---
volume set: failed: option : tcp-user-timeout does not exist
Did you mean tcp-user-timeout?
volume set: success
volume set: success
----
Looks like you need to set the option specifically on client and server.
Thank you very much!
We set this option, hope it will solve our problem.
Ravishankar N
2018-10-10 09:58:41 UTC
Permalink
Hi,
Sorry for the delay, should have gotten to this earlier. We uncovered
the issue in our internal QE testing and it is a regression.
Details/patch is available in BZ 1637802.  I'll back-port it to the
release branches once it gets merged.  Shout out to Pranith for helping
with the RCA!
Regards,
Ravi
Post by Dmitry Melekhov
Ran into this issue too with 4.1.5 with an arbiter setup.  Also could
not run a statedump due to "Segmentation fault".
Tried with 3.12.13 and had issues with locked files as well.  We were
able to do a statedump and found that some of our files were
"BLOCKED" (xlator.features.locks.vol-locks.inode).  Attached part of
statedump.
Also tried clearing the locks using clear-locks, which did remove the
lock, but as soon as I tried to cat the file, it got locked again and
the cat process hung.
I created issue in bugzilla, can't find it though :-(
Looks like there is no activity after I sent all logs...
Post by Dmitry Melekhov
Hello!
Yesterday we hit something like this on 4.1.2
Centos 7.5.
Volume is replicated - two bricks and one arbiter.
We rebooted arbiter, waited for heal end,  and tried to live
[2018-08-27 09:56:22.085411] I [MSGID: 115029]
accepted client from
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-
client-6-RECON_NO:-0 (version: 4.1.2)
[2018-08-27 09:56:22.107609] I [MSGID: 115036]
disconnecting connection from
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-
client-6-RECON_NO:-0
[2018-08-27 09:56:22.107747] I [MSGID: 101055]
[client_t.c:444:gf_client_unref] 0-pool-server: Shutting
down connection
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-clien
t-6-RECON_NO:-0
[2018-08-27 09:58:37.905829] I [MSGID: 115036]
disconnecting connection from
CTX_ID:c3eb6cfc-2ef9-470a-89d1-a87170d00da5-GRAPH_ID:0-PID:30292-HOST:father-PC_NAME:p
ool-client-6-RECON_NO:-0
[2018-08-27 09:58:37.905926] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28c831d8bc550000}
[2018-08-27 09:58:37.905959] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=2870a7d6bc550000}
[2018-08-27 09:58:37.905979] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=2880a7d6bc550000}
[2018-08-27 09:58:37.905997] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28f031d8bc550000}
[2018-08-27 09:58:37.906016] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28b07dd5bc550000}
[2018-08-27 09:58:37.906034] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28e0a7d6bc550000}
[2018-08-27 09:58:37.906056] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28b845d8bc550000}
[2018-08-27 09:58:37.906079] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=2858a7d8bc550000}
[2018-08-27 09:58:37.906098] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=2868a8d7bc550000}
[2018-08-27 09:58:37.906121] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28f80bd7bc550000}
...
[2018-08-27 09:58:37.907375] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=28a8cdd6bc550000}
[2018-08-27 09:58:37.907393] W
releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held
by {client=0x7ffb58035bc0, pid=30292 lk-owner=2880cdd6bc550000}
[2018-08-27 09:58:37.907476] I
[socket.c:3837:socket_submit_reply] 0-tcp.pool-server: not
connected (priv->connected = -1)
[2018-08-27 09:58:37.907520] E
[rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed
to submit message (XID: 0xcb88cb, Program: GlusterFS 4.x v1,
ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.910727] E
[server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910814] E
[rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed
to submit message (XID: 0xcb88ce, Program: GlusterFS 4.x v1,
ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.910861] E
[server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910904] E
[rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed
to submit message (XID: 0xcb88cf, Program: GlusterFS 4.x v1,
ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.910940] E
[server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910979] E
[rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed
to submit message (XID: 0xcb88d1, Program: GlusterFS 4.x v1,
ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.911012] E
[server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.911050] E
[rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed
to submit message (XID: 0xcb88d8, Program: GlusterFS 4.x v1,
ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
[2018-08-27 09:58:37.911083] E
[server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.916217] E
[server.c:137:server_submit_reply]
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
[0x7ffb64379084]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba]
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.916520] I [MSGID: 115013]
[server-helpers.c:286:do_fd_cleanup] 0-pool-server: fd
cleanup on /balamak.img
after this I/O on  /balamak.img was blocked.
Only solution we found was to reboot all 3 nodes.
Is there any bug report in bugzilla we can add logs?
Not aware of such bugs!
Is it possible to turn of these locks?
Not sure, will get back on this one!
btw, found this link
https://docs.gluster.org/en/v3/Troubleshooting/troubleshooting-filelocks/
Segmentation fault (core dumped)
4.1.2 too...
something is wrong here.
Post by Dmitry Melekhov
Thank you!
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users
--
Amar Tumballi (amarts)
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users
Dmitry Melekhov
2018-10-10 10:02:28 UTC
Permalink
Post by Ravishankar N
Hi,
Sorry for the delay, should have gotten to this earlier. We uncovered
the issue in our internal QE testing and it is a regression.
Details/patch is available in BZ 1637802.  I'll back-port it to the
release branches once it gets merged.  Shout out to Pranith for
helping with the RCA!
Regards,
Ravi
Thank you!
Hope fix will be released soon.
Danny Lee
2018-10-10 11:50:03 UTC
Permalink
Great news! Awesome job, Pranith!

Do you have a link to the patch? I tried looking it up but had no luck.
Post by Dmitry Melekhov
Post by Ravishankar N
Hi,
Sorry for the delay, should have gotten to this earlier. We uncovered
the issue in our internal QE testing and it is a regression.
Details/patch is available in BZ 1637802. I'll back-port it to the
release branches once it gets merged. Shout out to Pranith for
helping with the RCA!
Regards,
Ravi
Thank you!
Hope fix will be released soon.
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users
Ravishankar N
2018-10-10 11:52:57 UTC
Permalink
https://bugzilla.redhat.com/show_bug.cgi?id=1637802

https://review.gluster.org/#/c/glusterfs/+/21380/
Post by Danny Lee
Great news! Awesome job, Pranith!
Do you have a link to the patch?  I tried looking it up but had no luck.
Post by Ravishankar N
Hi,
Sorry for the delay, should have gotten to this earlier. We
uncovered
Post by Ravishankar N
the issue in our internal QE testing and it is a regression.
Details/patch is available in BZ 1637802.  I'll back-port it to the
release branches once it gets merged.  Shout out to Pranith for
helping with the RCA!
Regards,
Ravi
Thank you!
Hope fix will be released soon.
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users
Danny Lee
2018-10-10 12:17:46 UTC
Permalink
Thank you, Ravishankar. We should probably mention it in the bugzilla
ticket Dmitry created.

https://bugzilla.redhat.com/show_bug.cgi?id=1622814
Post by Ravishankar N
https://bugzilla.redhat.com/show_bug.cgi?id=1637802
https://review.gluster.org/#/c/glusterfs/+/21380/
Great news! Awesome job, Pranith!
Do you have a link to the patch? I tried looking it up but had no luck.
Post by Dmitry Melekhov
Post by Ravishankar N
Hi,
Sorry for the delay, should have gotten to this earlier. We uncovered
the issue in our internal QE testing and it is a regression.
Details/patch is available in BZ 1637802. I'll back-port it to the
release branches once it gets merged. Shout out to Pranith for
helping with the RCA!
Regards,
Ravi
Thank you!
Hope fix will be released soon.
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users
Loading...