Discussion:
In a replica 2 server, file-updates on one server missing on the other server
(too old to reply)
A Ghoshal
2015-01-20 12:04:49 UTC
Permalink
Hello,

I am using the following replicated volume:

***@serv0:~> gluster v info replicated_vol

Volume Name: replicated_vol
Type: Replicate
Volume ID: 26d111e3-7e4c-479e-9355-91635ab7f1c2
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: serv0:/mnt/bricks/replicated_vol/brick
Brick2: serv1:/mnt/bricks/replicated_vol/brick
Options Reconfigured:
diagnostics.client-log-level: INFO
network.ping-timeout: 10
nfs.enable-ino32: on
cluster.self-heal-daemon: on
nfs.disable: off

replicated_vol is mounted at /mnt/replicated_vol on both serv0 and serv1.
If I do the following on serv0:

***@serv0:~>echo "cranberries" > /mnt/replicated_vol/testfile
***@serv0:~>echo "tangerines" >> /mnt/replicated_vol/testfile

And then I check for the state of the replicas in the bricks, then I find
that

***@serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile
cranberries
tangerines
***@serv0:~>

***@serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile
***@serv1:~>

As may be seen, the replica on serv1 is blank, when I write into testfile
from serv0 (even though the file is created on both bricks).
Interestingly, if I write something to the file at serv1, then the two
replicas become identical.

***@serv1:~>echo "artichokes" >> /mnt/replicated_vol/testfile

***@serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile
cranberries
tangerines
artichokes
***@serv1:~>

***@serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile
cranberries
tangerines
artichokes
***@serv0:~>

So, I dabbled into the logs a little bit, after upping the diagnostic
level, and this is what I saw:

When I write on serv0 (bad case):

[2015-01-20 09:21:52.197704] T [fuse-bridge.c:546:fuse_lookup_resume]
0-glusterfs-fuse: 53027: LOOKUP
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:21:52.197959] D
[afr-common.c:131:afr_lookup_xattr_req_prepare]
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:21:52.198006] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-0: Auth Info: pid: 28151, uid: 0, gid: 0, owner:
0000000000000000
[2015-01-20 09:21:52.198024] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:21:52.198108] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x78163x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:21:52.198565] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x78163x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
[2015-01-20 09:21:52.198640] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:21:52.198669] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:21:52.198681] D
[afr-self-heal-common.c:887:afr_mark_sources]
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:21:52.198694] D
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:21:52.198705] D
[afr-common.c:1380:afr_lookup_select_read_child]
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:21:52.198720] D
[afr-common.c:1117:afr_lookup_build_response_params]
0-replicated_vol-replicate-0: Building lookup response from 0
[2015-01-20 09:21:52.198732] D
[afr-common.c:1732:afr_lookup_perform_self_heal]
0-replicated_vol-replicate-0: Only 1 child up - do not attempt to detect
self heal

When I write on serv1 (good case):

[2015-01-20 09:37:49.151506] T [fuse-bridge.c:546:fuse_lookup_resume]
0-glusterfs-fuse: 31212: LOOKUP
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:37:49.151683] D
[afr-common.c:131:afr_lookup_xattr_req_prepare]
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:37:49.151726] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-0: Auth Info: pid: 7599, uid: 0, gid: 0, owner:
0000000000000000
[2015-01-20 09:37:49.151744] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0, owner:
0000000000000000
[2015-01-20 09:37:49.151824] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1)
[2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-1)
[2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
[2015-01-20 09:37:49.152582] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:37:49.152596] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:37:49.152621] D
[afr-self-heal-common.c:887:afr_mark_sources]
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:37:49.152633] D
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:37:49.152644] D
[afr-common.c:1380:afr_lookup_select_read_child]
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:37:49.152657] D
[afr-common.c:1117:afr_lookup_build_response_params]
0-replicated_vol-replicate-0: Building lookup response from 0

We see that when you write on serv1, the RPC request is sent to both
replicated_vol-client-0 and replicated_vol-client-1, while when we write
on serv0, the request is sent only to replicated_vol-client-0, and the
FUse client is unaware of the presence of client-1 in the latter case.

I checked a bit more in the logs. When I turn on my trace, I found many
instances of these logs on serv0 but NOT on serv1:

[2015-01-20 09:21:15.520784] T [fuse-bridge.c:681:fuse_attr_cbk]
0-glusterfs-fuse: 53011: LOOKUP() / => 1
[2015-01-20 09:21:17.683088] T [rpc-clnt.c:422:rpc_clnt_reconnect]
0-replicated_vol-client-1: attempting reconnect
[2015-01-20 09:21:17.683159] D [name.c:155:client_fill_address_family]
0-replicated_vol-client-1: address-family not specified, guessing it to be
inet from (remote-host: serv1)
[2015-01-20 09:21:17.683178] T
[name.c:225:af_inet_client_get_remote_sockaddr] 0-replicated_vol-client-1:
option remote-port missing in volume replicated_vol-client-1. Defaulting
to 24007
[2015-01-20 09:21:17.683191] T [common-utils.c:188:gf_resolve_ip6]
0-resolver: flushing DNS cache
[2015-01-20 09:21:17.683202] T [common-utils.c:195:gf_resolve_ip6]
0-resolver: DNS cache not present, freshly probing hostname: serv1
[2015-01-20 09:21:17.683814] D [common-utils.c:237:gf_resolve_ip6]
0-resolver: returning ip-192.168.24.81 (port-24007) for hostname: serv1
and port: 24007
[2015-01-20 09:21:17.684139] D [common-utils.c:257:gf_resolve_ip6]
0-resolver: next DNS query will return: ip-192.168.24.81 port-24007
[2015-01-20 09:21:17.684164] T [socket.c:731:__socket_nodelay]
0-replicated_vol-client-1: NODELAY enabled for socket 10
[2015-01-20 09:21:17.684177] T [socket.c:790:__socket_keepalive]
0-replicated_vol-client-1: Keep-alive enabled for socket 10, interval 2,
idle: 20
[2015-01-20 09:21:17.684236] W [common-utils.c:2247:gf_get_reserved_ports]
0-glusterfs: could not open the file
/proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports info
(No such file or directory)
[2015-01-20 09:21:17.684253] W
[common-utils.c:2280:gf_process_reserved_ports] 0-glusterfs: Not able to
get reserved ports, hence there is a possibility that glusterfs may
consume reserved port
[2015-01-20 09:21:17.684660] D [socket.c:605:__socket_shutdown]
0-replicated_vol-client-1: shutdown() returned -1. Transport endpoint is
not connected
[2015-01-20 09:21:17.684699] T
[rpc-clnt.c:519:rpc_clnt_connection_cleanup] 0-replicated_vol-client-1:
cleaning up state in transport object 0x68a630
[2015-01-20 09:21:17.684731] D [socket.c:486:__socket_rwv]
0-replicated_vol-client-1: EOF on socket
[2015-01-20 09:21:17.684750] W [socket.c:514:__socket_rwv]
0-replicated_vol-client-1: readv failed (No data available)
[2015-01-20 09:21:17.684766] D
[socket.c:1962:__socket_proto_state_machine] 0-replicated_vol-client-1:
reading from socket failed. Error (No data available), peer
(192.168.24.81:49198)

I could not find a 'remote-port' option in /var/lib/glusterd on either
peer. Could somebody tell me where this configuration is looked up from?
Also, sometime later, I rebooted serv0 and that seemed to solve the
problem. However, stop+start of replicated_vol and restart of
/etc/init.d/glusterd did NOT solve the problem.

Any help on this matter will be greatly appreciated as I need to provide
robustness assurances for our setup.

Thanks a lot,
Anirban

P.s. Additional details:
glusterfs version: 3.4.2
Linux kernel version: 2.6.34
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you
Lindsay Mathieson
2015-01-20 12:16:23 UTC
Permalink
Post by A Ghoshal
replicated_vol is mounted at /mnt/replicated_vol on both serv0 and serv1.
The mounts - these are the base disk mounts?

To access the replicated files system you need to mount the gluster file system
itself.

mount -t glusterfs HOSTNAME-OR-IPADDRESS:/VOLNAME MOUNTDIR

e.g

mount -t glusterfs server1:/test-volume /mnt/glusterfs

Good blog post for ubuntu here:

http://www.jamescoyle.net/how-to/439-mount-a-glusterfs-volume
A Ghoshal
2015-01-20 12:36:40 UTC
Permalink
No, this is the fuse mount for the glusterfs volume:

***@serv0:~> df -TH /mnt/replicated_vol/
Filesystem Type Size Used Avail Use% Mounted on
serv0:replicated_vol
fuse.glusterfs 138M 15M 124M 11% /replicated_vol
***@serv0:~>

Thanks,
Anirban



From: Lindsay Mathieson <***@gmail.com>
To: gluster-***@gluster.org
Date: 01/20/2015 05:56 PM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server
Post by A Ghoshal
replicated_vol is mounted at /mnt/replicated_vol on both serv0 and serv1.
The mounts - these are the base disk mounts?

To access the replicated files system you need to mount the gluster file
system
itself.

mount -t glusterfs HOSTNAME-OR-IPADDRESS:/VOLNAME MOUNTDIR

e.g

mount -t glusterfs server1:/test-volume /mnt/glusterfs

Good blog post for ubuntu here:

http://www.jamescoyle.net/how-to/439-mount-a-glusterfs-volume


_______________________________________________
Gluster-users mailing list
Gluster-***@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you
Pranith Kumar Karampuri
2015-01-21 18:39:04 UTC
Permalink
hi,
Responses inline.

PS: You are chalkogen_oxygen?

Pranith
Post by A Ghoshal
Hello,
Volume Name: replicated_vol
Type: Replicate
Volume ID: 26d111e3-7e4c-479e-9355-91635ab7f1c2
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Brick1: serv0:/mnt/bricks/replicated_vol/brick
Brick2: serv1:/mnt/bricks/replicated_vol/brick
diagnostics.client-log-level: INFO
network.ping-timeout: 10
nfs.enable-ino32: on
cluster.self-heal-daemon: on
nfs.disable: off
replicated_vol is mounted at /mnt/replicated_vol on both serv0 and
And then I check for the state of the replicas in the bricks, then I
find that
cranberries
tangerines
As may be seen, the replica on serv1 is blank, when I write into
testfile from serv0 (even though the file is created on both bricks).
Interestingly, if I write something to the file at serv1, then the two
replicas become identical.
cranberries
tangerines
artichokes
cranberries
tangerines
artichokes
So, I dabbled into the logs a little bit, after upping the diagnostic
*_When I write on serv0 (bad case):_*
[2015-01-20 09:21:52.197704] T [fuse-bridge.c:546:fuse_lookup_resume]
0-glusterfs-fuse: 53027: LOOKUP
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:21:52.197959] D
[afr-common.c:131:afr_lookup_xattr_req_prepare]
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:21:52.198006] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-0: Auth Info: pid: 28151, uid: 0, gid: 0,
owner: 0000000000000000
[2015-01-20 09:21:52.198024] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request
fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:21:52.198108] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x78163x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:21:52.198565] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x78163x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
[2015-01-20 09:21:52.198640] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:21:52.198669] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:21:52.198681] D
[afr-self-heal-common.c:887:afr_mark_sources]
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:21:52.198694] D
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:21:52.198705] D
[afr-common.c:1380:afr_lookup_select_read_child]
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:21:52.198720] D
[afr-common.c:1117:afr_lookup_build_response_params]
0-replicated_vol-replicate-0: Building lookup response from 0
[2015-01-20 09:21:52.198732] D
[afr-common.c:1732:afr_lookup_perform_self_heal]
0-replicated_vol-replicate-0: Only 1 child up - do not attempt to
detect self heal
*_When I write on serv1 (good case):_*
[2015-01-20 09:37:49.151506] T [fuse-bridge.c:546:fuse_lookup_resume]
0-glusterfs-fuse: 31212: LOOKUP
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:37:49.151683] D
[afr-common.c:131:afr_lookup_xattr_req_prepare]
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:37:49.151726] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-0: Auth Info: pid: 7599, uid: 0, gid: 0,
owner: 0000000000000000
[2015-01-20 09:37:49.151744] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request
fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0,
owner: 0000000000000000
[2015-01-20 09:37:49.151824] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request
fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1)
[2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-1)
[2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
[2015-01-20 09:37:49.152582] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:37:49.152596] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:37:49.152621] D
[afr-self-heal-common.c:887:afr_mark_sources]
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:37:49.152633] D
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:37:49.152644] D
[afr-common.c:1380:afr_lookup_select_read_child]
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:37:49.152657] D
[afr-common.c:1117:afr_lookup_build_response_params]
0-replicated_vol-replicate-0: Building lookup response from 0
We see that when you write on serv1, the RPC request is sent to both
replicated_vol-client-0 and replicated_vol-client-1, while when we
write on serv0, the request is sent only to replicated_vol-client-0,
and the FUse client is unaware of the presence of client-1 in the
latter case.
I checked a bit more in the logs. When I turn on my trace, I found
[2015-01-20 09:21:15.520784] T [fuse-bridge.c:681:fuse_attr_cbk]
0-glusterfs-fuse: 53011: LOOKUP() / => 1
[2015-01-20 09:21:17.683088] T [rpc-clnt.c:422:rpc_clnt_reconnect]
0-replicated_vol-client-1: attempting reconnect
[2015-01-20 09:21:17.683159] D [name.c:155:client_fill_address_family]
0-replicated_vol-client-1: address-family not specified, guessing it
to be inet from (remote-host: serv1)
[2015-01-20 09:21:17.683178] T
[name.c:225:af_inet_client_get_remote_sockaddr]
0-replicated_vol-client-1: option remote-port missing in volume
replicated_vol-client-1. Defaulting to 24007
[2015-01-20 09:21:17.683191] T [common-utils.c:188:gf_resolve_ip6]
0-resolver: flushing DNS cache
[2015-01-20 09:21:17.683202] T [common-utils.c:195:gf_resolve_ip6]
0-resolver: DNS cache not present, freshly probing hostname: serv1
[2015-01-20 09:21:17.683814] D [common-utils.c:237:gf_resolve_ip6]
serv1 and port: 24007
[2015-01-20 09:21:17.684139] D [common-utils.c:257:gf_resolve_ip6]
0-resolver: next DNS query will return: ip-192.168.24.81 port-24007
[2015-01-20 09:21:17.684164] T [socket.c:731:__socket_nodelay]
0-replicated_vol-client-1: NODELAY enabled for socket 10
[2015-01-20 09:21:17.684177] T [socket.c:790:__socket_keepalive]
0-replicated_vol-client-1: Keep-alive enabled for socket 10, interval
2, idle: 20
[2015-01-20 09:21:17.684236] W
[common-utils.c:2247:gf_get_reserved_ports] 0-glusterfs: could not
open the file /proc/sys/net/ipv4/ip_local_reserved_ports for getting
reserved ports info (No such file or directory)
[2015-01-20 09:21:17.684253] W
[common-utils.c:2280:gf_process_reserved_ports] 0-glusterfs: Not able
to get reserved ports, hence there is a possibility that glusterfs may
consume reserved port
Logs above suggest that mount process couldn't assign a reserved port
because it couldn't find the file /proc/sys/net/ipv4/ip_local_reserved_ports

I guess reboot of the machine fixed it. Wonder why it was not found in
the first place.

Pranith.
Post by A Ghoshal
[2015-01-20 09:21:17.684660] D [socket.c:605:__socket_shutdown]
0-replicated_vol-client-1: shutdown() returned -1. Transport endpoint
is not connected
[2015-01-20 09:21:17.684699] T
[rpc-clnt.c:519:rpc_clnt_connection_cleanup]
0-replicated_vol-client-1: cleaning up state in transport object 0x68a630
[2015-01-20 09:21:17.684731] D [socket.c:486:__socket_rwv]
0-replicated_vol-client-1: EOF on socket
[2015-01-20 09:21:17.684750] W [socket.c:514:__socket_rwv]
0-replicated_vol-client-1: readv failed (No data available)
[2015-01-20 09:21:17.684766] D
[socket.c:1962:__socket_proto_state_machine]
0-replicated_vol-client-1: reading from socket failed. Error (No data
available), peer (192.168.24.81:49198)
I could not find a 'remote-port' option in /var/lib/glusterd on either
peer. Could somebody tell me where this configuration is looked up
from? Also, sometime later, I rebooted serv0 and that seemed to solve
the problem. However, stop+start of replicated_vol and restart of
/etc/init.d/glusterd did NOT solve the problem.
Ignore that log. If no port is given in that volfile, it picks 24007 as
the port, which is the default port where glusterd 'listens'
Post by A Ghoshal
Any help on this matter will be greatly appreciated as I need to
provide robustness assurances for our setup.
Thanks a lot,
Anirban
/glusterfs version: 3.4.2/
/Linux kernel version: 2.6.34/
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you
_______________________________________________
Gluster-users mailing list
http://www.gluster.org/mailman/listinfo/gluster-users
A Ghoshal
2015-01-22 08:37:44 UTC
Permalink
Hi Pranith,

Yes, the very same (***@yahoo.com). Justin Clift sent me a
mail a while back telling me that it is better if we all use our business
email addresses so I made me a new profile.

Glusterfs complains about /proc/sys/net/ipv4/ip_local_reserved_ports
because we use a really old Linux kernel (2.6.34) wherein this feature is
not present. We plan to upgrade our Linux so often but each time we are
dissuaded from it by some compatibility issue or the other. So, we get
this log every time - on both good volumes and bad ones. What bothered me
was this (on serv1):

[2015-01-20 09:37:49.151744] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0, owner:
0000000000000000
[2015-01-20 09:37:49.151824] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1)
[2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-1)
[2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)

When I write on the good server (serv1), we see that an RPC request is
sent to both client-0 and client-1. While, when I write on the bad server
(serv0), the RPC request is sent only to client-0, which is why it is no
wonder that the writes are not synced over to serv1. Somehow I could not
make the daemon on serv0 understand that there are two up-children and not
just one.

One additional detail - since we are using a kernel that's too old, we do
not have the (Anand Avati's?) FUse readdirplus patches, either. I've
noticed that the fixes in the readdirplus version of glusterfs aren't
always guaranteed to be present on the non-readdirplus version of the
patches. I'd filed a bug around one such anomaly back, but never got
around to writing a patch for it (sorry!) Here it is:
https://bugzilla.redhat.com/show_bug.cgi?id=1062287

Maybe something on similar lines here?

Thanks,
Anirban

P.s. Please ignore the #Personal# in the subject line - we need to do that
to push mails to the public domain past the email filter safely.



From: Pranith Kumar Karampuri <***@redhat.com>
To: A Ghoshal <***@tcs.com>, gluster-***@gluster.org
Date: 01/22/2015 12:09 AM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server



hi,
Responses inline.

PS: You are chalkogen_oxygen?

Pranith
On 01/20/2015 05:34 PM, A Ghoshal wrote:
Hello,

I am using the following replicated volume:

***@serv0:~> gluster v info replicated_vol

Volume Name: replicated_vol
Type: Replicate
Volume ID: 26d111e3-7e4c-479e-9355-91635ab7f1c2
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: serv0:/mnt/bricks/replicated_vol/brick
Brick2: serv1:/mnt/bricks/replicated_vol/brick
Options Reconfigured:
diagnostics.client-log-level: INFO
network.ping-timeout: 10
nfs.enable-ino32: on
cluster.self-heal-daemon: on
nfs.disable: off

replicated_vol is mounted at /mnt/replicated_vol on both serv0 and serv1.
If I do the following on serv0:

***@serv0:~>echo "cranberries" > /mnt/replicated_vol/testfile
***@serv0:~>echo "tangerines" >> /mnt/replicated_vol/testfile

And then I check for the state of the replicas in the bricks, then I find
that

***@serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile
cranberries
tangerines
***@serv0:~>

***@serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile
***@serv1:~>

As may be seen, the replica on serv1 is blank, when I write into testfile
from serv0 (even though the file is created on both bricks).
Interestingly, if I write something to the file at serv1, then the two
replicas become identical.

***@serv1:~>echo "artichokes" >> /mnt/replicated_vol/testfile

***@serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile
cranberries
tangerines
artichokes
***@serv1:~>

***@serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile
cranberries
tangerines
artichokes
***@serv0:~>

So, I dabbled into the logs a little bit, after upping the diagnostic
level, and this is what I saw:

When I write on serv0 (bad case):

[2015-01-20 09:21:52.197704] T [fuse-bridge.c:546:fuse_lookup_resume]
0-glusterfs-fuse: 53027: LOOKUP
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:21:52.197959] D
[afr-common.c:131:afr_lookup_xattr_req_prepare]
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:21:52.198006] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-0: Auth Info: pid: 28151, uid: 0, gid: 0, owner:
0000000000000000
[2015-01-20 09:21:52.198024] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:21:52.198108] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x78163x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:21:52.198565] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x78163x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
[2015-01-20 09:21:52.198640] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:21:52.198669] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:21:52.198681] D
[afr-self-heal-common.c:887:afr_mark_sources]
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:21:52.198694] D
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:21:52.198705] D
[afr-common.c:1380:afr_lookup_select_read_child]
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:21:52.198720] D
[afr-common.c:1117:afr_lookup_build_response_params]
0-replicated_vol-replicate-0: Building lookup response from 0
[2015-01-20 09:21:52.198732] D
[afr-common.c:1732:afr_lookup_perform_self_heal]
0-replicated_vol-replicate-0: Only 1 child up - do not attempt to detect
self heal

When I write on serv1 (good case):

[2015-01-20 09:37:49.151506] T [fuse-bridge.c:546:fuse_lookup_resume]
0-glusterfs-fuse: 31212: LOOKUP
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:37:49.151683] D
[afr-common.c:131:afr_lookup_xattr_req_prepare]
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:37:49.151726] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-0: Auth Info: pid: 7599, uid: 0, gid: 0, owner:
0000000000000000
[2015-01-20 09:37:49.151744] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0, owner:
0000000000000000
[2015-01-20 09:37:49.151824] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1)
[2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-1)
[2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
[2015-01-20 09:37:49.152582] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:37:49.152596] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:37:49.152621] D
[afr-self-heal-common.c:887:afr_mark_sources]
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:37:49.152633] D
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:37:49.152644] D
[afr-common.c:1380:afr_lookup_select_read_child]
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:37:49.152657] D
[afr-common.c:1117:afr_lookup_build_response_params]
0-replicated_vol-replicate-0: Building lookup response from 0

We see that when you write on serv1, the RPC request is sent to both
replicated_vol-client-0 and replicated_vol-client-1, while when we write
on serv0, the request is sent only to replicated_vol-client-0, and the
FUse client is unaware of the presence of client-1 in the latter case.

I checked a bit more in the logs. When I turn on my trace, I found many
instances of these logs on serv0 but NOT on serv1:

[2015-01-20 09:21:15.520784] T [fuse-bridge.c:681:fuse_attr_cbk]
0-glusterfs-fuse: 53011: LOOKUP() / => 1
[2015-01-20 09:21:17.683088] T [rpc-clnt.c:422:rpc_clnt_reconnect]
0-replicated_vol-client-1: attempting reconnect
[2015-01-20 09:21:17.683159] D [name.c:155:client_fill_address_family]
0-replicated_vol-client-1: address-family not specified, guessing it to be
inet from (remote-host: serv1)
[2015-01-20 09:21:17.683178] T
[name.c:225:af_inet_client_get_remote_sockaddr] 0-replicated_vol-client-1:
option remote-port missing in volume replicated_vol-client-1. Defaulting
to 24007
[2015-01-20 09:21:17.683191] T [common-utils.c:188:gf_resolve_ip6]
0-resolver: flushing DNS cache
[2015-01-20 09:21:17.683202] T [common-utils.c:195:gf_resolve_ip6]
0-resolver: DNS cache not present, freshly probing hostname: serv1
[2015-01-20 09:21:17.683814] D [common-utils.c:237:gf_resolve_ip6]
0-resolver: returning ip-192.168.24.81 (port-24007) for hostname: serv1
and port: 24007
[2015-01-20 09:21:17.684139] D [common-utils.c:257:gf_resolve_ip6]
0-resolver: next DNS query will return: ip-192.168.24.81 port-24007
[2015-01-20 09:21:17.684164] T [socket.c:731:__socket_nodelay]
0-replicated_vol-client-1: NODELAY enabled for socket 10
[2015-01-20 09:21:17.684177] T [socket.c:790:__socket_keepalive]
0-replicated_vol-client-1: Keep-alive enabled for socket 10, interval 2,
idle: 20
[2015-01-20 09:21:17.684236] W [common-utils.c:2247:gf_get_reserved_ports]
0-glusterfs: could not open the file
/proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports info
(No such file or directory)
[2015-01-20 09:21:17.684253] W
[common-utils.c:2280:gf_process_reserved_ports] 0-glusterfs: Not able to
get reserved ports, hence there is a possibility that glusterfs may
consume reserved port
Logs above suggest that mount process couldn't assign a reserved port
because it couldn't find the file
/proc/sys/net/ipv4/ip_local_reserved_ports

I guess reboot of the machine fixed it. Wonder why it was not found in the
first place.

Pranith.
[2015-01-20 09:21:17.684660] D [socket.c:605:__socket_shutdown]
0-replicated_vol-client-1: shutdown() returned -1. Transport endpoint is
not connected
[2015-01-20 09:21:17.684699] T
[rpc-clnt.c:519:rpc_clnt_connection_cleanup] 0-replicated_vol-client-1:
cleaning up state in transport object 0x68a630
[2015-01-20 09:21:17.684731] D [socket.c:486:__socket_rwv]
0-replicated_vol-client-1: EOF on socket
[2015-01-20 09:21:17.684750] W [socket.c:514:__socket_rwv]
0-replicated_vol-client-1: readv failed (No data available)
[2015-01-20 09:21:17.684766] D
[socket.c:1962:__socket_proto_state_machine] 0-replicated_vol-client-1:
reading from socket failed. Error (No data available), peer
(192.168.24.81:49198)

I could not find a 'remote-port' option in /var/lib/glusterd on either
peer. Could somebody tell me where this configuration is looked up from?
Also, sometime later, I rebooted serv0 and that seemed to solve the
problem. However, stop+start of replicated_vol and restart of
/etc/init.d/glusterd did NOT solve the problem.
Ignore that log. If no port is given in that volfile, it picks 24007 as
the port, which is the default port where glusterd 'listens'


Any help on this matter will be greatly appreciated as I need to provide
robustness assurances for our setup.

Thanks a lot,
Anirban

P.s. Additional details:
glusterfs version: 3.4.2
Linux kernel version: 2.6.34
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you
Pranith Kumar Karampuri
2015-01-23 07:58:25 UTC
Permalink
Post by A Ghoshal
Hi Pranith,
me a mail a while back telling me that it is better if we all use our
business email addresses so I made me a new profile.
Glusterfs complains about /proc/sys/net/ipv4/ip_local_reserved_ports
because we use a really old Linux kernel (2.6.34) wherein this feature
is not present. We plan to upgrade our Linux so often but each time we
are dissuaded from it by some compatibility issue or the other. So, we
get this log every time - on both good volumes and bad ones. What
Basically to make the connections to servers i.e. bricks clients need to
choose secure ports i.e. port less than 1024. Since this file is not
present, it is not binding to any port as per the code I just checked.
There is an option called client-bind-insecure which bypasses this
check. I feel that is one (probably only way) to get around this.
You have to "volume set server.allow-insecure on" option and
bind-insecure option.
CC ndevos who seemed to have helped someone set bind-insecure option
correctly here (http://irclog.perlgeek.de/gluster/2014-04-09/text)

Pranith
Post by A Ghoshal
[2015-01-20 09:37:49.151744] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request
fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0,
owner: 0000000000000000
[2015-01-20 09:37:49.151824] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request
fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1)
[2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-1)
[2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
When I write on the good server (serv1), we see that an RPC request is
sent to both client-0 and client-1. While, when I write on the bad
server (serv0), the RPC request is sent only to client-0, which is why
it is no wonder that the writes are not synced over to serv1. Somehow
I could not make the daemon on serv0 understand that there are two
up-children and not just one.
One additional detail - since we are using a kernel that's too old, we
do not have the (Anand Avati's?) FUse readdirplus patches, either.
I've noticed that the fixes in the readdirplus version of glusterfs
aren't always guaranteed to be present on the non-readdirplus version
of the patches. I'd filed a bug around one such anomaly back, but
https://bugzilla.redhat.com/show_bug.cgi?id=1062287
I don't this has anything to do with readdirplus.
Post by A Ghoshal
Maybe something on similar lines here?
Thanks,
Anirban
P.s. Please ignore the #Personal# in the subject line - we need to do
that to push mails to the public domain past the email filter safely.
Date: 01/22/2015 12:09 AM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server
------------------------------------------------------------------------
hi,
Responses inline.
PS: You are chalkogen_oxygen?
Pranith
Hello,
Volume Name: replicated_vol
Type: Replicate
Volume ID: 26d111e3-7e4c-479e-9355-91635ab7f1c2
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Brick1: serv0:/mnt/bricks/replicated_vol/brick
Brick2: serv1:/mnt/bricks/replicated_vol/brick
diagnostics.client-log-level: INFO
network.ping-timeout: 10
nfs.enable-ino32: on
cluster.self-heal-daemon: on
nfs.disable: off
replicated_vol is mounted at /mnt/replicated_vol on both serv0 and
And then I check for the state of the replicas in the bricks, then I
find that
cranberries
tangerines
As may be seen, the replica on serv1 is blank, when I write into
testfile from serv0 (even though the file is created on both bricks).
Interestingly, if I write something to the file at serv1, then the two
replicas become identical.
cranberries
tangerines
artichokes
cranberries
tangerines
artichokes
So, I dabbled into the logs a little bit, after upping the diagnostic
*_
When I write on serv0 (bad case):_*
[2015-01-20 09:21:52.197704] T [fuse-bridge.c:546:fuse_lookup_resume]
0-glusterfs-fuse: 53027: LOOKUP
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:21:52.197959] D
[afr-common.c:131:afr_lookup_xattr_req_prepare]
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:21:52.198006] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-0: Auth Info: pid: 28151, uid: 0, gid: 0,
owner: 0000000000000000
[2015-01-20 09:21:52.198024] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request
fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:21:52.198108] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x78163x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:21:52.198565] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x78163x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
[2015-01-20 09:21:52.198640] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:21:52.198669] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:21:52.198681] D
[afr-self-heal-common.c:887:afr_mark_sources]
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:21:52.198694] D
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:21:52.198705] D
[afr-common.c:1380:afr_lookup_select_read_child]
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:21:52.198720] D
[afr-common.c:1117:afr_lookup_build_response_params]
0-replicated_vol-replicate-0: Building lookup response from 0
[2015-01-20 09:21:52.198732] D
[afr-common.c:1732:afr_lookup_perform_self_heal]
0-replicated_vol-replicate-0: Only 1 child up - do not attempt to
detect self heal
*_
When I write on serv1 (good case):_*
[2015-01-20 09:37:49.151506] T [fuse-bridge.c:546:fuse_lookup_resume]
0-glusterfs-fuse: 31212: LOOKUP
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:37:49.151683] D
[afr-common.c:131:afr_lookup_xattr_req_prepare]
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:37:49.151726] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-0: Auth Info: pid: 7599, uid: 0, gid: 0,
owner: 0000000000000000
[2015-01-20 09:37:49.151744] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request
fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0,
owner: 0000000000000000
[2015-01-20 09:37:49.151824] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request
fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1)
[2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-1)
[2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
[2015-01-20 09:37:49.152582] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:37:49.152596] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:37:49.152621] D
[afr-self-heal-common.c:887:afr_mark_sources]
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:37:49.152633] D
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:37:49.152644] D
[afr-common.c:1380:afr_lookup_select_read_child]
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:37:49.152657] D
[afr-common.c:1117:afr_lookup_build_response_params]
0-replicated_vol-replicate-0: Building lookup response from 0
We see that when you write on serv1, the RPC request is sent to both
replicated_vol-client-0 and replicated_vol-client-1, while when we
write on serv0, the request is sent only to replicated_vol-client-0,
and the FUse client is unaware of the presence of client-1 in the
latter case.
I checked a bit more in the logs. When I turn on my trace, I found
[2015-01-20 09:21:15.520784] T [fuse-bridge.c:681:fuse_attr_cbk]
0-glusterfs-fuse: 53011: LOOKUP() / => 1
[2015-01-20 09:21:17.683088] T [rpc-clnt.c:422:rpc_clnt_reconnect]
0-replicated_vol-client-1: attempting reconnect
[2015-01-20 09:21:17.683159] D [name.c:155:client_fill_address_family]
0-replicated_vol-client-1: address-family not specified, guessing it
to be inet from (remote-host: serv1)
[2015-01-20 09:21:17.683178] T
[name.c:225:af_inet_client_get_remote_sockaddr]
0-replicated_vol-client-1: option remote-port missing in volume
replicated_vol-client-1. Defaulting to 24007
[2015-01-20 09:21:17.683191] T [common-utils.c:188:gf_resolve_ip6]
0-resolver: flushing DNS cache
[2015-01-20 09:21:17.683202] T [common-utils.c:195:gf_resolve_ip6]
0-resolver: DNS cache not present, freshly probing hostname: serv1
[2015-01-20 09:21:17.683814] D [common-utils.c:237:gf_resolve_ip6]
serv1 and port: 24007
[2015-01-20 09:21:17.684139] D [common-utils.c:257:gf_resolve_ip6]
0-resolver: next DNS query will return: ip-192.168.24.81 port-24007
[2015-01-20 09:21:17.684164] T [socket.c:731:__socket_nodelay]
0-replicated_vol-client-1: NODELAY enabled for socket 10
[2015-01-20 09:21:17.684177] T [socket.c:790:__socket_keepalive]
0-replicated_vol-client-1: Keep-alive enabled for socket 10, interval
2, idle: 20
[2015-01-20 09:21:17.684236] W
[common-utils.c:2247:gf_get_reserved_ports] 0-glusterfs: could not
open the file /proc/sys/net/ipv4/ip_local_reserved_ports for getting
reserved ports info (No such file or directory)
[2015-01-20 09:21:17.684253] W
[common-utils.c:2280:gf_process_reserved_ports] 0-glusterfs: Not able
to get reserved ports, hence there is a possibility that glusterfs may
consume reserved port
Logs above suggest that mount process couldn't assign a reserved port
because it couldn't find the file
/proc/sys/net/ipv4/ip_local_reserved_ports
I guess reboot of the machine fixed it. Wonder why it was not found in
the first place.
Pranith.
[2015-01-20 09:21:17.684660] D [socket.c:605:__socket_shutdown]
0-replicated_vol-client-1: shutdown() returned -1. Transport endpoint
is not connected
[2015-01-20 09:21:17.684699] T
[rpc-clnt.c:519:rpc_clnt_connection_cleanup]
0-replicated_vol-client-1: cleaning up state in transport object 0x68a630
[2015-01-20 09:21:17.684731] D [socket.c:486:__socket_rwv]
0-replicated_vol-client-1: EOF on socket
[2015-01-20 09:21:17.684750] W [socket.c:514:__socket_rwv]
0-replicated_vol-client-1: readv failed (No data available)
[2015-01-20 09:21:17.684766] D
[socket.c:1962:__socket_proto_state_machine]
0-replicated_vol-client-1: reading from socket failed. Error (No data
available), peer (192.168.24.81:49198)
I could not find a 'remote-port' option in /var/lib/glusterd on either
peer. Could somebody tell me where this configuration is looked up
from? Also, sometime later, I rebooted serv0 and that seemed to solve
the problem. However, stop+start of replicated_vol and restart of
/etc/init.d/glusterd did NOT solve the problem.
Ignore that log. If no port is given in that volfile, it picks 24007
as the port, which is the default port where glusterd 'listens'
Any help on this matter will be greatly appreciated as I need to
provide robustness assurances for our setup.
Thanks a lot,
Anirban
P.s. Additional details:/
glusterfs version: 3.4.2//
Linux kernel version: 2.6.34/
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you
_______________________________________________
Gluster-users mailing list
_http://www.gluster.org/mailman/listinfo/gluster-users_
A Ghoshal
2015-01-23 08:24:18 UTC
Permalink
Thanks a lot, Pranith.

We'll set this option on our test servers and keep the setup under
observation.

Thanks,
Anirban



From: Pranith Kumar Karampuri <***@redhat.com>
To: A Ghoshal <***@tcs.com>
Cc: gluster-***@gluster.org, Niels de Vos <***@redhat.com>
Date: 01/23/2015 01:28 PM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server #Personal#




On 01/22/2015 02:07 PM, A Ghoshal wrote:
Hi Pranith,

Yes, the very same (***@yahoo.com). Justin Clift sent me a
mail a while back telling me that it is better if we all use our business
email addresses so I made me a new profile.

Glusterfs complains about /proc/sys/net/ipv4/ip_local_reserved_ports
because we use a really old Linux kernel (2.6.34) wherein this feature is
not present. We plan to upgrade our Linux so often but each time we are
dissuaded from it by some compatibility issue or the other. So, we get
this log every time - on both good volumes and bad ones. What bothered me
was this (on serv1):
Basically to make the connections to servers i.e. bricks clients need to
choose secure ports i.e. port less than 1024. Since this file is not
present, it is not binding to any port as per the code I just checked.
There is an option called client-bind-insecure which bypasses this check.
I feel that is one (probably only way) to get around this.
You have to "volume set server.allow-insecure on" option and bind-insecure
option.
CC ndevos who seemed to have helped someone set bind-insecure option
correctly here (http://irclog.perlgeek.de/gluster/2014-04-09/text)

Pranith

[2015-01-20 09:37:49.151744] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0, owner:
0000000000000000
[2015-01-20 09:37:49.151824] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1)
[2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-1)
[2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)

When I write on the good server (serv1), we see that an RPC request is
sent to both client-0 and client-1. While, when I write on the bad server
(serv0), the RPC request is sent only to client-0, which is why it is no
wonder that the writes are not synced over to serv1. Somehow I could not
make the daemon on serv0 understand that there are two up-children and not
just one.

One additional detail - since we are using a kernel that's too old, we do
not have the (Anand Avati's?) FUse readdirplus patches, either. I've
noticed that the fixes in the readdirplus version of glusterfs aren't
always guaranteed to be present on the non-readdirplus version of the
patches. I'd filed a bug around one such anomaly back, but never got
around to writing a patch for it (sorry!) Here it is:
https://bugzilla.redhat.com/show_bug.cgi?id=1062287
I don't this has anything to do with readdirplus.

Maybe something on similar lines here?

Thanks,
Anirban

P.s. Please ignore the #Personal# in the subject line - we need to do that
to push mails to the public domain past the email filter safely.



From: Pranith Kumar Karampuri <***@redhat.com>
To: A Ghoshal <***@tcs.com>, gluster-***@gluster.org
Date: 01/22/2015 12:09 AM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server



hi,
Responses inline.

PS: You are chalkogen_oxygen?

Pranith
On 01/20/2015 05:34 PM, A Ghoshal wrote:
Hello,

I am using the following replicated volume:

***@serv0:~> gluster v info replicated_vol

Volume Name: replicated_vol
Type: Replicate
Volume ID: 26d111e3-7e4c-479e-9355-91635ab7f1c2
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: serv0:/mnt/bricks/replicated_vol/brick
Brick2: serv1:/mnt/bricks/replicated_vol/brick
Options Reconfigured:
diagnostics.client-log-level: INFO
network.ping-timeout: 10
nfs.enable-ino32: on
cluster.self-heal-daemon: on
nfs.disable: off

replicated_vol is mounted at /mnt/replicated_vol on both serv0 and serv1.
If I do the following on serv0:

***@serv0:~>echo "cranberries" > /mnt/replicated_vol/testfile
***@serv0:~>echo "tangerines" >> /mnt/replicated_vol/testfile

And then I check for the state of the replicas in the bricks, then I find
that

***@serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile
cranberries
tangerines
***@serv0:~>

***@serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile
***@serv1:~>

As may be seen, the replica on serv1 is blank, when I write into testfile
from serv0 (even though the file is created on both bricks).
Interestingly, if I write something to the file at serv1, then the two
replicas become identical.

***@serv1:~>echo "artichokes" >> /mnt/replicated_vol/testfile

***@serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile
cranberries
tangerines
artichokes
***@serv1:~>

***@serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile
cranberries
tangerines
artichokes
***@serv0:~>

So, I dabbled into the logs a little bit, after upping the diagnostic
level, and this is what I saw:

When I write on serv0 (bad case):

[2015-01-20 09:21:52.197704] T [fuse-bridge.c:546:fuse_lookup_resume]
0-glusterfs-fuse: 53027: LOOKUP
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:21:52.197959] D
[afr-common.c:131:afr_lookup_xattr_req_prepare]
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:21:52.198006] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-0: Auth Info: pid: 28151, uid: 0, gid: 0, owner:
0000000000000000
[2015-01-20 09:21:52.198024] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:21:52.198108] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x78163x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:21:52.198565] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x78163x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
[2015-01-20 09:21:52.198640] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:21:52.198669] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:21:52.198681] D
[afr-self-heal-common.c:887:afr_mark_sources]
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:21:52.198694] D
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:21:52.198705] D
[afr-common.c:1380:afr_lookup_select_read_child]
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:21:52.198720] D
[afr-common.c:1117:afr_lookup_build_response_params]
0-replicated_vol-replicate-0: Building lookup response from 0
[2015-01-20 09:21:52.198732] D
[afr-common.c:1732:afr_lookup_perform_self_heal]
0-replicated_vol-replicate-0: Only 1 child up - do not attempt to detect
self heal

When I write on serv1 (good case):

[2015-01-20 09:37:49.151506] T [fuse-bridge.c:546:fuse_lookup_resume]
0-glusterfs-fuse: 31212: LOOKUP
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:37:49.151683] D
[afr-common.c:131:afr_lookup_xattr_req_prepare]
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:37:49.151726] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-0: Auth Info: pid: 7599, uid: 0, gid: 0, owner:
0000000000000000
[2015-01-20 09:37:49.151744] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0, owner:
0000000000000000
[2015-01-20 09:37:49.151824] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1)
[2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-1)
[2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
[2015-01-20 09:37:49.152582] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:37:49.152596] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:37:49.152621] D
[afr-self-heal-common.c:887:afr_mark_sources]
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:37:49.152633] D
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:37:49.152644] D
[afr-common.c:1380:afr_lookup_select_read_child]
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:37:49.152657] D
[afr-common.c:1117:afr_lookup_build_response_params]
0-replicated_vol-replicate-0: Building lookup response from 0

We see that when you write on serv1, the RPC request is sent to both
replicated_vol-client-0 and replicated_vol-client-1, while when we write
on serv0, the request is sent only to replicated_vol-client-0, and the
FUse client is unaware of the presence of client-1 in the latter case.

I checked a bit more in the logs. When I turn on my trace, I found many
instances of these logs on serv0 but NOT on serv1:

[2015-01-20 09:21:15.520784] T [fuse-bridge.c:681:fuse_attr_cbk]
0-glusterfs-fuse: 53011: LOOKUP() / => 1
[2015-01-20 09:21:17.683088] T [rpc-clnt.c:422:rpc_clnt_reconnect]
0-replicated_vol-client-1: attempting reconnect
[2015-01-20 09:21:17.683159] D [name.c:155:client_fill_address_family]
0-replicated_vol-client-1: address-family not specified, guessing it to be
inet from (remote-host: serv1)
[2015-01-20 09:21:17.683178] T
[name.c:225:af_inet_client_get_remote_sockaddr] 0-replicated_vol-client-1:
option remote-port missing in volume replicated_vol-client-1. Defaulting
to 24007
[2015-01-20 09:21:17.683191] T [common-utils.c:188:gf_resolve_ip6]
0-resolver: flushing DNS cache
[2015-01-20 09:21:17.683202] T [common-utils.c:195:gf_resolve_ip6]
0-resolver: DNS cache not present, freshly probing hostname: serv1
[2015-01-20 09:21:17.683814] D [common-utils.c:237:gf_resolve_ip6]
0-resolver: returning ip-192.168.24.81 (port-24007) for hostname: serv1
and port: 24007
[2015-01-20 09:21:17.684139] D [common-utils.c:257:gf_resolve_ip6]
0-resolver: next DNS query will return: ip-192.168.24.81 port-24007
[2015-01-20 09:21:17.684164] T [socket.c:731:__socket_nodelay]
0-replicated_vol-client-1: NODELAY enabled for socket 10
[2015-01-20 09:21:17.684177] T [socket.c:790:__socket_keepalive]
0-replicated_vol-client-1: Keep-alive enabled for socket 10, interval 2,
idle: 20
[2015-01-20 09:21:17.684236] W [common-utils.c:2247:gf_get_reserved_ports]
0-glusterfs: could not open the file
/proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports info
(No such file or directory)
[2015-01-20 09:21:17.684253] W
[common-utils.c:2280:gf_process_reserved_ports] 0-glusterfs: Not able to
get reserved ports, hence there is a possibility that glusterfs may
consume reserved port
Logs above suggest that mount process couldn't assign a reserved port
because it couldn't find the file
/proc/sys/net/ipv4/ip_local_reserved_ports

I guess reboot of the machine fixed it. Wonder why it was not found in the
first place.

Pranith.
[2015-01-20 09:21:17.684660] D [socket.c:605:__socket_shutdown]
0-replicated_vol-client-1: shutdown() returned -1. Transport endpoint is
not connected
[2015-01-20 09:21:17.684699] T
[rpc-clnt.c:519:rpc_clnt_connection_cleanup] 0-replicated_vol-client-1:
cleaning up state in transport object 0x68a630
[2015-01-20 09:21:17.684731] D [socket.c:486:__socket_rwv]
0-replicated_vol-client-1: EOF on socket
[2015-01-20 09:21:17.684750] W [socket.c:514:__socket_rwv]
0-replicated_vol-client-1: readv failed (No data available)
[2015-01-20 09:21:17.684766] D
[socket.c:1962:__socket_proto_state_machine] 0-replicated_vol-client-1:
reading from socket failed. Error (No data available), peer
(192.168.24.81:49198)

I could not find a 'remote-port' option in /var/lib/glusterd on either
peer. Could somebody tell me where this configuration is looked up from?
Also, sometime later, I rebooted serv0 and that seemed to solve the
problem. However, stop+start of replicated_vol and restart of
/etc/init.d/glusterd did NOT solve the problem.
Ignore that log. If no port is given in that volfile, it picks 24007 as
the port, which is the default port where glusterd 'listens'


Any help on this matter will be greatly appreciated as I need to provide
robustness assurances for our setup.

Thanks a lot,
Anirban

P.s. Additional details:
glusterfs version: 3.4.2
Linux kernel version: 2.6.34
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you
Pranith Kumar Karampuri
2015-01-23 08:27:41 UTC
Permalink
Post by A Ghoshal
Thanks a lot, Pranith.
We'll set this option on our test servers and keep the setup under
observation.
How did you get the bind-insecure option working?
I guess I will send a patch to make it 'volume set option'

Pranith
Post by A Ghoshal
Thanks,
Anirban
Date: 01/23/2015 01:28 PM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server #Personal#
------------------------------------------------------------------------
Hi Pranith,
while back telling me that it is better if we all use our business
email addresses so I made me a new profile.
Glusterfs complains about /proc/sys/net/ipv4/ip_local_reserved_ports
because we use a really old Linux kernel (2.6.34) wherein this feature
is not present. We plan to upgrade our Linux so often but each time we
are dissuaded from it by some compatibility issue or the other. So, we
get this log every time - on both good volumes and bad ones. What
Basically to make the connections to servers i.e. bricks clients need
to choose secure ports i.e. port less than 1024. Since this file is
not present, it is not binding to any port as per the code I just
checked. There is an option called client-bind-insecure which bypasses
this check. I feel that is one (probably only way) to get around this.
You have to "volume set server.allow-insecure on" option and
bind-insecure option.
CC ndevos who seemed to have helped someone set bind-insecure option
correctly here (_http://irclog.perlgeek.de/gluster/2014-04-09/text_)
Pranith
[2015-01-20 09:37:49.151744] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request
fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0,
owner: 0000000000000000
[2015-01-20 09:37:49.151824] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request
fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1)
[2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-1)
[2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
When I write on the good server (serv1), we see that an RPC request is
sent to both client-0 and client-1. While, when I write on the bad
server (serv0), the RPC request is sent only to client-0, which is why
it is no wonder that the writes are not synced over to serv1. Somehow
I could not make the daemon on serv0 understand that there are two
up-children and not just one.
One additional detail - since we are using a kernel that's too old, we
do not have the (Anand Avati's?) FUse readdirplus patches, either.
I've noticed that the fixes in the readdirplus version of glusterfs
aren't always guaranteed to be present on the non-readdirplus version
of the patches. I'd filed a bug around one such anomaly back, but
_https://bugzilla.redhat.com/show_bug.cgi?id=1062287_
I don't this has anything to do with readdirplus.
Maybe something on similar lines here?
Thanks,
Anirban
P.s. Please ignore the #Personal# in the subject line - we need to do
that to push mails to the public domain past the email filter safely.
Date: 01/22/2015 12:09 AM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server
------------------------------------------------------------------------
hi,
Responses inline.
PS: You are chalkogen_oxygen?
Pranith
Hello,
Volume Name: replicated_vol
Type: Replicate
Volume ID: 26d111e3-7e4c-479e-9355-91635ab7f1c2
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Brick1: serv0:/mnt/bricks/replicated_vol/brick
Brick2: serv1:/mnt/bricks/replicated_vol/brick
diagnostics.client-log-level: INFO
network.ping-timeout: 10
nfs.enable-ino32: on
cluster.self-heal-daemon: on
nfs.disable: off
replicated_vol is mounted at /mnt/replicated_vol on both serv0 and
And then I check for the state of the replicas in the bricks, then I
find that
cranberries
tangerines
As may be seen, the replica on serv1 is blank, when I write into
testfile from serv0 (even though the file is created on both bricks).
Interestingly, if I write something to the file at serv1, then the two
replicas become identical.
cranberries
tangerines
artichokes
cranberries
tangerines
artichokes
So, I dabbled into the logs a little bit, after upping the diagnostic
level, and this is what I saw:*_
When I write on serv0 (bad case):_*
[2015-01-20 09:21:52.197704] T [fuse-bridge.c:546:fuse_lookup_resume]
0-glusterfs-fuse: 53027: LOOKUP
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:21:52.197959] D
[afr-common.c:131:afr_lookup_xattr_req_prepare]
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:21:52.198006] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-0: Auth Info: pid: 28151, uid: 0, gid: 0,
owner: 0000000000000000
[2015-01-20 09:21:52.198024] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request
fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:21:52.198108] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x78163x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:21:52.198565] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x78163x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
[2015-01-20 09:21:52.198640] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:21:52.198669] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:21:52.198681] D
[afr-self-heal-common.c:887:afr_mark_sources]
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:21:52.198694] D
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:21:52.198705] D
[afr-common.c:1380:afr_lookup_select_read_child]
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:21:52.198720] D
[afr-common.c:1117:afr_lookup_build_response_params]
0-replicated_vol-replicate-0: Building lookup response from 0
[2015-01-20 09:21:52.198732] D
[afr-common.c:1732:afr_lookup_perform_self_heal]
0-replicated_vol-replicate-0: Only 1 child up - do not attempt to
detect self heal*_
When I write on serv1 (good case):_*
[2015-01-20 09:37:49.151506] T [fuse-bridge.c:546:fuse_lookup_resume]
0-glusterfs-fuse: 31212: LOOKUP
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:37:49.151683] D
[afr-common.c:131:afr_lookup_xattr_req_prepare]
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:37:49.151726] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-0: Auth Info: pid: 7599, uid: 0, gid: 0,
owner: 0000000000000000
[2015-01-20 09:37:49.151744] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request
fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0,
owner: 0000000000000000
[2015-01-20 09:37:49.151824] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request
fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1)
[2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-1)
[2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
[2015-01-20 09:37:49.152582] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:37:49.152596] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:37:49.152621] D
[afr-self-heal-common.c:887:afr_mark_sources]
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:37:49.152633] D
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:37:49.152644] D
[afr-common.c:1380:afr_lookup_select_read_child]
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:37:49.152657] D
[afr-common.c:1117:afr_lookup_build_response_params]
0-replicated_vol-replicate-0: Building lookup response from 0
We see that when you write on serv1, the RPC request is sent to both
replicated_vol-client-0 and replicated_vol-client-1, while when we
write on serv0, the request is sent only to replicated_vol-client-0,
and the FUse client is unaware of the presence of client-1 in the
latter case.
I checked a bit more in the logs. When I turn on my trace, I found
[2015-01-20 09:21:15.520784] T [fuse-bridge.c:681:fuse_attr_cbk]
0-glusterfs-fuse: 53011: LOOKUP() / => 1
[2015-01-20 09:21:17.683088] T [rpc-clnt.c:422:rpc_clnt_reconnect]
0-replicated_vol-client-1: attempting reconnect
[2015-01-20 09:21:17.683159] D [name.c:155:client_fill_address_family]
0-replicated_vol-client-1: address-family not specified, guessing it
to be inet from (remote-host: serv1)
[2015-01-20 09:21:17.683178] T
[name.c:225:af_inet_client_get_remote_sockaddr]
0-replicated_vol-client-1: option remote-port missing in volume
replicated_vol-client-1. Defaulting to 24007
[2015-01-20 09:21:17.683191] T [common-utils.c:188:gf_resolve_ip6]
0-resolver: flushing DNS cache
[2015-01-20 09:21:17.683202] T [common-utils.c:195:gf_resolve_ip6]
0-resolver: DNS cache not present, freshly probing hostname: serv1
[2015-01-20 09:21:17.683814] D [common-utils.c:237:gf_resolve_ip6]
serv1 and port: 24007
[2015-01-20 09:21:17.684139] D [common-utils.c:257:gf_resolve_ip6]
0-resolver: next DNS query will return: ip-192.168.24.81 port-24007
[2015-01-20 09:21:17.684164] T [socket.c:731:__socket_nodelay]
0-replicated_vol-client-1: NODELAY enabled for socket 10
[2015-01-20 09:21:17.684177] T [socket.c:790:__socket_keepalive]
0-replicated_vol-client-1: Keep-alive enabled for socket 10, interval
2, idle: 20
[2015-01-20 09:21:17.684236] W
[common-utils.c:2247:gf_get_reserved_ports] 0-glusterfs: could not
open the file /proc/sys/net/ipv4/ip_local_reserved_ports for getting
reserved ports info (No such file or directory)
[2015-01-20 09:21:17.684253] W
[common-utils.c:2280:gf_process_reserved_ports] 0-glusterfs: Not able
to get reserved ports, hence there is a possibility that glusterfs may
consume reserved port
Logs above suggest that mount process couldn't assign a reserved port
because it couldn't find the file
/proc/sys/net/ipv4/ip_local_reserved_ports
I guess reboot of the machine fixed it. Wonder why it was not found in
the first place.
Pranith.
[2015-01-20 09:21:17.684660] D [socket.c:605:__socket_shutdown]
0-replicated_vol-client-1: shutdown() returned -1. Transport endpoint
is not connected
[2015-01-20 09:21:17.684699] T
[rpc-clnt.c:519:rpc_clnt_connection_cleanup]
0-replicated_vol-client-1: cleaning up state in transport object 0x68a630
[2015-01-20 09:21:17.684731] D [socket.c:486:__socket_rwv]
0-replicated_vol-client-1: EOF on socket
[2015-01-20 09:21:17.684750] W [socket.c:514:__socket_rwv]
0-replicated_vol-client-1: readv failed (No data available)
[2015-01-20 09:21:17.684766] D
[socket.c:1962:__socket_proto_state_machine]
0-replicated_vol-client-1: reading from socket failed. Error (No data
available), peer (192.168.24.81:49198)
I could not find a 'remote-port' option in /var/lib/glusterd on either
peer. Could somebody tell me where this configuration is looked up
from? Also, sometime later, I rebooted serv0 and that seemed to solve
the problem. However, stop+start of replicated_vol and restart of
/etc/init.d/glusterd did NOT solve the problem.
Ignore that log. If no port is given in that volfile, it picks 24007
as the port, which is the default port where glusterd 'listens'
Any help on this matter will be greatly appreciated as I need to
provide robustness assurances for our setup.
Thanks a lot,
Anirban
P.s. Additional details:/
glusterfs version: 3.4.2//
Linux kernel version: 2.6.34/
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you
_______________________________________________
Gluster-users mailing list_
__http://www.gluster.org/mailman/listinfo/gluster-users_
A Ghoshal
2015-01-23 09:17:45 UTC
Permalink
Oh, I didn't I only read a fragment of the IRC log and assumed
--xlator-option would be enough. Apparently it's a lot more work....

I do have a query, though. These connections, from one of our setups, are
these on secure ports? Or, maybe I didn't get it the first time.....

***@serv0:/root> ps -ef | grep replicated_vol
root 8851 25307 0 10:03 pts/2 00:00:00 grep replicated_vol
root 29751 1 4 Jan21 ? 01:47:20 /usr/sbin/glusterfsd -s
serv0 --volfile-id replicated_vol.serv0.mnt-bricks-replicated_vol-brick -p
/var/lib/glusterd/vols/_replicated_vol/run/serv0-mnt-bricks-replicated_vol-brick.pid
-S /var/run/dff9fa3c93e82f20103f2a3d91adc4a8.socket --brick-name
/mnt/bricks/replicated_vol/brick -l
/var/log/glusterfs/bricks/mnt-bricks-replicated_vol-brick.log
--xlator-option *-posix.glusterd-uuid=1a1d1ebc-4b92-428f-b66b-9c5efa49574d
--brick-port 49185 --xlator-option replicated_vol-server.listen-port=49185
root 30399 1 0 Jan21 ? 00:19:06 /usr/sbin/glusterfs
--volfile-id=replicated_vol --volfile-server=serv0 /mnt/replicated_vol

***@serv0:/root> netstat -p | grep 30399
tcp 0 0 serv0:969 serv0:49185 ESTABLISHED
30399/glusterfs
tcp 0 0 serv0:999 serv1:49159 ESTABLISHED
30399/glusterfs
tcp 0 0 serv0:1023 serv0:24007 ESTABLISHED
30399/glusterfs
***@serv0:/root>

Thanks again,
Anirban



From: Pranith Kumar Karampuri <***@redhat.com>
To: A Ghoshal <***@tcs.com>
Cc: gluster-***@gluster.org, Niels de Vos <***@redhat.com>
Date: 01/23/2015 01:58 PM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server #Personal#




On 01/23/2015 01:54 PM, A Ghoshal wrote:
Thanks a lot, Pranith.

We'll set this option on our test servers and keep the setup under
observation.
How did you get the bind-insecure option working?
I guess I will send a patch to make it 'volume set option'

Pranith

Thanks,
Anirban



From: Pranith Kumar Karampuri <***@redhat.com>
To: A Ghoshal <***@tcs.com>
Cc: gluster-***@gluster.org, Niels de Vos <***@redhat.com>
Date: 01/23/2015 01:28 PM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server #Personal#




On 01/22/2015 02:07 PM, A Ghoshal wrote:
Hi Pranith,

Yes, the very same (***@yahoo.com). Justin Clift sent me a
mail a while back telling me that it is better if we all use our business
email addresses so I made me a new profile.

Glusterfs complains about /proc/sys/net/ipv4/ip_local_reserved_ports
because we use a really old Linux kernel (2.6.34) wherein this feature is
not present. We plan to upgrade our Linux so often but each time we are
dissuaded from it by some compatibility issue or the other. So, we get
this log every time - on both good volumes and bad ones. What bothered me
was this (on serv1):
Basically to make the connections to servers i.e. bricks clients need to
choose secure ports i.e. port less than 1024. Since this file is not
present, it is not binding to any port as per the code I just checked.
There is an option called client-bind-insecure which bypasses this check.
I feel that is one (probably only way) to get around this.
You have to "volume set server.allow-insecure on" option and bind-insecure
option.
CC ndevos who seemed to have helped someone set bind-insecure option
correctly here (http://irclog.perlgeek.de/gluster/2014-04-09/text)

Pranith

[2015-01-20 09:37:49.151744] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0, owner:
0000000000000000
[2015-01-20 09:37:49.151824] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1)
[2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-1)
[2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)

When I write on the good server (serv1), we see that an RPC request is
sent to both client-0 and client-1. While, when I write on the bad server
(serv0), the RPC request is sent only to client-0, which is why it is no
wonder that the writes are not synced over to serv1. Somehow I could not
make the daemon on serv0 understand that there are two up-children and not
just one.

One additional detail - since we are using a kernel that's too old, we do
not have the (Anand Avati's?) FUse readdirplus patches, either. I've
noticed that the fixes in the readdirplus version of glusterfs aren't
always guaranteed to be present on the non-readdirplus version of the
patches. I'd filed a bug around one such anomaly back, but never got
around to writing a patch for it (sorry!) Here it is:
https://bugzilla.redhat.com/show_bug.cgi?id=1062287
I don't this has anything to do with readdirplus.

Maybe something on similar lines here?

Thanks,
Anirban

P.s. Please ignore the #Personal# in the subject line - we need to do that
to push mails to the public domain past the email filter safely.



From: Pranith Kumar Karampuri <***@redhat.com>
To: A Ghoshal <***@tcs.com>, gluster-***@gluster.org
Date: 01/22/2015 12:09 AM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server



hi,
Responses inline.

PS: You are chalkogen_oxygen?

Pranith
On 01/20/2015 05:34 PM, A Ghoshal wrote:
Hello,

I am using the following replicated volume:

***@serv0:~> gluster v info replicated_vol

Volume Name: replicated_vol
Type: Replicate
Volume ID: 26d111e3-7e4c-479e-9355-91635ab7f1c2
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: serv0:/mnt/bricks/replicated_vol/brick
Brick2: serv1:/mnt/bricks/replicated_vol/brick
Options Reconfigured:
diagnostics.client-log-level: INFO
network.ping-timeout: 10
nfs.enable-ino32: on
cluster.self-heal-daemon: on
nfs.disable: off

replicated_vol is mounted at /mnt/replicated_vol on both serv0 and serv1.
If I do the following on serv0:

***@serv0:~>echo "cranberries" > /mnt/replicated_vol/testfile
***@serv0:~>echo "tangerines" >> /mnt/replicated_vol/testfile

And then I check for the state of the replicas in the bricks, then I find
that

***@serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile
cranberries
tangerines
***@serv0:~>

***@serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile
***@serv1:~>

As may be seen, the replica on serv1 is blank, when I write into testfile
from serv0 (even though the file is created on both bricks).
Interestingly, if I write something to the file at serv1, then the two
replicas become identical.

***@serv1:~>echo "artichokes" >> /mnt/replicated_vol/testfile

***@serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile
cranberries
tangerines
artichokes
***@serv1:~>

***@serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile
cranberries
tangerines
artichokes
***@serv0:~>

So, I dabbled into the logs a little bit, after upping the diagnostic
level, and this is what I saw:

When I write on serv0 (bad case):

[2015-01-20 09:21:52.197704] T [fuse-bridge.c:546:fuse_lookup_resume]
0-glusterfs-fuse: 53027: LOOKUP
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:21:52.197959] D
[afr-common.c:131:afr_lookup_xattr_req_prepare]
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:21:52.198006] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-0: Auth Info: pid: 28151, uid: 0, gid: 0, owner:
0000000000000000
[2015-01-20 09:21:52.198024] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:21:52.198108] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x78163x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:21:52.198565] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x78163x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
[2015-01-20 09:21:52.198640] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:21:52.198669] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:21:52.198681] D
[afr-self-heal-common.c:887:afr_mark_sources]
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:21:52.198694] D
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:21:52.198705] D
[afr-common.c:1380:afr_lookup_select_read_child]
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:21:52.198720] D
[afr-common.c:1117:afr_lookup_build_response_params]
0-replicated_vol-replicate-0: Building lookup response from 0
[2015-01-20 09:21:52.198732] D
[afr-common.c:1732:afr_lookup_perform_self_heal]
0-replicated_vol-replicate-0: Only 1 child up - do not attempt to detect
self heal

When I write on serv1 (good case):

[2015-01-20 09:37:49.151506] T [fuse-bridge.c:546:fuse_lookup_resume]
0-glusterfs-fuse: 31212: LOOKUP
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:37:49.151683] D
[afr-common.c:131:afr_lookup_xattr_req_prepare]
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:37:49.151726] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-0: Auth Info: pid: 7599, uid: 0, gid: 0, owner:
0000000000000000
[2015-01-20 09:37:49.151744] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0, owner:
0000000000000000
[2015-01-20 09:37:49.151824] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1)
[2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-1)
[2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
[2015-01-20 09:37:49.152582] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:37:49.152596] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:37:49.152621] D
[afr-self-heal-common.c:887:afr_mark_sources]
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:37:49.152633] D
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:37:49.152644] D
[afr-common.c:1380:afr_lookup_select_read_child]
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:37:49.152657] D
[afr-common.c:1117:afr_lookup_build_response_params]
0-replicated_vol-replicate-0: Building lookup response from 0

We see that when you write on serv1, the RPC request is sent to both
replicated_vol-client-0 and replicated_vol-client-1, while when we write
on serv0, the request is sent only to replicated_vol-client-0, and the
FUse client is unaware of the presence of client-1 in the latter case.

I checked a bit more in the logs. When I turn on my trace, I found many
instances of these logs on serv0 but NOT on serv1:

[2015-01-20 09:21:15.520784] T [fuse-bridge.c:681:fuse_attr_cbk]
0-glusterfs-fuse: 53011: LOOKUP() / => 1
[2015-01-20 09:21:17.683088] T [rpc-clnt.c:422:rpc_clnt_reconnect]
0-replicated_vol-client-1: attempting reconnect
[2015-01-20 09:21:17.683159] D [name.c:155:client_fill_address_family]
0-replicated_vol-client-1: address-family not specified, guessing it to be
inet from (remote-host: serv1)
[2015-01-20 09:21:17.683178] T
[name.c:225:af_inet_client_get_remote_sockaddr] 0-replicated_vol-client-1:
option remote-port missing in volume replicated_vol-client-1. Defaulting
to 24007
[2015-01-20 09:21:17.683191] T [common-utils.c:188:gf_resolve_ip6]
0-resolver: flushing DNS cache
[2015-01-20 09:21:17.683202] T [common-utils.c:195:gf_resolve_ip6]
0-resolver: DNS cache not present, freshly probing hostname: serv1
[2015-01-20 09:21:17.683814] D [common-utils.c:237:gf_resolve_ip6]
0-resolver: returning ip-192.168.24.81 (port-24007) for hostname: serv1
and port: 24007
[2015-01-20 09:21:17.684139] D [common-utils.c:257:gf_resolve_ip6]
0-resolver: next DNS query will return: ip-192.168.24.81 port-24007
[2015-01-20 09:21:17.684164] T [socket.c:731:__socket_nodelay]
0-replicated_vol-client-1: NODELAY enabled for socket 10
[2015-01-20 09:21:17.684177] T [socket.c:790:__socket_keepalive]
0-replicated_vol-client-1: Keep-alive enabled for socket 10, interval 2,
idle: 20
[2015-01-20 09:21:17.684236] W [common-utils.c:2247:gf_get_reserved_ports]
0-glusterfs: could not open the file
/proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports info
(No such file or directory)
[2015-01-20 09:21:17.684253] W
[common-utils.c:2280:gf_process_reserved_ports] 0-glusterfs: Not able to
get reserved ports, hence there is a possibility that glusterfs may
consume reserved port
Logs above suggest that mount process couldn't assign a reserved port
because it couldn't find the file
/proc/sys/net/ipv4/ip_local_reserved_ports

I guess reboot of the machine fixed it. Wonder why it was not found in the
first place.

Pranith.
[2015-01-20 09:21:17.684660] D [socket.c:605:__socket_shutdown]
0-replicated_vol-client-1: shutdown() returned -1. Transport endpoint is
not connected
[2015-01-20 09:21:17.684699] T
[rpc-clnt.c:519:rpc_clnt_connection_cleanup] 0-replicated_vol-client-1:
cleaning up state in transport object 0x68a630
[2015-01-20 09:21:17.684731] D [socket.c:486:__socket_rwv]
0-replicated_vol-client-1: EOF on socket
[2015-01-20 09:21:17.684750] W [socket.c:514:__socket_rwv]
0-replicated_vol-client-1: readv failed (No data available)
[2015-01-20 09:21:17.684766] D
[socket.c:1962:__socket_proto_state_machine] 0-replicated_vol-client-1:
reading from socket failed. Error (No data available), peer
(192.168.24.81:49198)

I could not find a 'remote-port' option in /var/lib/glusterd on either
peer. Could somebody tell me where this configuration is looked up from?
Also, sometime later, I rebooted serv0 and that seemed to solve the
problem. However, stop+start of replicated_vol and restart of
/etc/init.d/glusterd did NOT solve the problem.
Ignore that log. If no port is given in that volfile, it picks 24007 as
the port, which is the default port where glusterd 'listens'


Any help on this matter will be greatly appreciated as I need to provide
robustness assurances for our setup.

Thanks a lot,
Anirban

P.s. Additional details:
glusterfs version: 3.4.2
Linux kernel version: 2.6.34
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you
Pranith Kumar Karampuri
2015-01-27 01:34:44 UTC
Permalink
Post by A Ghoshal
Oh, I didn't I only read a fragment of the IRC log and assumed
--xlator-option would be enough. Apparently it's a lot more work....
I do have a query, though. These connections, from one of our setups,
are these on secure ports? Or, maybe I didn't get it the first time.....
root 8851 25307 0 10:03 pts/2 00:00:00 grep replicated_vol
root 29751 1 4 Jan21 ? 01:47:20 /usr/sbin/glusterfsd -s
serv0 --volfile-id
replicated_vol.serv0.mnt-bricks-replicated_vol-brick -p
/var/lib/glusterd/vols/_replicated_vol/run/serv0-mnt-bricks-replicated_vol-brick.pid
-S /var/run/dff9fa3c93e82f20103f2a3d91adc4a8.socket --brick-name
/mnt/bricks/replicated_vol/brick -l
/var/log/glusterfs/bricks/mnt-bricks-replicated_vol-brick.log
--xlator-option
*-posix.glusterd-uuid=1a1d1ebc-4b92-428f-b66b-9c5efa49574d
--brick-port 49185 --xlator-option
replicated_vol-server.listen-port=49185
root 30399 1 0 Jan21 ? 00:19:06 /usr/sbin/glusterfs
--volfile-id=replicated_vol --volfile-server=serv0 /mnt/replicated_vol
tcp 0 0 serv0:969 serv0:49185 ESTABLISHED
30399/glusterfs
tcp 0 0 serv0:999 serv1:49159 ESTABLISHED
30399/glusterfs
tcp 0 0 serv0:1023 serv0:24007 ESTABLISHED
30399/glusterfs
Seems to be. The ports are: 969, 999, 1023 all of which are < 1024.

Pranith
Post by A Ghoshal
Thanks again,
Anirban
Date: 01/23/2015 01:58 PM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server #Personal#
------------------------------------------------------------------------
Thanks a lot, Pranith.
We'll set this option on our test servers and keep the setup under
observation.
How did you get the bind-insecure option working?
I guess I will send a patch to make it 'volume set option'
Pranith
Thanks,
Anirban
Date: 01/23/2015 01:28 PM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server #Personal#
------------------------------------------------------------------------
Hi Pranith,
while back telling me that it is better if we all use our business
email addresses so I made me a new profile.
Glusterfs complains about /proc/sys/net/ipv4/ip_local_reserved_ports
because we use a really old Linux kernel (2.6.34) wherein this feature
is not present. We plan to upgrade our Linux so often but each time we
are dissuaded from it by some compatibility issue or the other. So, we
get this log every time - on both good volumes and bad ones. What
Basically to make the connections to servers i.e. bricks clients need
to choose secure ports i.e. port less than 1024. Since this file is
not present, it is not binding to any port as per the code I just
checked. There is an option called client-bind-insecure which bypasses
this check. I feel that is one (probably only way) to get around this.
You have to "volume set server.allow-insecure on" option and
bind-insecure option.
CC ndevos who seemed to have helped someone set bind-insecure option
correctly here (_http://irclog.perlgeek.de/gluster/2014-04-09/text_)
Pranith
[2015-01-20 09:37:49.151744] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request
fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0,
owner: 0000000000000000
[2015-01-20 09:37:49.151824] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request
fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1)
[2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-1)
[2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
When I write on the good server (serv1), we see that an RPC request is
sent to both client-0 and client-1. While, when I write on the bad
server (serv0), the RPC request is sent only to client-0, which is why
it is no wonder that the writes are not synced over to serv1. Somehow
I could not make the daemon on serv0 understand that there are two
up-children and not just one.
One additional detail - since we are using a kernel that's too old, we
do not have the (Anand Avati's?) FUse readdirplus patches, either.
I've noticed that the fixes in the readdirplus version of glusterfs
aren't always guaranteed to be present on the non-readdirplus version
of the patches. I'd filed a bug around one such anomaly back, but
_https://bugzilla.redhat.com/show_bug.cgi?id=1062287_
I don't this has anything to do with readdirplus.
Maybe something on similar lines here?
Thanks,
Anirban
P.s. Please ignore the #Personal# in the subject line - we need to do
that to push mails to the public domain past the email filter safely.
Date: 01/22/2015 12:09 AM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server
------------------------------------------------------------------------
hi,
Responses inline.
PS: You are chalkogen_oxygen?
Pranith
Hello,
Volume Name: replicated_vol
Type: Replicate
Volume ID: 26d111e3-7e4c-479e-9355-91635ab7f1c2
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Brick1: serv0:/mnt/bricks/replicated_vol/brick
Brick2: serv1:/mnt/bricks/replicated_vol/brick
diagnostics.client-log-level: INFO
network.ping-timeout: 10
nfs.enable-ino32: on
cluster.self-heal-daemon: on
nfs.disable: off
replicated_vol is mounted at /mnt/replicated_vol on both serv0 and
And then I check for the state of the replicas in the bricks, then I
find that
cranberries
tangerines
As may be seen, the replica on serv1 is blank, when I write into
testfile from serv0 (even though the file is created on both bricks).
Interestingly, if I write something to the file at serv1, then the two
replicas become identical.
cranberries
tangerines
artichokes
cranberries
tangerines
artichokes
So, I dabbled into the logs a little bit, after upping the diagnostic
level, and this is what I saw:*_
When I write on serv0 (bad case):_*
[2015-01-20 09:21:52.197704] T [fuse-bridge.c:546:fuse_lookup_resume]
0-glusterfs-fuse: 53027: LOOKUP
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:21:52.197959] D
[afr-common.c:131:afr_lookup_xattr_req_prepare]
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:21:52.198006] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-0: Auth Info: pid: 28151, uid: 0, gid: 0,
owner: 0000000000000000
[2015-01-20 09:21:52.198024] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request
fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:21:52.198108] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x78163x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:21:52.198565] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x78163x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
[2015-01-20 09:21:52.198640] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:21:52.198669] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:21:52.198681] D
[afr-self-heal-common.c:887:afr_mark_sources]
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:21:52.198694] D
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:21:52.198705] D
[afr-common.c:1380:afr_lookup_select_read_child]
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:21:52.198720] D
[afr-common.c:1117:afr_lookup_build_response_params]
0-replicated_vol-replicate-0: Building lookup response from 0
[2015-01-20 09:21:52.198732] D
[afr-common.c:1732:afr_lookup_perform_self_heal]
0-replicated_vol-replicate-0: Only 1 child up - do not attempt to
detect self heal*_
When I write on serv1 (good case):_*
[2015-01-20 09:37:49.151506] T [fuse-bridge.c:546:fuse_lookup_resume]
0-glusterfs-fuse: 31212: LOOKUP
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:37:49.151683] D
[afr-common.c:131:afr_lookup_xattr_req_prepare]
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:37:49.151726] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-0: Auth Info: pid: 7599, uid: 0, gid: 0,
owner: 0000000000000000
[2015-01-20 09:37:49.151744] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request
fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0,
owner: 0000000000000000
[2015-01-20 09:37:49.151824] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request
fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1)
[2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-1)
[2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
[2015-01-20 09:37:49.152582] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:37:49.152596] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:37:49.152621] D
[afr-self-heal-common.c:887:afr_mark_sources]
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:37:49.152633] D
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:37:49.152644] D
[afr-common.c:1380:afr_lookup_select_read_child]
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:37:49.152657] D
[afr-common.c:1117:afr_lookup_build_response_params]
0-replicated_vol-replicate-0: Building lookup response from 0
We see that when you write on serv1, the RPC request is sent to both
replicated_vol-client-0 and replicated_vol-client-1, while when we
write on serv0, the request is sent only to replicated_vol-client-0,
and the FUse client is unaware of the presence of client-1 in the
latter case.
I checked a bit more in the logs. When I turn on my trace, I found
[2015-01-20 09:21:15.520784] T [fuse-bridge.c:681:fuse_attr_cbk]
0-glusterfs-fuse: 53011: LOOKUP() / => 1
[2015-01-20 09:21:17.683088] T [rpc-clnt.c:422:rpc_clnt_reconnect]
0-replicated_vol-client-1: attempting reconnect
[2015-01-20 09:21:17.683159] D [name.c:155:client_fill_address_family]
0-replicated_vol-client-1: address-family not specified, guessing it
to be inet from (remote-host: serv1)
[2015-01-20 09:21:17.683178] T
[name.c:225:af_inet_client_get_remote_sockaddr]
0-replicated_vol-client-1: option remote-port missing in volume
replicated_vol-client-1. Defaulting to 24007
[2015-01-20 09:21:17.683191] T [common-utils.c:188:gf_resolve_ip6]
0-resolver: flushing DNS cache
[2015-01-20 09:21:17.683202] T [common-utils.c:195:gf_resolve_ip6]
0-resolver: DNS cache not present, freshly probing hostname: serv1
[2015-01-20 09:21:17.683814] D [common-utils.c:237:gf_resolve_ip6]
serv1 and port: 24007
[2015-01-20 09:21:17.684139] D [common-utils.c:257:gf_resolve_ip6]
0-resolver: next DNS query will return: ip-192.168.24.81 port-24007
[2015-01-20 09:21:17.684164] T [socket.c:731:__socket_nodelay]
0-replicated_vol-client-1: NODELAY enabled for socket 10
[2015-01-20 09:21:17.684177] T [socket.c:790:__socket_keepalive]
0-replicated_vol-client-1: Keep-alive enabled for socket 10, interval
2, idle: 20
[2015-01-20 09:21:17.684236] W
[common-utils.c:2247:gf_get_reserved_ports] 0-glusterfs: could not
open the file /proc/sys/net/ipv4/ip_local_reserved_ports for getting
reserved ports info (No such file or directory)
[2015-01-20 09:21:17.684253] W
[common-utils.c:2280:gf_process_reserved_ports] 0-glusterfs: Not able
to get reserved ports, hence there is a possibility that glusterfs may
consume reserved port
Logs above suggest that mount process couldn't assign a reserved port
because it couldn't find the file
/proc/sys/net/ipv4/ip_local_reserved_ports
I guess reboot of the machine fixed it. Wonder why it was not found in
the first place.
Pranith.
[2015-01-20 09:21:17.684660] D [socket.c:605:__socket_shutdown]
0-replicated_vol-client-1: shutdown() returned -1. Transport endpoint
is not connected
[2015-01-20 09:21:17.684699] T
[rpc-clnt.c:519:rpc_clnt_connection_cleanup]
0-replicated_vol-client-1: cleaning up state in transport object 0x68a630
[2015-01-20 09:21:17.684731] D [socket.c:486:__socket_rwv]
0-replicated_vol-client-1: EOF on socket
[2015-01-20 09:21:17.684750] W [socket.c:514:__socket_rwv]
0-replicated_vol-client-1: readv failed (No data available)
[2015-01-20 09:21:17.684766] D
[socket.c:1962:__socket_proto_state_machine]
0-replicated_vol-client-1: reading from socket failed. Error (No data
available), peer (192.168.24.81:49198)
I could not find a 'remote-port' option in /var/lib/glusterd on either
peer. Could somebody tell me where this configuration is looked up
from? Also, sometime later, I rebooted serv0 and that seemed to solve
the problem. However, stop+start of replicated_vol and restart of
/etc/init.d/glusterd did NOT solve the problem.
Ignore that log. If no port is given in that volfile, it picks 24007
as the port, which is the default port where glusterd 'listens'
Any help on this matter will be greatly appreciated as I need to
provide robustness assurances for our setup.
Thanks a lot,
Anirban
P.s. Additional details:/
glusterfs version: 3.4.2//
Linux kernel version: 2.6.34/
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you
_______________________________________________
Gluster-users mailing list_
__http://www.gluster.org/mailman/listinfo/gluster-users_
A Ghoshal
2015-02-02 23:54:14 UTC
Permalink
Hi Pranith,

I finally understood what you meant the secure ports, because the issue
occurred in one of our setups once more. It seems one of the clients on
serv1 could not open a connection to the glusterfsd running on serv0. I'd
actually started a mail trail about it (believing it might be something
else, initially) here:
http://www.gluster.org/pipermail/gluster-users/2015-February/020465.html

I think I can write me a rudimentary kind of patch altering
af_inet_bind_to_port_lt_ceiling() to get it to call bind with port 0,
rather than specify a port explicitly when the client.bind-insecure is
specified ...

Then I'd need to create a way to set server.allow-insecure using the cli
(or if you already sent around the patch to do that like you said in the
earlier mail, do let me know). I'll keep you posted about it round here or
@ [gluster-devel] if I can get it to work.

Thanks a lot,

Anirban



From: A Ghoshal/MUM/TCS
To: Pranith Kumar Karampuri <***@redhat.com>
Cc: gluster-***@gluster.org, Niels de Vos <***@redhat.com>
Date: 01/23/2015 02:45 PM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server #Personal#
Sent by: A Ghoshal


Oh, I didn't I only read a fragment of the IRC log and assumed
--xlator-option would be enough. Apparently it's a lot more work....

I do have a query, though. These connections, from one of our setups, are
these on secure ports? Or, maybe I didn't get it the first time.....

***@serv0:/root> ps -ef | grep replicated_vol
root 8851 25307 0 10:03 pts/2 00:00:00 grep replicated_vol
root 29751 1 4 Jan21 ? 01:47:20 /usr/sbin/glusterfsd -s
serv0 --volfile-id replicated_vol.serv0.mnt-bricks-replicated_vol-brick -p
/var/lib/glusterd/vols/_replicated_vol/run/serv0-mnt-bricks-replicated_vol-brick.pid
-S /var/run/dff9fa3c93e82f20103f2a3d91adc4a8.socket --brick-name
/mnt/bricks/replicated_vol/brick -l
/var/log/glusterfs/bricks/mnt-bricks-replicated_vol-brick.log
--xlator-option *-posix.glusterd-uuid=1a1d1ebc-4b92-428f-b66b-9c5efa49574d
--brick-port 49185 --xlator-option replicated_vol-server.listen-port=49185
root 30399 1 0 Jan21 ? 00:19:06 /usr/sbin/glusterfs
--volfile-id=replicated_vol --volfile-server=serv0 /mnt/replicated_vol

***@serv0:/root> netstat -p | grep 30399
tcp 0 0 serv0:969 serv0:49185 ESTABLISHED
30399/glusterfs
tcp 0 0 serv0:999 serv1:49159 ESTABLISHED
30399/glusterfs
tcp 0 0 serv0:1023 serv0:24007 ESTABLISHED
30399/glusterfs
***@serv0:/root>

Thanks again,
Anirban




From: Pranith Kumar Karampuri <***@redhat.com>
To: A Ghoshal <***@tcs.com>
Cc: gluster-***@gluster.org, Niels de Vos <***@redhat.com>
Date: 01/23/2015 01:58 PM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server #Personal#




On 01/23/2015 01:54 PM, A Ghoshal wrote:
Thanks a lot, Pranith.

We'll set this option on our test servers and keep the setup under
observation.
How did you get the bind-insecure option working?
I guess I will send a patch to make it 'volume set option'

Pranith

Thanks,
Anirban



From: Pranith Kumar Karampuri <***@redhat.com>
To: A Ghoshal <***@tcs.com>
Cc: gluster-***@gluster.org, Niels de Vos <***@redhat.com>
Date: 01/23/2015 01:28 PM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server #Personal#




On 01/22/2015 02:07 PM, A Ghoshal wrote:
Hi Pranith,

Yes, the very same (***@yahoo.com). Justin Clift sent me a
mail a while back telling me that it is better if we all use our business
email addresses so I made me a new profile.

Glusterfs complains about /proc/sys/net/ipv4/ip_local_reserved_ports
because we use a really old Linux kernel (2.6.34) wherein this feature is
not present. We plan to upgrade our Linux so often but each time we are
dissuaded from it by some compatibility issue or the other. So, we get
this log every time - on both good volumes and bad ones. What bothered me
was this (on serv1):
Basically to make the connections to servers i.e. bricks clients need to
choose secure ports i.e. port less than 1024. Since this file is not
present, it is not binding to any port as per the code I just checked.
There is an option called client-bind-insecure which bypasses this check.
I feel that is one (probably only way) to get around this.
You have to "volume set server.allow-insecure on" option and bind-insecure
option.
CC ndevos who seemed to have helped someone set bind-insecure option
correctly here (http://irclog.perlgeek.de/gluster/2014-04-09/text)

Pranith

[2015-01-20 09:37:49.151744] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0, owner:
0000000000000000
[2015-01-20 09:37:49.151824] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1)
[2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-1)
[2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)

When I write on the good server (serv1), we see that an RPC request is
sent to both client-0 and client-1. While, when I write on the bad server
(serv0), the RPC request is sent only to client-0, which is why it is no
wonder that the writes are not synced over to serv1. Somehow I could not
make the daemon on serv0 understand that there are two up-children and not
just one.

One additional detail - since we are using a kernel that's too old, we do
not have the (Anand Avati's?) FUse readdirplus patches, either. I've
noticed that the fixes in the readdirplus version of glusterfs aren't
always guaranteed to be present on the non-readdirplus version of the
patches. I'd filed a bug around one such anomaly back, but never got
around to writing a patch for it (sorry!) Here it is:
https://bugzilla.redhat.com/show_bug.cgi?id=1062287
I don't this has anything to do with readdirplus.

Maybe something on similar lines here?

Thanks,
Anirban

P.s. Please ignore the #Personal# in the subject line - we need to do that
to push mails to the public domain past the email filter safely.



From: Pranith Kumar Karampuri <***@redhat.com>
To: A Ghoshal <***@tcs.com>, gluster-***@gluster.org
Date: 01/22/2015 12:09 AM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server



hi,
Responses inline.

PS: You are chalkogen_oxygen?

Pranith
On 01/20/2015 05:34 PM, A Ghoshal wrote:
Hello,

I am using the following replicated volume:

***@serv0:~> gluster v info replicated_vol

Volume Name: replicated_vol
Type: Replicate
Volume ID: 26d111e3-7e4c-479e-9355-91635ab7f1c2
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: serv0:/mnt/bricks/replicated_vol/brick
Brick2: serv1:/mnt/bricks/replicated_vol/brick
Options Reconfigured:
diagnostics.client-log-level: INFO
network.ping-timeout: 10
nfs.enable-ino32: on
cluster.self-heal-daemon: on
nfs.disable: off

replicated_vol is mounted at /mnt/replicated_vol on both serv0 and serv1.
If I do the following on serv0:

***@serv0:~>echo "cranberries" > /mnt/replicated_vol/testfile
***@serv0:~>echo "tangerines" >> /mnt/replicated_vol/testfile

And then I check for the state of the replicas in the bricks, then I find
that

***@serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile
cranberries
tangerines
***@serv0:~>

***@serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile
***@serv1:~>

As may be seen, the replica on serv1 is blank, when I write into testfile
from serv0 (even though the file is created on both bricks).
Interestingly, if I write something to the file at serv1, then the two
replicas become identical.

***@serv1:~>echo "artichokes" >> /mnt/replicated_vol/testfile

***@serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile
cranberries
tangerines
artichokes
***@serv1:~>

***@serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile
cranberries
tangerines
artichokes
***@serv0:~>

So, I dabbled into the logs a little bit, after upping the diagnostic
level, and this is what I saw:

When I write on serv0 (bad case):

[2015-01-20 09:21:52.197704] T [fuse-bridge.c:546:fuse_lookup_resume]
0-glusterfs-fuse: 53027: LOOKUP
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:21:52.197959] D
[afr-common.c:131:afr_lookup_xattr_req_prepare]
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:21:52.198006] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-0: Auth Info: pid: 28151, uid: 0, gid: 0, owner:
0000000000000000
[2015-01-20 09:21:52.198024] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:21:52.198108] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x78163x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:21:52.198565] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x78163x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
[2015-01-20 09:21:52.198640] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:21:52.198669] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:21:52.198681] D
[afr-self-heal-common.c:887:afr_mark_sources]
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:21:52.198694] D
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:21:52.198705] D
[afr-common.c:1380:afr_lookup_select_read_child]
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:21:52.198720] D
[afr-common.c:1117:afr_lookup_build_response_params]
0-replicated_vol-replicate-0: Building lookup response from 0
[2015-01-20 09:21:52.198732] D
[afr-common.c:1732:afr_lookup_perform_self_heal]
0-replicated_vol-replicate-0: Only 1 child up - do not attempt to detect
self heal

When I write on serv1 (good case):

[2015-01-20 09:37:49.151506] T [fuse-bridge.c:546:fuse_lookup_resume]
0-glusterfs-fuse: 31212: LOOKUP
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:37:49.151683] D
[afr-common.c:131:afr_lookup_xattr_req_prepare]
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:37:49.151726] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-0: Auth Info: pid: 7599, uid: 0, gid: 0, owner:
0000000000000000
[2015-01-20 09:37:49.151744] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0, owner:
0000000000000000
[2015-01-20 09:37:49.151824] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1)
[2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-1)
[2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
[2015-01-20 09:37:49.152582] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:37:49.152596] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:37:49.152621] D
[afr-self-heal-common.c:887:afr_mark_sources]
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:37:49.152633] D
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:37:49.152644] D
[afr-common.c:1380:afr_lookup_select_read_child]
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:37:49.152657] D
[afr-common.c:1117:afr_lookup_build_response_params]
0-replicated_vol-replicate-0: Building lookup response from 0

We see that when you write on serv1, the RPC request is sent to both
replicated_vol-client-0 and replicated_vol-client-1, while when we write
on serv0, the request is sent only to replicated_vol-client-0, and the
FUse client is unaware of the presence of client-1 in the latter case.

I checked a bit more in the logs. When I turn on my trace, I found many
instances of these logs on serv0 but NOT on serv1:

[2015-01-20 09:21:15.520784] T [fuse-bridge.c:681:fuse_attr_cbk]
0-glusterfs-fuse: 53011: LOOKUP() / => 1
[2015-01-20 09:21:17.683088] T [rpc-clnt.c:422:rpc_clnt_reconnect]
0-replicated_vol-client-1: attempting reconnect
[2015-01-20 09:21:17.683159] D [name.c:155:client_fill_address_family]
0-replicated_vol-client-1: address-family not specified, guessing it to be
inet from (remote-host: serv1)
[2015-01-20 09:21:17.683178] T
[name.c:225:af_inet_client_get_remote_sockaddr] 0-replicated_vol-client-1:
option remote-port missing in volume replicated_vol-client-1. Defaulting
to 24007
[2015-01-20 09:21:17.683191] T [common-utils.c:188:gf_resolve_ip6]
0-resolver: flushing DNS cache
[2015-01-20 09:21:17.683202] T [common-utils.c:195:gf_resolve_ip6]
0-resolver: DNS cache not present, freshly probing hostname: serv1
[2015-01-20 09:21:17.683814] D [common-utils.c:237:gf_resolve_ip6]
0-resolver: returning ip-192.168.24.81 (port-24007) for hostname: serv1
and port: 24007
[2015-01-20 09:21:17.684139] D [common-utils.c:257:gf_resolve_ip6]
0-resolver: next DNS query will return: ip-192.168.24.81 port-24007
[2015-01-20 09:21:17.684164] T [socket.c:731:__socket_nodelay]
0-replicated_vol-client-1: NODELAY enabled for socket 10
[2015-01-20 09:21:17.684177] T [socket.c:790:__socket_keepalive]
0-replicated_vol-client-1: Keep-alive enabled for socket 10, interval 2,
idle: 20
[2015-01-20 09:21:17.684236] W [common-utils.c:2247:gf_get_reserved_ports]
0-glusterfs: could not open the file
/proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports info
(No such file or directory)
[2015-01-20 09:21:17.684253] W
[common-utils.c:2280:gf_process_reserved_ports] 0-glusterfs: Not able to
get reserved ports, hence there is a possibility that glusterfs may
consume reserved port
Logs above suggest that mount process couldn't assign a reserved port
because it couldn't find the file
/proc/sys/net/ipv4/ip_local_reserved_ports

I guess reboot of the machine fixed it. Wonder why it was not found in the
first place.

Pranith.
[2015-01-20 09:21:17.684660] D [socket.c:605:__socket_shutdown]
0-replicated_vol-client-1: shutdown() returned -1. Transport endpoint is
not connected
[2015-01-20 09:21:17.684699] T
[rpc-clnt.c:519:rpc_clnt_connection_cleanup] 0-replicated_vol-client-1:
cleaning up state in transport object 0x68a630
[2015-01-20 09:21:17.684731] D [socket.c:486:__socket_rwv]
0-replicated_vol-client-1: EOF on socket
[2015-01-20 09:21:17.684750] W [socket.c:514:__socket_rwv]
0-replicated_vol-client-1: readv failed (No data available)
[2015-01-20 09:21:17.684766] D
[socket.c:1962:__socket_proto_state_machine] 0-replicated_vol-client-1:
reading from socket failed. Error (No data available), peer
(192.168.24.81:49198)

I could not find a 'remote-port' option in /var/lib/glusterd on either
peer. Could somebody tell me where this configuration is looked up from?
Also, sometime later, I rebooted serv0 and that seemed to solve the
problem. However, stop+start of replicated_vol and restart of
/etc/init.d/glusterd did NOT solve the problem.
Ignore that log. If no port is given in that volfile, it picks 24007 as
the port, which is the default port where glusterd 'listens'


Any help on this matter will be greatly appreciated as I need to provide
robustness assurances for our setup.

Thanks a lot,
Anirban

P.s. Additional details:
glusterfs version: 3.4.2
Linux kernel version: 2.6.34
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you
A Ghoshal
2015-02-04 20:33:19 UTC
Permalink
Ok, more updates here:

I turned on trace and it seems bind to secure port on the mount client
with the remote brick is successful - afterwards the connect() fails to
complete. I saw these logs:

[2015-02-03 12:11:33.832615] T [rpc-clnt.c:422:rpc_clnt_reconnect]
0-replicated_vol-1: attempting reconnect
[2015-02-03 12:11:33.832666] D [name.c:155:client_fill_address_family]
0-replicated_vol-1: address-family not specified, guessing it to be inet
from (remote-host: serv1)
[2015-02-03 12:11:33.832683] T
[name.c:225:af_inet_client_get_remote_sockaddr] 0-replicated_vol-1: option
remote-port missing in volume replicated_vol-1. Defaulting to 24007
[2015-02-03 12:11:33.833083] D [common-utils.c:237:gf_resolve_ip6]
0-resolver: returning ip-192.168.24.81 (port-24007) for hostname: serv1
and port: 24007
[2015-02-03 12:11:33.833113] T [socket.c:731:__socket_nodelay]
0-replicated_vol-1: NODELAY enabled for socket 10
[2015-02-03 12:11:33.833128] T [socket.c:790:__socket_keepalive]
0-replicated_vol-1: Keep-alive enabled for socket 10, interval 2, idle: 20
[2015-02-03 12:11:33.833188] W [common-utils.c:2247:gf_get_reserved_ports]
0-glusterfs: could not open the file
/proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports info
(No such file or directory)
[2015-02-03 12:11:33.833204] W
[common-utils.c:2280:gf_process_reserved_ports] 0-glusterfs: Not able to
get reserved ports, hence there is a possibility that glusterfs may
consume reserved port
[2015-02-03 12:11:33.833560] D [socket.c:605:__socket_shutdown]
0-replicated_vol-1: shutdown() returned -1. Transport endpoint is not
connected
[2015-02-03 12:11:33.833594] T
[rpc-clnt.c:519:rpc_clnt_connection_cleanup] 0-replicated_vol-1: cleaning
up state in transport object 0x6887a0
[2015-02-03 12:11:33.833622] D [socket.c:486:__socket_rwv]
0-replicated_vol-1: EOF on socket
[2015-02-03 12:11:33.833635] W [socket.c:514:__socket_rwv]
0-replicated_vol-1: readv failed (No data available)
[2015-02-03 12:11:33.833647] D
[socket.c:1962:__socket_proto_state_machine] 0-replicated_vol-1: reading
from socket failed. Error (No data available), peer (192.168.24.81:49175)
[2015-02-03 12:11:33.833659] D [socket.c:2236:socket_event_handler]
0-transport: disconnecting now
[2015-02-03 12:11:33.833684] T
[rpc-clnt.c:519:rpc_clnt_connection_cleanup] 0-replicated_vol-1: cleaning
up state in transport object 0x6887a0

After that, I attached the mount process to my gdb session and attached a
breakpoint at __socket_shutdown(). It gave me the following traceback

Breakpoint 1, __socket_shutdown (this=0x68a7f0) at socket.c:594
594 socket.c: No such file or directory.
in socket.c
(gdb) bt
#0 __socket_shutdown (this=0x68a7f0) at socket.c:594
#1 0x00007f3391906553 in __socket_disconnect (this=0x68a7f0) at
socket.c:623
#2 0x00007f339190856a in socket_connect_finish (this=0x68a7f0) at
socket.c:2160
#3 0x00007f339190dc04 in socket_event_handler (fd=<value optimized out>,
idx=<value optimized out>, data=0x68a7f0,
poll_in=1, poll_out=4, poll_err=<value optimized out>) at
socket.c:2223
#4 0x00007f3395aa219f in event_dispatch_epoll_handler
(event_pool=0x62d8b0) at event-epoll.c:384
#5 event_dispatch_epoll (event_pool=0x62d8b0) at event-epoll.c:445
#6 0x0000000000406b06 in main (argc=4, argv=0x7fff71108c38) at
glusterfsd.c:1934
(gdb)

In case somebody could give me a clue or two on my next move (I'm looking
at the code in the meantime), I'd be very grateful.

Thanks,
Anirban



From: A Ghoshal <***@tcs.com>
To: Pranith Kumar Karampuri <***@redhat.com>
Cc: gluster-***@gluster.org
Date: 02/03/2015 05:24 AM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server #Personal#
Sent by: gluster-users-***@gluster.org



Hi Pranith,

I finally understood what you meant the secure ports, because the issue
occurred in one of our setups once more. It seems one of the clients on
serv1 could not open a connection to the glusterfsd running on serv0. I'd
actually started a mail trail about it (believing it might be something
else, initially) here:
http://www.gluster.org/pipermail/gluster-users/2015-February/020465.html

I think I can write me a rudimentary kind of patch altering
af_inet_bind_to_port_lt_ceiling() to get it to call bind with port 0,
rather than specify a port explicitly when the client.bind-insecure is
specified ...

Then I'd need to create a way to set server.allow-insecure using the cli
(or if you already sent around the patch to do that like you said in the
earlier mail, do let me know). I'll keep you posted about it round here or
@ [gluster-devel] if I can get it to work.

Thanks a lot,

Anirban



From: A Ghoshal/MUM/TCS
To: Pranith Kumar Karampuri <***@redhat.com>
Cc: gluster-***@gluster.org, Niels de Vos <***@redhat.com>
Date: 01/23/2015 02:45 PM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server #Personal#
Sent by: A Ghoshal


Oh, I didn't I only read a fragment of the IRC log and assumed
--xlator-option would be enough. Apparently it's a lot more work....

I do have a query, though. These connections, from one of our setups, are
these on secure ports? Or, maybe I didn't get it the first time.....

***@serv0:/root> ps -ef | grep replicated_vol
root 8851 25307 0 10:03 pts/2 00:00:00 grep replicated_vol
root 29751 1 4 Jan21 ? 01:47:20 /usr/sbin/glusterfsd -s
serv0 --volfile-id replicated_vol.serv0.mnt-bricks-replicated_vol-brick -p
/var/lib/glusterd/vols/_replicated_vol/run/serv0-mnt-bricks-replicated_vol-brick.pid
-S /var/run/dff9fa3c93e82f20103f2a3d91adc4a8.socket --brick-name
/mnt/bricks/replicated_vol/brick -l
/var/log/glusterfs/bricks/mnt-bricks-replicated_vol-brick.log
--xlator-option *-posix.glusterd-uuid=1a1d1ebc-4b92-428f-b66b-9c5efa49574d
--brick-port 49185 --xlator-option replicated_vol-server.listen-port=49185

root 30399 1 0 Jan21 ? 00:19:06 /usr/sbin/glusterfs
--volfile-id=replicated_vol --volfile-server=serv0 /mnt/replicated_vol

***@serv0:/root> netstat -p | grep 30399
tcp 0 0 serv0:969 serv0:49185 ESTABLISHED
30399/glusterfs
tcp 0 0 serv0:999 serv1:49159 ESTABLISHED
30399/glusterfs
tcp 0 0 serv0:1023 serv0:24007 ESTABLISHED
30399/glusterfs
***@serv0:/root>

Thanks again,
Anirban




From: Pranith Kumar Karampuri <***@redhat.com>
To: A Ghoshal <***@tcs.com>
Cc: gluster-***@gluster.org, Niels de Vos <***@redhat.com>
Date: 01/23/2015 01:58 PM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server #Personal#




On 01/23/2015 01:54 PM, A Ghoshal wrote:
Thanks a lot, Pranith.

We'll set this option on our test servers and keep the setup under
observation.
How did you get the bind-insecure option working?
I guess I will send a patch to make it 'volume set option'

Pranith

Thanks,
Anirban



From: Pranith Kumar Karampuri <***@redhat.com>
To: A Ghoshal <***@tcs.com>
Cc: gluster-***@gluster.org, Niels de Vos <***@redhat.com>
Date: 01/23/2015 01:28 PM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server #Personal#




On 01/22/2015 02:07 PM, A Ghoshal wrote:
Hi Pranith,

Yes, the very same (***@yahoo.com). Justin Clift sent me a
mail a while back telling me that it is better if we all use our business
email addresses so I made me a new profile.

Glusterfs complains about /proc/sys/net/ipv4/ip_local_reserved_ports
because we use a really old Linux kernel (2.6.34) wherein this feature is
not present. We plan to upgrade our Linux so often but each time we are
dissuaded from it by some compatibility issue or the other. So, we get
this log every time - on both good volumes and bad ones. What bothered me
was this (on serv1):
Basically to make the connections to servers i.e. bricks clients need to
choose secure ports i.e. port less than 1024. Since this file is not
present, it is not binding to any port as per the code I just checked.
There is an option called client-bind-insecure which bypasses this check.
I feel that is one (probably only way) to get around this.
You have to "volume set server.allow-insecure on" option and bind-insecure
option.
CC ndevos who seemed to have helped someone set bind-insecure option
correctly here (http://irclog.perlgeek.de/gluster/2014-04-09/text)

Pranith

[2015-01-20 09:37:49.151744] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0, owner:
0000000000000000
[2015-01-20 09:37:49.151824] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1)
[2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-1)
[2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)

When I write on the good server (serv1), we see that an RPC request is
sent to both client-0 and client-1. While, when I write on the bad server
(serv0), the RPC request is sent only to client-0, which is why it is no
wonder that the writes are not synced over to serv1. Somehow I could not
make the daemon on serv0 understand that there are two up-children and not
just one.

One additional detail - since we are using a kernel that's too old, we do
not have the (Anand Avati's?) FUse readdirplus patches, either. I've
noticed that the fixes in the readdirplus version of glusterfs aren't
always guaranteed to be present on the non-readdirplus version of the
patches. I'd filed a bug around one such anomaly back, but never got
around to writing a patch for it (sorry!) Here it is:
https://bugzilla.redhat.com/show_bug.cgi?id=1062287
I don't this has anything to do with readdirplus.

Maybe something on similar lines here?

Thanks,
Anirban

P.s. Please ignore the #Personal# in the subject line - we need to do that
to push mails to the public domain past the email filter safely.



From: Pranith Kumar Karampuri <***@redhat.com>
To: A Ghoshal <***@tcs.com>, gluster-***@gluster.org
Date: 01/22/2015 12:09 AM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server



hi,
Responses inline.

PS: You are chalkogen_oxygen?

Pranith
On 01/20/2015 05:34 PM, A Ghoshal wrote:
Hello,

I am using the following replicated volume:

***@serv0:~> gluster v info replicated_vol

Volume Name: replicated_vol
Type: Replicate
Volume ID: 26d111e3-7e4c-479e-9355-91635ab7f1c2
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: serv0:/mnt/bricks/replicated_vol/brick
Brick2: serv1:/mnt/bricks/replicated_vol/brick
Options Reconfigured:
diagnostics.client-log-level: INFO
network.ping-timeout: 10
nfs.enable-ino32: on
cluster.self-heal-daemon: on
nfs.disable: off

replicated_vol is mounted at /mnt/replicated_vol on both serv0 and serv1.
If I do the following on serv0:

***@serv0:~>echo "cranberries" > /mnt/replicated_vol/testfile
***@serv0:~>echo "tangerines" >> /mnt/replicated_vol/testfile

And then I check for the state of the replicas in the bricks, then I find
that

***@serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile
cranberries
tangerines
***@serv0:~>

***@serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile
***@serv1:~>

As may be seen, the replica on serv1 is blank, when I write into testfile
from serv0 (even though the file is created on both bricks).
Interestingly, if I write something to the file at serv1, then the two
replicas become identical.

***@serv1:~>echo "artichokes" >> /mnt/replicated_vol/testfile

***@serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile
cranberries
tangerines
artichokes
***@serv1:~>

***@serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile
cranberries
tangerines
artichokes
***@serv0:~>

So, I dabbled into the logs a little bit, after upping the diagnostic
level, and this is what I saw:

When I write on serv0 (bad case):

[2015-01-20 09:21:52.197704] T [fuse-bridge.c:546:fuse_lookup_resume]
0-glusterfs-fuse: 53027: LOOKUP
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:21:52.197959] D
[afr-common.c:131:afr_lookup_xattr_req_prepare]
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:21:52.198006] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-0: Auth Info: pid: 28151, uid: 0, gid: 0, owner:
0000000000000000
[2015-01-20 09:21:52.198024] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:21:52.198108] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x78163x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:21:52.198565] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x78163x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
[2015-01-20 09:21:52.198640] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:21:52.198669] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:21:52.198681] D
[afr-self-heal-common.c:887:afr_mark_sources]
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:21:52.198694] D
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:21:52.198705] D
[afr-common.c:1380:afr_lookup_select_read_child]
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:21:52.198720] D
[afr-common.c:1117:afr_lookup_build_response_params]
0-replicated_vol-replicate-0: Building lookup response from 0
[2015-01-20 09:21:52.198732] D
[afr-common.c:1732:afr_lookup_perform_self_heal]
0-replicated_vol-replicate-0: Only 1 child up - do not attempt to detect
self heal

When I write on serv1 (good case):

[2015-01-20 09:37:49.151506] T [fuse-bridge.c:546:fuse_lookup_resume]
0-glusterfs-fuse: 31212: LOOKUP
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:37:49.151683] D
[afr-common.c:131:afr_lookup_xattr_req_prepare]
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:37:49.151726] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-0: Auth Info: pid: 7599, uid: 0, gid: 0, owner:
0000000000000000
[2015-01-20 09:37:49.151744] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0, owner:
0000000000000000
[2015-01-20 09:37:49.151824] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1)
[2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-1)
[2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
[2015-01-20 09:37:49.152582] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:37:49.152596] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:37:49.152621] D
[afr-self-heal-common.c:887:afr_mark_sources]
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:37:49.152633] D
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:37:49.152644] D
[afr-common.c:1380:afr_lookup_select_read_child]
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:37:49.152657] D
[afr-common.c:1117:afr_lookup_build_response_params]
0-replicated_vol-replicate-0: Building lookup response from 0

We see that when you write on serv1, the RPC request is sent to both
replicated_vol-client-0 and replicated_vol-client-1, while when we write
on serv0, the request is sent only to replicated_vol-client-0, and the
FUse client is unaware of the presence of client-1 in the latter case.

I checked a bit more in the logs. When I turn on my trace, I found many
instances of these logs on serv0 but NOT on serv1:

[2015-01-20 09:21:15.520784] T [fuse-bridge.c:681:fuse_attr_cbk]
0-glusterfs-fuse: 53011: LOOKUP() / => 1
[2015-01-20 09:21:17.683088] T [rpc-clnt.c:422:rpc_clnt_reconnect]
0-replicated_vol-client-1: attempting reconnect
[2015-01-20 09:21:17.683159] D [name.c:155:client_fill_address_family]
0-replicated_vol-client-1: address-family not specified, guessing it to be
inet from (remote-host: serv1)
[2015-01-20 09:21:17.683178] T
[name.c:225:af_inet_client_get_remote_sockaddr] 0-replicated_vol-client-1:
option remote-port missing in volume replicated_vol-client-1. Defaulting
to 24007
[2015-01-20 09:21:17.683191] T [common-utils.c:188:gf_resolve_ip6]
0-resolver: flushing DNS cache
[2015-01-20 09:21:17.683202] T [common-utils.c:195:gf_resolve_ip6]
0-resolver: DNS cache not present, freshly probing hostname: serv1
[2015-01-20 09:21:17.683814] D [common-utils.c:237:gf_resolve_ip6]
0-resolver: returning ip-192.168.24.81 (port-24007) for hostname: serv1
and port: 24007
[2015-01-20 09:21:17.684139] D [common-utils.c:257:gf_resolve_ip6]
0-resolver: next DNS query will return: ip-192.168.24.81 port-24007
[2015-01-20 09:21:17.684164] T [socket.c:731:__socket_nodelay]
0-replicated_vol-client-1: NODELAY enabled for socket 10
[2015-01-20 09:21:17.684177] T [socket.c:790:__socket_keepalive]
0-replicated_vol-client-1: Keep-alive enabled for socket 10, interval 2,
idle: 20
[2015-01-20 09:21:17.684236] W [common-utils.c:2247:gf_get_reserved_ports]
0-glusterfs: could not open the file
/proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports info
(No such file or directory)
[2015-01-20 09:21:17.684253] W
[common-utils.c:2280:gf_process_reserved_ports] 0-glusterfs: Not able to
get reserved ports, hence there is a possibility that glusterfs may
consume reserved port
Logs above suggest that mount process couldn't assign a reserved port
because it couldn't find the file
/proc/sys/net/ipv4/ip_local_reserved_ports

I guess reboot of the machine fixed it. Wonder why it was not found in the
first place.

Pranith.
[2015-01-20 09:21:17.684660] D [socket.c:605:__socket_shutdown]
0-replicated_vol-client-1: shutdown() returned -1. Transport endpoint is
not connected
[2015-01-20 09:21:17.684699] T
[rpc-clnt.c:519:rpc_clnt_connection_cleanup] 0-replicated_vol-client-1:
cleaning up state in transport object 0x68a630
[2015-01-20 09:21:17.684731] D [socket.c:486:__socket_rwv]
0-replicated_vol-client-1: EOF on socket
[2015-01-20 09:21:17.684750] W [socket.c:514:__socket_rwv]
0-replicated_vol-client-1: readv failed (No data available)
[2015-01-20 09:21:17.684766] D
[socket.c:1962:__socket_proto_state_machine] 0-replicated_vol-client-1:
reading from socket failed. Error (No data available), peer
(192.168.24.81:49198)

I could not find a 'remote-port' option in /var/lib/glusterd on either
peer. Could somebody tell me where this configuration is looked up from?
Also, sometime later, I rebooted serv0 and that seemed to solve the
problem. However, stop+start of replicated_vol and restart of
/etc/init.d/glusterd did NOT solve the problem.
Ignore that log. If no port is given in that volfile, it picks 24007 as
the port, which is the default port where glusterd 'listens'


Any help on this matter will be greatly appreciated as I need to provide
robustness assurances for our setup.

Thanks a lot,
Anirban

P.s. Additional details:
glusterfs version: 3.4.2
Linux kernel version: 2.6.34
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you
A Ghoshal
2015-02-04 21:36:02 UTC
Permalink
Sorry for spamming you guys, but this is kind of important for me to
debug, so if you saw anything like this before, do let me know. Here's an
update:

It seems the mount client is attempting connection with an invalid port
number. 49175 is NOT the port number of glusterfsd on serv1
(192.168.24.8).

I got me an strace:

[pid 31026] open("/proc/sys/net/ipv4/ip_local_reserved_ports", O_RDONLY) =
-1 ENOENT (No such file or directory)
[pid 31026] write(4, "[2015-02-04 20:39:02.793154] W ["..., 215) = 215
[pid 31026] write(4, "[2015-02-04 20:39:02.793289] W ["..., 194) = 194
[pid 31026] bind(10, {sa_family=AF_INET, sin_port=htons(1023),
sin_addr=inet_addr("192.168.24.80")}, 16) = 0
[pid 31026] fcntl(10, F_GETFL) = 0x2 (flags O_RDWR)
[pid 31026] fcntl(10, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid 31026] connect(10, {sa_family=AF_INET, sin_port=htons(49175),
sin_addr=inet_addr("192.168.24.81")}, 16) = -1 EINPROGRESS (Operation now
in progress)
[pid 31026] fcntl(10, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK)
[pid 31026] fcntl(10, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid 31026] epoll_ctl(3, EPOLL_CTL_ADD, 10, {EPOLLIN|EPOLLPRI|EPOLLOUT,
{u32=10, u64=8589934602}}) = 0
[pid 31026] nanosleep({1, 0}, <unfinished ...>
[pid 31021] <... epoll_wait resumed> {{EPOLLIN|EPOLLOUT|EPOLLERR|EPOLLHUP,
{u32=10, u64=8589934602}}}, 257, 4294967295) = 1
[pid 31021] getsockopt(10, SOL_SOCKET, SO_ERROR, [29422518842425455], [4])
= 0
[pid 31021] shutdown(10, 2 /* send and receive */) = -1 ENOTCONN
(Transport endpoint is not connected)

Which basically told me that connection is attempted via a non-blocking
socket at port 49175. The errno from the failure is -ECONNREFUSED, which
is what is expected.

807 in socket.c
(gdb) bt
#0 __socket_connect_finish (this=0x6887a0) at socket.c:807
#1 socket_connect_finish (this=0x6887a0) at socket.c:2147
#2 0x00007fc863de4c04 in socket_event_handler (fd=<value optimized out>,
idx=<value optimized out>, data=0x6887a0,
poll_in=1, poll_out=4, poll_err=<value optimized out>) at
socket.c:2223
#3 0x00007fc867f7919f in event_dispatch_epoll_handler
(event_pool=0x62db70) at event-epoll.c:384
#4 event_dispatch_epoll (event_pool=0x62db70) at event-epoll.c:445
#5 0x0000000000406b06 in main (argc=4, argv=0x7fff25302c38) at
glusterfsd.c:1934
(gdb) print *optval
Cannot access memory at address 0x6f
(gdb) print optval
$1 = 111

Note that this agrees with the following debug log:

[2015-02-03 12:11:33.833647] D
[socket.c:1962:__socket_proto_state_machine] 0-replicated_vol-1: reading
from socket failed. Error (No data available), peer (192.168.24.81:49175)

There is, of course no service running on port 49175. In fact, listen port
for corresponding glusterd on serv1 is 49206. Where does the mount client
pick this port number from? I know that if I kill and restart the mount
client on serv0 from command line, then the problem will disappear. So,
it's not something that is up with the processes on serv1...

Thanks,
Anirban


From: A Ghoshal/MUM/TCS
To: A Ghoshal <***@tcs.com>
Cc: gluster-***@gluster.org, gluster-users-***@gluster.org,
Pranith Kumar Karampuri <***@redhat.com>
Date: 02/05/2015 02:03 AM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server #Personal#
Sent by: A Ghoshal


Ok, more updates here:

I turned on trace and it seems bind to secure port on the mount client
with the remote brick is successful - afterwards the connect() fails to
complete. I saw these logs:

[2015-02-03 12:11:33.832615] T [rpc-clnt.c:422:rpc_clnt_reconnect]
0-replicated_vol-1: attempting reconnect
[2015-02-03 12:11:33.832666] D [name.c:155:client_fill_address_family]
0-replicated_vol-1: address-family not specified, guessing it to be inet
from (remote-host: serv1)
[2015-02-03 12:11:33.832683] T
[name.c:225:af_inet_client_get_remote_sockaddr] 0-replicated_vol-1: option
remote-port missing in volume replicated_vol-1. Defaulting to 24007
[2015-02-03 12:11:33.833083] D [common-utils.c:237:gf_resolve_ip6]
0-resolver: returning ip-192.168.24.81 (port-24007) for hostname: serv1
and port: 24007
[2015-02-03 12:11:33.833113] T [socket.c:731:__socket_nodelay]
0-replicated_vol-1: NODELAY enabled for socket 10
[2015-02-03 12:11:33.833128] T [socket.c:790:__socket_keepalive]
0-replicated_vol-1: Keep-alive enabled for socket 10, interval 2, idle: 20
[2015-02-03 12:11:33.833188] W [common-utils.c:2247:gf_get_reserved_ports]
0-glusterfs: could not open the file
/proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports info
(No such file or directory)
[2015-02-03 12:11:33.833204] W
[common-utils.c:2280:gf_process_reserved_ports] 0-glusterfs: Not able to
get reserved ports, hence there is a possibility that glusterfs may
consume reserved port
[2015-02-03 12:11:33.833560] D [socket.c:605:__socket_shutdown]
0-replicated_vol-1: shutdown() returned -1. Transport endpoint is not
connected
[2015-02-03 12:11:33.833594] T
[rpc-clnt.c:519:rpc_clnt_connection_cleanup] 0-replicated_vol-1: cleaning
up state in transport object 0x6887a0
[2015-02-03 12:11:33.833622] D [socket.c:486:__socket_rwv]
0-replicated_vol-1: EOF on socket
[2015-02-03 12:11:33.833635] W [socket.c:514:__socket_rwv]
0-replicated_vol-1: readv failed (No data available)
[2015-02-03 12:11:33.833647] D
[socket.c:1962:__socket_proto_state_machine] 0-replicated_vol-1: reading
from socket failed. Error (No data available), peer (192.168.24.81:49175)
[2015-02-03 12:11:33.833659] D [socket.c:2236:socket_event_handler]
0-transport: disconnecting now
[2015-02-03 12:11:33.833684] T
[rpc-clnt.c:519:rpc_clnt_connection_cleanup] 0-replicated_vol-1: cleaning
up state in transport object 0x6887a0

After that, I attached the mount process to my gdb session and attached a
breakpoint at __socket_shutdown(). It gave me the following traceback

Breakpoint 1, __socket_shutdown (this=0x68a7f0) at socket.c:594
594 socket.c: No such file or directory.
in socket.c
(gdb) bt
#0 __socket_shutdown (this=0x68a7f0) at socket.c:594
#1 0x00007f3391906553 in __socket_disconnect (this=0x68a7f0) at
socket.c:623
#2 0x00007f339190856a in socket_connect_finish (this=0x68a7f0) at
socket.c:2160
#3 0x00007f339190dc04 in socket_event_handler (fd=<value optimized out>,
idx=<value optimized out>, data=0x68a7f0,
poll_in=1, poll_out=4, poll_err=<value optimized out>) at
socket.c:2223
#4 0x00007f3395aa219f in event_dispatch_epoll_handler
(event_pool=0x62d8b0) at event-epoll.c:384
#5 event_dispatch_epoll (event_pool=0x62d8b0) at event-epoll.c:445
#6 0x0000000000406b06 in main (argc=4, argv=0x7fff71108c38) at
glusterfsd.c:1934
(gdb)

In case somebody could give me a clue or two on my next move (I'm looking
at the code in the meantime), I'd be very grateful.

Thanks,
Anirban




From: A Ghoshal <***@tcs.com>
To: Pranith Kumar Karampuri <***@redhat.com>
Cc: gluster-***@gluster.org
Date: 02/03/2015 05:24 AM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server #Personal#
Sent by: gluster-users-***@gluster.org



Hi Pranith,

I finally understood what you meant the secure ports, because the issue
occurred in one of our setups once more. It seems one of the clients on
serv1 could not open a connection to the glusterfsd running on serv0. I'd
actually started a mail trail about it (believing it might be something
else, initially) here:
http://www.gluster.org/pipermail/gluster-users/2015-February/020465.html

I think I can write me a rudimentary kind of patch altering
af_inet_bind_to_port_lt_ceiling() to get it to call bind with port 0,
rather than specify a port explicitly when the client.bind-insecure is
specified ...

Then I'd need to create a way to set server.allow-insecure using the cli
(or if you already sent around the patch to do that like you said in the
earlier mail, do let me know). I'll keep you posted about it round here or
@ [gluster-devel] if I can get it to work.

Thanks a lot,

Anirban



From: A Ghoshal/MUM/TCS
To: Pranith Kumar Karampuri <***@redhat.com>
Cc: gluster-***@gluster.org, Niels de Vos <***@redhat.com>
Date: 01/23/2015 02:45 PM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server #Personal#
Sent by: A Ghoshal


Oh, I didn't I only read a fragment of the IRC log and assumed
--xlator-option would be enough. Apparently it's a lot more work....

I do have a query, though. These connections, from one of our setups, are
these on secure ports? Or, maybe I didn't get it the first time.....

***@serv0:/root> ps -ef | grep replicated_vol
root 8851 25307 0 10:03 pts/2 00:00:00 grep replicated_vol
root 29751 1 4 Jan21 ? 01:47:20 /usr/sbin/glusterfsd -s
serv0 --volfile-id replicated_vol.serv0.mnt-bricks-replicated_vol-brick -p
/var/lib/glusterd/vols/_replicated_vol/run/serv0-mnt-bricks-replicated_vol-brick.pid
-S /var/run/dff9fa3c93e82f20103f2a3d91adc4a8.socket --brick-name
/mnt/bricks/replicated_vol/brick -l
/var/log/glusterfs/bricks/mnt-bricks-replicated_vol-brick.log
--xlator-option *-posix.glusterd-uuid=1a1d1ebc-4b92-428f-b66b-9c5efa49574d
--brick-port 49185 --xlator-option replicated_vol-server.listen-port=49185

root 30399 1 0 Jan21 ? 00:19:06 /usr/sbin/glusterfs
--volfile-id=replicated_vol --volfile-server=serv0 /mnt/replicated_vol

***@serv0:/root> netstat -p | grep 30399
tcp 0 0 serv0:969 serv0:49185 ESTABLISHED
30399/glusterfs
tcp 0 0 serv0:999 serv1:49159 ESTABLISHED
30399/glusterfs
tcp 0 0 serv0:1023 serv0:24007 ESTABLISHED
30399/glusterfs
***@serv0:/root>

Thanks again,
Anirban




From: Pranith Kumar Karampuri <***@redhat.com>
To: A Ghoshal <***@tcs.com>
Cc: gluster-***@gluster.org, Niels de Vos <***@redhat.com>
Date: 01/23/2015 01:58 PM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server #Personal#




On 01/23/2015 01:54 PM, A Ghoshal wrote:
Thanks a lot, Pranith.

We'll set this option on our test servers and keep the setup under
observation.
How did you get the bind-insecure option working?
I guess I will send a patch to make it 'volume set option'

Pranith

Thanks,
Anirban



From: Pranith Kumar Karampuri <***@redhat.com>
To: A Ghoshal <***@tcs.com>
Cc: gluster-***@gluster.org, Niels de Vos <***@redhat.com>
Date: 01/23/2015 01:28 PM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server #Personal#




On 01/22/2015 02:07 PM, A Ghoshal wrote:
Hi Pranith,

Yes, the very same (***@yahoo.com). Justin Clift sent me a
mail a while back telling me that it is better if we all use our business
email addresses so I made me a new profile.

Glusterfs complains about /proc/sys/net/ipv4/ip_local_reserved_ports
because we use a really old Linux kernel (2.6.34) wherein this feature is
not present. We plan to upgrade our Linux so often but each time we are
dissuaded from it by some compatibility issue or the other. So, we get
this log every time - on both good volumes and bad ones. What bothered me
was this (on serv1):
Basically to make the connections to servers i.e. bricks clients need to
choose secure ports i.e. port less than 1024. Since this file is not
present, it is not binding to any port as per the code I just checked.
There is an option called client-bind-insecure which bypasses this check.
I feel that is one (probably only way) to get around this.
You have to "volume set server.allow-insecure on" option and bind-insecure
option.
CC ndevos who seemed to have helped someone set bind-insecure option
correctly here (http://irclog.perlgeek.de/gluster/2014-04-09/text)

Pranith

[2015-01-20 09:37:49.151744] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0, owner:
0000000000000000
[2015-01-20 09:37:49.151824] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1)
[2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-1)
[2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)

When I write on the good server (serv1), we see that an RPC request is
sent to both client-0 and client-1. While, when I write on the bad server
(serv0), the RPC request is sent only to client-0, which is why it is no
wonder that the writes are not synced over to serv1. Somehow I could not
make the daemon on serv0 understand that there are two up-children and not
just one.

One additional detail - since we are using a kernel that's too old, we do
not have the (Anand Avati's?) FUse readdirplus patches, either. I've
noticed that the fixes in the readdirplus version of glusterfs aren't
always guaranteed to be present on the non-readdirplus version of the
patches. I'd filed a bug around one such anomaly back, but never got
around to writing a patch for it (sorry!) Here it is:
https://bugzilla.redhat.com/show_bug.cgi?id=1062287
I don't this has anything to do with readdirplus.

Maybe something on similar lines here?

Thanks,
Anirban

P.s. Please ignore the #Personal# in the subject line - we need to do that
to push mails to the public domain past the email filter safely.



From: Pranith Kumar Karampuri <***@redhat.com>
To: A Ghoshal <***@tcs.com>, gluster-***@gluster.org
Date: 01/22/2015 12:09 AM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one server missing on the other server



hi,
Responses inline.

PS: You are chalkogen_oxygen?

Pranith
On 01/20/2015 05:34 PM, A Ghoshal wrote:
Hello,

I am using the following replicated volume:

***@serv0:~> gluster v info replicated_vol

Volume Name: replicated_vol
Type: Replicate
Volume ID: 26d111e3-7e4c-479e-9355-91635ab7f1c2
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: serv0:/mnt/bricks/replicated_vol/brick
Brick2: serv1:/mnt/bricks/replicated_vol/brick
Options Reconfigured:
diagnostics.client-log-level: INFO
network.ping-timeout: 10
nfs.enable-ino32: on
cluster.self-heal-daemon: on
nfs.disable: off

replicated_vol is mounted at /mnt/replicated_vol on both serv0 and serv1.
If I do the following on serv0:

***@serv0:~>echo "cranberries" > /mnt/replicated_vol/testfile
***@serv0:~>echo "tangerines" >> /mnt/replicated_vol/testfile

And then I check for the state of the replicas in the bricks, then I find
that

***@serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile
cranberries
tangerines
***@serv0:~>

***@serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile
***@serv1:~>

As may be seen, the replica on serv1 is blank, when I write into testfile
from serv0 (even though the file is created on both bricks).
Interestingly, if I write something to the file at serv1, then the two
replicas become identical.

***@serv1:~>echo "artichokes" >> /mnt/replicated_vol/testfile

***@serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile
cranberries
tangerines
artichokes
***@serv1:~>

***@serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile
cranberries
tangerines
artichokes
***@serv0:~>

So, I dabbled into the logs a little bit, after upping the diagnostic
level, and this is what I saw:

When I write on serv0 (bad case):

[2015-01-20 09:21:52.197704] T [fuse-bridge.c:546:fuse_lookup_resume]
0-glusterfs-fuse: 53027: LOOKUP
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:21:52.197959] D
[afr-common.c:131:afr_lookup_xattr_req_prepare]
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:21:52.198006] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-0: Auth Info: pid: 28151, uid: 0, gid: 0, owner:
0000000000000000
[2015-01-20 09:21:52.198024] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:21:52.198108] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x78163x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:21:52.198565] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x78163x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
[2015-01-20 09:21:52.198640] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:21:52.198669] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:21:52.198681] D
[afr-self-heal-common.c:887:afr_mark_sources]
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:21:52.198694] D
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:21:52.198705] D
[afr-common.c:1380:afr_lookup_select_read_child]
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:21:52.198720] D
[afr-common.c:1117:afr_lookup_build_response_params]
0-replicated_vol-replicate-0: Building lookup response from 0
[2015-01-20 09:21:52.198732] D
[afr-common.c:1732:afr_lookup_perform_self_heal]
0-replicated_vol-replicate-0: Only 1 child up - do not attempt to detect
self heal

When I write on serv1 (good case):

[2015-01-20 09:37:49.151506] T [fuse-bridge.c:546:fuse_lookup_resume]
0-glusterfs-fuse: 31212: LOOKUP
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:37:49.151683] D
[afr-common.c:131:afr_lookup_xattr_req_prepare]
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:37:49.151726] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-0: Auth Info: pid: 7599, uid: 0, gid: 0, owner:
0000000000000000
[2015-01-20 09:37:49.151744] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record]
0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0, owner:
0000000000000000
[2015-01-20 09:37:49.151824] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1)
[2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-1)
[2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
[2015-01-20 09:37:49.152582] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:37:49.152596] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:37:49.152621] D
[afr-self-heal-common.c:887:afr_mark_sources]
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:37:49.152633] D
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:37:49.152644] D
[afr-common.c:1380:afr_lookup_select_read_child]
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:37:49.152657] D
[afr-common.c:1117:afr_lookup_build_response_params]
0-replicated_vol-replicate-0: Building lookup response from 0

We see that when you write on serv1, the RPC request is sent to both
replicated_vol-client-0 and replicated_vol-client-1, while when we write
on serv0, the request is sent only to replicated_vol-client-0, and the
FUse client is unaware of the presence of client-1 in the latter case.

I checked a bit more in the logs. When I turn on my trace, I found many
instances of these logs on serv0 but NOT on serv1:

[2015-01-20 09:21:15.520784] T [fuse-bridge.c:681:fuse_attr_cbk]
0-glusterfs-fuse: 53011: LOOKUP() / => 1
[2015-01-20 09:21:17.683088] T [rpc-clnt.c:422:rpc_clnt_reconnect]
0-replicated_vol-client-1: attempting reconnect
[2015-01-20 09:21:17.683159] D [name.c:155:client_fill_address_family]
0-replicated_vol-client-1: address-family not specified, guessing it to be
inet from (remote-host: serv1)
[2015-01-20 09:21:17.683178] T
[name.c:225:af_inet_client_get_remote_sockaddr] 0-replicated_vol-client-1:
option remote-port missing in volume replicated_vol-client-1. Defaulting
to 24007
[2015-01-20 09:21:17.683191] T [common-utils.c:188:gf_resolve_ip6]
0-resolver: flushing DNS cache
[2015-01-20 09:21:17.683202] T [common-utils.c:195:gf_resolve_ip6]
0-resolver: DNS cache not present, freshly probing hostname: serv1
[2015-01-20 09:21:17.683814] D [common-utils.c:237:gf_resolve_ip6]
0-resolver: returning ip-192.168.24.81 (port-24007) for hostname: serv1
and port: 24007
[2015-01-20 09:21:17.684139] D [common-utils.c:257:gf_resolve_ip6]
0-resolver: next DNS query will return: ip-192.168.24.81 port-24007
[2015-01-20 09:21:17.684164] T [socket.c:731:__socket_nodelay]
0-replicated_vol-client-1: NODELAY enabled for socket 10
[2015-01-20 09:21:17.684177] T [socket.c:790:__socket_keepalive]
0-replicated_vol-client-1: Keep-alive enabled for socket 10, interval 2,
idle: 20
[2015-01-20 09:21:17.684236] W [common-utils.c:2247:gf_get_reserved_ports]
0-glusterfs: could not open the file
/proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports info
(No such file or directory)
[2015-01-20 09:21:17.684253] W
[common-utils.c:2280:gf_process_reserved_ports] 0-glusterfs: Not able to
get reserved ports, hence there is a possibility that glusterfs may
consume reserved port
Logs above suggest that mount process couldn't assign a reserved port
because it couldn't find the file
/proc/sys/net/ipv4/ip_local_reserved_ports

I guess reboot of the machine fixed it. Wonder why it was not found in the
first place.

Pranith.
[2015-01-20 09:21:17.684660] D [socket.c:605:__socket_shutdown]
0-replicated_vol-client-1: shutdown() returned -1. Transport endpoint is
not connected
[2015-01-20 09:21:17.684699] T
[rpc-clnt.c:519:rpc_clnt_connection_cleanup] 0-replicated_vol-client-1:
cleaning up state in transport object 0x68a630
[2015-01-20 09:21:17.684731] D [socket.c:486:__socket_rwv]
0-replicated_vol-client-1: EOF on socket
[2015-01-20 09:21:17.684750] W [socket.c:514:__socket_rwv]
0-replicated_vol-client-1: readv failed (No data available)
[2015-01-20 09:21:17.684766] D
[socket.c:1962:__socket_proto_state_machine] 0-replicated_vol-client-1:
reading from socket failed. Error (No data available), peer
(192.168.24.81:49198)

I could not find a 'remote-port' option in /var/lib/glusterd on either
peer. Could somebody tell me where this configuration is looked up from?
Also, sometime later, I rebooted serv0 and that seemed to solve the
problem. However, stop+start of replicated_vol and restart of
/etc/init.d/glusterd did NOT solve the problem.
Ignore that log. If no port is given in that volfile, it picks 24007 as
the port, which is the default port where glusterd 'listens'


Any help on this matter will be greatly appreciated as I need to provide
robustness assurances for our setup.

Thanks a lot,
Anirban

P.s. Additional details:
glusterfs version: 3.4.2
Linux kernel version: 2.6.34
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you
A Ghoshal
2015-02-04 23:20:36 UTC
Permalink
CC gluster-users.

No, there aren't any firewall rules in our server. As I write in one of my
earlier emails, if I kill the mount client, and remount the volume, then
the problem disappears. That is to say, this causes the client to refresh
remote port data and from there everything's fine. Also, we dont' use
gfapi - and bind() is always good.



From: Ben England <***@redhat.com>
To: A Ghoshal <***@tcs.com>
Date: 02/05/2015 04:40 AM
Subject: Re: [Gluster-devel] [Gluster-users] In a replica 2 server,
file-updates on one server missing on the other server #Personal#



could it be a problem with iptables blocking connections? DO iptables
--list and make sure gluster ports are allowed through, at both ends.
Also, if you are using libgfapi, be sure you use rpc-auth-allow-insecure
if you have a lot of gfapi instances, or else you'll run into problems.

----- Original Message -----
Sent: Wednesday, February 4, 2015 6:07:10 PM
Subject: Re: [Gluster-devel] [Gluster-users] In a replica 2 server,
file-updates on one server missing on the other
server #Personal#
Thanks, Ben, same here :/ I actually get port numbers for glusterfsd in
1. gluster volume status <VOLNAME>
2. command line for glusterfsd on target server.
3. if you're really paranoid, get the glusterfsd PID and use netstat.
Looking at the code it seems to me that the whole thing operates on a
statd-notify paradigm. Your local mount client registers for notify on
all
remote glusterfsd's. When remote brick goes down and comes back up, you
are notified and then it calls portmap to obtain remote glusterfsd port.
I see here that both glusterd are up. But somehow the port number of the
remote glusterfsd with the mount client is now stale - not sure how it
happens. Now, the client keeps trying to connect on the stale port every
3
seconds. It gets the return errno of -111 (-ECONNREFUSED) which is
clearly
indicating that there is not listener on the remote host's IP at this
port.
Design-wise, could it indicate to the mount client that the port number
information needs to be refreshed? Would you say this is a bug of sorts?
Date: 02/05/2015 03:59 AM
Subject: Re: [Gluster-devel] [Gluster-users] In a replica 2
server,
file-updates on one server missing on the other server #Personal#
I thought Gluster was based on ONC RPC, which means there are no fixed
port numbers except for glusterd (24007). The client connects to
Glusterd, reads the volfile, and gets the port numbers of the registered
glusterfsd processes at that time, then it connects to glusterfsd. Make
sense? What you need to know is whether glusterfsd is running or not,
and
whether glusterd is finding out current state of glusterfsd.
/var/log/glusterfsd/bricks/*log has log files for each glusterfsd
process,
you might be able to see from that what glusterfsd port number is.
/var/log/glusterfs/etc*log is glusterd's log file, it might say whether
glusterd knows about glusterfsd. I'm not as good at troubleshooting as
some of the other people are so don't take my word for it.
-ben
----- Original Message -----
Sent: Wednesday, February 4, 2015 4:36:02 PM
Subject: Re: [Gluster-devel] [Gluster-users] In a replica 2 server,
file-updates on one server missing on the other
server #Personal#
Sorry for spamming you guys, but this is kind of important for me to
debug,
so if you saw anything like this before, do let me know. Here's an
It seems the mount client is attempting connection with an invalid port
number. 49175 is NOT the port number of glusterfsd on serv1
(192.168.24.8).
[pid 31026] open("/proc/sys/net/ipv4/ip_local_reserved_ports",
O_RDONLY)
= -1
ENOENT (No such file or directory)
[pid 31026] write(4, "[2015-02-04 20:39:02.793154] W ["..., 215) = 215
[pid 31026] write(4, "[2015-02-04 20:39:02.793289] W ["..., 194) = 194
[pid 31026] bind(10, {sa_family=AF_INET, sin_port=htons(1023),
sin_addr=inet_addr("192.168.24.80")}, 16) = 0
[pid 31026] fcntl(10, F_GETFL) = 0x2 (flags O_RDWR)
[pid 31026] fcntl(10, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid 31026] connect(10, {sa_family=AF_INET, sin_port=htons(49175),
sin_addr=inet_addr("192.168.24.81")}, 16) = -1 EINPROGRESS (Operation
now in
progress)
[pid 31026] fcntl(10, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK)
[pid 31026] fcntl(10, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid 31026] epoll_ctl(3, EPOLL_CTL_ADD, 10,
{EPOLLIN|EPOLLPRI|EPOLLOUT,
{u32=10, u64=8589934602}}) = 0
[pid 31026] nanosleep({1, 0}, <unfinished ...>
[pid 31021] <... epoll_wait resumed>
{{EPOLLIN|EPOLLOUT|EPOLLERR|EPOLLHUP,
{u32=10, u64=8589934602}}}, 257, 4294967295) = 1
[pid 31021] getsockopt(10, SOL_SOCKET, SO_ERROR, [29422518842425455],
[4]) =
0
[pid 31021] shutdown(10, 2 /* send and receive */) = -1 ENOTCONN
(Transport
endpoint is not connected)
Which basically told me that connection is attempted via a
non-blocking
socket at port 49175. The errno from the failure is -ECONNREFUSED, which
is
what is expected.
807 in socket.c
(gdb) bt
#0 __socket_connect_finish (this=0x6887a0) at socket.c:807
#1 socket_connect_finish (this=0x6887a0) at socket.c:2147
#2 0x00007fc863de4c04 in socket_event_handler (fd=<value optimized
out>,
idx=<value optimized out>, data=0x6887a0,
poll_in=1, poll_out=4, poll_err=<value optimized out>) at
socket.c:2223
#3 0x00007fc867f7919f in event_dispatch_epoll_handler
(event_pool=0x62db70)
at event-epoll.c:384
#4 event_dispatch_epoll (event_pool=0x62db70) at event-epoll.c:445
#5 0x0000000000406b06 in main (argc=4, argv=0x7fff25302c38) at
glusterfsd.c:1934
(gdb) print *optval
Cannot access memory at address 0x6f
(gdb) print optval
$1 = 111
[2015-02-03 12:11:33.833647] D
[socket.c:1962:__socket_proto_state_machine]
0-replicated_vol-1: reading from socket failed. Error (No data
available),
peer (192.168.24.81:49175)
There is, of course no service running on port 49175. In fact, listen
port
for corresponding glusterd on serv1 is 49206. Where does the mount
client
pick this port number from? I know that if I kill and restart the mount
client on serv0 from command line, then the problem will disappear. So,
it's
not something that is up with the processes on serv1...
Thanks,
Anirban
From: A Ghoshal/MUM/TCS
Pranith
Date: 02/05/2015 02:03 AM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one
server missing on the other server #Personal#
Sent by: A Ghoshal
I turned on trace and it seems bind to secure port on the mount client
with
the remote brick is successful - afterwards the connect() fails to
complete.
[2015-02-03 12:11:33.832615] T [rpc-clnt.c:422:rpc_clnt_reconnect]
0-replicated_vol-1: attempting reconnect
[2015-02-03 12:11:33.832666] D [name.c:155:client_fill_address_family]
0-replicated_vol-1: address-family not specified, guessing it to be inet
from (remote-host: serv1)
[2015-02-03 12:11:33.832683] T
option
remote-port missing in volume replicated_vol-1. Defaulting to 24007
[2015-02-03 12:11:33.833083] D [common-utils.c:237:gf_resolve_ip6]
0-resolver: returning ip-192.168.24.81 (port-24007) for hostname: serv1
and
port: 24007
[2015-02-03 12:11:33.833113] T [socket.c:731:__socket_nodelay]
0-replicated_vol-1: NODELAY enabled for socket 10
[2015-02-03 12:11:33.833128] T [socket.c:790:__socket_keepalive]
0-replicated_vol-1: Keep-alive enabled for socket 10, interval 2,
20
[2015-02-03 12:11:33.833188] W
[common-utils.c:2247:gf_get_reserved_ports]
0-glusterfs: could not open the file
/proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports
info
(No such file or directory)
[2015-02-03 12:11:33.833204] W
[common-utils.c:2280:gf_process_reserved_ports] 0-glusterfs: Not able to
get
reserved ports, hence there is a possibility that glusterfs may
consume
reserved port
[2015-02-03 12:11:33.833560] D [socket.c:605:__socket_shutdown]
0-replicated_vol-1: shutdown() returned -1. Transport endpoint is not
connected
[2015-02-03 12:11:33.833594] T
[rpc-clnt.c:519:rpc_clnt_connection_cleanup]
0-replicated_vol-1: cleaning up state in transport object 0x6887a0
[2015-02-03 12:11:33.833622] D [socket.c:486:__socket_rwv]
0-replicated_vol-1: EOF on socket
[2015-02-03 12:11:33.833635] W [socket.c:514:__socket_rwv]
0-replicated_vol-1: readv failed (No data available)
[2015-02-03 12:11:33.833647] D
[socket.c:1962:__socket_proto_state_machine]
0-replicated_vol-1: reading from socket failed. Error (No data
available),
peer (192.168.24.81:49175)
[2015-02-03 12:11:33.833659] D [socket.c:2236:socket_event_handler]
0-transport: disconnecting now
[2015-02-03 12:11:33.833684] T
[rpc-clnt.c:519:rpc_clnt_connection_cleanup]
0-replicated_vol-1: cleaning up state in transport object 0x6887a0
After that, I attached the mount process to my gdb session and
attached
a
breakpoint at __socket_shutdown(). It gave me the following traceback
Breakpoint 1, __socket_shutdown (this=0x68a7f0) at socket.c:594
594 socket.c: No such file or directory.
in socket.c
(gdb) bt
#0 __socket_shutdown (this=0x68a7f0) at socket.c:594
#1 0x00007f3391906553 in __socket_disconnect (this=0x68a7f0) at
socket.c:623
#2 0x00007f339190856a in socket_connect_finish (this=0x68a7f0) at
socket.c:2160
#3 0x00007f339190dc04 in socket_event_handler (fd=<value optimized
out>,
idx=<value optimized out>, data=0x68a7f0,
poll_in=1, poll_out=4, poll_err=<value optimized out>) at
socket.c:2223
#4 0x00007f3395aa219f in event_dispatch_epoll_handler
(event_pool=0x62d8b0)
at event-epoll.c:384
#5 event_dispatch_epoll (event_pool=0x62d8b0) at event-epoll.c:445
#6 0x0000000000406b06 in main (argc=4, argv=0x7fff71108c38) at
glusterfsd.c:1934
(gdb)
In case somebody could give me a clue or two on my next move (I'm
looking at
the code in the meantime), I'd be very grateful.
Thanks,
Anirban
Date: 02/03/2015 05:24 AM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one
server missing on the other server #Personal#
Hi Pranith,
I finally understood what you meant the secure ports, because the issue
occurred in one of our setups once more. It seems one of the clients on
serv1 could not open a connection to the glusterfsd running on serv0.
I'd
actually started a mail trail about it (believing it might be something
http://www.gluster.org/pipermail/gluster-users/2015-February/020465.html
I think I can write me a rudimentary kind of patch altering
af_inet_bind_to_port_lt_ceiling() to get it to call bind with port 0,
rather
than specify a port explicitly when the client.bind-insecure is
specified
...
Then I'd need to create a way to set server.allow-insecure using the cli
(or
if you already sent around the patch to do that like you said in the
earlier
[gluster-devel] if I can get it to work.
Thanks a lot,
Anirban
From: A Ghoshal/MUM/TCS
Date: 01/23/2015 02:45 PM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one
server missing on the other server #Personal#
Sent by: A Ghoshal
Oh, I didn't I only read a fragment of the IRC log and assumed
--xlator-option would be enough. Apparently it's a lot more work....
I do have a query, though. These connections, from one of our setups,
are
these on secure ports? Or, maybe I didn't get it the first time.....
root 8851 25307 0 10:03 pts/2 00:00:00 grep replicated_vol
root 29751 1 4 Jan21 ? 01:47:20 /usr/sbin/glusterfsd -s serv0
--volfile-id
replicated_vol.serv0.mnt-bricks-replicated_vol-brick -p
/var/lib/glusterd/vols/_replicated_vol/run/serv0-mnt-bricks-replicated_vol-brick.pid
-S /var/run/dff9fa3c93e82f20103f2a3d91adc4a8.socket --brick-name
/mnt/bricks/replicated_vol/brick -l
/var/log/glusterfs/bricks/mnt-bricks-replicated_vol-brick.log
--xlator-option
*-posix.glusterd-uuid=1a1d1ebc-4b92-428f-b66b-9c5efa49574d
--brick-port 49185 --xlator-option
replicated_vol-server.listen-port=49185
root 30399 1 0 Jan21 ? 00:19:06 /usr/sbin/glusterfs
--volfile-id=replicated_vol --volfile-server=serv0 /mnt/replicated_vol
tcp 0 0 serv0:969 serv0:49185 ESTABLISHED 30399/glusterfs
tcp 0 0 serv0:999 serv1:49159 ESTABLISHED 30399/glusterfs
tcp 0 0 serv0:1023 serv0:24007 ESTABLISHED 30399/glusterfs
Thanks again,
Anirban
Date: 01/23/2015 01:58 PM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one
server missing on the other server #Personal#
Thanks a lot, Pranith.
We'll set this option on our test servers and keep the setup under
observation.
How did you get the bind-insecure option working?
I guess I will send a patch to make it 'volume set option'
Pranith
Thanks,
Anirban
Date: 01/23/2015 01:28 PM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one
server missing on the other server #Personal#
Hi Pranith,
sent
me a
mail a while back telling me that it is better if we all use our
business
email addresses so I made me a new profile.
Glusterfs complains about /proc/sys/net/ipv4/ip_local_reserved_ports
because
we use a really old Linux kernel (2.6.34) wherein this feature is not
present. We plan to upgrade our Linux so often but each time we are
dissuaded from it by some compatibility issue or the other. So, we get
this
log every time - on both good volumes and bad ones. What bothered me
was
Basically to make the connections to servers i.e. bricks clients need to
choose secure ports i.e. port less than 1024. Since this file is not
present, it is not binding to any port as per the code I just checked.
There
is an option called client-bind-insecure which bypasses this check. I
feel
that is one (probably only way) to get around this.
You have to "volume set server.allow-insecure on" option and
bind-insecure
option.
CC ndevos who seemed to have helped someone set bind-insecure option
correctly here ( http://irclog.perlgeek.de/gluster/2014-04-09/text )
Pranith
[2015-01-20 09:37:49.151744] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header]
0-rpc-clnt: Request fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit]
330,
Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record]
0000000000000000
[2015-01-20 09:37:49.151824] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header]
0-rpc-clnt: Request fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit]
330,
Proc: 27) to rpc-transport (replicated_vol-client-1)
[2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x
GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-1)
[2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x
GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
When I write on the good server (serv1), we see that an RPC request is
sent
to both client-0 and client-1. While, when I write on the bad server
(serv0), the RPC request is sent only to client-0, which is why it is no
wonder that the writes are not synced over to serv1. Somehow I could not
make the daemon on serv0 understand that there are two up-children and
not
just one.
One additional detail - since we are using a kernel that's too old, we
do not
have the (Anand Avati's?) FUse readdirplus patches, either. I've
noticed
that the fixes in the readdirplus version of glusterfs aren't always
guaranteed to be present on the non-readdirplus version of the
patches.
I'd
filed a bug around one such anomaly back, but never got around to
writing a
https://bugzilla.redhat.com/show_bug.cgi?id=1062287
I don't this has anything to do with readdirplus.
Maybe something on similar lines here?
Thanks,
Anirban
P.s. Please ignore the #Personal# in the subject line - we need to do
that to
push mails to the public domain past the email filter safely.
Date: 01/22/2015 12:09 AM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one
server missing on the other server
hi,
Responses inline.
PS: You are chalkogen_oxygen?
Pranith
Hello,
Volume Name: replicated_vol
Type: Replicate
Volume ID: 26d111e3-7e4c-479e-9355-91635ab7f1c2
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Brick1: serv0:/mnt/bricks/replicated_vol/brick
Brick2: serv1:/mnt/bricks/replicated_vol/brick
diagnostics.client-log-level: INFO
network.ping-timeout: 10
nfs.enable-ino32: on
cluster.self-heal-daemon: on
nfs.disable: off
replicated_vol is mounted at /mnt/replicated_vol on both serv0 and
serv1. If
And then I check for the state of the replicas in the bricks, then I
find
that
cranberries
tangerines
As may be seen, the replica on serv1 is blank, when I write into
testfile
from serv0 (even though the file is created on both bricks).
Interestingly,
if I write something to the file at serv1, then the two replicas
become
identical.
cranberries
tangerines
artichokes
cranberries
tangerines
artichokes
So, I dabbled into the logs a little bit, after upping the diagnostic
level,
[2015-01-20 09:21:52.197704] T [fuse-bridge.c:546:fuse_lookup_resume]
0-glusterfs-fuse: 53027: LOOKUP
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:21:52.197959] D
[afr-common.c:131:afr_lookup_xattr_req_prepare]
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:21:52.198006] T [rpc-clnt.c:1302:rpc_clnt_record]
0000000000000000
[2015-01-20 09:21:52.198024] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header]
0-rpc-clnt: Request fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:21:52.198108] T [rpc-clnt.c:1499:rpc_clnt_submit]
330,
Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:21:52.198565] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x78163x
GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
[2015-01-20 09:21:52.198640] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:21:52.198669] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:21:52.198681] D
[afr-self-heal-common.c:887:afr_mark_sources]
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:21:52.198694] D
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:21:52.198705] D
[afr-common.c:1380:afr_lookup_select_read_child]
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:21:52.198720] D
[afr-common.c:1117:afr_lookup_build_response_params]
0-replicated_vol-replicate-0: Building lookup response from 0
[2015-01-20 09:21:52.198732] D
[afr-common.c:1732:afr_lookup_perform_self_heal]
0-replicated_vol-replicate-0: Only 1 child up - do not attempt to detect
self heal
[2015-01-20 09:37:49.151506] T [fuse-bridge.c:546:fuse_lookup_resume]
0-glusterfs-fuse: 31212: LOOKUP
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:37:49.151683] D
[afr-common.c:131:afr_lookup_xattr_req_prepare]
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:37:49.151726] T [rpc-clnt.c:1302:rpc_clnt_record]
0000000000000000
[2015-01-20 09:37:49.151744] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header]
0-rpc-clnt: Request fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit]
330,
Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record]
0000000000000000
[2015-01-20 09:37:49.151824] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header]
0-rpc-clnt: Request fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit]
330,
Proc: 27) to rpc-transport (replicated_vol-client-1)
[2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x
GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-1)
[2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x
GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
[2015-01-20 09:37:49.152582] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:37:49.152596] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:37:49.152621] D
[afr-self-heal-common.c:887:afr_mark_sources]
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:37:49.152633] D
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:37:49.152644] D
[afr-common.c:1380:afr_lookup_select_read_child]
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:37:49.152657] D
[afr-common.c:1117:afr_lookup_build_response_params]
0-replicated_vol-replicate-0: Building lookup response from 0
We see that when you write on serv1, the RPC request is sent to both
replicated_vol-client-0 and replicated_vol-client-1, while when we write
on
serv0, the request is sent only to replicated_vol-client-0, and the
FUse
client is unaware of the presence of client-1 in the latter case.
I checked a bit more in the logs. When I turn on my trace, I found many
[2015-01-20 09:21:15.520784] T [fuse-bridge.c:681:fuse_attr_cbk]
0-glusterfs-fuse: 53011: LOOKUP() / => 1
[2015-01-20 09:21:17.683088] T [rpc-clnt.c:422:rpc_clnt_reconnect]
0-replicated_vol-client-1: attempting reconnect
[2015-01-20 09:21:17.683159] D [name.c:155:client_fill_address_family]
0-replicated_vol-client-1: address-family not specified, guessing it
to
be
inet from (remote-host: serv1)
[2015-01-20 09:21:17.683178] T
[name.c:225:af_inet_client_get_remote_sockaddr]
option remote-port missing in volume replicated_vol-client-1. Defaulting
to
24007
[2015-01-20 09:21:17.683191] T [common-utils.c:188:gf_resolve_ip6]
0-resolver: flushing DNS cache
[2015-01-20 09:21:17.683202] T [common-utils.c:195:gf_resolve_ip6]
0-resolver: DNS cache not present, freshly probing hostname: serv1
[2015-01-20 09:21:17.683814] D [common-utils.c:237:gf_resolve_ip6]
0-resolver: returning ip-192.168.24.81 (port-24007) for hostname: serv1
and
port: 24007
[2015-01-20 09:21:17.684139] D [common-utils.c:257:gf_resolve_ip6]
0-resolver: next DNS query will return: ip-192.168.24.81 port-24007
[2015-01-20 09:21:17.684164] T [socket.c:731:__socket_nodelay]
0-replicated_vol-client-1: NODELAY enabled for socket 10
[2015-01-20 09:21:17.684177] T [socket.c:790:__socket_keepalive]
0-replicated_vol-client-1: Keep-alive enabled for socket 10, interval 2,
idle: 20
[2015-01-20 09:21:17.684236] W
[common-utils.c:2247:gf_get_reserved_ports]
0-glusterfs: could not open the file
/proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports
info
(No such file or directory)
[2015-01-20 09:21:17.684253] W
[common-utils.c:2280:gf_process_reserved_ports] 0-glusterfs: Not able to
get
reserved ports, hence there is a possibility that glusterfs may
consume
reserved port
Logs above suggest that mount process couldn't assign a reserved port
because
it couldn't find the file /proc/sys/net/ipv4/ip_local_reserved_ports
I guess reboot of the machine fixed it. Wonder why it was not found in
the
first place.
Pranith.
[2015-01-20 09:21:17.684660] D [socket.c:605:__socket_shutdown]
0-replicated_vol-client-1: shutdown() returned -1. Transport endpoint is
not
connected
[2015-01-20 09:21:17.684699] T
[rpc-clnt.c:519:rpc_clnt_connection_cleanup]
0-replicated_vol-client-1: cleaning up state in transport object
0x68a630
[2015-01-20 09:21:17.684731] D [socket.c:486:__socket_rwv]
0-replicated_vol-client-1: EOF on socket
[2015-01-20 09:21:17.684750] W [socket.c:514:__socket_rwv]
0-replicated_vol-client-1: readv failed (No data available)
[2015-01-20 09:21:17.684766] D
[socket.c:1962:__socket_proto_state_machine]
0-replicated_vol-client-1: reading from socket failed. Error (No data
available), peer (192.168.24.81:49198)
I could not find a 'remote-port' option in /var/lib/glusterd on either
peer.
Could somebody tell me where this configuration is looked up from?
Also,
sometime later, I rebooted serv0 and that seemed to solve the problem.
However, stop+start of replicated_vol and restart of
/etc/init.d/glusterd
did NOT solve the problem.
Ignore that log. If no port is given in that volfile, it picks 24007 as
the
port, which is the default port where glusterd 'listens'
Any help on this matter will be greatly appreciated as I need to provide
robustness assurances for our setup.
Thanks a lot,
Anirban
glusterfs version: 3.4.2
Linux kernel version: 2.6.34
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you
_______________________________________________
Gluster-users mailing list
http://www.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
http://www.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-devel mailing list
http://www.gluster.org/mailman/listinfo/gluster-devel
A Ghoshal
2015-02-20 14:00:43 UTC
Permalink
I found out the reason this happens a few of days back. Just to let you
know..

It seems it has partly to do with the way we handle reboots on our setup.
When we take down one of our replica servers (for testing/maintenance), to
ensure that the bricks are unmounted correctly, we kill off the glusterfsd
processes (short of stopping the volume and causing service disruption to
the mount clients). Let us assume that serv1 is being rebooted. When we
kill off glusterfsd,

For file-systems that are normally not accessed:

1. ping between the mount client on serv0 and the brick's glusterfsd on
serv1 times out. In our system, this ping is configured at 10 seconds.

2. At this point, the mount client on serv0 destroys the now defunct TCP
connection and querying the port of the remote brick with the remote
glusterd process.

3. But, since by this time serv1 is already down, no response arrives, and
the local mount client retries the query till serv1 is up once more, upon
which the glusterd on serv1 responds with the newly allocated port number
for the brick, and a new connection is thus established.

For frequently accessed file-systems:

1. it is one of the file operations (read/write) that times out. This
happens much earlier than 10 seconds.
This results in the connection being destroyed and the mount client on
serv0 querying remote glusterd for the remote brick's port number.

2. Because this happens so quickly, glusterd on serv1 is not yet down, and
also unaware that the local brick is not alive anymore. So, it returns the
port number of the dead process.

3. For the mount client on serv0, since the query succeeded, it does not
attempt another port query, but instead tries to connect to the stale port
number ad infinitum.

Our solution to this problem is simple - before we kill glusterfsd and
unmount the bricks, we stop glusterd:

/etc/init.d/glusterd stop

This ensures that the portmap queries by the mount client on serv0 are
never honored.

Thanks,
Anirban



From: A Ghoshal/MUM/TCS
To: Ben England <***@redhat.com>
Cc: gluster-***@gluster.org
Date: 02/05/2015 04:50 AM
Subject: Re: [Gluster-devel] [Gluster-users] In a replica 2 server,
file-updates on one server missing on the other server #Personal#
Sent by: A Ghoshal


CC gluster-users.

No, there aren't any firewall rules in our server. As I write in one of my
earlier emails, if I kill the mount client, and remount the volume, then
the problem disappears. That is to say, this causes the client to refresh
remote port data and from there everything's fine. Also, we dont' use
gfapi - and bind() is always good.




From: Ben England <***@redhat.com>
To: A Ghoshal <***@tcs.com>
Date: 02/05/2015 04:40 AM
Subject: Re: [Gluster-devel] [Gluster-users] In a replica 2 server,
file-updates on one server missing on the other server #Personal#



could it be a problem with iptables blocking connections? DO iptables
--list and make sure gluster ports are allowed through, at both ends.
Also, if you are using libgfapi, be sure you use rpc-auth-allow-insecure
if you have a lot of gfapi instances, or else you'll run into problems.

----- Original Message -----
Sent: Wednesday, February 4, 2015 6:07:10 PM
Subject: Re: [Gluster-devel] [Gluster-users] In a replica 2 server,
file-updates on one server missing on the other
server #Personal#
Thanks, Ben, same here :/ I actually get port numbers for glusterfsd in
1. gluster volume status <VOLNAME>
2. command line for glusterfsd on target server.
3. if you're really paranoid, get the glusterfsd PID and use netstat.
Looking at the code it seems to me that the whole thing operates on a
statd-notify paradigm. Your local mount client registers for notify on
all
remote glusterfsd's. When remote brick goes down and comes back up, you
are notified and then it calls portmap to obtain remote glusterfsd port.
I see here that both glusterd are up. But somehow the port number of the
remote glusterfsd with the mount client is now stale - not sure how it
happens. Now, the client keeps trying to connect on the stale port every
3
seconds. It gets the return errno of -111 (-ECONNREFUSED) which is
clearly
indicating that there is not listener on the remote host's IP at this
port.
Design-wise, could it indicate to the mount client that the port number
information needs to be refreshed? Would you say this is a bug of sorts?
Date: 02/05/2015 03:59 AM
Subject: Re: [Gluster-devel] [Gluster-users] In a replica 2
server,
file-updates on one server missing on the other server #Personal#
I thought Gluster was based on ONC RPC, which means there are no fixed
port numbers except for glusterd (24007). The client connects to
Glusterd, reads the volfile, and gets the port numbers of the registered
glusterfsd processes at that time, then it connects to glusterfsd. Make
sense? What you need to know is whether glusterfsd is running or not,
and
whether glusterd is finding out current state of glusterfsd.
/var/log/glusterfsd/bricks/*log has log files for each glusterfsd
process,
you might be able to see from that what glusterfsd port number is.
/var/log/glusterfs/etc*log is glusterd's log file, it might say whether
glusterd knows about glusterfsd. I'm not as good at troubleshooting as
some of the other people are so don't take my word for it.
-ben
----- Original Message -----
Sent: Wednesday, February 4, 2015 4:36:02 PM
Subject: Re: [Gluster-devel] [Gluster-users] In a replica 2 server,
file-updates on one server missing on the other
server #Personal#
Sorry for spamming you guys, but this is kind of important for me to
debug,
so if you saw anything like this before, do let me know. Here's an
It seems the mount client is attempting connection with an invalid port
number. 49175 is NOT the port number of glusterfsd on serv1
(192.168.24.8).
[pid 31026] open("/proc/sys/net/ipv4/ip_local_reserved_ports",
O_RDONLY)
= -1
ENOENT (No such file or directory)
[pid 31026] write(4, "[2015-02-04 20:39:02.793154] W ["..., 215) = 215
[pid 31026] write(4, "[2015-02-04 20:39:02.793289] W ["..., 194) = 194
[pid 31026] bind(10, {sa_family=AF_INET, sin_port=htons(1023),
sin_addr=inet_addr("192.168.24.80")}, 16) = 0
[pid 31026] fcntl(10, F_GETFL) = 0x2 (flags O_RDWR)
[pid 31026] fcntl(10, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid 31026] connect(10, {sa_family=AF_INET, sin_port=htons(49175),
sin_addr=inet_addr("192.168.24.81")}, 16) = -1 EINPROGRESS (Operation
now in
progress)
[pid 31026] fcntl(10, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK)
[pid 31026] fcntl(10, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid 31026] epoll_ctl(3, EPOLL_CTL_ADD, 10,
{EPOLLIN|EPOLLPRI|EPOLLOUT,
{u32=10, u64=8589934602}}) = 0
[pid 31026] nanosleep({1, 0}, <unfinished ...>
[pid 31021] <... epoll_wait resumed>
{{EPOLLIN|EPOLLOUT|EPOLLERR|EPOLLHUP,
{u32=10, u64=8589934602}}}, 257, 4294967295) = 1
[pid 31021] getsockopt(10, SOL_SOCKET, SO_ERROR, [29422518842425455],
[4]) =
0
[pid 31021] shutdown(10, 2 /* send and receive */) = -1 ENOTCONN
(Transport
endpoint is not connected)
Which basically told me that connection is attempted via a
non-blocking
socket at port 49175. The errno from the failure is -ECONNREFUSED, which
is
what is expected.
807 in socket.c
(gdb) bt
#0 __socket_connect_finish (this=0x6887a0) at socket.c:807
#1 socket_connect_finish (this=0x6887a0) at socket.c:2147
#2 0x00007fc863de4c04 in socket_event_handler (fd=<value optimized
out>,
idx=<value optimized out>, data=0x6887a0,
poll_in=1, poll_out=4, poll_err=<value optimized out>) at
socket.c:2223
#3 0x00007fc867f7919f in event_dispatch_epoll_handler
(event_pool=0x62db70)
at event-epoll.c:384
#4 event_dispatch_epoll (event_pool=0x62db70) at event-epoll.c:445
#5 0x0000000000406b06 in main (argc=4, argv=0x7fff25302c38) at
glusterfsd.c:1934
(gdb) print *optval
Cannot access memory at address 0x6f
(gdb) print optval
$1 = 111
[2015-02-03 12:11:33.833647] D
[socket.c:1962:__socket_proto_state_machine]
0-replicated_vol-1: reading from socket failed. Error (No data
available),
peer (192.168.24.81:49175)
There is, of course no service running on port 49175. In fact, listen
port
for corresponding glusterd on serv1 is 49206. Where does the mount
client
pick this port number from? I know that if I kill and restart the mount
client on serv0 from command line, then the problem will disappear. So,
it's
not something that is up with the processes on serv1...
Thanks,
Anirban
From: A Ghoshal/MUM/TCS
Pranith
Date: 02/05/2015 02:03 AM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one
server missing on the other server #Personal#
Sent by: A Ghoshal
I turned on trace and it seems bind to secure port on the mount client
with
the remote brick is successful - afterwards the connect() fails to
complete.
[2015-02-03 12:11:33.832615] T [rpc-clnt.c:422:rpc_clnt_reconnect]
0-replicated_vol-1: attempting reconnect
[2015-02-03 12:11:33.832666] D [name.c:155:client_fill_address_family]
0-replicated_vol-1: address-family not specified, guessing it to be inet
from (remote-host: serv1)
[2015-02-03 12:11:33.832683] T
option
remote-port missing in volume replicated_vol-1. Defaulting to 24007
[2015-02-03 12:11:33.833083] D [common-utils.c:237:gf_resolve_ip6]
0-resolver: returning ip-192.168.24.81 (port-24007) for hostname: serv1
and
port: 24007
[2015-02-03 12:11:33.833113] T [socket.c:731:__socket_nodelay]
0-replicated_vol-1: NODELAY enabled for socket 10
[2015-02-03 12:11:33.833128] T [socket.c:790:__socket_keepalive]
0-replicated_vol-1: Keep-alive enabled for socket 10, interval 2,
20
[2015-02-03 12:11:33.833188] W
[common-utils.c:2247:gf_get_reserved_ports]
0-glusterfs: could not open the file
/proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports
info
(No such file or directory)
[2015-02-03 12:11:33.833204] W
[common-utils.c:2280:gf_process_reserved_ports] 0-glusterfs: Not able to
get
reserved ports, hence there is a possibility that glusterfs may
consume
reserved port
[2015-02-03 12:11:33.833560] D [socket.c:605:__socket_shutdown]
0-replicated_vol-1: shutdown() returned -1. Transport endpoint is not
connected
[2015-02-03 12:11:33.833594] T
[rpc-clnt.c:519:rpc_clnt_connection_cleanup]
0-replicated_vol-1: cleaning up state in transport object 0x6887a0
[2015-02-03 12:11:33.833622] D [socket.c:486:__socket_rwv]
0-replicated_vol-1: EOF on socket
[2015-02-03 12:11:33.833635] W [socket.c:514:__socket_rwv]
0-replicated_vol-1: readv failed (No data available)
[2015-02-03 12:11:33.833647] D
[socket.c:1962:__socket_proto_state_machine]
0-replicated_vol-1: reading from socket failed. Error (No data
available),
peer (192.168.24.81:49175)
[2015-02-03 12:11:33.833659] D [socket.c:2236:socket_event_handler]
0-transport: disconnecting now
[2015-02-03 12:11:33.833684] T
[rpc-clnt.c:519:rpc_clnt_connection_cleanup]
0-replicated_vol-1: cleaning up state in transport object 0x6887a0
After that, I attached the mount process to my gdb session and
attached
a
breakpoint at __socket_shutdown(). It gave me the following traceback
Breakpoint 1, __socket_shutdown (this=0x68a7f0) at socket.c:594
594 socket.c: No such file or directory.
in socket.c
(gdb) bt
#0 __socket_shutdown (this=0x68a7f0) at socket.c:594
#1 0x00007f3391906553 in __socket_disconnect (this=0x68a7f0) at
socket.c:623
#2 0x00007f339190856a in socket_connect_finish (this=0x68a7f0) at
socket.c:2160
#3 0x00007f339190dc04 in socket_event_handler (fd=<value optimized
out>,
idx=<value optimized out>, data=0x68a7f0,
poll_in=1, poll_out=4, poll_err=<value optimized out>) at
socket.c:2223
#4 0x00007f3395aa219f in event_dispatch_epoll_handler
(event_pool=0x62d8b0)
at event-epoll.c:384
#5 event_dispatch_epoll (event_pool=0x62d8b0) at event-epoll.c:445
#6 0x0000000000406b06 in main (argc=4, argv=0x7fff71108c38) at
glusterfsd.c:1934
(gdb)
In case somebody could give me a clue or two on my next move (I'm
looking at
the code in the meantime), I'd be very grateful.
Thanks,
Anirban
Date: 02/03/2015 05:24 AM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one
server missing on the other server #Personal#
Hi Pranith,
I finally understood what you meant the secure ports, because the issue
occurred in one of our setups once more. It seems one of the clients on
serv1 could not open a connection to the glusterfsd running on serv0.
I'd
actually started a mail trail about it (believing it might be something
http://www.gluster.org/pipermail/gluster-users/2015-February/020465.html
I think I can write me a rudimentary kind of patch altering
af_inet_bind_to_port_lt_ceiling() to get it to call bind with port 0,
rather
than specify a port explicitly when the client.bind-insecure is
specified
...
Then I'd need to create a way to set server.allow-insecure using the cli
(or
if you already sent around the patch to do that like you said in the
earlier
[gluster-devel] if I can get it to work.
Thanks a lot,
Anirban
From: A Ghoshal/MUM/TCS
Date: 01/23/2015 02:45 PM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one
server missing on the other server #Personal#
Sent by: A Ghoshal
Oh, I didn't I only read a fragment of the IRC log and assumed
--xlator-option would be enough. Apparently it's a lot more work....
I do have a query, though. These connections, from one of our setups,
are
these on secure ports? Or, maybe I didn't get it the first time.....
root 8851 25307 0 10:03 pts/2 00:00:00 grep replicated_vol
root 29751 1 4 Jan21 ? 01:47:20 /usr/sbin/glusterfsd -s serv0
--volfile-id
replicated_vol.serv0.mnt-bricks-replicated_vol-brick -p
/var/lib/glusterd/vols/_replicated_vol/run/serv0-mnt-bricks-replicated_vol-brick.pid
-S /var/run/dff9fa3c93e82f20103f2a3d91adc4a8.socket --brick-name
/mnt/bricks/replicated_vol/brick -l
/var/log/glusterfs/bricks/mnt-bricks-replicated_vol-brick.log
--xlator-option
*-posix.glusterd-uuid=1a1d1ebc-4b92-428f-b66b-9c5efa49574d
--brick-port 49185 --xlator-option
replicated_vol-server.listen-port=49185
root 30399 1 0 Jan21 ? 00:19:06 /usr/sbin/glusterfs
--volfile-id=replicated_vol --volfile-server=serv0 /mnt/replicated_vol
tcp 0 0 serv0:969 serv0:49185 ESTABLISHED 30399/glusterfs
tcp 0 0 serv0:999 serv1:49159 ESTABLISHED 30399/glusterfs
tcp 0 0 serv0:1023 serv0:24007 ESTABLISHED 30399/glusterfs
Thanks again,
Anirban
Date: 01/23/2015 01:58 PM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one
server missing on the other server #Personal#
Thanks a lot, Pranith.
We'll set this option on our test servers and keep the setup under
observation.
How did you get the bind-insecure option working?
I guess I will send a patch to make it 'volume set option'
Pranith
Thanks,
Anirban
Date: 01/23/2015 01:28 PM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one
server missing on the other server #Personal#
Hi Pranith,
sent
me a
mail a while back telling me that it is better if we all use our
business
email addresses so I made me a new profile.
Glusterfs complains about /proc/sys/net/ipv4/ip_local_reserved_ports
because
we use a really old Linux kernel (2.6.34) wherein this feature is not
present. We plan to upgrade our Linux so often but each time we are
dissuaded from it by some compatibility issue or the other. So, we get
this
log every time - on both good volumes and bad ones. What bothered me
was
Basically to make the connections to servers i.e. bricks clients need to
choose secure ports i.e. port less than 1024. Since this file is not
present, it is not binding to any port as per the code I just checked.
There
is an option called client-bind-insecure which bypasses this check. I
feel
that is one (probably only way) to get around this.
You have to "volume set server.allow-insecure on" option and
bind-insecure
option.
CC ndevos who seemed to have helped someone set bind-insecure option
correctly here ( http://irclog.perlgeek.de/gluster/2014-04-09/text )
Pranith
[2015-01-20 09:37:49.151744] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header]
0-rpc-clnt: Request fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit]
330,
Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record]
0000000000000000
[2015-01-20 09:37:49.151824] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header]
0-rpc-clnt: Request fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit]
330,
Proc: 27) to rpc-transport (replicated_vol-client-1)
[2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x
GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-1)
[2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x
GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
When I write on the good server (serv1), we see that an RPC request is
sent
to both client-0 and client-1. While, when I write on the bad server
(serv0), the RPC request is sent only to client-0, which is why it is no
wonder that the writes are not synced over to serv1. Somehow I could not
make the daemon on serv0 understand that there are two up-children and
not
just one.
One additional detail - since we are using a kernel that's too old, we
do not
have the (Anand Avati's?) FUse readdirplus patches, either. I've
noticed
that the fixes in the readdirplus version of glusterfs aren't always
guaranteed to be present on the non-readdirplus version of the
patches.
I'd
filed a bug around one such anomaly back, but never got around to
writing a
https://bugzilla.redhat.com/show_bug.cgi?id=1062287
I don't this has anything to do with readdirplus.
Maybe something on similar lines here?
Thanks,
Anirban
P.s. Please ignore the #Personal# in the subject line - we need to do
that to
push mails to the public domain past the email filter safely.
Date: 01/22/2015 12:09 AM
Subject: Re: [Gluster-users] In a replica 2 server, file-updates on
one
server missing on the other server
hi,
Responses inline.
PS: You are chalkogen_oxygen?
Pranith
Hello,
Volume Name: replicated_vol
Type: Replicate
Volume ID: 26d111e3-7e4c-479e-9355-91635ab7f1c2
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Brick1: serv0:/mnt/bricks/replicated_vol/brick
Brick2: serv1:/mnt/bricks/replicated_vol/brick
diagnostics.client-log-level: INFO
network.ping-timeout: 10
nfs.enable-ino32: on
cluster.self-heal-daemon: on
nfs.disable: off
replicated_vol is mounted at /mnt/replicated_vol on both serv0 and
serv1. If
And then I check for the state of the replicas in the bricks, then I
find
that
cranberries
tangerines
As may be seen, the replica on serv1 is blank, when I write into
testfile
from serv0 (even though the file is created on both bricks).
Interestingly,
if I write something to the file at serv1, then the two replicas
become
identical.
cranberries
tangerines
artichokes
cranberries
tangerines
artichokes
So, I dabbled into the logs a little bit, after upping the diagnostic
level,
[2015-01-20 09:21:52.197704] T [fuse-bridge.c:546:fuse_lookup_resume]
0-glusterfs-fuse: 53027: LOOKUP
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:21:52.197959] D
[afr-common.c:131:afr_lookup_xattr_req_prepare]
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:21:52.198006] T [rpc-clnt.c:1302:rpc_clnt_record]
0000000000000000
[2015-01-20 09:21:52.198024] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header]
0-rpc-clnt: Request fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:21:52.198108] T [rpc-clnt.c:1499:rpc_clnt_submit]
330,
Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:21:52.198565] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x78163x
GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
[2015-01-20 09:21:52.198640] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:21:52.198669] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:21:52.198681] D
[afr-self-heal-common.c:887:afr_mark_sources]
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:21:52.198694] D
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:21:52.198705] D
[afr-common.c:1380:afr_lookup_select_read_child]
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:21:52.198720] D
[afr-common.c:1117:afr_lookup_build_response_params]
0-replicated_vol-replicate-0: Building lookup response from 0
[2015-01-20 09:21:52.198732] D
[afr-common.c:1732:afr_lookup_perform_self_heal]
0-replicated_vol-replicate-0: Only 1 child up - do not attempt to detect
self heal
[2015-01-20 09:37:49.151506] T [fuse-bridge.c:546:fuse_lookup_resume]
0-glusterfs-fuse: 31212: LOOKUP
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:37:49.151683] D
[afr-common.c:131:afr_lookup_xattr_req_prepare]
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:37:49.151726] T [rpc-clnt.c:1302:rpc_clnt_record]
0000000000000000
[2015-01-20 09:37:49.151744] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header]
0-rpc-clnt: Request fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit]
330,
Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record]
0000000000000000
[2015-01-20 09:37:49.151824] T
[rpc-clnt.c:1182:rpc_clnt_record_build_header]
0-rpc-clnt: Request fraglen 456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit]
330,
Proc: 27) to rpc-transport (replicated_vol-client-1)
[2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x
GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-1)
[2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init]
0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x
GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport
(replicated_vol-client-0)
[2015-01-20 09:37:49.152582] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:37:49.152596] D
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:37:49.152621] D
[afr-self-heal-common.c:887:afr_mark_sources]
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:37:49.152633] D
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:37:49.152644] D
[afr-common.c:1380:afr_lookup_select_read_child]
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:37:49.152657] D
[afr-common.c:1117:afr_lookup_build_response_params]
0-replicated_vol-replicate-0: Building lookup response from 0
We see that when you write on serv1, the RPC request is sent to both
replicated_vol-client-0 and replicated_vol-client-1, while when we write
on
serv0, the request is sent only to replicated_vol-client-0, and the
FUse
client is unaware of the presence of client-1 in the latter case.
I checked a bit more in the logs. When I turn on my trace, I found many
[2015-01-20 09:21:15.520784] T [fuse-bridge.c:681:fuse_attr_cbk]
0-glusterfs-fuse: 53011: LOOKUP() / => 1
[2015-01-20 09:21:17.683088] T [rpc-clnt.c:422:rpc_clnt_reconnect]
0-replicated_vol-client-1: attempting reconnect
[2015-01-20 09:21:17.683159] D [name.c:155:client_fill_address_family]
0-replicated_vol-client-1: address-family not specified, guessing it
to
be
inet from (remote-host: serv1)
[2015-01-20 09:21:17.683178] T
[name.c:225:af_inet_client_get_remote_sockaddr]
option remote-port missing in volume replicated_vol-client-1. Defaulting
to
24007
[2015-01-20 09:21:17.683191] T [common-utils.c:188:gf_resolve_ip6]
0-resolver: flushing DNS cache
[2015-01-20 09:21:17.683202] T [common-utils.c:195:gf_resolve_ip6]
0-resolver: DNS cache not present, freshly probing hostname: serv1
[2015-01-20 09:21:17.683814] D [common-utils.c:237:gf_resolve_ip6]
0-resolver: returning ip-192.168.24.81 (port-24007) for hostname: serv1
and
port: 24007
[2015-01-20 09:21:17.684139] D [common-utils.c:257:gf_resolve_ip6]
0-resolver: next DNS query will return: ip-192.168.24.81 port-24007
[2015-01-20 09:21:17.684164] T [socket.c:731:__socket_nodelay]
0-replicated_vol-client-1: NODELAY enabled for socket 10
[2015-01-20 09:21:17.684177] T [socket.c:790:__socket_keepalive]
0-replicated_vol-client-1: Keep-alive enabled for socket 10, interval 2,
idle: 20
[2015-01-20 09:21:17.684236] W
[common-utils.c:2247:gf_get_reserved_ports]
0-glusterfs: could not open the file
/proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports
info
(No such file or directory)
[2015-01-20 09:21:17.684253] W
[common-utils.c:2280:gf_process_reserved_ports] 0-glusterfs: Not able to
get
reserved ports, hence there is a possibility that glusterfs may
consume
reserved port
Logs above suggest that mount process couldn't assign a reserved port
because
it couldn't find the file /proc/sys/net/ipv4/ip_local_reserved_ports
I guess reboot of the machine fixed it. Wonder why it was not found in
the
first place.
Pranith.
[2015-01-20 09:21:17.684660] D [socket.c:605:__socket_shutdown]
0-replicated_vol-client-1: shutdown() returned -1. Transport endpoint is
not
connected
[2015-01-20 09:21:17.684699] T
[rpc-clnt.c:519:rpc_clnt_connection_cleanup]
0-replicated_vol-client-1: cleaning up state in transport object
0x68a630
[2015-01-20 09:21:17.684731] D [socket.c:486:__socket_rwv]
0-replicated_vol-client-1: EOF on socket
[2015-01-20 09:21:17.684750] W [socket.c:514:__socket_rwv]
0-replicated_vol-client-1: readv failed (No data available)
[2015-01-20 09:21:17.684766] D
[socket.c:1962:__socket_proto_state_machine]
0-replicated_vol-client-1: reading from socket failed. Error (No data
available), peer (192.168.24.81:49198)
I could not find a 'remote-port' option in /var/lib/glusterd on either
peer.
Could somebody tell me where this configuration is looked up from?
Also,
sometime later, I rebooted serv0 and that seemed to solve the problem.
However, stop+start of replicated_vol and restart of
/etc/init.d/glusterd
did NOT solve the problem.
Ignore that log. If no port is given in that volfile, it picks 24007 as
the
port, which is the default port where glusterd 'listens'
Any help on this matter will be greatly appreciated as I need to provide
robustness assurances for our setup.
Thanks a lot,
Anirban
glusterfs version: 3.4.2
Linux kernel version: 2.6.34
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you
_______________________________________________
Gluster-users mailing list
http://www.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
http://www.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-devel mailing list
http://www.gluster.org/mailman/listinfo/gluster-devel
Continue reading on narkive:
Loading...