Discussion:
[Gluster-users] Gluster 3.12.11 geo-replication connection to peer is broken
Pablo J Rebollo Sosa
2018-07-23 20:17:44 UTC
Permalink
Hi,

I’m having problem with Gluster 3.12.11 geo-replication in CentOS 7.5. The process starts the geo-replication but after few minutes the log shows “connection to peer is broken”.

The “status detail” looks ok but no files are replicated.

[***@gluster1 vol_replicated]# gluster volume geo-replication vol_replicated ***@10.20.220.12::georep_1 status detail | sort

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED ENTRY DATA META FAILURES CHECKPOINT TIME CHECKPOINT COMPLETED CHECKPOINT COMPLETION TIME
gluster1 vol_replicated /export/brick1/vol_replicated geoaccount1 ***@10.20.220.12::georep_1 10.20.220.12 Active Hybrid Crawl N/A 8191 6550 0 0 N/A N/A N/A
gluster2 vol_replicated /export/brick1/vol_replicated geoaccount1 ***@10.20.220.12::georep_1 10.20.220.13 Passive N/A N/A N/A N/A N/A N/A N/A N/A N/A
gluster3 vol_replicated /export/brick1/vol_replicated geoaccount1 ***@10.20.220.12::georep_1 10.20.220.12 Passive N/A N/A N/A N/A N/A N/A N/A N/A N/A
gluster4 vol_replicated /export/brick1/vol_replicated geoaccount1 ***@10.20.220.12::georep_1 10.20.220.13 Active Hybrid Crawl N/A 8191 6532 0 0 N/A N/A N/A

These are the messages on the log file.

[2018-07-23 19:35:50.18026] I [gsyncdstatus(/export/brick1/vol_replicated):276:set_active] GeorepStatus: Worker Status Change status=Active
[2018-07-23 19:35:50.19126] I [gsyncdstatus(/export/brick1/vol_replicated):248:set_worker_crawl_status] GeorepStatus: Crawl Status Change status=History Crawl
[2018-07-23 19:35:50.19480] I [master(/export/brick1/vol_replicated):1432:crawl] _GMaster: starting history crawl turns=1 stime=(0, 0) entry_stime=None etime=1532374550
[2018-07-23 19:35:50.20056] E [repce(/export/brick1/vol_replicated):117:worker] <top>: call failed:
Traceback (most recent call last):
File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 113, in worker
res = getattr(self.obj, rmeth)(*in_data[2:])
File "/usr/libexec/glusterfs/python/syncdaemon/changelogagent.py", line 54, in history
num_parallel)
File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line 103, in cl_history_changelog
raise ChangelogHistoryNotAvailable()
ChangelogHistoryNotAvailable
[2018-07-23 19:35:50.20999] E [repce(/export/brick1/vol_replicated):209:__call__] RepceClient: call failed on peer call=39755:140602890745664:1532374550.02 method=history error=ChangelogHistoryNotAvailable
[2018-07-23 19:35:50.21156] I [resource(/export/brick1/vol_replicated):1675:service_loop] GLUSTER: Changelog history not available, using xsync
[2018-07-23 19:35:50.28688] I [master(/export/brick1/vol_replicated):1543:crawl] _GMaster: starting hybrid crawl stime=(0, 0)
[2018-07-23 19:35:50.30505] I [gsyncdstatus(/export/brick1/vol_replicated):248:set_worker_crawl_status] GeorepStatus: Crawl Status Change status=Hybrid Crawl
[2018-07-23 19:35:54.35396] I [master(/export/brick1/vol_replicated):1554:crawl] _GMaster: processing xsync changelog path=/var/lib/misc/glusterfsd/vol_replicated/ssh%3A%2F%2Fgeoaccount1%4010.20.220.12%3Agluster%3A%2F%2F127.0.0.1%3Ageorep_1/a68ebfef8cdf86c3c6e9a0d85969cd3f/xsync/XSYNC-CHANGELOG.1532374550
[2018-07-23 19:36:11.590595] E [syncdutils(/export/brick1/vol_replicated):304:log_raise_exception] <top>: connection to peer is broken

Anyone have some clues to what might be wrong?

Best regards,

Pablo J. Rebollo-Sosa
Kotresh Hiremath Ravishankar
2018-07-24 04:44:48 UTC
Permalink
Hi Pablo,

The geo-rep status should go to Faulty if he connection to peer is broken.
Does node log files failing with same error? Are these logs repeating?
Does stop and start geo-rep giving the same error?

Thanks,
Kotresh HR
Post by Pablo J Rebollo Sosa
Hi,
I’m having problem with Gluster 3.12.11 geo-replication in CentOS 7.5.
The process starts the geo-replication but after few minutes the log shows
“connection to peer is broken”.
The “status detail” looks ok but no files are replicated.
------------------------------------------------------------
------------------------------------------------------------
------------------------------------------------------------
------------------------------------------------------------
-------------------------------------
MASTER NODE MASTER VOL MASTER BRICK SLAVE
USER SLAVE SLAVE NODE STATUS
CRAWL STATUS LAST_SYNCED ENTRY DATA META FAILURES
CHECKPOINT TIME CHECKPOINT COMPLETED CHECKPOINT COMPLETION TIME
gluster1 vol_replicated /export/brick1/vol_replicated
Active Hybrid Crawl N/A 8191 6550 0 0
N/A N/A N/A
gluster2 vol_replicated /export/brick1/vol_replicated
Passive N/A N/A N/A N/A N/A N/A
N/A N/A N/A
gluster3 vol_replicated /export/brick1/vol_replicated
Passive N/A N/A N/A N/A N/A N/A
N/A N/A N/A
gluster4 vol_replicated /export/brick1/vol_replicated
Active Hybrid Crawl N/A 8191 6532 0 0
N/A N/A N/A
These are the messages on the log file.
[2018-07-23 19:35:50.18026] I [gsyncdstatus(/export/brick1/
vol_replicated):276:set_active] GeorepStatus: Worker Status Change
status=Active
[2018-07-23 19:35:50.19126] I [gsyncdstatus(/export/brick1/
vol_replicated):248:set_worker_crawl_status] GeorepStatus: Crawl Status
Change status=History Crawl
[2018-07-23 19:35:50.19480] I [master(/export/brick1/vol_replicated):1432:crawl]
_GMaster: starting history crawl turns=1 stime=(0, 0)
entry_stime=None etime=1532374550
[2018-07-23 19:35:50.20056] E [repce(/export/brick1/vol_replicated):117:worker]
File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 113, in worker
res = getattr(self.obj, rmeth)(*in_data[2:])
File "/usr/libexec/glusterfs/python/syncdaemon/changelogagent.py", line 54, in history
num_parallel)
File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line
103, in cl_history_changelog
raise ChangelogHistoryNotAvailable()
ChangelogHistoryNotAvailable
[2018-07-23 19:35:50.20999] E [repce(/export/brick1/vol_replicated):209:__call__]
RepceClient: call failed on peer call=39755:140602890745664:1532374550.02
method=history error=ChangelogHistoryNotAvailable
[2018-07-23 19:35:50.21156] I [resource(/export/brick1/vol_replicated):1675:service_loop]
GLUSTER: Changelog history not available, using xsync
[2018-07-23 19:35:50.28688] I [master(/export/brick1/vol_replicated):1543:crawl]
_GMaster: starting hybrid crawl stime=(0, 0)
[2018-07-23 19:35:50.30505] I [gsyncdstatus(/export/brick1/
vol_replicated):248:set_worker_crawl_status] GeorepStatus: Crawl Status
Change status=Hybrid Crawl
[2018-07-23 19:35:54.35396] I [master(/export/brick1/vol_replicated):1554:crawl]
_GMaster: processing xsync changelog path=/var/lib/misc/glusterfsd/
vol_replicated/ssh%3A%2F%2Fgeoaccount1%4010.20.220.12%
3Agluster%3A%2F%2F127.0.0.1%3Ageorep_1/a68ebfef8cdf86c3c6e9a0d85969cd
3f/xsync/XSYNC-CHANGELOG.1532374550
[2018-07-23 19:36:11.590595] E [syncdutils(/export/brick1/
vol_replicated):304:log_raise_exception] <top>: connection to peer is
broken
Anyone have some clues to what might be wrong?
Best regards,
Pablo J. Rebollo-Sosa
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users
--
Thanks and Regards,
Kotresh H R
Pablo J Rebollo Sosa
2018-07-24 05:50:45 UTC
Permalink
Dear Kotresh,
Post by Kotresh Hiremath Ravishankar
Hi Pablo,
The geo-rep status should go to Faulty if he connection to peer is broken.
The geo-rep status don’t go to “faulty” after the “connection to peer is broken” on the event log.
Post by Kotresh Hiremath Ravishankar
Does node log files failing with same error? Are these logs repeating?
The “connection to peer is broken” error is on the following log file. No new events are added after “connection to peer is broken” on the master.

/var/log/glusterfs/geo-replication/vol_replicated/ssh%3A%2F%2Fgeoaccount1%4010.20.220.12%3Agluster%3A%2F%2F127.0.0.1%3Ageorep_1.log
Post by Kotresh Hiremath Ravishankar
Does stop and start geo-rep giving the same error?
I restarted the geo-rep process and keeps giving the same error.

Another user reported the same problem last month.

https://bugzilla.redhat.com/show_bug.cgi?id=1595916
Post by Kotresh Hiremath Ravishankar
Thanks,
Kotresh HR
Hi,
I’m having problem with Gluster 3.12.11 geo-replication in CentOS 7.5. The process starts the geo-replication but after few minutes the log shows “connection to peer is broken”.
The “status detail” looks ok but no files are replicated.
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED ENTRY DATA META FAILURES CHECKPOINT TIME CHECKPOINT COMPLETED CHECKPOINT COMPLETION TIME
These are the messages on the log file.
[2018-07-23 19:35:50.18026] I [gsyncdstatus(/export/brick1/vol_replicated):276:set_active] GeorepStatus: Worker Status Change status=Active
[2018-07-23 19:35:50.19126] I [gsyncdstatus(/export/brick1/vol_replicated):248:set_worker_crawl_status] GeorepStatus: Crawl Status Change status=History Crawl
[2018-07-23 19:35:50.19480] I [master(/export/brick1/vol_replicated):1432:crawl] _GMaster: starting history crawl turns=1 stime=(0, 0) entry_stime=None etime=1532374550
File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 113, in worker
res = getattr(self.obj, rmeth)(*in_data[2:])
File "/usr/libexec/glusterfs/python/syncdaemon/changelogagent.py", line 54, in history
num_parallel)
File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line 103, in cl_history_changelog
raise ChangelogHistoryNotAvailable()
ChangelogHistoryNotAvailable
[2018-07-23 19:35:50.20999] E [repce(/export/brick1/vol_replicated):209:__call__] RepceClient: call failed on peer call=39755:140602890745664:1532374550.02 method=history error=ChangelogHistoryNotAvailable
[2018-07-23 19:35:50.21156] I [resource(/export/brick1/vol_replicated):1675:service_loop] GLUSTER: Changelog history not available, using xsync
[2018-07-23 19:35:50.28688] I [master(/export/brick1/vol_replicated):1543:crawl] _GMaster: starting hybrid crawl stime=(0, 0)
[2018-07-23 19:35:50.30505] I [gsyncdstatus(/export/brick1/vol_replicated):248:set_worker_crawl_status] GeorepStatus: Crawl Status Change status=Hybrid Crawl
[2018-07-23 19:35:54.35396] I [master(/export/brick1/vol_replicated):1554:crawl] _GMaster: processing xsync changelog path=/var/lib/misc/glusterfsd/vol_replicated/ssh%3A%2F%2Fgeoaccount1%4010.20.220.12%3Agluster%3A%2F%2F127.0.0.1%3Ageorep_1/a68ebfef8cdf86c3c6e9a0d85969cd3f/xsync/XSYNC-CHANGELOG.1532374550
[2018-07-23 19:36:11.590595] E [syncdutils(/export/brick1/vol_replicated):304:log_raise_exception] <top>: connection to peer is broken
Anyone have some clues to what might be wrong?
Best regards,
Pablo J. Rebollo-Sosa
_______________________________________________
Gluster-users mailing list
https://lists.gluster.org/mailman/listinfo/gluster-users <https://lists.gluster.org/mailman/listinfo/gluster-users>
--
Thanks and Regards,
Kotresh H R
Loading...