[Openais] aisexec unable to get in sync

Mark Haverkamp markh at osdl.org
Mon Sep 27 10:27:37 PDT 2004


On Mon, 2004-09-27 at 09:22, Mark Haverkamp wrote:
> Steve,
> 
> On Friday afternoon I started a test on my four nodes publishing
> events.  After they ran for a while, I checked top and saw that they
> were using about 8meg of memory and running OK.  When I got in this
> morning I saw this sort of thing on all nodes.  The programs were still
> using around 8 meg so I don't think that there was some sort of
> out-of-memory problem.  
> 
> EVS STATE group arut 4 gmi arut 4 highest 51420646 barrier 51420647 starting group arut 4
> EVS STATE group arut 4 gmi arut 4 highest 51420646 barrier 51420647 starting group arut 4
> EVS STATE group arut 4 gmi arut 4 highest 51420646 barrier 51420647 starting group arut 4
> EVS STATE group arut 4 gmi arut 4 highest 51420646 barrier 51420647 starting group arut 4
> 
> I checked the saved console logs and saw that this happened on all four
> nodes about the same time (4AM Saturday).  Any ideas on a good way to
> debug this?
> 
> Thanks,
> Mark.

Actually, the four nodes had some differences in what they printed out. 
Here is output from each.

cl008:
L(3): Token being retransmitted.
L(1): Received message has invalid digest... ignoring.
L(3): Token being retransmitted.
L(1): Received message has invalid digest... ignoring.
L(4): Got attempt join from 192.168.1.18
L(4): entering GATHER state.
L(4): SENDING attempt join because this node is ring rep.
L(3): Token loss in GATHER or COMMIT.
L(4): Got attempt join from 192.168.1.19
L(4): CONSENSUS reached!
Got membership form token
conf_desc_list 1
highest seq 0 51420627
setting barrier seq to 51420627
setting barrier seq to 51420628
Got membership form token
FORM CONF ENTRIES 1
EVS STATE group arut 51420627 gmi arut 51420627 highest 51420627 barrier 51420628 starting group arut 51420627
EVS STATE group arut 51420628 gmi arut 51420628 highest 51420627 barrier 51420628 starting group arut 51420628
L(4): EVS recovery of messages complete, transitioning to operational.
CONFCHG ENTRIES 4
calling recovery
L(4): CLM CONFIGURATION CHANGE
L(4): New Configuration:
L(4):   192.168.1.8
L(4):   192.168.1.18
L(4):   192.168.1.19
L(4): Members Left:
L(4):   192.168.1.17
L(4): Members Joined:
L(4): CLM CONFIGURATION CHANGE
L(4): New Configuration:
L(4):   192.168.1.8
L(4):   192.168.1.18
L(4):   192.168.1.19
L(4): Members Left:
L(4): Members Joined:
L(4): All services unplugged, unplugging processor
L(4): All processors unplugged, allowing messages to be transmitted.
L(4): Got attempt join from 192.168.1.17
L(4): entering GATHER state.
L(4): SENDING attempt join because this node is ring rep.
L(4): CONSENSUS reached!
L(4): swallowing ORF token 16310570.
Got membership form token
conf_desc_list 2
highest seq 0 51420646
setting barrier seq to 51420646
highest seq 1 51420646
setting barrier seq to 51420647
Got membership form token
FORM CONF ENTRIES 2
EVS STATE group arut 4 gmi arut 4 highest 51420646 barrier 51420647 starting group arut 4
EVS STATE group arut 4 gmi arut 4 highest 51420646 barrier 51420647 starting group arut 4

cl017:
already received token 65190900 65190900
L(4): Got attempt join from 192.168.1.18
L(4): rep is 192.168.1.8, not handling attempt join.
L(4): Got attempt join from 192.168.1.19
L(4): rep is 192.168.1.8, not handling attempt join.
L(4): Got attempt join from 192.168.1.8
L(4): rep is 192.168.1.8, not handling attempt join.
already received token 65239212 65239212
L(3): Token being retransmitted.
L(3): Token loss in OPERATIONAL.
L(4): entering GATHER state.
L(4): SENDING attempt join because this node is ring rep.
L(4): Got attempt join from 192.168.1.8
L(4): CONSENSUS reached!
Got membership form token
Got membership form token
conf_desc_list 2
highest seq 0 51420646
setting barrier seq to 51420646
highest seq 1 51420646
setting barrier seq to 51420647
EVS STATE group arut 4 gmi arut 51420646 highest 51420646 barrier 51420647 starting group arut 4
EVS STATE group arut 4 gmi arut 51420647 highest 51420646 barrier 51420647 starting group arut 4
EVS STATE group arut 4 gmi arut 51420647 highest 51420646 barrier 51420647 starting group arut 4
EVS STATE group arut 4 gmi arut 51420647 highest 51420646 barrier 51420647 starting group arut 4
EVS STATE group arut 4 gmi arut 51420647 highest 51420646 barrier 51420647 starting group arut 4
EVS STATE group arut 4 gmi arut 51420647 highest 51420646 barrier 51420647 starting group arut 4
EVS STATE group arut 4 gmi arut 51420647 highest 51420646 barrier 51420647 starting group arut 4
EVS STATE group arut 4 gmi arut 51420647 highest 51420646 barrier 51420647 starting group arut 4
EVS STATE group arut 4 gmi arut 51420647 highest 51420646 barrier 51420647 starting group arut 4
EVS STATE group arut 4 gmi arut 51420647 highest 51420646 barrier 51420647 starting group arut 4
EVS STATE group arut 4 gmi arut 51420647 highest 51420646 barrier 51420647 starting group arut 4

cl018:
L(3): Token being retransmitted.
L(3): Token being retransmitted.
L(3): Token loss in OPERATIONAL.
L(4): entering GATHER state.
L(4): SENDING attempt join because this node is ring rep.
L(4): Got attempt join from 192.168.1.19
L(4): Got attempt join from 192.168.1.8
L(4): CONSENSUS reached!
Got membership form token
Got membership form token
conf_desc_list 1
highest seq 0 51420627
setting barrier seq to 51420627
setting barrier seq to 51420628
EVS STATE group arut 51420627 gmi arut 51420627 highest 51420627 barrier 51420628 starting group arut 51420627
EVS STATE group arut 51420628 gmi arut 51420628 highest 51420627 barrier 51420628 starting group arut 51420628
L(4): EVS recovery of messages complete, transitioning to operational.
CONFCHG ENTRIES 4
calling recovery
L(4): CLM CONFIGURATION CHANGE
L(4): New Configuration:
L(4):   192.168.1.8
L(4):   192.168.1.18
L(4):   192.168.1.19
L(4): Members Left:
L(4):   192.168.1.17
L(4): Members Joined:
L(4): CLM CONFIGURATION CHANGE
L(4): New Configuration:
L(4):   192.168.1.8
L(4):   192.168.1.18
L(4):   192.168.1.19
L(4): Members Left:
L(4): Members Joined:
L(4): All services unplugged, unplugging processor
already received token 15 65239213
L(4): All processors unplugged, allowing messages to be transmitted.
L(1): Received message has invalid digest... ignoring.
L(4): Got attempt join from 192.168.1.17
L(4): rep is 192.168.1.8, not handling attempt join.
L(4): Got attempt join from 192.168.1.8
L(4): rep is 192.168.1.8, not handling attempt join.
Got membership form token
Got membership form token
conf_desc_list 2
highest seq 0 51420646
setting barrier seq to 51420646
highest seq 1 51420646
setting barrier seq to 51420647
EVS STATE group arut 51420646 gmi arut 4 highest 51420646 barrier 51420647 starting group arut 51420646
EVS STATE group arut 4 gmi arut 4 highest 51420646 barrier 51420647 starting group arut 4
EVS STATE group arut 4 gmi arut 4 highest 51420646 barrier 51420647 starting group arut 4
EVS STATE group arut 4 gmi arut 4 highest 51420646 barrier 51420647 starting group arut 4


cl019:
already received token 65190898 65190898
L(3): Token being retransmitted.
already received token 65239210 65239210
L(3): Token being retransmitted.
L(4): Got attempt join from 192.168.1.18
L(4): rep is 192.168.1.8, not handling attempt join.
L(3): Token loss in OPERATIONAL.
L(4): entering GATHER state.
L(4): SENDING attempt join because this node is ring rep.
L(4): Got attempt join from 192.168.1.8
L(4): CONSENSUS reached!
Got membership form token
Got membership form token
conf_desc_list 1
highest seq 0 51420627
setting barrier seq to 51420627
setting barrier seq to 51420628
EVS STATE group arut 51420627 gmi arut 51420627 highest 51420627 barrier 51420628 starting group arut 51420627
EVS STATE group arut 51420628 gmi arut 51420628 highest 51420627 barrier 51420628 starting group arut 51420628
L(4): EVS recovery of messages complete, transitioning to operational.
CONFCHG ENTRIES 4
calling recovery
L(4): CLM CONFIGURATION CHANGE
L(4): New Configuration:
L(4):   192.168.1.8
L(4):   192.168.1.18
L(4):   192.168.1.19
L(4): Members Left:
L(4):   192.168.1.17
L(4): Members Joined:
L(4): CLM CONFIGURATION CHANGE
L(4): New Configuration:
L(4):   192.168.1.8
L(4):   192.168.1.18
L(4):   192.168.1.19
L(4): Members Left:
L(4): Members Joined:
L(4): All services unplugged, unplugging processor
L(4): All processors unplugged, allowing messages to be transmitted.
L(4): Got attempt join from 192.168.1.17
L(4): rep is 192.168.1.8, not handling attempt join.
L(4): Got attempt join from 192.168.1.8
L(4): rep is 192.168.1.8, not handling attempt join.
Got membership form token
Got membership form token
conf_desc_list 2
highest seq 0 51420646
setting barrier seq to 51420646
highest seq 1 51420646
setting barrier seq to 51420647
EVS STATE group arut 4 gmi arut 4 highest 51420646 barrier 51420647 starting group arut 4
EVS STATE group arut 4 gmi arut 4 highest 51420646 barrier 51420647 starting group arut 4
EVS STATE group arut 4 gmi arut 4 highest 51420646 barrier 51420647 starting group arut 4

-- 
Mark Haverkamp <markh at osdl.org>




More information about the Openais mailing list