[Openais] aisexec unable to get in sync

Mark Haverkamp markh at osdl.org
Mon Sep 27 09:22:48 PDT 2004


Steve,

On Friday afternoon I started a test on my four nodes publishing
events.  After they ran for a while, I checked top and saw that they
were using about 8meg of memory and running OK.  When I got in this
morning I saw this sort of thing on all nodes.  The programs were still
using around 8 meg so I don't think that there was some sort of
out-of-memory problem.  

EVS STATE group arut 4 gmi arut 4 highest 51420646 barrier 51420647 starting group arut 4
EVS STATE group arut 4 gmi arut 4 highest 51420646 barrier 51420647 starting group arut 4
EVS STATE group arut 4 gmi arut 4 highest 51420646 barrier 51420647 starting group arut 4
EVS STATE group arut 4 gmi arut 4 highest 51420646 barrier 51420647 starting group arut 4

I checked the saved console logs and saw that this happened on all four
nodes about the same time (4AM Saturday).  Any ideas on a good way to
debug this?

Thanks,
Mark.


L(3): Token being retransmitted.
L(1): Received message has invalid digest... ignoring.
L(3): Token being retransmitted.
L(1): Received message has invalid digest... ignoring.
L(4): Got attempt join from 192.168.1.18
L(4): entering GATHER state.
L(4): SENDING attempt join because this node is ring rep.
L(3): Token loss in GATHER or COMMIT.
L(4): Got attempt join from 192.168.1.19
L(4): CONSENSUS reached!
Got membership form token
conf_desc_list 1
highest seq 0 51420627
setting barrier seq to 51420627
setting barrier seq to 51420628
Got membership form token
FORM CONF ENTRIES 1
EVS STATE group arut 51420627 gmi arut 51420627 highest 51420627 barrier 51420628 starting group arut 51420627
EVS STATE group arut 51420628 gmi arut 51420628 highest 51420627 barrier 51420628 starting group arut 51420628
L(4): EVS recovery of messages complete, transitioning to operational.
CONFCHG ENTRIES 4
calling recovery
L(4): CLM CONFIGURATION CHANGE
L(4): New Configuration:
L(4):   192.168.1.8
L(4):   192.168.1.18
L(4):   192.168.1.19
L(4): Members Left:
L(4):   192.168.1.17
L(4): Members Joined:
L(4): CLM CONFIGURATION CHANGE
L(4): New Configuration:
L(4):   192.168.1.8
L(4):   192.168.1.18
L(4):   192.168.1.19
L(4): Members Left:
L(4): Members Joined:
L(4): All services unplugged, unplugging processor
L(4): All processors unplugged, allowing messages to be transmitted.
L(4): Got attempt join from 192.168.1.17
L(4): entering GATHER state.
L(4): SENDING attempt join because this node is ring rep.
L(4): CONSENSUS reached!
L(4): swallowing ORF token 16310570.
Got membership form token
conf_desc_list 2
highest seq 0 51420646
setting barrier seq to 51420646
highest seq 1 51420646
setting barrier seq to 51420647
Got membership form token
FORM CONF ENTRIES 2
EVS STATE group arut 4 gmi arut 4 highest 51420646 barrier 51420647 starting group arut 4
EVS STATE group arut 4 gmi arut 4 highest 51420646 barrier 51420647 starting group arut 4
EVS STATE group arut 4 gmi arut 4 highest 51420646 barrier 51420647 starting group arut 4
EVS STATE group arut 4 gmi arut 4 highest 51420646 barrier 51420647 starting group arut 4
EVS STATE group arut 4 gmi arut 4 highest 51420646 barrier 51420647 starting group arut 4
EVS STATE group arut 4 gmi arut 4 highest 51420646 barrier 51420647 starting group arut 4

And so on.....

-- 
Mark Haverkamp <markh at osdl.org>




More information about the Openais mailing list