defect 1170 - assert in memb_state_recover_enter (was) Re:
[Openais] looks like the synchronization code is still broke
Fabien THOMAS
fabien.thomas at netasq.com
Wed Apr 5 09:37:26 PDT 2006
a lot of time now when i kill one node aisexec exit with that log:
Apr 05 15:55:53 [NOTICE ] [MAIN ] AIS Executive Service: Copyright
(C) 2002-2006 MontaVista Software, Inc. and contributors.
Apr 05 15:55:53 [WARNING ] [MAIN ] Could not lock memory of service
to avoid page faults
Apr 05 15:55:53 [NOTICE ] [TOTEM] Token Timeout (1000 ms) retransmit
timeout (238 ms)
Apr 05 15:55:53 [NOTICE ] [TOTEM] token hold (180 ms) retransmits
before loss (4 retrans)
Apr 05 15:55:53 [NOTICE ] [TOTEM] join (100 ms) consensus (200 ms)
merge (200 ms)
Apr 05 15:55:53 [NOTICE ] [TOTEM] downcheck (1000 ms) fail to recv
const (50 msgs)
Apr 05 15:55:53 [NOTICE ] [TOTEM] seqno unchanged const (30
rotations) Maximum network MTU 1500
Apr 05 15:55:53 [NOTICE ] [TOTEM] send threads (0 threads)
Apr 05 15:55:53 [NOTICE ] [TOTEM] heartbeat_failures_allowed (3)
Apr 05 15:55:53 [NOTICE ] [TOTEM] max_network_delay (50 ms)
Apr 05 15:55:53 [NOTICE ] [TOTEM] total heartbeat_timeout (764 ms)
Apr 05 15:55:53 [NOTICE ] [TOTEM] Receive multicast socket recv
buffer size (144000 bytes).
Apr 05 15:55:53 [NOTICE ] [TOTEM] Transmit multicast socket send
buffer size (144000 bytes).
Apr 05 15:55:53 [NOTICE ] [TOTEM] The network interface [10.2.1.7]
is now up.
Apr 05 15:55:53 [NOTICE ] [TOTEM] Created or loaded sequence id
67856.10.2.1.7 for this ring.
Apr 05 15:55:53 [NOTICE ] [TOTEM] entering GATHER state.
Apr 05 15:55:53 [NOTICE ] [SERV ] Initialising service handler
'openais cluster membership service B.01.01'
Apr 05 15:55:53 [NOTICE ] [SERV ] Initialising service handler
'openais availability management framework B.01.01'
Apr 05 15:55:53 [NOTICE ] [SERV ] Initialising service handler
'openais checkpoint service B.01.01'
Apr 05 15:55:53 [NOTICE ] [SERV ] Initialising service handler
'openais event service B.01.01'
Apr 05 15:55:53 [NOTICE ] [SERV ] Initialising service handler
'openais distributed locking service B.01.01'
Apr 05 15:55:53 [NOTICE ] [SERV ] Initialising service handler
'openais message service B.01.01'
Apr 05 15:55:53 [NOTICE ] [SERV ] Initialising service handler
'openais configuration service'
Apr 05 15:55:53 [NOTICE ] [SERV ] Initialising service handler
'openais cluster closed process group service v1.01'
Apr 05 15:55:53 [NOTICE ] [MAIN ] AIS Executive Service: started and
ready to receive connections.
Apr 05 15:55:53 [NOTICE ] [TOTEM] Creating commit token because I am
the rep.
Apr 05 15:55:53 [NOTICE ] [TOTEM] Saving state aru 0 high seq
received 0
Apr 05 15:55:53 [NOTICE ] [TOTEM] Storing new sequence id for ring
67860
Apr 05 15:55:53 [NOTICE ] [TOTEM] entering COMMIT state.
Apr 05 15:55:53 [NOTICE ] [TOTEM] entering RECOVERY state.
Apr 05 15:55:53 [NOTICE ] [TOTEM] position [0] member 10.2.1.7:
Apr 05 15:55:53 [NOTICE ] [TOTEM] previous ring seq 67856 rep 10.2.1.7
Apr 05 15:55:53 [NOTICE ] [TOTEM] aru 0 high delivered 0 received
flag 0
Apr 05 15:55:53 [NOTICE ] [TOTEM] Did not need to originate any
messages in recovery.
Apr 05 15:55:53 [NOTICE ] [TOTEM] Sending initial ORF token
Apr 05 15:55:53 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
Apr 05 15:55:53 [NOTICE ] [CLM ] New Configuration:
Apr 05 15:55:53 [NOTICE ] [CLM ] Members Left:
Apr 05 15:55:53 [NOTICE ] [CLM ] Members Joined:
Apr 05 15:55:53 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
Apr 05 15:55:53 [NOTICE ] [CLM ] New Configuration:
Apr 05 15:55:53 [NOTICE ] [CLM ] 10.2.1.7
Apr 05 15:55:53 [NOTICE ] [CLM ] Members Left:
Apr 05 15:55:53 [NOTICE ] [CLM ] Members Joined:
Apr 05 15:55:53 [NOTICE ] [CLM ] 10.2.1.7
Apr 05 15:55:53 [NOTICE ] [SYNC ] This node is within the non-
primary component and will NOT provide any services.
Apr 05 15:55:53 [NOTICE ] [TOTEM] entering OPERATIONAL state.
Apr 05 15:55:53 [NOTICE ] [TOTEM] entering GATHER state.
Apr 05 15:55:53 [NOTICE ] [TOTEM] entering GATHER state.
Apr 05 15:55:54 [NOTICE ] [TOTEM] entering GATHER state.
Apr 05 15:55:54 [NOTICE ] [TOTEM] Creating commit token because I am
the rep.
Apr 05 15:55:54 [NOTICE ] [TOTEM] Saving state aru 23 high seq
received 23
Apr 05 15:55:54 [NOTICE ] [TOTEM] Storing new sequence id for ring
67868
Apr 05 15:55:54 [NOTICE ] [TOTEM] entering COMMIT state.
Apr 05 15:55:54 [NOTICE ] [TOTEM] entering RECOVERY state.
Apr 05 15:55:54 [NOTICE ] [TOTEM] position [0] member 10.2.1.7:
Apr 05 15:55:54 [NOTICE ] [TOTEM] previous ring seq 67860 rep 10.2.1.7
Apr 05 15:55:54 [NOTICE ] [TOTEM] aru 23 high delivered 0 received
flag 0
Apr 05 15:55:54 [NOTICE ] [TOTEM] copying all old ring messages from
24-23.
Apr 05 15:55:54 [NOTICE ] [TOTEM] Originated 0 messages in RECOVERY.
Apr 05 15:55:54 [NOTICE ] [TOTEM] Originated for recovery:
Apr 05 15:55:54 [NOTICE ] [TOTEM] Not Originated for recovery:
Apr 05 15:55:54 [NOTICE ] [TOTEM] Sending initial ORF token
Apr 05 15:55:54 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
Apr 05 15:55:54 [NOTICE ] [CLM ] New Configuration:
Apr 05 15:55:54 [NOTICE ] [CLM ] 10.2.1.7
Apr 05 15:55:54 [NOTICE ] [CLM ] Members Left:
Apr 05 15:55:54 [NOTICE ] [CLM ] Members Joined:
Apr 05 15:55:54 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
Apr 05 15:55:54 [NOTICE ] [CLM ] New Configuration:
Apr 05 15:55:54 [NOTICE ] [CLM ] 10.2.1.7
Apr 05 15:55:54 [NOTICE ] [CLM ] Members Left:
Apr 05 15:55:54 [NOTICE ] [CLM ] Members Joined:
Apr 05 15:55:54 [NOTICE ] [SYNC ] This node is within the non-
primary component and will NOT provide any services.
Apr 05 15:55:54 [NOTICE ] [TOTEM] entering OPERATIONAL state.
Apr 05 15:55:54 [NOTICE ] [YKD ] This processor is within the
primary component.
Apr 05 15:55:54 [NOTICE ] [SYNC ] This node is within the primary
component and will provide service.
Apr 05 15:55:54 [NOTICE ] [SYNC ] Synchronization barrier completed
Apr 05 15:55:54 [NOTICE ] [SYNC ] Synchronization actions starting
for (openais cluster membership service B.01.01)
Apr 05 15:55:54 [NOTICE ] [YKD ] This processor is within the
primary component.
Apr 05 15:55:54 [NOTICE ] [SYNC ] This node is within the primary
component and will provide service.
Apr 05 15:55:54 [NOTICE ] [SYNC ] Synchronization barrier completed
Apr 05 15:55:54 [NOTICE ] [SYNC ] Synchronization actions starting
for (openais cluster membership service B.01.01)
Apr 05 15:55:54 [NOTICE ] [SYNC ] Synchronization actions done for
(openais cluster membership service B.01.01)
Apr 05 15:55:54 [NOTICE ] [CLM ] got nodejoin message 10.2.1.7
Apr 05 15:55:54 [NOTICE ] [SYNC ] Synchronization barrier completed
Apr 05 15:55:54 [NOTICE ] [SYNC ] Synchronization actions starting
for (openais checkpoint service B.01.01)
Apr 05 15:55:54 [NOTICE ] [SYNC ] Synchronization actions done for
(openais checkpoint service B.01.01)
Apr 05 15:55:54 [NOTICE ] [SYNC ] Synchronization barrier completed
Apr 05 15:55:54 [NOTICE ] [SYNC ] Synchronization actions starting
for (openais event service B.01.01)
Apr 05 15:55:54 [NOTICE ] [SYNC ] Synchronization actions done for
(openais event service B.01.01)
Apr 05 15:55:54 [NOTICE ] [TOTEM] entering GATHER state.
Apr 05 15:55:54 [NOTICE ] [TOTEM] entering GATHER state.
Apr 05 15:55:54 [NOTICE ] [TOTEM] entering GATHER state.
Apr 05 15:55:54 [NOTICE ] [TOTEM] Creating commit token because I am
the rep.
Apr 05 15:55:54 [NOTICE ] [TOTEM] Saving state aru 2a high seq
received 2a
Apr 05 15:55:54 [NOTICE ] [TOTEM] Storing new sequence id for ring
67872
Apr 05 15:55:54 [NOTICE ] [TOTEM] entering COMMIT state.
Apr 05 15:55:54 [NOTICE ] [TOTEM] entering RECOVERY state.
Apr 05 15:55:54 [NOTICE ] [TOTEM] position [0] member 10.2.1.7:
Apr 05 15:55:54 [NOTICE ] [TOTEM] previous ring seq 67868 rep 10.2.1.7
Apr 05 15:55:54 [NOTICE ] [TOTEM] aru 2a high delivered 2a received
flag 0
Apr 05 15:55:54 [NOTICE ] [TOTEM] position [1] member 10.2.11.5:
Apr 05 15:55:54 [NOTICE ] [TOTEM] previous ring seq 67868 rep 10.2.11.5
Apr 05 15:55:54 [NOTICE ] [TOTEM] aru 3a high delivered 3a received
flag 0
Apr 05 15:55:54 [NOTICE ] [TOTEM] copying all old ring messages from
2b-2a.
Apr 05 15:55:54 [NOTICE ] [TOTEM] Originated 0 messages in RECOVERY.
Apr 05 15:55:54 [NOTICE ] [TOTEM] Originated for recovery:
Apr 05 15:55:54 [NOTICE ] [TOTEM] Not Originated for recovery:
Apr 05 15:55:54 [NOTICE ] [TOTEM] Sending initial ORF token
Apr 05 15:55:54 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
Apr 05 15:55:54 [NOTICE ] [CLM ] New Configuration:
Apr 05 15:55:54 [NOTICE ] [CLM ] 10.2.1.7
Apr 05 15:55:54 [NOTICE ] [CLM ] Members Left:
Apr 05 15:55:54 [NOTICE ] [CLM ] Members Joined:
Apr 05 15:55:54 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
Apr 05 15:55:54 [NOTICE ] [CLM ] New Configuration:
Apr 05 15:55:54 [NOTICE ] [CLM ] 10.2.1.7
Apr 05 15:55:54 [NOTICE ] [CLM ] 10.2.11.5
Apr 05 15:55:54 [NOTICE ] [CLM ] Members Left:
Apr 05 15:55:54 [NOTICE ] [CLM ] Members Joined:
Apr 05 15:55:54 [NOTICE ] [CLM ] 10.2.11.5
Apr 05 15:55:54 [NOTICE ] [SYNC ] This node is within the non-
primary component and will NOT provide any services.
Apr 05 15:55:54 [NOTICE ] [TOTEM] entering OPERATIONAL state.
Apr 05 15:55:54 [NOTICE ] [TOTEM] entering GATHER state.
Apr 05 15:55:55 [NOTICE ] [TOTEM] entering GATHER state.
Apr 05 15:55:55 [NOTICE ] [TOTEM] Creating commit token because I am
the rep.
Apr 05 15:55:55 [NOTICE ] [TOTEM] Saving state aru 47 high seq
received 47
Apr 05 15:55:55 [NOTICE ] [TOTEM] Storing new sequence id for ring
67876
Apr 05 15:55:55 [NOTICE ] [TOTEM] entering COMMIT state.
Apr 05 15:55:55 [NOTICE ] [TOTEM] entering RECOVERY state.
Apr 05 15:55:55 [NOTICE ] [TOTEM] position [0] member 10.2.1.7:
Apr 05 15:55:55 [NOTICE ] [TOTEM] previous ring seq 67872 rep 10.2.1.7
Apr 05 15:55:55 [NOTICE ] [TOTEM] aru 47 high delivered 0 received
flag 0
Apr 05 15:55:55 [NOTICE ] [TOTEM] position [1] member 10.2.11.5:
Apr 05 15:55:55 [NOTICE ] [TOTEM] previous ring seq 67872 rep 10.2.1.7
Apr 05 15:55:55 [NOTICE ] [TOTEM] aru 47 high delivered f received
flag 0
Apr 05 15:55:55 [NOTICE ] [TOTEM] copying all old ring messages from
48-47.
Apr 05 15:55:55 [NOTICE ] [TOTEM] Originated 0 messages in RECOVERY.
Apr 05 15:55:55 [NOTICE ] [TOTEM] Originated for recovery:
Apr 05 15:55:55 [NOTICE ] [TOTEM] Not Originated for recovery:
Apr 05 15:55:55 [NOTICE ] [TOTEM] Sending initial ORF token
Apr 05 15:55:55 [ERROR ] [CKPT ] CKPT: ckpt_checkpoint_find_global
returned 0 Calling error_exit.
Apr 05 15:55:55 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
Apr 05 15:55:55 [NOTICE ] [CLM ] New Configuration:
Apr 05 15:55:55 [NOTICE ] [CLM ] 10.2.1.7
Apr 05 15:55:55 [NOTICE ] [CLM ] 10.2.11.5
Apr 05 15:55:55 [NOTICE ] [CLM ] Members Left:
Apr 05 15:55:55 [NOTICE ] [CLM ] Members Joined:
Apr 05 15:55:55 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
Apr 05 15:55:55 [NOTICE ] [CLM ] New Configuration:
Apr 05 15:55:55 [NOTICE ] [CLM ] 10.2.1.7
Apr 05 15:55:55 [NOTICE ] [CLM ] 10.2.11.5
Apr 05 15:55:55 [NOTICE ] [CLM ] Members Left:
Apr 05 15:55:55 [NOTICE ] [CLM ] Members Joined:
Apr 05 15:55:55 [NOTICE ] [SYNC ] This node is within the non-
primary component and will NOT provide any services.
Apr 05 15:55:55 [NOTICE ] [TOTEM] entering OPERATIONAL state.
Apr 05 15:55:55 [NOTICE ] [TOTEM] entering GATHER state.
Apr 05 15:55:55 [NOTICE ] [TOTEM] entering GATHER state.
Apr 05 15:55:55 [NOTICE ] [TOTEM] Creating commit token because I am
the rep.
Apr 05 15:55:55 [NOTICE ] [TOTEM] Saving state aru 46 high seq
received 46
Apr 05 15:55:55 [NOTICE ] [TOTEM] Storing new sequence id for ring
67880
Apr 05 15:55:55 [NOTICE ] [TOTEM] entering COMMIT state.
Apr 05 15:55:55 [NOTICE ] [TOTEM] entering RECOVERY state.
Apr 05 15:55:55 [NOTICE ] [TOTEM] position [0] member 10.2.1.7:
Apr 05 15:55:55 [NOTICE ] [TOTEM] previous ring seq 67876 rep 10.2.1.7
Apr 05 15:55:55 [NOTICE ] [TOTEM] aru 46 high delivered 0 received
flag 0
Apr 05 15:55:55 [NOTICE ] [TOTEM] position [1] member 10.2.11.5:
Apr 05 15:55:55 [NOTICE ] [TOTEM] previous ring seq 67876 rep 10.2.1.7
Apr 05 15:55:55 [NOTICE ] [TOTEM] aru 46 high delivered f received
flag 0
Apr 05 15:55:55 [NOTICE ] [TOTEM] copying all old ring messages from
47-46.
Apr 05 15:55:55 [NOTICE ] [TOTEM] Originated 0 messages in RECOVERY.
Apr 05 15:55:55 [NOTICE ] [TOTEM] Originated for recovery:
Apr 05 15:55:55 [NOTICE ] [TOTEM] Not Originated for recovery:
Apr 05 15:55:55 [NOTICE ] [TOTEM] Sending initial ORF token
Apr 05 15:55:55 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
Apr 05 15:55:55 [NOTICE ] [CLM ] New Configuration:
Apr 05 15:55:55 [NOTICE ] [CLM ] 10.2.1.7
Apr 05 15:55:55 [NOTICE ] [CLM ] 10.2.11.5
Apr 05 15:55:55 [NOTICE ] [CLM ] Members Left:
Apr 05 15:55:55 [NOTICE ] [CLM ] Members Joined:
Apr 05 15:55:55 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
Apr 05 15:55:55 [NOTICE ] [CLM ] New Configuration:
Apr 05 15:55:55 [NOTICE ] [CLM ] 10.2.1.7
Apr 05 15:55:55 [NOTICE ] [CLM ] 10.2.11.5
Apr 05 15:55:55 [NOTICE ] [CLM ] Members Left:
Apr 05 15:55:55 [NOTICE ] [CLM ] Members Joined:
Apr 05 15:55:55 [NOTICE ] [SYNC ] This node is within the non-
primary component and will NOT provide any services.
Apr 05 15:55:55 [NOTICE ] [TOTEM] entering OPERATIONAL state.
Apr 05 15:55:55 [NOTICE ] [TOTEM] entering GATHER state.
Apr 05 15:55:55 [NOTICE ] [TOTEM] entering GATHER state.
Apr 05 15:55:56 [NOTICE ] [TOTEM] entering GATHER state.
Apr 05 15:55:56 [NOTICE ] [TOTEM] entering GATHER state.
Apr 05 15:55:56 [NOTICE ] [TOTEM] Creating commit token because I am
the rep.
Apr 05 15:55:56 [NOTICE ] [TOTEM] Saving state aru 46 high seq
received 46
Apr 05 15:55:56 [NOTICE ] [TOTEM] Storing new sequence id for ring
67884
Apr 05 15:55:56 [NOTICE ] [TOTEM] entering COMMIT state.
Apr 05 15:55:56 [NOTICE ] [TOTEM] entering RECOVERY state.
Apr 05 15:55:56 [NOTICE ] [TOTEM] position [0] member 10.2.1.7:
Apr 05 15:55:56 [NOTICE ] [TOTEM] previous ring seq 67880 rep 10.2.1.7
Apr 05 15:55:56 [NOTICE ] [TOTEM] aru 46 high delivered 1e received
flag 0
Apr 05 15:55:56 [NOTICE ] [TOTEM] position [1] member 10.2.11.5:
Apr 05 15:55:56 [NOTICE ] [TOTEM] previous ring seq 67880 rep 10.2.1.7
Apr 05 15:55:56 [NOTICE ] [TOTEM] aru 46 high delivered 2d received
flag 0
Apr 05 15:55:56 [NOTICE ] [TOTEM] position [2] member 10.2.20.254:
Apr 05 15:55:56 [NOTICE ] [TOTEM] previous ring seq 67864 rep
10.2.20.254
Apr 05 15:55:56 [NOTICE ] [TOTEM] aru 23 high delivered 0 received
flag 0
Apr 05 15:55:56 [NOTICE ] [TOTEM] copying all old ring messages from
47-46.
Apr 05 15:55:56 [NOTICE ] [TOTEM] Originated 0 messages in RECOVERY.
Apr 05 15:55:56 [NOTICE ] [TOTEM] Originated for recovery:
Apr 05 15:55:56 [NOTICE ] [TOTEM] Not Originated for recovery:
Apr 05 15:55:56 [NOTICE ] [TOTEM] Sending initial ORF token
Apr 05 15:55:56 [NOTICE ] [TOTEM] Restoring instance->my_aru 46 my
high seq received 46
Apr 05 15:55:56 [NOTICE ] [TOTEM] entering GATHER state.
Apr 05 15:55:56 [NOTICE ] [TOTEM] Creating commit token because I am
the rep.
Apr 05 15:55:56 [NOTICE ] [TOTEM] Storing new sequence id for ring
67888
Apr 05 15:55:56 [NOTICE ] [TOTEM] entering COMMIT state.
Apr 05 15:55:56 [NOTICE ] [TOTEM] entering RECOVERY state.
Apr 05 15:55:56 [NOTICE ] [TOTEM] position [0] member 10.2.1.7:
Apr 05 15:55:56 [NOTICE ] [TOTEM] previous ring seq 67880 rep 10.2.1.7
Apr 05 15:55:56 [NOTICE ] [TOTEM] aru 46 high delivered 1e received
flag 1
Apr 05 15:55:56 [NOTICE ] [TOTEM] position [1] member 10.2.11.5:
Apr 05 15:55:56 [NOTICE ] [TOTEM] previous ring seq 67884 rep 10.2.1.7
Apr 05 15:55:56 [NOTICE ] [TOTEM] aru 0 high delivered 0 received
flag 0
Apr 05 15:55:56 [NOTICE ] [TOTEM] position [2] member 10.2.20.254:
Apr 05 15:55:56 [NOTICE ] [TOTEM] previous ring seq 67884 rep 10.2.1.7
Apr 05 15:55:56 [NOTICE ] [TOTEM] aru 0 high delivered 0 received
flag 0
Apr 05 15:55:56 [NOTICE ] [TOTEM] copying all old ring messages from
47-46.
Apr 05 15:55:56 [NOTICE ] [TOTEM] Originated 0 messages in RECOVERY.
Apr 05 15:55:56 [NOTICE ] [TOTEM] Originated for recovery:
Apr 05 15:55:56 [NOTICE ] [TOTEM] Not Originated for recovery:
Apr 05 15:55:56 [NOTICE ] [TOTEM] Sending initial ORF token
Apr 05 15:55:56 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
Apr 05 15:55:56 [NOTICE ] [CLM ] New Configuration:
Apr 05 15:55:56 [NOTICE ] [CLM ] 10.2.1.7
Apr 05 15:55:56 [NOTICE ] [CLM ] 10.2.11.5
Apr 05 15:55:56 [NOTICE ] [CLM ] Members Left:
Apr 05 15:55:56 [NOTICE ] [CLM ] Members Joined:
Apr 05 15:55:56 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
Apr 05 15:55:56 [NOTICE ] [CLM ] New Configuration:
Apr 05 15:55:56 [NOTICE ] [CLM ] 10.2.1.7
Apr 05 15:55:56 [NOTICE ] [CLM ] 10.2.11.5
Apr 05 15:55:56 [NOTICE ] [CLM ] 10.2.20.254
Apr 05 15:55:56 [NOTICE ] [CLM ] Members Left:
Apr 05 15:55:56 [NOTICE ] [CLM ] Members Joined:
Apr 05 15:55:56 [NOTICE ] [CLM ] 10.2.20.254
Apr 05 15:55:56 [NOTICE ] [SYNC ] This node is within the non-
primary component and will NOT provide any services.
Apr 05 15:55:56 [NOTICE ] [TOTEM] entering OPERATIONAL state.
Apr 05 15:55:56 [NOTICE ] [TOTEM] entering GATHER state.
Apr 05 15:55:56 [NOTICE ] [TOTEM] entering GATHER state.
Apr 05 15:55:56 [NOTICE ] [TOTEM] Creating commit token because I am
the rep.
Apr 05 15:55:56 [NOTICE ] [TOTEM] Saving state aru 1e high seq
received 1e
Apr 05 15:55:56 [NOTICE ] [TOTEM] Storing new sequence id for ring
67892
Apr 05 15:55:56 [NOTICE ] [TOTEM] entering COMMIT state.
Apr 05 15:55:56 [NOTICE ] [TOTEM] entering RECOVERY state.
Apr 05 15:55:56 [NOTICE ] [TOTEM] position [0] member 10.2.1.7:
Apr 05 15:55:56 [NOTICE ] [TOTEM] previous ring seq 67888 rep 10.2.1.7
Apr 05 15:55:56 [NOTICE ] [TOTEM] aru 1e high delivered f received
flag 0
Apr 05 15:55:56 [NOTICE ] [TOTEM] position [1] member 10.2.11.5:
Apr 05 15:55:56 [NOTICE ] [TOTEM] previous ring seq 67884 rep 10.2.1.7
Apr 05 15:55:56 [NOTICE ] [TOTEM] aru 0 high delivered 0 received
flag 0
Apr 05 15:55:56 [NOTICE ] [TOTEM] position [2] member 10.2.20.254:
Apr 05 15:55:56 [NOTICE ] [TOTEM] previous ring seq 67884 rep 10.2.1.7
Apr 05 15:55:56 [NOTICE ] [TOTEM] aru 0 high delivered 0 received
flag 0
Apr 05 15:55:56 [NOTICE ] [TOTEM] copying all old ring messages from
1f-1e.
Apr 05 15:55:56 [NOTICE ] [TOTEM] Originated 0 messages in RECOVERY.
Apr 05 15:55:56 [NOTICE ] [TOTEM] Originated for recovery:
Apr 05 15:55:56 [NOTICE ] [TOTEM] Not Originated for recovery:
Apr 05 15:55:56 [NOTICE ] [TOTEM] Sending initial ORF token
Apr 05 15:55:56 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
Apr 05 15:55:56 [NOTICE ] [CLM ] New Configuration:
Apr 05 15:55:56 [NOTICE ] [CLM ] 10.2.1.7
Apr 05 15:55:56 [NOTICE ] [CLM ] 10.2.11.5
Apr 05 15:55:56 [NOTICE ] [CLM ] 10.2.20.254
Apr 05 15:55:56 [NOTICE ] [CLM ] Members Left:
Apr 05 15:55:56 [NOTICE ] [CLM ] Members Joined:
Apr 05 15:55:56 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
Apr 05 15:55:56 [NOTICE ] [CLM ] New Configuration:
Apr 05 15:55:56 [NOTICE ] [CLM ] 10.2.1.7
Apr 05 15:55:56 [NOTICE ] [CLM ] 10.2.11.5
Apr 05 15:55:56 [NOTICE ] [CLM ] 10.2.20.254
Apr 05 15:55:56 [NOTICE ] [CLM ] Members Left:
Apr 05 15:55:56 [NOTICE ] [CLM ] Members Joined:
Apr 05 15:55:56 [NOTICE ] [SYNC ] This node is within the non-
primary component and will NOT provide any services.
Apr 05 15:55:56 [NOTICE ] [TOTEM] entering OPERATIONAL state.
Apr 05 15:55:56 [NOTICE ] [TOTEM] entering GATHER state.
Apr 05 15:55:56 [NOTICE ] [TOTEM] entering GATHER state.
Apr 05 15:55:56 [NOTICE ] [TOTEM] Saving state aru f high seq
received f
Apr 05 15:55:56 [NOTICE ] [TOTEM] Storing new sequence id for ring
67896
Apr 05 15:55:56 [NOTICE ] [TOTEM] entering COMMIT state.
Apr 05 15:55:57 [NOTICE ] [TOTEM] entering RECOVERY state.
Apr 05 15:55:57 [NOTICE ] [TOTEM] position [0] member 10.2.1.6:
Apr 05 15:55:57 [NOTICE ] [TOTEM] previous ring seq 67892 rep 10.2.1.6
Apr 05 15:55:57 [NOTICE ] [TOTEM] aru 1e high delivered 4 received
flag 0
Apr 05 15:55:57 [NOTICE ] [TOTEM] position [1] member 10.2.1.7:
Apr 05 15:55:57 [NOTICE ] [TOTEM] previous ring seq 67892 rep 10.2.1.7
Apr 05 15:55:57 [NOTICE ] [TOTEM] aru f high delivered 0 received
flag 0
Apr 05 15:55:57 [NOTICE ] [TOTEM] position [2] member 10.2.11.5:
Apr 05 15:55:57 [NOTICE ] [TOTEM] previous ring seq 67892 rep 10.2.1.7
Apr 05 15:55:57 [NOTICE ] [TOTEM] aru 11 high delivered 0 received
flag 0
Apr 05 15:55:57 [NOTICE ] [TOTEM] position [3] member 10.2.20.254:
Apr 05 15:55:57 [NOTICE ] [TOTEM] previous ring seq 67892 rep 10.2.1.7
Apr 05 15:55:57 [NOTICE ] [TOTEM] aru 1e high delivered 0 received
flag 0
Apr 05 15:55:57 [NOTICE ] [TOTEM] position [4] member 10.2.25.254:
Apr 05 15:55:57 [NOTICE ] [TOTEM] previous ring seq 67892 rep 10.2.1.6
Apr 05 15:55:57 [NOTICE ] [TOTEM] aru 2d high delivered f received
flag 0
Apr 05 15:55:57 [NOTICE ] [TOTEM] copying all old ring messages from
10-f.
Apr 05 15:55:57 [NOTICE ] [TOTEM] Originated 0 messages in RECOVERY.
Apr 05 15:55:57 [NOTICE ] [TOTEM] Originated for recovery:
Apr 05 15:55:57 [NOTICE ] [TOTEM] Not Originated for recovery:
Apr 05 15:55:57 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:55:57 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:55:57 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:55:57 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:55:57 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:55:57 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:55:57 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:55:57 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:55:58 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:55:58 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:55:58 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:55:58 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:55:58 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:55:58 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:55:58 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:55:58 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:55:58 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:55:59 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:55:59 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:55:59 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:55:59 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:55:59 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:55:59 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:55:59 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:55:59 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:56:00 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:56:00 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:56:00 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:56:00 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:56:00 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:56:00 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:56:01 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:56:01 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:56:01 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:56:02 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:56:02 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:56:02 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:56:02 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:56:03 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:56:03 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:56:03 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:56:03 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:56:04 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:56:04 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:56:04 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:56:05 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:56:05 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:56:05 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:56:05 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:56:06 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:56:06 [NOTICE ] [TOTEM] Retransmit List: 1 2 3 4 5 6 7 8 9
a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e
Apr 05 15:56:06 [ERROR ] [TOTEM] FAILED TO RECEIVE
Le 4 avr. 06 à 10:22, Steven Dake a écrit :
> Fabien,
>
> Please try this patch and see if it solves your problem. It should
> but
> I am not able to duplicate the assert. There was a programming error
> (few) in the protocol around handling of the recovery state.
>
> Regards
> -steve
>
> On Wed, 2006-03-29 at 09:51 +0200, Fabien THOMAS wrote:
>> First launch with 4 nodes: one node crashed.
>>
>> I think it is not related to your patch because i've seen this crash
>> before. :(
>>
>> (gdb) bt
>> #0 0x28187723 in kill () from /lib/libc.so.6
>> #1 0x280b61da in raise () from /usr/lib/libpthread.so.2
>> #2 0x281863d4 in abort () from /lib/libc.so.6
>> #3 0x28164358 in __assert () from /lib/libc.so.6
>> #4 0x0805179b in memb_state_recovery_enter (instance=0x83c6000,
>> commit_token=0x83e0650)
>> at totemsrp.c:1616
>> #5 0x08056944 in message_handler_memb_commit_token
>> (instance=0x83c6000, system_from=0x3fbfea00,
>> msg=0x83e0650, msg_len=2102, endian_conversion_needed=0) at
>> totemsrp.c:3576
>> #6 0x08056af1 in main_deliver_fn (context=0x83c6000,
>> system_from=0x3fbfea00, msg=0x83e0650,
>> msg_len=2102) at totemsrp.c:3635
>> #7 0x0804df5e in active_mcast_recv (instance=0x83b4680,
>> context=0x83c6000,
>> system_from=0x3fbfea00, msg=0x83e0650, msg_len=2102) at
>> totemrrp.c:393
>> #8 0x0804e30a in rrp_deliver_fn (context=0x83b55f0,
>> system_from=0x3fbfea00, msg=0x83e0650,
>> msg_len=2102) at totemrrp.c:549
>> #9 0x0804c402 in net_deliver_fn (handle=0, fd=6, revents=1,
>> data=0x83e0000, prio=0x83b4894)
>> at totemnet.c:687
>> #10 0x0804abc2 in poll_run (handle=0) at aispoll.c:424
>> #11 0x0805fa37 in main (argc=1, argv=0x3fbfecfc) at main.c:1313
>> (gdb) frame 4
>> #4 0x0805179b in memb_state_recovery_enter (instance=0x83c6000,
>> commit_token=0x83e0650)
>> at totemsrp.c:1616
>> 1616 totemsrp.c: No such file or directory.
>> in totemsrp.c
>> (gdb) print range
>> $1 = 4294967285
>> (gdb) print *instance
>> $2 = {first_run = 1, fcc_remcast_last = 0, fcc_mcast_last = 0,
>> fcc_mcast_current = 0,
>> fcc_remcast_current = 0, consensus_list = {{addr = {nodeid =
>> 117506570, family = 2,
>> addr = "\n\002\001\a\000\000\000\000\000\000\r;*D\030?"},
>> set = 1}, {addr = {
>> nodeid = 4263051786, family = 2, addr = "\n\002\031?\000\000
>> \000\000\000\000\r;*D\030?"},
>> set = 1}, {addr = {nodeid = 100729354, family = 2,
>> addr = "\n\002\001\006??W\237\005\bL1;\b 1"}, set = 1},
>> {addr = {nodeid = 0, family = 0,
>> addr = '\0' <repeats 15 times>}, set = 0} <repeats 29
>> times>}, consensus_list_entries = 2,
>> my_proc_list = {{nodeid = 117506570, family = 2,
>> addr = "\n\002\001\a", '\0' <repeats 11 times>}, {nodeid =
>> 4263051786, family = 2,
>> addr = "\n\002\031?", '\0' <repeats 11 times>}, {nodeid =
>> 100729354, family = 2,
>> addr = "\n\002\001\006", '\0' <repeats 11 times>}, {nodeid =
>> 0, family = 0,
>> addr = '\0' <repeats 15 times>} <repeats 29 times>},
>> my_failed_list = {{nodeid = 100729354,
>> family = 2, addr = "\n\002\001\006", '\0' <repeats 11 times>},
>> {nodeid = 0, family = 0,
>> addr = '\0' <repeats 15 times>} <repeats 31 times>},
>> my_new_memb_list = {{
>> nodeid = 100729354, family = 2, addr = "\n\002\001\006", '\0'
>> <repeats 11 times>}, {
>> nodeid = 117506570, family = 2, addr = "\n\002\001\a", '\0'
>> <repeats 11 times>}, {
>> nodeid = 4263051786, family = 2, addr = "\n\002\031?", '\0'
>> <repeats 11 times>}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}
>> <repeats 29 times>},
>> my_trans_memb_list = {{nodeid = 117506570, family = 2,
>> addr = "\n\002\001\a", '\0' <repeats 11 times>}, {nodeid =
>> 4263051786, family = 2,
>> addr = "\n\002\031?", '\0' <repeats 11 times>}, {nodeid = 0,
>> family = 0,
>> addr = '\0' <repeats 15 times>} <repeats 30 times>},
>> my_memb_list = {{nodeid = 117506570,
>> family = 2, addr = "\n\002\001\a", '\0' <repeats 11 times>},
>> {nodeid = 4263051786,
>> family = 2, addr = "\n\002\031?", '\0' <repeats 11 times>},
>> {nodeid = 0, family = 0,
>> ---Type <return> to continue, or q <return> to quit---
>> addr = '\0' <repeats 15 times>} <repeats 30 times>},
>> my_deliver_memb_list = {{
>> nodeid = 117506570, family = 2, addr = "\n\002\001\a", '\0'
>> <repeats 11 times>}, {
>> nodeid = 4263051786, family = 2, addr = "\n\002\031?", '\0'
>> <repeats 11 times>}, {
>> nodeid = 0, family = 0, addr = '\0' <repeats 15 times>}
>> <repeats 30 times>},
>> my_nodeid_lookup_list = {{nodeid = 117506570, family = 2,
>> addr = "\n\002\001\a", '\0' <repeats 11 times>}, {nodeid =
>> 4263051786, family = 2,
>> addr = "\n\002\031?", '\0' <repeats 11 times>}, {nodeid =
>> 100729354, family = 2,
>> addr = "\n\002\001\006", '\0' <repeats 11 times>}, {nodeid =
>> 0, family = 0,
>> addr = '\0' <repeats 15 times>} <repeats 29 times>},
>> my_proc_list_entries = 3,
>> my_failed_list_entries = 0, my_new_memb_entries = 3,
>> my_trans_memb_entries = 2,
>> my_memb_entries = 2, my_deliver_memb_entries = 2,
>> my_nodeid_lookup_entries = 3, my_ring_id = {
>> rep = {nodeid = 100729354, family = 2, addr = "\n\002\001\006",
>> '\0' <repeats 11 times>},
>> seq = 33764}, my_old_ring_id = {rep = {nodeid = 117506570,
>> family = 2,
>> addr = "\n\002\001\a", '\0' <repeats 11 times>}, seq = 33756},
>> my_aru_count = 0,
>> my_merge_detect_timeout_outstanding = 0, my_last_aru = 0,
>> my_seq_unchanged = 0,
>> my_received_flg = 1, my_high_seq_received = 4, my_install_seq = 0,
>> my_rotation_counter = 0,
>> my_set_retrans_flg = 0, my_retrans_flg_count = 0,
>> my_high_ring_delivered = 0,
>> heartbeat_timeout = 0, new_message_queue = {head = 78, tail = 41,
>> used = 36, usedhw = 36,
>> size = 181, items = 0x83e5000, size_per_item = 48, iterator =
>> 0}, retrans_message_queue = {
>> head = 0, tail = 499, used = 0, usedhw = 0, size = 500, items =
>> 0x83ce000, size_per_item = 48,
>> iterator = 0}, regular_sort_queue = {head = 0, size = 256, items
>> = 0x83c3000,
>> items_inuse = 0x83c0c00, size_per_item = 44, head_seqid = 0,
>> item_count = 256, pos_max = 4},
>> recovery_sort_queue = {head = 0, size = 256, items = 0x83d4000,
>> items_inuse = 0x83d7000,
>> size_per_item = 44, head_seqid = 0, item_count = 256, pos_max =
>> 0}, my_aru = 0,
>> ---Type <return> to continue, or q <return> to quit---
>> my_high_delivered = 0, token_callback_received_listhead = {next =
>> 0x83b3420, prev = 0x83b3420},
>> token_callback_sent_listhead = {next = 0x83c77f0, prev =
>> 0x83c77f0},
>> orf_token_retransmit = 0x83ca000 "", orf_token_retransmit_size =
>> 82, my_token_seq = 4294967295,
>> timer_orf_token_timeout = 0x83b3480,
>> timer_orf_token_retransmit_timeout = 0x83b34a0,
>> timer_orf_token_hold_retransmit_timeout = 0x0,
>> timer_merge_detect_timeout = 0x0,
>> memb_timer_state_gather_join_timeout = 0x0,
>> memb_timer_state_gather_consensus_timeout = 0x0,
>> memb_timer_state_commit_timeout = 0x0, timer_heartbeat_timeout
>> = 0x0,
>> totemsrp_log_level_security = 65538, totemsrp_log_level_error =
>> 131074,
>> totemsrp_log_level_warning = 196610, totemsrp_log_level_notice =
>> 262146,
>> totemsrp_log_level_debug = 327682, totemsrp_log_printf = 0x805fbe8
>> <internal_log_printf>,
>> memb_state = MEMB_STATE_COMMIT, my_id = {nodeid = 117506570,
>> family = 2,
>> addr = "\n\002\001\a", '\0' <repeats 11 times>}, next_memb =
>> {nodeid = 4263051786, family = 2,
>> addr = "\n\002\031?", '\0' <repeats 11 times>}, iov_buffer =
>> '\0' <repeats 8999 times>,
>> totemsrp_iov_recv = {iov_base = 0x0, iov_len = 0},
>> totemsrp_poll_handle = 0, totemsrp_recv = 0,
>> mcast_address = {nodeid = 0, family = 2, addr = "?^\001\001", '\0'
>> <repeats 11 times>},
>> totemsrp_deliver_fn = 0x8056c08 <totemmrp_deliver_fn>,
>> totemsrp_confchg_fn = 0x8056c3c <totemmrp_confchg_fn>,
>> global_seqno = 223, my_token_held = 0,
>> token_ring_id_seq = 33764, last_released = 0, set_aru =
>> 4294967295, old_ring_state_saved = 1,
>> old_ring_state_aru = 0, old_ring_state_high_seq_received = 4,
>> ring_saved = 1, my_last_seq = 15,
>> tv_old = {tv_sec = 0, tv_usec = 0}, totemrrp_handle = 0,
>> totem_config = 0x3fbfeb84,
>> use_heartbeat = 0}
>>
>>
>> Mar 29 7:45:09 [NOTICE ] [MAIN ] AIS Executive Service: Copyright
>> (C) 2002-2006 MontaVista Software, Inc. and contributors.
>> Mar 29 7:45:09 [WARNING ] [MAIN ] Could not lock memory of service
>> to avoid page faults
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] Token Timeout (1000 ms) retransmit
>> timeout (238 ms)
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] token hold (180 ms) retransmits
>> before loss (4 retrans)
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] join (100 ms) consensus (200 ms)
>> merge (200 ms)
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] downcheck (1000 ms) fail to recv
>> const (50 msgs)
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] seqno unchanged const (30
>> rotations) Maximum network MTU 1500
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] send threads (0 threads)
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] heartbeat_failures_allowed (0)
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] max_network_delay (50 ms)
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] HeartBeat is Disabled. To enable
>> set heartbeat_failures_allowed > 0
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] Receive multicast socket recv
>> buffer size (144000 bytes).
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] Transmit multicast socket send
>> buffer size (144000 bytes).
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] The network interface [10.2.1.7]
>> is now up.
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] Created or loaded sequence id
>> 33740.10.2.1.7 for this ring.
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] entering GATHER state.
>> Mar 29 7:45:09 [NOTICE ] [SERV ] openais component openais_cpg
>> loaded.
>> Mar 29 7:45:09 [NOTICE ] [SERV ] Registering service handler
>> 'openais cluster closed process group service v1.01'
>> Mar 29 7:45:09 [NOTICE ] [SERV ] Initializing service handler
>> 'openais cluster closed process group service v1.01'
>> Mar 29 7:45:09 [NOTICE ] [SERV ] openais component openais_cfg
>> loaded.
>> Mar 29 7:45:09 [NOTICE ] [SERV ] Registering service handler
>> 'openais configuration service'
>> Mar 29 7:45:09 [NOTICE ] [SERV ] Initializing service handler
>> 'openais configuration service'
>> Mar 29 7:45:09 [NOTICE ] [SERV ] openais component openais_msg
>> loaded.
>> Mar 29 7:45:09 [NOTICE ] [SERV ] Registering service handler
>> 'openais message service B.01.01'
>> Mar 29 7:45:09 [NOTICE ] [SERV ] Initializing service handler
>> 'openais message service B.01.01'
>> Mar 29 7:45:09 [NOTICE ] [SERV ] openais component openais_lck
>> loaded.
>> Mar 29 7:45:09 [NOTICE ] [SERV ] Registering service handler
>> 'openais distributed locking service B.01.01'
>> Mar 29 7:45:09 [NOTICE ] [SERV ] Initializing service handler
>> 'openais distributed locking service B.01.01'
>> Mar 29 7:45:09 [NOTICE ] [SERV ] openais component openais_evt
>> loaded.
>> Mar 29 7:45:09 [NOTICE ] [SERV ] Registering service handler
>> 'openais event service B.01.01'
>> Mar 29 7:45:09 [NOTICE ] [SERV ] Initializing service handler
>> 'openais event service B.01.01'
>> Mar 29 7:45:09 [NOTICE ] [SERV ] openais component openais_ckpt
>> loaded.
>> Mar 29 7:45:09 [NOTICE ] [SERV ] Registering service handler
>> 'openais checkpoint service B.01.01'
>> Mar 29 7:45:09 [NOTICE ] [SERV ] Initializing service handler
>> 'openais checkpoint service B.01.01'
>> Mar 29 7:45:09 [NOTICE ] [SERV ] openais component openais_amf
>> loaded.
>> Mar 29 7:45:09 [NOTICE ] [SERV ] Registering service handler
>> 'openais availability management framework B.01.01'
>> Mar 29 7:45:09 [NOTICE ] [SERV ] Initializing service handler
>> 'openais availability management framework B.01.01'
>> Mar 29 7:45:09 [NOTICE ] [SERV ] openais component openais_clm
>> loaded.
>> Mar 29 7:45:09 [NOTICE ] [SERV ] Registering service handler
>> 'openais cluster membership service B.01.01'
>> Mar 29 7:45:09 [NOTICE ] [SERV ] Initializing service handler
>> 'openais cluster membership service B.01.01'
>> Mar 29 7:45:09 [NOTICE ] [SERV ] openais component openais_evs
>> loaded.
>> Mar 29 7:45:09 [NOTICE ] [SERV ] Registering service handler
>> 'openais extended virtual synchrony service'
>> Mar 29 7:45:09 [NOTICE ] [SERV ] Initializing service handler
>> 'openais extended virtual synchrony service'
>> Mar 29 7:45:09 [NOTICE ] [MAIN ] AIS Executive Service: started and
>> ready to receive connections.
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] Creating commit token because I am
>> the rep.
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] Saving state aru 0 high seq
>> received 0
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] Storing new sequence id for ring
>> 33744
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] entering COMMIT state.
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] position [0] member 10.2.1.7:
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] previous ring seq 33740 rep
>> 10.2.1.7
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] aru 0 high delivered 0 received
>> flag 1
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] copying all old ring messages from
>> 1-0.
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] Originated 0 messages in RECOVERY.
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] Originated for recovery:
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] Not Originated for recovery:
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] Sending initial ORF token
>> Mar 29 7:45:09 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
>> Mar 29 7:45:09 [NOTICE ] [CLM ] New Configuration:
>> Mar 29 7:45:09 [NOTICE ] [CLM ] Members Left:
>> Mar 29 7:45:09 [NOTICE ] [CLM ] Members Joined:
>> Mar 29 7:45:09 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
>> Mar 29 7:45:09 [NOTICE ] [CLM ] New Configuration:
>> Mar 29 7:45:09 [NOTICE ] [CLM ] 10.2.1.7
>> Mar 29 7:45:09 [NOTICE ] [CLM ] Members Left:
>> Mar 29 7:45:09 [NOTICE ] [CLM ] Members Joined:
>> Mar 29 7:45:09 [NOTICE ] [CLM ] 10.2.1.7
>> Mar 29 7:45:09 [NOTICE ] [SYNC ] This node is within the non-
>> primary component and will NOT provide any services.
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] entering OPERATIONAL state.
>> Mar 29 7:45:09 [NOTICE ] [YKD ] This processor is within the
>> primary component.
>> Mar 29 7:45:09 [NOTICE ] [SYNC ] This node is within the primary
>> component and will provide service.
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] entering GATHER state.
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] Creating commit token because I am
>> the rep.
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] Saving state aru 25 high seq
>> received 25
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] Storing new sequence id for ring
>> 33748
>> Mar 29 7:45:09 [NOTICE ] [TOTEM] entering COMMIT state.
>> Mar 29 7:45:10 [NOTICE ] [TOTEM] The token was lost in state 3 from
>> timer 83c6000
>> Mar 29 7:45:10 [NOTICE ] [TOTEM] entering GATHER state.
>> Mar 29 7:45:10 [NOTICE ] [TOTEM] Creating commit token because I am
>> the rep.
>> Mar 29 7:45:10 [NOTICE ] [TOTEM] Storing new sequence id for ring
>> 33752
>> Mar 29 7:45:10 [NOTICE ] [TOTEM] entering COMMIT state.
>> Mar 29 7:45:10 [NOTICE ] [TOTEM] position [0] member 10.2.1.7:
>> Mar 29 7:45:10 [NOTICE ] [TOTEM] previous ring seq 33744 rep
>> 10.2.1.7
>> Mar 29 7:45:10 [NOTICE ] [TOTEM] aru 25 high delivered 24 received
>> flag 1
>> Mar 29 7:45:10 [NOTICE ] [TOTEM] position [1] member 10.2.25.254:
>> Mar 29 7:45:10 [NOTICE ] [TOTEM] previous ring seq 33748 rep
>> 10.2.25.254
>> Mar 29 7:45:10 [NOTICE ] [TOTEM] aru 2b high delivered 2b received
>> flag 1
>> Mar 29 7:45:10 [NOTICE ] [TOTEM] copying all old ring messages from
>> 26-25.
>> Mar 29 7:45:10 [NOTICE ] [TOTEM] Originated 0 messages in RECOVERY.
>> Mar 29 7:45:10 [NOTICE ] [TOTEM] Originated for recovery:
>> Mar 29 7:45:10 [NOTICE ] [TOTEM] Not Originated for recovery:
>> Mar 29 7:45:10 [NOTICE ] [TOTEM] Sending initial ORF token
>> Mar 29 7:45:11 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
>> Mar 29 7:45:11 [NOTICE ] [CLM ] New Configuration:
>> Mar 29 7:45:11 [NOTICE ] [CLM ] 10.2.1.7
>> Mar 29 7:45:11 [NOTICE ] [CLM ] Members Left:
>> Mar 29 7:45:11 [NOTICE ] [CLM ] Members Joined:
>> Mar 29 7:45:11 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
>> Mar 29 7:45:11 [NOTICE ] [CLM ] New Configuration:
>> Mar 29 7:45:11 [NOTICE ] [CLM ] 10.2.1.7
>> Mar 29 7:45:11 [NOTICE ] [CLM ] 10.2.25.254
>> Mar 29 7:45:11 [NOTICE ] [CLM ] Members Left:
>> Mar 29 7:45:11 [NOTICE ] [CLM ] Members Joined:
>> Mar 29 7:45:11 [NOTICE ] [CLM ] 10.2.25.254
>> Mar 29 7:45:11 [NOTICE ] [SYNC ] This node is within the non-
>> primary component and will NOT provide any services.
>> Mar 29 7:45:11 [NOTICE ] [TOTEM] entering OPERATIONAL state.
>> Mar 29 7:45:11 [NOTICE ] [YKD ] This processor is within the
>> primary component.
>> Mar 29 7:45:11 [NOTICE ] [SYNC ] This node is within the primary
>> component and will provide service.
>> Mar 29 7:45:11 [NOTICE ] [SYNC ] Synchronization barrier completed
>> Mar 29 7:45:11 [NOTICE ] [SYNC ] Synchronization actions starting
>> for (openais cluster membership service B.01.01)
>> Mar 29 7:45:11 [NOTICE ] [SYNC ] Synchronization actions done for
>> (openais cluster membership service B.01.01)
>> Mar 29 7:45:11 [NOTICE ] [CLM ] got nodejoin message 10.2.1.7
>> Mar 29 7:45:11 [NOTICE ] [CLM ] got nodejoin message 10.2.25.254
>> Mar 29 7:45:11 [NOTICE ] [SYNC ] Synchronization barrier completed
>> Mar 29 7:45:11 [NOTICE ] [SYNC ] Synchronization actions starting
>> for (openais checkpoint service B.01.01)
>> Mar 29 7:45:11 [NOTICE ] [SYNC ] Synchronization actions done for
>> (openais checkpoint service B.01.01)
>> Mar 29 7:45:11 [NOTICE ] [SYNC ] Synchronization barrier completed
>> Mar 29 7:45:11 [NOTICE ] [SYNC ] Synchronization actions starting
>> for (openais event service B.01.01)
>> Mar 29 7:45:11 [NOTICE ] [SYNC ] Synchronization actions done for
>> (openais event service B.01.01)
>> Mar 29 7:45:15 [NOTICE ] [TOTEM] entering GATHER state.
>> Mar 29 7:45:16 [NOTICE ] [TOTEM] The token was lost in state 2 from
>> timer 83c6000
>> Mar 29 7:45:16 [NOTICE ] [TOTEM] entering GATHER state.
>> Mar 29 7:45:16 [NOTICE ] [TOTEM] entering GATHER state.
>> Mar 29 7:45:16 [NOTICE ] [TOTEM] Creating commit token because I am
>> the rep.
>> Mar 29 7:45:16 [NOTICE ] [TOTEM] Saving state aru e5 high seq
>> received e5
>> Mar 29 7:45:16 [NOTICE ] [TOTEM] Storing new sequence id for ring
>> 33756
>> Mar 29 7:45:16 [NOTICE ] [TOTEM] entering COMMIT state.
>> Mar 29 7:45:16 [NOTICE ] [TOTEM] position [0] member 10.2.1.7:
>> Mar 29 7:45:16 [NOTICE ] [TOTEM] previous ring seq 33752 rep
>> 10.2.1.7
>> Mar 29 7:45:16 [NOTICE ] [TOTEM] aru e5 high delivered e5 received
>> flag 1
>> Mar 29 7:45:16 [NOTICE ] [TOTEM] position [1] member 10.2.25.254:
>> Mar 29 7:45:16 [NOTICE ] [TOTEM] previous ring seq 33752 rep
>> 10.2.1.7
>> Mar 29 7:45:16 [NOTICE ] [TOTEM] aru e5 high delivered e5 received
>> flag 1
>> Mar 29 7:45:16 [NOTICE ] [TOTEM] copying all old ring messages from
>> e6-e5.
>> Mar 29 7:45:16 [NOTICE ] [TOTEM] Originated 0 messages in RECOVERY.
>> Mar 29 7:45:16 [NOTICE ] [TOTEM] Originated for recovery:
>> Mar 29 7:45:16 [NOTICE ] [TOTEM] Not Originated for recovery:
>> Mar 29 7:45:16 [NOTICE ] [TOTEM] Sending initial ORF token
>> Mar 29 7:45:16 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
>> Mar 29 7:45:16 [NOTICE ] [CLM ] New Configuration:
>> Mar 29 7:45:16 [NOTICE ] [CLM ] 10.2.1.7
>> Mar 29 7:45:16 [NOTICE ] [CLM ] 10.2.25.254
>> Mar 29 7:45:16 [NOTICE ] [CLM ] Members Left:
>> Mar 29 7:45:16 [NOTICE ] [CLM ] Members Joined:
>> Mar 29 7:45:16 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
>> Mar 29 7:45:16 [NOTICE ] [CLM ] New Configuration:
>> Mar 29 7:45:16 [NOTICE ] [CLM ] 10.2.1.7
>> Mar 29 7:45:16 [NOTICE ] [CLM ] 10.2.25.254
>> Mar 29 7:45:16 [NOTICE ] [CLM ] Members Left:
>> Mar 29 7:45:16 [NOTICE ] [CLM ] Members Joined:
>> Mar 29 7:45:16 [NOTICE ] [SYNC ] This node is within the non-
>> primary component and will NOT provide any services.
>> Mar 29 7:45:16 [NOTICE ] [TOTEM] entering OPERATIONAL state.
>> Mar 29 7:45:16 [NOTICE ] [TOTEM] entering GATHER state.
>> Mar 29 7:45:16 [NOTICE ] [TOTEM] Saving state aru 0 high seq
>> received 4
>> Mar 29 7:45:16 [NOTICE ] [TOTEM] Storing new sequence id for ring
>> 33760
>> Mar 29 7:45:16 [NOTICE ] [TOTEM] entering COMMIT state.
>> Mar 29 7:45:17 [NOTICE ] [TOTEM] The token was lost in state 3 from
>> timer 83c6000
>> Mar 29 7:45:17 [NOTICE ] [TOTEM] entering GATHER state.
>> Mar 29 7:45:17 [NOTICE ] [TOTEM] Storing new sequence id for ring
>> 33764
>> Mar 29 7:45:17 [NOTICE ] [TOTEM] entering COMMIT state.
>> Mar 29 7:45:17 [NOTICE ] [TOTEM] position [0] member 10.2.1.6:
>> Mar 29 7:45:17 [NOTICE ] [TOTEM] previous ring seq 33748 rep
>> 10.2.1.6
>> Mar 29 7:45:17 [NOTICE ] [TOTEM] aru 23 high delivered 0 received
>> flag 1
>> Mar 29 7:45:17 [NOTICE ] [TOTEM] position [1] member 10.2.1.7:
>> Mar 29 7:45:17 [NOTICE ] [TOTEM] previous ring seq 33756 rep
>> 10.2.1.7
>> Mar 29 7:45:17 [NOTICE ] [TOTEM] aru 0 high delivered 0 received
>> flag 1
>> Mar 29 7:45:17 [NOTICE ] [TOTEM] position [2] member 10.2.25.254:
>> Mar 29 7:45:17 [NOTICE ] [TOTEM] previous ring seq 33756 rep
>> 10.2.1.7
>> Mar 29 7:45:17 [NOTICE ] [TOTEM] aru f high delivered 0 received
>> flag 1
>> Mar 29 7:45:17 [NOTICE ] [TOTEM] copying all old ring messages from
>> 10-4.
>> Assertion failed: (range < 1024), function memb_state_recovery_enter,
>> file totemsrp.c, line 1616.
>>
>> Le 28 mars 06 à 22:15, Steven Dake a écrit :
>>
>>> Find attached a patch which I think will fix one of the problem(s).
>>>
>>> I think I see the problem here at least with this debug log output.
>>>
>>> A synchronization is taking place and then the other node starts
>>> interrupting the synchronization. But there are still
>>> synchronization
>>> messages that are taking place.
>>>
>>> The sync service should ignore sync messages if the ring id under
>>> which
>>> they were originated is not the same ring id delivered in the last
>>> configuration change message. Remember it is possible for those
>>> recovery messages to sit queued. This is yet another reason why we
>>> need
>>> flushed totem.
>>>
>>> Regards
>>> -steve
>>>
>>>
>>> On Tue, 2006-03-28 at 11:29 -0700, Steven Dake wrote:
>>>> mark this event error only comes when the sync code is broken
>>>> right?
>>>>
>>>> Regards
>>>> -steve
>>>> email message attachment, "Forwarded message - [Bug 1153] Ramdom
>>>> crash
>>>> when 2nd instance of aisexec is launched"
>>>> On Tue, 2006-03-28 at 11:29 -0700, Steven Dake wrote:
>>>>> http://www.osdl.org/developer_bugzilla/show_bug.cgi?id=1153
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ------- Additional Comments From fabien.thomas at netasq.com
>>>>> 2006-03-28 05:06 -------
>>>>> i have more information, it seems that we have a race condition
>>>>> somewhere:
>>>>> i've two device one VIA eden at 400MHZ and one VIA eden at 800HMZ
>>>>> when the slow device is launched first the fast device crash very
>>>>> often
>>>>> when the fast device is launched first the slow device can
>>>>> connect to the cluster without problems.
>>>>>
>>>>> maybe it can help to understand the problem...
>>>>>
>>>>> here another trace smaller than the previous:
>>>>> Mar 28 13:04:52 [NOTICE ] [MAIN ] AIS Executive Service:
>>>>> Copyright (C) 2002-2006 MontaVista
>>>>> Software, Inc. and contributors.
>>>>> Mar 28 13:04:52 [WARNING ] [MAIN ] Could not lock memory of
>>>>> service to avoid page faults
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] Token Timeout (1000 ms)
>>>>> retransmit timeout (238 ms)
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] token hold (180 ms)
>>>>> retransmits before loss (4 retrans)
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] join (100 ms) consensus (200
>>>>> ms) merge (200 ms)
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] downcheck (1000 ms) fail to
>>>>> recv const (50 msgs)
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] seqno unchanged const (30
>>>>> rotations) Maximum network MTU
>>>>> 1500
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] send threads (0 threads)
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] heartbeat_failures_allowed (0)
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] max_network_delay (50 ms)
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] HeartBeat is Disabled. To
>>>>> enable set heartbeat_failures_allowed >
>>>>> 0
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] Receive multicast socket recv
>>>>> buffer size (144000 bytes).
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] Transmit multicast socket send
>>>>> buffer size (144000 bytes).
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] The network interface
>>>>> [10.2.1.7] is now up.
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] Created or loaded sequence id
>>>>> 5444.10.2.1.7 for this ring.
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] entering GATHER state.
>>>>> Mar 28 13:04:52 [NOTICE ] [SERV ] openais component openais_cpg
>>>>> loaded.
>>>>> Mar 28 13:04:52 [NOTICE ] [SERV ] Registering service handler
>>>>> 'openais cluster closed process group
>>>>> service v1.01'
>>>>> Mar 28 13:04:52 [NOTICE ] [SERV ] Initializing service handler
>>>>> 'openais cluster closed process group
>>>>> service v1.01'
>>>>> Mar 28 13:04:52 [NOTICE ] [SERV ] openais component openais_cfg
>>>>> loaded.
>>>>> Mar 28 13:04:52 [NOTICE ] [SERV ] Registering service handler
>>>>> 'openais configuration service'
>>>>> Mar 28 13:04:52 [NOTICE ] [SERV ] Initializing service handler
>>>>> 'openais configuration service'
>>>>> Mar 28 13:04:52 [NOTICE ] [SERV ] openais component openais_msg
>>>>> loaded.
>>>>> Mar 28 13:04:52 [NOTICE ] [SERV ] Registering service handler
>>>>> 'openais message service B.01.01'
>>>>> Mar 28 13:04:52 [NOTICE ] [SERV ] Initializing service handler
>>>>> 'openais message service B.01.01'
>>>>> Mar 28 13:04:52 [NOTICE ] [SERV ] openais component openais_lck
>>>>> loaded.
>>>>> Mar 28 13:04:52 [NOTICE ] [SERV ] Registering service handler
>>>>> 'openais distributed locking service B.
>>>>> 01.01'
>>>>> Mar 28 13:04:52 [NOTICE ] [SERV ] Initializing service handler
>>>>> 'openais distributed locking service B.
>>>>> 01.01'
>>>>> Mar 28 13:04:52 [NOTICE ] [SERV ] openais component openais_evt
>>>>> loaded.
>>>>> Mar 28 13:04:52 [NOTICE ] [SERV ] Registering service handler
>>>>> 'openais event service B.01.01'
>>>>> Mar 28 13:04:52 [NOTICE ] [SERV ] Initializing service handler
>>>>> 'openais event service B.01.01'
>>>>> Mar 28 13:04:52 [NOTICE ] [SERV ] openais component openais_ckpt
>>>>> loaded.
>>>>> Mar 28 13:04:52 [NOTICE ] [SERV ] Registering service handler
>>>>> 'openais checkpoint service B.01.01'
>>>>> Mar 28 13:04:52 [NOTICE ] [SERV ] Initializing service handler
>>>>> 'openais checkpoint service B.01.01'
>>>>> Mar 28 13:04:52 [NOTICE ] [SERV ] openais component openais_amf
>>>>> loaded.
>>>>> Mar 28 13:04:52 [NOTICE ] [SERV ] Registering service handler
>>>>> 'openais availability management
>>>>> framework B.01.01'
>>>>> Mar 28 13:04:52 [NOTICE ] [SERV ] Initializing service handler
>>>>> 'openais availability management
>>>>> framework B.01.01'
>>>>> Mar 28 13:04:52 [NOTICE ] [SERV ] openais component openais_clm
>>>>> loaded.
>>>>> Mar 28 13:04:52 [NOTICE ] [SERV ] Registering service handler
>>>>> 'openais cluster membership service B.
>>>>> 01.01'
>>>>> Mar 28 13:04:52 [NOTICE ] [SERV ] Initializing service handler
>>>>> 'openais cluster membership service B.
>>>>> 01.01'
>>>>> Mar 28 13:04:52 [NOTICE ] [SERV ] openais component openais_evs
>>>>> loaded.
>>>>> Mar 28 13:04:52 [NOTICE ] [SERV ] Registering service handler
>>>>> 'openais extended virtual synchrony
>>>>> service'
>>>>> Mar 28 13:04:52 [NOTICE ] [SERV ] Initializing service handler
>>>>> 'openais extended virtual synchrony
>>>>> service'
>>>>> Mar 28 13:04:52 [NOTICE ] [MAIN ] AIS Executive Service: started
>>>>> and ready to receive connections.
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] Creating commit token because
>>>>> I am the rep.
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] Saving state aru 0 high seq
>>>>> received 0
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] Storing new sequence id for
>>>>> ring 5448
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] entering COMMIT state.
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] position [0] member 10.2.1.7:
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] previous ring seq 5444 rep
>>>>> 10.2.1.7
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] aru 0 high delivered 0
>>>>> received flag 1
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] copying all old ring messages
>>>>> from 1-0.
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] Originated 0 messages in
>>>>> RECOVERY.
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] Originated for recovery:
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] Not Originated for recovery:
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] Sending initial ORF token
>>>>> Mar 28 13:04:52 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
>>>>> Mar 28 13:04:52 [NOTICE ] [CLM ] New Configuration:
>>>>> Mar 28 13:04:52 [NOTICE ] [CLM ] Members Left:
>>>>> Mar 28 13:04:52 [NOTICE ] [CLM ] Members Joined:
>>>>> Mar 28 13:04:52 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
>>>>> Mar 28 13:04:52 [NOTICE ] [CLM ] New Configuration:
>>>>> Mar 28 13:04:52 [NOTICE ] [CLM ] 10.2.1.7
>>>>> Mar 28 13:04:52 [NOTICE ] [CLM ] Members Left:
>>>>> Mar 28 13:04:52 [NOTICE ] [CLM ] Members Joined:
>>>>> Mar 28 13:04:52 [NOTICE ] [CLM ] 10.2.1.7
>>>>> Mar 28 13:04:52 [NOTICE ] [SYNC ] This node is within the non-
>>>>> primary component and will NOT
>>>>> provide any services.
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] entering OPERATIONAL state.
>>>>> Mar 28 13:04:52 [NOTICE ] [YKD ] This processor is within the
>>>>> primary component.
>>>>> Mar 28 13:04:52 [NOTICE ] [SYNC ] This node is within the
>>>>> primary component and will provide service.
>>>>> Mar 28 13:04:52 [NOTICE ] [SYNC ] Synchronization barrier
>>>>> completed
>>>>> Mar 28 13:04:52 [NOTICE ] [SYNC ] Synchronization actions
>>>>> starting for (openais cluster membership
>>>>> service B.01.01)
>>>>> Mar 28 13:04:52 [NOTICE ] [SYNC ] Synchronization actions done
>>>>> for (openais cluster membership
>>>>> service B.01.01)
>>>>> Mar 28 13:04:52 [NOTICE ] [CLM ] got nodejoin message 10.2.1.7
>>>>> Mar 28 13:04:52 [NOTICE ] [SYNC ] Synchronization barrier
>>>>> completed
>>>>> Mar 28 13:04:52 [NOTICE ] [SYNC ] Synchronization actions
>>>>> starting for (openais checkpoint service B.
>>>>> 01.01)
>>>>> Mar 28 13:04:52 [NOTICE ] [SYNC ] Synchronization actions done
>>>>> for (openais checkpoint service B.
>>>>> 01.01)
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] entering GATHER state.
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] Saving state aru 28 high seq
>>>>> received 28
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] Storing new sequence id for
>>>>> ring 5452
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] entering COMMIT state.
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] position [0] member 10.2.1.6:
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] previous ring seq 5448 rep
>>>>> 10.2.1.6
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] aru 2b high delivered 2b
>>>>> received flag 1
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] position [1] member 10.2.1.7:
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] previous ring seq 5448 rep
>>>>> 10.2.1.7
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] aru 28 high delivered 27
>>>>> received flag 1
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] copying all old ring messages
>>>>> from 29-28.
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] Originated 0 messages in
>>>>> RECOVERY.
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] Originated for recovery:
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] Not Originated for recovery:
>>>>> Mar 28 13:04:52 [NOTICE ] [SYNC ] Synchronization barrier
>>>>> completed
>>>>> Mar 28 13:04:52 [NOTICE ] [SYNC ] Synchronization actions
>>>>> starting for (openais event service B.01.01)
>>>>> Mar 28 13:04:52 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
>>>>> Mar 28 13:04:52 [NOTICE ] [CLM ] New Configuration:
>>>>> Mar 28 13:04:52 [NOTICE ] [CLM ] 10.2.1.7
>>>>> Mar 28 13:04:52 [NOTICE ] [CLM ] Members Left:
>>>>> Mar 28 13:04:52 [NOTICE ] [CLM ] Members Joined:
>>>>> Mar 28 13:04:52 [NOTICE ] [CLM ] CLM CONFIGURATION CHANGE
>>>>> Mar 28 13:04:52 [NOTICE ] [CLM ] New Configuration:
>>>>> Mar 28 13:04:52 [NOTICE ] [CLM ] 10.2.1.6
>>>>> Mar 28 13:04:52 [NOTICE ] [CLM ] 10.2.1.7
>>>>> Mar 28 13:04:52 [NOTICE ] [CLM ] Members Left:
>>>>> Mar 28 13:04:52 [NOTICE ] [CLM ] Members Joined:
>>>>> Mar 28 13:04:52 [NOTICE ] [CLM ] 10.2.1.6
>>>>> Mar 28 13:04:52 [NOTICE ] [SYNC ] This node is within the non-
>>>>> primary component and will NOT
>>>>> provide any services.
>>>>> Mar 28 13:04:52 [NOTICE ] [TOTEM] entering OPERATIONAL state.
>>>>> Mar 28 13:04:52 [ERROR ] [EVT ] recovery error node: (null)
>>>>> not found
>>>>> Assertion failed: (0), function evt_sync_process, file evt.c,
>>>>> line 4056.
>>>>>
>>>>>
>>>>>
>>>>> ------- You are receiving this mail because: -------
>>>>> You are on the CC list for the bug, or are watching someone who
>>>>> is.
>>>> _______________________________________________
>>>> Openais mailing list
>>>> Openais at lists.osdl.org
>>>> https://lists.osdl.org/mailman/listinfo/openais
>>>> <defect-1153-1.patch>
>>> _______________________________________________
>>> Openais mailing list
>>> Openais at lists.osdl.org
>>> https://lists.osdl.org/mailman/listinfo/openais
>>
>> <defect-1170.patch>
More information about the Openais
mailing list