[Openais] Logs during reconfiguration (node lost)

Kristen Smith kjsmith at nortel.com
Mon Feb 21 09:19:03 PST 2005


Hi Steve,

We had some traffic running this weekend (5+1) and one of the nodes died
(the same aisexec: ../include/sq.h:152: sq_item_get: Assertion `sq_position
>= 0' failed. that is already reported). In looking through the logs when
this happened, I am confused about something and maybe you can clear this up
for me.

We had 6 nodes (47.104.22.82 - 47.104.22.87) - the failure occurred on .84.
The reconfig looks the same on 4 of the remaining nodes and different on
another one. The logs are shown below. 

My questions are:

1) why do all but .86 think that .84 AND .86 went away - .84 died, so that
makes sense, but why .86 as well?
2) why does .86 think all other nodes went away and it is all by itself?
3) both .82 and .86 think they are the rep and create new commit tokens - I
guess this is because .86 thinks it is in a cluster by itself and .82 was
the original rep.

Also, this is just the beginning of the reconfiguration at this time - all
nodes do multiple reconfigurations after this one caused by the failure. I
can send all logs along later if you want. Eventually (within a second or so
after this initial reconfig), all the nodes wind up seeing each other and
the ring is reformed in a 5+0 scenario.

Thanks,
Kristen

Here are the logs when the failed occurred:

.82:
Feb 19  2:24:23 [NOTICE  ] [GMI  ] Creating commit token because I am the
rep.
Feb 19  2:24:23 [NOTICE  ] [GMI  ] Storing new sequence id for ring 4228
Feb 19  2:24:23 [NOTICE  ] [GMI  ] entering COMMIT state.
Feb 19  2:24:23 [NOTICE  ] [GMI  ] entering GATHER state.
Feb 19  2:24:23 [NOTICE  ] [GMI  ] entering GATHER state.
Feb 19  2:24:24 [NOTICE  ] [GMI  ] Creating commit token because I am the
rep.
Feb 19  2:24:24 [NOTICE  ] [GMI  ] Storing new sequence id for ring 4232
Feb 19  2:24:24 [NOTICE  ] [GMI  ] entering COMMIT state.
Feb 19  2:24:24 [NOTICE  ] [GMI  ] entering RECOVERY state.
Feb 19  2:24:24 [NOTICE  ] [GMI  ] Sending initial ORF token
Feb 19  2:24:24 [NOTICE  ] [CLM  ] CLM CONFIGURATION CHANGE
Feb 19  2:24:24 [NOTICE  ] [CLM  ] New Configuration:
Feb 19  2:24:24 [NOTICE  ] [CLM  ]      47.104.22.82
Feb 19  2:24:24 [NOTICE  ] [CLM  ]      47.104.22.83
Feb 19  2:24:24 [NOTICE  ] [CLM  ]      47.104.22.85
Feb 19  2:24:24 [NOTICE  ] [CLM  ]      47.104.22.87
Feb 19  2:24:24 [NOTICE  ] [CLM  ] Members Left:
Feb 19  2:24:24 [NOTICE  ] [CLM  ]      47.104.22.84
Feb 19  2:24:24 [NOTICE  ] [CLM  ]      47.104.22.86
Feb 19  2:24:24 [NOTICE  ] [CLM  ] Members Joined:
Feb 19  2:24:24 [NOTICE  ] [CLM  ] CLM CONFIGURATION CHANGE
Feb 19  2:24:24 [NOTICE  ] [CLM  ] New Configuration:
Feb 19  2:24:24 [NOTICE  ] [CLM  ]      47.104.22.82
Feb 19  2:24:24 [NOTICE  ] [CLM  ]      47.104.22.83
Feb 19  2:24:24 [NOTICE  ] [CLM  ]      47.104.22.85
Feb 19  2:24:24 [NOTICE  ] [CLM  ]      47.104.22.87
Feb 19  2:24:24 [NOTICE  ] [CLM  ] Members Left:
Feb 19  2:24:24 [NOTICE  ] [CLM  ] Members Joined:
Feb 19  2:24:24 [NOTICE  ] [GMI  ] entering OPERATIONAL state.

.83, .85, .87:
Feb 19  2:14:56 [NOTICE  ] [GMI  ] entering GATHER state.
Feb 19  2:14:56 [NOTICE  ] [GMI  ] entering GATHER state.
Feb 19  2:14:57 [NOTICE  ] [GMI  ] Storing new sequence id for ring 4232
Feb 19  2:14:57 [NOTICE  ] [GMI  ] entering COMMIT state.
Feb 19  2:14:57 [NOTICE  ] [GMI  ] entering RECOVERY state.
Feb 19  2:14:57 [NOTICE  ] [CLM  ] CLM CONFIGURATION CHANGE
Feb 19  2:14:57 [NOTICE  ] [CLM  ] New Configuration:
Feb 19  2:14:57 [NOTICE  ] [CLM  ]      47.104.22.82
Feb 19  2:14:57 [NOTICE  ] [CLM  ]      47.104.22.83
Feb 19  2:14:57 [NOTICE  ] [CLM  ]      47.104.22.85
Feb 19  2:14:57 [NOTICE  ] [CLM  ]      47.104.22.87
Feb 19  2:14:57 [NOTICE  ] [CLM  ] Members Left:
Feb 19  2:14:57 [NOTICE  ] [CLM  ]      47.104.22.84
Feb 19  2:14:57 [NOTICE  ] [CLM  ]      47.104.22.86
Feb 19  2:14:57 [NOTICE  ] [CLM  ] Members Joined:
Feb 19  2:14:57 [NOTICE  ] [CLM  ] CLM CONFIGURATION CHANGE
Feb 19  2:14:57 [NOTICE  ] [CLM  ] New Configuration:
Feb 19  2:14:57 [NOTICE  ] [CLM  ]      47.104.22.82
Feb 19  2:14:57 [NOTICE  ] [CLM  ]      47.104.22.83
Feb 19  2:14:57 [NOTICE  ] [CLM  ]      47.104.22.85
Feb 19  2:14:57 [NOTICE  ] [CLM  ]      47.104.22.87
Feb 19  2:14:57 [NOTICE  ] [CLM  ] Members Left:
Feb 19  2:14:57 [NOTICE  ] [CLM  ] Members Joined:
Feb 19  2:14:57 [NOTICE  ] [GMI  ] entering OPERATIONAL state.

.86:
Feb 19  2:20:31 [NOTICE  ] [GMI  ] The token was lost in state 1 from timer
270f
Feb 19  2:20:31 [NOTICE  ] [GMI  ] entering GATHER state.
Feb 19  2:20:31 [NOTICE  ] [GMI  ] entering GATHER state.
Feb 19  2:20:31 [NOTICE  ] [GMI  ] entering GATHER state.
Feb 19  2:20:31 [NOTICE  ] [GMI  ] entering GATHER state.
Feb 19  2:20:31 [NOTICE  ] [GMI  ] entering GATHER state.
Feb 19  2:20:31 [NOTICE  ] [GMI  ] entering GATHER state.
Feb 19  2:20:31 [NOTICE  ] [GMI  ] Creating commit token because I am the
rep.
Feb 19  2:20:31 [NOTICE  ] [GMI  ] Storing new sequence id for ring 4236
Feb 19  2:20:31 [NOTICE  ] [GMI  ] entering COMMIT state.
Feb 19  2:20:31 [NOTICE  ] [GMI  ] entering RECOVERY state.
Feb 19  2:20:31 [NOTICE  ] [GMI  ] Sending initial ORF token
Feb 19  2:20:31 [NOTICE  ] [CLM  ] CLM CONFIGURATION CHANGE
Feb 19  2:20:31 [NOTICE  ] [CLM  ] New Configuration:
Feb 19  2:20:31 [NOTICE  ] [CLM  ]      47.104.22.86
Feb 19  2:20:31 [NOTICE  ] [CLM  ] Members Left:
Feb 19  2:20:31 [NOTICE  ] [CLM  ]      47.104.22.82
Feb 19  2:20:31 [NOTICE  ] [CLM  ]      47.104.22.83
Feb 19  2:20:31 [NOTICE  ] [CLM  ]      47.104.22.84
Feb 19  2:20:31 [NOTICE  ] [CLM  ]      47.104.22.85
Feb 19  2:20:31 [NOTICE  ] [CLM  ]      47.104.22.87
Feb 19  2:20:31 [NOTICE  ] [CLM  ] Members Joined:
Feb 19  2:20:31 [NOTICE  ] [CLM  ] CLM CONFIGURATION CHANGE
Feb 19  2:20:31 [NOTICE  ] [CLM  ] New Configuration:
Feb 19  2:20:31 [NOTICE  ] [CLM  ]      47.104.22.86
Feb 19  2:20:31 [NOTICE  ] [CLM  ] Members Left:
Feb 19  2:20:31 [NOTICE  ] [CLM  ] Members Joined:
Feb 19  2:20:31 [NOTICE  ] [GMI  ] entering OPERATIONAL state.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.linux-foundation.org/pipermail/openais/attachments/20050221/7a8842bc/attachment-0001.htm


More information about the Openais mailing list