[Openais] Node loss detection taking a long time
Andrew Beekhof
abeekhof at suse.de
Wed Feb 4 08:14:26 PST 2009
On Feb 4, 2009, at 5:08 PM, Steven Dake wrote:
> 10 seconds. (10000 msec).
That was my impression (based on the 'token' setting right?) too.
Have you any thoughts on what could have caused it to take 3 times that?
What information would you need to comment further?
>
>
> Regards
> -steve
>
> On Wed, 2009-02-04 at 16:21 +0100, Andrew Beekhof wrote:
>> Given the following totem section in openais.conf, how long would you
>> expect whitetank to notice the node was down?
>>
>> totem {
>> token: 10000
>> token_retransmits_before_loss_const: 20
>> join: 60
>> consensus: 4800
>> vsftype: none
>> max_messages: 20
>>
>> nodeid: 16
>> threads: 0
>> secauth: on
>> version: 2
>> interface {
>> ringnumber: 0
>> bindnetaddr: 192.168.1.0
>> mcastport: 5405
>> mcastaddr: 226.94.1.1
>> }
>> rrp_mode: passive
>> interface {
>> ringnumber: 1
>> bindnetaddr: 10.10.0.0
>> mcastport: 5406
>> mcastaddr: 226.94.1.10
>> }
>> }
>>
>> It seems to have taken 30s or so (the times on vm14 and 16 are within
>> 3s of each other).
>>
>> Feb 3 22:52:08 s390vm14 crmd: [28359]: debug: ...
>> Feb 3 22:52:35 s390vm16 openais[17354]: [TOTEM] The token was lost
>> in
>> the OPERATIONAL state.
>>
>> And because its a VM, it was up again before openais calculated a new
>> membership
>>
>> Feb 3 22:52:39 s390vm16 openais[17354]: [TOTEM] Did not need to
>> originate any messages in recovery.
>> Feb 3 22:52:39 s390vm16 openais[17354]: [CLM ] CLM CONFIGURATION
>> CHANGE
>> Feb 3 22:52:39 s390vm16 openais[17354]: [CLM ] New Configuration:
>> Feb 3 22:52:39 s390vm16 openais[17354]: [CLM ] r(0)
>> ip(192.168.1.13) r(1) ip(10.10.220.109)
>> Feb 3 22:52:39 s390vm16 openais[17354]: [CLM ] r(0)
>> ip(192.168.1.14) r(1) ip(10.10.220.110)
>> Feb 3 22:52:39 s390vm16 openais[17354]: [CLM ] r(0)
>> ip(192.168.1.16) r(1) ip(10.10.220.112)
>> Feb 3 22:52:39 s390vm16 openais[17354]: [CLM ] Members Left:
>> Feb 3 22:52:39 s390vm16 openais[17354]: [CLM ] Members Joined:
>> Feb 3 22:52:39 s390vm16 openais[17354]: [crm ] notice:
>> global_confchg_fn: Stable membership event on ring 4320: memb=3,
>> new=0, lost=0
>> Feb 3 22:52:39 s390vm16 openais[17354]: [crm ] info:
>> global_confchg_fn: MEMB: s390vm13 13
>> Feb 3 22:52:39 s390vm16 openais[17354]: [crm ] info:
>> global_confchg_fn: MEMB: s390vm14 14
>> Feb 3 22:52:39 s390vm16 openais[17354]: [crm ] info:
>> global_confchg_fn: MEMB: s390vm16 16
>>
>> Which confuses the rest of the cluster a fraction because its as if
>> the cluster never left.
>> _______________________________________________
>> Openais mailing list
>> Openais at lists.linux-foundation.org
>> https://lists.linux-foundation.org/mailman/listinfo/openais
>
More information about the Openais
mailing list