[Openais] Node loss detection taking a long time

Steven Dake sdake at redhat.com
Wed Feb 4 08:08:34 PST 2009


10 seconds. (10000 msec).

Regards
-steve

On Wed, 2009-02-04 at 16:21 +0100, Andrew Beekhof wrote:
> Given the following totem section in openais.conf, how long would you  
> expect whitetank to notice the node was down?
> 
> totem {
> 	token:          10000
> 	token_retransmits_before_loss_const: 20
> 	join:           60
> 	consensus:      4800
> 	vsftype:        none
> 	max_messages:   20
> 
> 	nodeid: 16
> 	threads: 0
> 	secauth: on
> 	version: 2
> 	interface {
> 		ringnumber: 0
> 		bindnetaddr: 192.168.1.0
> 		mcastport: 5405
> 		mcastaddr: 226.94.1.1
> 	}
> 	rrp_mode: passive
> 	interface {
> 		ringnumber: 1
> 		bindnetaddr: 10.10.0.0
> 		mcastport: 5406
> 		mcastaddr: 226.94.1.10
> 	}
> }
> 
> It seems to have taken 30s or so (the times on vm14 and 16 are within  
> 3s of each other).
> 
> Feb  3 22:52:08 s390vm14 crmd: [28359]: debug: ...
> Feb  3 22:52:35 s390vm16 openais[17354]: [TOTEM] The token was lost in  
> the OPERATIONAL state.
> 
> And because its a VM, it was up again before openais calculated a new  
> membership
> 
> Feb  3 22:52:39 s390vm16 openais[17354]: [TOTEM] Did not need to  
> originate any messages in recovery.
> Feb  3 22:52:39 s390vm16 openais[17354]: [CLM  ] CLM CONFIGURATION  
> CHANGE
> Feb  3 22:52:39 s390vm16 openais[17354]: [CLM  ] New Configuration:
> Feb  3 22:52:39 s390vm16 openais[17354]: [CLM  ] 	r(0)  
> ip(192.168.1.13) r(1) ip(10.10.220.109)
> Feb  3 22:52:39 s390vm16 openais[17354]: [CLM  ] 	r(0)  
> ip(192.168.1.14) r(1) ip(10.10.220.110)
> Feb  3 22:52:39 s390vm16 openais[17354]: [CLM  ] 	r(0)  
> ip(192.168.1.16) r(1) ip(10.10.220.112)
> Feb  3 22:52:39 s390vm16 openais[17354]: [CLM  ] Members Left:
> Feb  3 22:52:39 s390vm16 openais[17354]: [CLM  ] Members Joined:
> Feb  3 22:52:39 s390vm16 openais[17354]: [crm  ] notice:  
> global_confchg_fn: Stable membership event on ring 4320: memb=3,  
> new=0, lost=0
> Feb  3 22:52:39 s390vm16 openais[17354]: [crm  ] info:  
> global_confchg_fn: MEMB: s390vm13 13
> Feb  3 22:52:39 s390vm16 openais[17354]: [crm  ] info:  
> global_confchg_fn: MEMB: s390vm14 14
> Feb  3 22:52:39 s390vm16 openais[17354]: [crm  ] info:  
> global_confchg_fn: MEMB: s390vm16 16
> 
> Which confuses the rest of the cluster a fraction because its as if  
> the cluster never left.
> _______________________________________________
> Openais mailing list
> Openais at lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/openais



More information about the Openais mailing list