[Openais] corosync - CPG model_init + callback with totem ringid and members

Jan Friesse jfriesse at redhat.com
Thu Apr 8 07:57:22 PDT 2010


Included is patch solving 2nd problem.

In first problem, I agree with Chrissie, and really don't have any
single idea how to make regular confchg precede totem_confchg.

Christine Caulfield wrote:
> On 07/04/10 20:32, David Teigland wrote:
>> On Tue, Apr 06, 2010 at 02:05:00PM +0200, Jan Friesse wrote:
>>> Same patch but rebased on top of Steve's change (today trunk).
>>
>> Thanks, this is mostly working well, but I've found one problem, and one
>> additional thing I need (mentioned on irc already):
>>
>> 1. When a node joins, I get the totem callback before the corresponding
>> confchg callback.  When a node leaves I get them in the expected order:
>> confchg followed by totem callback.
> 
> 
> That *is* the expected order, as far as CPG is concerned anyway. The
> process is node deemed to be a member of the group until all nodes have
> seen its join message. it also makes more logical sense because the node
> has to join the cluster before the process joins the group.
> 
> 
>> 2. When my app starts up it needs to be able to get the current ring id,
>> so we need to be able to get/force an initial totem callback after a
>> cpg_join that indicates the current ring id.
>>
>>
>> I've also had a problem getting the current sequence number through
>> libcman/cman_get_cluster()/ci_generation ---
>>
>> On node 2 I see:
>>
>> in cman_dispatch statechange callback:
>>    call cman_get_cluster(), get generation 2124
>>    call cman_get_nodes(), see node 1 removed
>>
>> in cman_dispatch statechange callback:
>>    call cman_get_cluster(), get generation 2128
>>    call cman_get_nodes(), see node 1 added
>>
>> in cman_dispatch statechange callback:
>>    call cman_get_cluster(), get generation 2128 (expect 2132)
>>    call cman_get_nodes(), see node 1 removed
>>
>> in cman_dispatch statechange callback:
>>    call cman_get_cluster(), get generation 2136
>>    call cman_get_nodes(), see node 1 added
>>
>> The second time node 1 is removed I get the previous generation when
>> node 1 was added instead of generation 2132 which the callback is for.
>>
>> On node 4 I do get generation 2132 in that callback as expected.  So it
>> seems like it could be a race, I've only gone through this test once.
>>
> 
> There is almost certainly a race there. The ring IDs need to be
> delivered at the same time as the change notifications.
> 

Chrissie,
is that problem in cman or in my patch?

> Chrissie
> 

Regards,
  Honza
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: 2010-04-08-cpg_model+totem_cb.patch
Url: http://lists.linux-foundation.org/pipermail/openais/attachments/20100408/9d97e963/attachment-0001.ksh 


More information about the Openais mailing list