[Openais] PATCH: aisexec leak & corruption (whitetank & trunk)(retry)

Fabien THOMAS fabien.thomas at netasq.com
Tue Aug 29 08:05:38 PDT 2006


> It is a very simple scenario to reproduce the problem:
>
> on first node run aisexec.
> on the same node run a client that do:
> init
> open
> iterate
> close
> finalize
>
sorry i've missed that here another thread create a checkpoint with  
two section and update them in loop

> on another node run aisexec:
> 	recovery process is launched on the first node (that have all the
> information locally)
> 	=> internal structure is corrupted if the recovery happened during
> the iterate process
>
> what is really strange to me is why do we need to create new
> checkpoint on a node that is the origin of the information?
> why it is not possible to link the new checkpoint structure  to
> current iterators by name ?
>
> sorry if its a dumb question but i'm not aware of all the operations
> done in aisexec.
>
> Le 29 août 06 à 16:53, Muni Bajpai a écrit :
>
>> So iteration at this point cannot survive a recovery process
>> because the iterator has references to the sections which are not
>> valid once recovery happens. Now recovery has no clue of these
>> references and hence cannot update them.
>>
>> What I propose is that we re - initialize the iterator after a
>> recovery to update the references. What this means is that any
>> current iteration would have to be restarted. I might be able to
>> get around that but that is the worst case scenario.
>>
>> - Muni
>>
>> - Muni----- Original Message ----- From: "Fabien THOMAS"
>> <fabien.thomas at netasq.com>
>> To: <openais at lists.osdl.org>
>> Cc: "Muni Bajpai" <muniba at nortel.com>
>> Sent: Tuesday, August 29, 2006 5:18 AM
>> Subject: Re: [Openais] PATCH: aisexec leak & corruption (whitetank
>> & trunk)(retry)
>>
>>
>>> There is one remaining problem but i cannot find the reason:
>>> during checkpoint recovery it seems that the structure is corrupted
>>> (the full log is attached to my previous post).
>>>
>>
>> Maybe i've an idea here:
>>
>> iteration_entry contain pointer to checkpoint section and it seems
>> that the recovery process does not update the pointer to the new
>> section list address.
>> can you confirm ?
>>
>>
>> fabien
>> _______________________________________________
>> Openais mailing list
>> Openais at lists.osdl.org
>> https://lists.osdl.org/mailman/listinfo/openais
>>
>
>
> _______________________________________________
> Openais mailing list
> Openais at lists.osdl.org
> https://lists.osdl.org/mailman/listinfo/openais
>





More information about the Openais mailing list