[Openais] Re: evt update for retained event recovery on config change

Steven Dake sdake at mvista.com
Fri Sep 24 14:43:51 PDT 2004


Mark & Daniel,

Clearly we are getting to the hard part (minus the low level
communication of course:-) of implementing the AIS specification...  I
think this is an excellent first shot at merge recovery.  I have a few
comments:

name_match is already implemented.  Use SaNameTisNameT.  Feel free to
rename it to non-spazmodic style if you desire.  When I started the ais
code, I thought the executive should follow the ais coding style, but I
have changed my mind to something more like the kernel coding style.  I
want to retain it for the lib directory to help debuggability though.  I
think parse.c is the correct place for this function today.  We probably
want a util.c file to store our time related stuff and comparison
operations.

I have a feeling we could genericize the hashing of the node data
structure and add it somehow to exec/clm.c.  This is not a high priority
at the omment, perhaps something we can address next year.

Your approach to distributing events is clever (selecting the oldest
boot time).

Would it help to know the previous configuration of every new member in
the configuration change?  This way, each configuration could select one
member from the old synchronized set to synchronize its events to the
new configuration.  I had kicked this idea around for checkpointing
sync, but haven't got to it yet.

What happens if a retained event is expired during a configuration
change?  I know this window is small, but next_rtained at line 805 or so
may point to a deleted event and cause a segfault or some other
undefined behavior?  I need to come up for a solution for checkpointing
too, so an exchange would be helpful on this subject.

at line 2169, do you still see this happening?  I think this can still
happen but I'm not sure.  The changes to ensure we avoid it are not very
elegant...

The rest looks good.

I'll commit the token callback changes (minus the test code) for the
foundation.  The rest of your patch should be committed with removal of
the duplicated function in the first comment above...  I'm not sure you
fully have implemented expiration of retained events yet..  So perhaps
the work relating to recovery of expired events can wait for another
commit.

We should talk more about the previous configuration idea if your
interested.

Regards
-steve

On Fri, 2004-09-24 at 10:44, Mark Haverkamp wrote:
> Steve,
> 
> Here is an update to the evt code.  It contains configuration change
> code to handle distributing retained events.  It is using the token
> callback code.  It seems to handle config changes at this time. It
> doesn't really handle merging of partitions that have operated
> independently for a while, but Daniel and I have some ideas for this to
> be added later.
> It also has the start of the code to handle channel opens via exec
> handlers. (Channel close, freeing data, etc. still TBD).
> 
> I changed the token callback handle to be an unsigned long since it is
> really a pointer.  I don't think that an int is guaranteed to be pointer
> size but an unsigned long is.
> 
> Mark.




More information about the Openais mailing list