[Openais] Re: evt update for retained event recovery on config
change
Steven Dake
sdake at mvista.com
Fri Sep 24 15:30:05 PDT 2004
On Fri, 2004-09-24 at 15:17, Mark Haverkamp wrote:
> On Fri, 2004-09-24 at 14:43, Steven Dake wrote:
> > Mark & Daniel,
> >
> > Clearly we are getting to the hard part (minus the low level
> > communication of course:-) of implementing the AIS specification... I
> > think this is an excellent first shot at merge recovery. I have a few
> > comments:
> >
> > name_match is already implemented. Use SaNameTisNameT. Feel free to
> > rename it to non-spazmodic style if you desire. When I started the ais
> > code, I thought the executive should follow the ais coding style, but I
> > have changed my mind to something more like the kernel coding style. I
> > want to retain it for the lib directory to help debuggability though.
> > I think parse.c is the correct place for this function today. We probably
> > want a util.c file to store our time related stuff and comparison
> > operations.
>
> Do you want a new comparison function in util.c or move the existing
> one? If you like, I can create a util.c and start with the match and
> time functions.
>
sounds good mark
> >
> > I have a feeling we could genericize the hashing of the node data
> > structure and add it somehow to exec/clm.c. This is not a high priority
> > at the omment, perhaps something we can address next year.
> >
> > Your approach to distributing events is clever (selecting the oldest
> > boot time).
> >
> > Would it help to know the previous configuration of every new member in
> > the configuration change? This way, each configuration could select one
> > member from the old synchronized set to synchronize its events to the
> > new configuration. I had kicked this idea around for checkpointing
> > sync, but haven't got to it yet.
>
> This is what Daniel and I have been kicking around. We thought that
> keeping track of a previous config would allow oldest nodes from each
> partition to distribute retained events from their partitions and get a
> more accurate distribution of retained events.
>
Cool since we are both in agreement it can be useful, we can add
something like this. Since its friday, I'll have a look at this on
Monday or if one of you wants to take it on, let me know.
I need to get to the checkpointing merge recovery soon so we can get to
a release...
> >
> > What happens if a retained event is expired during a configuration
> > change? I know this window is small, but next_rtained at line 805 or so
> > may point to a deleted event and cause a segfault or some other
> > undefined behavior? I need to come up for a solution for checkpointing
> > too, so an exchange would be helpful on this subject.
> >
> > at line 2169, do you still see this happening? I think this can still
> > happen but I'm not sure. The changes to ensure we avoid it are not very
> > elegant...
>
> It can definitely happen, but I take that into account in the retained
> events expire code. If I'm deleting the next_retained event, I fix the
> next_retained pointer at that time.
>
Ok I understand the mechanism now.
> >
> > The rest looks good.
> >
> > I'll commit the token callback changes (minus the test code) for the
> > foundation. The rest of your patch should be committed with removal of
> > the duplicated function in the first comment above... I'm not sure you
> > fully have implemented expiration of retained events yet.. So perhaps
> > the work relating to recovery of expired events can wait for another
> > commit.
> >
> > We should talk more about the previous configuration idea if your
> > interested.
>
> Yup.
>
>
>
> Let me know what you want in the util.c file and I'll update my code for
> the new match and time functions that I'll place there and send the
> patch out one more time.
>
sounds good
> Thanks,
> Mark.
More information about the Openais
mailing list