[Openais] flow control and merge recovery

Steven Dake sdake at mvista.com
Thu Sep 23 16:32:31 PDT 2004


On Thu, 2004-09-23 at 16:10, Mark Haverkamp wrote:
> On Thu, 2004-09-23 at 15:53, Steven Dake wrote:
> > On Thu, 2004-09-23 at 15:29, Mark Haverkamp wrote:
> > > On Wed, 2004-09-22 at 14:04, Steven Dake wrote:
> > > > On Wed, 2004-09-22 at 13:47, Mark Haverkamp wrote:
> > > 
> > > > > 
> > > > > So, would I call gmi_send_ok()/gmi_mcast() until I could send no more?
> > > > > Couldn't I still do that without the token callback?
> > > > > 
> > > > 
> > > > Yup it could be done without the token callback.  The advantage of the
> > > > token callback is that it will specify when it makes sense to do
> > > > gmi-send_ok/gmi_mcast again to finish the recovery.
> > > 
> > > Are you working on this callback code now?  Also, looking at gmi_mcast,
> > > it seems to either return success (0) or assert.  Would it be reasonable
> > > to have gmi_mcast call gmi_send_ok, and return status if it can't queue
> > > the message?
> > > 
> > 
> > I have not started the callback code..  It should be pretty simple to
> > add..  If you want to use this mechanism, I'll add it in.  Ideas on the
> > interface?  Something like:
> > 
> > gmi_fc_open_create (void *handle, int (*callback_fn), void *data);
> > gmi_fc_open_destroy (void *handle);
> > (flow control opened register/unregister)
> 
> This sounds useful, one thing that I have been concerned about with my
> current approach is that if for some reason I can't do a gmi_mcast, I
> won't have a way to try later since I use the receipt of the message
> that I send to trigger the next one.  (I haven't seen this yet though).
> 
> How does the function calling my callback code know that it can handle
> the mcast that I want to do?  Do I need to call gmi_send_ok first and
> just return 0 if I can't send my message?
> 

We can have gmi_mcast return -1 if it couldn't send the message (because
the buffer was full or some other reason) and do the assertion changes
throughout the rest of the gmi_mcast callers.

> 
> > 
> > if callback_fn returns -1, no more callbacks will be called.  This would
> > indicate the outgoing queues are full.  If callback_fn returns 0, more
> > callbacks would be called until all have been called or -1 is returned.
> > 
> > If a new configuration change comes while fc_open_destroy is pending,
> > call destroy to start the recovery over (with a new data element).
> 
> I'm not sure that I understand this.  Does this mean that if a new
> configuration change happens while I have an active call back, that I
> destroy the current one and create a new one?
> 
> 

It depends on how you want to do it.. I've been thinking of this for
checkpointing, and I think I want to destroy whatever context I pass in
data and start fresh with a new context.  But this may not be an issue
depending on how its implemented.

> > 
> > We can make gmi_mcast not assert, but we have to be careful.  The assert
> > is in there to catch bugs..  If a caller calls gmi_mcast, and the
> > message can't be queued, then in every case in the current openais that
> > is a serious bug.  This shouldn't happen today with the flow control
> > code (which is why there is an assert there).
> 
> As long as the mcast is done from the library interface side.
> 

right

> > 
> > If we change the semantics of gmi_mcast, by allowing it to fail to queue
> > without asserting, we should be careful to either handle return values
> > (in the case of recovery) or assert where a -1 return value shouldn't
> > happen.
> > 
> > So most of the gmi_mcast calls would be somethign like:
> > 
> > res = gmi_mcast (...)
> > assert (res == 0);
> > 
> > for all of the services, except in the case where a res of -1 can be
> > handled (such as merge recovery).
> 
> Yes.
> 
> Mark.

I'll work on the token rotation callback.  Can you work up the patch for
the gmi_send_ok/gmi_mcast/assert changes?

Thanks
-steve




More information about the Openais mailing list