[Openais] Re: Configuration change question

Daniel McNeil daniel at osdl.org
Thu Oct 14 13:26:46 PDT 2004


On Thu, 2004-10-14 at 13:04, Steven Dake wrote:
> On Thu, 2004-10-14 at 12:26, Mark Haverkamp wrote:
> > Steve,
> > 
> > If I remember correctly, the code to deliver messages from the previous
> > configuration that happens in the transitional configuration isn't there
> > yet.  This may explain what I am seeing during the event service
> > recovery.  I now track open channels on all nodes and keep track by gmi
> > messages for opens and closes.  At reconfig time, Each node sends its
> > open count for each channel via gmi to update any nodes that may be new.
> > What I am seeing is that sometimes the open count that a node receives
> > is different than its notion of opens for that node.  I think that maybe
> > an open or close was partially distributed then the config change
> > happened and some nodes didn't get the open/close.  Is it possible for
> 
> No this is not possible even with the current code (unless there is a
> bug).  All messages will be recovered from the old configuration before
> any configuration change is delivered.  If all messages are not
> recovered, you will see a repeating EVS %d %d %d lines as I'm sure you
> have seen in the past..
> 
> If a message is sent after a configuration change, it will not be
> delivered until the new configuration is formed.
> 
> The idea of VS is that we can ensure that the messages and configuration
> changes occur in the same order on every processor that is a member of
> the old and new configuration.  This probably solves the problem your
> having (if it works right..).

Steve,

Can you clarify what you mean by "probably solves the problem
you're having"?

Is the current code recovering and delivering all old
configuration messages before the regular configuration change
function gets called?

What messages are sent in the transitional configuration?

In Mark's code he is assuming that all outstanding messages
have be delivered from previous configuration, then he
sends to all nodes the current 'open count' using messages
with recovery priority, then unplugs and continues.

So the current code should be delivering all messages to 
all nodes in the same order even through configuration
changes, right?

Thanks,

Daniel







More information about the Openais mailing list