[Openais] Re: Configuration change question

Steven Dake sdake at mvista.com
Thu Oct 14 13:37:59 PDT 2004

On Thu, 2004-10-14 at 13:26, Daniel McNeil wrote:
> On Thu, 2004-10-14 at 13:04, Steven Dake wrote:
> > On Thu, 2004-10-14 at 12:26, Mark Haverkamp wrote:
> > > Steve,
> > > 
> > > If I remember correctly, the code to deliver messages from the previous
> > > configuration that happens in the transitional configuration isn't there
> > > yet.  This may explain what I am seeing during the event service
> > > recovery.  I now track open channels on all nodes and keep track by gmi
> > > messages for opens and closes.  At reconfig time, Each node sends its
> > > open count for each channel via gmi to update any nodes that may be new.
> > > What I am seeing is that sometimes the open count that a node receives
> > > is different than its notion of opens for that node.  I think that maybe
> > > an open or close was partially distributed then the config change
> > > happened and some nodes didn't get the open/close.  Is it possible for
> > 
> > No this is not possible even with the current code (unless there is a
> > bug).  All messages will be recovered from the old configuration before
> > any configuration change is delivered.  If all messages are not
> > recovered, you will see a repeating EVS %d %d %d lines as I'm sure you
> > have seen in the past..
> > 
> > If a message is sent after a configuration change, it will not be
> > delivered until the new configuration is formed.
> > 
> > The idea of VS is that we can ensure that the messages and configuration
> > changes occur in the same order on every processor that is a member of
> > the old and new configuration.  This probably solves the problem your
> > having (if it works right..).
> Steve,
> Can you clarify what you mean by "probably solves the problem
> you're having"?

sure..  I mean to say that the code should always ensure that messages
arrive in the same order.

> Is the current code recovering and delivering all old
> configuration messages before the regular configuration change
> function gets called?

it doesn't recover and deliver all old "configuration messages" but it
does recover and deliver all regular messages... (I think this is what
you meant).

> What messages are sent in the transitional configuration?

None are sent yet..  This remains unimplemented.  If there were a hole
at the end of the configuration, then a transitional configuration
should be delivered, then any of those messages after which a hole was
encounted are delivered.  This is to indicate to the services that "hey
you may be missing an important message relating to your operation, so
count all further messages as suspect".  The service may then ignore
them, or try to do some recovery in the next configuration..

> In Mark's code he is assuming that all outstanding messages
> have be delivered from previous configuration, then he
> sends to all nodes the current 'open count' using messages
> with recovery priority, then unplugs and continues.
This seems correct and the way the gmi code works, this should work
perfectly 100% (unless there is a hole, in which case you would know you
had that problem because openais would continually print out "EVS state"
with a bunch of numbers over and over).

> So the current code should be delivering all messages to 
> all nodes in the same order even through configuration
> changes, right?

You got it.  Thats how its supposed to work.  I really believe it works
correctly now, except for the hole case which is related to transitional
configurations.  If you can show it not working, then we have a pretty
serious bug.


> Thanks,
> Daniel

More information about the Openais mailing list