[Openais] Redundant ring not recovering after issuing the command corosync-cfgtool -r

Darren Thompson darrent at akurit.com.au
Thu Apr 15 15:20:36 PDT 2010


Anyone?

On Wed, 2010-04-14 at 08:06 +0930, Darren Thompson wrote:

> Question.
> Is it better to use card bonding and a single ring or unbonded  
> interfaces and dual rings?
> 
> Sent from my iPhone
> 
> On 14/04/2010, at 4:04 AM, Steven Dake <sdake at redhat.com> wrote:
> 
> > On Tue, 2010-04-13 at 19:31 +0100, Tom Pride wrote:
> >> Just to clarify, when I ifdown eth1 corosync does detect a failure  
> >> and
> >> it does mark the ring as faulty.  Are you saying that when I use ifup
> >> corosync can't work out that the interface is back up and
> >> communications can resume when I run corosync-cfgtool -r ?  Would I
> >> therefore get a different result if I introduced the failure by
> >> physically unplugging the cat5 from the server and then physically
> >> reconnecting the cat5?  What about if I shut down the port on the
> >> switch it is connected to?
> >>
> >
> > Yes this is correct.  You should see proper operation if the network
> > link is lost normally (ie the nic fails, the link fails, the switch  
> > port
> > fails, the switch fails).
> >
> > When an interface is ifdowned, it sends a special event to corosync,
> > which corosync captures and causes special behavior to occur (the
> > binding to 127.0.0.1).  Pulling a network cable doesn't cause this  
> > same
> > event to occur.  This rebind behavior is incompatible with redundant
> > ring.
> >
> > Regards
> > -steve
> >
> >> On Tue, Apr 13, 2010 at 6:33 PM, Steven Dake <sdake at redhat.com>  
> >> wrote:
> >>        On Tue, 2010-04-13 at 17:04 +0100, Tom Pride wrote:
> >>> Hi Steve,
> >>>
> >>> Thanks for the suggestion but that didn't work.  I'm not
> >>        sure if you
> >>> read my entire post or not, but the two redundant rings that
> >>        I have
> >>> configured, both work without a problem until I introduce a
> >>        fault by
> >>> shutting down eth1 on one of the nodes.  This then causes
> >>        the cluster
> >>> to mark ringid 0 as FAULTY.  When I then reactivate eth1 and
> >>        both
> >>> nodes can once again ping each other over the network, I
> >>        then run
> _______________________________________________
> Openais mailing list
> Openais at lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/openais
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.linux-foundation.org/pipermail/openais/attachments/20100416/f595e166/attachment-0001.htm 


More information about the Openais mailing list