[Openais] Redundant ring not recovering after issuing the command corosync-cfgtool -r
Darren Thompson
darrent at akurit.com.au
Thu Apr 15 15:20:36 PDT 2010
Anyone?
On Wed, 2010-04-14 at 08:06 +0930, Darren Thompson wrote:
> Question.
> Is it better to use card bonding and a single ring or unbonded
> interfaces and dual rings?
>
> Sent from my iPhone
>
> On 14/04/2010, at 4:04 AM, Steven Dake <sdake at redhat.com> wrote:
>
> > On Tue, 2010-04-13 at 19:31 +0100, Tom Pride wrote:
> >> Just to clarify, when I ifdown eth1 corosync does detect a failure
> >> and
> >> it does mark the ring as faulty. Are you saying that when I use ifup
> >> corosync can't work out that the interface is back up and
> >> communications can resume when I run corosync-cfgtool -r ? Would I
> >> therefore get a different result if I introduced the failure by
> >> physically unplugging the cat5 from the server and then physically
> >> reconnecting the cat5? What about if I shut down the port on the
> >> switch it is connected to?
> >>
> >
> > Yes this is correct. You should see proper operation if the network
> > link is lost normally (ie the nic fails, the link fails, the switch
> > port
> > fails, the switch fails).
> >
> > When an interface is ifdowned, it sends a special event to corosync,
> > which corosync captures and causes special behavior to occur (the
> > binding to 127.0.0.1). Pulling a network cable doesn't cause this
> > same
> > event to occur. This rebind behavior is incompatible with redundant
> > ring.
> >
> > Regards
> > -steve
> >
> >> On Tue, Apr 13, 2010 at 6:33 PM, Steven Dake <sdake at redhat.com>
> >> wrote:
> >> On Tue, 2010-04-13 at 17:04 +0100, Tom Pride wrote:
> >>> Hi Steve,
> >>>
> >>> Thanks for the suggestion but that didn't work. I'm not
> >> sure if you
> >>> read my entire post or not, but the two redundant rings that
> >> I have
> >>> configured, both work without a problem until I introduce a
> >> fault by
> >>> shutting down eth1 on one of the nodes. This then causes
> >> the cluster
> >>> to mark ringid 0 as FAULTY. When I then reactivate eth1 and
> >> both
> >>> nodes can once again ping each other over the network, I
> >> then run
> _______________________________________________
> Openais mailing list
> Openais at lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/openais
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.linux-foundation.org/pipermail/openais/attachments/20100416/f595e166/attachment-0001.htm
More information about the Openais
mailing list