[Openais] corosync ring marked FAULTY - administrative intervention required

Vadym Chepkov chepkov at yahoo.com
Mon Apr 12 05:50:16 PDT 2010


--- On Fri, 4/9/10, Steven Dake <sdake at redhat.com> wrote:

> 
> Broadcast and redundant ring probably don't work to well
> together.  If
> you really want to use broadcast, take care to insure port
> numbers are
> separated by 2.  In your config, your using port 5405
> for one ring and
> 5406 for another.  Internally totem will use 5405+5404
> for one ring, and
> 5405+5406 for another.  With multicast this isn't a
> problem since you
> could use different multicast addresses.  With
> brodcast, this is not the
> case.
> 
> Try fixing that and report back if it helps.  If not
> we can further
> investigate.
> 
> Regards
> -steve
> 

I have changed the ports and it did help, thank you. The reason I was using broadcast is because my second ring is a cross-over cable. I wasn't sure if multicast makes any sense on such interface. Also I didn't know if I can have one redundant ring with multicast and another with broadcast. I would really like to know how an expert would configure corosync in my setup (two nodes, two ethernet cards each, connected to common switch and crossover-link between).

Thank you,
Vadym

> 
> > corosync-1.2.1-1.el5
> > 
> > Here is my config:
> > 
> > compatibility: none
> > 
> > aisexec {
> >     
>    user:   root
> >         group: 
> root
> > }
> > 
> > service {
> >         name: pacemaker
> >         ver:  0
> > }
> > 
> > totem {
> >         version: 2
> >         token: 5000
> >     
>    token_retransmits_before_loss_const: 20
> >         join: 1000
> >         consensus: 7500
> >         vsftype: none
> >         max_messages:
> 20
> >         secauth: off
> >         threads: 0
> >     
>    clear_node_high_bit: yes
> >         rrp_mode:
> passive
> >         interface {
> >             
>    ringnumber: 0
> >             
>    broadcast: yes
> >             
>    bindnetaddr: 10.0.0.0
> >             
>    mcastport: 5405
> >         }
> >         interface {
> >             
>    ringnumber: 1
> >             
>    broadcast: yes
> >             
>    bindnetaddr: 207.207.163.0
> >             
>    mcastport: 5406
> >         }
> > }
> > 
> > logging {
> >         fileline: off
> >         to_stderr: no
> >         to_syslog: yes
> >         debug: on
> >         timestamp: on
> > }
> > 
> > amf {
> >         mode: disabled
> > }
> > 
> > [root at xen-11 ~]# ifconfig 
> > eth0      Link encap:Ethernet 
> HWaddr 00:30:48:62:4E:DC  
> >           inet
> addr:207.207.163.11  Bcast:207.207.163.255 
> Mask:255.255.255.0
> >           inet6
> addr: fe80::230:48ff:fe62:4edc/64 Scope:Link
> >           UP
> BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> >           RX
> packets:2009418 errors:0 dropped:0 overruns:0 frame:0
> >           TX
> packets:799835 errors:0 dropped:0 overruns:0 carrier:0
> >       
>    collisions:0 txqueuelen:0 
> >           RX
> bytes:1428434820 (1.3 GiB)  TX bytes:664164837 (633.3
> MiB)
> > 
> > eth1      Link encap:Ethernet 
> HWaddr 00:30:48:62:4E:DD  
> >           inet
> addr:10.0.0.1  Bcast:10.0.0.3 
> Mask:255.255.255.252
> >           inet6
> addr: fe80::230:48ff:fe62:4edd/64 Scope:Link
> >           UP
> BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> >           RX
> packets:4233811 errors:0 dropped:0 overruns:0 frame:0
> >           TX
> packets:14118095 errors:0 dropped:0 overruns:0 carrier:0
> >       
>    collisions:0 txqueuelen:1000 
> >           RX
> bytes:518593446 (494.5 MiB)  TX bytes:14199338528 (13.2
> GiB)
> >       
>    Memory:d8060000-d8080000 
> > 
> > [root at xen-12 ~]# ifconfig 
> > eth0      Link encap:Ethernet 
> HWaddr 00:30:48:62:4C:CA  
> >           inet
> addr:207.207.163.12  Bcast:207.207.163.255 
> Mask:255.255.255.0
> >           inet6
> addr: fe80::230:48ff:fe62:4cca/64 Scope:Link
> >           UP
> BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> >           RX
> packets:1210002 errors:0 dropped:0 overruns:0 frame:0
> >           TX
> packets:473204 errors:0 dropped:0 overruns:0 carrier:0
> >       
>    collisions:0 txqueuelen:0 
> >           RX
> bytes:698444593 (666.0 MiB)  TX bytes:1145344594 (1.0
> GiB)
> > 
> > eth1      Link encap:Ethernet 
> HWaddr 00:30:48:62:4C:CB  
> >           inet
> addr:10.0.0.2  Bcast:10.0.0.3 
> Mask:255.255.255.252
> >           inet6
> addr: fe80::230:48ff:fe62:4ccb/64 Scope:Link
> >           UP
> BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> >           RX
> packets:13776771 errors:0 dropped:0 overruns:0 frame:0
> >           TX
> packets:4008079 errors:0 dropped:0 overruns:0 carrier:0
> >       
>    collisions:0 txqueuelen:1000 
> >           RX
> bytes:14138136203 (13.1 GiB)  TX bytes:493569061 (470.7
> MiB)
> >       
>    Memory:d8060000-d8080000 
> > 
> > Cross-over connection on eth1
> > 
> > I don't see much of details  in message log,
> probably need to increase debug level
> > 
> > [root at xen-12 ~]# corosync-cfgtool -s
> > Printing ring status.
> > Local node ID 33554442
> > RING ID 0
> >     id    = 10.0.0.2
> >     status    = ring 0
> active with no faults
> > RING ID 1
> >     id    =
> 207.207.163.12
> >     status    = Marking
> seqid 6594 ringid 1 interface 207.207.163.12 FAULTY -
> adminisrtative intervention required.
> > 
> > 
> > I can reset it just fine
> > 
> > [root at xen-12 ~]# corosync-cfgtool -r
> > Re-enabling all failed rings.
> > [root at xen-12 ~]# corosync-cfgtool -s
> > Printing ring status.
> > Local node ID 33554442
> > RING ID 0
> >     id    = 10.0.0.2
> >     status    = ring 0
> active with no faults
> > RING ID 1
> >     id    =
> 207.207.163.12
> >     status    = ring 1
> active with no faults
> > 
> > But it goes into FAULTY mode almost right away:
> > 
> > Apr  9 11:40:56 xen-12
> corosync[13835]:   [TOTEM ] Marking seqid
> 18340 ringid 1 interface 207.207.163.12 FAULTY -
> adminisrtative intervention required.
> > 
> > that's the only message from the corosync in the log
> > 
> > Thank you,
> > Vadym Chepkov
> > 
> > _______________________________________________
> > Openais mailing list
> > Openais at lists.linux-foundation.org
> > https://lists.linux-foundation.org/mailman/listinfo/openais
> 
> _______________________________________________
> Openais mailing list
> Openais at lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/openais
> 


More information about the Openais mailing list