[Openais] OpenAIS ring marked FAULTY - administrative intervention required

Steven Dake sdake at redhat.com
Wed Apr 7 10:22:01 PDT 2010


On Wed, 2010-04-07 at 09:18 +0930, Darren Thompson wrote:
> Steven
> 
> I still do not understand why Corosync was forked off from OpenAIS.
> 

Answered in faq entry:
http://www.corosync.org/doku.php?id=faq:why

> We now have an incompatible mess of two 80% overlapping and partially
> co-dependant applications (You cannot even now install OpenAIS now
> without first installing Corosync).

Corosync and OpenAIS are not co-dependent.  Corosync is a standalone
component, whereas openais depends on Corosync.  It is true there is
some overlap in API feature set.

Most people don't require SA Forum AIS APIs, in which case there is no
point to installing openais.  In all of our deployments, AIS APIs
account for less then 5%.  That either means our AIS implementation
produced in openais is bad or irrelevant.  I tend to believe the
implementation is pretty good...
 
> 
> What would it take to re-merge these two ugly duckings back into a
> single cohesive product again?

The two projects have differing missions.  My philosophy is to do one
thing, do it well.

Regards
-steve

> Darren
> 
> 
> 
> On Tue, 2010-04-06 at 15:57 -0700, Steven Dake wrote: 
> > On Tue, 2010-04-06 at 15:26 +0200, Filip Sakalos wrote:
> > > Hi,
> > > 
> > > I am using openAIS and Pacemaker for clustering. I want to use two
> > > rings for communication between nodes. The problem is, that one of the
> > > rings is always marked as faulty on on one or both nodes:
> > > 
> > >  xen1:/home/filip # openais-cfgtool -s
> > >  Printing ring status.
> > >  RING ID 0
> > >          id      = 192.168.58.124
> > >          status  = Marking ringid 0 interface 192.168.58.124 FAULTY -
> > > adminisrtative intervention required.
> > >  RING ID 1
> > >          id      = 192.168.7.1
> > >          status  = ring 1 active with no faults
> > > 
> > > 
> > > Same on the other node:
> > > 
> > > xen2:~ # openais-cfgtool -s
> > > Printing ring status.
> > > RING ID 0
> > >         id      = 192.168.58.172
> > >         status  = Marking seqid 12298 ringid 0 interface
> > > 192.168.58.172 FAULTY - adminisrtative intervention required.
> > > RING ID 1
> > >         id      = 192.168.7.2
> > >         status  = ring 1 active with no faults
> > > 
> > > This is my configuration file (/etc/ais/openais.conf):
> > > 
> > > # Please read the openais.conf.5 manual page
> > > 
> > > aisexec {
> > >     # Run as root - this is necessary to be able to manage resources
> > > with Pacemaker
> > >     user:    root
> > >     group:    root
> > > }
> > > 
> > > service {
> > >     # Load the Pacemaker Cluster Resource Manager
> > >     ver:       0
> > >     name:      pacemaker
> > >     use_mgmtd: 1
> > > }
> > > 
> > > totem {
> > >     version: 2
> > > 
> > >     # How long before declaring a token lost (ms)
> > >     token:          1000
> > > 
> > >     # How many token retransmits before forming a new configuration
> > >     token_retransmits_before_loss_const: 10
> > > 
> > >     # How long to wait for join messages in the membership protocol (ms)
> > >     join:           60
> > > 
> > >     # How long to wait for consensus to be achieved before starting a
> > > new round of membership configuration (ms)
> > >     consensus:      1500
> > > 
> > >     # Turn off the virtual synchrony filter
> > >     vsftype:        none
> > > 
> > >     # Number of messages that may be sent by one processor on receipt
> > > of the token
> > >     max_messages:   20
> > > 
> > >     # Stagger sending the node join messages by 1..send_join ms
> > >     send_join: 45
> > > 
> > >     # Limit generated nodeids to 31-bits (positive signed integers)
> > >     clear_node_high_bit: yes
> > > 
> > >     # Disable encryption
> > >     secauth:    on
> > > 
> > >     # How many threads to use for encryption/decryption
> > >     threads:       0
> > > 
> > >     # Optionally assign a fixed node id (integer)
> > >     # nodeid:         1234
> > > 
> > >     rrp_mode: passive
> > > 
> > >     interface {
> > >         ringnumber: 0
> > >         # The following values need to be set based on your environment
> > >         bindnetaddr: 192.168.58.0
> > >         mcastaddr: 226.94.1.1
> > >         mcastport: 5405
> > >     }
> > > 
> > >     interface {
> > > 
> > >         ringnumber: 1
> > >         bindnetaddr: 192.168.7.0
> > >         mcastaddr: 226.94.1.2
> > >         mcastport: 5405
> > >     }
> > > }
> > > 
> > > #logging {
> > > #    debug: off
> > > #    fileline: off
> > > #    to_syslog: yes
> > > #    to_stderr: off
> > > #    syslog_facility: daemon
> > > #    timestamp: on
> > > #}
> > > 
> > > logging {
> > >     debug: on
> > >     to_file: yes
> > >     logfile: /var/log/openais.log
> > >     to_syslog: yes
> > >     syslog_facility: daemon
> > >     timestamp: on
> > > }
> > > 
> > > amf {
> > >     mode: disabled
> > > }
> > > 
> > > #eof
> > > 
> > > I can ping the other node without problem, ssh works too. Can anyone help?
> > > 
> > > 
> > 
> > I recommend using corosync instead of openais.  Corosync is much more
> > suitable for running pacemaker yet is nearly the same from a user
> > perspective (similar configuration, etc).
> > 
> > Provide the syslog output for the two nodes
> > 
> > Run ifconfig on the nodes and paste the output
> > 
> > Regards
> > -steve
> > 
> > > 
> > > Sincerely,
> > > Filip Sakalos
> > > _______________________________________________
> > > Openais mailing list
> > > Openais at lists.linux-foundation.org
> > > https://lists.linux-foundation.org/mailman/listinfo/openais
> > 
> > _______________________________________________
> > Openais mailing list
> > Openais at lists.linux-foundation.org
> > https://lists.linux-foundation.org/mailman/listinfo/openais



More information about the Openais mailing list