[Openais] corosync 1.2.5 still doesn't shutdown properly

Steven Dake sdake at redhat.com
Tue Jun 22 10:49:58 PDT 2010


On 06/22/2010 03:56 AM, Vadym Chepkov wrote:
> Hi,
>
> I decided to check if I can start using corosync again on several of
> my clusters (have to use heartbeat there at the moment).
> I don't even have any services defined in corosync.conf, commented
> pacemaker out, just plain corosync and it never goes down:
>
> # ps axf|grep corosync
> 26294 pts/0    S+     0:00  |               \_ /bin/sh /sbin/service
> corosync restart
> 26299 pts/0    S+     0:01  |                   \_ /bin/bash
> /etc/init.d/corosync restart
> 29249 pts/1    S+     0:00                  \_ grep corosync
> 25959 ?        Ssl    0:00 corosync
>
>
> I attached to the process and this is where it hangs:
>
> (gdb) where
> #0  0x0fe14134 in poll () from /lib/libc.so.6
> #1  0x0ffbc530 in poll_run (handle=150346236434579456) at coropoll.c:413
> #2  0x10006e50 in main (argc=<value optimized out>, argv=<value
> optimized out>) at main.c:1576
>
> How can I help to debug this problem?
> It is 100% reproducible.
>
> Thank you,
> Vadym
> ________

Vadym,

Thanks for the feedback.  I do test this scenario and it works for me:

[root at cast flatiron]# service corosync start
Starting Corosync Cluster Engine (corosync):               [  OK  ]
[root at cast flatiron]# service corosync restart
Signaling Corosync Cluster Engine (corosync) to terminate: [  OK  ]
Waiting for corosync services to unload:.                  [  OK  ]
Starting Corosync Cluster Engine (corosync):               [  OK  ]
[root at cast flatiron]# service corosync stop
Signaling Corosync Cluster Engine (corosync) to terminate: [  OK  ]
Waiting for corosync services to unload:.                  [  OK  ]
[root at cast flatiron]# service corosync start
Starting Corosync Cluster Engine (corosync):               [  OK  ]
[root at cast flatiron]# /etc/init.d/corosync restart
Signaling Corosync Cluster Engine (corosync) to terminate: [  OK  ]
Waiting for corosync services to unload:.                  [  OK  ]
Starting Corosync Cluster Engine (corosync):               [  OK  ]


One thing that would stop corosync from shutting down is if it couldn't 
enter operational state.  This often happens because of a firewall 
enabled on the ports corosync uses to communicate.

The system logs would be helpful (with debug: on).

Regards
-steve
_______________________________________
> Openais mailing list
> Openais at lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/openais



More information about the Openais mailing list