[Openais] Another issue with corosync shutdown

Andreas Mock Andreas.Mock at web.de
Wed Apr 28 10:01:15 PDT 2010


Hi all,

after Jan helped me to get around a stupid maintenance bug
I started again with CTS which worked very good the first
couple of tests. I came as far as I never got before. Thank
you all who helped me with that.

But suddenly another /etc/init.d/corosync stop keeps hanging
"endlessly" (more than 2 hours before I killed corosync) with
a CPU usage of 200%. It bound to CPU cores completely as
'top' showed.

I could also see that the script /etc/init.d/corosync was hanging
in its endless loop as it is the regular behaviour at the moment 
when the corosync process is not terminating.

A strace to the corosync gave the following output:
---------------------------8<-----------------------------
db03:~ # ps ax | grep corosy
 3228 pts/1    R+     0:00 grep corosy
15982 ?        Ssl  166:29 corosync
20496 ?        Ss     0:02 /bin/bash /etc/init.d/corosync stop
db03:~ # strace -f -tt -p 15982
Process 15984 attached with 3 threads - interrupt to quit
[pid 15982] 18:41:38.715931 futex(0x7f133980e9e0, FUTEX_WAIT, 15983, NULL
---------------------------8<-----------------------------

No more output than this single line. It seemed to be stuck somehow.

Any ideas? Some other debugging required?

Best regards
Andreas Mock


More information about the Openais mailing list