[Openais] Another issue with corosync shutdown

Steven Dake sdake at redhat.com
Wed Apr 28 15:12:43 PDT 2010


Andreas,

Thanks,

install the corosync-debuginfo package

Please execute the following

gdb
attach (the pid of corosync)
thread apply all bt

this will present output about the states of the various threads and is
critical to resolving this issue.

Thanks
-steve


On Thu, 2010-04-29 at 00:05 +0200, Andreas Mock wrote:
> -----Ursprüngliche Nachricht-----
> Von: Steven Dake <sdake at redhat.com>
> Gesendet: 28.04.2010 20:40:32
> An: Andreas Mock <Andreas.Mock at web.de>
> Betreff: Re: [Openais] Another issue with corosync shutdown
> 
> >
> >It would be helpful to know the state of the network (gather,
> >operational, commit, recovery) when the shutdown was initiated.  This
> >can be gathered by turning on debug.
> 
> 
> Hi Steven, hi all,
> 
> attached you find the log files for the two nodes db03, db04 and
> the log of CTS.
> 
> I'm pretty sure that all hassle start with the first error logged by CTS.
> Between that first error and the situation where cocorsync is stuck
> the CPU usage by corosync on node db04 was suddenly 100%. 
> 
> With the last situation it has a CPU usage of 200%.
> IMHO there is something building up.
> In the condition with CPU load 100% corosync was up and CTS didn't want
> to stop it. Child processes where 'defunc'. strace had much output.
> 
> I'm pretty sure that also in the first run the test 'SpecialTest1' caused an
> error. Probably this is the root of the following problems.
> 
> If you're interested in the currect condition please hurry up. I have to kill
> corosync soon.
> 
> Best regards
> Andreas Mock



More information about the Openais mailing list