[Openais] corosync 1.2.5 still doesn't shutdown properly

Vadym Chepkov vchepkov at gmail.com
Tue Jun 22 11:07:05 PDT 2010


On Tue, Jun 22, 2010 at 1:49 PM, Steven Dake <sdake at redhat.com> wrote:
> On 06/22/2010 03:56 AM, Vadym Chepkov wrote:
>>
>> Hi,
>>
>> I decided to check if I can start using corosync again on several of
>> my clusters (have to use heartbeat there at the moment).
>> I don't even have any services defined in corosync.conf, commented
>> pacemaker out, just plain corosync and it never goes down:
>>
>> # ps axf|grep corosync
>> 26294 pts/0    S+     0:00  |               \_ /bin/sh /sbin/service
>> corosync restart
>> 26299 pts/0    S+     0:01  |                   \_ /bin/bash
>> /etc/init.d/corosync restart
>> 29249 pts/1    S+     0:00                  \_ grep corosync
>> 25959 ?        Ssl    0:00 corosync
>>
>>
>> I attached to the process and this is where it hangs:
>>
>> (gdb) where
>> #0  0x0fe14134 in poll () from /lib/libc.so.6
>> #1  0x0ffbc530 in poll_run (handle=150346236434579456) at coropoll.c:413
>> #2  0x10006e50 in main (argc=<value optimized out>, argv=<value
>> optimized out>) at main.c:1576
>>
>> How can I help to debug this problem?
>> It is 100% reproducible.
>>
>> Thank you,
>> Vadym
>> ________
>
> Vadym,
>
> Thanks for the feedback.  I do test this scenario and it works for me:
>
> [root at cast flatiron]# service corosync start
> Starting Corosync Cluster Engine (corosync):               [  OK  ]
> [root at cast flatiron]# service corosync restart
> Signaling Corosync Cluster Engine (corosync) to terminate: [  OK  ]
> Waiting for corosync services to unload:.                  [  OK  ]
> Starting Corosync Cluster Engine (corosync):               [  OK  ]
> [root at cast flatiron]# service corosync stop
> Signaling Corosync Cluster Engine (corosync) to terminate: [  OK  ]
> Waiting for corosync services to unload:.                  [  OK  ]
> [root at cast flatiron]# service corosync start
> Starting Corosync Cluster Engine (corosync):               [  OK  ]
> [root at cast flatiron]# /etc/init.d/corosync restart
> Signaling Corosync Cluster Engine (corosync) to terminate: [  OK  ]
> Waiting for corosync services to unload:.                  [  OK  ]
> Starting Corosync Cluster Engine (corosync):               [  OK  ]
>
>
> One thing that would stop corosync from shutting down is if it couldn't
> enter operational state.  This often happens because of a firewall enabled
> on the ports corosync uses to communicate.
>
> The system logs would be helpful (with debug: on).
>
> Regards
> -steve


And it works fine on Intel based servers, but on Redhat PPC based
server it doesn't

I attached the config and the log file

Thanks,
Vadym
-------------- next part --------------
A non-text attachment was scrubbed...
Name: corosync.log
Type: application/octet-stream
Size: 17539 bytes
Desc: not available
Url : http://lists.linux-foundation.org/pipermail/openais/attachments/20100622/fc1713ee/attachment-0002.obj 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: corosync.conf
Type: application/octet-stream
Size: 503 bytes
Desc: not available
Url : http://lists.linux-foundation.org/pipermail/openais/attachments/20100622/fc1713ee/attachment-0003.obj 


More information about the Openais mailing list