[Openais] recover from corosync daemon restart and cpg_finalize timing

Andrew Beekhof andrew at beekhof.net
Wed Jun 23 23:35:28 PDT 2010


On Thu, Jun 24, 2010 at 1:50 AM, dan clark <2clarkd at gmail.com> wrote:
> Dear Gentle Reader....
>
> Attached is a small test program to stress initializing and finalizing
> communication between a corosync cpg client and the corosync daemon.
> The test was run under version 1.2.4.  Initial testing was with a
> single node, subsequent testing occurred on a system consisting of 3
> nodes.
>
> 1) If the program is run in such a way that it loops on the
> initialize/mcast_joined/dispatch/finalize AND the corosync daemon is
> restarted while the program is looping (service corosync restart) then
> the application locks up in the corosync client library in a variety
> of interesting locations.  This is easiest to reproduce in a single
> node system with a large iteration count and a usleep value between
> joins.  'stress_finalize -t 500 -i 10000 -u 1000 -v'  Sometimes it
> recovers in a few seconds (analysis of strace indicated
> futex(...FUTEX_WAIT, 0, {1, 997888000}) ... which would account for
> multiple 2 second delays in error recovery from a lost corosync
> daemon).  Sometimes it locks up solid!   What is the proper way of
> handling the loss of the corosync daemon?  Is it possible to have the
> cpg library have a fast error recovery in the case of a failed daemon?
>
> sample back trace of lockup:
> #0  0x000000363c60c711 in sem_wait () from /lib64/libpthread.so.0
> #1  0x0000003000002a34 in coroipcc_msg_send_reply_receive (
>   handle=<value optimized out>, iov=<value optimized out>, iov_len=1,
>   res_msg=0x7fffaefecac0, res_len=24) at coroipcc.c:465
> #2  0x0000003000802db1 in cpg_leave (handle=1648075416440668160,
>   group=<value optimized out>) at cpg.c:458
> #3  0x0000000000400df8 in coInit (handle=0x7fffaefecdb0,
>   groupNameStr=0x7fffaefeccb0 "./stress_finalize_groupName-0", ctx=0x6e1)
>   at stress_finalize.c:101
> #4  0x000000000040138a in main (argc=8, argv=0x7fffaefecf28)
>   at stress_finalize.c:243

I've also started getting semaphore related stack traces.

#0  __new_sem_init (sem=0x7ff01f81a008, pshared=1, value=0) at sem_init.c:45
45	  isem->value = value;
Missing separate debuginfos, use: debuginfo-install
audit-libs-2.0.1-1.fc12.x86_64 libgcrypt-1.4.4-8.fc12.x86_64
libgpg-error-1.6-4.x86_64 libtasn1-2.3-1.fc12.x86_64
libuuid-2.16-10.2.fc12.x86_64
(gdb) where
#0  __new_sem_init (sem=0x7ff01f81a008, pshared=1, value=0) at sem_init.c:45
#1  0x00007ff01e601e8e in coroipcc_service_connect (socket_name=<value
optimized out>, service=<value optimized out>, request_size=1048576,
response_size=1048576, dispatch_size=1048576, handle=<value optimized
out>)
    at coroipcc.c:706
#2  0x00007ff01ec1bb81 in init_ais_connection_once (dispatch=0x40e798
<cib_ais_dispatch>, destroy=0x40e8f2 <cib_ais_destroy>, our_uuid=0x0,
our_uname=0x6182c0, nodeid=0x0) at ais.c:622
#3  0x00007ff01ec1ba22 in init_ais_connection (dispatch=0x40e798
<cib_ais_dispatch>, destroy=0x40e8f2 <cib_ais_destroy>, our_uuid=0x0,
our_uname=0x6182c0, nodeid=0x0) at ais.c:585
#4  0x00007ff01ec16b90 in crm_cluster_connect (our_uname=0x6182c0,
our_uuid=0x0, dispatch=0x40e798, destroy=0x40e8f2, hb_conn=0x6182b0)
at cluster.c:56
#5  0x000000000040e9fb in cib_init () at main.c:424
#6  0x000000000040df78 in main (argc=1, argv=0x7ffff194aaf8) at main.c:218
(gdb) print *isem
Cannot access memory at address 0x7ff01f81a008

sigh

>
> 2) If the test program is run with an iteration count of greater than
> about 10, group joins for the specified group name tends to start
> failing (CS_ERR_TRY_AGAIN) but never recovers (trying again doesn't
> help :).  This test was run on a single node of a 3 node system (but
> may be reproduce similar problems on a smaller number of nodes).
> ' ./stress_finalize -i 10 -j 1 junk'
>
> 3) An unrelated observation is that if the corosync daemon is setup on
> two nodes that are participate in multicast through a tunnel, the
> corosync daemon runs in a tight loop at very high priority level
> effectively halting the machine.  Is this because the basic daemon
> communication relies on message reflection of the underlying transport
> which would occur on an ethernet multicast but would not on a tunnel?
>
> An example setup for an ip tunnel might be something along the following lines:
> modprobe ip_grep up
> echo 1 > /proc/sys/net/ipv4/ip_forward
> ip tunnel add gre1 mode gre remote 10.x.y.z local 20.z.y.x ttl 127
> ip addr add 192.168.100.33/24 peer 192.168.100.11/24 dev gre1
> ip link set gre1 up multicast on
>
> Thank you for taking the time to consider these tests.  Perhaps future
> versions of the software package could include a similar set of tests
> illustrating proper behavior?
>
> dan
>
> _______________________________________________
> Openais mailing list
> Openais at lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/openais
>


More information about the Openais mailing list