[Openais] aisexec core dump during traffic
kjsmith at nortel.com
Tue Feb 15 06:57:34 PST 2005
Bug 256 has been filed.
Also - the value of the memb_state global:
(gdb) p memb_state
$1 = MEMB_STATE_OPERATIONAL
From: Steven Dake [mailto:sdake at mvista.com]
Sent: Monday, February 14, 2005 2:06 PM
To: Smith, Kristen [NGC:B675:EXCH]
Cc: 'openais at lists.osdl.org'; Bajpai, Muni [NGC:B670:EXCH]
Subject: Re: [Openais] aisexec core dump during traffic
Ok I think I have duplicated this in the past but don't have an immediate
solution. Basically what happens is that during recovery the token is lost,
which transitions back to gather. Then in gather, the processor may
multicast messages which queues new messages in place of the old ones. This
results in the fault you see.
You could have another error; its difficult to say. could you print the
memb_state global variable in gdb?
Please file a defect on this one so we can track it.
On Sun, 2005-02-13 at 08:25, Kristen Smith wrote:
> Running traffic this weekend (in a 3+1 configuration - each of the
> active nodes were writing out ~6/ckpts/second). Ran for about 20 hours
> and then got the following from aisexec (on of the active nodes):
> aisexec: ../include/sq.h:102: sq_item_add: Assertion
> `sq->items_inuse[sq_position] == 0' failed.
> and a trace:
> #0 0x00bebcdf in raise () from /lib/tls/libc.so.6
> #1 0x00bed4e5 in abort () from /lib/tls/libc.so.6
> #2 0x00be5609 in __assert_fail () from /lib/tls/libc.so.6
> #3 0x0805add1 in orf_token_mcast (token=0xbfffce00,
> fcc_mcasts_allowed=29, system_from=0xbfffd420)
> at totemsrp.c:1990
> #4 0x080587e6 in message_handler_orf_token (system_from=0xbfffd420,
> iovec=0xbfffce00, iov_len=1,
> bytes_received=78, endian_conversion_needed=0) at totemsrp.c:2702
> #5 0x0805a3d9 in recv_handler (handle=0, fd=7, revents=1, data=0x0,
> prio=0x0) at totemsrp.c:3351
> #6 0x08056e62 in poll_run (handle=0) at aispoll.c:386
> #7 0x080499ac in main (argc=1, argv=0xbfffd634) at main.c:1003
> This is the bitkeeper code from last Monday.
> Here are the #defines I have changed, if that matters at all:
> #define TIMEOUT_STATE_GATHER_JOIN 40
> #define TIMEOUT_STATE_GATHER_CONSENSUS 80
> #define TIMEOUT_TOKEN 180
> #define TIMEOUT_TOKEN_RETRANSMIT 30
> Any other information I can provide for you?
> Openais mailing list
> Openais at lists.osdl.org http://lists.osdl.org/mailman/listinfo/openais
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Openais