[Openais] aisexec core dump during traffic

Kristen Smith kjsmith at nortel.com
Sun Feb 13 07:25:30 PST 2005


Steve,

Running traffic this weekend (in a 3+1 configuration - each of the active
nodes were writing out ~6/ckpts/second). Ran for about 20 hours and then got
the following from aisexec (on of the active nodes):

aisexec: ../include/sq.h:102: sq_item_add: Assertion
`sq->items_inuse[sq_position] == 0' failed.

and a trace:

#0  0x00bebcdf in raise () from /lib/tls/libc.so.6
#1  0x00bed4e5 in abort () from /lib/tls/libc.so.6
#2  0x00be5609 in __assert_fail () from /lib/tls/libc.so.6
#3  0x0805add1 in orf_token_mcast (token=0xbfffce00, fcc_mcasts_allowed=29,
system_from=0xbfffd420)
    at totemsrp.c:1990
#4  0x080587e6 in message_handler_orf_token (system_from=0xbfffd420,
iovec=0xbfffce00, iov_len=1,
    bytes_received=78, endian_conversion_needed=0) at totemsrp.c:2702
#5  0x0805a3d9 in recv_handler (handle=0, fd=7, revents=1, data=0x0,
prio=0x0) at totemsrp.c:3351
#6  0x08056e62 in poll_run (handle=0) at aispoll.c:386
#7  0x080499ac in main (argc=1, argv=0xbfffd634) at main.c:1003

This is the bitkeeper code from last Monday.

Here are the #defines I have changed, if that matters at all:

#define TIMEOUT_STATE_GATHER_JOIN               40
#define TIMEOUT_STATE_GATHER_CONSENSUS  80
#define TIMEOUT_TOKEN                                      180
#define TIMEOUT_TOKEN_RETRANSMIT                30

Any other information I can provide for you?

Thanks,
Kristen

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.linux-foundation.org/pipermail/openais/attachments/20050213/0096a354/attachment-0001.htm


More information about the Openais mailing list