[Openais] aisexec core dump during traffic
Kristen Smith
kjsmith at nortel.com
Sun Feb 13 07:25:30 PST 2005
Steve,
Running traffic this weekend (in a 3+1 configuration - each of the active
nodes were writing out ~6/ckpts/second). Ran for about 20 hours and then got
the following from aisexec (on of the active nodes):
aisexec: ../include/sq.h:102: sq_item_add: Assertion
`sq->items_inuse[sq_position] == 0' failed.
and a trace:
#0 0x00bebcdf in raise () from /lib/tls/libc.so.6
#1 0x00bed4e5 in abort () from /lib/tls/libc.so.6
#2 0x00be5609 in __assert_fail () from /lib/tls/libc.so.6
#3 0x0805add1 in orf_token_mcast (token=0xbfffce00, fcc_mcasts_allowed=29,
system_from=0xbfffd420)
at totemsrp.c:1990
#4 0x080587e6 in message_handler_orf_token (system_from=0xbfffd420,
iovec=0xbfffce00, iov_len=1,
bytes_received=78, endian_conversion_needed=0) at totemsrp.c:2702
#5 0x0805a3d9 in recv_handler (handle=0, fd=7, revents=1, data=0x0,
prio=0x0) at totemsrp.c:3351
#6 0x08056e62 in poll_run (handle=0) at aispoll.c:386
#7 0x080499ac in main (argc=1, argv=0xbfffd634) at main.c:1003
This is the bitkeeper code from last Monday.
Here are the #defines I have changed, if that matters at all:
#define TIMEOUT_STATE_GATHER_JOIN 40
#define TIMEOUT_STATE_GATHER_CONSENSUS 80
#define TIMEOUT_TOKEN 180
#define TIMEOUT_TOKEN_RETRANSMIT 30
Any other information I can provide for you?
Thanks,
Kristen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.linux-foundation.org/pipermail/openais/attachments/20050213/0096a354/attachment-0001.htm
More information about the Openais
mailing list