<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii">
<META NAME="Generator" CONTENT="MS Exchange Server version 5.5.2658.2">
<TITLE>RE: [Openais] aisexec core dump during traffic</TITLE>
</HEAD>
<BODY>
<P><FONT SIZE=2>Bug 256 has been filed.</FONT>
</P>
<P><FONT SIZE=2>Also - the value of the memb_state global:</FONT>
</P>
<P><FONT SIZE=2>(gdb) p memb_state</FONT>
<BR><FONT SIZE=2>$1 = MEMB_STATE_OPERATIONAL</FONT>
</P>
<P><FONT SIZE=2>-----Original Message-----</FONT>
<BR><FONT SIZE=2>From: Steven Dake [<A HREF="mailto:sdake@mvista.com">mailto:sdake@mvista.com</A>] </FONT>
<BR><FONT SIZE=2>Sent: Monday, February 14, 2005 2:06 PM</FONT>
<BR><FONT SIZE=2>To: Smith, Kristen [NGC:B675:EXCH]</FONT>
<BR><FONT SIZE=2>Cc: 'openais@lists.osdl.org'; Bajpai, Muni [NGC:B670:EXCH]</FONT>
<BR><FONT SIZE=2>Subject: Re: [Openais] aisexec core dump during traffic</FONT>
</P>
<BR>
<P><FONT SIZE=2>Kristen,</FONT>
</P>
<P><FONT SIZE=2>Ok I think I have duplicated this in the past but don't have an immediate solution. Basically what happens is that during recovery the token is lost, which transitions back to gather. Then in gather, the processor may multicast messages which queues new messages in place of the old ones. This results in the fault you see.</FONT></P>
<P><FONT SIZE=2>You could have another error; its difficult to say. could you print the memb_state global variable in gdb?</FONT>
</P>
<P><FONT SIZE=2>Please file a defect on this one so we can track it.</FONT>
</P>
<P><FONT SIZE=2>Thanks</FONT>
<BR><FONT SIZE=2>-steve</FONT>
</P>
<P><FONT SIZE=2>On Sun, 2005-02-13 at 08:25, Kristen Smith wrote:</FONT>
<BR><FONT SIZE=2>> Steve,</FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> Running traffic this weekend (in a 3+1 configuration - each of the </FONT>
<BR><FONT SIZE=2>> active nodes were writing out ~6/ckpts/second). Ran for about 20 hours </FONT>
<BR><FONT SIZE=2>> and then got the following from aisexec (on of the active nodes):</FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> aisexec: ../include/sq.h:102: sq_item_add: Assertion </FONT>
<BR><FONT SIZE=2>> `sq->items_inuse[sq_position] == 0' failed.</FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> and a trace:</FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> #0 0x00bebcdf in raise () from /lib/tls/libc.so.6</FONT>
<BR><FONT SIZE=2>> #1 0x00bed4e5 in abort () from /lib/tls/libc.so.6</FONT>
<BR><FONT SIZE=2>> #2 0x00be5609 in __assert_fail () from /lib/tls/libc.so.6</FONT>
<BR><FONT SIZE=2>> #3 0x0805add1 in orf_token_mcast (token=0xbfffce00, </FONT>
<BR><FONT SIZE=2>> fcc_mcasts_allowed=29, system_from=0xbfffd420)</FONT>
<BR><FONT SIZE=2>> at totemsrp.c:1990</FONT>
<BR><FONT SIZE=2>> #4 0x080587e6 in message_handler_orf_token (system_from=0xbfffd420, </FONT>
<BR><FONT SIZE=2>> iovec=0xbfffce00, iov_len=1,</FONT>
<BR><FONT SIZE=2>> bytes_received=78, endian_conversion_needed=0) at totemsrp.c:2702 </FONT>
<BR><FONT SIZE=2>> #5 0x0805a3d9 in recv_handler (handle=0, fd=7, revents=1, data=0x0,</FONT>
<BR><FONT SIZE=2>> prio=0x0) at totemsrp.c:3351</FONT>
<BR><FONT SIZE=2>> #6 0x08056e62 in poll_run (handle=0) at aispoll.c:386</FONT>
<BR><FONT SIZE=2>> #7 0x080499ac in main (argc=1, argv=0xbfffd634) at main.c:1003</FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> This is the bitkeeper code from last Monday.</FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> Here are the #defines I have changed, if that matters at all:</FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> #define TIMEOUT_STATE_GATHER_JOIN 40</FONT>
<BR><FONT SIZE=2>> #define TIMEOUT_STATE_GATHER_CONSENSUS 80</FONT>
<BR><FONT SIZE=2>> #define TIMEOUT_TOKEN 180</FONT>
<BR><FONT SIZE=2>> #define TIMEOUT_TOKEN_RETRANSMIT 30</FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> Any other information I can provide for you?</FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> Thanks,</FONT>
<BR><FONT SIZE=2>> Kristen</FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> ______________________________________________________________________</FONT>
<BR><FONT SIZE=2>> _______________________________________________</FONT>
<BR><FONT SIZE=2>> Openais mailing list</FONT>
<BR><FONT SIZE=2>> Openais@lists.osdl.org <A HREF="http://lists.osdl.org/mailman/listinfo/openais" TARGET="_blank">http://lists.osdl.org/mailman/listinfo/openais</A></FONT>
</P>
<BR>
</BODY>
</HTML>