[Openais] aisexec core dump during traffic

Steven Dake sdake at mvista.com
Mon Feb 14 12:05:38 PST 2005


Kristen,

Ok I think I have duplicated this in the past but don't have an
immediate solution.  Basically what happens is that during recovery the
token is lost, which transitions back to gather.  Then in gather, the
processor may multicast messages which queues new messages in place of
the old ones.  This results in the fault you see.

You could have another error; its difficult to say.  could you print the
memb_state global variable in gdb?

Please file a defect on this one so we can track it.

Thanks
-steve

On Sun, 2005-02-13 at 08:25, Kristen Smith wrote:
> Steve,
> 
> Running traffic this weekend (in a 3+1 configuration - each of the
> active nodes were writing out ~6/ckpts/second). Ran for about 20 hours
> and then got the following from aisexec (on of the active nodes):
> 
> aisexec: ../include/sq.h:102: sq_item_add: Assertion
> `sq->items_inuse[sq_position] == 0' failed.
> 
> and a trace:
> 
> #0  0x00bebcdf in raise () from /lib/tls/libc.so.6
> #1  0x00bed4e5 in abort () from /lib/tls/libc.so.6
> #2  0x00be5609 in __assert_fail () from /lib/tls/libc.so.6
> #3  0x0805add1 in orf_token_mcast (token=0xbfffce00,
> fcc_mcasts_allowed=29, system_from=0xbfffd420)
>     at totemsrp.c:1990
> #4  0x080587e6 in message_handler_orf_token (system_from=0xbfffd420,
> iovec=0xbfffce00, iov_len=1,
>     bytes_received=78, endian_conversion_needed=0) at totemsrp.c:2702
> #5  0x0805a3d9 in recv_handler (handle=0, fd=7, revents=1, data=0x0,
> prio=0x0) at totemsrp.c:3351
> #6  0x08056e62 in poll_run (handle=0) at aispoll.c:386
> #7  0x080499ac in main (argc=1, argv=0xbfffd634) at main.c:1003
> 
> This is the bitkeeper code from last Monday.
> 
> Here are the #defines I have changed, if that matters at all:
> 
> #define TIMEOUT_STATE_GATHER_JOIN               40
> #define TIMEOUT_STATE_GATHER_CONSENSUS  80
> #define TIMEOUT_TOKEN                                      180
> #define TIMEOUT_TOKEN_RETRANSMIT                30
> 
> Any other information I can provide for you?
> 
> Thanks,
> Kristen
> 
> 
> 
> ______________________________________________________________________
> _______________________________________________
> Openais mailing list
> Openais at lists.osdl.org
> http://lists.osdl.org/mailman/listinfo/openais




More information about the Openais mailing list