[Openais] Re: segfaults and asserts

Steven Dake sdake at mvista.com
Wed Feb 23 12:03:15 PST 2005


The cl019 assert path is a new one I think unreported.  If you still
have the windows open can you print out the sq data?  Also could you go
up to update_aru and print out my_aru, i, and my_high_seq.  Must be some
case I have missed.

I think we have seen the other segfault but I'm not sure.  I don't think
I've seen source_addr set to the address 0x8 before.  Were you able to
debug the segfault?  need to know assembly->index and datasize, and
iovec[0].iov_len (arguments to the memcpy).  Might be interesting to see
all the iovec metadata if it has a iovlen of more then 1.

Thanks
-steve

On Wed, 2005-02-23 at 08:46, Mark Haverkamp wrote:
> I have been running my publish/subscribe test since last Friday
> afternoon.  It has had a few config cycles (no lost or gained nodes) but
> has been working OK.  This morning when I got in I had three nodes with
> a segfault and one with an assert.  (Node cl008's clock if slow by 4
> minutes even though I've got ntpd running).
> 
> 
> cl008:
> Program received signal SIGSEGV, Segmentation fault.
> 0x4207c46c in memcpy () from /lib/i686/libc.so.6
> (gdb) bt
> #0  0x4207c46c in memcpy () from /lib/i686/libc.so.6
> #1  0x08062c56 in totempg_deliver_fn (source_addr=Cannot access memory at address 0x8
> ) at totempg.c:287
> 
> clProgram received signal SIGSEGV, Segmentation fault.
> 0x4207c46c in memcpy () from /lib/i686/libc.so.6
> (gdb) bt
> #0  0x4207c46c in memcpy () from /lib/i686/libc.so.6
> #1  0x08062c56 in totempg_deliver_fn (source_addr=Cannot access memory at address 0x8
> ) at totempg.c:287
> 
> cl018:
> Program received signal SIGSEGV, Segmentation fault.
> 0x4207c46c in memcpy () from /lib/i686/libc.so.6
> (gdb) bt
> #0  0x4207c46c in memcpy () from /lib/i686/libc.so.6
> #1  0x08062c56 in totempg_deliver_fn (source_addr=Cannot access memory at address 0x8
> ) at totempg.c:287
> 
> cl019:
> aisexec: ../include/sq.h:149: sq_item_get: Assertion `seq_id < (sq->head_seqid + sq->size)' failed.
> 
> Program received signal SIGABRT, Aborted.
> 0x42028cc1 in kill () from /lib/i686/libc.so.6
> (gdb) bt
> #0  0x42028cc1 in kill () from /lib/i686/libc.so.6
> #1  0x42028ac8 in raise () from /lib/i686/libc.so.6
> #2  0x4202a019 in abort () from /lib/i686/libc.so.6
> #3  0x42021cd6 in __assert_fail () from /lib/i686/libc.so.6
> #4  0x0805d83f in sq_item_get (sq=0x80ba430, seq_id=2038,
>     sq_item_out=0xbfffeb54) at sq.h:149
> #5  0x0805f1c9 in update_aru () at totemsrp.c:1862
> #6  0x0805f447 in orf_token_mcast (token=0xbffff220, fcc_mcasts_allowed=29,
>     system_from=0xbffff840) at totemsrp.c:1981
> #7  0x08060a37 in message_handler_orf_token (system_from=0xbffff840,
>     iovec=0x806d120, iov_len=1, bytes_received=78, endian_conversion_needed=0)
>     at totemsrp.c:2698
> #8  0x08062936 in recv_handler (handle=0, fd=8, revents=1, data=0x0,
>     prio=0x811bb74) at totemsrp.c:3347
> #9  0x0805c7e8 in poll_run (handle=0) at aispoll.c:398
> #10 0x0804bf47 in main (argc=1, argv=0xbffffa24) at main.c:1128
> #11 0x420158d4 in __libc_start_main () from /lib/i686/libc.so.6
> 
> 




More information about the Openais mailing list