[Openais] recent segfault

Mark Haverkamp markh at osdl.org
Tue Feb 1 11:24:52 PST 2005


Steve,

I got a segfault yesterday in the evt function that makes a local event
out of the received message.  It looks like the message was partially
corrupt.  I did some looking around and found that, for instance, that
the req_header says that the size is 712 (seems reasonable), but the
iovec passed to the delivery function says that that the iov_len is
1306.

#0  0x4207c46c in memcpy () from /lib/i686/libc.so.6
#1  0x08055f03 in make_local_event (p=0xb7f0a00c, eci=0x1201a8c0) at evt.c:1784
#2  0x080573c1 in evt_remote_evt (msg=0xb7f0a00c, source_addr=
      {s_addr = 302098624}, endian_conversion_required=0) at evt.c:2742
#3  0x0804b432 in deliver_fn (source_addr={s_addr = 302098624},
    iovec=0x8075b18, iov_len=1, endian_conversion_required=0) at main.c:702
#4  0x08061807 in totempg_deliver_fn (source_addr={s_addr = 302098624},
    iovec=0x80ebd58, iov_len=1, endian_conversion_required=0) at totempg.c:314
#5  0x0805f86b in messages_deliver_to_app (skip=0, start_point=0x806d2c8,
    end_point=32) at totemsrp.c:2847
#6  0x0805fb4e in message_handler_mcast (system_from=0xbffff850,
    iovec=0x806c5e0, iov_len=1, bytes_received=1472,
    endian_conversion_needed=0) at totemsrp.c:2970
#7  0x080611f6 in recv_handler (handle=0, fd=7, revents=1, data=0x0,
    prio=0x80bc628) at totemsrp.c:3315
#8  0x0805ae5e in poll_run (handle=0) at aispoll.c:386
#9  0x0804bb67 in main (argc=1, argv=0xbffffa34) at main.c:1005
#10 0x420158d4 in __libc_start_main () from /lib/i686/libc.so.6

(gdb) p /x *p
$41 = {led_head = {size = 0x2c8, id = 0x13, error = 0x0}, led_in_addr = {
    s_addr = 0x1201a8c0}, led_receive_time = 0xf5da2a3970817c8,
  led_svr_channel_handle = 0x0, led_lib_channel_handle = 0x4e206275,
  led_chan_name = {length = 0x12, value = {0x45, 0x56, 0x45, 0x4e, 0x54, 0x5f,
      0x54, 0x45, 0x53, 0x54, 0x5f, 0x43, 0x48, 0x41, 0x4e, 0x4e, 0x45, 0x4c,
      0x0, 0x0, 0x0, 0x0, 0x9c, 0x11, 0xfa, 0xb7, 0xff, 0xff, 0xff, 0x1f,
      0xff, 0xff, 0xff, 0xff, 0x7, 0x0, 0x0, 0x0, 0x7, 0x0, 0x0, 0x0, 0x0,
      0x0, 0x0, 0x0, 0x80, 0x8, 0x0, 0xb8, 0x0, 0x0, 0x0, 0x0, 0xe0, 0xf8,
      0xff, 0xbf, 0xff, 0xff, 0xff, 0x1f, 0x8, 0x0 <repeats 11 times>, 0xc0,
      0x2b, 0x7a, 0x0, 0x70, 0xf9, 0xff, 0xbf, 0xe0, 0x6, 0x0, 0xb8, 0x8, 0x0,
      0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x58, 0x56, 0x0, 0x42, 0x0, 0x10, 0xfa,
      0xb7, 0x0, 0x0, 0x0, 0x0, 0xdc, 0x61, 0x12, 0x42, 0x7, 0x0, 0x0, 0x0,
      0x0, 0x0, 0x0, 0x0, 0x78, 0xf9, 0xff, 0xbf, 0xd9, 0xe3, 0x2, 0x42, 0x0,
      0x0, 0x1, 0x0 <repeats 125 times>, 0xe8, 0xf9}},
  led_event_id = 0x1201a8c00029579f, led_sub_id = 0x4e206275,
  led_publisher_node_id = 0x1201a8c0, led_publisher_name = {length = 0xd,
    value = {0x54, 0x65, 0x73, 0x74, 0x20, 0x50, 0x75, 0x62, 0x20, 0x4e, 0x61,
      0x6d, 0x65, 0x0 <repeats 243 times>}}, led_retention_time = 0x0,
  led_publish_time = 0xf5da2a3580bf670, led_priority = 0x2,
  led_user_data_offset = 0x6c, led_user_data_size = 0x2c80000,
  led_patterns_number = 0x130000, msg_id = 0x0, led_body = 0xb7f0a268}

led_user_data_size should be zero, and the patterns number should be 4

looking up the stack a little.
#3  0x0804b432 in deliver_fn (source_addr={s_addr = 302098624},
    iovec=0x8075b18, iov_len=1, endian_conversion_required=0) at main.c:702
702             res = aisexec_handler_fns[header->id](header, source_addr, endian_conversion_required);
(gdb) p *iovec
$42 = {iov_base = 0xb7f0a00c, iov_len = 1306}

This happened just after a config change.  No one left or came into
membership. It seems like the assembly areas should have been OK, but
the the 1306 looks a little funny.

Mark.


-- 
Mark Haverkamp <markh at osdl.org>




More information about the Openais mailing list