[Openais] Re: recent segfault

Steven Dake sdake at mvista.com
Thu Feb 3 14:12:21 PST 2005


Mark

Cool I'll start testing with it.

I have noticed there is definately still some bugs; I think related to
recovery in totemsrp.  I've got a handle on these cases, but I'm not
quite sure how to fix them yet.  I did notice a possible bug in your
patch when I looked at it earlier.

In the previous patch you sent I noticed a problem and fixed it.  I
think this code has the same problem.  Here is the code:

    for  (i = 0; i < msg_count; i++) {
        /*
         * If the first packed message is a continuation
         * of a previous message, but the assembly buffer
         * is empty, then we need to discard it since we can't
         * assemble a complete message.
         */
        if (mcast->continuation && (assembly->index == 0)) {
            assembly->index += msg_lens[i];
            mcast->continuation = 0;
            continue;
        }

        assembly->index += msg_lens[i];
        app_deliver_fn (source_addr, &iov_delv, 1,
            endian_conversion_required);
        iov_delv.iov_base = &assembly->data[assembly->index];
        iov_delv.iov_len = msg_lens[i + 1];
    }

in the case of continuation being set, assembly->index is increased. 
Shouldn't iov_delv.iov_base and iov_delv.iov_len should also be moved to
the next message?

Something like:

        if (mcast->continuation && (assembly->index == 0)) {
            assembly->index += msg_lens[0];
            mcast->continuation = 0;
	iov_delv.iov_base = &assembly->data[msg_lens[0]];
	iov_delv.iov_len = msg_lens[1];
            continue;
        }

Also it isn't valid to change mcast->continuation.  This value must be
cached if you intend to change it.  The reason is that it may be
delivered to one processor, but then another processor requests it be
retransmitted.  If that happened, the retransmit list would contain the
modified header which would .

All data in the delivered message must be read only for this reason.

Good job
-steve

On Thu, 2005-02-03 at 14:46, Mark Haverkamp wrote:
> On Wed, 2005-02-02 at 12:09 -0700, Steven Dake wrote:
> [...]
> > 
> > I'll give it a run with my patch to fix unrecovered messages during
> > recovery.  Please feel free to rewrite/improve the assembly code.  I'm
> > sure it can be improved.  Only thing to keep in mind is that a message
> > from a local processor is different from a remote processor.
> 
> Steve,
> Here is an addition to my previous patch.  It compresses the two
> delivery loops into one.  I've been running it for a short while and
> have generated some config changes using it.
> 
> Mark.
> 




More information about the Openais mailing list