[Openais] [PATCH 2/2] Avoid copying buffers where possible in the Infiniband driver

Zane Bitter zane.bitter at gmail.com
Mon Mar 28 21:06:55 PDT 2011

On 2011/03/28, at 02:31, Steven Dake wrote:

> Yes, the receive buffers handling was changed by your patch.  What
> happens if a recv buffer is delivered to totemsrp?  That recv buffer is
> put back in the "send buffers free list".  Prior to this change, the
> recv buffer would be posted back to the recv queue via ibv_post_recv.
> totemsrp doesn't know anything about recv buffer vs send buffer, and I'd
> prefer to keep the complexities of rdma out of totemsrp.c.
> Regards
> -steve

OK, that's true, I am assuming that anything passed to iba_mcast_flush_send() or iba_mcast_noflush_send() is actually a send_buf (which is wrong!). I did a hunt through the code to trace back where each buffer that is sent to either of these functions is allocated. Full details are below, but the executive summary:

* buffers passed to totemrrp_token_send() are allocated on the stack or in the totemsrp instance.
  - I think we already decided to keep doing the memcpy() in the iba driver for these
* buffers passed to totemrrp_mcast_flush_send() are allocated on the stack.
  - This makes my patch wrong - we should probably continue to do a memcpy() in the iba driver for these too
* buffers passed to totemrrp_mcast_noflush_send() are ultimately allocated by totemsrp_buffer_alloc()

I definitely agree that passing a receive buffer to iba_mcast_noflush_send() would be bad, but if it's happening I can't find where. Intuitively, this seems plausible to me: if we receive a packet we are not going to just transmit it straight away; we need to make a copy in case we need to retransmit it again later, and the net driver will want its buffer back as soon as the deliver call up to totemsrp is finished (in the iba case it gets reposted to the receive queue immediately once iba_deliver_fn() returns).

Tracing (somewhat less exhaustively) in the other direction, through the deliver callbacks, I see the following things happening to the receive buffers:
message_handler_orf_token() copied to the stack, sometimes retransmitted from there
message_handler_mcast() copied to a buffer allocated with totemsrp_buffer_alloc() and that buffer added to a sort queue
message_handler_memb_merge_detect() copied to the stack
message_handler_memb_join() passed to memb_join_process(), no transmit
message_handler_memb_commit_token() copied to the stack, sometimes retransmitted from there
message_handler_token_hold_cancel() nothing

But if I am still missing something, I would be _very_ happy to find out where it is ;)


- totemsrp.c:2585 instance->orf_token_retransmit
- totemsrp.c:2669 passed to token_send()
- totemsrp.c:2848 passed to memb_state_commit_token_send_recovery()
- totemsrp.c:2876 instance->commit_token (== instance->commit_token_storage)

- totemsrp.c:2741 <stack>
- totemsrp.c:3619 <stack>

- totemsrp.c:1992 passed to memb_state_recovery_enter()

- totemsrp.c:4309 alloca()

- totemsrp.c:2701 <stack>
- totemsrp.c:3011 <stack>
- totemsrp.c:3081 <stack>
- totemsrp.c:3101 <stack>

- totemsrp.c:2273 instance->regular_sort_queue
- totemsrp.c:2273 instance->recovery_sort_queue
- totemsrp.c:2424 instance->retrans_message_queue
- totemsrp.c:2424 instance->new_message_queue

- totemsrp.c:1691 instance->recovery_sort_queue
- totemsrp.c:1790 instance->recovery_sort_queue
- totemsrp.c:2422 instance->new_message_queue
- totemsrp.c:3869 totemsrp_buffer_alloc()

- totemsrp.c:2422 instance->retrans_message_queue
- totemsrp.c:3869 totemsrp_buffer_alloc()

- totemsrp.c:2127 totemsrp_buffer_alloc()

- totemsrp.c:2214 totemsrp_buffer_alloc()

More information about the Openais mailing list