[Openais] [PATCH 2/2] Avoid copying buffers where possible in the Infiniband driver

Steven Dake sdake at redhat.com
Tue Mar 29 14:18:29 PDT 2011


On 03/28/2011 09:06 PM, Zane Bitter wrote:
> 
> On 2011/03/28, at 02:31, Steven Dake wrote:
> 
>>
>> Yes, the receive buffers handling was changed by your patch.  What
>> happens if a recv buffer is delivered to totemsrp?  That recv buffer is
>> put back in the "send buffers free list".  Prior to this change, the
>> recv buffer would be posted back to the recv queue via ibv_post_recv.
>> totemsrp doesn't know anything about recv buffer vs send buffer, and I'd
>> prefer to keep the complexities of rdma out of totemsrp.c.
>>
>> Regards
>> -steve
> 
> OK, that's true, I am assuming that anything passed to iba_mcast_flush_send() or iba_mcast_noflush_send() is actually a send_buf (which is wrong!). I did a hunt through the code to trace back where each buffer that is sent to either of these functions is allocated. Full details are below, but the executive summary:
> 
> * buffers passed to totemrrp_token_send() are allocated on the stack or in the totemsrp instance.
>   - I think we already decided to keep doing the memcpy() in the iba driver for these
> * buffers passed to totemrrp_mcast_flush_send() are allocated on the stack.
>   - This makes my patch wrong - we should probably continue to do a memcpy() in the iba driver for these too
> * buffers passed to totemrrp_mcast_noflush_send() are ultimately allocated by totemsrp_buffer_alloc()
> 
> I definitely agree that passing a receive buffer to iba_mcast_noflush_send() would be bad, but if it's happening I can't find where. Intuitively, this seems plausible to me: if we receive a packet we are not going to just transmit it straight away; we need to make a copy in case we need to retransmit it again later, and the net driver will want its buffer back as soon as the deliver call up to totemsrp is finished (in the iba case it gets reposted to the receive queue immediately once iba_deliver_fn() returns).
> 

When a message is received via totem, it triggers:
static int mcast_cq_recv_event_fn (hdb_handle_t poll_handle,  int
events,  int suck,  void *context)

This causes the recv buffer to be posted back to the hardware recv
buffer list:
mcast_recv_buf_post in the same funtion

In your patches, when a message is received via totem, it is delivered,
and then freed via totemsrp.c:messages_free() (which adds it to a list
of free messages for the send queue) but never posts it back to the recv
buffer.

Regards
-steve

> Tracing (somewhat less exhaustively) in the other direction, through the deliver callbacks, I see the following things happening to the receive buffers:
> message_handler_orf_token() copied to the stack, sometimes retransmitted from there
> message_handler_mcast() copied to a buffer allocated with totemsrp_buffer_alloc() and that buffer added to a sort queue
> message_handler_memb_merge_detect() copied to the stack
> message_handler_memb_join() passed to memb_join_process(), no transmit
> message_handler_memb_commit_token() copied to the stack, sometimes retransmitted from there
> message_handler_token_hold_cancel() nothing
> 
> But if I am still missing something, I would be _very_ happy to find out where it is ;)
> 
> thanks!
> Zane.
> 
> 
> 
> totemrrp_token_send()
> - totemsrp.c:2585 instance->orf_token_retransmit
> - totemsrp.c:2669 passed to token_send()
> - totemsrp.c:2848 passed to memb_state_commit_token_send_recovery()
> - totemsrp.c:2876 instance->commit_token (== instance->commit_token_storage)
> 
> token_send()
> - totemsrp.c:2741 <stack>
> - totemsrp.c:3619 <stack>
> 
> memb_state_commit_token_send_recovery()
> - totemsrp.c:1992 passed to memb_state_recovery_enter()
> 
> memb_state_recovery_enter()
> - totemsrp.c:4309 alloca()
> 
> 
> totemrrp_mcast_flush_send()
> - totemsrp.c:2701 <stack>
> - totemsrp.c:3011 <stack>
> - totemsrp.c:3081 <stack>
> - totemsrp.c:3101 <stack>
> 
> 
> totemrrp_mcast_noflush_send()
> - totemsrp.c:2273 instance->regular_sort_queue
> - totemsrp.c:2273 instance->recovery_sort_queue
> - totemsrp.c:2424 instance->retrans_message_queue
> - totemsrp.c:2424 instance->new_message_queue
> 
> instance->regular_sort_queue
> - totemsrp.c:1691 instance->recovery_sort_queue
> - totemsrp.c:1790 instance->recovery_sort_queue
> - totemsrp.c:2422 instance->new_message_queue
> - totemsrp.c:3869 totemsrp_buffer_alloc()
> 
> instance->recovery_sort_queue
> - totemsrp.c:2422 instance->retrans_message_queue
> - totemsrp.c:3869 totemsrp_buffer_alloc()
> 
> instance->retrans_message_queue
> - totemsrp.c:2127 totemsrp_buffer_alloc()
> 
> instance->new_message_queue
> - totemsrp.c:2214 totemsrp_buffer_alloc()
> 



More information about the Openais mailing list