[Openais] Re: latest svn tree

Steven Dake sdake at redhat.com
Thu Apr 27 15:05:18 PDT 2006


Mark

Thanks for the stack backtrace.

I understand the problem

A mutex was locked.  Then that mutex was pthread_mutex_unlock'ed  only
if in the prioritized thread when fd==1 and ufd.revents was set.  If
these conditions weren't met, the loop would try again (this time with
the same mutex locked) and deadlock.

Patch attached to fix (against trunk).

On Thu, 2006-04-27 at 14:33 -0700, Mark Haverkamp wrote:
> On Thu, 2006-04-27 at 14:02 -0700, Steven Dake wrote:
> > Mark
> > next time run with debug build if possible.
> > 
> > the frame pointers are ommitted in the release build and this backtrace
> > isn't helpful at identifying the mutex...
> 
> It just happened again.  Two machines stuck, two OK.
> 
> 1st stuck machine:
> 
> (gdb) thread 1
> [Switching to thread 1 (Thread -151132480 (LWP 28073))]#0  0x006567a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> (gdb) where
> #0  0x006567a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> #1  0x008ac21e in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
> #2  0x008a8dcf in _L_mutex_lock_32 () from /lib/tls/libpthread.so.0
> #3  0xfef5a838 in ?? ()
> #4  0x0072e6b6 in poll () from /lib/tls/libc.so.6
> #5  0x0804c3b6 in poll_run (handle=0) at aispoll.c:398
> #6  0x08060863 in main () at main.c:546
> (gdb) thread 2
> [Switching to thread 2 (Thread -155717040 (LWP 28305))]#0  0x006567a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> (gdb) where
> #0  0x006567a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> #1  0x008ac21e in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
> #2  0x008a8dcf in _L_mutex_lock_32 () from /lib/tls/libpthread.so.0
> #3  0xf6b7eab8 in ?? ()
> #4  0x0072e6b6 in poll () from /lib/tls/libc.so.6
> #5  0x080629c9 in prioritized_poll_thread (conn=0xf6907968) at ipc.c:371
> #6  0x008a71d5 in start_thread () from /lib/tls/libpthread.so.0
> #7  0x007382da in clone () from /lib/tls/libc.so.6
> (gdb) thread 3
> [Switching to thread 3 (Thread -156077488 (LWP 28304))]#0  0x006567a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> (gdb) where
> #0  0x006567a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> #1  0x0072e6a4 in poll () from /lib/tls/libc.so.6
> #2  0x080629ab in prioritized_poll_thread (conn=0xf69078b8) at ipc.c:366
> #3  0x008a71d5 in start_thread () from /lib/tls/libpthread.so.0
> #4  0x007382da in clone () from /lib/tls/libc.so.6
> (gdb) thread 4
> [Switching to thread 4 (Thread -156437936 (LWP 28085))]#0  0x006567a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> (gdb) where
> #0  0x006567a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> #1  0x0072e6a4 in poll () from /lib/tls/libc.so.6
> #2  0x080629ab in prioritized_poll_thread (conn=0xf6903588) at ipc.c:366
> #3  0x008a71d5 in start_thread () from /lib/tls/libpthread.so.0
> #4  0x007382da in clone () from /lib/tls/libc.so.6
> (gdb) thread 5
> [Switching to thread 5 (Thread -156798384 (LWP 28084))]#0  0x006567a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> (gdb) where
> #0  0x006567a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> #1  0x008ac21e in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
> #2  0x008a8dcf in _L_mutex_lock_32 () from /lib/tls/libpthread.so.0
> #3  0xf6a76ab8 in ?? ()
> #4  0x0072e6b6 in poll () from /lib/tls/libc.so.6
> #5  0x080629c9 in prioritized_poll_thread (conn=0xf6902cd0) at ipc.c:371
> #6  0x008a71d5 in start_thread () from /lib/tls/libpthread.so.0
> #7  0x007382da in clone () from /lib/tls/libc.so.6
> 
> 2nd stuck machine:
> (gdb) thread 1
> [Switching to thread 1 (Thread -151132480 (LWP 6822))]#0  0x0092e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> (gdb) where
> #0  0x0092e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> #1  0x00b8421e in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
> #2  0x00b80dcf in _L_mutex_lock_32 () from /lib/tls/libpthread.so.0
> #3  0xfef095f8 in ?? ()
> #4  0x00a066b6 in poll () from /lib/tls/libc.so.6
> #5  0x0804c3b6 in poll_run (handle=0) at aispoll.c:398
> #6  0x08060863 in main () at main.c:546
> (gdb) thread 2
> [Switching to thread 2 (Thread -155303344 (LWP 7133))]#0  0x0092e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> (gdb) where
> #0  0x0092e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> #1  0x00b8421e in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
> #2  0x00b80dcf in _L_mutex_lock_32 () from /lib/tls/libpthread.so.0
> #3  0xf6be3ab8 in ?? ()
> #4  0x00a066b6 in poll () from /lib/tls/libc.so.6
> #5  0x080629c9 in prioritized_poll_thread (conn=0x87ff2a8) at ipc.c:371
> #6  0x00b7f1d5 in start_thread () from /lib/tls/libpthread.so.0
> #7  0x00a102da in clone () from /lib/tls/libc.so.6
> (gdb) thread 3
> [Switching to thread 3 (Thread -154684848 (LWP 7132))]#0  0x0092e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> (gdb) where
> #0  0x0092e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> #1  0x00a066a4 in poll () from /lib/tls/libc.so.6
> #2  0x080629ab in prioritized_poll_thread (conn=0x8814390) at ipc.c:366
> #3  0x00b7f1d5 in start_thread () from /lib/tls/libpthread.so.0
> #4  0x00a102da in clone () from /lib/tls/libc.so.6
> (gdb) thread 4
> [Switching to thread 4 (Thread -156798384 (LWP 6838))]#0  0x0092e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> (gdb) where
> #0  0x0092e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> #1  0x00a066a4 in poll () from /lib/tls/libc.so.6
> #2  0x080629ab in prioritized_poll_thread (conn=0x8769988) at ipc.c:366
> #3  0x00b7f1d5 in start_thread () from /lib/tls/libpthread.so.0
> #4  0x00a102da in clone () from /lib/tls/libc.so.6
> (gdb) thread 5
> [Switching to thread 5 (Thread -156437936 (LWP 6837))]#0  0x0092e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> (gdb) where
> #0  0x0092e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> #1  0x00b8421e in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
> #2  0x00b80dcf in _L_mutex_lock_32 () from /lib/tls/libpthread.so.0
> #3  0xf6aceab8 in ?? ()
> #4  0x00a066b6 in poll () from /lib/tls/libc.so.6
> #5  0x080629c9 in prioritized_poll_thread (conn=0x8782550) at ipc.c:371
> #6  0x00b7f1d5 in start_thread () from /lib/tls/libpthread.so.0
> #7  0x00a102da in clone () from /lib/tls/libc.so.6
> 
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mutex-fix-2.patch
Type: text/x-patch
Size: 3003 bytes
Desc: not available
Url : http://lists.linux-foundation.org/pipermail/openais/attachments/20060427/1be8db4b/mutex-fix-2-0001.bin


More information about the Openais mailing list