[Ksummit-2012-discuss] [ATTEND] kernel debugging, tracing, backtracing

Jason Wessel jason.wessel at windriver.com
Mon Jun 18 05:26:09 UTC 2012

While no one had mentioned it yet I am certainly up for discussing the
future of debugging and tracing.

Over the last few years it appears there has been a lot less need for
a stop mode debugger, so long as you have a good backtrace, printk, or
kdump.  I am certainly curious how others feel about this.  I am also
curious if we have some kind of mechanism to get a poll out to folks
such as our core community.  Obviously if no one cares to even respond
to a poll that tells us something as well. :-)

There are a number of orphaned debug tools, that with some focus could
potentially make it to the mainline but it doesn't seem as if there is
a high demand.  One of the main reasons these have become orphans is
that each requires some invasive change to a particular subsystem.  I
am not entirely sure that the various subsystem maintainers would even
allow the kinds of changes required, or if we truly have a need for
these tools.

The orphan list:
   * KDB shell with USB keyboard
        - Requires invasive poll changes to USB stack
   * kgdboe (KGDB over Ethernet)
        - Requires invasive changes to scheduling and NET POLL / net core
          to create a "safe" state to use the ethernet HW
   * kgdbou (KGDB over USB serial)
        - Requires invasive poll changes to the USB stack and change
          to the existing CONSOLE_POLL API

More recently I had been looking at lowering the barrier of entry for
looking at kernel oops messages at the expense of a bit of memory for
kernel modules and the core kernel, see the trace below.

Call Trace:
 [<ffffffff815f3003>] panic+0xbd/0x14 panic.c:111
 [<ffffffff815f31f4>] ? printk+0x68/0xd printk.c:765
 [<ffffffffa0000175>] panic_write+0x25/0x30 [test_panic] test_panic.c:189
 [<ffffffff8118aa96>] proc_file_write+0x76/0x21 generic.c:226
 [<ffffffff8118aa20>] ? __proc_create+0x130/0x21 generic.c:211
 [<ffffffff81185678>] proc_reg_write+0x88/0x21 inode.c:218
 [<ffffffff81125718>] vfs_write+0xc8/0x20 read_write.c:435
 [<ffffffff811258d1>] sys_write+0x51/0x19 read_write.c:457
 [<ffffffff815f84d9>] ia32_do_call+0x13/0xc ia32entry.S:427

The above was posted in April - http://lkml.org/lkml/2012/4/20/427

This begs the question of if we have a reliable mechanism where we
could pull this data in on demand, since it is kind of large.  The RFC
patch set eats the memory at the time you activate the feature.  Of
course there are a number of other possible consumers of a service
such as this beyond oops (kprobes, perf, kdb, ftrace backtrace)

Another point of discussion I would like to have in the hallway track
is around share internal software breakpoints.  A few years back when
perf decided to claim the HW breakpoints and HW counters, we solved the
sharing problem between kprobes, perf, kgdb, ptrace and others.  It
seems we need to have another look into some consideration for an API
around software breakpoint exceptions as well.  Certainly with the
advent of more and more tracing, it can make the kernel debugger and
tracing mutually exclusive and I don't know that this really needs to
be the case.

Qualifications for myself:
  * kgdb/kdb maintainer
  * 12 years creating multi architecture embedded Linux systems
  * Chief Linux product architect for Wind River Linux

More information about the Ksummit-2012-discuss mailing list