[Ksummit-2011-discuss] Discussion Topic proposal: structured data device error logging instead of stone-age free-text printk() fiddling

Thu Jul 21 11:23:28 PDT 2011

On Mon, 11 Jul 2011 14:58:09 +0200 Hannes Reinecke wrote:

> (I've just now subscribed to the list, so I apologize for the 
> incorrect citation)
> 
>  > It's a very old problem, and people continue to fix symptoms
>  > without working on a solution to fix the underlying problem.
>  > Latest mail thread is here:
>  >   https://lkml.org/lkml/2011/7/8/338
>  >
>  > The basic idea is to push out a dictionary of key/values from
>  > the kernel to userspace. It will carry the current free-text
>  > printk, but also additional values to classify the message, or
>  > carry binary data.

[Sorry, I can't find Kay's original posting of this subject, although I'm
sure that I saw it.]

I don't know if this comment goes with Kay's idea or if it's independent,
but I would really like to see an easy way to find out if the system
has suffered an Oops or BUG or other fatal or near-fatal error (but is
still running or limping along).

Currently I grep dmesg output for a series of strings, but I could easily
miss one of the needed search strings or next week's kernel could mean that
I need to update my search string list, but I wouldn't know about that
until that error happened... so I would like to see ONE marker in the kernel
log that is used for all fatalities.  Where should I go for this?

Or maybe it should be one log entry in /proc/sys/kernel/bug that won't be
overwritten by subsequent fatalities...

Kay will implement it?  :)
or just send patches?

thanks,
---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***