[Ksummit-2012-discuss] [ATTEND] kernel core dump and "dying breath"

Cong Wang xiyou.wangcong at gmail.com
Thu Jun 21 14:45:41 UTC 2012


On 06/21/2012 06:57 PM, Jason Wessel wrote:
> On 06/21/2012 05:37 AM, Konrad Rzeszutek Wilk wrote:
>> On Thu, Jun 21, 2012 at 4:05 AM, Cong Wang<xiyou.wangcong at gmail.com>  wrote:
>>> Hi, all,
>>>
>>> I would like to bring up the kernel "dying breath" topic with
>>> to the Kernel Summit this year. It will contain some recent
>>> technologies like pstore, ramconsole etc., and of course
>>> should also cover kdump and netconsole too.
>
> If we are going to make netconsole actually work reliably in all
> contexts, kgdboe can be revived.  Currently there are places
> netconsole simply doesn't work and that printk you were looking
> for... It is never going to be delivered.


Yeah, netconsole requires the UDP network stack and network interface 
function well, if the panic/oops just happened in the network transmit 
path, it certainly doesn't work.

>
>>
>> Are there any future projects in the pipeline? Most of these deal with
>> depositing somewhere "why it crashed" information, but are there any
>> that try to omit the cause on the next boot?
>>
>
> Self healing eh?  Short of using a different kernel to run some
> scripts to look at the crash itself and take action such as booting a
> new kernel, OR booting the existing kernel and looking at the previous
> crash information to attempt to take corrective action I am not aware
> of anything.  Things like fsck take this sort of action today, I
> believe you are asking about something of an entirely different level.

A self-aid kernel is a dream, but KS is a place to make such dreams come 
true, right? :)

>
> The one thing that does come to mind is that if you did save
> information about the prior crash that you could probably get more
> information on the next next crash by automatically inserting a kprobe
> at the crash address that could collect more information automatically
> into the ftrace buffer or "something" depending on the original crash.


It would be very interesting if kprobe could help to "fix" the previous 
crash, I think we could go further than just saving the core.

>
> For the really tricky sorts of problems like memory corruption
> however, the addresses tend to move around so this is not likely to
> help much.  I tend to fall back to kdb, and the "kdb death script" (a
> toy of mine that is not in the mainline), where you can assign an
> action to output all the commands you would have other wise typed, and
> then reboot.


I know nearly nothing about kdb, it is good to know kdb could help this 
topic as well!

Thanks.


More information about the Ksummit-2012-discuss mailing list