[Ksummit-2012-discuss] [ATTEND] kernel core dump and "dying breath"

Cong Wang xiyou.wangcong at gmail.com
Thu Jun 21 14:29:39 UTC 2012


On 06/21/2012 06:37 PM, Konrad Rzeszutek Wilk wrote:
> On Thu, Jun 21, 2012 at 4:05 AM, Cong Wang<xiyou.wangcong at gmail.com>  wrote:
>> Hi, all,
>>
>> I would like to bring up the kernel "dying breath" topic with
>> to the Kernel Summit this year. It will contain some recent
>> technologies like pstore, ramconsole etc., and of course
>> should also cover kdump and netconsole too.
>
> Are there any future projects in the pipeline? Most of these deal with
> depositing somewhere "why it crashed" information, but are there any
> that try to omit the cause on the next boot?

I think you mean the next reboot in kdump. Yeah, this is a very good 
question.

The main reason why we have to a reboot in the kdump kernel is that 
there is no way to escape from the reserved memory for crashkernel to 
normal memory currently, I remember IBM had some ppc-specific way to 
release the vmcore after core dump is saved, and continue to boot the 
kernel without a reboot. It would be certainly nicer to make this 
generic, IOW, we need to find a way to release the vmcore in memory.

The other reason could be that some device driver doesn't work well in 
the second kernel, so even if we could boot the kernel normally, the 
device could not function well.

What's more, there are also some problem to bring up all CPU's in the 
second kernel, IIRC. Currently we only bring up one CPU in the second 
kernel, with either nr_cpus=1 or maxcpus=1.

>
> Perhaps a better question is - are there more things in this field
> that can be done?
>

Yeah, there are few more detailed problems related to kdump:

1. Is it possible to reserve the crashkernel memory at run time? 
Currently we reserve that memory before boot, although this memory can 
be released/shrinked at run time, but it can't grown or reserved.

2. Loading the kernel and initrd into higher memory location. Currently 
the limit of initrd is 896M, as the memory grows to TB today, there are 
much more memory sitting after 896M, and the first 896M is usually 
fragmented very much, thus, it is hard to find a large contiguous memory 
block, say 512M, before 856M. In theory, we could load the kernel up to 
4G or even higher location.

3. A distro-independent user-space kdump mechanism is needed, Red Hat 
ships a Red Hat specific mkdumprd for a long time, currently we are 
moving it to dracut which is supposed to be distro-independent, but 
there are still problems to make the mkdumprd to be really 
distro-independent.

I'd like to discuss all these things at Kernel Summit, if you are 
interested.

Thanks!


More information about the Ksummit-2012-discuss mailing list