[Openais] [RFC] simple blackbox

Mon Oct 13 05:45:55 PDT 2008

On 2008-10-09T08:51:58, Steven Dake <sdake at redhat.com> wrote:

Hi Steven,

> > The goal was to have a blackbox which we cannot just retroactively dump,
> > but also easily recover from a core or kernel crashdump. 
> 
> the array is global and can easily be obtained from a core file with a
> simple script.

I got that, but I would at least mention that we are looking at crash
scenarios. Not exactly scenarios which are well known for well-behaved
pointers.

> > I would maintain that I'd prefer a ring buffer of messages logged (at +n
> > log levels above the actual log level configured), in their printable
> > char[] form.
> too much overhead - see below.  Also then trace data is lost during
> normal runtime operations.  IMO we want to maintain this trace
> information at all times during every run.

Uhm, but for how long? It seems obvious that the blackbox eventually
overflows and wraps around.

> > That would also ensure that we are looking at exactly the same strings
> > as were eventually logged externally, which could be very much useful to
> > correlate the logs from the blackbox with those in the system logs.
> this is indeed a problem in the current design.

A rather key issue I think.

> Let me explain the real world problem and design goals I had when I
> started so you understand where the implementation comes from.
> 
> I have a real world problem in the checkpoint service where under
> certain circumstances with 10-12 hrs of test runs during recovery
> checkpoint segfaults.  I have instrumented it heavily with logsys and
> the error goes away entirely (because it dramatically changes the timing
> characteristics.  So there is the problem - instrumentation overhead
> causes errors that were once there to disappear.

I understand this. Logging instrumentation of whatever kind will always
subtly change the timing.

However, the idea of instrumenting the general logging call, for example
such as:

- loglevel_general >= notice
- loglevel_blackbox >= debug

Would actually ensure that the debug messages were generated at all
times (just not normally written out). This would thus not affect timing
between test and production runs, as the logging would be identical.

Of course, this means that even at debug log level, the code should not
log so much that it slows down so that it becomes unusable for
production. Some additional internal debug levels might be used for
actual debugging runs during development.

Even with your proposal, I don't expect the tracing would record
_EVERYTHING_ at all times, as that would still be too much data. But
depending on your format, you might be able to squeeze in more data,
sure - as long as we don't overcompensate by logging too large
structures ;-)

> How will this be used?  All of those printfs will be turned into log_rec
> messages.  Since log_rec is super high performance, I am hopeful it
> doesn't impact the reproduction of the error.  Also those log_rec
> commands will be _left in the code_ permanently. 

The same would also be true for instrumenting the general logging code.

> Anytime there is a field failure, instead of providing a log file,
> users would provide the flight data which could be analyzed and
> debugged.  I could look at the trace data and say "wow that event
> looks out of place..".  Complete data structures can be logged instead
> of logging individual pieces of the sprintf functionality.

That is indeed an interesting property. Maybe one worth accepting the
complexity for, even though I'm not sure about that. As long as the logs
can be easily reconciled with the externally observed logs - preferably,
those would use the same infrastructure, just that then the logging
thread would chose to convert and output them as text immediately?

> The goals are:
> 1) super high performance
> 2) tracing left in permanently
> 3) uses circular buffer properly and efficiently
> 4) easy to recover trace data from core files
> 5) allows variable length parameters to be logged which are binary in
> format (such as data structures)
> 6) records all information necessary for the person supporting the
> failure to understand exactly what happened during execution.

I definitely don't disagree with the goals; I hadn't thought of 5), and
am not perfectly sure whether the implementation achieves 4), but maybe
this is a good trade-off to make.

Regards,
    Lars

-- 
Teamlead Kernel, SuSE Labs, Research and Development
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde