[Ksummit-2012-discuss] [ATTEND] writeback and kernel testing

Fengguang Wu fengguang.wu at intel.com
Sat Jun 16 14:50:41 UTC 2012

On Sat, Jun 16, 2012 at 07:07:25AM -0700, Guenter Roeck wrote:
> On Sat, Jun 16, 2012 at 08:44:18PM +0800, Fengguang Wu wrote:
> > Greetings,
> > 
> > I'd like to attend this year's kernel summit.
> > 
> > I may talk about the technical challenges and trade-offs on writeback and memcg
> > and IO controllers with anyone interested, perhaps in some breakout session.
> > 
> > And I would like a chance to talk about doing kernel tests in a timely fashion:
> > whenever one pushes new commits to git.kernel.org, build/boot/stress tests will
> > be kicked off and possible errors be notified back to the author within hours.
> > 
> > This fast develop-test feedback cycle is enabled by running a test
> > backend that is able to build 25000 kernels and runtime test 3000
> > kernels (assuming 10m boot+testing time for each kernel) each day.
> > Just capable enough to outrace our patch creation rate ;-)
> > 
> > On an average day, 1-2 build errors are caught in the 160 monitored kernel trees.
> > 
> > The runtime tests are still in active development and I'd like to ask for your
> > inputs on best practices and test methodology for every possible subsystems. The
> > target is to catch _common_ bugs early, so that a) less people are impacted and
> > b) when bisecting a specific bug, it's not confused by unrelated but common bugs.
> > 
> > I'll need your help and feedback to run this system well. In return, you'll be
> > able to take better advantage of it, once got some basic understandings on how
> > it runs for you. Hopefully, someday, these diligent machines may bring a little
> > happiness to our stressed life. As is the secret of happiness for us kernel
> > developers: if a bug is caught and fixed in my own tree, Cheers!  Otherwise if
> > the bug has been merged in the upstream tree, OMG, it's out of control and may
> > well impact 1000 commits after it.. regrets, sadness, guilty, embarrassments,
> > bad commits with my Signed-off-by carved in stone, forever ...
> > 
> I am running a nightly sequence of builds on my tree, for as many targets as
> possible. allyesconfig, allmodconfig, and a large number of randconfigs. That
> helps me find most of the problems I had early on, which only show up in some
> configurations. That combined with a personal rule to only push code upstream
> which has been in my local tree for at least one test cycle helps a lot to avoid
> the embarrassment of breaking linux-next or Linus' tree.

That definitely helps. By doing so you establish yourself as a good
citizen in the kernel arena :)

Actually this 0day kernel testing project is started exactly to
prevent such embarrassments to our kernel team. When doing so I
thought it could be done in a general and easy to use form, so that
the same backend can be used to serve internal as well as public
kernel git trees, and to run it for as many trees as the machine
capability permits.

I hope it won't be an excuse for the developers to not do proper tests
on their own. And all commits should definitely go though linux-next.
We need various testing efforts in each stage to achieve good quality.

> I would love to be able to expand that to runtime tests, specifically for
> hardware monitoring functionality, but that is a bit more than I can afford
> and/or maintain on my own.

Understandable. It is a practical limit that not all developers have
the resource or can afford the time to do heavy tests, especially
commit-by-commit compile/boot bisect-ability tests. That's exactly the
situation on myself for many years.

I'm collecting test cases (especially the fast ones) for my testing
backend and you are welcome to send some to me!

> Anyway, someone suggested something similar to me a couple of weeks ago, with a
> small twist: The ability to submit patches into such a test system before they
> are integrated into an upstream kernel. I thought that wasn't feasible, but
> maybe it will be at some point, if you can build 25k kernel per day already.

Yeah, that should be a pretty common impression. The hardware
capability and the fruit of catching one bug per day turns out to be
much higher than my expectation :-)


More information about the Ksummit-2012-discuss mailing list