[Ksummit-2012-discuss] [ATTEND] Wither the baseline; EFI breakout session
H. Peter Anvin
hpa at zytor.com
Sat Jun 16 22:25:19 UTC 2012
Hello all,
First, the mandatory self-promotion blurb: I am part of the tip team
maintaining arch/x86 (with tglx and mingo); I tend be the one focusing
on the x86-specific low-level code.
I would really like to have a discussion about setting the baseline.
The Linux community justifiably prides itself on supporting all kinds
of hardware, including very old hardware. Furthermore, we include all
kinds of odd things like paravirtualization, virtual SMP, and a huge
cross of toolchains (mostly gcc and binutils, but at some point we
have supported icc on some architectures, and there is push toward
building with LLVM and so on.) There are people who want to build
current kernels with toolchains that are seven years old, there are
platforms on which current toolchains don't even work, and there are
others who want the kernel to be self-hosting on machines with very
few binaries installed.
All of this imposes a tax on the core development community, and it is
to a large degree borne by the development community rather than the
ones who are pushing the niche usages -- the niche users tend to bear
the upfront cost, but a lot of the ongoing cost is in the form of
extra constraints that is imposed on the development community. What
I am interested in is not so much deciding where the baseline is
*today*, but how to set the baseline on an ongoing basis.
Areas that impose this kind of constraints:
1. Old or Exotic Toolchains
Users of old toolchains who still demand to build current kernels,
many years later. Historically the cost imposed by this was
largely having to carry bug workarounds around for a very long
time, which could be hard enough (the number of times the macro
bugs in gas 2.16 have forced us to do crazy things...) but as we
are -- finally -- getting a more cooperative working relationship
with the gcc crowd we are starting to have quite a bit of things
which need a feature in a recent gcc or binutils, and then have
fallback code which will sit, largely untested by the developers,
and bitrot.
arch code is less affected than most -- it tends to be up to the
arch maintainers to decide what toolchain is the minimum on a
particular platform -- but the global baseline as described in
Documentation/Changes is extremely low level (and almost certainly
wrong); furthermore, most of the dependencies are not even covered.
Picking on myself for a bit: the arch-specific requirements really
need to be documented somewhere, too, at least to the extent they
are known.
2. Tools Requirements
There has been a lot of discussion about what tools are reasonable
to require to build the kernel. "Less is better" is a noble
sentiment, but what is the cost?
3. Hooks and Notifiers
Hooks and notifiers are a form of "COME FROM" programming, and they
make it very hard to reason about the code. The only way that that
can be reasonably mitigated is by having the exact semantics of a
hook or notifier -- the preconditions, postconditions, and other
invariants -- carefully documented. Experience has shown that in
practice that happens somewhere between rarely and never.
Hooks that terminate into hypercalls or otherwise are empty in the
"normal" flow are particularly problematic, as it is trivial for a
mainstream developer to break them.
Furthermore, these things tend to be named based on where they fit
into a particular flow, for example:
x86_init.paging.pagetable_setup_start(swapper_pg_dir);
paging_init();
x86_init.paging.pagetable_setup_done(swapper_pg_dir);
This may not make any sense if paging_init() is moved to another
part of the architecture initialization flow, and the preconditions
and postconditions change as a result.
4. Archaic Hardware
There are cases in the Linux kernel where the triggering event for
code removal has been that the last known specimen of the hardware
in question has been confirmed destroyed. Other cases have been
that the code has not compiled for several kernel cycles.
Those are the easy cases.
Another reasonably easy case is where a known-to-be-very-rare piece
of hardware is directly in the way of a major restructuring -- at
that point it becomes a matter of just moving ahead and see if
anyone is willing to do the work.
However, there are much more subtle cases. We still at least in
theory support i386, although it seems that more often than not the
i386-specific workarounds are broken in various ways. There are
people out there which still occasionally test i386 and send bug
reports, but there is no evidence that anyone is actually *using*
i386 -- quite on the contrary it appears that these testers happen
to have a hardware specimen and happen to go "wonder what would
happen." Should we still bother developing workarounds for i386?
5. Don't Break Userspace
Linus has, rightfully, said "don't ever break userspace". The one
sole problem with that is that user space also includes malware,
and malware authors are notorious for seeking out the least tested,
most exotic nooks and crannies of the system and exploiting them.
For example, there is support in the kernel for running i386 a.out
binaries on top of an x86-64 kernel. It isn't clear if this
support has *ever* been used by anyone for real work, especially in
this day and age of virtualization, but we definitely have had both
breakage and security holes in that path. Distributions uniformly
turn it off. Should we keep it, with the overhead of keeping it
functional and secure, under the auspices of "don't break
userspace", or should we turn it off and stop spending the effort
maintaining an option that nearly all users will turn off anyway?
There are almost certainly many more than this.
I would also like to do a (U)EFI breakout session; it isn't clear to
me if this is a KS or Plumber's topic or even if it matters in any
way, but what is clear is that we will actually have the bulk of the
people working on this in one place in San Diego, and so this is
something that we really should do.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
More information about the Ksummit-2012-discuss
mailing list