[Ksummit-2012-discuss] [ATTEND] Wither the baseline; EFI breakout session

H. Peter Anvin hpa at zytor.com
Sat Jun 16 22:25:19 UTC 2012


Hello all,

First, the mandatory self-promotion blurb: I am part of the tip team
maintaining arch/x86 (with tglx and mingo); I tend be the one focusing
on the x86-specific low-level code.


I would really like to have a discussion about setting the baseline.

The Linux community justifiably prides itself on supporting all kinds
of hardware, including very old hardware.  Furthermore, we include all
kinds of odd things like paravirtualization, virtual SMP, and a huge
cross of toolchains (mostly gcc and binutils, but at some point we
have supported icc on some architectures, and there is push toward
building with LLVM and so on.)  There are people who want to build
current kernels with toolchains that are seven years old, there are
platforms on which current toolchains don't even work, and there are
others who want the kernel to be self-hosting on machines with very
few binaries installed.

All of this imposes a tax on the core development community, and it is
to a large degree borne by the development community rather than the
ones who are pushing the niche usages -- the niche users tend to bear
the upfront cost, but a lot of the ongoing cost is in the form of
extra constraints that is imposed on the development community.  What
I am interested in is not so much deciding where the baseline is
*today*, but how to set the baseline on an ongoing basis.

Areas that impose this kind of constraints:

1. Old or Exotic Toolchains

   Users of old toolchains who still demand to build current kernels,
   many years later.  Historically the cost imposed by this was
   largely having to carry bug workarounds around for a very long
   time, which could be hard enough (the number of times the macro
   bugs in gas 2.16 have forced us to do crazy things...) but as we
   are -- finally -- getting a more cooperative working relationship
   with the gcc crowd we are starting to have quite a bit of things
   which need a feature in a recent gcc or binutils, and then have
   fallback code which will sit, largely untested by the developers,
   and bitrot.

   arch code is less affected than most -- it tends to be up to the
   arch maintainers to decide what toolchain is the minimum on a
   particular platform -- but the global baseline as described in
   Documentation/Changes is extremely low level (and almost certainly
   wrong); furthermore, most of the dependencies are not even covered.

   Picking on myself for a bit: the arch-specific requirements really
   need to be documented somewhere, too, at least to the extent they
   are known.

2. Tools Requirements

   There has been a lot of discussion about what tools are reasonable
   to require to build the kernel.  "Less is better" is a noble
   sentiment, but what is the cost?

3. Hooks and Notifiers

   Hooks and notifiers are a form of "COME FROM" programming, and they
   make it very hard to reason about the code.  The only way that that
   can be reasonably mitigated is by having the exact semantics of a
   hook or notifier -- the preconditions, postconditions, and other
   invariants -- carefully documented.  Experience has shown that in
   practice that happens somewhere between rarely and never.

   Hooks that terminate into hypercalls or otherwise are empty in the
   "normal" flow are particularly problematic, as it is trivial for a
   mainstream developer to break them.

   Furthermore, these things tend to be named based on where they fit
   into a particular flow, for example:

        x86_init.paging.pagetable_setup_start(swapper_pg_dir);
        paging_init();
        x86_init.paging.pagetable_setup_done(swapper_pg_dir);

   This may not make any sense if paging_init() is moved to another
   part of the architecture initialization flow, and the preconditions
   and postconditions change as a result.

4. Archaic Hardware

   There are cases in the Linux kernel where the triggering event for
   code removal has been that the last known specimen of the hardware
   in question has been confirmed destroyed.  Other cases have been
   that the code has not compiled for several kernel cycles.

   Those are the easy cases.

   Another reasonably easy case is where a known-to-be-very-rare piece
   of hardware is directly in the way of a major restructuring -- at
   that point it becomes a matter of just moving ahead and see if
   anyone is willing to do the work.

   However, there are much more subtle cases.  We still at least in
   theory support i386, although it seems that more often than not the
   i386-specific workarounds are broken in various ways.  There are
   people out there which still occasionally test i386 and send bug
   reports, but there is no evidence that anyone is actually *using*
   i386 -- quite on the contrary it appears that these testers happen
   to have a hardware specimen and happen to go "wonder what would
   happen."  Should we still bother developing workarounds for i386?

5. Don't Break Userspace

   Linus has, rightfully, said "don't ever break userspace".  The one
   sole problem with that is that user space also includes malware,
   and malware authors are notorious for seeking out the least tested,
   most exotic nooks and crannies of the system and exploiting them.
   For example, there is support in the kernel for running i386 a.out
   binaries on top of an x86-64 kernel.  It isn't clear if this
   support has *ever* been used by anyone for real work, especially in
   this day and age of virtualization, but we definitely have had both
   breakage and security holes in that path.  Distributions uniformly
   turn it off.  Should we keep it, with the overhead of keeping it
   functional and secure, under the auspices of "don't break
   userspace", or should we turn it off and stop spending the effort
   maintaining an option that nearly all users will turn off anyway?

There are almost certainly many more than this.


I would also like to do a (U)EFI breakout session; it isn't clear to
me if this is a KS or Plumber's topic or even if it matters in any
way, but what is clear is that we will actually have the bulk of the
people working on this in one place in San Diego, and so this is
something that we really should do.


	  -hpa


-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.



More information about the Ksummit-2012-discuss mailing list