[fhs-discuss] user-specific directories in /run

Roger Leigh rleigh at codelibre.net
Tue May 24 12:34:23 PDT 2011


On Tue, May 24, 2011 at 07:29:26PM +0200, Lennart Poettering wrote:
> On Tue, 24.05.11 16:36, Roger Leigh (rleigh at codelibre.net) wrote:
> 
> > > > From the POV of the system being guaranteed to be able to create
> > > > files in /run e.g. when starting a service, or to be able to continue
> > > > to append to datafiles (samba), you do not want the user to be able
> > > > to fill up that filesystem.  It's especially important here because
> > > > there's no 5% "reserved" blocks like on ext, so any user could, either
> > > > accidentally or deliberately, break the entire system.  I really don't
> > > > want that, and this does need to be considered lest our systems
> > > > become vulerable to simple local DoS.
> > > 
> > > This is a general problem by which /dev/shm, /tmp and /run/user
> > > suffer. It's on the todo list to fix this in the kernel by providing
> > > something quota-like (for example an rlimit) on tmpfs.
> > 
> > While having this deficiency fixed would be ideal, we will still need
> > to cope with older kernels even after it is fixed.
> 
> UH? Why? There's really no point in running old kernels on new
> distributions. The other way round might make more sense...

The issue has yet to be fixed in the kernel, so we need to deal with
the deficiency until this is solved, and then there will be a
transitional period when the minimum supported kernel version will
include an unfixed version.  We do need to bear this in mind--any
fix is still at some indefinite point in the future.

While this might not be important to you personally, having a clean
and well tested upgrade path is important.

> > > > - /run/user won't be cleaned by default either.
> > > 
> > > Please read the XDG_RUNTIME_DIR spec. Its a *MUST* that this directory
> > > is removed after a complete logout.
> > 
> > I have read it.  In the context of the FHS, I don't think the text of
> > the spec means much.  It's fine to say "must", but if a session is
> > killed unexpectedly it can still fail to clean up despite what the
> > spec states, and this needs to be considered.
> 
> No. In systemd (git) this cannot happen.

That's great, but the FHS does need to consider a rather wider scope
than just how well systemd can handle things.

> > > > However, consider that on a busy or long running system that you'll
> > > > end up with stray session directories under /run/user as well. 
> > > 
> > > That would be a weakness in your implementation. On a systemd system
> > > this doesn't happen.
> > 
> > Perhaps.  But in the context of the FHS, how a particular init system
> > may or may not handle cleanup is not that useful--the FHS needs to be
> > uniform across all init systems and platforms, and this isn't a
> > requirement that can be enforced nor guaranteed.
> 
> Oh, hell, it can be enforced, just write your software properly. You
> shouldn't build into the spec provisions for "I want to use broken
> software". If you really hate systemd so much, then reimplement that
> feature, but don't weaken the specification for that.

This isn't a matter of "writing software properly".  What about
situations where the process responsible for the cleanup receives
SIGKILL or otherwise ends prematurely?  In this sort of case, stray
session data will be left lying around, and that's completely out of
the programmer's hands.

This isn't something that the FHS can sensibly standardise, though
systemd can certainly implement stricter guarantees in addition.

> > > > At that point you'll be in the same situation as /tmp, and you'll
> > > > need to clean them *both*...  In consequence, I don't think that /run
> > > > is a better choice than /tmp.
> > > 
> > > I think PulsedAudio is the only application that ever got it right
> > > placing a socket in /tmp. And the code for that is massive. I am not
> > > sure you understand the complexity of this. i.e. You need to create a
> > > random directory in /tmp, and then add a symlink in $HOME to that dir,
> > > so that you can access it under a well-defined name. The complexity now
> > > is added on top of that in that this symlink needs to include a machine
> > > id of some kind to not break NFS, and you need to ensure that the
> > > directory in /tmp is really yours in case /tmp was cleaned up since you
> > > created the dir in it. That is just crazy. In fact the entirety of /tmp
> > > is just a gigantic fuckup.
> > > 
> > > The advantages of /run/user/$USER is the guaranteed cleanup and private
> > > namespace, which makes it very easy to write safe code that doesn't
> > > pollute the file systemd over long.
> > 
> > I think the issues you highlight above could be summarised as follows:
> > "cleanup of /tmp causes massive problems for applications using /tmp".
> 
> No, you misunderstood this completely.
> 
> The problem is the shared namespace and the fact that things might end
> up lurking around forever.

The sticky bit is set on /tmp.  What's so hard about securely creating
a session directory and setting XDG_RUNTIME_DIR to point to that?  Once
created, it will remain there, and accessible only to that user.  So
long as automated cleanup of /tmp doesn't take out the directory
(which would be utterly broken), I don't see what the problem is here
unless there's part of the picture I'm missing.

> > Why should applications need to have massively complex code to
> > workaround cleanup of /tmp, when simply /not/ cleaning /tmp is the
> > simple solution?
> 
> > Please note that the FHS makes no mention of cleaning /tmp after
> > startup; it's a site-specific issue left to the admin.  As I said
> > before: if you choose to enable automated cleanup of /tmp, you do so in
> > the full knowledge that it will cause breakage.  Expecting that
> > applications should explicitly cater for such is unrealistic.  While in
> > theory this could happen to any program at any point in time on a
> > multiuser, multitasking system as another process makes changes to the
> > filesystem, most programs would be expected to notice as system calls
> > fail and bail out appropriately as would be the case for file
> > operations in any other part of the filesystem hierarchy.  Complex and
> > fragile workarounds are not a good idea, especially given that cleanup
> > can introduce "interesting" security issues which are entirely
> > absent sans cleanup.
> 
> I think you are seeing problems where there aren't any problems, and not
> seeing the actual problems that exist.
> 
> Note that all systemd-based systems clean up /tmp by default. This is
> the only sane thing to do.

While we should probably agree to disagree about the relative merits of
cleaning /tmp or not, this is not at all applicable to the FHS: it's
local policy.

> > As I said above, we need to cater for past and current kernels
> > irrespective of newer kernels fixing things.  And we could well be
> > running on non-Linux kernels without tmpfs support as well.  While
> > the various BSDs chose not to participate in the FHS, Debian does
> > also run on the FreeBSD and Hurd kernels, and I am taking their
> > needs into account as well when considering what Debian wants from
> > the FHS.
> 
> Oh god. I really hope that legacy support for toy OSes nobody uses and
> legacy kernels won't pollute the FHS. What you are doing is not bringing
> Linux forward, but holding it back.

I doubt this.  Consider that by tightly specifying that things must be
done in a specific way, such as /run *must* be a tmpfs mount, that you
restrict the flexibility of implementations and this may cause problems
down the line should it be desirable to change to do things differently.
There's nothing intrinsic to /run that *requires* this, it's just
useful for having it created in the initramfs before the rootfs is
mounted.  Not all systems use an initramfs or have sufficient memory
to store /run, and in these situations having it on a regular
filesystem is equally acceptable.

By considering the needs of other systems and by keeping the specs
relatively generalised (e.g. /run should be a tmpfs rather than a must)
it doesn't tightly limit the options of implementors--any writable
filesystem is possible--and while in this particular case I referred to
alternative kernels, this equally applicable just considering Linux.

> > As mentioned in other mails: /run /should/ be on a tmpfs, but this
> > is *not* guaranteed.  It could be just a regular filesystem.  In
> > consequence, we can not provide the guarantees you desire, even
> > on plain Linux kernels--consider embedded systems with low memory
> > and no swap where a tmpfs is too expensive, for example.
> 
> On systemd /run is guaranteed to be tmpfs. I think the tmpfs being too
> expensive claim is a myth you are simply using to retroactively justify
> Debian's non-sensical configurability choices. Such a problem does not
> really exist.

What systemd does or does not do is irrelevant to the FHS; it's just
one init system out of many.  The fact that systemd always mounts it
as a tmpfs does not mean that this is what the FHS should mandate.
In Debian sysvinit initscripts also mounts /run as a tmpfs as well.
But that's just an implementation detail--I'm simply suggesting that
this should be a *recommendation*, not a *requirement*.  In other
situations it may make more sense not to use a tmpfs.  I'm not claiming
I know what all those situations might be--low memory embedded systems
are just one possibility--but I don't want to hamstring what
implementors can do now and in the future with an unnecessarily
restrictive requirement just because that's what's done *today*.

> > > If services are per-session then they should simply include a session
> > > identifier in the files/sockets they create in /run/user.
> > The FHS shouldn't really be standardising usage based upon current
> > limitations in the GNOME desktop session and login handling.  And the
> > same applies to ridiculous limitations in applications like firefox
> > which refuse to permit multiple instances.  Other desktops and
> > applications do not have these curious limitations.  You might well
> > think that it's "pointless" to fix such limitations.  However, the FHS
> > has a wider scope than this; a "session" might not even necessarily
> > relate to an "X session".  If set up by PAM, it might be tied to any
> > other PAM service, including console and SSH logins.  The various
> > "-agent" processes are usually tied to a single login session, and you
> > might well want per-session instances (the system I'm writing this
> > mail on has ssh-agents tied to separate screen sessions).
> > 
> > While the XDG might mandate a certain usage for /run/user for
> > XDG-compliant applications, the location as standardised in the FHS may
> > well have users which are FHS compliant yet not XDG compliant.  What
> > was already concluded as being acceptable to the XDG may not be
> > acceptable to the FHS--please do bear in mind that it has a rather
> > wider scope.
> 
> Oh my, I see. XDG = bad, FHS = good. 

I'm not saying anything of the sort.  I'm simply saying that the FHS
has a wider scope than the XDG and what is standardised by the FHS
need not be as restrictive as the XDG spec, because programs using
the FHS are not necessarily XDG compliant.  I'm simply saying that
they are separate standards, and what has been standardised by XDG
may not may sense in the context of inclusion in the FHS.


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux             http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?       http://gutenprint.sourceforge.net/
   `-    GPG Public Key: 0x25BFB848   Please GPG sign your mail.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
Url : http://lists.linux-foundation.org/pipermail/fhs-discuss/attachments/20110524/77621ff7/attachment.pgp 


More information about the fhs-discuss mailing list