[PATCH 09/10] Enable multiple instances of devpts

Mon Sep 29 08:18:28 PDT 2008

Serge E. Hallyn [serue at us.ibm.com] wrote:
| Quoting sukadev at linux.vnet.ibm.com (sukadev at linux.vnet.ibm.com):
| > | > @@ -232,6 +246,8 @@ static int devpts_show_options(struct seq_file *seq, struct vfsmount *vfs)
| > | >  	seq_printf(seq, ",mode=%03o", opts->mode);
| > | >  #ifdef CONFIG_DEVPTS_MULTIPLE_INSTANCES
| > | >  	seq_printf(seq, ",ptmxmode=%03o", opts->ptmxmode);
| > | > +	if (opts->newinstance)
| > | > +		seq_printf(seq, ",newinstance");
| > | 
| > | Is actually that something we want to show?  It doesn't seem
| > | informative.
| > 
| > Without this users have no easy way of knowing whether they have a 
| > private mount specially if they mounted from command line ?

You mean in a nested container ? I agree that it does not help then.

| 
| If they were in a container to begin with, then they still don't know.
| 
| Now if you were to keep a unique per-instance id and have show_options
| list 'instance=%x', that would be helpful.  Either that or just
| dropping the info altogether make sense.  This 'newinstance' listing
| is meaningless.
| 

Another way to look at it is that it is a mount option that was specified
and we just report it. It may not be useful always but might help in some
cases. But I am fine either way.

<snip>

| > | > +
| > | > +		err = mknod_ptmx(mnt->mnt_sb);
| > | > +		if (err) {
| > | > +			dput(mnt->mnt_sb->s_root);
| > | > +			deactivate_super(mnt->mnt_sb);
| > | > +		} else
| > | > +			devpts_mnt = mnt;
| > | > +
| > | > +		return err;
| > | 
| > | There is no locking here, so in early-userspace two competing processes
| > | could both try to set devpts_mnt, right?
| > 
| > Hmm. I was thinking there would be only one thread calling the
| > vfs_kern_mount() in init_devpts_fs.
| 
| But what if init happens to (perhaps mistakenly) lead to 2 racing ones?
| 
| Sure it's just a small memory leak, but why not just prevent it.

Ok.

| 
| > | 
| > | > +	}
| > | > +
| > | > +	return get_sb_ref(devpts_mnt->mnt_sb, flags, data, mnt);
| > | > +}
| > | > +
| > | >  static int devpts_get_sb(struct file_system_type *fs_type,
| > | >  	int flags, const char *dev_name, void *data, struct vfsmount *mnt)
| > | >  {
| > | > +	int new;
| > | > +
| > | > +	new = is_new_instance_mount(data);
| > | > +	if (new < 0)
| > | > +		return new;
| > | > +
| > | > +	if (new)
| > | > +		return new_pts_mount(fs_type, flags, data, mnt);
| > | > +
| > | > +	return init_pts_mount(fs_type, flags, data, mnt);
| > | 
| > | Wait a sec - so if a container does
| > | 
| > | 	mount -t devpts -o newinstance none /dev/pts
| > | 	and then later on just does
| > | 	mount -t devpts none /dev/pts
| > | 
| > | it'll get the init_pts_ns, not the one it had created?
| > 
| > Yes.  Should we treat the latter as remount of the private instance ?
| > If so, user could add '-oremount' ?
| > 
| > The logic seems simple: With newinstance create a private namespace.
| > Without newinstance, bind to initial ns.
| 
| But if I'm in a container in a new mounts ns and somehow managed 
| to umount -l /dev/pts, shouldn't i be able to remount my container's
| devpts by just doing 'mount -t devpts devpts /dev/pts'?

Now wouldn't that require us to associate the devpts mount with some
notion of a container ? (a namespace object in nsproxy of container-init 
like we do with /proc).

Yes, after 'umount -l'  we have lost _that_ devpts ns and we may have to
'redo' the relevant container-init parts