[RFC][v7][PATCH 0/9] Implement clone2() system call

Oren Laadan orenl at librato.com
Thu Sep 24 10:44:05 PDT 2009



Sukadev Bhattiprolu wrote:
> === NEW CLONE() SYSTEM CALL:
> 
> To support application checkpoint/restart, a task must have the same pid it
> had when it was checkpointed.  When containers are nested, the tasks within
> the containers exist in multiple pid namespaces and hence have multiple pids
> to specify during restart.
> 
> This patchset implements a new system call, clone2() that lets a process
> specify the pids of the child process.
> 
> Patches 1 through 6 are helper patches, needed for choosing a pid for the
> child process.
> 
> Patch 8 defines a prototype of the new system call. Patch 9 adds some
> documentation on the new system call, some/all of which will eventually
> go into a man page.
> 

[...]

> 
> Based on these requirements and constraints, we explored a couple of system
> call interfaces (in earlier versions of this patchset) and currently define
> the system call as:
> 
> 	struct clone_struct {
> 		u64 flags;
> 		u64 child_stack;
> 		u32 nr_pids;
> 		u32 parent_tid;
> 		u32 child_tid;

So @parent_tid and @child_tid are pointers to userspace memory and
require 'u64' (and it won't hurt to make @reserved1 a 'u64' as well).

> 		u32 reserved1;
> 		u64 reserved2;
> 	};
> 

Also, for forward/backward compatibility, explicitly state in the
documentation, and enforce in the kernel, that flags which are not
defined must not be set, and that reserved{1,2} must remain 0.

> 	sys_clone2(struct clone_struct __user *cs, pid_t __user *pids)
> 
> Signed-off-by: Sukadev Bhattiprolu <sukadev at linux.vnet.ibm.com>

Otherwise, looks great.

Oren.



More information about the Containers mailing list