[PATCH] c/r: fix race of prepare_descendant() with an ongoing fork()

Oren Laadan orenl at librato.com
Tue Sep 22 16:43:03 PDT 2009



Serge E. Hallyn wrote:
> Quoting Serge E. Hallyn (serue at us.ibm.com):
>> Quoting Oren Laadan (orenl at librato.com):
>>> From: Oren Laadan <orenl at librato.edu>
>>>
>>> If prepare_descendants() is walking a tree and one of the tasks is
>>> forking, one of two bads can happen. If the child doesn't inherit the
>>> ->ctx, it breaks the assumption that the entire subtree is prepared.
>>> If the child inherits the ->ctx, it will have one without having taken
>>> a reference.
>>>
>>> This patch closed this race by explicitly getting and referencing the
>>> ->ctx for a child process should the parent have one, atomically under
>>> the tasklist_lock.
>>>
>>> Signed-off-by: Oren Laadan <orenl at cs.columbia.edu>
>>> ---
>>>  kernel/fork.c |   11 ++++++++---
>>>  1 files changed, 8 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/kernel/fork.c b/kernel/fork.c
>>> index 9f13d7b..57118e4 100644
>>> --- a/kernel/fork.c
>>> +++ b/kernel/fork.c
>>> @@ -62,6 +62,7 @@
>>>  #include <linux/fs_struct.h>
>>>  #include <linux/magic.h>
>>>  #include <linux/perf_counter.h>
>>> +#include <linux/checkpoint.h>
>>>
>>>  #include <asm/pgtable.h>
>>>  #include <asm/pgalloc.h>
>>> @@ -1148,9 +1149,6 @@ static struct task_struct *copy_process(unsigned long clone_flags,
>>>  	INIT_LIST_HEAD(&p->pi_state_list);
>>>  	p->pi_state_cache = NULL;
>>>  #endif
>>> -#ifdef CONFIG_CHECKPOINT
>>> -	p->checkpoint_ctx = NULL;
>>> -#endif
>>>  	/*
>>>  	 * sigaltstack should be cleared when sharing the same VM
>>>  	 */
>>> @@ -1188,6 +1186,13 @@ static struct task_struct *copy_process(unsigned long clone_flags,
>>>  	/* Need tasklist lock for parent etc handling! */
>>>  	write_lock_irq(&tasklist_lock);
>>>
>>> +#ifdef CONFIG_CHECKPOINT
>>> +	/* If parent is restarting, child should be too */
>>> +	if (unlikely(current->checkpoint_ctx)) {
>>> +		p->checkpoint_ctx = current->checkpoint_ctx;
>> Won't break anything, but technically p->checkpoint_ctx will
>> already be copied from current->checkpoint_ctx, so only the
>> ckpt_ctx_get() is necessary, so this could really read
>>
>> 	if (p->checkpoint_ctx)
>> 		ckpt_ctx_get(p->checkpoint_ctx);
>>
>> Right?
> 
> BTW since all I'm doing is nit-picking, I obviously agree with
> the patch on the whole :)
> 
> There is no way for the task to be forked into a traced state
> (without the parent being traced) right?  And, is the fact that

Not that I'm aware of.

> ctx->nr_total may end up less than the total number of active
> tasks a problem at all?

That's ok. A restart will fail if there are not enough tasks,
and will not-succeed (hang until interrupted) if there are
"too many" tasks.

The above is a protection against malicious use of sys_restart(),
not a fix to a correct use through (bug-less!) restart.c  :p

Oren.

> 
> thanks,
> -serge


More information about the Containers mailing list