sched: Fix KVM and ia64 boot crash due to sched_groups circular linked list assumption
Igor Mammedov [Wed, 9 May 2012 10:38:28 +0000 (12:38 +0200)]
If we have one cpu that failed to boot and boot cpu gave up on
waiting for it and then another cpu is being booted, kernel
might crash with following OOPS:

   BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
   IP: [<ffffffff812c3630>] __bitmap_weight+0x30/0x80
   Call Trace:
       [<ffffffff8108b9b6>] build_sched_domains+0x7b6/0xa50

The crash happens in init_sched_groups_power() that expects
sched_groups to be circular linked list. However it is not
always true, since sched_groups preallocated in __sdt_alloc are
initialized in build_sched_groups and it may exit early

        if (cpu != cpumask_first(sched_domain_span(sd)))
                return 0;

without initializing sd->groups->next field.

Fix bug by initializing next field right after sched_group was
allocated.

Also-Reported-by: Jiang Liu <liuj97@gmail.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Cc: a.p.zijlstra@chello.nl
Cc: pjt@google.com
Cc: seto.hidetoshi@jp.fujitsu.com
Link: http://lkml.kernel.org/r/1336559908-32533-1-git-send-email-imammedo@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

kernel/sched/core.c

index 0533a68..e5212ae 100644 (file)
@@ -6382,6 +6382,8 @@ static int __sdt_alloc(const struct cpumask *cpu_map)
                        if (!sg)
                                return -ENOMEM;
 
+                       sg->next = sg;
+
                        *per_cpu_ptr(sdd->sg, j) = sg;
 
                        sgp = kzalloc_node(sizeof(struct sched_group_power),