sched: Move blk_schedule_flush_plug() out of __schedule()
Thomas Gleixner [Wed, 22 Jun 2011 17:47:01 +0000 (19:47 +0200)]
There is no real reason to run blk_schedule_flush_plug() with
interrupts and preemption disabled.

Move it into schedule() and call it when the task is going voluntarily
to sleep. There might be false positives when the task is woken
between that call and actually scheduling, but that's not really
different from being woken immediately after switching away.

This fixes a deadlock in the scheduler where the
blk_schedule_flush_plug() callchain enables interrupts and thereby
allows a wakeup to happen of the task that's going to sleep.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable@kernel.org # 2.6.39+
Link: http://lkml.kernel.org/n/tip-dwfxtra7yg1b5r65m32ywtct@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>

kernel/sched.c

index ec15e81..511732c 100644 (file)
@@ -4322,16 +4322,6 @@ need_resched:
                                if (to_wakeup)
                                        try_to_wake_up_local(to_wakeup);
                        }
-
-                       /*
-                        * If we are going to sleep and we have plugged IO
-                        * queued, make sure to submit it to avoid deadlocks.
-                        */
-                       if (blk_needs_flush_plug(prev)) {
-                               raw_spin_unlock(&rq->lock);
-                               blk_schedule_flush_plug(prev);
-                               raw_spin_lock(&rq->lock);
-                       }
                }
                switch_count = &prev->nvcsw;
        }
@@ -4370,8 +4360,23 @@ need_resched:
                goto need_resched;
 }
 
+static inline void sched_submit_work(struct task_struct *tsk)
+{
+       if (!tsk->state)
+               return;
+       /*
+        * If we are going to sleep and we have plugged IO queued,
+        * make sure to submit it to avoid deadlocks.
+        */
+       if (blk_needs_flush_plug(tsk))
+               blk_schedule_flush_plug(tsk);
+}
+
 asmlinkage void schedule(void)
 {
+       struct task_struct *tsk = current;
+
+       sched_submit_work(tsk);
        __schedule();
 }
 EXPORT_SYMBOL(schedule);