aio: wake up waiters when freeing unused kiocbs
Jeff Moyer [Mon, 5 Mar 2012 22:59:12 +0000 (14:59 -0800)]
Bart Van Assche reported a hung fio process when either hot-removing
storage or when interrupting the fio process itself.  The (pruned) call
trace for the latter looks like so:

  fio             D 0000000000000001     0  6849   6848 0x00000004
   ffff880092541b88 0000000000000046 ffff880000000000 ffff88012fa11dc0
   ffff88012404be70 ffff880092541fd8 ffff880092541fd8 ffff880092541fd8
   ffff880128b894d0 ffff88012404be70 ffff880092541b88 000000018106f24d
  Call Trace:
    schedule+0x3f/0x60
    io_schedule+0x8f/0xd0
    wait_for_all_aios+0xc0/0x100
    exit_aio+0x55/0xc0
    mmput+0x2d/0x110
    exit_mm+0x10d/0x130
    do_exit+0x671/0x860
    do_group_exit+0x44/0xb0
    get_signal_to_deliver+0x218/0x5a0
    do_signal+0x65/0x700
    do_notify_resume+0x65/0x80
    int_signal+0x12/0x17

The problem lies with the allocation batching code.  It will
opportunistically allocate kiocbs, and then trim back the list of iocbs
when there is not enough room in the completion ring to hold all of the
events.

In the case above, what happens is that the pruning back of events ends
up freeing up the last active request and the context is marked as dead,
so it is thus responsible for waking up waiters.  Unfortunately, the
code does not check for this condition, so we end up with a hung task.

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Reported-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Bart Van Assche <bvanassche@acm.org>
Cc: <stable@kernel.org> [3.2.x only]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fs/aio.c

index 969beb0..67e4b90 100644 (file)
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -490,6 +490,8 @@ static void kiocb_batch_free(struct kioctx *ctx, struct kiocb_batch *batch)
                kmem_cache_free(kiocb_cachep, req);
                ctx->reqs_active--;
        }
+       if (unlikely(!ctx->reqs_active && ctx->dead))
+               wake_up_all(&ctx->wait);
        spin_unlock_irq(&ctx->ctx_lock);
 }