[PATCH] fix Data Acess error in dup_fd
Sharyathi Nagesh [Fri, 10 Nov 2006 20:27:54 +0000 (12:27 -0800)]
On running the Stress Test on machine for more than 72 hours following
error message was observed.

0:mon> e
cpu 0x0: Vector: 300 (Data Access) at [c00000007ce2f7f0]
    pc: c000000000060d90: .dup_fd+0x240/0x39c
    lr: c000000000060d6c: .dup_fd+0x21c/0x39c
    sp: c00000007ce2fa70
   msr: 800000000000b032
   dar: ffffffff00000028
 dsisr: 40000000
  current = 0xc000000074950980
  paca    = 0xc000000000454500
    pid   = 27330, comm = bash

0:mon> t
[c00000007ce2fa70] c000000000060d28 .dup_fd+0x1d8/0x39c (unreliable)
[c00000007ce2fb30] c000000000060f48 .copy_files+0x5c/0x88
[c00000007ce2fbd0] c000000000061f5c .copy_process+0x574/0x1520
[c00000007ce2fcd0] c000000000062f88 .do_fork+0x80/0x1c4
[c00000007ce2fdc0] c000000000011790 .sys_clone+0x5c/0x74
[c00000007ce2fe30] c000000000008950 .ppc_clone+0x8/0xc

The problem is because of race window.  When if(expand) block is executed in
dup_fd unlocking of oldf->file_lock give a window for fdtable in oldf to be
modified.  So actual open_files in oldf may not match with open_files
variable.

Cc: Vadim Lobanov <vlobanov@speakeasy.net>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

kernel/fork.c

index 3da978e..4b4eab2 100644 (file)
@@ -687,6 +687,7 @@ static struct files_struct *dup_fd(struct files_struct *oldf, int *errorp)
                 * the latest pointer.
                 */
                spin_lock(&oldf->file_lock);
+               open_files = count_open_files(old_fdt);
                old_fdt = files_fdtable(oldf);
        }