fs: make sure data stored into inode is properly seen before unlocking new inode
Jan Kara [Tue, 22 Sep 2009 00:01:06 +0000 (17:01 -0700)]
In theory it could happen that on one CPU we initialize a new inode but
clearing of I_NEW | I_LOCK gets reordered before some of the
initialization.  Thus on another CPU we return not fully uptodate inode
from iget_locked().

This seems to fix a corruption issue on ext3 mounted over NFS.

[akpm@linux-foundation.org: add some commentary]
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fs/inode.c

index b2ba83d..798052f 100644 (file)
@@ -695,13 +695,15 @@ void unlock_new_inode(struct inode *inode)
        }
 #endif
        /*
-        * This is special!  We do not need the spinlock
-        * when clearing I_LOCK, because we're guaranteed
-        * that nobody else tries to do anything about the
-        * state of the inode when it is locked, as we
-        * just created it (so there can be no old holders
-        * that haven't tested I_LOCK).
+        * This is special!  We do not need the spinlock when clearing I_LOCK,
+        * because we're guaranteed that nobody else tries to do anything about
+        * the state of the inode when it is locked, as we just created it (so
+        * there can be no old holders that haven't tested I_LOCK).
+        * However we must emit the memory barrier so that other CPUs reliably
+        * see the clearing of I_LOCK after the other inode initialisation has
+        * completed.
         */
+       smp_mb();
        WARN_ON((inode->i_state & (I_LOCK|I_NEW)) != (I_LOCK|I_NEW));
        inode->i_state &= ~(I_LOCK|I_NEW);
        wake_up_inode(inode);