radix_tree_tag_get() is not as safe as the docs make out [ver #2]
authorDavid Howells <dhowells@redhat.com>
Tue, 6 Apr 2010 21:36:20 +0000 (22:36 +0100)
committerLinus Torvalds <torvalds@linux-foundation.org>
Fri, 9 Apr 2010 17:12:03 +0000 (10:12 -0700)
commitce82653d6cfcc95ba88c25908664878459fb1b8d
treeab80dd0055bcb4b9296c28c241f1d1fba229be1f
parentd3e06e2b15590b70ea73733fc4612e4741ff46e0
radix_tree_tag_get() is not as safe as the docs make out [ver #2]

radix_tree_tag_get() is not safe to use concurrently with radix_tree_tag_set()
or radix_tree_tag_clear().  The problem is that the double tag_get() in
radix_tree_tag_get():

if (!tag_get(node, tag, offset))
saw_unset_tag = 1;
if (height == 1) {
int ret = tag_get(node, tag, offset);

may see the value change due to the action of set/clear.  RCU is no protection
against this as no pointers are being changed, no nodes are being replaced
according to a COW protocol - set/clear alter the node directly.

The documentation in linux/radix-tree.h, however, says that
radix_tree_tag_get() is an exception to the rule that "any function modifying
the tree or tags (...) must exclude other modifications, and exclude any
functions reading the tree".

The problem is that the next statement in radix_tree_tag_get() checks that the
tag doesn't vary over time:

BUG_ON(ret && saw_unset_tag);

This has been seen happening in FS-Cache:

https://www.redhat.com/archives/linux-cachefs/2010-April/msg00013.html

To this end, remove the BUG_ON() from radix_tree_tag_get() and note in various
comments that the value of the tag may change whilst the RCU read lock is held,
and thus that the return value of radix_tree_tag_get() may not be relied upon
unless radix_tree_tag_set/clear() and radix_tree_delete() are excluded from
running concurrently with it.

Reported-by: Romain DEGEZ <romain.degez@smartjog.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
include/linux/radix-tree.h
lib/radix-tree.c