Btrfs: Mixed back reference (FORWARD ROLLING FORMAT CHANGE)
Yan Zheng [Wed, 10 Jun 2009 14:45:14 +0000 (10:45 -0400)]
This commit introduces a new kind of back reference for btrfs metadata.
Once a filesystem has been mounted with this commit, IT WILL NO LONGER
BE MOUNTABLE BY OLDER KERNELS.

When a tree block in subvolume tree is cow'd, the reference counts of all
extents it points to are increased by one.  At transaction commit time,
the old root of the subvolume is recorded in a "dead root" data structure,
and the btree it points to is later walked, dropping reference counts
and freeing any blocks where the reference count goes to 0.

The increments done during cow and decrements done after commit cancel out,
and the walk is a very expensive way to go about freeing the blocks that
are no longer referenced by the new btree root.  This commit reduces the
transaction overhead by avoiding the need for dead root records.

When a non-shared tree block is cow'd, we free the old block at once, and the
new block inherits old block's references. When a tree block with reference
count > 1 is cow'd, we increase the reference counts of all extents
the new block points to by one, and decrease the old block's reference count by
one.

This dead tree avoidance code removes the need to modify the reference
counts of lower level extents when a non-shared tree block is cow'd.
But we still need to update back ref for all pointers in the block.
This is because the location of the block is recorded in the back ref
item.

We can solve this by introducing a new type of back ref. The new
back ref provides information about pointer's key, level and in which
tree the pointer lives. This information allow us to find the pointer
by searching the tree. The shortcoming of the new back ref is that it
only works for pointers in tree blocks referenced by their owner trees.

This is mostly a problem for snapshots, where resolving one of these
fuzzy back references would be O(number_of_snapshots) and quite slow.
The solution used here is to use the fuzzy back references in the common
case where a given tree block is only referenced by one root,
and use the full back references when multiple roots have a reference
on a given block.

This commit adds per subvolume red-black tree to keep trace of cached
inodes. The red-black tree helps the balancing code to find cached
inodes whose inode numbers within a given range.

This commit improves the balancing code by introducing several data
structures to keep the state of balancing. The most important one
is the back ref cache. It caches how the upper level tree blocks are
referenced. This greatly reduce the overhead of checking back ref.

The improved balancing code scales significantly better with a large
number of snapshots.

This is a very large commit and was written in a number of
pieces.  But, they depend heavily on the disk format change and were
squashed together to make sure git bisect didn't end up in a
bad state wrt space balancing or the format change.

Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

20 files changed:
fs/btrfs/Makefile
fs/btrfs/btrfs_inode.h
fs/btrfs/ctree.c
fs/btrfs/ctree.h
fs/btrfs/delayed-ref.c
fs/btrfs/delayed-ref.h
fs/btrfs/disk-io.c
fs/btrfs/export.c
fs/btrfs/extent-tree.c
fs/btrfs/file.c
fs/btrfs/inode.c
fs/btrfs/ioctl.c
fs/btrfs/print-tree.c
fs/btrfs/relocation.c [new file with mode: 0644]
fs/btrfs/root-tree.c
fs/btrfs/super.c
fs/btrfs/transaction.c
fs/btrfs/transaction.h
fs/btrfs/tree-log.c
fs/btrfs/volumes.c

index 9421284..a35eb36 100644 (file)
@@ -6,5 +6,5 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \
           transaction.o inode.o file.o tree-defrag.o \
           extent_map.o sysfs.o struct-funcs.o xattr.o ordered-data.o \
           extent_io.o volumes.o async-thread.o ioctl.o locking.o orphan.o \
-          ref-cache.o export.o tree-log.o acl.o free-space-cache.o zlib.o \
-          compression.o delayed-ref.o
+          export.o tree-log.o acl.o free-space-cache.o zlib.o \
+          compression.o delayed-ref.o relocation.o
index b30986f..ecf5f7d 100644 (file)
@@ -72,6 +72,9 @@ struct btrfs_inode {
         */
        struct list_head ordered_operations;
 
+       /* node for the red-black tree that links inodes in subvolume root */
+       struct rb_node rb_node;
+
        /* the space_info for where this inode's data allocations are done */
        struct btrfs_space_info *space_info;
 
index fedf8b9..2b96027 100644 (file)
@@ -197,14 +197,7 @@ int btrfs_copy_root(struct btrfs_trans_handle *trans,
        u32 nritems;
        int ret = 0;
        int level;
-       struct btrfs_root *new_root;
-
-       new_root = kmalloc(sizeof(*new_root), GFP_NOFS);
-       if (!new_root)
-               return -ENOMEM;
-
-       memcpy(new_root, root, sizeof(*new_root));
-       new_root->root_key.objectid = new_root_objectid;
+       struct btrfs_disk_key disk_key;
 
        WARN_ON(root->ref_cows && trans->transid !=
                root->fs_info->running_transaction->transid);
@@ -212,28 +205,37 @@ int btrfs_copy_root(struct btrfs_trans_handle *trans,
 
        level = btrfs_header_level(buf);
        nritems = btrfs_header_nritems(buf);
+       if (level == 0)
+               btrfs_item_key(buf, &disk_key, 0);
+       else
+               btrfs_node_key(buf, &disk_key, 0);
 
-       cow = btrfs_alloc_free_block(trans, new_root, buf->len, 0,
-                                    new_root_objectid, trans->transid,
-                                    level, buf->start, 0);
-       if (IS_ERR(cow)) {
-               kfree(new_root);
+       cow = btrfs_alloc_free_block(trans, root, buf->len, 0,
+                                    new_root_objectid, &disk_key, level,
+                                    buf->start, 0);
+       if (IS_ERR(cow))
                return PTR_ERR(cow);
-       }
 
        copy_extent_buffer(cow, buf, 0, 0, cow->len);
        btrfs_set_header_bytenr(cow, cow->start);
        btrfs_set_header_generation(cow, trans->transid);
-       btrfs_set_header_owner(cow, new_root_objectid);
-       btrfs_clear_header_flag(cow, BTRFS_HEADER_FLAG_WRITTEN);
+       btrfs_set_header_backref_rev(cow, BTRFS_MIXED_BACKREF_REV);
+       btrfs_clear_header_flag(cow, BTRFS_HEADER_FLAG_WRITTEN |
+                                    BTRFS_HEADER_FLAG_RELOC);
+       if (new_root_objectid == BTRFS_TREE_RELOC_OBJECTID)
+               btrfs_set_header_flag(cow, BTRFS_HEADER_FLAG_RELOC);
+       else
+               btrfs_set_header_owner(cow, new_root_objectid);
 
        write_extent_buffer(cow, root->fs_info->fsid,
                            (unsigned long)btrfs_header_fsid(cow),
                            BTRFS_FSID_SIZE);
 
        WARN_ON(btrfs_header_generation(buf) > trans->transid);
-       ret = btrfs_inc_ref(trans, new_root, buf, cow, NULL);
-       kfree(new_root);
+       if (new_root_objectid == BTRFS_TREE_RELOC_OBJECTID)
+               ret = btrfs_inc_ref(trans, root, cow, 1);
+       else
+               ret = btrfs_inc_ref(trans, root, cow, 0);
 
        if (ret)
                return ret;
@@ -244,6 +246,125 @@ int btrfs_copy_root(struct btrfs_trans_handle *trans,
 }
 
 /*
+ * check if the tree block can be shared by multiple trees
+ */
+int btrfs_block_can_be_shared(struct btrfs_root *root,
+                             struct extent_buffer *buf)
+{
+       /*
+        * Tree blocks not in refernece counted trees and tree roots
+        * are never shared. If a block was allocated after the last
+        * snapshot and the block was not allocated by tree relocation,
+        * we know the block is not shared.
+        */
+       if (root->ref_cows &&
+           buf != root->node && buf != root->commit_root &&
+           (btrfs_header_generation(buf) <=
+            btrfs_root_last_snapshot(&root->root_item) ||
+            btrfs_header_flag(buf, BTRFS_HEADER_FLAG_RELOC)))
+               return 1;
+#ifdef BTRFS_COMPAT_EXTENT_TREE_V0
+       if (root->ref_cows &&
+           btrfs_header_backref_rev(buf) < BTRFS_MIXED_BACKREF_REV)
+               return 1;
+#endif
+       return 0;
+}
+
+static noinline int update_ref_for_cow(struct btrfs_trans_handle *trans,
+                                      struct btrfs_root *root,
+                                      struct extent_buffer *buf,
+                                      struct extent_buffer *cow)
+{
+       u64 refs;
+       u64 owner;
+       u64 flags;
+       u64 new_flags = 0;
+       int ret;
+
+       /*
+        * Backrefs update rules:
+        *
+        * Always use full backrefs for extent pointers in tree block
+        * allocated by tree relocation.
+        *
+        * If a shared tree block is no longer referenced by its owner
+        * tree (btrfs_header_owner(buf) == root->root_key.objectid),
+        * use full backrefs for extent pointers in tree block.
+        *
+        * If a tree block is been relocating
+        * (root->root_key.objectid == BTRFS_TREE_RELOC_OBJECTID),
+        * use full backrefs for extent pointers in tree block.
+        * The reason for this is some operations (such as drop tree)
+        * are only allowed for blocks use full backrefs.
+        */
+
+       if (btrfs_block_can_be_shared(root, buf)) {
+               ret = btrfs_lookup_extent_info(trans, root, buf->start,
+                                              buf->len, &refs, &flags);
+               BUG_ON(ret);
+               BUG_ON(refs == 0);
+       } else {
+               refs = 1;
+               if (root->root_key.objectid == BTRFS_TREE_RELOC_OBJECTID ||
+                   btrfs_header_backref_rev(buf) < BTRFS_MIXED_BACKREF_REV)
+                       flags = BTRFS_BLOCK_FLAG_FULL_BACKREF;
+               else
+                       flags = 0;
+       }
+
+       owner = btrfs_header_owner(buf);
+       BUG_ON(owner == BTRFS_TREE_RELOC_OBJECTID &&
+              !(flags & BTRFS_BLOCK_FLAG_FULL_BACKREF));
+
+       if (refs > 1) {
+               if ((owner == root->root_key.objectid ||
+                    root->root_key.objectid == BTRFS_TREE_RELOC_OBJECTID) &&
+                   !(flags & BTRFS_BLOCK_FLAG_FULL_BACKREF)) {
+                       ret = btrfs_inc_ref(trans, root, buf, 1);
+                       BUG_ON(ret);
+
+                       if (root->root_key.objectid ==
+                           BTRFS_TREE_RELOC_OBJECTID) {
+                               ret = btrfs_dec_ref(trans, root, buf, 0);
+                               BUG_ON(ret);
+                               ret = btrfs_inc_ref(trans, root, cow, 1);
+                               BUG_ON(ret);
+                       }
+                       new_flags |= BTRFS_BLOCK_FLAG_FULL_BACKREF;
+               } else {
+
+                       if (root->root_key.objectid ==
+                           BTRFS_TREE_RELOC_OBJECTID)
+                               ret = btrfs_inc_ref(trans, root, cow, 1);
+                       else
+                               ret = btrfs_inc_ref(trans, root, cow, 0);
+                       BUG_ON(ret);
+               }
+               if (new_flags != 0) {
+                       ret = btrfs_set_disk_extent_flags(trans, root,
+                                                         buf->start,
+                                                         buf->len,
+                                                         new_flags, 0);
+                       BUG_ON(ret);
+               }
+       } else {
+               if (flags & BTRFS_BLOCK_FLAG_FULL_BACKREF) {
+                       if (root->root_key.objectid ==
+                           BTRFS_TREE_RELOC_OBJECTID)
+                               ret = btrfs_inc_ref(trans, root, cow, 1);
+                       else
+                               ret = btrfs_inc_ref(trans, root, cow, 0);
+                       BUG_ON(ret);
+                       ret = btrfs_dec_ref(trans, root, buf, 1);
+                       BUG_ON(ret);
+               }
+               clean_tree_block(trans, root, buf);
+       }
+       return 0;
+}
+
+/*
  * does the dirty work in cow of a single block.  The parent block (if
  * supplied) is updated to point to the new cow copy.  The new buffer is marked
  * dirty and returned locked.  If you modify the block it needs to be marked
@@ -262,34 +383,39 @@ static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans,
                             struct extent_buffer **cow_ret,
                             u64 search_start, u64 empty_size)
 {
-       u64 parent_start;
+       struct btrfs_disk_key disk_key;
        struct extent_buffer *cow;
-       u32 nritems;
-       int ret = 0;
        int level;
        int unlock_orig = 0;
+       u64 parent_start;
 
        if (*cow_ret == buf)
                unlock_orig = 1;
 
        btrfs_assert_tree_locked(buf);
 
-       if (parent)
-               parent_start = parent->start;
-       else
-               parent_start = 0;
-
        WARN_ON(root->ref_cows && trans->transid !=
                root->fs_info->running_transaction->transid);
        WARN_ON(root->ref_cows && trans->transid != root->last_trans);
 
        level = btrfs_header_level(buf);
-       nritems = btrfs_header_nritems(buf);
 
-       cow = btrfs_alloc_free_block(trans, root, buf->len,
-                                    parent_start, root->root_key.objectid,
-                                    trans->transid, level,
-                                    search_start, empty_size);
+       if (level == 0)
+               btrfs_item_key(buf, &disk_key, 0);
+       else
+               btrfs_node_key(buf, &disk_key, 0);
+
+       if (root->root_key.objectid == BTRFS_TREE_RELOC_OBJECTID) {
+               if (parent)
+                       parent_start = parent->start;
+               else
+                       parent_start = 0;
+       } else
+               parent_start = 0;
+
+       cow = btrfs_alloc_free_block(trans, root, buf->len, parent_start,
+                                    root->root_key.objectid, &disk_key,
+                                    level, search_start, empty_size);
        if (IS_ERR(cow))
                return PTR_ERR(cow);
 
@@ -298,83 +424,53 @@ static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans,
        copy_extent_buffer(cow, buf, 0, 0, cow->len);
        btrfs_set_header_bytenr(cow, cow->start);
        btrfs_set_header_generation(cow, trans->transid);
-       btrfs_set_header_owner(cow, root->root_key.objectid);
-       btrfs_clear_header_flag(cow, BTRFS_HEADER_FLAG_WRITTEN);
+       btrfs_set_header_backref_rev(cow, BTRFS_MIXED_BACKREF_REV);
+       btrfs_clear_header_flag(cow, BTRFS_HEADER_FLAG_WRITTEN |
+                                    BTRFS_HEADER_FLAG_RELOC);
+       if (root->root_key.objectid == BTRFS_TREE_RELOC_OBJECTID)
+               btrfs_set_header_flag(cow, BTRFS_HEADER_FLAG_RELOC);
+       else
+               btrfs_set_header_owner(cow, root->root_key.objectid);
 
        write_extent_buffer(cow, root->fs_info->fsid,
                            (unsigned long)btrfs_header_fsid(cow),
                            BTRFS_FSID_SIZE);
 
-       WARN_ON(btrfs_header_generation(buf) > trans->transid);
-       if (btrfs_header_generation(buf) != trans->transid) {
-               u32 nr_extents;
-               ret = btrfs_inc_ref(trans, root, buf, cow, &nr_extents);
-               if (ret)
-                       return ret;
-
-               ret = btrfs_cache_ref(trans, root, buf, nr_extents);
-               WARN_ON(ret);
-       } else if (btrfs_header_owner(buf) == BTRFS_TREE_RELOC_OBJECTID) {
-               /*
-                * There are only two places that can drop reference to
-                * tree blocks owned by living reloc trees, one is here,
-                * the other place is btrfs_drop_subtree. In both places,
-                * we check reference count while tree block is locked.
-                * Furthermore, if reference count is one, it won't get
-                * increased by someone else.
-                */
-               u32 refs;
-               ret = btrfs_lookup_extent_ref(trans, root, buf->start,
-                                             buf->len, &refs);
-               BUG_ON(ret);
-               if (refs == 1) {
-                       ret = btrfs_update_ref(trans, root, buf, cow,
-                                              0, nritems);
-                       clean_tree_block(trans, root, buf);
-               } else {
-                       ret = btrfs_inc_ref(trans, root, buf, cow, NULL);
-               }
-               BUG_ON(ret);
-       } else {
-               ret = btrfs_update_ref(trans, root, buf, cow, 0, nritems);
-               if (ret)
-                       return ret;
-               clean_tree_block(trans, root, buf);
-       }
-
-       if (root->root_key.objectid == BTRFS_TREE_RELOC_OBJECTID) {
-               ret = btrfs_reloc_tree_cache_ref(trans, root, cow, buf->start);
-               WARN_ON(ret);
-       }
+       update_ref_for_cow(trans, root, buf, cow);
 
        if (buf == root->node) {
                WARN_ON(parent && parent != buf);
+               if (root->root_key.objectid == BTRFS_TREE_RELOC_OBJECTID ||
+                   btrfs_header_backref_rev(buf) < BTRFS_MIXED_BACKREF_REV)
+                       parent_start = buf->start;
+               else
+                       parent_start = 0;
 
                spin_lock(&root->node_lock);
                root->node = cow;
                extent_buffer_get(cow);
                spin_unlock(&root->node_lock);
 
-               if (buf != root->commit_root) {
-                       btrfs_free_extent(trans, root, buf->start,
-                                         buf->len, buf->start,
-                                         root->root_key.objectid,
-                                         btrfs_header_generation(buf),
-                                         level, 1);
-               }
+               btrfs_free_extent(trans, root, buf->start, buf->len,
+                                 parent_start, root->root_key.objectid,
+                                 level, 0);
                free_extent_buffer(buf);
                add_root_to_dirty_list(root);
        } else {
+               if (root->root_key.objectid == BTRFS_TREE_RELOC_OBJECTID)
+                       parent_start = parent->start;
+               else
+                       parent_start = 0;
+
+               WARN_ON(trans->transid != btrfs_header_generation(parent));
                btrfs_set_node_blockptr(parent, parent_slot,
                                        cow->start);
-               WARN_ON(trans->transid == 0);
                btrfs_set_node_ptr_generation(parent, parent_slot,
                                              trans->transid);
                btrfs_mark_buffer_dirty(parent);
-               WARN_ON(btrfs_header_generation(parent) != trans->transid);
                btrfs_free_extent(trans, root, buf->start, buf->len,
-                                 parent_start, btrfs_header_owner(parent),
-                                 btrfs_header_generation(parent), level, 1);
+                                 parent_start, root->root_key.objectid,
+                                 level, 0);
        }
        if (unlock_orig)
                btrfs_tree_unlock(buf);
@@ -384,6 +480,18 @@ static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans,
        return 0;
 }
 
+static inline int should_cow_block(struct btrfs_trans_handle *trans,
+                                  struct btrfs_root *root,
+                                  struct extent_buffer *buf)
+{
+       if (btrfs_header_generation(buf) == trans->transid &&
+           !btrfs_header_flag(buf, BTRFS_HEADER_FLAG_WRITTEN) &&
+           !(root->root_key.objectid != BTRFS_TREE_RELOC_OBJECTID &&
+             btrfs_header_flag(buf, BTRFS_HEADER_FLAG_RELOC)))
+               return 0;
+       return 1;
+}
+
 /*
  * cows a single block, see __btrfs_cow_block for the real work.
  * This version of it has extra checks so that a block isn't cow'd more than
@@ -411,9 +519,7 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle *trans,
                WARN_ON(1);
        }
 
-       if (btrfs_header_generation(buf) == trans->transid &&
-           btrfs_header_owner(buf) == root->root_key.objectid &&
-           !btrfs_header_flag(buf, BTRFS_HEADER_FLAG_WRITTEN)) {
+       if (!should_cow_block(trans, root, buf)) {
                *cow_ret = buf;
                return 0;
        }
@@ -469,7 +575,7 @@ static int comp_keys(struct btrfs_disk_key *disk, struct btrfs_key *k2)
 /*
  * same as comp_keys only with two btrfs_key's
  */
-static int comp_cpu_keys(struct btrfs_key *k1, struct btrfs_key *k2)
+int btrfs_comp_cpu_keys(struct btrfs_key *k1, struct btrfs_key *k2)
 {
        if (k1->objectid > k2->objectid)
                return 1;
@@ -845,6 +951,12 @@ static int bin_search(struct extent_buffer *eb, struct btrfs_key *key,
        return -1;
 }
 
+int btrfs_bin_search(struct extent_buffer *eb, struct btrfs_key *key,
+                    int level, int *slot)
+{
+       return bin_search(eb, key, level, slot);
+}
+
 /* given a node and slot number, this reads the blocks it points to.  The
  * extent buffer is returned with a reference taken (but unlocked).
  * NULL is returned on error.
@@ -921,13 +1033,6 @@ static noinline int balance_level(struct btrfs_trans_handle *trans,
                root->node = child;
                spin_unlock(&root->node_lock);
 
-               ret = btrfs_update_extent_ref(trans, root, child->start,
-                                             child->len,
-                                             mid->start, child->start,
-                                             root->root_key.objectid,
-                                             trans->transid, level - 1);
-               BUG_ON(ret);
-
                add_root_to_dirty_list(root);
                btrfs_tree_unlock(child);
 
@@ -938,9 +1043,7 @@ static noinline int balance_level(struct btrfs_trans_handle *trans,
                /* once for the path */
                free_extent_buffer(mid);
                ret = btrfs_free_extent(trans, root, mid->start, mid->len,
-                                       mid->start, root->root_key.objectid,
-                                       btrfs_header_generation(mid),
-                                       level, 1);
+                                       0, root->root_key.objectid, level, 1);
                /* once for the root ptr */
                free_extent_buffer(mid);
                return ret;
@@ -998,7 +1101,6 @@ static noinline int balance_level(struct btrfs_trans_handle *trans,
                        ret = wret;
                if (btrfs_header_nritems(right) == 0) {
                        u64 bytenr = right->start;
-                       u64 generation = btrfs_header_generation(parent);
                        u32 blocksize = right->len;
 
                        clean_tree_block(trans, root, right);
@@ -1010,9 +1112,9 @@ static noinline int balance_level(struct btrfs_trans_handle *trans,
                        if (wret)
                                ret = wret;
                        wret = btrfs_free_extent(trans, root, bytenr,
-                                                blocksize, parent->start,
-                                                btrfs_header_owner(parent),
-                                                generation, level, 1);
+                                                blocksize, 0,
+                                                root->root_key.objectid,
+                                                level, 0);
                        if (wret)
                                ret = wret;
                } else {
@@ -1047,7 +1149,6 @@ static noinline int balance_level(struct btrfs_trans_handle *trans,
        }
        if (btrfs_header_nritems(mid) == 0) {
                /* we've managed to empty the middle node, drop it */
-               u64 root_gen = btrfs_header_generation(parent);
                u64 bytenr = mid->start;
                u32 blocksize = mid->len;
 
@@ -1059,9 +1160,8 @@ static noinline int balance_level(struct btrfs_trans_handle *trans,
                if (wret)
                        ret = wret;
                wret = btrfs_free_extent(trans, root, bytenr, blocksize,
-                                        parent->start,
-                                        btrfs_header_owner(parent),
-                                        root_gen, level, 1);
+                                        0, root->root_key.objectid,
+                                        level, 0);
                if (wret)
                        ret = wret;
        } else {
@@ -1437,7 +1537,7 @@ noinline void btrfs_unlock_up_safe(struct btrfs_path *path, int level)
 {
        int i;
 
-       if (path->keep_locks || path->lowest_level)
+       if (path->keep_locks)
                return;
 
        for (i = level; i < BTRFS_MAX_LEVEL; i++) {
@@ -1614,10 +1714,17 @@ int btrfs_search_slot(struct btrfs_trans_handle *trans, struct btrfs_root
                lowest_unlock = 2;
 
 again:
-       if (p->skip_locking)
-               b = btrfs_root_node(root);
-       else
-               b = btrfs_lock_root_node(root);
+       if (p->search_commit_root) {
+               b = root->commit_root;
+               extent_buffer_get(b);
+               if (!p->skip_locking)
+                       btrfs_tree_lock(b);
+       } else {
+               if (p->skip_locking)
+                       b = btrfs_root_node(root);
+               else
+                       b = btrfs_lock_root_node(root);
+       }
 
        while (b) {
                level = btrfs_header_level(b);
@@ -1638,11 +1745,9 @@ again:
                         * then we don't want to set the path blocking,
                         * so we test it here
                         */
-                       if (btrfs_header_generation(b) == trans->transid &&
-                           btrfs_header_owner(b) == root->root_key.objectid &&
-                           !btrfs_header_flag(b, BTRFS_HEADER_FLAG_WRITTEN)) {
+                       if (!should_cow_block(trans, root, b))
                                goto cow_done;
-                       }
+
                        btrfs_set_path_blocking(p);
 
                        wret = btrfs_cow_block(trans, root, b,
@@ -1764,138 +1869,6 @@ done:
        return ret;
 }
 
-int btrfs_merge_path(struct btrfs_trans_handle *trans,
-                    struct btrfs_root *root,
-                    struct btrfs_key *node_keys,
-                    u64 *nodes, int lowest_level)
-{
-       struct extent_buffer *eb;
-       struct extent_buffer *parent;
-       struct btrfs_key key;
-       u64 bytenr;
-       u64 generation;
-       u32 blocksize;
-       int level;
-       int slot;
-       int key_match;
-       int ret;
-
-       eb = btrfs_lock_root_node(root);
-       ret = btrfs_cow_block(trans, root, eb, NULL, 0, &eb);
-       BUG_ON(ret);
-
-       btrfs_set_lock_blocking(eb);
-
-       parent = eb;
-       while (1) {
-               level = btrfs_header_level(parent);
-               if (level == 0 || level <= lowest_level)
-                       break;
-
-               ret = bin_search(parent, &node_keys[lowest_level], level,
-                                &slot);
-               if (ret && slot > 0)
-                       slot--;
-
-               bytenr = btrfs_node_blockptr(parent, slot);
-               if (nodes[level - 1] == bytenr)
-                       break;
-
-               blocksize = btrfs_level_size(root, level - 1);
-               generation = btrfs_node_ptr_generation(parent, slot);
-               btrfs_node_key_to_cpu(eb, &key, slot);
-               key_match = !memcmp(&key, &node_keys[level - 1], sizeof(key));
-
-               if (generation == trans->transid) {
-                       eb = read_tree_block(root, bytenr, blocksize,
-                                            generation);
-                       btrfs_tree_lock(eb);
-                       btrfs_set_lock_blocking(eb);
-               }
-
-               /*
-                * if node keys match and node pointer hasn't been modified
-                * in the running transaction, we can merge the path. for
-                * blocks owened by reloc trees, the node pointer check is
-                * skipped, this is because these blocks are fully controlled
-                * by the space balance code, no one else can modify them.
-                */
-               if (!nodes[level - 1] || !key_match ||
-                   (generation == trans->transid &&
-                    btrfs_header_owner(eb) != BTRFS_TREE_RELOC_OBJECTID)) {
-                       if (level == 1 || level == lowest_level + 1) {
-                               if (generation == trans->transid) {
-                                       btrfs_tree_unlock(eb);
-                                       free_extent_buffer(eb);
-                               }
-                               break;
-                       }
-
-                       if (generation != trans->transid) {
-                               eb = read_tree_block(root, bytenr, blocksize,
-                                               generation);
-                               btrfs_tree_lock(eb);
-                               btrfs_set_lock_blocking(eb);
-                       }
-
-                       ret = btrfs_cow_block(trans, root, eb, parent, slot,
-                                             &eb);
-                       BUG_ON(ret);
-
-                       if (root->root_key.objectid ==
-                           BTRFS_TREE_RELOC_OBJECTID) {
-                               if (!nodes[level - 1]) {
-                                       nodes[level - 1] = eb->start;
-                                       memcpy(&node_keys[level - 1], &key,
-                                              sizeof(node_keys[0]));
-                               } else {
-                                       WARN_ON(1);
-                               }
-                       }
-
-                       btrfs_tree_unlock(parent);
-                       free_extent_buffer(parent);
-                       parent = eb;
-                       continue;
-               }
-
-               btrfs_set_node_blockptr(parent, slot, nodes[level - 1]);
-               btrfs_set_node_ptr_generation(parent, slot, trans->transid);
-               btrfs_mark_buffer_dirty(parent);
-
-               ret = btrfs_inc_extent_ref(trans, root,
-                                       nodes[level - 1],
-                                       blocksize, parent->start,
-                                       btrfs_header_owner(parent),
-                                       btrfs_header_generation(parent),
-                                       level - 1);
-               BUG_ON(ret);
-
-               /*
-                * If the block was created in the running transaction,
-                * it's possible this is the last reference to it, so we
-                * should drop the subtree.
-                */
-               if (generation == trans->transid) {
-                       ret = btrfs_drop_subtree(trans, root, eb, parent);
-                       BUG_ON(ret);
-                       btrfs_tree_unlock(eb);
-                       free_extent_buffer(eb);
-               } else {
-                       ret = btrfs_free_extent(trans, root, bytenr,
-                                       blocksize, parent->start,
-                                       btrfs_header_owner(parent),
-                                       btrfs_header_generation(parent),
-                                       level - 1, 1);
-                       BUG_ON(ret);
-               }
-               break;
-       }
-       btrfs_tree_unlock(parent);
-       free_extent_buffer(parent);
-       return 0;
-}
-
 /*
  * adjust the pointers going up the tree, starting at level
  * making sure the right key of each node is points to 'key'.
@@ -2021,9 +1994,6 @@ static int push_node_left(struct btrfs_trans_handle *trans,
        btrfs_mark_buffer_dirty(src);
        btrfs_mark_buffer_dirty(dst);
 
-       ret = btrfs_update_ref(trans, root, src, dst, dst_nritems, push_items);
-       BUG_ON(ret);
-
        return ret;
 }
 
@@ -2083,9 +2053,6 @@ static int balance_node_right(struct btrfs_trans_handle *trans,
        btrfs_mark_buffer_dirty(src);
        btrfs_mark_buffer_dirty(dst);
 
-       ret = btrfs_update_ref(trans, root, src, dst, 0, push_items);
-       BUG_ON(ret);
-
        return ret;
 }
 
@@ -2105,7 +2072,6 @@ static noinline int insert_new_root(struct btrfs_trans_handle *trans,
        struct extent_buffer *c;
        struct extent_buffer *old;
        struct btrfs_disk_key lower_key;
-       int ret;
 
        BUG_ON(path->nodes[level]);
        BUG_ON(path->nodes[level-1] != root->node);
@@ -2117,16 +2083,17 @@ static noinline int insert_new_root(struct btrfs_trans_handle *trans,
                btrfs_node_key(lower, &lower_key, 0);
 
        c = btrfs_alloc_free_block(trans, root, root->nodesize, 0,
-                                  root->root_key.objectid, trans->transid,
+                                  root->root_key.objectid, &lower_key,
                                   level, root->node->start, 0);
        if (IS_ERR(c))
                return PTR_ERR(c);
 
-       memset_extent_buffer(c, 0, 0, root->nodesize);
+       memset_extent_buffer(c, 0, 0, sizeof(struct btrfs_header));
        btrfs_set_header_nritems(c, 1);
        btrfs_set_header_level(c, level);
        btrfs_set_header_bytenr(c, c->start);
        btrfs_set_header_generation(c, trans->transid);
+       btrfs_set_header_backref_rev(c, BTRFS_MIXED_BACKREF_REV);
        btrfs_set_header_owner(c, root->root_key.objectid);
 
        write_extent_buffer(c, root->fs_info->fsid,
@@ -2151,12 +2118,6 @@ static noinline int insert_new_root(struct btrfs_trans_handle *trans,
        root->node = c;
        spin_unlock(&root->node_lock);
 
-       ret = btrfs_update_extent_ref(trans, root, lower->start,
-                                     lower->len, lower->start, c->start,
-                                     root->root_key.objectid,
-                                     trans->transid, level - 1);
-       BUG_ON(ret);
-
        /* the super has an extra ref to root->node */
        free_extent_buffer(old);
 
@@ -2244,20 +2205,21 @@ static noinline int split_node(struct btrfs_trans_handle *trans,
        }
 
        c_nritems = btrfs_header_nritems(c);
+       mid = (c_nritems + 1) / 2;
+       btrfs_node_key(c, &disk_key, mid);
 
-       split = btrfs_alloc_free_block(trans, root, root->nodesize,
-                                       path->nodes[level + 1]->start,
+       split = btrfs_alloc_free_block(trans, root, root->nodesize, 0,
                                        root->root_key.objectid,
-                                       trans->transid, level, c->start, 0);
+                                       &disk_key, level, c->start, 0);
        if (IS_ERR(split))
                return PTR_ERR(split);
 
-       btrfs_set_header_flags(split, btrfs_header_flags(c));
+       memset_extent_buffer(split, 0, 0, sizeof(struct btrfs_header));
        btrfs_set_header_level(split, btrfs_header_level(c));
        btrfs_set_header_bytenr(split, split->start);
        btrfs_set_header_generation(split, trans->transid);
+       btrfs_set_header_backref_rev(split, BTRFS_MIXED_BACKREF_REV);
        btrfs_set_header_owner(split, root->root_key.objectid);
-       btrfs_set_header_flags(split, 0);
        write_extent_buffer(split, root->fs_info->fsid,
                            (unsigned long)btrfs_header_fsid(split),
                            BTRFS_FSID_SIZE);
@@ -2265,7 +2227,6 @@ static noinline int split_node(struct btrfs_trans_handle *trans,
                            (unsigned long)btrfs_header_chunk_tree_uuid(split),
                            BTRFS_UUID_SIZE);
 
-       mid = (c_nritems + 1) / 2;
 
        copy_extent_buffer(split, c,
                           btrfs_node_key_ptr_offset(0),
@@ -2278,16 +2239,12 @@ static noinline int split_node(struct btrfs_trans_handle *trans,
        btrfs_mark_buffer_dirty(c);
        btrfs_mark_buffer_dirty(split);
 
-       btrfs_node_key(split, &disk_key, 0);
        wret = insert_ptr(trans, root, path, &disk_key, split->start,
                          path->slots[level + 1] + 1,
                          level + 1);
        if (wret)
                ret = wret;
 
-       ret = btrfs_update_ref(trans, root, c, split, 0, c_nritems - mid);
-       BUG_ON(ret);
-
        if (path->slots[level] >= mid) {
                path->slots[level] -= mid;
                btrfs_tree_unlock(c);
@@ -2360,7 +2317,6 @@ static noinline int __push_leaf_right(struct btrfs_trans_handle *trans,
        u32 right_nritems;
        u32 data_end;
        u32 this_item_size;
-       int ret;
 
        if (empty)
                nr = 0;
@@ -2473,9 +2429,6 @@ static noinline int __push_leaf_right(struct btrfs_trans_handle *trans,
                btrfs_mark_buffer_dirty(left);
        btrfs_mark_buffer_dirty(right);
 
-       ret = btrfs_update_ref(trans, root, left, right, 0, push_items);
-       BUG_ON(ret);
-
        btrfs_item_key(right, &disk_key, 0);
        btrfs_set_node_key(upper, &disk_key, slot + 1);
        btrfs_mark_buffer_dirty(upper);
@@ -2720,10 +2673,6 @@ static noinline int __push_leaf_left(struct btrfs_trans_handle *trans,
        if (right_nritems)
                btrfs_mark_buffer_dirty(right);
 
-       ret = btrfs_update_ref(trans, root, right, left,
-                              old_left_nritems, push_items);
-       BUG_ON(ret);
-
        btrfs_item_key(right, &disk_key, 0);
        wret = fixup_low_keys(trans, root, path, &disk_key, 1);
        if (wret)
@@ -2880,9 +2829,6 @@ static noinline int copy_for_split(struct btrfs_trans_handle *trans,
        btrfs_mark_buffer_dirty(l);
        BUG_ON(path->slots[0] != slot);
 
-       ret = btrfs_update_ref(trans, root, l, right, 0, nritems);
-       BUG_ON(ret);
-
        if (mid <= slot) {
                btrfs_tree_unlock(path->nodes[0]);
                free_extent_buffer(path->nodes[0]);
@@ -2911,6 +2857,7 @@ static noinline int split_leaf(struct btrfs_trans_handle *trans,
                               struct btrfs_path *path, int data_size,
                               int extend)
 {
+       struct btrfs_disk_key disk_key;
        struct extent_buffer *l;
        u32 nritems;
        int mid;
@@ -2918,7 +2865,7 @@ static noinline int split_leaf(struct btrfs_trans_handle *trans,
        struct extent_buffer *right;
        int ret = 0;
        int wret;
-       int double_split;
+       int split;
        int num_doubles = 0;
 
        /* first try to make some room by pushing left and right */
@@ -2945,16 +2892,53 @@ static noinline int split_leaf(struct btrfs_trans_handle *trans,
                        return ret;
        }
 again:
-       double_split = 0;
+       split = 1;
        l = path->nodes[0];
        slot = path->slots[0];
        nritems = btrfs_header_nritems(l);
        mid = (nritems + 1) / 2;
 
-       right = btrfs_alloc_free_block(trans, root, root->leafsize,
-                                       path->nodes[1]->start,
+       if (mid <= slot) {
+               if (nritems == 1 ||
+                   leaf_space_used(l, mid, nritems - mid) + data_size >
+                       BTRFS_LEAF_DATA_SIZE(root)) {
+                       if (slot >= nritems) {
+                               split = 0;
+                       } else {
+                               mid = slot;
+                               if (mid != nritems &&
+                                   leaf_space_used(l, mid, nritems - mid) +
+                                   data_size > BTRFS_LEAF_DATA_SIZE(root)) {
+                                       split = 2;
+                               }
+                       }
+               }
+       } else {
+               if (leaf_space_used(l, 0, mid) + data_size >
+                       BTRFS_LEAF_DATA_SIZE(root)) {
+                       if (!extend && data_size && slot == 0) {
+                               split = 0;
+                       } else if ((extend || !data_size) && slot == 0) {
+                               mid = 1;
+                       } else {
+                               mid = slot;
+                               if (mid != nritems &&
+                                   leaf_space_used(l, mid, nritems - mid) +
+                                   data_size > BTRFS_LEAF_DATA_SIZE(root)) {
+                                       split = 2 ;
+                               }
+                       }
+               }
+       }
+
+       if (split == 0)
+               btrfs_cpu_key_to_disk(&disk_key, ins_key);
+       else
+               btrfs_item_key(l, &disk_key, mid);
+
+       right = btrfs_alloc_free_block(trans, root, root->leafsize, 0,
                                        root->root_key.objectid,
-                                       trans->transid, 0, l->start, 0);
+                                       &disk_key, 0, l->start, 0);
        if (IS_ERR(right)) {
                BUG_ON(1);
                return PTR_ERR(right);
@@ -2963,6 +2947,7 @@ again:
        memset_extent_buffer(right, 0, 0, sizeof(struct btrfs_header));
        btrfs_set_header_bytenr(right, right->start);
        btrfs_set_header_generation(right, trans->transid);
+       btrfs_set_header_backref_rev(right, BTRFS_MIXED_BACKREF_REV);
        btrfs_set_header_owner(right, root->root_key.objectid);
        btrfs_set_header_level(right, 0);
        write_extent_buffer(right, root->fs_info->fsid,
@@ -2973,79 +2958,47 @@ again:
                            (unsigned long)btrfs_header_chunk_tree_uuid(right),
                            BTRFS_UUID_SIZE);
 
-       if (mid <= slot) {
-               if (nritems == 1 ||
-                   leaf_space_used(l, mid, nritems - mid) + data_size >
-                       BTRFS_LEAF_DATA_SIZE(root)) {
-                       if (slot >= nritems) {
-                               struct btrfs_disk_key disk_key;
-
-                               btrfs_cpu_key_to_disk(&disk_key, ins_key);
-                               btrfs_set_header_nritems(right, 0);
-                               wret = insert_ptr(trans, root, path,
-                                                 &disk_key, right->start,
-                                                 path->slots[1] + 1, 1);
-                               if (wret)
-                                       ret = wret;
+       if (split == 0) {
+               if (mid <= slot) {
+                       btrfs_set_header_nritems(right, 0);
+                       wret = insert_ptr(trans, root, path,
+                                         &disk_key, right->start,
+                                         path->slots[1] + 1, 1);
+                       if (wret)
+                               ret = wret;
 
-                               btrfs_tree_unlock(path->nodes[0]);
-                               free_extent_buffer(path->nodes[0]);
-                               path->nodes[0] = right;
-                               path->slots[0] = 0;
-                               path->slots[1] += 1;
-                               btrfs_mark_buffer_dirty(right);
-                               return ret;
-                       }
-                       mid = slot;
-                       if (mid != nritems &&
-                           leaf_space_used(l, mid, nritems - mid) +
-                           data_size > BTRFS_LEAF_DATA_SIZE(root)) {
-                               double_split = 1;
-                       }
-               }
-       } else {
-               if (leaf_space_used(l, 0, mid) + data_size >
-                       BTRFS_LEAF_DATA_SIZE(root)) {
-                       if (!extend && data_size && slot == 0) {
-                               struct btrfs_disk_key disk_key;
-
-                               btrfs_cpu_key_to_disk(&disk_key, ins_key);
-                               btrfs_set_header_nritems(right, 0);
-                               wret = insert_ptr(trans, root, path,
-                                                 &disk_key,
-                                                 right->start,
-                                                 path->slots[1], 1);
+                       btrfs_tree_unlock(path->nodes[0]);
+                       free_extent_buffer(path->nodes[0]);
+                       path->nodes[0] = right;
+                       path->slots[0] = 0;
+                       path->slots[1] += 1;
+               } else {
+                       btrfs_set_header_nritems(right, 0);
+                       wret = insert_ptr(trans, root, path,
+                                         &disk_key,
+                                         right->start,
+                                         path->slots[1], 1);
+                       if (wret)
+                               ret = wret;
+                       btrfs_tree_unlock(path->nodes[0]);
+                       free_extent_buffer(path->nodes[0]);
+                       path->nodes[0] = right;
+                       path->slots[0] = 0;
+                       if (path->slots[1] == 0) {
+                               wret = fixup_low_keys(trans, root,
+                                               path, &disk_key, 1);
                                if (wret)
                                        ret = wret;
-                               btrfs_tree_unlock(path->nodes[0]);
-                               free_extent_buffer(path->nodes[0]);
-                               path->nodes[0] = right;
-                               path->slots[0] = 0;
-                               if (path->slots[1] == 0) {
-                                       wret = fixup_low_keys(trans, root,
-                                                     path, &disk_key, 1);
-                                       if (wret)
-                                               ret = wret;
-                               }
-                               btrfs_mark_buffer_dirty(right);
-                               return ret;
-                       } else if ((extend || !data_size) && slot == 0) {
-                               mid = 1;
-                       } else {
-                               mid = slot;
-                               if (mid != nritems &&
-                                   leaf_space_used(l, mid, nritems - mid) +
-                                   data_size > BTRFS_LEAF_DATA_SIZE(root)) {
-                                       double_split = 1;
-                               }
                        }
                }
+               btrfs_mark_buffer_dirty(right);
+               return ret;
        }
 
        ret = copy_for_split(trans, root, path, l, right, slot, mid, nritems);
        BUG_ON(ret);
 
-       if (double_split) {
+       if (split == 2) {
                BUG_ON(num_doubles != 0);
                num_doubles++;
                goto again;
@@ -3447,7 +3400,7 @@ int btrfs_insert_some_items(struct btrfs_trans_handle *trans,
                /* figure out how many keys we can insert in here */
                total_data = data_size[0];
                for (i = 1; i < nr; i++) {
-                       if (comp_cpu_keys(&found_key, cpu_key + i) <= 0)
+                       if (btrfs_comp_cpu_keys(&found_key, cpu_key + i) <= 0)
                                break;
                        total_data += data_size[i];
                }
@@ -3745,9 +3698,7 @@ static int del_ptr(struct btrfs_trans_handle *trans, struct btrfs_root *root,
 
 /*
  * a helper function to delete the leaf pointed to by path->slots[1] and
- * path->nodes[1].  bytenr is the node block pointer, but since the callers
- * already know it, it is faster to have them pass it down than to
- * read it out of the node again.
+ * path->nodes[1].
  *
  * This deletes the pointer in path->nodes[1] and frees the leaf
  * block extent.  zero is returned if it all worked out, < 0 otherwise.
@@ -3755,15 +3706,14 @@ static int del_ptr(struct btrfs_trans_handle *trans, struct btrfs_root *root,
  * The path must have already been setup for deleting the leaf, including
  * all the proper balancing.  path->nodes[1] must be locked.
  */
-noinline int btrfs_del_leaf(struct btrfs_trans_handle *trans,
-                           struct btrfs_root *root,
-                           struct btrfs_path *path, u64 bytenr)
+static noinline int btrfs_del_leaf(struct btrfs_trans_handle *trans,
+                                  struct btrfs_root *root,
+                                  struct btrfs_path *path,
+                                  struct extent_buffer *leaf)
 {
        int ret;
-       u64 root_gen = btrfs_header_generation(path->nodes[1]);
-       u64 parent_start = path->nodes[1]->start;
-       u64 parent_owner = btrfs_header_owner(path->nodes[1]);
 
+       WARN_ON(btrfs_header_generation(leaf) != trans->transid);
        ret = del_ptr(trans, root, path, 1, path->slots[1]);
        if (ret)
                return ret;
@@ -3774,10 +3724,8 @@ noinline int btrfs_del_leaf(struct btrfs_trans_handle *trans,
         */
        btrfs_unlock_up_safe(path, 0);
 
-       ret = btrfs_free_extent(trans, root, bytenr,
-                               btrfs_level_size(root, 0),
-                               parent_start, parent_owner,
-                               root_gen, 0, 1);
+       ret = btrfs_free_extent(trans, root, leaf->start, leaf->len,
+                               0, root->root_key.objectid, 0, 0);
        return ret;
 }
 /*
@@ -3845,7 +3793,7 @@ int btrfs_del_items(struct btrfs_trans_handle *trans, struct btrfs_root *root,
                if (leaf == root->node) {
                        btrfs_set_header_level(leaf, 0);
                } else {
-                       ret = btrfs_del_leaf(trans, root, path, leaf->start);
+                       ret = btrfs_del_leaf(trans, root, path, leaf);
                        BUG_ON(ret);
                }
        } else {
@@ -3884,8 +3832,7 @@ int btrfs_del_items(struct btrfs_trans_handle *trans, struct btrfs_root *root,
 
                        if (btrfs_header_nritems(leaf) == 0) {
                                path->slots[1] = slot;
-                               ret = btrfs_del_leaf(trans, root, path,
-                                                    leaf->start);
+                               ret = btrfs_del_leaf(trans, root, path, leaf);
                                BUG_ON(ret);
                                free_extent_buffer(leaf);
                        } else {
index 4414a5d..ce3ab4e 100644 (file)
@@ -45,6 +45,8 @@ struct btrfs_ordered_sum;
 
 #define BTRFS_MAX_LEVEL 8
 
+#define BTRFS_COMPAT_EXTENT_TREE_V0
+
 /*
  * files bigger than this get some pre-flushing when they are added
  * to the ordered operations list.  That way we limit the total
@@ -267,7 +269,18 @@ static inline unsigned long btrfs_chunk_item_size(int num_stripes)
 }
 
 #define BTRFS_FSID_SIZE 16
-#define BTRFS_HEADER_FLAG_WRITTEN (1 << 0)
+#define BTRFS_HEADER_FLAG_WRITTEN      (1ULL << 0)
+#define BTRFS_HEADER_FLAG_RELOC                (1ULL << 1)
+#define BTRFS_SUPER_FLAG_SEEDING       (1ULL << 32)
+#define BTRFS_SUPER_FLAG_METADUMP      (1ULL << 33)
+
+#define BTRFS_BACKREF_REV_MAX          256
+#define BTRFS_BACKREF_REV_SHIFT                56
+#define BTRFS_BACKREF_REV_MASK         (((u64)BTRFS_BACKREF_REV_MAX - 1) << \
+                                        BTRFS_BACKREF_REV_SHIFT)
+
+#define BTRFS_OLD_BACKREF_REV          0
+#define BTRFS_MIXED_BACKREF_REV                1
 
 /*
  * every tree block (leaf or node) starts with this header.
@@ -296,7 +309,6 @@ struct btrfs_header {
                                        sizeof(struct btrfs_item) - \
                                        sizeof(struct btrfs_file_extent_item))
 
-#define BTRFS_SUPER_FLAG_SEEDING (1ULL << 32)
 
 /*
  * this is a very generous portion of the super block, giving us
@@ -355,9 +367,12 @@ struct btrfs_super_block {
  * Compat flags that we support.  If any incompat flags are set other than the
  * ones specified below then we will fail to mount
  */
-#define BTRFS_FEATURE_COMPAT_SUPP      0x0
-#define BTRFS_FEATURE_COMPAT_RO_SUPP   0x0
-#define BTRFS_FEATURE_INCOMPAT_SUPP    0x0
+#define BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF   (1ULL << 0)
+
+#define BTRFS_FEATURE_COMPAT_SUPP              0ULL
+#define BTRFS_FEATURE_COMPAT_RO_SUPP           0ULL
+#define BTRFS_FEATURE_INCOMPAT_SUPP            \
+       BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF
 
 /*
  * A leaf is full of items. offset and size tell us where to find
@@ -421,23 +436,65 @@ struct btrfs_path {
        unsigned int keep_locks:1;
        unsigned int skip_locking:1;
        unsigned int leave_spinning:1;
+       unsigned int search_commit_root:1;
 };
 
 /*
  * items in the extent btree are used to record the objectid of the
  * owner of the block and the number of references
  */
+
 struct btrfs_extent_item {
+       __le64 refs;
+       __le64 generation;
+       __le64 flags;
+} __attribute__ ((__packed__));
+
+struct btrfs_extent_item_v0 {
        __le32 refs;
 } __attribute__ ((__packed__));
 
-struct btrfs_extent_ref {
+#define BTRFS_MAX_EXTENT_ITEM_SIZE(r) ((BTRFS_LEAF_DATA_SIZE(r) >> 4) - \
+                                       sizeof(struct btrfs_item))
+
+#define BTRFS_EXTENT_FLAG_DATA         (1ULL << 0)
+#define BTRFS_EXTENT_FLAG_TREE_BLOCK   (1ULL << 1)
+
+/* following flags only apply to tree blocks */
+
+/* use full backrefs for extent pointers in the block */
+#define BTRFS_BLOCK_FLAG_FULL_BACKREF  (1ULL << 8)
+
+struct btrfs_tree_block_info {
+       struct btrfs_disk_key key;
+       u8 level;
+} __attribute__ ((__packed__));
+
+struct btrfs_extent_data_ref {
+       __le64 root;
+       __le64 objectid;
+       __le64 offset;
+       __le32 count;
+} __attribute__ ((__packed__));
+
+struct btrfs_shared_data_ref {
+       __le32 count;
+} __attribute__ ((__packed__));
+
+struct btrfs_extent_inline_ref {
+       u8 type;
+       u64 offset;
+} __attribute__ ((__packed__));
+
+/* old style backrefs item */
+struct btrfs_extent_ref_v0 {
        __le64 root;
        __le64 generation;
        __le64 objectid;
-       __le32 num_refs;
+       __le32 count;
 } __attribute__ ((__packed__));
 
+
 /* dev extents record free space on individual devices.  The owner
  * field points back to the chunk allocation mapping tree that allocated
  * the extent.  The chunk tree uuid field is a way to double check the owner
@@ -695,12 +752,7 @@ struct btrfs_block_group_cache {
        struct list_head cluster_list;
 };
 
-struct btrfs_leaf_ref_tree {
-       struct rb_root root;
-       struct list_head list;
-       spinlock_t lock;
-};
-
+struct reloc_control;
 struct btrfs_device;
 struct btrfs_fs_devices;
 struct btrfs_fs_info {
@@ -831,18 +883,11 @@ struct btrfs_fs_info {
        struct task_struct *cleaner_kthread;
        int thread_pool_size;
 
-       /* tree relocation relocated fields */
-       struct list_head dead_reloc_roots;
-       struct btrfs_leaf_ref_tree reloc_ref_tree;
-       struct btrfs_leaf_ref_tree shared_ref_tree;
-
        struct kobject super_kobj;
        struct completion kobj_unregister;
        int do_barriers;
        int closing;
        int log_root_recovering;
-       atomic_t throttles;
-       atomic_t throttle_gen;
 
        u64 total_pinned;
 
@@ -861,6 +906,8 @@ struct btrfs_fs_info {
         */
        struct list_head space_info;
 
+       struct reloc_control *reloc_ctl;
+
        spinlock_t delalloc_lock;
        spinlock_t new_trans_lock;
        u64 delalloc_bytes;
@@ -891,7 +938,6 @@ struct btrfs_fs_info {
  * in ram representation of the tree.  extent_root is used for all allocations
  * and for the extent tree extent_root root.
  */
-struct btrfs_dirty_root;
 struct btrfs_root {
        struct extent_buffer *node;
 
@@ -899,9 +945,6 @@ struct btrfs_root {
        spinlock_t node_lock;
 
        struct extent_buffer *commit_root;
-       struct btrfs_leaf_ref_tree *ref_tree;
-       struct btrfs_leaf_ref_tree ref_tree_struct;
-       struct btrfs_dirty_root *dirty_root;
        struct btrfs_root *log_root;
        struct btrfs_root *reloc_root;
 
@@ -952,10 +995,15 @@ struct btrfs_root {
        /* the dirty list is only used by non-reference counted roots */
        struct list_head dirty_list;
 
+       struct list_head root_list;
+
        spinlock_t list_lock;
-       struct list_head dead_list;
        struct list_head orphan_list;
 
+       spinlock_t inode_lock;
+       /* red-black tree that keeps track of in-memory inodes */
+       struct rb_root inode_tree;
+
        /*
         * right now this just gets used so that a root has its own devid
         * for stat.  It may be used for more later
@@ -1017,7 +1065,16 @@ struct btrfs_root {
  * are used, and how many references there are to each block
  */
 #define BTRFS_EXTENT_ITEM_KEY  168
-#define BTRFS_EXTENT_REF_KEY   180
+
+#define BTRFS_TREE_BLOCK_REF_KEY       176
+
+#define BTRFS_EXTENT_DATA_REF_KEY      178
+
+#define BTRFS_EXTENT_REF_V0_KEY                180
+
+#define BTRFS_SHARED_BLOCK_REF_KEY     182
+
+#define BTRFS_SHARED_DATA_REF_KEY      184
 
 /*
  * block groups give us hints into the extent allocation trees.  Which
@@ -1317,24 +1374,67 @@ static inline u8 *btrfs_dev_extent_chunk_tree_uuid(struct btrfs_dev_extent *dev)
        return (u8 *)((unsigned long)dev + ptr);
 }
 
-/* struct btrfs_extent_ref */
-BTRFS_SETGET_FUNCS(ref_root, struct btrfs_extent_ref, root, 64);
-BTRFS_SETGET_FUNCS(ref_generation, struct btrfs_extent_ref, generation, 64);
-BTRFS_SETGET_FUNCS(ref_objectid, struct btrfs_extent_ref, objectid, 64);
-BTRFS_SETGET_FUNCS(ref_num_refs, struct btrfs_extent_ref, num_refs, 32);
+BTRFS_SETGET_FUNCS(extent_refs, struct btrfs_extent_item, refs, 64);
+BTRFS_SETGET_FUNCS(extent_generation, struct btrfs_extent_item,
+                  generation, 64);
+BTRFS_SETGET_FUNCS(extent_flags, struct btrfs_extent_item, flags, 64);
 
-BTRFS_SETGET_STACK_FUNCS(stack_ref_root, struct btrfs_extent_ref, root, 64);
-BTRFS_SETGET_STACK_FUNCS(stack_ref_generation, struct btrfs_extent_ref,
-                        generation, 64);
-BTRFS_SETGET_STACK_FUNCS(stack_ref_objectid, struct btrfs_extent_ref,
-                        objectid, 64);
-BTRFS_SETGET_STACK_FUNCS(stack_ref_num_refs, struct btrfs_extent_ref,
-                        num_refs, 32);
+BTRFS_SETGET_FUNCS(extent_refs_v0, struct btrfs_extent_item_v0, refs, 32);
+
+
+BTRFS_SETGET_FUNCS(tree_block_level, struct btrfs_tree_block_info, level, 8);
+
+static inline void btrfs_tree_block_key(struct extent_buffer *eb,
+                                       struct btrfs_tree_block_info *item,
+                                       struct btrfs_disk_key *key)
+{
+       read_eb_member(eb, item, struct btrfs_tree_block_info, key, key);
+}
+
+static inline void btrfs_set_tree_block_key(struct extent_buffer *eb,
+                                           struct btrfs_tree_block_info *item,
+                                           struct btrfs_disk_key *key)
+{
+       write_eb_member(eb, item, struct btrfs_tree_block_info, key, key);
+}
 
-/* struct btrfs_extent_item */
-BTRFS_SETGET_FUNCS(extent_refs, struct btrfs_extent_item, refs, 32);
-BTRFS_SETGET_STACK_FUNCS(stack_extent_refs, struct btrfs_extent_item,
-                        refs, 32);
+BTRFS_SETGET_FUNCS(extent_data_ref_root, struct btrfs_extent_data_ref,
+                  root, 64);
+BTRFS_SETGET_FUNCS(extent_data_ref_objectid, struct btrfs_extent_data_ref,
+                  objectid, 64);
+BTRFS_SETGET_FUNCS(extent_data_ref_offset, struct btrfs_extent_data_ref,
+                  offset, 64);
+BTRFS_SETGET_FUNCS(extent_data_ref_count, struct btrfs_extent_data_ref,
+                  count, 32);
+
+BTRFS_SETGET_FUNCS(shared_data_ref_count, struct btrfs_shared_data_ref,
+                  count, 32);
+
+BTRFS_SETGET_FUNCS(extent_inline_ref_type, struct btrfs_extent_inline_ref,
+                  type, 8);
+BTRFS_SETGET_FUNCS(extent_inline_ref_offset, struct btrfs_extent_inline_ref,
+                  offset, 64);
+
+static inline u32 btrfs_extent_inline_ref_size(int type)
+{
+       if (type == BTRFS_TREE_BLOCK_REF_KEY ||
+           type == BTRFS_SHARED_BLOCK_REF_KEY)
+               return sizeof(struct btrfs_extent_inline_ref);
+       if (type == BTRFS_SHARED_DATA_REF_KEY)
+               return sizeof(struct btrfs_shared_data_ref) +
+                      sizeof(struct btrfs_extent_inline_ref);
+       if (type == BTRFS_EXTENT_DATA_REF_KEY)
+               return sizeof(struct btrfs_extent_data_ref) +
+                      offsetof(struct btrfs_extent_inline_ref, offset);
+       BUG();
+       return 0;
+}
+
+BTRFS_SETGET_FUNCS(ref_root_v0, struct btrfs_extent_ref_v0, root, 64);
+BTRFS_SETGET_FUNCS(ref_generation_v0, struct btrfs_extent_ref_v0,
+                  generation, 64);
+BTRFS_SETGET_FUNCS(ref_objectid_v0, struct btrfs_extent_ref_v0, objectid, 64);
+BTRFS_SETGET_FUNCS(ref_count_v0, struct btrfs_extent_ref_v0, count, 32);
 
 /* struct btrfs_node */
 BTRFS_SETGET_FUNCS(key_blockptr, struct btrfs_key_ptr, blockptr, 64);
@@ -1558,6 +1658,21 @@ static inline int btrfs_clear_header_flag(struct extent_buffer *eb, u64 flag)
        return (flags & flag) == flag;
 }
 
+static inline int btrfs_header_backref_rev(struct extent_buffer *eb)
+{
+       u64 flags = btrfs_header_flags(eb);
+       return flags >> BTRFS_BACKREF_REV_SHIFT;
+}
+
+static inline void btrfs_set_header_backref_rev(struct extent_buffer *eb,
+                                               int rev)
+{
+       u64 flags = btrfs_header_flags(eb);
+       flags &= ~BTRFS_BACKREF_REV_MASK;
+       flags |= (u64)rev << BTRFS_BACKREF_REV_SHIFT;
+       btrfs_set_header_flags(eb, flags);
+}
+
 static inline u8 *btrfs_header_fsid(struct extent_buffer *eb)
 {
        unsigned long ptr = offsetof(struct btrfs_header, fsid);
@@ -1790,39 +1905,32 @@ int btrfs_update_pinned_extents(struct btrfs_root *root,
 int btrfs_drop_leaf_ref(struct btrfs_trans_handle *trans,
                        struct btrfs_root *root, struct extent_buffer *leaf);
 int btrfs_cross_ref_exist(struct btrfs_trans_handle *trans,
-                         struct btrfs_root *root, u64 objectid, u64 bytenr);
+                         struct btrfs_root *root,
+                         u64 objectid, u64 offset, u64 bytenr);
 int btrfs_copy_pinned(struct btrfs_root *root, struct extent_io_tree *copy);
 struct btrfs_block_group_cache *btrfs_lookup_block_group(
                                                 struct btrfs_fs_info *info,
                                                 u64 bytenr);
+void btrfs_put_block_group(struct btrfs_block_group_cache *cache);
 u64 btrfs_find_block_group(struct btrfs_root *root,
                           u64 search_start, u64 search_hint, int owner);
 struct extent_buffer *btrfs_alloc_free_block(struct btrfs_trans_handle *trans,
-                                            struct btrfs_root *root,
-                                            u32 blocksize, u64 parent,
-                                            u64 root_objectid,
-                                            u64 ref_generation,
-                                            int level,
-                                            u64 hint,
-                                            u64 empty_size);
+                                       struct btrfs_root *root, u32 blocksize,
+                                       u64 parent, u64 root_objectid,
+                                       struct btrfs_disk_key *key, int level,
+                                       u64 hint, u64 empty_size);
 struct extent_buffer *btrfs_init_new_buffer(struct btrfs_trans_handle *trans,
                                            struct btrfs_root *root,
                                            u64 bytenr, u32 blocksize,
                                            int level);
-int btrfs_alloc_extent(struct btrfs_trans_handle *trans,
-                      struct btrfs_root *root,
-                      u64 num_bytes, u64 parent, u64 min_bytes,
-                      u64 root_objectid, u64 ref_generation,
-                      u64 owner, u64 empty_size, u64 hint_byte,
-                      u64 search_end, struct btrfs_key *ins, u64 data);
-int btrfs_alloc_reserved_extent(struct btrfs_trans_handle *trans,
-                               struct btrfs_root *root, u64 parent,
-                               u64 root_objectid, u64 ref_generation,
-                               u64 owner, struct btrfs_key *ins);
-int btrfs_alloc_logged_extent(struct btrfs_trans_handle *trans,
-                               struct btrfs_root *root, u64 parent,
-                               u64 root_objectid, u64 ref_generation,
-                               u64 owner, struct btrfs_key *ins);
+int btrfs_alloc_reserved_file_extent(struct btrfs_trans_handle *trans,
+                                    struct btrfs_root *root,
+                                    u64 root_objectid, u64 owner,
+                                    u64 offset, struct btrfs_key *ins);
+int btrfs_alloc_logged_file_extent(struct btrfs_trans_handle *trans,
+                                  struct btrfs_root *root,
+                                  u64 root_objectid, u64 owner, u64 offset,
+                                  struct btrfs_key *ins);
 int btrfs_reserve_extent(struct btrfs_trans_handle *trans,
                                  struct btrfs_root *root,
                                  u64 num_bytes, u64 min_alloc_size,
@@ -1830,18 +1938,18 @@ int btrfs_reserve_extent(struct btrfs_trans_handle *trans,
                                  u64 search_end, struct btrfs_key *ins,
                                  u64 data);
 int btrfs_inc_ref(struct btrfs_trans_handle *trans, struct btrfs_root *root,
-                 struct extent_buffer *orig_buf, struct extent_buffer *buf,
-                 u32 *nr_extents);
-int btrfs_cache_ref(struct btrfs_trans_handle *trans, struct btrfs_root *root,
-                   struct extent_buffer *buf, u32 nr_extents);
-int btrfs_update_ref(struct btrfs_trans_handle *trans,
-                    struct btrfs_root *root, struct extent_buffer *orig_buf,
-                    struct extent_buffer *buf, int start_slot, int nr);
+                 struct extent_buffer *buf, int full_backref);
+int btrfs_dec_ref(struct btrfs_trans_handle *trans, struct btrfs_root *root,
+                 struct extent_buffer *buf, int full_backref);
+int btrfs_set_disk_extent_flags(struct btrfs_trans_handle *trans,
+                               struct btrfs_root *root,
+                               u64 bytenr, u64 num_bytes, u64 flags,
+                               int is_data);
 int btrfs_free_extent(struct btrfs_trans_handle *trans,
                      struct btrfs_root *root,
                      u64 bytenr, u64 num_bytes, u64 parent,
-                     u64 root_objectid, u64 ref_generation,
-                     u64 owner_objectid, int pin);
+                     u64 root_objectid, u64 owner, u64 offset);
+
 int btrfs_free_reserved_extent(struct btrfs_root *root, u64 start, u64 len);
 int btrfs_finish_extent_commit(struct btrfs_trans_handle *trans,
                               struct btrfs_root *root,
@@ -1849,13 +1957,8 @@ int btrfs_finish_extent_commit(struct btrfs_trans_handle *trans,
 int btrfs_inc_extent_ref(struct btrfs_trans_handle *trans,
                         struct btrfs_root *root,
                         u64 bytenr, u64 num_bytes, u64 parent,
-                        u64 root_objectid, u64 ref_generation,
-                        u64 owner_objectid);
-int btrfs_update_extent_ref(struct btrfs_trans_handle *trans,
-                           struct btrfs_root *root, u64 bytenr, u64 num_bytes,
-                           u64 orig_parent, u64 parent,
-                           u64 root_objectid, u64 ref_generation,
-                           u64 owner_objectid);
+                        u64 root_objectid, u64 owner, u64 offset);
+
 int btrfs_write_dirty_block_groups(struct btrfs_trans_handle *trans,
                                    struct btrfs_root *root);
 int btrfs_extent_readonly(struct btrfs_root *root, u64 bytenr);
@@ -1867,16 +1970,9 @@ int btrfs_make_block_group(struct btrfs_trans_handle *trans,
                           u64 size);
 int btrfs_remove_block_group(struct btrfs_trans_handle *trans,
                             struct btrfs_root *root, u64 group_start);
-int btrfs_relocate_block_group(struct btrfs_root *root, u64 group_start);
-int btrfs_free_reloc_root(struct btrfs_trans_handle *trans,
-                         struct btrfs_root *root);
-int btrfs_drop_dead_reloc_roots(struct btrfs_root *root);
-int btrfs_reloc_tree_cache_ref(struct btrfs_trans_handle *trans,
-                              struct btrfs_root *root,
-                              struct extent_buffer *buf, u64 orig_start);
-int btrfs_add_dead_reloc_root(struct btrfs_root *root);
-int btrfs_cleanup_reloc_trees(struct btrfs_root *root);
-int btrfs_reloc_clone_csums(struct inode *inode, u64 file_pos, u64 len);
+int btrfs_prepare_block_group_relocation(struct btrfs_root *root,
+                               struct btrfs_block_group_cache *group);
+
 u64 btrfs_reduce_alloc_profile(struct btrfs_root *root, u64 flags);
 void btrfs_set_inode_space_info(struct btrfs_root *root, struct inode *ionde);
 void btrfs_clear_space_info_full(struct btrfs_fs_info *info);
@@ -1891,13 +1987,12 @@ void btrfs_delalloc_reserve_space(struct btrfs_root *root, struct inode *inode,
 void btrfs_delalloc_free_space(struct btrfs_root *root, struct inode *inode,
                              u64 bytes);
 /* ctree.c */
+int btrfs_bin_search(struct extent_buffer *eb, struct btrfs_key *key,
+                    int level, int *slot);
+int btrfs_comp_cpu_keys(struct btrfs_key *k1, struct btrfs_key *k2);
 int btrfs_previous_item(struct btrfs_root *root,
                        struct btrfs_path *path, u64 min_objectid,
                        int type);
-int btrfs_merge_path(struct btrfs_trans_handle *trans,
-                    struct btrfs_root *root,
-                    struct btrfs_key *node_keys,
-                    u64 *nodes, int lowest_level);
 int btrfs_set_item_key_safe(struct btrfs_trans_handle *trans,
                            struct btrfs_root *root, struct btrfs_path *path,
                            struct btrfs_key *new_key);
@@ -1918,6 +2013,8 @@ int btrfs_copy_root(struct btrfs_trans_handle *trans,
                      struct btrfs_root *root,
                      struct extent_buffer *buf,
                      struct extent_buffer **cow_ret, u64 new_root_objectid);
+int btrfs_block_can_be_shared(struct btrfs_root *root,
+                             struct extent_buffer *buf);
 int btrfs_extend_item(struct btrfs_trans_handle *trans, struct btrfs_root
                      *root, struct btrfs_path *path, u32 data_size);
 int btrfs_truncate_item(struct btrfs_trans_handle *trans,
@@ -1944,9 +2041,6 @@ void btrfs_unlock_up_safe(struct btrfs_path *p, int level);
 
 int btrfs_del_items(struct btrfs_trans_handle *trans, struct btrfs_root *root,
                   struct btrfs_path *path, int slot, int nr);
-int btrfs_del_leaf(struct btrfs_trans_handle *trans,
-                           struct btrfs_root *root,
-                           struct btrfs_path *path, u64 bytenr);
 static inline int btrfs_del_item(struct btrfs_trans_handle *trans,
                                 struct btrfs_root *root,
                                 struct btrfs_path *path)
@@ -2005,8 +2099,9 @@ int btrfs_find_last_root(struct btrfs_root *root, u64 objectid, struct
                         btrfs_root_item *item, struct btrfs_key *key);
 int btrfs_search_root(struct btrfs_root *root, u64 search_start,
                      u64 *found_objectid);
-int btrfs_find_dead_roots(struct btrfs_root *root, u64 objectid,
-                         struct btrfs_root *latest_root);
+int btrfs_find_dead_roots(struct btrfs_root *root, u64 objectid);
+int btrfs_set_root_node(struct btrfs_root_item *item,
+                       struct extent_buffer *node);
 /* dir-item.c */
 int btrfs_insert_dir_item(struct btrfs_trans_handle *trans,
                          struct btrfs_root *root, const char *name,
@@ -2139,7 +2234,6 @@ int btrfs_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf);
 int btrfs_readpage(struct file *file, struct page *page);
 void btrfs_delete_inode(struct inode *inode);
 void btrfs_put_inode(struct inode *inode);
-void btrfs_read_locked_inode(struct inode *inode);
 int btrfs_write_inode(struct inode *inode, int wait);
 void btrfs_dirty_inode(struct inode *inode);
 struct inode *btrfs_alloc_inode(struct super_block *sb);
@@ -2147,12 +2241,8 @@ void btrfs_destroy_inode(struct inode *inode);
 int btrfs_init_cachep(void);
 void btrfs_destroy_cachep(void);
 long btrfs_ioctl_trans_end(struct file *file);
-struct inode *btrfs_ilookup(struct super_block *s, u64 objectid,
-                           struct btrfs_root *root, int wait);
-struct inode *btrfs_iget_locked(struct super_block *s, u64 objectid,
-                               struct btrfs_root *root);
 struct inode *btrfs_iget(struct super_block *s, struct btrfs_key *location,
-                        struct btrfs_root *root, int *is_new);
+                        struct btrfs_root *root);
 int btrfs_commit_write(struct file *file, struct page *page,
                       unsigned from, unsigned to);
 struct extent_map *btrfs_get_extent(struct inode *inode, struct page *page,
@@ -2209,4 +2299,12 @@ int btrfs_check_acl(struct inode *inode, int mask);
 int btrfs_init_acl(struct inode *inode, struct inode *dir);
 int btrfs_acl_chmod(struct inode *inode);
 
+/* relocation.c */
+int btrfs_relocate_block_group(struct btrfs_root *root, u64 group_start);
+int btrfs_init_reloc_root(struct btrfs_trans_handle *trans,
+                         struct btrfs_root *root);
+int btrfs_update_reloc_root(struct btrfs_trans_handle *trans,
+                           struct btrfs_root *root);
+int btrfs_recover_relocation(struct btrfs_root *root);
+int btrfs_reloc_clone_csums(struct inode *inode, u64 file_pos, u64 len);
 #endif
index d6c01c0..84e6781 100644 (file)
  * add extents in the middle of btrfs_search_slot, and it allows
  * us to buffer up frequently modified backrefs in an rb tree instead
  * of hammering updates on the extent allocation tree.
- *
- * Right now this code is only used for reference counted trees, but
- * the long term goal is to get rid of the similar code for delayed
- * extent tree modifications.
  */
 
 /*
- * entries in the rb tree are ordered by the byte number of the extent
- * and by the byte number of the parent block.
+ * compare two delayed tree backrefs with same bytenr and type
+ */
+static int comp_tree_refs(struct btrfs_delayed_tree_ref *ref2,
+                         struct btrfs_delayed_tree_ref *ref1)
+{
+       if (ref1->node.type == BTRFS_TREE_BLOCK_REF_KEY) {
+               if (ref1->root < ref2->root)
+                       return -1;
+               if (ref1->root > ref2->root)
+                       return 1;
+       } else {
+               if (ref1->parent < ref2->parent)
+                       return -1;
+               if (ref1->parent > ref2->parent)
+                       return 1;
+       }
+       return 0;
+}
+
+/*
+ * compare two delayed data backrefs with same bytenr and type
  */
-static int comp_entry(struct btrfs_delayed_ref_node *ref,
-                     u64 bytenr, u64 parent)
+static int comp_data_refs(struct btrfs_delayed_data_ref *ref2,
+                         struct btrfs_delayed_data_ref *ref1)
 {
-       if (bytenr < ref->bytenr)
+       if (ref1->node.type == BTRFS_EXTENT_DATA_REF_KEY) {
+               if (ref1->root < ref2->root)
+                       return -1;
+               if (ref1->root > ref2->root)
+                       return 1;
+               if (ref1->objectid < ref2->objectid)
+                       return -1;
+               if (ref1->objectid > ref2->objectid)
+                       return 1;
+               if (ref1->offset < ref2->offset)
+                       return -1;
+               if (ref1->offset > ref2->offset)
+                       return 1;
+       } else {
+               if (ref1->parent < ref2->parent)
+                       return -1;
+               if (ref1->parent > ref2->parent)
+                       return 1;
+       }
+       return 0;
+}
+
+/*
+ * entries in the rb tree are ordered by the byte number of the extent,
+ * type of the delayed backrefs and content of delayed backrefs.
+ */
+static int comp_entry(struct btrfs_delayed_ref_node *ref2,
+                     struct btrfs_delayed_ref_node *ref1)
+{
+       if (ref1->bytenr < ref2->bytenr)
                return -1;
-       if (bytenr > ref->bytenr)
+       if (ref1->bytenr > ref2->bytenr)
                return 1;
-       if (parent < ref->parent)
+       if (ref1->is_head && ref2->is_head)
+               return 0;
+       if (ref2->is_head)
                return -1;
-       if (parent > ref->parent)
+       if (ref1->is_head)
                return 1;
+       if (ref1->type < ref2->type)
+               return -1;
+       if (ref1->type > ref2->type)
+               return 1;
+       if (ref1->type == BTRFS_TREE_BLOCK_REF_KEY ||
+           ref1->type == BTRFS_SHARED_BLOCK_REF_KEY) {
+               return comp_tree_refs(btrfs_delayed_node_to_tree_ref(ref2),
+                                     btrfs_delayed_node_to_tree_ref(ref1));
+       } else if (ref1->type == BTRFS_EXTENT_DATA_REF_KEY ||
+                  ref1->type == BTRFS_SHARED_DATA_REF_KEY) {
+               return comp_data_refs(btrfs_delayed_node_to_data_ref(ref2),
+                                     btrfs_delayed_node_to_data_ref(ref1));
+       }
+       BUG();
        return 0;
 }
 
@@ -59,20 +119,21 @@ static int comp_entry(struct btrfs_delayed_ref_node *ref,
  * inserted.
  */
 static struct btrfs_delayed_ref_node *tree_insert(struct rb_root *root,
-                                                 u64 bytenr, u64 parent,
                                                  struct rb_node *node)
 {
        struct rb_node **p = &root->rb_node;
        struct rb_node *parent_node = NULL;
        struct btrfs_delayed_ref_node *entry;
+       struct btrfs_delayed_ref_node *ins;
        int cmp;
 
+       ins = rb_entry(node, struct btrfs_delayed_ref_node, rb_node);
        while (*p) {
                parent_node = *p;
                entry = rb_entry(parent_node, struct btrfs_delayed_ref_node,
                                 rb_node);
 
-               cmp = comp_entry(entry, bytenr, parent);
+               cmp = comp_entry(entry, ins);
                if (cmp < 0)
                        p = &(*p)->rb_left;
                else if (cmp > 0)
@@ -81,18 +142,17 @@ static struct btrfs_delayed_ref_node *tree_insert(struct rb_root *root,
                        return entry;
        }
 
-       entry = rb_entry(node, struct btrfs_delayed_ref_node, rb_node);
        rb_link_node(node, parent_node, p);
        rb_insert_color(node, root);
        return NULL;
 }
 
 /*
- * find an entry based on (bytenr,parent).  This returns the delayed
- * ref if it was able to find one, or NULL if nothing was in that spot
+ * find an head entry based on bytenr. This returns the delayed ref
+ * head if it was able to find one, or NULL if nothing was in that spot
  */
-static struct btrfs_delayed_ref_node *tree_search(struct rb_root *root,
-                                 u64 bytenr, u64 parent,
+static struct btrfs_delayed_ref_node *find_ref_head(struct rb_root *root,
+                                 u64 bytenr,
                                  struct btrfs_delayed_ref_node **last)
 {
        struct rb_node *n = root->rb_node;
@@ -105,7 +165,15 @@ static struct btrfs_delayed_ref_node *tree_search(struct rb_root *root,
                if (last)
                        *last = entry;
 
-               cmp = comp_entry(entry, bytenr, parent);
+               if (bytenr < entry->bytenr)
+                       cmp = -1;
+               else if (bytenr > entry->bytenr)
+                       cmp = 1;
+               else if (!btrfs_delayed_ref_is_head(entry))
+                       cmp = 1;
+               else
+                       cmp = 0;
+
                if (cmp < 0)
                        n = n->rb_left;
                else if (cmp > 0)
@@ -154,7 +222,7 @@ int btrfs_find_ref_cluster(struct btrfs_trans_handle *trans,
                node = rb_first(&delayed_refs->root);
        } else {
                ref = NULL;
-               tree_search(&delayed_refs->root, start, (u64)-1, &ref);
+               find_ref_head(&delayed_refs->root, start, &ref);
                if (ref) {
                        struct btrfs_delayed_ref_node *tmp;
 
@@ -234,7 +302,7 @@ int btrfs_delayed_ref_pending(struct btrfs_trans_handle *trans, u64 bytenr)
        delayed_refs = &trans->transaction->delayed_refs;
        spin_lock(&delayed_refs->lock);
 
-       ref = tree_search(&delayed_refs->root, bytenr, (u64)-1, NULL);
+       ref = find_ref_head(&delayed_refs->root, bytenr, NULL);
        if (ref) {
                prev_node = rb_prev(&ref->rb_node);
                if (!prev_node)
@@ -250,25 +318,28 @@ out:
 }
 
 /*
- * helper function to lookup reference count
+ * helper function to lookup reference count and flags of extent.
  *
  * the head node for delayed ref is used to store the sum of all the
- * reference count modifications queued up in the rbtree.  This way you
- * can check to see what the reference count would be if all of the
- * delayed refs are processed.
+ * reference count modifications queued up in the rbtree. the head
+ * node may also store the extent flags to set. This way you can check
+ * to see what the reference count and extent flags would be if all of
+ * the delayed refs are not processed.
  */
-int btrfs_lookup_extent_ref(struct btrfs_trans_handle *trans,
-                           struct btrfs_root *root, u64 bytenr,
-                           u64 num_bytes, u32 *refs)
+int btrfs_lookup_extent_info(struct btrfs_trans_handle *trans,
+                            struct btrfs_root *root, u64 bytenr,
+                            u64 num_bytes, u64 *refs, u64 *flags)
 {
        struct btrfs_delayed_ref_node *ref;
        struct btrfs_delayed_ref_head *head;
        struct btrfs_delayed_ref_root *delayed_refs;
        struct btrfs_path *path;
-       struct extent_buffer *leaf;
        struct btrfs_extent_item *ei;
+       struct extent_buffer *leaf;
        struct btrfs_key key;
-       u32 num_refs;
+       u32 item_size;
+       u64 num_refs;
+       u64 extent_flags;
        int ret;
 
        path = btrfs_alloc_path();
@@ -287,37 +358,60 @@ again:
 
        if (ret == 0) {
                leaf = path->nodes[0];
-               ei = btrfs_item_ptr(leaf, path->slots[0],
-                                   struct btrfs_extent_item);
-               num_refs = btrfs_extent_refs(leaf, ei);
+               item_size = btrfs_item_size_nr(leaf, path->slots[0]);
+               if (item_size >= sizeof(*ei)) {
+                       ei = btrfs_item_ptr(leaf, path->slots[0],
+                                           struct btrfs_extent_item);
+                       num_refs = btrfs_extent_refs(leaf, ei);
+                       extent_flags = btrfs_extent_flags(leaf, ei);
+               } else {
+#ifdef BTRFS_COMPAT_EXTENT_TREE_V0
+                       struct btrfs_extent_item_v0 *ei0;
+                       BUG_ON(item_size != sizeof(*ei0));
+                       ei0 = btrfs_item_ptr(leaf, path->slots[0],
+                                            struct btrfs_extent_item_v0);
+                       num_refs = btrfs_extent_refs_v0(leaf, ei0);
+                       /* FIXME: this isn't correct for data */
+                       extent_flags = BTRFS_BLOCK_FLAG_FULL_BACKREF;
+#else
+                       BUG();
+#endif
+               }
+               BUG_ON(num_refs == 0);
        } else {
                num_refs = 0;
+               extent_flags = 0;
                ret = 0;
        }
 
        spin_lock(&delayed_refs->lock);
-       ref = tree_search(&delayed_refs->root, bytenr, (u64)-1, NULL);
+       ref = find_ref_head(&delayed_refs->root, bytenr, NULL);
        if (ref) {
                head = btrfs_delayed_node_to_head(ref);
-               if (mutex_trylock(&head->mutex)) {
-                       num_refs += ref->ref_mod;
-                       mutex_unlock(&head->mutex);
-                       *refs = num_refs;
-                       goto out;
-               }
+               if (!mutex_trylock(&head->mutex)) {
+                       atomic_inc(&ref->refs);
+                       spin_unlock(&delayed_refs->lock);
 
-               atomic_inc(&ref->refs);
-               spin_unlock(&delayed_refs->lock);
+                       btrfs_release_path(root->fs_info->extent_root, path);
 
-               btrfs_release_path(root->fs_info->extent_root, path);
+                       mutex_lock(&head->mutex);
+                       mutex_unlock(&head->mutex);
+                       btrfs_put_delayed_ref(ref);
+                       goto again;
+               }
+               if (head->extent_op && head->extent_op->update_flags)
+                       extent_flags |= head->extent_op->flags_to_set;
+               else
+                       BUG_ON(num_refs == 0);
 
-               mutex_lock(&head->mutex);
+               num_refs += ref->ref_mod;
                mutex_unlock(&head->mutex);
-               btrfs_put_delayed_ref(ref);
-               goto again;
-       } else {
-               *refs = num_refs;
        }
+       WARN_ON(num_refs == 0);
+       if (refs)
+               *refs = num_refs;
+       if (flags)
+               *flags = extent_flags;
 out:
        spin_unlock(&delayed_refs->lock);
        btrfs_free_path(path);
@@ -338,16 +432,7 @@ update_existing_ref(struct btrfs_trans_handle *trans,
                    struct btrfs_delayed_ref_node *existing,
                    struct btrfs_delayed_ref_node *update)
 {
-       struct btrfs_delayed_ref *existing_ref;
-       struct btrfs_delayed_ref *ref;
-
-       existing_ref = btrfs_delayed_node_to_ref(existing);
-       ref = btrfs_delayed_node_to_ref(update);
-
-       if (ref->pin)
-               existing_ref->pin = 1;
-
-       if (ref->action != existing_ref->action) {
+       if (update->action != existing->action) {
                /*
                 * this is effectively undoing either an add or a
                 * drop.  We decrement the ref_mod, and if it goes
@@ -363,20 +448,13 @@ update_existing_ref(struct btrfs_trans_handle *trans,
                        delayed_refs->num_entries--;
                        if (trans->delayed_ref_updates)
                                trans->delayed_ref_updates--;
+               } else {
+                       WARN_ON(existing->type == BTRFS_TREE_BLOCK_REF_KEY ||
+                               existing->type == BTRFS_SHARED_BLOCK_REF_KEY);
                }
        } else {
-               if (existing_ref->action == BTRFS_ADD_DELAYED_REF) {
-                       /* if we're adding refs, make sure all the
-                        * details match up.  The extent could
-                        * have been totally freed and reallocated
-                        * by a different owner before the delayed
-                        * ref entries were removed.
-                        */
-                       existing_ref->owner_objectid = ref->owner_objectid;
-                       existing_ref->generation = ref->generation;
-                       existing_ref->root = ref->root;
-                       existing->num_bytes = update->num_bytes;
-               }
+               WARN_ON(existing->type == BTRFS_TREE_BLOCK_REF_KEY ||
+                       existing->type == BTRFS_SHARED_BLOCK_REF_KEY);
                /*
                 * the action on the existing ref matches
                 * the action on the ref we're trying to add.
@@ -401,6 +479,7 @@ update_existing_head_ref(struct btrfs_delayed_ref_node *existing,
 
        existing_ref = btrfs_delayed_node_to_head(existing);
        ref = btrfs_delayed_node_to_head(update);
+       BUG_ON(existing_ref->is_data != ref->is_data);
 
        if (ref->must_insert_reserved) {
                /* if the extent was freed and then
@@ -420,6 +499,24 @@ update_existing_head_ref(struct btrfs_delayed_ref_node *existing,
 
        }
 
+       if (ref->extent_op) {
+               if (!existing_ref->extent_op) {
+                       existing_ref->extent_op = ref->extent_op;
+               } else {
+                       if (ref->extent_op->update_key) {
+                               memcpy(&existing_ref->extent_op->key,
+                                      &ref->extent_op->key,
+                                      sizeof(ref->extent_op->key));
+                               existing_ref->extent_op->update_key = 1;
+                       }
+                       if (ref->extent_op->update_flags) {
+                               existing_ref->extent_op->flags_to_set |=
+                                       ref->extent_op->flags_to_set;
+                               existing_ref->extent_op->update_flags = 1;
+                       }
+                       kfree(ref->extent_op);
+               }
+       }
        /*
         * update the reference mod on the head to reflect this new operation
         */
@@ -427,19 +524,16 @@ update_existing_head_ref(struct btrfs_delayed_ref_node *existing,
 }
 
 /*
- * helper function to actually insert a delayed ref into the rbtree.
+ * helper function to actually insert a head node into the rbtree.
  * this does all the dirty work in terms of maintaining the correct
- * overall modification count in the head node and properly dealing
- * with updating existing nodes as new modifications are queued.
+ * overall modification count.
  */
-static noinline int __btrfs_add_delayed_ref(struct btrfs_trans_handle *trans,
-                         struct btrfs_delayed_ref_node *ref,
-                         u64 bytenr, u64 num_bytes, u64 parent, u64 ref_root,
-                         u64 ref_generation, u64 owner_objectid, int action,
-                         int pin)
+static noinline int add_delayed_ref_head(struct btrfs_trans_handle *trans,
+                                       struct btrfs_delayed_ref_node *ref,
+                                       u64 bytenr, u64 num_bytes,
+                                       int action, int is_data)
 {
        struct btrfs_delayed_ref_node *existing;
-       struct btrfs_delayed_ref *full_ref;
        struct btrfs_delayed_ref_head *head_ref = NULL;
        struct btrfs_delayed_ref_root *delayed_refs;
        int count_mod = 1;
@@ -449,12 +543,10 @@ static noinline int __btrfs_add_delayed_ref(struct btrfs_trans_handle *trans,
         * the head node stores the sum of all the mods, so dropping a ref
         * should drop the sum in the head node by one.
         */
-       if (parent == (u64)-1) {
-               if (action == BTRFS_DROP_DELAYED_REF)
-                       count_mod = -1;
-               else if (action == BTRFS_UPDATE_DELAYED_HEAD)
-                       count_mod = 0;
-       }
+       if (action == BTRFS_UPDATE_DELAYED_HEAD)
+               count_mod = 0;
+       else if (action == BTRFS_DROP_DELAYED_REF)
+               count_mod = -1;
 
        /*
         * BTRFS_ADD_DELAYED_EXTENT means that we need to update
@@ -467,57 +559,148 @@ static noinline int __btrfs_add_delayed_ref(struct btrfs_trans_handle *trans,
         * Once we record must_insert_reserved, switch the action to
         * BTRFS_ADD_DELAYED_REF because other special casing is not required.
         */
-       if (action == BTRFS_ADD_DELAYED_EXTENT) {
+       if (action == BTRFS_ADD_DELAYED_EXTENT)
                must_insert_reserved = 1;
-               action = BTRFS_ADD_DELAYED_REF;
-       } else {
+       else
                must_insert_reserved = 0;
-       }
-
 
        delayed_refs = &trans->transaction->delayed_refs;
 
        /* first set the basic ref node struct up */
        atomic_set(&ref->refs, 1);
        ref->bytenr = bytenr;
-       ref->parent = parent;
+       ref->num_bytes = num_bytes;
        ref->ref_mod = count_mod;
+       ref->type  = 0;
+       ref->action  = 0;
+       ref->is_head = 1;
        ref->in_tree = 1;
+
+       head_ref = btrfs_delayed_node_to_head(ref);
+       head_ref->must_insert_reserved = must_insert_reserved;
+       head_ref->is_data = is_data;
+
+       INIT_LIST_HEAD(&head_ref->cluster);
+       mutex_init(&head_ref->mutex);
+
+       existing = tree_insert(&delayed_refs->root, &ref->rb_node);
+
+       if (existing) {
+               update_existing_head_ref(existing, ref);
+               /*
+                * we've updated the existing ref, free the newly
+                * allocated ref
+                */
+               kfree(ref);
+       } else {
+               delayed_refs->num_heads++;
+               delayed_refs->num_heads_ready++;
+               delayed_refs->num_entries++;
+               trans->delayed_ref_updates++;
+       }
+       return 0;
+}
+
+/*
+ * helper to insert a delayed tree ref into the rbtree.
+ */
+static noinline int add_delayed_tree_ref(struct btrfs_trans_handle *trans,
+                                        struct btrfs_delayed_ref_node *ref,
+                                        u64 bytenr, u64 num_bytes, u64 parent,
+                                        u64 ref_root, int level, int action)
+{
+       struct btrfs_delayed_ref_node *existing;
+       struct btrfs_delayed_tree_ref *full_ref;
+       struct btrfs_delayed_ref_root *delayed_refs;
+
+       if (action == BTRFS_ADD_DELAYED_EXTENT)
+               action = BTRFS_ADD_DELAYED_REF;
+
+       delayed_refs = &trans->transaction->delayed_refs;
+
+       /* first set the basic ref node struct up */
+       atomic_set(&ref->refs, 1);
+       ref->bytenr = bytenr;
        ref->num_bytes = num_bytes;
+       ref->ref_mod = 1;
+       ref->action = action;
+       ref->is_head = 0;
+       ref->in_tree = 1;
 
-       if (btrfs_delayed_ref_is_head(ref)) {
-               head_ref = btrfs_delayed_node_to_head(ref);
-               head_ref->must_insert_reserved = must_insert_reserved;
-               INIT_LIST_HEAD(&head_ref->cluster);
-               mutex_init(&head_ref->mutex);
+       full_ref = btrfs_delayed_node_to_tree_ref(ref);
+       if (parent) {
+               full_ref->parent = parent;
+               ref->type = BTRFS_SHARED_BLOCK_REF_KEY;
        } else {
-               full_ref = btrfs_delayed_node_to_ref(ref);
                full_ref->root = ref_root;
-               full_ref->generation = ref_generation;
-               full_ref->owner_objectid = owner_objectid;
-               full_ref->pin = pin;
-               full_ref->action = action;
+               ref->type = BTRFS_TREE_BLOCK_REF_KEY;
        }
+       full_ref->level = level;
 
-       existing = tree_insert(&delayed_refs->root, bytenr,
-                              parent, &ref->rb_node);
+       existing = tree_insert(&delayed_refs->root, &ref->rb_node);
 
        if (existing) {
-               if (btrfs_delayed_ref_is_head(ref))
-                       update_existing_head_ref(existing, ref);
-               else
-                       update_existing_ref(trans, delayed_refs, existing, ref);
+               update_existing_ref(trans, delayed_refs, existing, ref);
+               /*
+                * we've updated the existing ref, free the newly
+                * allocated ref
+                */
+               kfree(ref);
+       } else {
+               delayed_refs->num_entries++;
+               trans->delayed_ref_updates++;
+       }
+       return 0;
+}
+
+/*
+ * helper to insert a delayed data ref into the rbtree.
+ */
+static noinline int add_delayed_data_ref(struct btrfs_trans_handle *trans,
+                                        struct btrfs_delayed_ref_node *ref,
+                                        u64 bytenr, u64 num_bytes, u64 parent,
+                                        u64 ref_root, u64 owner, u64 offset,
+                                        int action)
+{
+       struct btrfs_delayed_ref_node *existing;
+       struct btrfs_delayed_data_ref *full_ref;
+       struct btrfs_delayed_ref_root *delayed_refs;
+
+       if (action == BTRFS_ADD_DELAYED_EXTENT)
+               action = BTRFS_ADD_DELAYED_REF;
+
+       delayed_refs = &trans->transaction->delayed_refs;
+
+       /* first set the basic ref node struct up */
+       atomic_set(&ref->refs, 1);
+       ref->bytenr = bytenr;
+       ref->num_bytes = num_bytes;
+       ref->ref_mod = 1;
+       ref->action = action;
+       ref->is_head = 0;
+       ref->in_tree = 1;
+
+       full_ref = btrfs_delayed_node_to_data_ref(ref);
+       if (parent) {
+               full_ref->parent = parent;
+               ref->type = BTRFS_SHARED_DATA_REF_KEY;
+       } else {
+               full_ref->root = ref_root;
+               ref->type = BTRFS_EXTENT_DATA_REF_KEY;
+       }
+       full_ref->objectid = owner;
+       full_ref->offset = offset;
 
+       existing = tree_insert(&delayed_refs->root, &ref->rb_node);
+
+       if (existing) {
+               update_existing_ref(trans, delayed_refs, existing, ref);
                /*
                 * we've updated the existing ref, free the newly
                 * allocated ref
                 */
                kfree(ref);
        } else {
-               if (btrfs_delayed_ref_is_head(ref)) {
-                       delayed_refs->num_heads++;
-                       delayed_refs->num_heads_ready++;
-               }
                delayed_refs->num_entries++;
                trans->delayed_ref_updates++;
        }
@@ -525,37 +708,78 @@ static noinline int __btrfs_add_delayed_ref(struct btrfs_trans_handle *trans,
 }
 
 /*
- * add a delayed ref to the tree.  This does all of the accounting required
+ * add a delayed tree ref.  This does all of the accounting required
  * to make sure the delayed ref is eventually processed before this
  * transaction commits.
  */
-int btrfs_add_delayed_ref(struct btrfs_trans_handle *trans,
-                         u64 bytenr, u64 num_bytes, u64 parent, u64 ref_root,
-                         u64 ref_generation, u64 owner_objectid, int action,
-                         int pin)
+int btrfs_add_delayed_tree_ref(struct btrfs_trans_handle *trans,
+                              u64 bytenr, u64 num_bytes, u64 parent,
+                              u64 ref_root,  int level, int action,
+                              struct btrfs_delayed_extent_op *extent_op)
 {
-       struct btrfs_delayed_ref *ref;
+       struct btrfs_delayed_tree_ref *ref;
        struct btrfs_delayed_ref_head *head_ref;
        struct btrfs_delayed_ref_root *delayed_refs;
        int ret;
 
+       BUG_ON(extent_op && extent_op->is_data);
        ref = kmalloc(sizeof(*ref), GFP_NOFS);
        if (!ref)
                return -ENOMEM;
 
+       head_ref = kmalloc(sizeof(*head_ref), GFP_NOFS);
+       if (!head_ref) {
+               kfree(ref);
+               return -ENOMEM;
+       }
+
+       head_ref->extent_op = extent_op;
+
+       delayed_refs = &trans->transaction->delayed_refs;
+       spin_lock(&delayed_refs->lock);
+
        /*
-        * the parent = 0 case comes from cases where we don't actually
-        * know the parent yet.  It will get updated later via a add/drop
-        * pair.
+        * insert both the head node and the new ref without dropping
+        * the spin lock
         */
-       if (parent == 0)
-               parent = bytenr;
+       ret = add_delayed_ref_head(trans, &head_ref->node, bytenr, num_bytes,
+                                  action, 0);
+       BUG_ON(ret);
+
+       ret = add_delayed_tree_ref(trans, &ref->node, bytenr, num_bytes,
+                                  parent, ref_root, level, action);
+       BUG_ON(ret);
+       spin_unlock(&delayed_refs->lock);
+       return 0;
+}
+
+/*
+ * add a delayed data ref. it's similar to btrfs_add_delayed_tree_ref.
+ */
+int btrfs_add_delayed_data_ref(struct btrfs_trans_handle *trans,
+                              u64 bytenr, u64 num_bytes,
+                              u64 parent, u64 ref_root,
+                              u64 owner, u64 offset, int action,
+                              struct btrfs_delayed_extent_op *extent_op)
+{
+       struct btrfs_delayed_data_ref *ref;
+       struct btrfs_delayed_ref_head *head_ref;
+       struct btrfs_delayed_ref_root *delayed_refs;
+       int ret;
+
+       BUG_ON(extent_op && !extent_op->is_data);
+       ref = kmalloc(sizeof(*ref), GFP_NOFS);
+       if (!ref)
+               return -ENOMEM;
 
        head_ref = kmalloc(sizeof(*head_ref), GFP_NOFS);
        if (!head_ref) {
                kfree(ref);
                return -ENOMEM;
        }
+
+       head_ref->extent_op = extent_op;
+
        delayed_refs = &trans->transaction->delayed_refs;
        spin_lock(&delayed_refs->lock);
 
@@ -563,14 +787,39 @@ int btrfs_add_delayed_ref(struct btrfs_trans_handle *trans,
         * insert both the head node and the new ref without dropping
         * the spin lock
         */
-       ret = __btrfs_add_delayed_ref(trans, &head_ref->node, bytenr, num_bytes,
-                                     (u64)-1, 0, 0, 0, action, pin);
+       ret = add_delayed_ref_head(trans, &head_ref->node, bytenr, num_bytes,
+                                  action, 1);
        BUG_ON(ret);
 
-       ret = __btrfs_add_delayed_ref(trans, &ref->node, bytenr, num_bytes,
-                                     parent, ref_root, ref_generation,
-                                     owner_objectid, action, pin);
+       ret = add_delayed_data_ref(trans, &ref->node, bytenr, num_bytes,
+                                  parent, ref_root, owner, offset, action);
+       BUG_ON(ret);
+       spin_unlock(&delayed_refs->lock);
+       return 0;
+}
+
+int btrfs_add_delayed_extent_op(struct btrfs_trans_handle *trans,
+                               u64 bytenr, u64 num_bytes,
+                               struct btrfs_delayed_extent_op *extent_op)
+{
+       struct btrfs_delayed_ref_head *head_ref;
+       struct btrfs_delayed_ref_root *delayed_refs;
+       int ret;
+
+       head_ref = kmalloc(sizeof(*head_ref), GFP_NOFS);
+       if (!head_ref)
+               return -ENOMEM;
+
+       head_ref->extent_op = extent_op;
+
+       delayed_refs = &trans->transaction->delayed_refs;
+       spin_lock(&delayed_refs->lock);
+
+       ret = add_delayed_ref_head(trans, &head_ref->node, bytenr,
+                                  num_bytes, BTRFS_UPDATE_DELAYED_HEAD,
+                                  extent_op->is_data);
        BUG_ON(ret);
+
        spin_unlock(&delayed_refs->lock);
        return 0;
 }
@@ -587,7 +836,7 @@ btrfs_find_delayed_ref_head(struct btrfs_trans_handle *trans, u64 bytenr)
        struct btrfs_delayed_ref_root *delayed_refs;
 
        delayed_refs = &trans->transaction->delayed_refs;
-       ref = tree_search(&delayed_refs->root, bytenr, (u64)-1, NULL);
+       ref = find_ref_head(&delayed_refs->root, bytenr, NULL);
        if (ref)
                return btrfs_delayed_node_to_head(ref);
        return NULL;
@@ -603,6 +852,7 @@ btrfs_find_delayed_ref_head(struct btrfs_trans_handle *trans, u64 bytenr)
  *
  * It is the same as doing a ref add and delete in two separate calls.
  */
+#if 0
 int btrfs_update_delayed_ref(struct btrfs_trans_handle *trans,
                          u64 bytenr, u64 num_bytes, u64 orig_parent,
                          u64 parent, u64 orig_ref_root, u64 ref_root,
@@ -666,3 +916,4 @@ int btrfs_update_delayed_ref(struct btrfs_trans_handle *trans,
        spin_unlock(&delayed_refs->lock);
        return 0;
 }
+#endif
index 3bec2ff..f6fc67d 100644 (file)
@@ -30,9 +30,6 @@ struct btrfs_delayed_ref_node {
        /* the starting bytenr of the extent */
        u64 bytenr;
 
-       /* the parent our backref will point to */
-       u64 parent;
-
        /* the size of the extent */
        u64 num_bytes;
 
@@ -50,10 +47,21 @@ struct btrfs_delayed_ref_node {
         */
        int ref_mod;
 
+       unsigned int action:8;
+       unsigned int type:8;
        /* is this node still in the rbtree? */
+       unsigned int is_head:1;
        unsigned int in_tree:1;
 };
 
+struct btrfs_delayed_extent_op {
+       struct btrfs_disk_key key;
+       u64 flags_to_set;
+       unsigned int update_key:1;
+       unsigned int update_flags:1;
+       unsigned int is_data:1;
+};
+
 /*
  * the head refs are used to hold a lock on a given extent, which allows us
  * to make sure that only one process is running the delayed refs
@@ -71,6 +79,7 @@ struct btrfs_delayed_ref_head {
 
        struct list_head cluster;
 
+       struct btrfs_delayed_extent_op *extent_op;
        /*
         * when a new extent is allocated, it is just reserved in memory
         * The actual extent isn't inserted into the extent allocation tree
@@ -84,27 +93,26 @@ struct btrfs_delayed_ref_head {
         * the free has happened.
         */
        unsigned int must_insert_reserved:1;
+       unsigned int is_data:1;
 };
 
-struct btrfs_delayed_ref {
+struct btrfs_delayed_tree_ref {
        struct btrfs_delayed_ref_node node;
+       union {
+               u64 root;
+               u64 parent;
+       };
+       int level;
+};
 
-       /* the root objectid our ref will point to */
-       u64 root;
-
-       /* the generation for the backref */
-       u64 generation;
-
-       /* owner_objectid of the backref  */
-       u64 owner_objectid;
-
-       /* operation done by this entry in the rbtree */
-       u8 action;
-
-       /* if pin == 1, when the extent is freed it will be pinned until
-        * transaction commit
-        */
-       unsigned int pin:1;
+struct btrfs_delayed_data_ref {
+       struct btrfs_delayed_ref_node node;
+       union {
+               u64 root;
+               u64 parent;
+       };
+       u64 objectid;
+       u64 offset;
 };
 
 struct btrfs_delayed_ref_root {
@@ -143,17 +151,25 @@ static inline void btrfs_put_delayed_ref(struct btrfs_delayed_ref_node *ref)
        }
 }
 
-int btrfs_add_delayed_ref(struct btrfs_trans_handle *trans,
-                         u64 bytenr, u64 num_bytes, u64 parent, u64 ref_root,
-                         u64 ref_generation, u64 owner_objectid, int action,
-                         int pin);
+int btrfs_add_delayed_tree_ref(struct btrfs_trans_handle *trans,
+                              u64 bytenr, u64 num_bytes, u64 parent,
+                              u64 ref_root, int level, int action,
+                              struct btrfs_delayed_extent_op *extent_op);
+int btrfs_add_delayed_data_ref(struct btrfs_trans_handle *trans,
+                              u64 bytenr, u64 num_bytes,
+                              u64 parent, u64 ref_root,
+                              u64 owner, u64 offset, int action,
+                              struct btrfs_delayed_extent_op *extent_op);
+int btrfs_add_delayed_extent_op(struct btrfs_trans_handle *trans,
+                               u64 bytenr, u64 num_bytes,
+                               struct btrfs_delayed_extent_op *extent_op);
 
 struct btrfs_delayed_ref_head *
 btrfs_find_delayed_ref_head(struct btrfs_trans_handle *trans, u64 bytenr);
 int btrfs_delayed_ref_pending(struct btrfs_trans_handle *trans, u64 bytenr);
-int btrfs_lookup_extent_ref(struct btrfs_trans_handle *trans,
-                           struct btrfs_root *root, u64 bytenr,
-                           u64 num_bytes, u32 *refs);
+int btrfs_lookup_extent_info(struct btrfs_trans_handle *trans,
+                            struct btrfs_root *root, u64 bytenr,
+                            u64 num_bytes, u64 *refs, u64 *flags);
 int btrfs_update_delayed_ref(struct btrfs_trans_handle *trans,
                          u64 bytenr, u64 num_bytes, u64 orig_parent,
                          u64 parent, u64 orig_ref_root, u64 ref_root,
@@ -169,18 +185,24 @@ int btrfs_find_ref_cluster(struct btrfs_trans_handle *trans,
  */
 static int btrfs_delayed_ref_is_head(struct btrfs_delayed_ref_node *node)
 {
-       return node->parent == (u64)-1;
+       return node->is_head;
 }
 
 /*
  * helper functions to cast a node into its container
  */
-static inline struct btrfs_delayed_ref *
-btrfs_delayed_node_to_ref(struct btrfs_delayed_ref_node *node)
+static inline struct btrfs_delayed_tree_ref *
+btrfs_delayed_node_to_tree_ref(struct btrfs_delayed_ref_node *node)
 {
        WARN_ON(btrfs_delayed_ref_is_head(node));
-       return container_of(node, struct btrfs_delayed_ref, node);
+       return container_of(node, struct btrfs_delayed_tree_ref, node);
+}
 
+static inline struct btrfs_delayed_data_ref *
+btrfs_delayed_node_to_data_ref(struct btrfs_delayed_ref_node *node)
+{
+       WARN_ON(btrfs_delayed_ref_is_head(node));
+       return container_of(node, struct btrfs_delayed_data_ref, node);
 }
 
 static inline struct btrfs_delayed_ref_head *
@@ -188,6 +210,5 @@ btrfs_delayed_node_to_head(struct btrfs_delayed_ref_node *node)
 {
        WARN_ON(!btrfs_delayed_ref_is_head(node));
        return container_of(node, struct btrfs_delayed_ref_head, node);
-
 }
 #endif
index 4b0ea0b..7f5c6e3 100644 (file)
@@ -36,7 +36,6 @@
 #include "print-tree.h"
 #include "async-thread.h"
 #include "locking.h"
-#include "ref-cache.h"
 #include "tree-log.h"
 #include "free-space-cache.h"
 
@@ -884,7 +883,6 @@ static int __setup_root(u32 nodesize, u32 leafsize, u32 sectorsize,
 {
        root->node = NULL;
        root->commit_root = NULL;
-       root->ref_tree = NULL;
        root->sectorsize = sectorsize;
        root->nodesize = nodesize;
        root->leafsize = leafsize;
@@ -899,12 +897,14 @@ static int __setup_root(u32 nodesize, u32 leafsize, u32 sectorsize,
        root->last_inode_alloc = 0;
        root->name = NULL;
        root->in_sysfs = 0;
+       root->inode_tree.rb_node = NULL;
 
        INIT_LIST_HEAD(&root->dirty_list);
        INIT_LIST_HEAD(&root->orphan_list);
-       INIT_LIST_HEAD(&root->dead_list);
+       INIT_LIST_HEAD(&root->root_list);
        spin_lock_init(&root->node_lock);
        spin_lock_init(&root->list_lock);
+       spin_lock_init(&root->inode_lock);
        mutex_init(&root->objectid_mutex);
        mutex_init(&root->log_mutex);
        init_waitqueue_head(&root->log_writer_wait);
@@ -918,9 +918,6 @@ static int __setup_root(u32 nodesize, u32 leafsize, u32 sectorsize,
        extent_io_tree_init(&root->dirty_log_pages,
                             fs_info->btree_inode->i_mapping, GFP_NOFS);
 
-       btrfs_leaf_ref_tree_init(&root->ref_tree_struct);
-       root->ref_tree = &root->ref_tree_struct;
-
        memset(&root->root_key, 0, sizeof(root->root_key));
        memset(&root->root_item, 0, sizeof(root->root_item));
        memset(&root->defrag_progress, 0, sizeof(root->defrag_progress));
@@ -959,6 +956,7 @@ static int find_and_setup_root(struct btrfs_root *tree_root,
        blocksize = btrfs_level_size(root, btrfs_root_level(&root->root_item));
        root->node = read_tree_block(root, btrfs_root_bytenr(&root->root_item),
                                     blocksize, generation);
+       root->commit_root = btrfs_root_node(root);
        BUG_ON(!root->node);
        return 0;
 }
@@ -1025,20 +1023,19 @@ static struct btrfs_root *alloc_log_tree(struct btrfs_trans_handle *trans,
         */
        root->ref_cows = 0;
 
-       leaf = btrfs_alloc_free_block(trans, root, root->leafsize,
-                                     0, BTRFS_TREE_LOG_OBJECTID,
-                                     trans->transid, 0, 0, 0);
+       leaf = btrfs_alloc_free_block(trans, root, root->leafsize, 0,
+                                     BTRFS_TREE_LOG_OBJECTID, NULL, 0, 0, 0);
        if (IS_ERR(leaf)) {
                kfree(root);
                return ERR_CAST(leaf);
        }
 
+       memset_extent_buffer(leaf, 0, 0, sizeof(struct btrfs_header));
+       btrfs_set_header_bytenr(leaf, leaf->start);
+       btrfs_set_header_generation(leaf, trans->transid);
+       btrfs_set_header_backref_rev(leaf, BTRFS_MIXED_BACKREF_REV);
+       btrfs_set_header_owner(leaf, BTRFS_TREE_LOG_OBJECTID);
        root->node = leaf;
-       btrfs_set_header_nritems(root->node, 0);
-       btrfs_set_header_level(root->node, 0);
-       btrfs_set_header_bytenr(root->node, root->node->start);
-       btrfs_set_header_generation(root->node, trans->transid);
-       btrfs_set_header_owner(root->node, BTRFS_TREE_LOG_OBJECTID);
 
        write_extent_buffer(root->node, root->fs_info->fsid,
                            (unsigned long)btrfs_header_fsid(root->node),
@@ -1081,8 +1078,7 @@ int btrfs_add_log_tree(struct btrfs_trans_handle *trans,
        inode_item->nbytes = cpu_to_le64(root->leafsize);
        inode_item->mode = cpu_to_le32(S_IFDIR | 0755);
 
-       btrfs_set_root_bytenr(&log_root->root_item, log_root->node->start);
-       btrfs_set_root_generation(&log_root->root_item, trans->transid);
+       btrfs_set_root_node(&log_root->root_item, log_root->node);
 
        WARN_ON(root->log_root);
        root->log_root = log_root;
@@ -1144,6 +1140,7 @@ out:
        blocksize = btrfs_level_size(root, btrfs_root_level(&root->root_item));
        root->node = read_tree_block(root, btrfs_root_bytenr(&root->root_item),
                                     blocksize, generation);
+       root->commit_root = btrfs_root_node(root);
        BUG_ON(!root->node);
 insert:
        if (location->objectid != BTRFS_TREE_LOG_OBJECTID) {
@@ -1210,7 +1207,7 @@ struct btrfs_root *btrfs_read_fs_root_no_name(struct btrfs_fs_info *fs_info,
        }
        if (!(fs_info->sb->s_flags & MS_RDONLY)) {
                ret = btrfs_find_dead_roots(fs_info->tree_root,
-                                           root->root_key.objectid, root);
+                                           root->root_key.objectid);
                BUG_ON(ret);
                btrfs_orphan_cleanup(root);
        }
@@ -1569,8 +1566,6 @@ struct btrfs_root *open_ctree(struct super_block *sb,
        atomic_set(&fs_info->async_delalloc_pages, 0);
        atomic_set(&fs_info->async_submit_draining, 0);
        atomic_set(&fs_info->nr_async_bios, 0);
-       atomic_set(&fs_info->throttles, 0);
-       atomic_set(&fs_info->throttle_gen, 0);
        fs_info->sb = sb;
        fs_info->max_extent = (u64)-1;
        fs_info->max_inline = 8192 * 1024;
@@ -1598,6 +1593,7 @@ struct btrfs_root *open_ctree(struct super_block *sb,
        fs_info->btree_inode->i_mapping->a_ops = &btree_aops;
        fs_info->btree_inode->i_mapping->backing_dev_info = &fs_info->bdi;
 
+       RB_CLEAR_NODE(&BTRFS_I(fs_info->btree_inode)->rb_node);
        extent_io_tree_init(&BTRFS_I(fs_info->btree_inode)->io_tree,
                             fs_info->btree_inode->i_mapping,
                             GFP_NOFS);
@@ -1613,10 +1609,6 @@ struct btrfs_root *open_ctree(struct super_block *sb,
                             fs_info->btree_inode->i_mapping, GFP_NOFS);
        fs_info->do_barriers = 1;
 
-       INIT_LIST_HEAD(&fs_info->dead_reloc_roots);
-       btrfs_leaf_ref_tree_init(&fs_info->reloc_ref_tree);
-       btrfs_leaf_ref_tree_init(&fs_info->shared_ref_tree);
-
        BTRFS_I(fs_info->btree_inode)->root = tree_root;
        memset(&BTRFS_I(fs_info->btree_inode)->location, 0,
               sizeof(struct btrfs_key));
@@ -1674,6 +1666,12 @@ struct btrfs_root *open_ctree(struct super_block *sb,
                goto fail_iput;
        }
 
+       features = btrfs_super_incompat_flags(disk_super);
+       if (!(features & BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF)) {
+               features |= BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF;
+               btrfs_set_super_incompat_flags(disk_super, features);
+       }
+
        features = btrfs_super_compat_ro_flags(disk_super) &
                ~BTRFS_FEATURE_COMPAT_RO_SUPP;
        if (!(sb->s_flags & MS_RDONLY) && features) {
@@ -1771,7 +1769,7 @@ struct btrfs_root *open_ctree(struct super_block *sb,
        if (ret) {
                printk(KERN_WARNING "btrfs: failed to read the system "
                       "array on %s\n", sb->s_id);
-               goto fail_sys_array;
+               goto fail_sb_buffer;
        }
 
        blocksize = btrfs_level_size(tree_root,
@@ -1785,6 +1783,8 @@ struct btrfs_root *open_ctree(struct super_block *sb,
                                           btrfs_super_chunk_root(disk_super),
                                           blocksize, generation);
        BUG_ON(!chunk_root->node);
+       btrfs_set_root_node(&chunk_root->root_item, chunk_root->node);
+       chunk_root->commit_root = btrfs_root_node(chunk_root);
 
        read_extent_buffer(chunk_root->node, fs_info->chunk_tree_uuid,
           (unsigned long)btrfs_header_chunk_tree_uuid(chunk_root->node),
@@ -1810,7 +1810,8 @@ struct btrfs_root *open_ctree(struct super_block *sb,
                                          blocksize, generation);
        if (!tree_root->node)
                goto fail_chunk_root;
-
+       btrfs_set_root_node(&tree_root->root_item, tree_root->node);
+       tree_root->commit_root = btrfs_root_node(tree_root);
 
        ret = find_and_setup_root(tree_root, fs_info,
                                  BTRFS_EXTENT_TREE_OBJECTID, extent_root);
@@ -1820,14 +1821,14 @@ struct btrfs_root *open_ctree(struct super_block *sb,
 
        ret = find_and_setup_root(tree_root, fs_info,
                                  BTRFS_DEV_TREE_OBJECTID, dev_root);
-       dev_root->track_dirty = 1;
        if (ret)
                goto fail_extent_root;
+       dev_root->track_dirty = 1;
 
        ret = find_and_setup_root(tree_root, fs_info,
                                  BTRFS_CSUM_TREE_OBJECTID, csum_root);
        if (ret)
-               goto fail_extent_root;
+               goto fail_dev_root;
 
        csum_root->track_dirty = 1;
 
@@ -1881,7 +1882,7 @@ struct btrfs_root *open_ctree(struct super_block *sb,
        }
 
        if (!(sb->s_flags & MS_RDONLY)) {
-               ret = btrfs_cleanup_reloc_trees(tree_root);
+               ret = btrfs_recover_relocation(tree_root);
                BUG_ON(ret);
        }
 
@@ -1908,14 +1909,19 @@ fail_cleaner:
 
 fail_csum_root:
        free_extent_buffer(csum_root->node);
+       free_extent_buffer(csum_root->commit_root);
+fail_dev_root:
+       free_extent_buffer(dev_root->node);
+       free_extent_buffer(dev_root->commit_root);
 fail_extent_root:
        free_extent_buffer(extent_root->node);
+       free_extent_buffer(extent_root->commit_root);
 fail_tree_root:
        free_extent_buffer(tree_root->node);
+       free_extent_buffer(tree_root->commit_root);
 fail_chunk_root:
        free_extent_buffer(chunk_root->node);
-fail_sys_array:
-       free_extent_buffer(dev_root->node);
+       free_extent_buffer(chunk_root->commit_root);
 fail_sb_buffer:
        btrfs_stop_workers(&fs_info->fixup_workers);
        btrfs_stop_workers(&fs_info->delalloc_workers);
@@ -2173,6 +2179,7 @@ int write_ctree_super(struct btrfs_trans_handle *trans,
 
 int btrfs_free_fs_root(struct btrfs_fs_info *fs_info, struct btrfs_root *root)
 {
+       WARN_ON(!RB_EMPTY_ROOT(&root->inode_tree));
        radix_tree_delete(&fs_info->fs_roots_radix,
                          (unsigned long)root->root_key.objectid);
        if (root->anon_super.s_dev) {
@@ -2219,10 +2226,12 @@ int btrfs_cleanup_fs_roots(struct btrfs_fs_info *fs_info)
                                             ARRAY_SIZE(gang));
                if (!ret)
                        break;
+
+               root_objectid = gang[ret - 1]->root_key.objectid + 1;
                for (i = 0; i < ret; i++) {
                        root_objectid = gang[i]->root_key.objectid;
                        ret = btrfs_find_dead_roots(fs_info->tree_root,
-                                                   root_objectid, gang[i]);
+                                                   root_objectid);
                        BUG_ON(ret);
                        btrfs_orphan_cleanup(gang[i]);
                }
@@ -2278,20 +2287,16 @@ int close_ctree(struct btrfs_root *root)
                       (unsigned long long)fs_info->total_ref_cache_size);
        }
 
-       if (fs_info->extent_root->node)
-               free_extent_buffer(fs_info->extent_root->node);
-
-       if (fs_info->tree_root->node)
-               free_extent_buffer(fs_info->tree_root->node);
-
-       if (root->fs_info->chunk_root->node)
-               free_extent_buffer(root->fs_info->chunk_root->node);
-
-       if (root->fs_info->dev_root->node)
-               free_extent_buffer(root->fs_info->dev_root->node);
-
-       if (root->fs_info->csum_root->node)
-               free_extent_buffer(root->fs_info->csum_root->node);
+       free_extent_buffer(fs_info->extent_root->node);
+       free_extent_buffer(fs_info->extent_root->commit_root);
+       free_extent_buffer(fs_info->tree_root->node);
+       free_extent_buffer(fs_info->tree_root->commit_root);
+       free_extent_buffer(root->fs_info->chunk_root->node);
+       free_extent_buffer(root->fs_info->chunk_root->commit_root);
+       free_extent_buffer(root->fs_info->dev_root->node);
+       free_extent_buffer(root->fs_info->dev_root->commit_root);
+       free_extent_buffer(root->fs_info->csum_root->node);
+       free_extent_buffer(root->fs_info->csum_root->commit_root);
 
        btrfs_free_block_groups(root->fs_info);
 
index 85315d2..9596b40 100644 (file)
@@ -78,7 +78,7 @@ static struct dentry *btrfs_get_dentry(struct super_block *sb, u64 objectid,
        btrfs_set_key_type(&key, BTRFS_INODE_ITEM_KEY);
        key.offset = 0;
 
-       inode = btrfs_iget(sb, &key, root, NULL);
+       inode = btrfs_iget(sb, &key, root);
        if (IS_ERR(inode))
                return (void *)inode;
 
@@ -192,7 +192,7 @@ static struct dentry *btrfs_get_parent(struct dentry *child)
        btrfs_set_key_type(&key, BTRFS_INODE_ITEM_KEY);
        key.offset = 0;
 
-       return d_obtain_alias(btrfs_iget(root->fs_info->sb, &key, root, NULL));
+       return d_obtain_alias(btrfs_iget(root->fs_info->sb, &key, root));
 }
 
 const struct export_operations btrfs_export_ops = {
index 35af933..a42419c 100644 (file)
 #include "transaction.h"
 #include "volumes.h"
 #include "locking.h"
-#include "ref-cache.h"
 #include "free-space-cache.h"
 
-#define PENDING_EXTENT_INSERT 0
-#define PENDING_EXTENT_DELETE 1
-#define PENDING_BACKREF_UPDATE 2
-
-struct pending_extent_op {
-       int type;
-       u64 bytenr;
-       u64 num_bytes;
-       u64 parent;
-       u64 orig_parent;
-       u64 generation;
-       u64 orig_generation;
-       int level;
-       struct list_head list;
-       int del;
-};
-
-static int __btrfs_alloc_reserved_extent(struct btrfs_trans_handle *trans,
-                                        struct btrfs_root *root, u64 parent,
-                                        u64 root_objectid, u64 ref_generation,
-                                        u64 owner, struct btrfs_key *ins,
-                                        int ref_mod);
 static int update_reserved_extents(struct btrfs_root *root,
                                   u64 bytenr, u64 num, int reserve);
 static int update_block_group(struct btrfs_trans_handle *trans,
                              struct btrfs_root *root,
                              u64 bytenr, u64 num_bytes, int alloc,
                              int mark_free);
-static noinline int __btrfs_free_extent(struct btrfs_trans_handle *trans,
-                                       struct btrfs_root *root,
-                                       u64 bytenr, u64 num_bytes, u64 parent,
-                                       u64 root_objectid, u64 ref_generation,
-                                       u64 owner_objectid, int pin,
-                                       int ref_to_drop);
+static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
+                               struct btrfs_root *root,
+                               u64 bytenr, u64 num_bytes, u64 parent,
+                               u64 root_objectid, u64 owner_objectid,
+                               u64 owner_offset, int refs_to_drop,
+                               struct btrfs_delayed_extent_op *extra_op);
+static void __run_delayed_extent_op(struct btrfs_delayed_extent_op *extent_op,
+                                   struct extent_buffer *leaf,
+                                   struct btrfs_extent_item *ei);
+static int alloc_reserved_file_extent(struct btrfs_trans_handle *trans,
+                                     struct btrfs_root *root,
+                                     u64 parent, u64 root_objectid,
+                                     u64 flags, u64 owner, u64 offset,
+                                     struct btrfs_key *ins, int ref_mod);
+static int alloc_reserved_tree_block(struct btrfs_trans_handle *trans,
+                                    struct btrfs_root *root,
+                                    u64 parent, u64 root_objectid,
+                                    u64 flags, struct btrfs_disk_key *key,
+                                    int level, struct btrfs_key *ins);
 
 static int do_chunk_alloc(struct btrfs_trans_handle *trans,
                          struct btrfs_root *extent_root, u64 alloc_bytes,
@@ -453,196 +443,973 @@ int btrfs_lookup_extent(struct btrfs_root *root, u64 start, u64 len)
  *    maintenance.  This is actually the same as #2, but with a slightly
  *    different use case.
  *
+ * There are two kinds of back refs. The implicit back refs is optimized
+ * for pointers in non-shared tree blocks. For a given pointer in a block,
+ * back refs of this kind provide information about the block's owner tree
+ * and the pointer's key. These information allow us to find the block by
+ * b-tree searching. The full back refs is for pointers in tree blocks not
+ * referenced by their owner trees. The location of tree block is recorded
+ * in the back refs. Actually the full back refs is generic, and can be
+ * used in all cases the implicit back refs is used. The major shortcoming
+ * of the full back refs is its overhead. Every time a tree block gets
+ * COWed, we have to update back refs entry for all pointers in it.
+ *
+ * For a newly allocated tree block, we use implicit back refs for
+ * pointers in it. This means most tree related operations only involve
+ * implicit back refs. For a tree block created in old transaction, the
+ * only way to drop a reference to it is COW it. So we can detect the
+ * event that tree block loses its owner tree's reference and do the
+ * back refs conversion.
+ *
+ * When a tree block is COW'd through a tree, there are four cases:
+ *
+ * The reference count of the block is one and the tree is the block's
+ * owner tree. Nothing to do in this case.
+ *
+ * The reference count of the block is one and the tree is not the
+ * block's owner tree. In this case, full back refs is used for pointers
+ * in the block. Remove these full back refs, add implicit back refs for
+ * every pointers in the new block.
+ *
+ * The reference count of the block is greater than one and the tree is
+ * the block's owner tree. In this case, implicit back refs is used for
+ * pointers in the block. Add full back refs for every pointers in the
+ * block, increase lower level extents' reference counts. The original
+ * implicit back refs are entailed to the new block.
+ *
+ * The reference count of the block is greater than one and the tree is
+ * not the block's owner tree. Add implicit back refs for every pointer in
+ * the new block, increase lower level extents' reference count.
+ *
+ * Back Reference Key composing:
+ *
+ * The key objectid corresponds to the first byte in the extent,
+ * The key type is used to differentiate between types of back refs.
+ * There are different meanings of the key offset for different types
+ * of back refs.
+ *
  * File extents can be referenced by:
  *
  * - multiple snapshots, subvolumes, or different generations in one subvol
  * - different files inside a single subvolume
  * - different offsets inside a file (bookend extents in file.c)
  *
- * The extent ref structure has fields for:
+ * The extent ref structure for the implicit back refs has fields for:
  *
  * - Objectid of the subvolume root
- * - Generation number of the tree holding the reference
  * - objectid of the file holding the reference
- * - number of references holding by parent node (alway 1 for tree blocks)
- *
- * Btree leaf may hold multiple references to a file extent. In most cases,
- * these references are from same file and the corresponding offsets inside
- * the file are close together.
- *
- * When a file extent is allocated the fields are filled in:
- *     (root_key.objectid, trans->transid, inode objectid, 1)
+ * - original offset in the file
+ * - how many bookend extents
  *
- * When a leaf is cow'd new references are added for every file extent found
- * in the leaf.  It looks similar to the create case, but trans->transid will
- * be different when the block is cow'd.
+ * The key offset for the implicit back refs is hash of the first
+ * three fields.
  *
- *     (root_key.objectid, trans->transid, inode objectid,
- *      number of references in the leaf)
+ * The extent ref structure for the full back refs has field for:
  *
- * When a file extent is removed either during snapshot deletion or
- * file truncation, we find the corresponding back reference and check
- * the following fields:
+ * - number of pointers in the tree leaf
  *
- *     (btrfs_header_owner(leaf), btrfs_header_generation(leaf),
- *      inode objectid)
+ * The key offset for the implicit back refs is the first byte of
+ * the tree leaf
  *
- * Btree extents can be referenced by:
- *
- * - Different subvolumes
- * - Different generations of the same subvolume
- *
- * When a tree block is created, back references are inserted:
+ * When a file extent is allocated, The implicit back refs is used.
+ * the fields are filled in:
  *
- * (root->root_key.objectid, trans->transid, level, 1)
+ *     (root_key.objectid, inode objectid, offset in file, 1)
  *
- * When a tree block is cow'd, new back references are added for all the
- * blocks it points to. If the tree block isn't in reference counted root,
- * the old back references are removed. These new back references are of
- * the form (trans->transid will have increased since creation):
+ * When a file extent is removed file truncation, we find the
+ * corresponding implicit back refs and check the following fields:
  *
- * (root->root_key.objectid, trans->transid, level, 1)
+ *     (btrfs_header_owner(leaf), inode objectid, offset in file)
  *
- * When a backref is in deleting, the following fields are checked:
+ * Btree extents can be referenced by:
  *
- * if backref was for a tree root:
- *     (btrfs_header_owner(itself), btrfs_header_generation(itself), level)
- * else
- *     (btrfs_header_owner(parent), btrfs_header_generation(parent), level)
+ * - Different subvolumes
  *
- * Back Reference Key composing:
+ * Both the implicit back refs and the full back refs for tree blocks
+ * only consist of key. The key offset for the implicit back refs is
+ * objectid of block's owner tree. The key offset for the full back refs
+ * is the first byte of parent block.
  *
- * The key objectid corresponds to the first byte in the extent, the key
- * type is set to BTRFS_EXTENT_REF_KEY, and the key offset is the first
- * byte of parent extent. If a extent is tree root, the key offset is set
- * to the key objectid.
+ * When implicit back refs is used, information about the lowest key and
+ * level of the tree block are required. These information are stored in
+ * tree block info structure.
  */
 
-static noinline int lookup_extent_backref(struct btrfs_trans_handle *trans,
-                                         struct btrfs_root *root,
-                                         struct btrfs_path *path,
-                                         u64 bytenr, u64 parent,
-                                         u64 ref_root, u64 ref_generation,
-                                         u64 owner_objectid, int del)
+#ifdef BTRFS_COMPAT_EXTENT_TREE_V0
+static int convert_extent_item_v0(struct btrfs_trans_handle *trans,
+                                 struct btrfs_root *root,
+                                 struct btrfs_path *path,
+                                 u64 owner, u32 extra_size)
 {
+       struct btrfs_extent_item *item;
+       struct btrfs_extent_item_v0 *ei0;
+       struct btrfs_extent_ref_v0 *ref0;
+       struct btrfs_tree_block_info *bi;
+       struct extent_buffer *leaf;
        struct btrfs_key key;
-       struct btrfs_extent_ref *ref;
+       struct btrfs_key found_key;
+       u32 new_size = sizeof(*item);
+       u64 refs;
+       int ret;
+
+       leaf = path->nodes[0];
+       BUG_ON(btrfs_item_size_nr(leaf, path->slots[0]) != sizeof(*ei0));
+
+       btrfs_item_key_to_cpu(leaf, &key, path->slots[0]);
+       ei0 = btrfs_item_ptr(leaf, path->slots[0],
+                            struct btrfs_extent_item_v0);
+       refs = btrfs_extent_refs_v0(leaf, ei0);
+
+       if (owner == (u64)-1) {
+               while (1) {
+                       if (path->slots[0] >= btrfs_header_nritems(leaf)) {
+                               ret = btrfs_next_leaf(root, path);
+                               if (ret < 0)
+                                       return ret;
+                               BUG_ON(ret > 0);
+                               leaf = path->nodes[0];
+                       }
+                       btrfs_item_key_to_cpu(leaf, &found_key,
+                                             path->slots[0]);
+                       BUG_ON(key.objectid != found_key.objectid);
+                       if (found_key.type != BTRFS_EXTENT_REF_V0_KEY) {
+                               path->slots[0]++;
+                               continue;
+                       }
+                       ref0 = btrfs_item_ptr(leaf, path->slots[0],
+                                             struct btrfs_extent_ref_v0);
+                       owner = btrfs_ref_objectid_v0(leaf, ref0);
+                       break;
+               }
+       }
+       btrfs_release_path(root, path);
+
+       if (owner < BTRFS_FIRST_FREE_OBJECTID)
+               new_size += sizeof(*bi);
+
+       new_size -= sizeof(*ei0);
+       ret = btrfs_search_slot(trans, root, &key, path,
+                               new_size + extra_size, 1);
+       if (ret < 0)
+               return ret;
+       BUG_ON(ret);
+
+       ret = btrfs_extend_item(trans, root, path, new_size);
+       BUG_ON(ret);
+
+       leaf = path->nodes[0];
+       item = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_extent_item);
+       btrfs_set_extent_refs(leaf, item, refs);
+       /* FIXME: get real generation */
+       btrfs_set_extent_generation(leaf, item, 0);
+       if (owner < BTRFS_FIRST_FREE_OBJECTID) {
+               btrfs_set_extent_flags(leaf, item,
+                                      BTRFS_EXTENT_FLAG_TREE_BLOCK |
+                                      BTRFS_BLOCK_FLAG_FULL_BACKREF);
+               bi = (struct btrfs_tree_block_info *)(item + 1);
+               /* FIXME: get first key of the block */
+               memset_extent_buffer(leaf, 0, (unsigned long)bi, sizeof(*bi));
+               btrfs_set_tree_block_level(leaf, bi, (int)owner);
+       } else {
+               btrfs_set_extent_flags(leaf, item, BTRFS_EXTENT_FLAG_DATA);
+       }
+       btrfs_mark_buffer_dirty(leaf);
+       return 0;
+}
+#endif
+
+static u64 hash_extent_data_ref(u64 root_objectid, u64 owner, u64 offset)
+{
+       u32 high_crc = ~(u32)0;
+       u32 low_crc = ~(u32)0;
+       __le64 lenum;
+
+       lenum = cpu_to_le64(root_objectid);
+       high_crc = btrfs_crc32c(high_crc, &lenum, sizeof(lenum));
+       lenum = cpu_to_le64(owner);
+       low_crc = btrfs_crc32c(low_crc, &lenum, sizeof(lenum));
+       lenum = cpu_to_le64(offset);
+       low_crc = btrfs_crc32c(low_crc, &lenum, sizeof(lenum));
+
+       return ((u64)high_crc << 31) ^ (u64)low_crc;
+}
+
+static u64 hash_extent_data_ref_item(struct extent_buffer *leaf,
+                                    struct btrfs_extent_data_ref *ref)
+{
+       return hash_extent_data_ref(btrfs_extent_data_ref_root(leaf, ref),
+                                   btrfs_extent_data_ref_objectid(leaf, ref),
+                                   btrfs_extent_data_ref_offset(leaf, ref));
+}
+
+static int match_extent_data_ref(struct extent_buffer *leaf,
+                                struct btrfs_extent_data_ref *ref,
+                                u64 root_objectid, u64 owner, u64 offset)
+{
+       if (btrfs_extent_data_ref_root(leaf, ref) != root_objectid ||
+           btrfs_extent_data_ref_objectid(leaf, ref) != owner ||
+           btrfs_extent_data_ref_offset(leaf, ref) != offset)
+               return 0;
+       return 1;
+}
+
+static noinline int lookup_extent_data_ref(struct btrfs_trans_handle *trans,
+                                          struct btrfs_root *root,
+                                          struct btrfs_path *path,
+                                          u64 bytenr, u64 parent,
+                                          u64 root_objectid,
+                                          u64 owner, u64 offset)
+{
+       struct btrfs_key key;
+       struct btrfs_extent_data_ref *ref;
        struct extent_buffer *leaf;
-       u64 ref_objectid;
+       u32 nritems;
        int ret;
+       int recow;
+       int err = -ENOENT;
 
        key.objectid = bytenr;
-       key.type = BTRFS_EXTENT_REF_KEY;
-       key.offset = parent;
+       if (parent) {
+               key.type = BTRFS_SHARED_DATA_REF_KEY;
+               key.offset = parent;
+       } else {
+               key.type = BTRFS_EXTENT_DATA_REF_KEY;
+               key.offset = hash_extent_data_ref(root_objectid,
+                                                 owner, offset);
+       }
+again:
+       recow = 0;
+       ret = btrfs_search_slot(trans, root, &key, path, -1, 1);
+       if (ret < 0) {
+               err = ret;
+               goto fail;
+       }
 
-       ret = btrfs_search_slot(trans, root, &key, path, del ? -1 : 0, 1);
-       if (ret < 0)
-               goto out;
-       if (ret > 0) {
-               ret = -ENOENT;
-               goto out;
+       if (parent) {
+               if (!ret)
+                       return 0;
+#ifdef BTRFS_COMPAT_EXTENT_TREE_V0
+               key.type = BTRFS_EXTENT_REF_V0_KEY;
+               btrfs_release_path(root, path);
+               ret = btrfs_search_slot(trans, root, &key, path, -1, 1);
+               if (ret < 0) {
+                       err = ret;
+                       goto fail;
+               }
+               if (!ret)
+                       return 0;
+#endif
+               goto fail;
        }
 
        leaf = path->nodes[0];
-       ref = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_extent_ref);
-       ref_objectid = btrfs_ref_objectid(leaf, ref);
-       if (btrfs_ref_root(leaf, ref) != ref_root ||
-           btrfs_ref_generation(leaf, ref) != ref_generation ||
-           (ref_objectid != owner_objectid &&
-            ref_objectid != BTRFS_MULTIPLE_OBJECTIDS)) {
-               ret = -EIO;
-               WARN_ON(1);
-               goto out;
+       nritems = btrfs_header_nritems(leaf);
+       while (1) {
+               if (path->slots[0] >= nritems) {
+                       ret = btrfs_next_leaf(root, path);
+                       if (ret < 0)
+                               err = ret;
+                       if (ret)
+                               goto fail;
+
+                       leaf = path->nodes[0];
+                       nritems = btrfs_header_nritems(leaf);
+                       recow = 1;
+               }
+
+               btrfs_item_key_to_cpu(leaf, &key, path->slots[0]);
+               if (key.objectid != bytenr ||
+                   key.type != BTRFS_EXTENT_DATA_REF_KEY)
+                       goto fail;
+
+               ref = btrfs_item_ptr(leaf, path->slots[0],
+                                    struct btrfs_extent_data_ref);
+
+               if (match_extent_data_ref(leaf, ref, root_objectid,
+                                         owner, offset)) {
+                       if (recow) {
+                               btrfs_release_path(root, path);
+                               goto again;
+                       }
+                       err = 0;
+                       break;
+               }
+               path->slots[0]++;
        }
-       ret = 0;
-out:
-       return ret;
+fail:
+       return err;
 }
 
-static noinline int insert_extent_backref(struct btrfs_trans_handle *trans,
-                                         struct btrfs_root *root,
-                                         struct btrfs_path *path,
-                                         u64 bytenr, u64 parent,
-                                         u64 ref_root, u64 ref_generation,
-                                         u64 owner_objectid,
-                                         int refs_to_add)
+static noinline int insert_extent_data_ref(struct btrfs_trans_handle *trans,
+                                          struct btrfs_root *root,
+                                          struct btrfs_path *path,
+                                          u64 bytenr, u64 parent,
+                                          u64 root_objectid, u64 owner,
+                                          u64 offset, int refs_to_add)
 {
        struct btrfs_key key;
        struct extent_buffer *leaf;
-       struct btrfs_extent_ref *ref;
+       u32 size;
        u32 num_refs;
        int ret;
 
        key.objectid = bytenr;
-       key.type = BTRFS_EXTENT_REF_KEY;
-       key.offset = parent;
+       if (parent) {
+               key.type = BTRFS_SHARED_DATA_REF_KEY;
+               key.offset = parent;
+               size = sizeof(struct btrfs_shared_data_ref);
+       } else {
+               key.type = BTRFS_EXTENT_DATA_REF_KEY;
+               key.offset = hash_extent_data_ref(root_objectid,
+                                                 owner, offset);
+               size = sizeof(struct btrfs_extent_data_ref);
+       }
 
-       ret = btrfs_insert_empty_item(trans, root, path, &key, sizeof(*ref));
-       if (ret == 0) {
-               leaf = path->nodes[0];
-               ref = btrfs_item_ptr(leaf, path->slots[0],
-                                    struct btrfs_extent_ref);
-               btrfs_set_ref_root(leaf, ref, ref_root);
-               btrfs_set_ref_generation(leaf, ref, ref_generation);
-               btrfs_set_ref_objectid(leaf, ref, owner_objectid);
-               btrfs_set_ref_num_refs(leaf, ref, refs_to_add);
-       } else if (ret == -EEXIST) {
-               u64 existing_owner;
-
-               BUG_ON(owner_objectid < BTRFS_FIRST_FREE_OBJECTID);
-               leaf = path->nodes[0];
+       ret = btrfs_insert_empty_item(trans, root, path, &key, size);
+       if (ret && ret != -EEXIST)
+               goto fail;
+
+       leaf = path->nodes[0];
+       if (parent) {
+               struct btrfs_shared_data_ref *ref;
                ref = btrfs_item_ptr(leaf, path->slots[0],
-                                    struct btrfs_extent_ref);
-               if (btrfs_ref_root(leaf, ref) != ref_root ||
-                   btrfs_ref_generation(leaf, ref) != ref_generation) {
-                       ret = -EIO;
-                       WARN_ON(1);
-                       goto out;
+                                    struct btrfs_shared_data_ref);
+               if (ret == 0) {
+                       btrfs_set_shared_data_ref_count(leaf, ref, refs_to_add);
+               } else {
+                       num_refs = btrfs_shared_data_ref_count(leaf, ref);
+                       num_refs += refs_to_add;
+                       btrfs_set_shared_data_ref_count(leaf, ref, num_refs);
                }
+       } else {
+               struct btrfs_extent_data_ref *ref;
+               while (ret == -EEXIST) {
+                       ref = btrfs_item_ptr(leaf, path->slots[0],
+                                            struct btrfs_extent_data_ref);
+                       if (match_extent_data_ref(leaf, ref, root_objectid,
+                                                 owner, offset))
+                               break;
+                       btrfs_release_path(root, path);
+                       key.offset++;
+                       ret = btrfs_insert_empty_item(trans, root, path, &key,
+                                                     size);
+                       if (ret && ret != -EEXIST)
+                               goto fail;
 
-               num_refs = btrfs_ref_num_refs(leaf, ref);
-               BUG_ON(num_refs == 0);
-               btrfs_set_ref_num_refs(leaf, ref, num_refs + refs_to_add);
-
-               existing_owner = btrfs_ref_objectid(leaf, ref);
-               if (existing_owner != owner_objectid &&
-                   existing_owner != BTRFS_MULTIPLE_OBJECTIDS) {
-                       btrfs_set_ref_objectid(leaf, ref,
-                                       BTRFS_MULTIPLE_OBJECTIDS);
+                       leaf = path->nodes[0];
+               }
+               ref = btrfs_item_ptr(leaf, path->slots[0],
+                                    struct btrfs_extent_data_ref);
+               if (ret == 0) {
+                       btrfs_set_extent_data_ref_root(leaf, ref,
+                                                      root_objectid);
+                       btrfs_set_extent_data_ref_objectid(leaf, ref, owner);
+                       btrfs_set_extent_data_ref_offset(leaf, ref, offset);
+                       btrfs_set_extent_data_ref_count(leaf, ref, refs_to_add);
+               } else {
+                       num_refs = btrfs_extent_data_ref_count(leaf, ref);
+                       num_refs += refs_to_add;
+                       btrfs_set_extent_data_ref_count(leaf, ref, num_refs);
                }
-               ret = 0;
-       } else {
-               goto out;
        }
-       btrfs_unlock_up_safe(path, 1);
-       btrfs_mark_buffer_dirty(path->nodes[0]);
-out:
+       btrfs_mark_buffer_dirty(leaf);
+       ret = 0;
+fail:
        btrfs_release_path(root, path);
        return ret;
 }
 
-static noinline int remove_extent_backref(struct btrfs_trans_handle *trans,
-                                         struct btrfs_root *root,
-                                         struct btrfs_path *path,
-                                         int refs_to_drop)
+static noinline int remove_extent_data_ref(struct btrfs_trans_handle *trans,
+                                          struct btrfs_root *root,
+                                          struct btrfs_path *path,
+                                          int refs_to_drop)
 {
+       struct btrfs_key key;
+       struct btrfs_extent_data_ref *ref1 = NULL;
+       struct btrfs_shared_data_ref *ref2 = NULL;
        struct extent_buffer *leaf;
-       struct btrfs_extent_ref *ref;
-       u32 num_refs;
+       u32 num_refs = 0;
        int ret = 0;
 
        leaf = path->nodes[0];
-       ref = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_extent_ref);
-       num_refs = btrfs_ref_num_refs(leaf, ref);
+       btrfs_item_key_to_cpu(leaf, &key, path->slots[0]);
+
+       if (key.type == BTRFS_EXTENT_DATA_REF_KEY) {
+               ref1 = btrfs_item_ptr(leaf, path->slots[0],
+                                     struct btrfs_extent_data_ref);
+               num_refs = btrfs_extent_data_ref_count(leaf, ref1);
+       } else if (key.type == BTRFS_SHARED_DATA_REF_KEY) {
+               ref2 = btrfs_item_ptr(leaf, path->slots[0],
+                                     struct btrfs_shared_data_ref);
+               num_refs = btrfs_shared_data_ref_count(leaf, ref2);
+#ifdef BTRFS_COMPAT_EXTENT_TREE_V0
+       } else if (key.type == BTRFS_EXTENT_REF_V0_KEY) {
+               struct btrfs_extent_ref_v0 *ref0;
+               ref0 = btrfs_item_ptr(leaf, path->slots[0],
+                                     struct btrfs_extent_ref_v0);
+               num_refs = btrfs_ref_count_v0(leaf, ref0);
+#endif
+       } else {
+               BUG();
+       }
+
        BUG_ON(num_refs < refs_to_drop);
        num_refs -= refs_to_drop;
+
        if (num_refs == 0) {
                ret = btrfs_del_item(trans, root, path);
        } else {
-               btrfs_set_ref_num_refs(leaf, ref, num_refs);
+               if (key.type == BTRFS_EXTENT_DATA_REF_KEY)
+                       btrfs_set_extent_data_ref_count(leaf, ref1, num_refs);
+               else if (key.type == BTRFS_SHARED_DATA_REF_KEY)
+                       btrfs_set_shared_data_ref_count(leaf, ref2, num_refs);
+#ifdef BTRFS_COMPAT_EXTENT_TREE_V0
+               else {
+                       struct btrfs_extent_ref_v0 *ref0;
+                       ref0 = btrfs_item_ptr(leaf, path->slots[0],
+                                       struct btrfs_extent_ref_v0);
+                       btrfs_set_ref_count_v0(leaf, ref0, num_refs);
+               }
+#endif
                btrfs_mark_buffer_dirty(leaf);
        }
+       return ret;
+}
+
+static noinline u32 extent_data_ref_count(struct btrfs_root *root,
+                                         struct btrfs_path *path,
+                                         struct btrfs_extent_inline_ref *iref)
+{
+       struct btrfs_key key;
+       struct extent_buffer *leaf;
+       struct btrfs_extent_data_ref *ref1;
+       struct btrfs_shared_data_ref *ref2;
+       u32 num_refs = 0;
+
+       leaf = path->nodes[0];
+       btrfs_item_key_to_cpu(leaf, &key, path->slots[0]);
+       if (iref) {
+               if (btrfs_extent_inline_ref_type(leaf, iref) ==
+                   BTRFS_EXTENT_DATA_REF_KEY) {
+                       ref1 = (struct btrfs_extent_data_ref *)(&iref->offset);
+                       num_refs = btrfs_extent_data_ref_count(leaf, ref1);
+               } else {
+                       ref2 = (struct btrfs_shared_data_ref *)(iref + 1);
+                       num_refs = btrfs_shared_data_ref_count(leaf, ref2);
+               }
+       } else if (key.type == BTRFS_EXTENT_DATA_REF_KEY) {
+               ref1 = btrfs_item_ptr(leaf, path->slots[0],
+                                     struct btrfs_extent_data_ref);
+               num_refs = btrfs_extent_data_ref_count(leaf, ref1);
+       } else if (key.type == BTRFS_SHARED_DATA_REF_KEY) {
+               ref2 = btrfs_item_ptr(leaf, path->slots[0],
+                                     struct btrfs_shared_data_ref);
+               num_refs = btrfs_shared_data_ref_count(leaf, ref2);
+#ifdef BTRFS_COMPAT_EXTENT_TREE_V0
+       } else if (key.type == BTRFS_EXTENT_REF_V0_KEY) {
+               struct btrfs_extent_ref_v0 *ref0;
+               ref0 = btrfs_item_ptr(leaf, path->slots[0],
+                                     struct btrfs_extent_ref_v0);
+               num_refs = btrfs_ref_count_v0(leaf, ref0);
+#endif
+       } else {
+               WARN_ON(1);
+       }
+       return num_refs;
+}
+
+static noinline int lookup_tree_block_ref(struct btrfs_trans_handle *trans,
+                                         struct btrfs_root *root,
+                                         struct btrfs_path *path,
+                                         u64 bytenr, u64 parent,
+                                         u64 root_objectid)
+{
+       struct btrfs_key key;
+       int ret;
+
+       key.objectid = bytenr;
+       if (parent) {
+               key.type = BTRFS_SHARED_BLOCK_REF_KEY;
+               key.offset = parent;
+       } else {
+               key.type = BTRFS_TREE_BLOCK_REF_KEY;
+               key.offset = root_objectid;
+       }
+
+       ret = btrfs_search_slot(trans, root, &key, path, -1, 1);
+       if (ret > 0)
+               ret = -ENOENT;
+#ifdef BTRFS_COMPAT_EXTENT_TREE_V0
+       if (ret == -ENOENT && parent) {
+               btrfs_release_path(root, path);
+               key.type = BTRFS_EXTENT_REF_V0_KEY;
+               ret = btrfs_search_slot(trans, root, &key, path, -1, 1);
+               if (ret > 0)
+                       ret = -ENOENT;
+       }
+#endif
+       return ret;
+}
+
+static noinline int insert_tree_block_ref(struct btrfs_trans_handle *trans,
+                                         struct btrfs_root *root,
+                                         struct btrfs_path *path,
+                                         u64 bytenr, u64 parent,
+                                         u64 root_objectid)
+{
+       struct btrfs_key key;
+       int ret;
+
+       key.objectid = bytenr;
+       if (parent) {
+               key.type = BTRFS_SHARED_BLOCK_REF_KEY;
+               key.offset = parent;
+       } else {
+               key.type = BTRFS_TREE_BLOCK_REF_KEY;
+               key.offset = root_objectid;
+       }
+
+       ret = btrfs_insert_empty_item(trans, root, path, &key, 0);
+       btrfs_release_path(root, path);
+       return ret;
+}
+
+static inline int extent_ref_type(u64 parent, u64 owner)
+{
+       int type;
+       if (owner < BTRFS_FIRST_FREE_OBJECTID) {
+               if (parent > 0)
+                       type = BTRFS_SHARED_BLOCK_REF_KEY;
+               else
+                       type = BTRFS_TREE_BLOCK_REF_KEY;
+       } else {
+               if (parent > 0)
+                       type = BTRFS_SHARED_DATA_REF_KEY;
+               else
+                       type = BTRFS_EXTENT_DATA_REF_KEY;
+       }
+       return type;
+}
+
+static int find_next_key(struct btrfs_path *path, struct btrfs_key *key)
+
+{
+       int level;
+       BUG_ON(!path->keep_locks);
+       for (level = 0; level < BTRFS_MAX_LEVEL; level++) {
+               if (!path->nodes[level])
+                       break;
+               btrfs_assert_tree_locked(path->nodes[level]);
+               if (path->slots[level] + 1 >=
+                   btrfs_header_nritems(path->nodes[level]))
+                       continue;
+               if (level == 0)
+                       btrfs_item_key_to_cpu(path->nodes[level], key,
+                                             path->slots[level] + 1);
+               else
+                       btrfs_node_key_to_cpu(path->nodes[level], key,
+                                             path->slots[level] + 1);
+               return 0;
+       }
+       return 1;
+}
+
+/*
+ * look for inline back ref. if back ref is found, *ref_ret is set
+ * to the address of inline back ref, and 0 is returned.
+ *
+ * if back ref isn't found, *ref_ret is set to the address where it
+ * should be inserted, and -ENOENT is returned.
+ *
+ * if insert is true and there are too many inline back refs, the path
+ * points to the extent item, and -EAGAIN is returned.
+ *
+ * NOTE: inline back refs are ordered in the same way that back ref
+ *      items in the tree are ordered.
+ */
+static noinline_for_stack
+int lookup_inline_extent_backref(struct btrfs_trans_handle *trans,
+                                struct btrfs_root *root,
+                                struct btrfs_path *path,
+                                struct btrfs_extent_inline_ref **ref_ret,
+                                u64 bytenr, u64 num_bytes,
+                                u64 parent, u64 root_objectid,
+                                u64 owner, u64 offset, int insert)
+{
+       struct btrfs_key key;
+       struct extent_buffer *leaf;
+       struct btrfs_extent_item *ei;
+       struct btrfs_extent_inline_ref *iref;
+       u64 flags;
+       u64 item_size;
+       unsigned long ptr;
+       unsigned long end;
+       int extra_size;
+       int type;
+       int want;
+       int ret;
+       int err = 0;
+
+       key.objectid = bytenr;
+       key.type = BTRFS_EXTENT_ITEM_KEY;
+       key.offset = num_bytes;
+
+       want = extent_ref_type(parent, owner);
+       if (insert) {
+               extra_size = btrfs_extent_inline_ref_size(want);
+               if (owner >= BTRFS_FIRST_FREE_OBJECTID)
+                       path->keep_locks = 1;
+       } else
+               extra_size = -1;
+       ret = btrfs_search_slot(trans, root, &key, path, extra_size, 1);
+       if (ret < 0) {
+               err = ret;
+               goto out;
+       }
+       BUG_ON(ret);
+
+       leaf = path->nodes[0];
+       item_size = btrfs_item_size_nr(leaf, path->slots[0]);
+#ifdef BTRFS_COMPAT_EXTENT_TREE_V0
+       if (item_size < sizeof(*ei)) {
+               if (!insert) {
+                       err = -ENOENT;
+                       goto out;
+               }
+               ret = convert_extent_item_v0(trans, root, path, owner,
+                                            extra_size);
+               if (ret < 0) {
+                       err = ret;
+                       goto out;
+               }
+               leaf = path->nodes[0];
+               item_size = btrfs_item_size_nr(leaf, path->slots[0]);
+       }
+#endif
+       BUG_ON(item_size < sizeof(*ei));
+
+       if (owner < BTRFS_FIRST_FREE_OBJECTID && insert &&
+           item_size + extra_size >= BTRFS_MAX_EXTENT_ITEM_SIZE(root)) {
+               err = -EAGAIN;
+               goto out;
+       }
+
+       ei = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_extent_item);
+       flags = btrfs_extent_flags(leaf, ei);
+
+       ptr = (unsigned long)(ei + 1);
+       end = (unsigned long)ei + item_size;
+
+       if (flags & BTRFS_EXTENT_FLAG_TREE_BLOCK) {
+               ptr += sizeof(struct btrfs_tree_block_info);
+               BUG_ON(ptr > end);
+       } else {
+               BUG_ON(!(flags & BTRFS_EXTENT_FLAG_DATA));
+       }
+
+       err = -ENOENT;
+       while (1) {
+               if (ptr >= end) {
+                       WARN_ON(ptr > end);
+                       break;
+               }
+               iref = (struct btrfs_extent_inline_ref *)ptr;
+               type = btrfs_extent_inline_ref_type(leaf, iref);
+               if (want < type)
+                       break;
+               if (want > type) {
+                       ptr += btrfs_extent_inline_ref_size(type);
+                       continue;
+               }
+
+               if (type == BTRFS_EXTENT_DATA_REF_KEY) {
+                       struct btrfs_extent_data_ref *dref;
+                       dref = (struct btrfs_extent_data_ref *)(&iref->offset);
+                       if (match_extent_data_ref(leaf, dref, root_objectid,
+                                                 owner, offset)) {
+                               err = 0;
+                               break;
+                       }
+                       if (hash_extent_data_ref_item(leaf, dref) <
+                           hash_extent_data_ref(root_objectid, owner, offset))
+                               break;
+               } else {
+                       u64 ref_offset;
+                       ref_offset = btrfs_extent_inline_ref_offset(leaf, iref);
+                       if (parent > 0) {
+                               if (parent == ref_offset) {
+                                       err = 0;
+                                       break;
+                               }
+                               if (ref_offset < parent)
+                                       break;
+                       } else {
+                               if (root_objectid == ref_offset) {
+                                       err = 0;
+                                       break;
+                               }
+                               if (ref_offset < root_objectid)
+                                       break;
+                       }
+               }
+               ptr += btrfs_extent_inline_ref_size(type);
+       }
+       if (err == -ENOENT && insert) {
+               if (item_size + extra_size >=
+                   BTRFS_MAX_EXTENT_ITEM_SIZE(root)) {
+                       err = -EAGAIN;
+                       goto out;
+               }
+               /*
+                * To add new inline back ref, we have to make sure
+                * there is no corresponding back ref item.
+                * For simplicity, we just do not add new inline back
+                * ref if there is any kind of item for this block
+                */
+               if (owner >= BTRFS_FIRST_FREE_OBJECTID &&
+                   find_next_key(path, &key) == 0 && key.objectid == bytenr) {
+                       err = -EAGAIN;
+                       goto out;
+               }
+       }
+       *ref_ret = (struct btrfs_extent_inline_ref *)ptr;
+out:
+       if (insert && owner >= BTRFS_FIRST_FREE_OBJECTID) {
+               path->keep_locks = 0;
+               btrfs_unlock_up_safe(path, 1);
+       }
+       return err;
+}
+
+/*
+ * helper to add new inline back ref
+ */
+static noinline_for_stack
+int setup_inline_extent_backref(struct btrfs_trans_handle *trans,
+                               struct btrfs_root *root,
+                               struct btrfs_path *path,
+                               struct btrfs_extent_inline_ref *iref,
+                               u64 parent, u64 root_objectid,
+                               u64 owner, u64 offset, int refs_to_add,
+                               struct btrfs_delayed_extent_op *extent_op)
+{
+       struct extent_buffer *leaf;
+       struct btrfs_extent_item *ei;
+       unsigned long ptr;
+       unsigned long end;
+       unsigned long item_offset;
+       u64 refs;
+       int size;
+       int type;
+       int ret;
+
+       leaf = path->nodes[0];
+       ei = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_extent_item);
+       item_offset = (unsigned long)iref - (unsigned long)ei;
+
+       type = extent_ref_type(parent, owner);
+       size = btrfs_extent_inline_ref_size(type);
+
+       ret = btrfs_extend_item(trans, root, path, size);
+       BUG_ON(ret);
+
+       ei = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_extent_item);
+       refs = btrfs_extent_refs(leaf, ei);
+       refs += refs_to_add;
+       btrfs_set_extent_refs(leaf, ei, refs);
+       if (extent_op)
+               __run_delayed_extent_op(extent_op, leaf, ei);
+
+       ptr = (unsigned long)ei + item_offset;
+       end = (unsigned long)ei + btrfs_item_size_nr(leaf, path->slots[0]);
+       if (ptr < end - size)
+               memmove_extent_buffer(leaf, ptr + size, ptr,
+                                     end - size - ptr);
+
+       iref = (struct btrfs_extent_inline_ref *)ptr;
+       btrfs_set_extent_inline_ref_type(leaf, iref, type);
+       if (type == BTRFS_EXTENT_DATA_REF_KEY) {
+               struct btrfs_extent_data_ref *dref;
+               dref = (struct btrfs_extent_data_ref *)(&iref->offset);
+               btrfs_set_extent_data_ref_root(leaf, dref, root_objectid);
+               btrfs_set_extent_data_ref_objectid(leaf, dref, owner);
+               btrfs_set_extent_data_ref_offset(leaf, dref, offset);
+               btrfs_set_extent_data_ref_count(leaf, dref, refs_to_add);
+       } else if (type == BTRFS_SHARED_DATA_REF_KEY) {
+               struct btrfs_shared_data_ref *sref;
+               sref = (struct btrfs_shared_data_ref *)(iref + 1);
+               btrfs_set_shared_data_ref_count(leaf, sref, refs_to_add);
+               btrfs_set_extent_inline_ref_offset(leaf, iref, parent);
+       } else if (type == BTRFS_SHARED_BLOCK_REF_KEY) {
+               btrfs_set_extent_inline_ref_offset(leaf, iref, parent);
+       } else {
+               btrfs_set_extent_inline_ref_offset(leaf, iref, root_objectid);
+       }
+       btrfs_mark_buffer_dirty(leaf);
+       return 0;
+}
+
+static int lookup_extent_backref(struct btrfs_trans_handle *trans,
+                                struct btrfs_root *root,
+                                struct btrfs_path *path,
+                                struct btrfs_extent_inline_ref **ref_ret,
+                                u64 bytenr, u64 num_bytes, u64 parent,
+                                u64 root_objectid, u64 owner, u64 offset)
+{
+       int ret;
+
+       ret = lookup_inline_extent_backref(trans, root, path, ref_ret,
+                                          bytenr, num_bytes, parent,
+                                          root_objectid, owner, offset, 0);
+       if (ret != -ENOENT)
+               return ret;
+
        btrfs_release_path(root, path);
+       *ref_ret = NULL;
+
+       if (owner < BTRFS_FIRST_FREE_OBJECTID) {
+               ret = lookup_tree_block_ref(trans, root, path, bytenr, parent,
+                                           root_objectid);
+       } else {
+               ret = lookup_extent_data_ref(trans, root, path, bytenr, parent,
+                                            root_objectid, owner, offset);
+       }
+       return ret;
+}
+
+/*
+ * helper to update/remove inline back ref
+ */
+static noinline_for_stack
+int update_inline_extent_backref(struct btrfs_trans_handle *trans,
+                                struct btrfs_root *root,
+                                struct btrfs_path *path,
+                                struct btrfs_extent_inline_ref *iref,
+                                int refs_to_mod,
+                                struct btrfs_delayed_extent_op *extent_op)
+{
+       struct extent_buffer *leaf;
+       struct btrfs_extent_item *ei;
+       struct btrfs_extent_data_ref *dref = NULL;
+       struct btrfs_shared_data_ref *sref = NULL;
+       unsigned long ptr;
+       unsigned long end;
+       u32 item_size;
+       int size;
+       int type;
+       int ret;
+       u64 refs;
+
+       leaf = path->nodes[0];
+       ei = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_extent_item);
+       refs = btrfs_extent_refs(leaf, ei);
+       WARN_ON(refs_to_mod < 0 && refs + refs_to_mod <= 0);
+       refs += refs_to_mod;
+       btrfs_set_extent_refs(leaf, ei, refs);
+       if (extent_op)
+               __run_delayed_extent_op(extent_op, leaf, ei);
+
+       type = btrfs_extent_inline_ref_type(leaf, iref);
+
+       if (type == BTRFS_EXTENT_DATA_REF_KEY) {
+               dref = (struct btrfs_extent_data_ref *)(&iref->offset);
+               refs = btrfs_extent_data_ref_count(leaf, dref);
+       } else if (type == BTRFS_SHARED_DATA_REF_KEY) {
+               sref = (struct btrfs_shared_data_ref *)(iref + 1);
+               refs = btrfs_shared_data_ref_count(leaf, sref);
+       } else {
+               refs = 1;
+               BUG_ON(refs_to_mod != -1);
+       }
+
+       BUG_ON(refs_to_mod < 0 && refs < -refs_to_mod);
+       refs += refs_to_mod;
+
+       if (refs > 0) {
+               if (type == BTRFS_EXTENT_DATA_REF_KEY)
+                       btrfs_set_extent_data_ref_count(leaf, dref, refs);
+               else
+                       btrfs_set_shared_data_ref_count(leaf, sref, refs);
+       } else {
+               size =  btrfs_extent_inline_ref_size(type);
+               item_size = btrfs_item_size_nr(leaf, path->slots[0]);
+               ptr = (unsigned long)iref;
+               end = (unsigned long)ei + item_size;
+               if (ptr + size < end)
+                       memmove_extent_buffer(leaf, ptr, ptr + size,
+                                             end - ptr - size);
+               item_size -= size;
+               ret = btrfs_truncate_item(trans, root, path, item_size, 1);
+               BUG_ON(ret);
+       }
+       btrfs_mark_buffer_dirty(leaf);
+       return 0;
+}
+
+static noinline_for_stack
+int insert_inline_extent_backref(struct btrfs_trans_handle *trans,
+                                struct btrfs_root *root,
+                                struct btrfs_path *path,
+                                u64 bytenr, u64 num_bytes, u64 parent,
+                                u64 root_objectid, u64 owner,
+                                u64 offset, int refs_to_add,
+                                struct btrfs_delayed_extent_op *extent_op)
+{
+       struct btrfs_extent_inline_ref *iref;
+       int ret;
+
+       ret = lookup_inline_extent_backref(trans, root, path, &iref,
+                                          bytenr, num_bytes, parent,
+                                          root_objectid, owner, offset, 1);
+       if (ret == 0) {
+               BUG_ON(owner < BTRFS_FIRST_FREE_OBJECTID);
+               ret = update_inline_extent_backref(trans, root, path, iref,
+                                                  refs_to_add, extent_op);
+       } else if (ret == -ENOENT) {
+               ret = setup_inline_extent_backref(trans, root, path, iref,
+                                                 parent, root_objectid,
+                                                 owner, offset, refs_to_add,
+                                                 extent_op);
+       }
+       return ret;
+}
+
+static int insert_extent_backref(struct btrfs_trans_handle *trans,
+                                struct btrfs_root *root,
+                                struct btrfs_path *path,
+                                u64 bytenr, u64 parent, u64 root_objectid,
+                                u64 owner, u64 offset, int refs_to_add)
+{
+       int ret;
+       if (owner < BTRFS_FIRST_FREE_OBJECTID) {
+               BUG_ON(refs_to_add != 1);
+               ret = insert_tree_block_ref(trans, root, path, bytenr,
+                                           parent, root_objectid);
+       } else {
+               ret = insert_extent_data_ref(trans, root, path, bytenr,
+                                            parent, root_objectid,
+                                            owner, offset, refs_to_add);
+       }
+       return ret;
+}
+
+static int remove_extent_backref(struct btrfs_trans_handle *trans,
+                                struct btrfs_root *root,
+                                struct btrfs_path *path,
+                                struct btrfs_extent_inline_ref *iref,
+                                int refs_to_drop, int is_data)
+{
+       int ret;
+
+       BUG_ON(!is_data && refs_to_drop != 1);
+       if (iref) {
+               ret = update_inline_extent_backref(trans, root, path, iref,
+                                                  -refs_to_drop, NULL);
+       } else if (is_data) {
+               ret = remove_extent_data_ref(trans, root, path, refs_to_drop);
+       } else {
+               ret = btrfs_del_item(trans, root, path);
+       }
        return ret;
 }
 
@@ -686,71 +1453,40 @@ static int btrfs_discard_extent(struct btrfs_root *root, u64 bytenr,
 #endif
 }
 
-static int __btrfs_update_extent_ref(struct btrfs_trans_handle *trans,
-                                    struct btrfs_root *root, u64 bytenr,
-                                    u64 num_bytes,
-                                    u64 orig_parent, u64 parent,
-                                    u64 orig_root, u64 ref_root,
-                                    u64 orig_generation, u64 ref_generation,
-                                    u64 owner_objectid)
+int btrfs_inc_extent_ref(struct btrfs_trans_handle *trans,
+                        struct btrfs_root *root,
+                        u64 bytenr, u64 num_bytes, u64 parent,
+                        u64 root_objectid, u64 owner, u64 offset)
 {
        int ret;
-       int pin = owner_objectid < BTRFS_FIRST_FREE_OBJECTID;
+       BUG_ON(owner < BTRFS_FIRST_FREE_OBJECTID &&
+              root_objectid == BTRFS_TREE_LOG_OBJECTID);
 
-       ret = btrfs_update_delayed_ref(trans, bytenr, num_bytes,
-                                      orig_parent, parent, orig_root,
-                                      ref_root, orig_generation,
-                                      ref_generation, owner_objectid, pin);
-       BUG_ON(ret);
+       if (owner < BTRFS_FIRST_FREE_OBJECTID) {
+               ret = btrfs_add_delayed_tree_ref(trans, bytenr, num_bytes,
+                                       parent, root_objectid, (int)owner,
+                                       BTRFS_ADD_DELAYED_REF, NULL);
+       } else {
+               ret = btrfs_add_delayed_data_ref(trans, bytenr, num_bytes,
+                                       parent, root_objectid, owner, offset,
+                                       BTRFS_ADD_DELAYED_REF, NULL);
+       }
        return ret;
 }
 
-int btrfs_update_extent_ref(struct btrfs_trans_handle *trans,
-                           struct btrfs_root *root, u64 bytenr,
-                           u64 num_bytes, u64 orig_parent, u64 parent,
-                           u64 ref_root, u64 ref_generation,
-                           u64 owner_objectid)
-{
-       int ret;
-       if (ref_root == BTRFS_TREE_LOG_OBJECTID &&
-           owner_objectid < BTRFS_FIRST_FREE_OBJECTID)
-               return 0;
-
-       ret = __btrfs_update_extent_ref(trans, root, bytenr, num_bytes,
-                                       orig_parent, parent, ref_root,
-                                       ref_root, ref_generation,
-                                       ref_generation, owner_objectid);
-       return ret;
-}
 static int __btrfs_inc_extent_ref(struct btrfs_trans_handle *trans,
-                                 struct btrfs_root *root, u64 bytenr,
-                                 u64 num_bytes,
-                                 u64 orig_parent, u64 parent,
-                                 u64 orig_root, u64 ref_root,
-                                 u64 orig_generation, u64 ref_generation,
-                                 u64 owner_objectid)
-{
-       int ret;
-
-       ret = btrfs_add_delayed_ref(trans, bytenr, num_bytes, parent, ref_root,
-                                   ref_generation, owner_objectid,
-                                   BTRFS_ADD_DELAYED_REF, 0);
-       BUG_ON(ret);
-       return ret;
-}
-
-static noinline_for_stack int add_extent_ref(struct btrfs_trans_handle *trans,
-                         struct btrfs_root *root, u64 bytenr,
-                         u64 num_bytes, u64 parent, u64 ref_root,
-                         u64 ref_generation, u64 owner_objectid,
-                         int refs_to_add)
+                                 struct btrfs_root *root,
+                                 u64 bytenr, u64 num_bytes,
+                                 u64 parent, u64 root_objectid,
+                                 u64 owner, u64 offset, int refs_to_add,
+                                 struct btrfs_delayed_extent_op *extent_op)
 {
        struct btrfs_path *path;
-       int ret;
-       struct btrfs_key key;
-       struct extent_buffer *l;
+       struct extent_buffer *leaf;
        struct btrfs_extent_item *item;
-       u32 refs;
+       u64 refs;
+       int ret;
+       int err = 0;
 
        path = btrfs_alloc_path();
        if (!path)
@@ -758,43 +1494,27 @@ static noinline_for_stack int add_extent_ref(struct btrfs_trans_handle *trans,
 
        path->reada = 1;
        path->leave_spinning = 1;
-       key.objectid = bytenr;
-       key.type = BTRFS_EXTENT_ITEM_KEY;
-       key.offset = num_bytes;
-
-       /* first find the extent item and update its reference count */
-       ret = btrfs_search_slot(trans, root->fs_info->extent_root, &key,
-                               path, 0, 1);
-       if (ret < 0) {
-               btrfs_set_path_blocking(path);
-               return ret;
-       }
-
-       if (ret > 0) {
-               WARN_ON(1);
-               btrfs_free_path(path);
-               return -EIO;
-       }
-       l = path->nodes[0];
+       /* this will setup the path even if it fails to insert the back ref */
+       ret = insert_inline_extent_backref(trans, root->fs_info->extent_root,
+                                          path, bytenr, num_bytes, parent,
+                                          root_objectid, owner, offset,
+                                          refs_to_add, extent_op);
+       if (ret == 0)
+               goto out;
 
-       btrfs_item_key_to_cpu(l, &key, path->slots[0]);
-       if (key.objectid != bytenr) {
-               btrfs_print_leaf(root->fs_info->extent_root, path->nodes[0]);
-               printk(KERN_ERR "btrfs wanted %llu found %llu\n",
-                      (unsigned long long)bytenr,
-                      (unsigned long long)key.objectid);
-               BUG();
+       if (ret != -EAGAIN) {
+               err = ret;
+               goto out;
        }
-       BUG_ON(key.type != BTRFS_EXTENT_ITEM_KEY);
 
-       item = btrfs_item_ptr(l, path->slots[0], struct btrfs_extent_item);
-
-       refs = btrfs_extent_refs(l, item);
-       btrfs_set_extent_refs(l, item, refs + refs_to_add);
-       btrfs_unlock_up_safe(path, 1);
-
-       btrfs_mark_buffer_dirty(path->nodes[0]);
+       leaf = path->nodes[0];
+       item = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_extent_item);
+       refs = btrfs_extent_refs(leaf, item);
+       btrfs_set_extent_refs(leaf, item, refs + refs_to_add);
+       if (extent_op)
+               __run_delayed_extent_op(extent_op, leaf, item);
 
+       btrfs_mark_buffer_dirty(leaf);
        btrfs_release_path(root->fs_info->extent_root, path);
 
        path->reada = 1;
@@ -802,56 +1522,197 @@ static noinline_for_stack int add_extent_ref(struct btrfs_trans_handle *trans,
 
        /* now insert the actual backref */
        ret = insert_extent_backref(trans, root->fs_info->extent_root,
-                                   path, bytenr, parent,
-                                   ref_root, ref_generation,
-                                   owner_objectid, refs_to_add);
+                                   path, bytenr, parent, root_objectid,
+                                   owner, offset, refs_to_add);
        BUG_ON(ret);
+out:
        btrfs_free_path(path);
-       return 0;
+       return err;
 }
 
-int btrfs_inc_extent_ref(struct btrfs_trans_handle *trans,
-                        struct btrfs_root *root,
-                        u64 bytenr, u64 num_bytes, u64 parent,
-                        u64 ref_root, u64 ref_generation,
-                        u64 owner_objectid)
+static int run_delayed_data_ref(struct btrfs_trans_handle *trans,
+                               struct btrfs_root *root,
+                               struct btrfs_delayed_ref_node *node,
+                               struct btrfs_delayed_extent_op *extent_op,
+                               int insert_reserved)
 {
-       int ret;
-       if (ref_root == BTRFS_TREE_LOG_OBJECTID &&
-           owner_objectid < BTRFS_FIRST_FREE_OBJECTID)
-               return 0;
+       int ret = 0;
+       struct btrfs_delayed_data_ref *ref;
+       struct btrfs_key ins;
+       u64 parent = 0;
+       u64 ref_root = 0;
+       u64 flags = 0;
+
+       ins.objectid = node->bytenr;
+       ins.offset = node->num_bytes;
+       ins.type = BTRFS_EXTENT_ITEM_KEY;
+
+       ref = btrfs_delayed_node_to_data_ref(node);
+       if (node->type == BTRFS_SHARED_DATA_REF_KEY)
+               parent = ref->parent;
+       else
+               ref_root = ref->root;
 
-       ret = __btrfs_inc_extent_ref(trans, root, bytenr, num_bytes, 0, parent,
-                                    0, ref_root, 0, ref_generation,
-                                    owner_objectid);
+       if (node->action == BTRFS_ADD_DELAYED_REF && insert_reserved) {
+               if (extent_op) {
+                       BUG_ON(extent_op->update_key);
+                       flags |= extent_op->flags_to_set;
+               }
+               ret = alloc_reserved_file_extent(trans, root,
+                                                parent, ref_root, flags,
+                                                ref->objectid, ref->offset,
+                                                &ins, node->ref_mod);
+               update_reserved_extents(root, ins.objectid, ins.offset, 0);
+       } else if (node->action == BTRFS_ADD_DELAYED_REF) {
+               ret = __btrfs_inc_extent_ref(trans, root, node->bytenr,
+                                            node->num_bytes, parent,
+                                            ref_root, ref->objectid,
+                                            ref->offset, node->ref_mod,
+                                            extent_op);
+       } else if (node->action == BTRFS_DROP_DELAYED_REF) {
+               ret = __btrfs_free_extent(trans, root, node->bytenr,
+                                         node->num_bytes, parent,
+                                         ref_root, ref->objectid,
+                                         ref->offset, node->ref_mod,
+                                         extent_op);
+       } else {
+               BUG();
+       }
        return ret;
 }
 
-static int drop_delayed_ref(struct btrfs_trans_handle *trans,
-                                       struct btrfs_root *root,
-                                       struct btrfs_delayed_ref_node *node)
+static void __run_delayed_extent_op(struct btrfs_delayed_extent_op *extent_op,
+                                   struct extent_buffer *leaf,
+                                   struct btrfs_extent_item *ei)
+{
+       u64 flags = btrfs_extent_flags(leaf, ei);
+       if (extent_op->update_flags) {
+               flags |= extent_op->flags_to_set;
+               btrfs_set_extent_flags(leaf, ei, flags);
+       }
+
+       if (extent_op->update_key) {
+               struct btrfs_tree_block_info *bi;
+               BUG_ON(!(flags & BTRFS_EXTENT_FLAG_TREE_BLOCK));
+               bi = (struct btrfs_tree_block_info *)(ei + 1);
+               btrfs_set_tree_block_key(leaf, bi, &extent_op->key);
+       }
+}
+
+static int run_delayed_extent_op(struct btrfs_trans_handle *trans,
+                                struct btrfs_root *root,
+                                struct btrfs_delayed_ref_node *node,
+                                struct btrfs_delayed_extent_op *extent_op)
+{
+       struct btrfs_key key;
+       struct btrfs_path *path;
+       struct btrfs_extent_item *ei;
+       struct extent_buffer *leaf;
+       u32 item_size;
+       int ret;
+       int err = 0;
+
+       path = btrfs_alloc_path();
+       if (!path)
+               return -ENOMEM;
+
+       key.objectid = node->bytenr;
+       key.type = BTRFS_EXTENT_ITEM_KEY;
+       key.offset = node->num_bytes;
+
+       path->reada = 1;
+       path->leave_spinning = 1;
+       ret = btrfs_search_slot(trans, root->fs_info->extent_root, &key,
+                               path, 0, 1);
+       if (ret < 0) {
+               err = ret;
+               goto out;
+       }
+       if (ret > 0) {
+               err = -EIO;
+               goto out;
+       }
+
+       leaf = path->nodes[0];
+       item_size = btrfs_item_size_nr(leaf, path->slots[0]);
+#ifdef BTRFS_COMPAT_EXTENT_TREE_V0
+       if (item_size < sizeof(*ei)) {
+               ret = convert_extent_item_v0(trans, root->fs_info->extent_root,
+                                            path, (u64)-1, 0);
+               if (ret < 0) {
+                       err = ret;
+                       goto out;
+               }
+               leaf = path->nodes[0];
+               item_size = btrfs_item_size_nr(leaf, path->slots[0]);
+       }
+#endif
+       BUG_ON(item_size < sizeof(*ei));
+       ei = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_extent_item);
+       __run_delayed_extent_op(extent_op, leaf, ei);
+
+       btrfs_mark_buffer_dirty(leaf);
+out:
+       btrfs_free_path(path);
+       return err;
+}
+
+static int run_delayed_tree_ref(struct btrfs_trans_handle *trans,
+                               struct btrfs_root *root,
+                               struct btrfs_delayed_ref_node *node,
+                               struct btrfs_delayed_extent_op *extent_op,
+                               int insert_reserved)
 {
        int ret = 0;
-       struct btrfs_delayed_ref *ref = btrfs_delayed_node_to_ref(node);
+       struct btrfs_delayed_tree_ref *ref;
+       struct btrfs_key ins;
+       u64 parent = 0;
+       u64 ref_root = 0;
 
-       BUG_ON(node->ref_mod == 0);
-       ret = __btrfs_free_extent(trans, root, node->bytenr, node->num_bytes,
-                                 node->parent, ref->root, ref->generation,
-                                 ref->owner_objectid, ref->pin, node->ref_mod);
+       ins.objectid = node->bytenr;
+       ins.offset = node->num_bytes;
+       ins.type = BTRFS_EXTENT_ITEM_KEY;
 
+       ref = btrfs_delayed_node_to_tree_ref(node);
+       if (node->type == BTRFS_SHARED_BLOCK_REF_KEY)
+               parent = ref->parent;
+       else
+               ref_root = ref->root;
+
+       BUG_ON(node->ref_mod != 1);
+       if (node->action == BTRFS_ADD_DELAYED_REF && insert_reserved) {
+               BUG_ON(!extent_op || !extent_op->update_flags ||
+                      !extent_op->update_key);
+               ret = alloc_reserved_tree_block(trans, root,
+                                               parent, ref_root,
+                                               extent_op->flags_to_set,
+                                               &extent_op->key,
+                                               ref->level, &ins);
+               update_reserved_extents(root, ins.objectid, ins.offset, 0);
+       } else if (node->action == BTRFS_ADD_DELAYED_REF) {
+               ret = __btrfs_inc_extent_ref(trans, root, node->bytenr,
+                                            node->num_bytes, parent, ref_root,
+                                            ref->level, 0, 1, extent_op);
+       } else if (node->action == BTRFS_DROP_DELAYED_REF) {
+               ret = __btrfs_free_extent(trans, root, node->bytenr,
+                                         node->num_bytes, parent, ref_root,
+                                         ref->level, 0, 1, extent_op);
+       } else {
+               BUG();
+       }
        return ret;
 }
 
+
 /* helper function to actually process a single delayed ref entry */
-static noinline int run_one_delayed_ref(struct btrfs_trans_handle *trans,
-                                       struct btrfs_root *root,
-                                       struct btrfs_delayed_ref_node *node,
-                                       int insert_reserved)
+static int run_one_delayed_ref(struct btrfs_trans_handle *trans,
+                              struct btrfs_root *root,
+                              struct btrfs_delayed_ref_node *node,
+                              struct btrfs_delayed_extent_op *extent_op,
+                              int insert_reserved)
 {
        int ret;
-       struct btrfs_delayed_ref *ref;
-
-       if (node->parent == (u64)-1) {
+       if (btrfs_delayed_ref_is_head(node)) {
                struct btrfs_delayed_ref_head *head;
                /*
                 * we've hit the end of the chain and we were supposed
@@ -859,44 +1720,35 @@ static noinline int run_one_delayed_ref(struct btrfs_trans_handle *trans,
                 * deleted before we ever needed to insert it, so all
                 * we have to do is clean up the accounting
                 */
+               BUG_ON(extent_op);
+               head = btrfs_delayed_node_to_head(node);
                if (insert_reserved) {
+                       if (head->is_data) {
+                               ret = btrfs_del_csums(trans, root,
+                                                     node->bytenr,
+                                                     node->num_bytes);
+                               BUG_ON(ret);
+                       }
+                       btrfs_update_pinned_extents(root, node->bytenr,
+                                                   node->num_bytes, 1);
                        update_reserved_extents(root, node->bytenr,
                                                node->num_bytes, 0);
                }
-               head = btrfs_delayed_node_to_head(node);
                mutex_unlock(&head->mutex);
                return 0;
        }
 
-       ref = btrfs_delayed_node_to_ref(node);
-       if (ref->action == BTRFS_ADD_DELAYED_REF) {
-               if (insert_reserved) {
-                       struct btrfs_key ins;
-
-                       ins.objectid = node->bytenr;
-                       ins.offset = node->num_bytes;
-                       ins.type = BTRFS_EXTENT_ITEM_KEY;
-
-                       /* record the full extent allocation */
-                       ret = __btrfs_alloc_reserved_extent(trans, root,
-                                       node->parent, ref->root,
-                                       ref->generation, ref->owner_objectid,
-                                       &ins, node->ref_mod);
-                       update_reserved_extents(root, node->bytenr,
-                                               node->num_bytes, 0);
-               } else {
-                       /* just add one backref */
-                       ret = add_extent_ref(trans, root, node->bytenr,
-                                    node->num_bytes,
-                                    node->parent, ref->root, ref->generation,
-                                    ref->owner_objectid, node->ref_mod);
-               }
-               BUG_ON(ret);
-       } else if (ref->action == BTRFS_DROP_DELAYED_REF) {
-               WARN_ON(insert_reserved);
-               ret = drop_delayed_ref(trans, root, node);
-       }
-       return 0;
+       if (node->type == BTRFS_TREE_BLOCK_REF_KEY ||
+           node->type == BTRFS_SHARED_BLOCK_REF_KEY)
+               ret = run_delayed_tree_ref(trans, root, node, extent_op,
+                                          insert_reserved);
+       else if (node->type == BTRFS_EXTENT_DATA_REF_KEY ||
+                node->type == BTRFS_SHARED_DATA_REF_KEY)
+               ret = run_delayed_data_ref(trans, root, node, extent_op,
+                                          insert_reserved);
+       else
+               BUG();
+       return ret;
 }
 
 static noinline struct btrfs_delayed_ref_node *
@@ -919,7 +1771,7 @@ again:
                                rb_node);
                if (ref->bytenr != head->node.bytenr)
                        break;
-               if (btrfs_delayed_node_to_ref(ref)->action == action)
+               if (ref->action == action)
                        return ref;
                node = rb_prev(node);
        }
@@ -937,6 +1789,7 @@ static noinline int run_clustered_refs(struct btrfs_trans_handle *trans,
        struct btrfs_delayed_ref_root *delayed_refs;
        struct btrfs_delayed_ref_node *ref;
        struct btrfs_delayed_ref_head *locked_ref = NULL;
+       struct btrfs_delayed_extent_op *extent_op;
        int ret;
        int count = 0;
        int must_insert_reserved = 0;
@@ -975,6 +1828,9 @@ static noinline int run_clustered_refs(struct btrfs_trans_handle *trans,
                must_insert_reserved = locked_ref->must_insert_reserved;
                locked_ref->must_insert_reserved = 0;
 
+               extent_op = locked_ref->extent_op;
+               locked_ref->extent_op = NULL;
+
                /*
                 * locked_ref is the head node, so we have to go one
                 * node back for any delayed ref updates
@@ -986,6 +1842,25 @@ static noinline int run_clustered_refs(struct btrfs_trans_handle *trans,
                         * so that any accounting fixes can happen
                         */
                        ref = &locked_ref->node;
+
+                       if (extent_op && must_insert_reserved) {
+                               kfree(extent_op);
+                               extent_op = NULL;
+                       }
+
+                       if (extent_op) {
+                               spin_unlock(&delayed_refs->lock);
+
+                               ret = run_delayed_extent_op(trans, root,
+                                                           ref, extent_op);
+                               BUG_ON(ret);
+                               kfree(extent_op);
+
+                               cond_resched();
+                               spin_lock(&delayed_refs->lock);
+                               continue;
+                       }
+
                        list_del_init(&locked_ref->cluster);
                        locked_ref = NULL;
                }
@@ -993,14 +1868,17 @@ static noinline int run_clustered_refs(struct btrfs_trans_handle *trans,
                ref->in_tree = 0;
                rb_erase(&ref->rb_node, &delayed_refs->root);
                delayed_refs->num_entries--;
+
                spin_unlock(&delayed_refs->lock);
 
-               ret = run_one_delayed_ref(trans, root, ref,
+               ret = run_one_delayed_ref(trans, root, ref, extent_op,
                                          must_insert_reserved);
                BUG_ON(ret);
-               btrfs_put_delayed_ref(ref);
 
+               btrfs_put_delayed_ref(ref);
+               kfree(extent_op);
                count++;
+
                cond_resched();
                spin_lock(&delayed_refs->lock);
        }
@@ -1095,25 +1973,112 @@ out:
        return 0;
 }
 
-int btrfs_cross_ref_exist(struct btrfs_trans_handle *trans,
-                         struct btrfs_root *root, u64 objectid, u64 bytenr)
+int btrfs_set_disk_extent_flags(struct btrfs_trans_handle *trans,
+                               struct btrfs_root *root,
+                               u64 bytenr, u64 num_bytes, u64 flags,
+                               int is_data)
+{
+       struct btrfs_delayed_extent_op *extent_op;
+       int ret;
+
+       extent_op = kmalloc(sizeof(*extent_op), GFP_NOFS);
+       if (!extent_op)
+               return -ENOMEM;
+
+       extent_op->flags_to_set = flags;
+       extent_op->update_flags = 1;
+       extent_op->update_key = 0;
+       extent_op->is_data = is_data ? 1 : 0;
+
+       ret = btrfs_add_delayed_extent_op(trans, bytenr, num_bytes, extent_op);
+       if (ret)
+               kfree(extent_op);
+       return ret;
+}
+
+static noinline int check_delayed_ref(struct btrfs_trans_handle *trans,
+                                     struct btrfs_root *root,
+                                     struct btrfs_path *path,
+                                     u64 objectid, u64 offset, u64 bytenr)
+{
+       struct btrfs_delayed_ref_head *head;
+       struct btrfs_delayed_ref_node *ref;
+       struct btrfs_delayed_data_ref *data_ref;
+       struct btrfs_delayed_ref_root *delayed_refs;
+       struct rb_node *node;
+       int ret = 0;
+
+       ret = -ENOENT;
+       delayed_refs = &trans->transaction->delayed_refs;
+       spin_lock(&delayed_refs->lock);
+       head = btrfs_find_delayed_ref_head(trans, bytenr);
+       if (!head)
+               goto out;
+
+       if (!mutex_trylock(&head->mutex)) {
+               atomic_inc(&head->node.refs);
+               spin_unlock(&delayed_refs->lock);
+
+               btrfs_release_path(root->fs_info->extent_root, path);
+
+               mutex_lock(&head->mutex);
+               mutex_unlock(&head->mutex);
+               btrfs_put_delayed_ref(&head->node);
+               return -EAGAIN;
+       }
+
+       node = rb_prev(&head->node.rb_node);
+       if (!node)
+               goto out_unlock;
+
+       ref = rb_entry(node, struct btrfs_delayed_ref_node, rb_node);
+
+       if (ref->bytenr != bytenr)
+               goto out_unlock;
+
+       ret = 1;
+       if (ref->type != BTRFS_EXTENT_DATA_REF_KEY)
+               goto out_unlock;
+
+       data_ref = btrfs_delayed_node_to_data_ref(ref);
+
+       node = rb_prev(node);
+       if (node) {
+               ref = rb_entry(node, struct btrfs_delayed_ref_node, rb_node);
+               if (ref->bytenr == bytenr)
+                       goto out_unlock;
+       }
+
+       if (data_ref->root != root->root_key.objectid ||
+           data_ref->objectid != objectid || data_ref->offset != offset)
+               goto out_unlock;
+
+       ret = 0;
+out_unlock:
+       mutex_unlock(&head->mutex);
+out:
+       spin_unlock(&delayed_refs->lock);
+       return ret;
+}
+
+static noinline int check_committed_ref(struct btrfs_trans_handle *trans,
+                                       struct btrfs_root *root,
+                                       struct btrfs_path *path,
+                                       u64 objectid, u64 offset, u64 bytenr)
 {
        struct btrfs_root *extent_root = root->fs_info->extent_root;
-       struct btrfs_path *path;
        struct extent_buffer *leaf;
-       struct btrfs_extent_ref *ref_item;
+       struct btrfs_extent_data_ref *ref;
+       struct btrfs_extent_inline_ref *iref;
+       struct btrfs_extent_item *ei;
        struct btrfs_key key;
-       struct btrfs_key found_key;
-       u64 ref_root;
-       u64 last_snapshot;
-       u32 nritems;
+       u32 item_size;
        int ret;
 
        key.objectid = bytenr;
        key.offset = (u64)-1;
        key.type = BTRFS_EXTENT_ITEM_KEY;
 
-       path = btrfs_alloc_path();
        ret = btrfs_search_slot(NULL, extent_root, &key, path, 0, 0);
        if (ret < 0)
                goto out;
@@ -1125,55 +2090,83 @@ int btrfs_cross_ref_exist(struct btrfs_trans_handle *trans,
 
        path->slots[0]--;
        leaf = path->nodes[0];
-       btrfs_item_key_to_cpu(leaf, &found_key, path->slots[0]);
+       btrfs_item_key_to_cpu(leaf, &key, path->slots[0]);
 
-       if (found_key.objectid != bytenr ||
-           found_key.type != BTRFS_EXTENT_ITEM_KEY)
+       if (key.objectid != bytenr || key.type != BTRFS_EXTENT_ITEM_KEY)
                goto out;
 
-       last_snapshot = btrfs_root_last_snapshot(&root->root_item);
-       while (1) {
-               leaf = path->nodes[0];
-               nritems = btrfs_header_nritems(leaf);
-               if (path->slots[0] >= nritems) {
-                       ret = btrfs_next_leaf(extent_root, path);
-                       if (ret < 0)
-                               goto out;
-                       if (ret == 0)
-                               continue;
-                       break;
-               }
-               btrfs_item_key_to_cpu(leaf, &found_key, path->slots[0]);
-               if (found_key.objectid != bytenr)
-                       break;
+       ret = 1;
+       item_size = btrfs_item_size_nr(leaf, path->slots[0]);
+#ifdef BTRFS_COMPAT_EXTENT_TREE_V0
+       if (item_size < sizeof(*ei)) {
+               WARN_ON(item_size != sizeof(struct btrfs_extent_item_v0));
+               goto out;
+       }
+#endif
+       ei = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_extent_item);
 
-               if (found_key.type != BTRFS_EXTENT_REF_KEY) {
-                       path->slots[0]++;
-                       continue;
-               }
+       if (item_size != sizeof(*ei) +
+           btrfs_extent_inline_ref_size(BTRFS_EXTENT_DATA_REF_KEY))
+               goto out;
 
-               ref_item = btrfs_item_ptr(leaf, path->slots[0],
-                                         struct btrfs_extent_ref);
-               ref_root = btrfs_ref_root(leaf, ref_item);
-               if ((ref_root != root->root_key.objectid &&
-                    ref_root != BTRFS_TREE_LOG_OBJECTID) ||
-                    objectid != btrfs_ref_objectid(leaf, ref_item)) {
-                       ret = 1;
-                       goto out;
-               }
-               if (btrfs_ref_generation(leaf, ref_item) <= last_snapshot) {
-                       ret = 1;
+       if (btrfs_extent_generation(leaf, ei) <=
+           btrfs_root_last_snapshot(&root->root_item))
+               goto out;
+
+       iref = (struct btrfs_extent_inline_ref *)(ei + 1);
+       if (btrfs_extent_inline_ref_type(leaf, iref) !=
+           BTRFS_EXTENT_DATA_REF_KEY)
+               goto out;
+
+       ref = (struct btrfs_extent_data_ref *)(&iref->offset);
+       if (btrfs_extent_refs(leaf, ei) !=
+           btrfs_extent_data_ref_count(leaf, ref) ||
+           btrfs_extent_data_ref_root(leaf, ref) !=
+           root->root_key.objectid ||
+           btrfs_extent_data_ref_objectid(leaf, ref) != objectid ||
+           btrfs_extent_data_ref_offset(leaf, ref) != offset)
+               goto out;
+
+       ret = 0;
+out:
+       return ret;
+}
+
+int btrfs_cross_ref_exist(struct btrfs_trans_handle *trans,
+                         struct btrfs_root *root,
+                         u64 objectid, u64 offset, u64 bytenr)
+{
+       struct btrfs_path *path;
+       int ret;
+       int ret2;
+
+       path = btrfs_alloc_path();
+       if (!path)
+               return -ENOENT;
+
+       do {
+               ret = check_committed_ref(trans, root, path, objectid,
+                                         offset, bytenr);
+               if (ret && ret != -ENOENT)
                        goto out;
-               }
 
-               path->slots[0]++;
+               ret2 = check_delayed_ref(trans, root, path, objectid,
+                                        offset, bytenr);
+       } while (ret2 == -EAGAIN);
+
+       if (ret2 && ret2 != -ENOENT) {
+               ret = ret2;
+               goto out;
        }
-       ret = 0;
+
+       if (ret != -ENOENT || ret2 != -ENOENT)
+               ret = 0;
 out:
        btrfs_free_path(path);
        return ret;
 }
 
+#if 0
 int btrfs_cache_ref(struct btrfs_trans_handle *trans, struct btrfs_root *root,
                    struct extent_buffer *buf, u32 nr_extents)
 {
@@ -1291,62 +2284,44 @@ static int refsort_cmp(const void *a_void, const void *b_void)
                return 1;
        return 0;
 }
+#endif
 
-
-noinline int btrfs_inc_ref(struct btrfs_trans_handle *trans,
+static int __btrfs_mod_ref(struct btrfs_trans_handle *trans,
                           struct btrfs_root *root,
-                          struct extent_buffer *orig_buf,
-                          struct extent_buffer *buf, u32 *nr_extents)
+                          struct extent_buffer *buf,
+                          int full_backref, int inc)
 {
        u64 bytenr;
+       u64 num_bytes;
+       u64 parent;
        u64 ref_root;
-       u64 orig_root;
-       u64 ref_generation;
-       u64 orig_generation;
-       struct refsort *sorted;
        u32 nritems;
-       u32 nr_file_extents = 0;
        struct btrfs_key key;
        struct btrfs_file_extent_item *fi;
        int i;
        int level;
        int ret = 0;
-       int faili = 0;
-       int refi = 0;
-       int slot;
        int (*process_func)(struct btrfs_trans_handle *, struct btrfs_root *,
-                           u64, u64, u64, u64, u64, u64, u64, u64, u64);
+                           u64, u64, u64, u64, u64, u64);
 
        ref_root = btrfs_header_owner(buf);
-       ref_generation = btrfs_header_generation(buf);
-       orig_root = btrfs_header_owner(orig_buf);
-       orig_generation = btrfs_header_generation(orig_buf);
-
        nritems = btrfs_header_nritems(buf);
        level = btrfs_header_level(buf);
 
-       sorted = kmalloc(sizeof(struct refsort) * nritems, GFP_NOFS);
-       BUG_ON(!sorted);
+       if (!root->ref_cows && level == 0)
+               return 0;
 
-       if (root->ref_cows) {
-               process_func = __btrfs_inc_extent_ref;
-       } else {
-               if (level == 0 &&
-                   root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID)
-                       goto out;
-               if (level != 0 &&
-                   root->root_key.objectid == BTRFS_TREE_LOG_OBJECTID)
-                       goto out;
-               process_func = __btrfs_update_extent_ref;
-       }
+       if (inc)
+               process_func = btrfs_inc_extent_ref;
+       else
+               process_func = btrfs_free_extent;
+
+       if (full_backref)
+               parent = buf->start;
+       else
+               parent = 0;
 
-       /*
-        * we make two passes through the items.  In the first pass we
-        * only record the byte number and slot.  Then we sort based on
-        * byte number and do the actual work based on the sorted results
-        */
        for (i = 0; i < nritems; i++) {
-               cond_resched();
                if (level == 0) {
                        btrfs_item_key_to_cpu(buf, &key, i);
                        if (btrfs_key_type(&key) != BTRFS_EXTENT_DATA_KEY)
@@ -1360,151 +2335,38 @@ noinline int btrfs_inc_ref(struct btrfs_trans_handle *trans,
                        if (bytenr == 0)
                                continue;
 
-                       nr_file_extents++;
-                       sorted[refi].bytenr = bytenr;
-                       sorted[refi].slot = i;
-                       refi++;
-               } else {
-                       bytenr = btrfs_node_blockptr(buf, i);
-                       sorted[refi].bytenr = bytenr;
-                       sorted[refi].slot = i;
-                       refi++;
-               }
-       }
-       /*
-        * if refi == 0, we didn't actually put anything into the sorted
-        * array and we're done
-        */
-       if (refi == 0)
-               goto out;
-
-       sort(sorted, refi, sizeof(struct refsort), refsort_cmp, NULL);
-
-       for (i = 0; i < refi; i++) {
-               cond_resched();
-               slot = sorted[i].slot;
-               bytenr = sorted[i].bytenr;
-
-               if (level == 0) {
-                       btrfs_item_key_to_cpu(buf, &key, slot);
-                       fi = btrfs_item_ptr(buf, slot,
-                                           struct btrfs_file_extent_item);
-
-                       bytenr = btrfs_file_extent_disk_bytenr(buf, fi);
-                       if (bytenr == 0)
-                               continue;
-
-                       ret = process_func(trans, root, bytenr,
-                                  btrfs_file_extent_disk_num_bytes(buf, fi),
-                                  orig_buf->start, buf->start,
-                                  orig_root, ref_root,
-                                  orig_generation, ref_generation,
-                                  key.objectid);
-
-                       if (ret) {
-                               faili = slot;
-                               WARN_ON(1);
+                       num_bytes = btrfs_file_extent_disk_num_bytes(buf, fi);
+                       key.offset -= btrfs_file_extent_offset(buf, fi);
+                       ret = process_func(trans, root, bytenr, num_bytes,
+                                          parent, ref_root, key.objectid,
+                                          key.offset);
+                       if (ret)
                                goto fail;
-                       }
                } else {
-                       ret = process_func(trans, root, bytenr, buf->len,
-                                          orig_buf->start, buf->start,
-                                          orig_root, ref_root,
-                                          orig_generation, ref_generation,
-                                          level - 1);
-                       if (ret) {
-                               faili = slot;
-                               WARN_ON(1);
+                       bytenr = btrfs_node_blockptr(buf, i);
+                       num_bytes = btrfs_level_size(root, level - 1);
+                       ret = process_func(trans, root, bytenr, num_bytes,
+                                          parent, ref_root, level - 1, 0);
+                       if (ret)
                                goto fail;
-                       }
                }
        }
-out:
-       kfree(sorted);
-       if (nr_extents) {
-               if (level == 0)
-                       *nr_extents = nr_file_extents;
-               else
-                       *nr_extents = nritems;
-       }
        return 0;
 fail:
-       kfree(sorted);
-       WARN_ON(1);
+       BUG();
        return ret;
 }
 
-int btrfs_update_ref(struct btrfs_trans_handle *trans,
-                    struct btrfs_root *root, struct extent_buffer *orig_buf,
-                    struct extent_buffer *buf, int start_slot, int nr)
-
+int btrfs_inc_ref(struct btrfs_trans_handle *trans, struct btrfs_root *root,
+                 struct extent_buffer *buf, int full_backref)
 {
-       u64 bytenr;
-       u64 ref_root;
-       u64 orig_root;
-       u64 ref_generation;
-       u64 orig_generation;
-       struct btrfs_key key;
-       struct btrfs_file_extent_item *fi;
-       int i;
-       int ret;
-       int slot;
-       int level;
-
-       BUG_ON(start_slot < 0);
-       BUG_ON(start_slot + nr > btrfs_header_nritems(buf));
-
-       ref_root = btrfs_header_owner(buf);
-       ref_generation = btrfs_header_generation(buf);
-       orig_root = btrfs_header_owner(orig_buf);
-       orig_generation = btrfs_header_generation(orig_buf);
-       level = btrfs_header_level(buf);
-
-       if (!root->ref_cows) {
-               if (level == 0 &&
-                   root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID)
-                       return 0;
-               if (level != 0 &&
-                   root->root_key.objectid == BTRFS_TREE_LOG_OBJECTID)
-                       return 0;
-       }
+       return __btrfs_mod_ref(trans, root, buf, full_backref, 1);
+}
 
-       for (i = 0, slot = start_slot; i < nr; i++, slot++) {
-               cond_resched();
-               if (level == 0) {
-                       btrfs_item_key_to_cpu(buf, &key, slot);
-                       if (btrfs_key_type(&key) != BTRFS_EXTENT_DATA_KEY)
-                               continue;
-                       fi = btrfs_item_ptr(buf, slot,
-                                           struct btrfs_file_extent_item);
-                       if (btrfs_file_extent_type(buf, fi) ==
-                           BTRFS_FILE_EXTENT_INLINE)
-                               continue;
-                       bytenr = btrfs_file_extent_disk_bytenr(buf, fi);
-                       if (bytenr == 0)
-                               continue;
-                       ret = __btrfs_update_extent_ref(trans, root, bytenr,
-                                   btrfs_file_extent_disk_num_bytes(buf, fi),
-                                   orig_buf->start, buf->start,
-                                   orig_root, ref_root, orig_generation,
-                                   ref_generation, key.objectid);
-                       if (ret)
-                               goto fail;
-               } else {
-                       bytenr = btrfs_node_blockptr(buf, slot);
-                       ret = __btrfs_update_extent_ref(trans, root, bytenr,
-                                           buf->len, orig_buf->start,
-                                           buf->start, orig_root, ref_root,
-                                           orig_generation, ref_generation,
-                                           level - 1);
-                       if (ret)
-                               goto fail;
-               }
-       }
-       return 0;
-fail:
-       WARN_ON(1);
-       return -1;
+int btrfs_dec_ref(struct btrfs_trans_handle *trans, struct btrfs_root *root,
+                 struct extent_buffer *buf, int full_backref)
+{
+       return __btrfs_mod_ref(trans, root, buf, full_backref, 0);
 }
 
 static int write_one_cache_group(struct btrfs_trans_handle *trans,
@@ -2007,6 +2869,24 @@ static int update_block_group(struct btrfs_trans_handle *trans,
        u64 old_val;
        u64 byte_in_group;
 
+       /* block accounting for super block */
+       spin_lock(&info->delalloc_lock);
+       old_val = btrfs_super_bytes_used(&info->super_copy);
+       if (alloc)
+               old_val += num_bytes;
+       else
+               old_val -= num_bytes;
+       btrfs_set_super_bytes_used(&info->super_copy, old_val);
+
+       /* block accounting for root item */
+       old_val = btrfs_root_used(&root->root_item);
+       if (alloc)
+               old_val += num_bytes;
+       else
+               old_val -= num_bytes;
+       btrfs_set_root_used(&root->root_item, old_val);
+       spin_unlock(&info->delalloc_lock);
+
        while (total) {
                cache = btrfs_lookup_block_group(info, bytenr);
                if (!cache)
@@ -2216,8 +3096,6 @@ static int pin_down_bytes(struct btrfs_trans_handle *trans,
                u64 header_owner = btrfs_header_owner(buf);
                u64 header_transid = btrfs_header_generation(buf);
                if (header_owner != BTRFS_TREE_LOG_OBJECTID &&
-                   header_owner != BTRFS_TREE_RELOC_OBJECTID &&
-                   header_owner != BTRFS_DATA_RELOC_TREE_OBJECTID &&
                    header_transid == trans->transid &&
                    !btrfs_header_flag(buf, BTRFS_HEADER_FLAG_WRITTEN)) {
                        *must_clean = buf;
@@ -2235,63 +3113,77 @@ pinit:
        return 0;
 }
 
-/*
- * remove an extent from the root, returns 0 on success
- */
-static int __free_extent(struct btrfs_trans_handle *trans,
-                        struct btrfs_root *root,
-                        u64 bytenr, u64 num_bytes, u64 parent,
-                        u64 root_objectid, u64 ref_generation,
-                        u64 owner_objectid, int pin, int mark_free,
-                        int refs_to_drop)
+
+static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
+                               struct btrfs_root *root,
+                               u64 bytenr, u64 num_bytes, u64 parent,
+                               u64 root_objectid, u64 owner_objectid,
+                               u64 owner_offset, int refs_to_drop,
+                               struct btrfs_delayed_extent_op *extent_op)
 {
-       struct btrfs_path *path;
        struct btrfs_key key;
+       struct btrfs_path *path;
        struct btrfs_fs_info *info = root->fs_info;
        struct btrfs_root *extent_root = info->extent_root;
        struct extent_buffer *leaf;
+       struct btrfs_extent_item *ei;
+       struct btrfs_extent_inline_ref *iref;
        int ret;
+       int is_data;
        int extent_slot = 0;
        int found_extent = 0;
        int num_to_del = 1;
-       struct btrfs_extent_item *ei;
-       u32 refs;
+       u32 item_size;
+       u64 refs;
 
-       key.objectid = bytenr;
-       btrfs_set_key_type(&key, BTRFS_EXTENT_ITEM_KEY);
-       key.offset = num_bytes;
        path = btrfs_alloc_path();
        if (!path)
                return -ENOMEM;
 
        path->reada = 1;
        path->leave_spinning = 1;
-       ret = lookup_extent_backref(trans, extent_root, path,
-                                   bytenr, parent, root_objectid,
-                                   ref_generation, owner_objectid, 1);
+
+       is_data = owner_objectid >= BTRFS_FIRST_FREE_OBJECTID;
+       BUG_ON(!is_data && refs_to_drop != 1);
+
+       ret = lookup_extent_backref(trans, extent_root, path, &iref,
+                                   bytenr, num_bytes, parent,
+                                   root_objectid, owner_objectid,
+                                   owner_offset);
        if (ret == 0) {
-               struct btrfs_key found_key;
                extent_slot = path->slots[0];
-               while (extent_slot > 0) {
-                       extent_slot--;
-                       btrfs_item_key_to_cpu(path->nodes[0], &found_key,
+               while (extent_slot >= 0) {
+                       btrfs_item_key_to_cpu(path->nodes[0], &key,
                                              extent_slot);
-                       if (found_key.objectid != bytenr)
+                       if (key.objectid != bytenr)
                                break;
-                       if (found_key.type == BTRFS_EXTENT_ITEM_KEY &&
-                           found_key.offset == num_bytes) {
+                       if (key.type == BTRFS_EXTENT_ITEM_KEY &&
+                           key.offset == num_bytes) {
                                found_extent = 1;
                                break;
                        }
                        if (path->slots[0] - extent_slot > 5)
                                break;
+                       extent_slot--;
                }
+#ifdef BTRFS_COMPAT_EXTENT_TREE_V0
+               item_size = btrfs_item_size_nr(path->nodes[0], extent_slot);
+               if (found_extent && item_size < sizeof(*ei))
+                       found_extent = 0;
+#endif
                if (!found_extent) {
+                       BUG_ON(iref);
                        ret = remove_extent_backref(trans, extent_root, path,
-                                                   refs_to_drop);
+                                                   NULL, refs_to_drop,
+                                                   is_data);
                        BUG_ON(ret);
                        btrfs_release_path(extent_root, path);
                        path->leave_spinning = 1;
+
+                       key.objectid = bytenr;
+                       key.type = BTRFS_EXTENT_ITEM_KEY;
+                       key.offset = num_bytes;
+
                        ret = btrfs_search_slot(trans, extent_root,
                                                &key, path, -1, 1);
                        if (ret) {
@@ -2307,82 +3199,98 @@ static int __free_extent(struct btrfs_trans_handle *trans,
                btrfs_print_leaf(extent_root, path->nodes[0]);
                WARN_ON(1);
                printk(KERN_ERR "btrfs unable to find ref byte nr %llu "
-                      "parent %llu root %llu gen %llu owner %llu\n",
+                      "parent %llu root %llu  owner %llu offset %llu\n",
                       (unsigned long long)bytenr,
                       (unsigned long long)parent,
                       (unsigned long long)root_objectid,
-                      (unsigned long long)ref_generation,
-                      (unsigned long long)owner_objectid);
+                      (unsigned long long)owner_objectid,
+                      (unsigned long long)owner_offset);
        }
 
        leaf = path->nodes[0];
+       item_size = btrfs_item_size_nr(leaf, extent_slot);
+#ifdef BTRFS_COMPAT_EXTENT_TREE_V0
+       if (item_size < sizeof(*ei)) {
+               BUG_ON(found_extent || extent_slot != path->slots[0]);
+               ret = convert_extent_item_v0(trans, extent_root, path,
+                                            owner_objectid, 0);
+               BUG_ON(ret < 0);
+
+               btrfs_release_path(extent_root, path);
+               path->leave_spinning = 1;
+
+               key.objectid = bytenr;
+               key.type = BTRFS_EXTENT_ITEM_KEY;
+               key.offset = num_bytes;
+
+               ret = btrfs_search_slot(trans, extent_root, &key, path,
+                                       -1, 1);
+               if (ret) {
+                       printk(KERN_ERR "umm, got %d back from search"
+                              ", was looking for %llu\n", ret,
+                              (unsigned long long)bytenr);
+                       btrfs_print_leaf(extent_root, path->nodes[0]);
+               }
+               BUG_ON(ret);
+               extent_slot = path->slots[0];
+               leaf = path->nodes[0];
+               item_size = btrfs_item_size_nr(leaf, extent_slot);
+       }
+#endif
+       BUG_ON(item_size < sizeof(*ei));
        ei = btrfs_item_ptr(leaf, extent_slot,
                            struct btrfs_extent_item);
-       refs = btrfs_extent_refs(leaf, ei);
-
-       /*
-        * we're not allowed to delete the extent item if there
-        * are other delayed ref updates pending
-        */
+       if (owner_objectid < BTRFS_FIRST_FREE_OBJECTID) {
+               struct btrfs_tree_block_info *bi;
+               BUG_ON(item_size < sizeof(*ei) + sizeof(*bi));
+               bi = (struct btrfs_tree_block_info *)(ei + 1);
+               WARN_ON(owner_objectid != btrfs_tree_block_level(leaf, bi));
+       }
 
+       refs = btrfs_extent_refs(leaf, ei);
        BUG_ON(refs < refs_to_drop);
        refs -= refs_to_drop;
-       btrfs_set_extent_refs(leaf, ei, refs);
-       btrfs_mark_buffer_dirty(leaf);
 
-       if (refs == 0 && found_extent &&
-           path->slots[0] == extent_slot + 1) {
-               struct btrfs_extent_ref *ref;
-               ref = btrfs_item_ptr(leaf, path->slots[0],
-                                    struct btrfs_extent_ref);
-               BUG_ON(btrfs_ref_num_refs(leaf, ref) != refs_to_drop);
-               /* if the back ref and the extent are next to each other
-                * they get deleted below in one shot
+       if (refs > 0) {
+               if (extent_op)
+                       __run_delayed_extent_op(extent_op, leaf, ei);
+               /*
+                * In the case of inline back ref, reference count will
+                * be updated by remove_extent_backref
                 */
-               path->slots[0] = extent_slot;
-               num_to_del = 2;
-       } else if (found_extent) {
-               /* otherwise delete the extent back ref */
-               ret = remove_extent_backref(trans, extent_root, path,
-                                           refs_to_drop);
-               BUG_ON(ret);
-               /* if refs are 0, we need to setup the path for deletion */
-               if (refs == 0) {
-                       btrfs_release_path(extent_root, path);
-                       path->leave_spinning = 1;
-                       ret = btrfs_search_slot(trans, extent_root, &key, path,
-                                               -1, 1);
+               if (iref) {
+                       BUG_ON(!found_extent);
+               } else {
+                       btrfs_set_extent_refs(leaf, ei, refs);
+                       btrfs_mark_buffer_dirty(leaf);
+               }
+               if (found_extent) {
+                       ret = remove_extent_backref(trans, extent_root, path,
+                                                   iref, refs_to_drop,
+                                                   is_data);
                        BUG_ON(ret);
                }
-       }
-
-       if (refs == 0) {
-               u64 super_used;
-               u64 root_used;
+       } else {
+               int mark_free = 0;
                struct extent_buffer *must_clean = NULL;
 
-               if (pin) {
-                       ret = pin_down_bytes(trans, root, path,
-                               bytenr, num_bytes,
-                               owner_objectid >= BTRFS_FIRST_FREE_OBJECTID,
-                               &must_clean);
-                       if (ret > 0)
-                               mark_free = 1;
-                       BUG_ON(ret < 0);
+               if (found_extent) {
+                       BUG_ON(is_data && refs_to_drop !=
+                              extent_data_ref_count(root, path, iref));
+                       if (iref) {
+                               BUG_ON(path->slots[0] != extent_slot);
+                       } else {
+                               BUG_ON(path->slots[0] != extent_slot + 1);
+                               path->slots[0] = extent_slot;
+                               num_to_del = 2;
+                       }
                }
 
-               /* block accounting for super block */
-               spin_lock(&info->delalloc_lock);
-               super_used = btrfs_super_bytes_used(&info->super_copy);
-               btrfs_set_super_bytes_used(&info->super_copy,
-                                          super_used - num_bytes);
-
-               /* block accounting for root item */
-               root_used = btrfs_root_used(&root->root_item);
-               btrfs_set_root_used(&root->root_item,
-                                          root_used - num_bytes);
-               spin_unlock(&info->delalloc_lock);
-
+               ret = pin_down_bytes(trans, root, path, bytenr,
+                                    num_bytes, is_data, &must_clean);
+               if (ret > 0)
+                       mark_free = 1;
+               BUG_ON(ret < 0);
                /*
                 * it is going to be very rare for someone to be waiting
                 * on the block we're freeing.  del_items might need to
@@ -2403,7 +3311,7 @@ static int __free_extent(struct btrfs_trans_handle *trans,
                        free_extent_buffer(must_clean);
                }
 
-               if (owner_objectid >= BTRFS_FIRST_FREE_OBJECTID) {
+               if (is_data) {
                        ret = btrfs_del_csums(trans, root, bytenr, num_bytes);
                        BUG_ON(ret);
                } else {
@@ -2421,34 +3329,6 @@ static int __free_extent(struct btrfs_trans_handle *trans,
 }
 
 /*
- * remove an extent from the root, returns 0 on success
- */
-static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
-                                       struct btrfs_root *root,
-                                       u64 bytenr, u64 num_bytes, u64 parent,
-                                       u64 root_objectid, u64 ref_generation,
-                                       u64 owner_objectid, int pin,
-                                       int refs_to_drop)
-{
-       WARN_ON(num_bytes < root->sectorsize);
-
-       /*
-        * if metadata always pin
-        * if data pin when any transaction has committed this
-        */
-       if (owner_objectid < BTRFS_FIRST_FREE_OBJECTID ||
-           ref_generation != trans->transid)
-               pin = 1;
-
-       if (ref_generation != trans->transid)
-               pin = 1;
-
-       return __free_extent(trans, root, bytenr, num_bytes, parent,
-                           root_objectid, ref_generation,
-                           owner_objectid, pin, pin == 0, refs_to_drop);
-}
-
-/*
  * when we free an extent, it is possible (and likely) that we free the last
  * delayed ref for that extent as well.  This searches the delayed ref tree for
  * a given extent, and if there are no other delayed refs to be processed, it
@@ -2479,6 +3359,13 @@ static noinline int check_ref_cleanup(struct btrfs_trans_handle *trans,
        if (ref->bytenr == bytenr)
                goto out;
 
+       if (head->extent_op) {
+               if (!head->must_insert_reserved)
+                       goto out;
+               kfree(head->extent_op);
+               head->extent_op = NULL;
+       }
+
        /*
         * waiting for the lock here would deadlock.  If someone else has it
         * locked they are already in the process of dropping it anyway
@@ -2507,7 +3394,8 @@ static noinline int check_ref_cleanup(struct btrfs_trans_handle *trans,
        spin_unlock(&delayed_refs->lock);
 
        ret = run_one_delayed_ref(trans, root->fs_info->tree_root,
-                                 &head->node, head->must_insert_reserved);
+                                 &head->node, head->extent_op,
+                                 head->must_insert_reserved);
        BUG_ON(ret);
        btrfs_put_delayed_ref(&head->node);
        return 0;
@@ -2519,32 +3407,32 @@ out:
 int btrfs_free_extent(struct btrfs_trans_handle *trans,
                      struct btrfs_root *root,
                      u64 bytenr, u64 num_bytes, u64 parent,
-                     u64 root_objectid, u64 ref_generation,
-                     u64 owner_objectid, int pin)
+                     u64 root_objectid, u64 owner, u64 offset)
 {
        int ret;
 
        /*
         * tree log blocks never actually go into the extent allocation
         * tree, just update pinning info and exit early.
-        *
-        * data extents referenced by the tree log do need to have
-        * their reference counts bumped.
         */
-       if (root->root_key.objectid == BTRFS_TREE_LOG_OBJECTID &&
-           owner_objectid < BTRFS_FIRST_FREE_OBJECTID) {
+       if (root_objectid == BTRFS_TREE_LOG_OBJECTID) {
+               WARN_ON(owner >= BTRFS_FIRST_FREE_OBJECTID);
                /* unlocks the pinned mutex */
                btrfs_update_pinned_extents(root, bytenr, num_bytes, 1);
                update_reserved_extents(root, bytenr, num_bytes, 0);
                ret = 0;
-       } else {
-               ret = btrfs_add_delayed_ref(trans, bytenr, num_bytes, parent,
-                                      root_objectid, ref_generation,
-                                      owner_objectid,
-                                      BTRFS_DROP_DELAYED_REF, 1);
+       } else if (owner < BTRFS_FIRST_FREE_OBJECTID) {
+               ret = btrfs_add_delayed_tree_ref(trans, bytenr, num_bytes,
+                                       parent, root_objectid, (int)owner,
+                                       BTRFS_DROP_DELAYED_REF, NULL);
                BUG_ON(ret);
                ret = check_ref_cleanup(trans, root, bytenr);
                BUG_ON(ret);
+       } else {
+               ret = btrfs_add_delayed_data_ref(trans, bytenr, num_bytes,
+                                       parent, root_objectid, owner,
+                                       offset, BTRFS_DROP_DELAYED_REF, NULL);
+               BUG_ON(ret);
        }
        return ret;
 }
@@ -2969,99 +3857,147 @@ int btrfs_reserve_extent(struct btrfs_trans_handle *trans,
        return ret;
 }
 
-static int __btrfs_alloc_reserved_extent(struct btrfs_trans_handle *trans,
-                                        struct btrfs_root *root, u64 parent,
-                                        u64 root_objectid, u64 ref_generation,
-                                        u64 owner, struct btrfs_key *ins,
-                                        int ref_mod)
+static int alloc_reserved_file_extent(struct btrfs_trans_handle *trans,
+                                     struct btrfs_root *root,
+                                     u64 parent, u64 root_objectid,
+                                     u64 flags, u64 owner, u64 offset,
+                                     struct btrfs_key *ins, int ref_mod)
 {
        int ret;
-       u64 super_used;
-       u64 root_used;
-       u64 num_bytes = ins->offset;
-       u32 sizes[2];
-       struct btrfs_fs_info *info = root->fs_info;
-       struct btrfs_root *extent_root = info->extent_root;
+       struct btrfs_fs_info *fs_info = root->fs_info;
        struct btrfs_extent_item *extent_item;
-       struct btrfs_extent_ref *ref;
+       struct btrfs_extent_inline_ref *iref;
        struct btrfs_path *path;
-       struct btrfs_key keys[2];
-
-       if (parent == 0)
-               parent = ins->objectid;
-
-       /* block accounting for super block */
-       spin_lock(&info->delalloc_lock);
-       super_used = btrfs_super_bytes_used(&info->super_copy);
-       btrfs_set_super_bytes_used(&info->super_copy, super_used + num_bytes);
+       struct extent_buffer *leaf;
+       int type;
+       u32 size;
 
-       /* block accounting for root item */
-       root_used = btrfs_root_used(&root->root_item);
-       btrfs_set_root_used(&root->root_item, root_used + num_bytes);
-       spin_unlock(&info->delalloc_lock);
+       if (parent > 0)
+               type = BTRFS_SHARED_DATA_REF_KEY;
+       else
+               type = BTRFS_EXTENT_DATA_REF_KEY;
 
-       memcpy(&keys[0], ins, sizeof(*ins));
-       keys[1].objectid = ins->objectid;
-       keys[1].type = BTRFS_EXTENT_REF_KEY;
-       keys[1].offset = parent;
-       sizes[0] = sizeof(*extent_item);
-       sizes[1] = sizeof(*ref);
+       size = sizeof(*extent_item) + btrfs_extent_inline_ref_size(type);
 
        path = btrfs_alloc_path();
        BUG_ON(!path);
 
        path->leave_spinning = 1;
-       ret = btrfs_insert_empty_items(trans, extent_root, path, keys,
-                                      sizes, 2);
+       ret = btrfs_insert_empty_item(trans, fs_info->extent_root, path,
+                                     ins, size);
        BUG_ON(ret);
 
-       extent_item = btrfs_item_ptr(path->nodes[0], path->slots[0],
+       leaf = path->nodes[0];
+       extent_item = btrfs_item_ptr(leaf, path->slots[0],
                                     struct btrfs_extent_item);
-       btrfs_set_extent_refs(path->nodes[0], extent_item, ref_mod);
-       ref = btrfs_item_ptr(path->nodes[0], path->slots[0] + 1,
-                            struct btrfs_extent_ref);
-
-       btrfs_set_ref_root(path->nodes[0], ref, root_objectid);
-       btrfs_set_ref_generation(path->nodes[0], ref, ref_generation);
-       btrfs_set_ref_objectid(path->nodes[0], ref, owner);
-       btrfs_set_ref_num_refs(path->nodes[0], ref, ref_mod);
+       btrfs_set_extent_refs(leaf, extent_item, ref_mod);
+       btrfs_set_extent_generation(leaf, extent_item, trans->transid);
+       btrfs_set_extent_flags(leaf, extent_item,
+                              flags | BTRFS_EXTENT_FLAG_DATA);
+
+       iref = (struct btrfs_extent_inline_ref *)(extent_item + 1);
+       btrfs_set_extent_inline_ref_type(leaf, iref, type);
+       if (parent > 0) {
+               struct btrfs_shared_data_ref *ref;
+               ref = (struct btrfs_shared_data_ref *)(iref + 1);
+               btrfs_set_extent_inline_ref_offset(leaf, iref, parent);
+               btrfs_set_shared_data_ref_count(leaf, ref, ref_mod);
+       } else {
+               struct btrfs_extent_data_ref *ref;
+               ref = (struct btrfs_extent_data_ref *)(&iref->offset);
+               btrfs_set_extent_data_ref_root(leaf, ref, root_objectid);
+               btrfs_set_extent_data_ref_objectid(leaf, ref, owner);
+               btrfs_set_extent_data_ref_offset(leaf, ref, offset);
+               btrfs_set_extent_data_ref_count(leaf, ref, ref_mod);
+       }
 
        btrfs_mark_buffer_dirty(path->nodes[0]);
-
-       trans->alloc_exclude_start = 0;
-       trans->alloc_exclude_nr = 0;
        btrfs_free_path(path);
 
-       if (ret)
-               goto out;
-
-       ret = update_block_group(trans, root, ins->objectid,
-                                ins->offset, 1, 0);
+       ret = update_block_group(trans, root, ins->objectid, ins->offset,
+                                1, 0);
        if (ret) {
                printk(KERN_ERR "btrfs update block group failed for %llu "
                       "%llu\n", (unsigned long long)ins->objectid,
                       (unsigned long long)ins->offset);
                BUG();
        }
-out:
        return ret;
 }
 
-int btrfs_alloc_reserved_extent(struct btrfs_trans_handle *trans,
-                               struct btrfs_root *root, u64 parent,
-                               u64 root_objectid, u64 ref_generation,
-                               u64 owner, struct btrfs_key *ins)
+static int alloc_reserved_tree_block(struct btrfs_trans_handle *trans,
+                                    struct btrfs_root *root,
+                                    u64 parent, u64 root_objectid,
+                                    u64 flags, struct btrfs_disk_key *key,
+                                    int level, struct btrfs_key *ins)
 {
        int ret;
+       struct btrfs_fs_info *fs_info = root->fs_info;
+       struct btrfs_extent_item *extent_item;
+       struct btrfs_tree_block_info *block_info;
+       struct btrfs_extent_inline_ref *iref;
+       struct btrfs_path *path;
+       struct extent_buffer *leaf;
+       u32 size = sizeof(*extent_item) + sizeof(*block_info) + sizeof(*iref);
 
-       if (root_objectid == BTRFS_TREE_LOG_OBJECTID)
-               return 0;
+       path = btrfs_alloc_path();
+       BUG_ON(!path);
 
-       ret = btrfs_add_delayed_ref(trans, ins->objectid,
-                                   ins->offset, parent, root_objectid,
-                                   ref_generation, owner,
-                                   BTRFS_ADD_DELAYED_EXTENT, 0);
+       path->leave_spinning = 1;
+       ret = btrfs_insert_empty_item(trans, fs_info->extent_root, path,
+                                     ins, size);
        BUG_ON(ret);
+
+       leaf = path->nodes[0];
+       extent_item = btrfs_item_ptr(leaf, path->slots[0],
+                                    struct btrfs_extent_item);
+       btrfs_set_extent_refs(leaf, extent_item, 1);
+       btrfs_set_extent_generation(leaf, extent_item, trans->transid);
+       btrfs_set_extent_flags(leaf, extent_item,
+                              flags | BTRFS_EXTENT_FLAG_TREE_BLOCK);
+       block_info = (struct btrfs_tree_block_info *)(extent_item + 1);
+
+       btrfs_set_tree_block_key(leaf, block_info, key);
+       btrfs_set_tree_block_level(leaf, block_info, level);
+
+       iref = (struct btrfs_extent_inline_ref *)(block_info + 1);
+       if (parent > 0) {
+               BUG_ON(!(flags & BTRFS_BLOCK_FLAG_FULL_BACKREF));
+               btrfs_set_extent_inline_ref_type(leaf, iref,
+                                                BTRFS_SHARED_BLOCK_REF_KEY);
+               btrfs_set_extent_inline_ref_offset(leaf, iref, parent);
+       } else {
+               btrfs_set_extent_inline_ref_type(leaf, iref,
+                                                BTRFS_TREE_BLOCK_REF_KEY);
+               btrfs_set_extent_inline_ref_offset(leaf, iref, root_objectid);
+       }
+
+       btrfs_mark_buffer_dirty(leaf);
+       btrfs_free_path(path);
+
+       ret = update_block_group(trans, root, ins->objectid, ins->offset,
+                                1, 0);
+       if (ret) {
+               printk(KERN_ERR "btrfs update block group failed for %llu "
+                      "%llu\n", (unsigned long long)ins->objectid,
+                      (unsigned long long)ins->offset);
+               BUG();
+       }
+       return ret;
+}
+
+int btrfs_alloc_reserved_file_extent(struct btrfs_trans_handle *trans,
+                                    struct btrfs_root *root,
+                                    u64 root_objectid, u64 owner,
+                                    u64 offset, struct btrfs_key *ins)
+{
+       int ret;
+
+       BUG_ON(root_objectid == BTRFS_TREE_LOG_OBJECTID);
+
+       ret = btrfs_add_delayed_data_ref(trans, ins->objectid, ins->offset,
+                                        0, root_objectid, owner, offset,
+                                        BTRFS_ADD_DELAYED_EXTENT, NULL);
        return ret;
 }
 
@@ -3070,10 +4006,10 @@ int btrfs_alloc_reserved_extent(struct btrfs_trans_handle *trans,
  * an extent has been allocated and makes sure to clear the free
  * space cache bits as well
  */
-int btrfs_alloc_logged_extent(struct btrfs_trans_handle *trans,
-                               struct btrfs_root *root, u64 parent,
-                               u64 root_objectid, u64 ref_generation,
-                               u64 owner, struct btrfs_key *ins)
+int btrfs_alloc_logged_file_extent(struct btrfs_trans_handle *trans,
+                                  struct btrfs_root *root,
+                                  u64 root_objectid, u64 owner, u64 offset,
+                                  struct btrfs_key *ins)
 {
        int ret;
        struct btrfs_block_group_cache *block_group;
@@ -3087,8 +4023,8 @@ int btrfs_alloc_logged_extent(struct btrfs_trans_handle *trans,
                                      ins->offset);
        BUG_ON(ret);
        btrfs_put_block_group(block_group);
-       ret = __btrfs_alloc_reserved_extent(trans, root, parent, root_objectid,
-                                           ref_generation, owner, ins, 1);
+       ret = alloc_reserved_file_extent(trans, root, 0, root_objectid,
+                                        0, owner, offset, ins, 1);
        return ret;
 }
 
@@ -3099,26 +4035,48 @@ int btrfs_alloc_logged_extent(struct btrfs_trans_handle *trans,
  *
  * returns 0 if everything worked, non-zero otherwise.
  */
-int btrfs_alloc_extent(struct btrfs_trans_handle *trans,
-                      struct btrfs_root *root,
-                      u64 num_bytes, u64 parent, u64 min_alloc_size,
-                      u64 root_objectid, u64 ref_generation,
-                      u64 owner_objectid, u64 empty_size, u64 hint_byte,
-                      u64 search_end, struct btrfs_key *ins, u64 data)
+static int alloc_tree_block(struct btrfs_trans_handle *trans,
+                           struct btrfs_root *root,
+                           u64 num_bytes, u64 parent, u64 root_objectid,
+                           struct btrfs_disk_key *key, int level,
+                           u64 empty_size, u64 hint_byte, u64 search_end,
+                           struct btrfs_key *ins)
 {
        int ret;
-       ret = __btrfs_reserve_extent(trans, root, num_bytes,
-                                    min_alloc_size, empty_size, hint_byte,
-                                    search_end, ins, data);
+       u64 flags = 0;
+
+       ret = __btrfs_reserve_extent(trans, root, num_bytes, num_bytes,
+                                    empty_size, hint_byte, search_end,
+                                    ins, 0);
        BUG_ON(ret);
+
+       if (root_objectid == BTRFS_TREE_RELOC_OBJECTID) {
+               if (parent == 0)
+                       parent = ins->objectid;
+               flags |= BTRFS_BLOCK_FLAG_FULL_BACKREF;
+       } else
+               BUG_ON(parent > 0);
+
+       update_reserved_extents(root, ins->objectid, ins->offset, 1);
        if (root_objectid != BTRFS_TREE_LOG_OBJECTID) {
-               ret = btrfs_add_delayed_ref(trans, ins->objectid,
-                                           ins->offset, parent, root_objectid,
-                                           ref_generation, owner_objectid,
-                                           BTRFS_ADD_DELAYED_EXTENT, 0);
+               struct btrfs_delayed_extent_op *extent_op;
+               extent_op = kmalloc(sizeof(*extent_op), GFP_NOFS);
+               BUG_ON(!extent_op);
+               if (key)
+                       memcpy(&extent_op->key, key, sizeof(extent_op->key));
+               else
+                       memset(&extent_op->key, 0, sizeof(extent_op->key));
+               extent_op->flags_to_set = flags;
+               extent_op->update_key = 1;
+               extent_op->update_flags = 1;
+               extent_op->is_data = 0;
+
+               ret = btrfs_add_delayed_tree_ref(trans, ins->objectid,
+                                       ins->offset, parent, root_objectid,
+                                       level, BTRFS_ADD_DELAYED_EXTENT,
+                                       extent_op);
                BUG_ON(ret);
        }
-       update_reserved_extents(root, ins->objectid, ins->offset, 1);
        return ret;
 }
 
@@ -3157,21 +4115,17 @@ struct extent_buffer *btrfs_init_new_buffer(struct btrfs_trans_handle *trans,
  * returns the tree buffer or NULL.
  */
 struct extent_buffer *btrfs_alloc_free_block(struct btrfs_trans_handle *trans,
-                                            struct btrfs_root *root,
-                                            u32 blocksize, u64 parent,
-                                            u64 root_objectid,
-                                            u64 ref_generation,
-                                            int level,
-                                            u64 hint,
-                                            u64 empty_size)
+                                       struct btrfs_root *root, u32 blocksize,
+                                       u64 parent, u64 root_objectid,
+                                       struct btrfs_disk_key *key, int level,
+                                       u64 hint, u64 empty_size)
 {
        struct btrfs_key ins;
        int ret;
        struct extent_buffer *buf;
 
-       ret = btrfs_alloc_extent(trans, root, blocksize, parent, blocksize,
-                                root_objectid, ref_generation, level,
-                                empty_size, hint, (u64)-1, &ins, 0);
+       ret = alloc_tree_block(trans, root, blocksize, parent, root_objectid,
+                              key, level, empty_size, hint, (u64)-1, &ins);
        if (ret) {
                BUG_ON(ret > 0);
                return ERR_PTR(ret);
@@ -3185,32 +4139,19 @@ struct extent_buffer *btrfs_alloc_free_block(struct btrfs_trans_handle *trans,
 int btrfs_drop_leaf_ref(struct btrfs_trans_handle *trans,
                        struct btrfs_root *root, struct extent_buffer *leaf)
 {
-       u64 leaf_owner;
-       u64 leaf_generation;
-       struct refsort *sorted;
+       u64 disk_bytenr;
+       u64 num_bytes;
        struct btrfs_key key;
        struct btrfs_file_extent_item *fi;
+       u32 nritems;
        int i;
-       int nritems;
        int ret;
-       int refi = 0;
-       int slot;
 
        BUG_ON(!btrfs_is_leaf(leaf));
        nritems = btrfs_header_nritems(leaf);
-       leaf_owner = btrfs_header_owner(leaf);
-       leaf_generation = btrfs_header_generation(leaf);
 
-       sorted = kmalloc(sizeof(*sorted) * nritems, GFP_NOFS);
-       /* we do this loop twice.  The first time we build a list
-        * of the extents we have a reference on, then we sort the list
-        * by bytenr.  The second time around we actually do the
-        * extent freeing.
-        */
        for (i = 0; i < nritems; i++) {
-               u64 disk_bytenr;
                cond_resched();
-
                btrfs_item_key_to_cpu(leaf, &key, i);
 
                /* only extents have references, skip everything else */
@@ -3230,45 +4171,16 @@ int btrfs_drop_leaf_ref(struct btrfs_trans_handle *trans,
                if (disk_bytenr == 0)
                        continue;
 
-               sorted[refi].bytenr = disk_bytenr;
-               sorted[refi].slot = i;
-               refi++;
-       }
-
-       if (refi == 0)
-               goto out;
-
-       sort(sorted, refi, sizeof(struct refsort), refsort_cmp, NULL);
-
-       for (i = 0; i < refi; i++) {
-               u64 disk_bytenr;
-
-               disk_bytenr = sorted[i].bytenr;
-               slot = sorted[i].slot;
-
-               cond_resched();
-
-               btrfs_item_key_to_cpu(leaf, &key, slot);
-               if (btrfs_key_type(&key) != BTRFS_EXTENT_DATA_KEY)
-                       continue;
-
-               fi = btrfs_item_ptr(leaf, slot, struct btrfs_file_extent_item);
-
-               ret = btrfs_free_extent(trans, root, disk_bytenr,
-                               btrfs_file_extent_disk_num_bytes(leaf, fi),
-                               leaf->start, leaf_owner, leaf_generation,
-                               key.objectid, 0);
+               num_bytes = btrfs_file_extent_disk_num_bytes(leaf, fi);
+               ret = btrfs_free_extent(trans, root, disk_bytenr, num_bytes,
+                                       leaf->start, 0, key.objectid, 0);
                BUG_ON(ret);
-
-               atomic_inc(&root->fs_info->throttle_gen);
-               wake_up(&root->fs_info->transaction_throttle);
-               cond_resched();
        }
-out:
-       kfree(sorted);
        return 0;
 }
 
+#if 0
+
 static noinline int cache_drop_leaf_ref(struct btrfs_trans_handle *trans,
                                        struct btrfs_root *root,
                                        struct btrfs_leaf_ref *ref)
@@ -3311,13 +4223,14 @@ static noinline int cache_drop_leaf_ref(struct btrfs_trans_handle *trans,
        return 0;
 }
 
+
 static int drop_snap_lookup_refcount(struct btrfs_trans_handle *trans,
                                     struct btrfs_root *root, u64 start,
                                     u64 len, u32 *refs)
 {
        int ret;
 
-       ret = btrfs_lookup_extent_ref(trans, root, start, len, refs);
+       ret = btrfs_lookup_extent_refs(trans, root, start, len, refs);
        BUG_ON(ret);
 
 #if 0 /* some debugging code in case we see problems here */
@@ -3352,6 +4265,7 @@ static int drop_snap_lookup_refcount(struct btrfs_trans_handle *trans,
        return ret;
 }
 
+
 /*
  * this is used while deleting old snapshots, and it drops the refs
  * on a whole subtree starting from a level 1 node.
@@ -3645,32 +4559,36 @@ out:
        cond_resched();
        return 0;
 }
+#endif
 
 /*
  * helper function for drop_subtree, this function is similar to
  * walk_down_tree. The main difference is that it checks reference
  * counts while tree blocks are locked.
  */
-static noinline int walk_down_subtree(struct btrfs_trans_handle *trans,
-                                     struct btrfs_root *root,
-                                     struct btrfs_path *path, int *level)
+static noinline int walk_down_tree(struct btrfs_trans_handle *trans,
+                                  struct btrfs_root *root,
+                                  struct btrfs_path *path, int *level)
 {
        struct extent_buffer *next;
        struct extent_buffer *cur;
        struct extent_buffer *parent;
        u64 bytenr;
        u64 ptr_gen;
+       u64 refs;
+       u64 flags;
        u32 blocksize;
-       u32 refs;
        int ret;
 
        cur = path->nodes[*level];
-       ret = btrfs_lookup_extent_ref(trans, root, cur->start, cur->len,
-                                     &refs);
+       ret = btrfs_lookup_extent_info(trans, root, cur->start, cur->len,
+                                      &refs, &flags);
        BUG_ON(ret);
        if (refs > 1)
                goto out;
 
+       BUG_ON(!(flags & BTRFS_BLOCK_FLAG_FULL_BACKREF));
+
        while (*level >= 0) {
                cur = path->nodes[*level];
                if (*level == 0) {
@@ -3692,16 +4610,15 @@ static noinline int walk_down_subtree(struct btrfs_trans_handle *trans,
                btrfs_tree_lock(next);
                btrfs_set_lock_blocking(next);
 
-               ret = btrfs_lookup_extent_ref(trans, root, bytenr, blocksize,
-                                             &refs);
+               ret = btrfs_lookup_extent_info(trans, root, bytenr, blocksize,
+                                              &refs, &flags);
                BUG_ON(ret);
                if (refs > 1) {
                        parent = path->nodes[*level];
                        ret = btrfs_free_extent(trans, root, bytenr,
-                                       blocksize, parent->start,
-                                       btrfs_header_owner(parent),
-                                       btrfs_header_generation(parent),
-                                       *level - 1, 1);
+                                               blocksize, parent->start,
+                                               btrfs_header_owner(parent),
+                                               *level - 1, 0);
                        BUG_ON(ret);
                        path->slots[*level]++;
                        btrfs_tree_unlock(next);
@@ -3709,6 +4626,8 @@ static noinline int walk_down_subtree(struct btrfs_trans_handle *trans,
                        continue;
                }
 
+               BUG_ON(!(flags & BTRFS_BLOCK_FLAG_FULL_BACKREF));
+
                *level = btrfs_header_level(next);
                path->nodes[*level] = next;
                path->slots[*level] = 0;
@@ -3716,13 +4635,15 @@ static noinline int walk_down_subtree(struct btrfs_trans_handle *trans,
                cond_resched();
        }
 out:
-       parent = path->nodes[*level + 1];
+       if (path->nodes[*level] == root->node)
+               parent = path->nodes[*level];
+       else
+               parent = path->nodes[*level + 1];
        bytenr = path->nodes[*level]->start;
        blocksize = path->nodes[*level]->len;
 
-       ret = btrfs_free_extent(trans, root, bytenr, blocksize,
-                       parent->start, btrfs_header_owner(parent),
-                       btrfs_header_generation(parent), *level, 1);
+       ret = btrfs_free_extent(trans, root, bytenr, blocksize, parent->start,
+                               btrfs_header_owner(parent), *level, 0);
        BUG_ON(ret);
 
        if (path->locks[*level]) {
@@ -3746,8 +4667,6 @@ static noinline int walk_up_tree(struct btrfs_trans_handle *trans,
                                 struct btrfs_path *path,
                                 int *level, int max_level)
 {
-       u64 root_owner;
-       u64 root_gen;
        struct btrfs_root_item *root_item = &root->root_item;
        int i;
        int slot;
@@ -3755,24 +4674,22 @@ static noinline int walk_up_tree(struct btrfs_trans_handle *trans,
 
        for (i = *level; i < max_level && path->nodes[i]; i++) {
                slot = path->slots[i];
-               if (slot < btrfs_header_nritems(path->nodes[i]) - 1) {
-                       struct extent_buffer *node;
-                       struct btrfs_disk_key disk_key;
-
+               if (slot + 1 < btrfs_header_nritems(path->nodes[i])) {
                        /*
                         * there is more work to do in this level.
                         * Update the drop_progress marker to reflect
                         * the work we've done so far, and then bump
                         * the slot number
                         */
-                       node = path->nodes[i];
                        path->slots[i]++;
-                       *level = i;
                        WARN_ON(*level == 0);
-                       btrfs_node_key(node, &disk_key, path->slots[i]);
-                       memcpy(&root_item->drop_progress,
-                              &disk_key, sizeof(disk_key));
-                       root_item->drop_level = i;
+                       if (max_level == BTRFS_MAX_LEVEL) {
+                               btrfs_node_key(path->nodes[i],
+                                              &root_item->drop_progress,
+                                              path->slots[i]);
+                               root_item->drop_level = i;
+                       }
+                       *level = i;
                        return 0;
                } else {
                        struct extent_buffer *parent;
@@ -3786,22 +4703,20 @@ static noinline int walk_up_tree(struct btrfs_trans_handle *trans,
                        else
                                parent = path->nodes[*level + 1];
 
-                       root_owner = btrfs_header_owner(parent);
-                       root_gen = btrfs_header_generation(parent);
-
-                       clean_tree_block(trans, root, path->nodes[*level]);
+                       clean_tree_block(trans, root, path->nodes[i]);
                        ret = btrfs_free_extent(trans, root,
-                                               path->nodes[*level]->start,
-                                               path->nodes[*level]->len,
-                                               parent->start, root_owner,
-                                               root_gen, *level, 1);
+                                               path->nodes[i]->start,
+                                               path->nodes[i]->len,
+                                               parent->start,
+                                               btrfs_header_owner(parent),
+                                               *level, 0);
                        BUG_ON(ret);
                        if (path->locks[*level]) {
-                               btrfs_tree_unlock(path->nodes[*level]);
-                               path->locks[*level] = 0;
+                               btrfs_tree_unlock(path->nodes[i]);
+                               path->locks[i] = 0;
                        }
-                       free_extent_buffer(path->nodes[*level]);
-                       path->nodes[*level] = NULL;
+                       free_extent_buffer(path->nodes[i]);
+                       path->nodes[i] = NULL;
                        *level = i + 1;
                }
        }
@@ -3820,21 +4735,18 @@ int btrfs_drop_snapshot(struct btrfs_trans_handle *trans, struct btrfs_root
        int wret;
        int level;
        struct btrfs_path *path;
-       int i;
-       int orig_level;
        int update_count;
        struct btrfs_root_item *root_item = &root->root_item;
 
-       WARN_ON(!mutex_is_locked(&root->fs_info->drop_mutex));
        path = btrfs_alloc_path();
        BUG_ON(!path);
 
        level = btrfs_header_level(root->node);
-       orig_level = level;
        if (btrfs_disk_key_objectid(&root_item->drop_progress) == 0) {
-               path->nodes[level] = root->node;
-               extent_buffer_get(root->node);
+               path->nodes[level] = btrfs_lock_root_node(root);
+               btrfs_set_lock_blocking(path->nodes[level]);
                path->slots[level] = 0;
+               path->locks[level] = 1;
        } else {
                struct btrfs_key key;
                struct btrfs_disk_key found_key;
@@ -3856,12 +4768,7 @@ int btrfs_drop_snapshot(struct btrfs_trans_handle *trans, struct btrfs_root
                 * unlock our path, this is safe because only this
                 * function is allowed to delete this snapshot
                 */
-               for (i = 0; i < BTRFS_MAX_LEVEL; i++) {
-                       if (path->nodes[i] && path->locks[i]) {
-                               path->locks[i] = 0;
-                               btrfs_tree_unlock(path->nodes[i]);
-                       }
-               }
+               btrfs_unlock_up_safe(path, 0);
        }
        while (1) {
                unsigned long update;
@@ -3882,8 +4789,6 @@ int btrfs_drop_snapshot(struct btrfs_trans_handle *trans, struct btrfs_root
                        ret = -EAGAIN;
                        break;
                }
-               atomic_inc(&root->fs_info->throttle_gen);
-               wake_up(&root->fs_info->transaction_throttle);
                for (update_count = 0; update_count < 16; update_count++) {
                        update = trans->delayed_ref_updates;
                        trans->delayed_ref_updates = 0;
@@ -3893,12 +4798,6 @@ int btrfs_drop_snapshot(struct btrfs_trans_handle *trans, struct btrfs_root
                                break;
                }
        }
-       for (i = 0; i <= orig_level; i++) {
-               if (path->nodes[i]) {
-                       free_extent_buffer(path->nodes[i]);
-                       path->nodes[i] = NULL;
-               }
-       }
 out:
        btrfs_free_path(path);
        return ret;
@@ -3931,7 +4830,7 @@ int btrfs_drop_subtree(struct btrfs_trans_handle *trans,
        path->slots[level] = 0;
 
        while (1) {
-               wret = walk_down_subtree(trans, root, path, &level);
+               wret = walk_down_tree(trans, root, path, &level);
                if (wret < 0)
                        ret = wret;
                if (wret != 0)
@@ -3948,6 +4847,7 @@ int btrfs_drop_subtree(struct btrfs_trans_handle *trans,
        return ret;
 }
 
+#if 0
 static unsigned long calc_ra(unsigned long start, unsigned long last,
                             unsigned long nr)
 {
@@ -5429,6 +6329,7 @@ out:
        kfree(ref_path);
        return ret;
 }
+#endif
 
 static u64 update_block_group_flags(struct btrfs_root *root, u64 flags)
 {
@@ -5477,7 +6378,8 @@ static int __alloc_chunk_for_shrink(struct btrfs_root *root,
        u64 calc;
 
        spin_lock(&shrink_block_group->lock);
-       if (btrfs_block_group_used(&shrink_block_group->item) > 0) {
+       if (btrfs_block_group_used(&shrink_block_group->item) +
+           shrink_block_group->reserved > 0) {
                spin_unlock(&shrink_block_group->lock);
 
                trans = btrfs_start_transaction(root, 1);
@@ -5502,6 +6404,17 @@ static int __alloc_chunk_for_shrink(struct btrfs_root *root,
        return 0;
 }
 
+
+int btrfs_prepare_block_group_relocation(struct btrfs_root *root,
+                                        struct btrfs_block_group_cache *group)
+
+{
+       __alloc_chunk_for_shrink(root, group, 1);
+       set_block_group_readonly(group);
+       return 0;
+}
+
+#if 0
 static int __insert_orphan_inode(struct btrfs_trans_handle *trans,
                                 struct btrfs_root *root,
                                 u64 objectid, u64 size)
@@ -5781,6 +6694,7 @@ out:
        btrfs_free_path(path);
        return ret;
 }
+#endif
 
 static int find_first_block_group(struct btrfs_root *root,
                struct btrfs_path *path, struct btrfs_key *key)
index 1d51dc3..0726a73 100644 (file)
@@ -291,16 +291,12 @@ noinline int btrfs_drop_extents(struct btrfs_trans_handle *trans,
 {
        u64 extent_end = 0;
        u64 search_start = start;
-       u64 leaf_start;
        u64 ram_bytes = 0;
-       u64 orig_parent = 0;
        u64 disk_bytenr = 0;
        u64 orig_locked_end = locked_end;
        u8 compression;
        u8 encryption;
        u16 other_encoding = 0;
-       u64 root_gen;
-       u64 root_owner;
        struct extent_buffer *leaf;
        struct btrfs_file_extent_item *extent;
        struct btrfs_path *path;
@@ -340,9 +336,6 @@ next_slot:
                bookend = 0;
                found_extent = 0;
                found_inline = 0;
-               leaf_start = 0;
-               root_gen = 0;
-               root_owner = 0;
                compression = 0;
                encryption = 0;
                extent = NULL;
@@ -417,9 +410,6 @@ next_slot:
                if (found_extent) {
                        read_extent_buffer(leaf, &old, (unsigned long)extent,
                                           sizeof(old));
-                       root_gen = btrfs_header_generation(leaf);
-                       root_owner = btrfs_header_owner(leaf);
-                       leaf_start = leaf->start;
                }
 
                if (end < extent_end && end >= key.offset) {
@@ -443,14 +433,14 @@ next_slot:
                                }
                                locked_end = extent_end;
                        }
-                       orig_parent = path->nodes[0]->start;
                        disk_bytenr = le64_to_cpu(old.disk_bytenr);
                        if (disk_bytenr != 0) {
                                ret = btrfs_inc_extent_ref(trans, root,
                                           disk_bytenr,
-                                          le64_to_cpu(old.disk_num_bytes),
-                                          orig_parent, root->root_key.objectid,
-                                          trans->transid, inode->i_ino);
+                                          le64_to_cpu(old.disk_num_bytes), 0,
+                                          root->root_key.objectid,
+                                          key.objectid, key.offset -
+                                          le64_to_cpu(old.offset));
                                BUG_ON(ret);
                        }
                }
@@ -568,17 +558,6 @@ next_slot:
                        btrfs_mark_buffer_dirty(path->nodes[0]);
                        btrfs_set_lock_blocking(path->nodes[0]);
 
-                       if (disk_bytenr != 0) {
-                               ret = btrfs_update_extent_ref(trans, root,
-                                               disk_bytenr,
-                                               le64_to_cpu(old.disk_num_bytes),
-                                               orig_parent,
-                                               leaf->start,
-                                               root->root_key.objectid,
-                                               trans->transid, ins.objectid);
-
-                               BUG_ON(ret);
-                       }
                        path->leave_spinning = 0;
                        btrfs_release_path(root, path);
                        if (disk_bytenr != 0)
@@ -594,8 +573,9 @@ next_slot:
                                ret = btrfs_free_extent(trans, root,
                                                old_disk_bytenr,
                                                le64_to_cpu(old.disk_num_bytes),
-                                               leaf_start, root_owner,
-                                               root_gen, key.objectid, 0);
+                                               0, root->root_key.objectid,
+                                               key.objectid, key.offset -
+                                               le64_to_cpu(old.offset));
                                BUG_ON(ret);
                                *hint_byte = old_disk_bytenr;
                        }
@@ -664,12 +644,11 @@ int btrfs_mark_extent_written(struct btrfs_trans_handle *trans,
        u64 bytenr;
        u64 num_bytes;
        u64 extent_end;
-       u64 extent_offset;
+       u64 orig_offset;
        u64 other_start;
        u64 other_end;
        u64 split = start;
        u64 locked_end = end;
-       u64 orig_parent;
        int extent_type;
        int split_end = 1;
        int ret;
@@ -703,7 +682,7 @@ again:
 
        bytenr = btrfs_file_extent_disk_bytenr(leaf, fi);
        num_bytes = btrfs_file_extent_disk_num_bytes(leaf, fi);
-       extent_offset = btrfs_file_extent_offset(leaf, fi);
+       orig_offset = key.offset - btrfs_file_extent_offset(leaf, fi);
 
        if (key.offset == start)
                split = end;
@@ -711,8 +690,6 @@ again:
        if (key.offset == start && extent_end == end) {
                int del_nr = 0;
                int del_slot = 0;
-               u64 leaf_owner = btrfs_header_owner(leaf);
-               u64 leaf_gen = btrfs_header_generation(leaf);
                other_start = end;
                other_end = 0;
                if (extent_mergeable(leaf, path->slots[0] + 1, inode->i_ino,
@@ -721,8 +698,8 @@ again:
                        del_slot = path->slots[0] + 1;
                        del_nr++;
                        ret = btrfs_free_extent(trans, root, bytenr, num_bytes,
-                                               leaf->start, leaf_owner,
-                                               leaf_gen, inode->i_ino, 0);
+                                               0, root->root_key.objectid,
+                                               inode->i_ino, orig_offset);
                        BUG_ON(ret);
                }
                other_start = 0;
@@ -733,8 +710,8 @@ again:
                        del_slot = path->slots[0];
                        del_nr++;
                        ret = btrfs_free_extent(trans, root, bytenr, num_bytes,
-                                               leaf->start, leaf_owner,
-                                               leaf_gen, inode->i_ino, 0);
+                                               0, root->root_key.objectid,
+                                               inode->i_ino, orig_offset);
                        BUG_ON(ret);
                }
                split_end = 0;
@@ -768,13 +745,12 @@ again:
                        locked_end = extent_end;
                }
                btrfs_set_file_extent_num_bytes(leaf, fi, split - key.offset);
-               extent_offset += split - key.offset;
        } else  {
                BUG_ON(key.offset != start);
-               btrfs_set_file_extent_offset(leaf, fi, extent_offset +
-                                            split - key.offset);
-               btrfs_set_file_extent_num_bytes(leaf, fi, extent_end - split);
                key.offset = split;
+               btrfs_set_file_extent_offset(leaf, fi, key.offset -
+                                            orig_offset);
+               btrfs_set_file_extent_num_bytes(leaf, fi, extent_end - split);
                btrfs_set_item_key_safe(trans, root, path, &key);
                extent_end = split;
        }
@@ -793,7 +769,8 @@ again:
                                            struct btrfs_file_extent_item);
                        key.offset = split;
                        btrfs_set_item_key_safe(trans, root, path, &key);
-                       btrfs_set_file_extent_offset(leaf, fi, extent_offset);
+                       btrfs_set_file_extent_offset(leaf, fi, key.offset -
+                                                    orig_offset);
                        btrfs_set_file_extent_num_bytes(leaf, fi,
                                                        other_end - split);
                        goto done;
@@ -815,10 +792,9 @@ again:
 
        btrfs_mark_buffer_dirty(leaf);
 
-       orig_parent = leaf->start;
-       ret = btrfs_inc_extent_ref(trans, root, bytenr, num_bytes,
-                                  orig_parent, root->root_key.objectid,
-                                  trans->transid, inode->i_ino);
+       ret = btrfs_inc_extent_ref(trans, root, bytenr, num_bytes, 0,
+                                  root->root_key.objectid,
+                                  inode->i_ino, orig_offset);
        BUG_ON(ret);
        btrfs_release_path(root, path);
 
@@ -833,20 +809,12 @@ again:
        btrfs_set_file_extent_type(leaf, fi, extent_type);
        btrfs_set_file_extent_disk_bytenr(leaf, fi, bytenr);
        btrfs_set_file_extent_disk_num_bytes(leaf, fi, num_bytes);
-       btrfs_set_file_extent_offset(leaf, fi, extent_offset);
+       btrfs_set_file_extent_offset(leaf, fi, key.offset - orig_offset);
        btrfs_set_file_extent_num_bytes(leaf, fi, extent_end - key.offset);
        btrfs_set_file_extent_ram_bytes(leaf, fi, num_bytes);
        btrfs_set_file_extent_compression(leaf, fi, 0);
        btrfs_set_file_extent_encryption(leaf, fi, 0);
        btrfs_set_file_extent_other_encoding(leaf, fi, 0);
-
-       if (orig_parent != leaf->start) {
-               ret = btrfs_update_extent_ref(trans, root, bytenr, num_bytes,
-                                             orig_parent, leaf->start,
-                                             root->root_key.objectid,
-                                             trans->transid, inode->i_ino);
-               BUG_ON(ret);
-       }
 done:
        btrfs_mark_buffer_dirty(leaf);
 
index 1c8b019..917bf10 100644 (file)
@@ -48,7 +48,6 @@
 #include "ordered-data.h"
 #include "xattr.h"
 #include "tree-log.h"
-#include "ref-cache.h"
 #include "compression.h"
 #include "locking.h"
 
@@ -944,6 +943,7 @@ static noinline int run_delalloc_nocow(struct inode *inode,
        u64 cow_start;
        u64 cur_offset;
        u64 extent_end;
+       u64 extent_offset;
        u64 disk_bytenr;
        u64 num_bytes;
        int extent_type;
@@ -1005,6 +1005,7 @@ next_slot:
                if (extent_type == BTRFS_FILE_EXTENT_REG ||
                    extent_type == BTRFS_FILE_EXTENT_PREALLOC) {
                        disk_bytenr = btrfs_file_extent_disk_bytenr(leaf, fi);
+                       extent_offset = btrfs_file_extent_offset(leaf, fi);
                        extent_end = found_key.offset +
                                btrfs_file_extent_num_bytes(leaf, fi);
                        if (extent_end <= start) {
@@ -1022,9 +1023,10 @@ next_slot:
                        if (btrfs_extent_readonly(root, disk_bytenr))
                                goto out_check;
                        if (btrfs_cross_ref_exist(trans, root, inode->i_ino,
-                                                 disk_bytenr))
+                                                 found_key.offset -
+                                                 extent_offset, disk_bytenr))
                                goto out_check;
-                       disk_bytenr += btrfs_file_extent_offset(leaf, fi);
+                       disk_bytenr += extent_offset;
                        disk_bytenr += cur_offset - found_key.offset;
                        num_bytes = min(end + 1, extent_end) - cur_offset;
                        /*
@@ -1489,9 +1491,9 @@ static int insert_reserved_file_extent(struct btrfs_trans_handle *trans,
        ins.objectid = disk_bytenr;
        ins.offset = disk_num_bytes;
        ins.type = BTRFS_EXTENT_ITEM_KEY;
-       ret = btrfs_alloc_reserved_extent(trans, root, leaf->start,
-                                         root->root_key.objectid,
-                                         trans->transid, inode->i_ino, &ins);
+       ret = btrfs_alloc_reserved_file_extent(trans, root,
+                                       root->root_key.objectid,
+                                       inode->i_ino, file_pos, &ins);
        BUG_ON(ret);
        btrfs_free_path(path);