8 years agofs/9p: mark inode attribute invalid on rename, unlink and setattr
Aneesh Kumar K.V [Mon, 28 Feb 2011 11:34:07 +0000]
fs/9p: mark inode attribute invalid on rename, unlink and setattr

rename, unlink and setattr can result in update of inode attribute.
So mark the cached copy invalid

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years agofs/9p: Add support for marking inode attribute invalid
Aneesh Kumar K.V [Mon, 28 Feb 2011 11:34:06 +0000]
fs/9p: Add support for marking inode attribute invalid

With cached mode some of the file system operation result
in updating inode attributes (ctime). Add support for
marking inode attribute invalid in such cases so that
we fetch the updated inode attribute on dentry revalidation.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years agofs/9p: Initialize root inode number for dotl
Aneesh Kumar K.V [Mon, 28 Feb 2011 11:34:06 +0000]
fs/9p: Initialize root inode number for dotl

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years agofs/9p: Update link count correctly on different file system operations
Aneesh Kumar K.V [Mon, 28 Feb 2011 11:34:05 +0000]
fs/9p: Update link count correctly on different file system operations

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years agofs/9p: Add drop_inode 9p callback
Aneesh Kumar K.V [Mon, 28 Feb 2011 11:34:05 +0000]
fs/9p: Add drop_inode 9p callback

We want to immediately drop the inode in non cached mode

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years agofs/9p: Add direct IO support in cached mode
Aneesh Kumar K.V [Mon, 28 Feb 2011 11:34:04 +0000]
fs/9p: Add direct IO support in cached mode

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years agofs/9p: Fix inode i_size update in file_write
Aneesh Kumar K.V [Mon, 28 Feb 2011 11:34:04 +0000]
fs/9p: Fix inode i_size update in file_write

Only update inode i_size when we write towards end of file.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years agofs/9p: set default readahead pages in cached mode
Aneesh Kumar K.V [Mon, 28 Feb 2011 11:34:03 +0000]
fs/9p: set default readahead pages in cached mode

We want to enable readahead in cached mode

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years agofs/9p: Move writeback fid to v9fs_inode
Aneesh Kumar K.V [Mon, 28 Feb 2011 11:34:03 +0000]
fs/9p: Move writeback fid to v9fs_inode

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years agofs/9p: Add v9fs_inode
Aneesh Kumar K.V [Mon, 28 Feb 2011 11:34:02 +0000]
fs/9p: Add v9fs_inode

Switch to the fscache code to v9fs_inode. We will later use
v9fs_inode in cache=loose mode to track the inode cache
validity timeout. Ie if we find an inode in cache older
that a specific jiffie range we will consider it stale

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years agofs/9p: Don't set stat.st_blocks based on nrpages
Aneesh Kumar K.V [Mon, 28 Feb 2011 11:34:01 +0000]
fs/9p: Don't set stat.st_blocks based on nrpages

simple_getattr does set stat.st_blocks to a value
derived from nrpages. That is not correct with 9p

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years agofs/9p: Add inode hashing
Aneesh Kumar K.V [Mon, 28 Feb 2011 11:34:01 +0000]
fs/9p: Add inode hashing

We didn't add the inode to inode hash in 9p. We need to do that
to get sync to work, otherwise __mark_inode_dirty will not
add the inode to super block's dirty list.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years agofs/9p: We need not writeback dirty pages during close
Aneesh Kumar K.V [Mon, 28 Feb 2011 11:34:00 +0000]
fs/9p: We need not writeback dirty pages during close

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years agofs/9p: Implement syncfs call back for 9Pfs
Aneesh Kumar K.V [Mon, 28 Feb 2011 11:34:00 +0000]
fs/9p: Implement syncfs call back for 9Pfs

FIXME!! what about dotu ?

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years agonet/9p: Implement syncfs 9P operation
Aneesh Kumar K.V [Mon, 28 Feb 2011 11:33:59 +0000]
net/9p: Implement syncfs 9P operation

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years agofs/9p: Mark file system with MS_SYNCHRONOUS only if it is not cached mode
Aneesh Kumar K.V [Mon, 28 Feb 2011 11:33:59 +0000]
fs/9p: Mark file system with MS_SYNCHRONOUS only if it is not cached mode

We should not mark file system synchronous if mounted cache=* option

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years agofs/9p: Clarify cached dentry delete operation
Aneesh Kumar K.V [Mon, 28 Feb 2011 11:33:58 +0000]
fs/9p: Clarify cached dentry delete operation

Update the comment to indicate that we don't want to cache
negative dentries.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years agofs/9p: Add buffered write support for v9fs.
Aneesh Kumar K.V [Mon, 28 Feb 2011 11:33:58 +0000]
fs/9p: Add buffered write support for v9fs.

We can now support writeable mmaps.
Based on the original patch from Badari Pulavarty <pbadari@us.ibm.com>

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years agofs/9p: Add fid to inode in cached mode
Aneesh Kumar K.V [Mon, 28 Feb 2011 11:33:57 +0000]
fs/9p: Add fid to inode in cached mode

The fid attached to inode will be opened O_RDWR mode and is used
for dirty page writeback only.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years agofs/9p: Add read write helper function
Aneesh Kumar K.V [Mon, 28 Feb 2011 11:33:56 +0000]
fs/9p: Add read write helper function

We add read write helper function here which will
be used later by the mmap patch

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years agofs/9p: [fscache] wait for page write in cached mode
Aneesh Kumar K.V [Mon, 28 Feb 2011 11:33:56 +0000]
fs/9p: [fscache] wait for page write in cached mode

We need to call fscache_wait_on_page_write in launder_page
for fscache

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years agofs/9p: increment inode->i_count in cached mode.
Aneesh Kumar K.V [Mon, 28 Feb 2011 11:33:55 +0000]
fs/9p: increment inode->i_count in cached mode.

We need to ihold even in cached mode

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years agofs/9p: set fs cache cookie in create path also
Aneesh Kumar K.V [Mon, 28 Feb 2011 11:33:55 +0000]
fs/9p: set fs cache cookie in create path also

We need to call v9fs_cache_inode_set_cookie in create
path also

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years agofs/9p: set the cached file_operations struct during inode init
Aneesh Kumar K.V [Mon, 28 Feb 2011 11:33:54 +0000]
fs/9p: set the cached file_operations struct during inode init

With the old code we were not setting the file->f_op
with cached file operations during creat.

(format correction by jvrao@linux.vnet.ibm.com)

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years ago[net/9p] Small non-IO PDUs for zero-copy supporting transports.
Venkateswararao Jujjuri (JV) [Wed, 16 Feb 2011 20:54:22 +0000]
[net/9p] Small non-IO PDUs for zero-copy supporting transports.

If a transport prefers payload to be sent separate from the PDU
(P9_TRANS_PREF_PAYLOAD_SEP), there is no need to allocate msize
PDU buffers(struct p9_fcall).

This patch allocates only upto 4k buffers for this kind of transports
and there won't be any change to the legacy transports.

Hence, this patch on top of zero copy changes allows user to
specify higher msizes through the mount option
without hogging the kernel heap.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years ago[net/9p] Handle Zero Copy TREAD/RERROR case in !dotl case.
Venkateswararao Jujjuri (JV) [Wed, 2 Feb 2011 04:04:59 +0000]
[net/9p] Handle Zero Copy TREAD/RERROR case in !dotl case.

This takes care of copying out error buffers from user buffer
payloads when we are using zero copy.  This happens because the
only payload buffer the server has to respond to the request is
the user buffer given for the zero copy read.

Because we only use zerocopy when the amount of data to transfer
is greater than a certain size (currently 4K) and error strings are
limited to ERRMAX (currently 128) we don't need to worry about there
being sufficient space for the error to fit in the payload.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years ago[net/9p] readdir zerocopy changes for 9P2000.L protocol.
Venkateswararao Jujjuri (JV) [Thu, 17 Feb 2011 02:43:20 +0000]
[net/9p] readdir zerocopy changes for 9P2000.L protocol.

Modify p9_client_readdir() to check the transport preference and act according
If the preference is P9_TRANS_PREF_PAYLOAD_SEP, send the payload
separately instead of putting it directly on PDU.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years ago[net/9p] Write side zerocopy changes for 9P2000.L protocol.
Venkateswararao Jujjuri (JV) [Mon, 14 Feb 2011 00:23:59 +0000]
[net/9p] Write side zerocopy changes for 9P2000.L protocol.

Modify p9_client_write() to check the transport preference and act accordingly.
If the preference is P9_TRANS_PREF_PAYLOAD_SEP, send the payload
separately instead of putting it directly on PDU.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years ago[net/9p] Read side zerocopy changes for 9P2000.L protocol.
Venkateswararao Jujjuri (JV) [Sat, 29 Jan 2011 01:05:59 +0000]
[net/9p] Read side zerocopy changes for 9P2000.L protocol.

Modify p9_client_read() to check the transport preference and act accordingly.
If the preference is P9_TRANS_PREF_PAYLOAD_SEP, send the payload
separately instead of putting it directly on PDU.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years ago[net/9p] Add preferences to transport layer.
Venkateswararao Jujjuri (JV) [Sun, 6 Feb 2011 20:08:01 +0000]
[net/9p] Add preferences to transport layer.

This patch adds preferences field to the p9_trans_module.
Through this, now transport layer can express its preference about the
payload. i.e if payload neds to be part of the PDU or it prefers it
to be sent sepearetly so that the transport layer can handle it in
a better way.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years ago[net/9p] Add gup/zero_copy support to VirtIO transport layer.
Venkateswararao Jujjuri (JV) [Fri, 28 Jan 2011 23:22:36 +0000]
[net/9p] Add gup/zero_copy support to VirtIO transport layer.

Modify p9_virtio_request() and req_done() functions to support
additional payload sent down to the transport layer through
tc->pubuf and tc->pkbuf.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years ago[net/9p] Assign type of transaction to tc->pdu->id which is otherwise unsed.
Venkateswararao Jujjuri (JV) [Thu, 3 Feb 2011 01:52:46 +0000]
[net/9p] Assign type of transaction to tc->pdu->id which is otherwise unsed.

This will be used by the transport layer to determine the out going
request type. Transport layer uses this information to correctly
place the mapped pages in the PDU. Patches following this will make
use of this to achieve zero copy.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years ago[net/9p] Preparation and helper functions for zero copy
Venkateswararao Jujjuri (JV) [Fri, 28 Jan 2011 22:11:13 +0000]
[net/9p] Preparation and helper functions for zero copy

This patch prepares p9_fcall structure for zero copy. Added
fields send the payload buffer information to the transport layer.
In addition it adds a 'private' field for the transport layer to
store mapped/pinned page information so that it can be freed/unpinned
during req_done.

This patch also creates trans_common.[ch] to house helper functions.
It adds the following helper functions.

p9_release_req_pages - Release pages after the transaction.
p9_nr_pages - Return number of pages needed to accomodate the payload.
payload_gup - Translates user buffer into kernel pages.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years ago[fs/9p] Make access=client default in 9p2000.L protocol
Venkateswararao Jujjuri (JV) [Thu, 27 Jan 2011 00:20:35 +0000]
[fs/9p] Make access=client default in 9p2000.L protocol

Current code sets access=user as default for all protocol versions.
This patch chagnes it to "client" only for dotl.

User can always specify particular access mode with -o access= option.
No change there.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years ago[fs/9P] Add posixacl mount option
Venkateswararao Jujjuri (JV) [Tue, 25 Jan 2011 23:40:54 +0000]
[fs/9P] Add posixacl mount option

The mount option access=client is overloaded as it assumes acl too.
Adding posixacl option to enable POSIX ACLs makes it explicit and clear.
Also it is convenient in the future to add other types of acls like richacls.

Ideally, the access mode 'client' should be just like V9FS_ACCESS_USER
except it underscores the location of access check.
Traditional 9P protocol lets the server perform access checks but with
this mode, all the access checks will be performed on the client itself.
Server just follows the client's directive.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years ago[fs/9p] Ignore acl mount option when CONFIG_9P_FS_POSIX_ACL is not defined.
Venkateswararao Jujjuri (JV) [Fri, 14 Jan 2011 23:24:59 +0000]
[fs/9p] Ignore acl mount option when CONFIG_9P_FS_POSIX_ACL is not defined.

If the kernel is not compiled with CONFIG_9P_FS_POSIX_ACL and the
mount option is specified to enable ACLs current code fails the mount.
This patch brings the behavior inline with other filesystems like ext3
by proceeding with the mount and log a warning to syslog.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

8 years ago[fs/9p] Initialze cached acls both in cached/uncached mode.
Venkateswararao Jujjuri (JV) [Fri, 14 Jan 2011 00:33:00 +0000]
[fs/9p] Initialze cached acls both in cached/uncached mode.

With create/mkdir/mknod in non cached mode we initialize the inode using
v9fs_get_inode. v9fs_get_inode doesn't initialize the cache inode value
to NULL.  This is causing to trip on BUG_ON in v9fs_get_cached_acl.
Fix is to initialize acls to NULL and not to leave them in ACL_NOT_CACHED
state.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

8 years ago[fs/9p] Plug potential acl leak
Venkateswararao Jujjuri (JV) [Thu, 13 Jan 2011 23:28:39 +0000]
[fs/9p] Plug potential acl leak

In v9fs_get_acl() if __v9fs_get_acl() gets only one of the
dacl/pacl we are not releasing it.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

8 years agoLinux 2.6.38
Linus Torvalds [Tue, 15 Mar 2011 01:20:32 +0000]
Linux 2.6.38

8 years agoMerge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux...
Linus Torvalds [Mon, 14 Mar 2011 22:20:39 +0000]
Merge branch 'fixes' of git://git./linux/kernel/git/dhowells/linux-2.6-mn10300

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-mn10300:
  MN10300: atomic_read() should ensure it emits a load
  MN10300: The SMP_ICACHE_INV_FLUSH_RANGE IPI command does not exist
  MN10300: Proper use of macros get_user() in the case of incremented pointers

8 years agoMerge branch 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus
Linus Torvalds [Mon, 14 Mar 2011 22:20:12 +0000]
Merge branch 'upstream' of git://git.linux-mips.org/upstream-linus

* 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus: (26 commits)
  MIPS: Alchemy: Fix reset for MTX-1 and XXS1500
  MIPS: MTX-1: Make au1000_eth probe all PHY addresses
  MIPS: Jz4740: Add HAVE_CLK
  MIPS: Move idle task creation to work queue
  MIPS, Perf-events: Use unsigned delta for right shift in event update
  MIPS, Perf-events: Work with the new callchain interface
  MIPS, Perf-events: Fix event check in validate_event()
  MIPS, Perf-events: Work with the new PMU interface
  MIPS, Perf-events: Work with irq_work
  MIPS: Fix always CONFIG_LOONGSON_UART_BASE=y
  MIPS: Loongson: Fix potentially wrong string handling
  MIPS: Fix GCC-4.6 'set but not used' warning in arch/mips/mm/init.c
  MIPS: Fix GCC-4.6 'set but not used' warning in ieee754int.h
  MIPS: Remove unused code from arch/mips/kernel/syscall.c
  MIPS: Fix GCC-4.6 'set but not used' warning in signal*.c
  MIPS: MSP: Fix MSP71xx bpci interrupt handler return value
  MIPS: Select R4K timer lib for all MSP platforms
  MIPS: Loongson: Remove ad-hoc cmdline default
  MIPS: Clear the correct flag in sysmips(MIPS_FIXADE, ...).
  MIPS: Add an unreachable return statement to satisfy buggy GCCs.
  ...

8 years agoMerge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Mon, 14 Mar 2011 22:19:09 +0000]
Merge branch 'x86-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: ce4100: Set pci ops via callback instead of module init
  x86/mm: Fix pgd_lock deadlock
  x86/mm: Handle mm_fault_error() in kernel space
  x86: Don't check for BIOS corruption in first 64K when there's no need to

8 years agoRevert "oom: oom_kill_process: fix the child_points logic"
Linus Torvalds [Mon, 14 Mar 2011 22:17:07 +0000]
Revert "oom: oom_kill_process: fix the child_points logic"

This reverts the parent commit.  I hate doing that, but it's generating
some discussion ("half of it is right"), and since I am planning on
doing the 2.6.38 release later today we can punt it to stable if
required. Let's not rock the boat right now.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

8 years agooom: oom_kill_process: fix the child_points logic
Oleg Nesterov [Mon, 14 Mar 2011 19:05:30 +0000]
oom: oom_kill_process: fix the child_points logic

oom_kill_process() starts with victim_points == 0.  This means that
(most likely) any child has more points and can be killed erroneously.

Also, "children has a different mm" doesn't match the reality, we should
check child->mm != t->mm.  This check is not exactly correct if t->mm ==
NULL but this doesn't really matter, oom_kill_task() will kill them
anyway.

Note: "Kill all processes sharing p->mm" in oom_kill_task() is wrong
too.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

8 years agoMIPS: Alchemy: Fix reset for MTX-1 and XXS1500
Florian Fainelli [Mon, 21 Feb 2011 13:28:02 +0000]
MIPS: Alchemy: Fix reset for MTX-1 and XXS1500

Since commit 32fd6901 (MIPS: Alchemy: get rid of common/reset.c)
Alchemy-based boards use their own reset function. For MTX-1 and XXS1500,
the reset function pokes at the BCSR.SYSTEM_RESET register, but this does
not work. According to Bruno Randolf, this was not tested when written.

Previously, the generic au1000_restart() routine called the board specific
reset function, which for MTX-1 and XXS1500 did not work, but finally made
a jump to the reset vector, which really triggers a system restart. Fix
reboot for both targets by jumping to the reset vector.

Signed-off-by: Florian Fainelli <florian@openwrt.org>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2093/
Acked-by: Bruno Randolf <br1@einfach.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

8 years agoMIPS: MTX-1: Make au1000_eth probe all PHY addresses
Florian Fainelli [Sun, 27 Feb 2011 18:53:53 +0000]
MIPS: MTX-1: Make au1000_eth probe all PHY addresses

When au1000_eth probes the MII bus for PHY address, if we do not set
au1000_eth platform data's phy_search_highest_address, the MII probing
logic will exit early and will assume a valid PHY is found at address 0.
For MTX-1, the PHY is at address 31, and without this patch, the link
detection/speed/duplex would not work correctly.

CC: stable@kernel.org
Signed-off-by: Florian Fainelli <florian@openwrt.org>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2111/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

8 years agoMIPS: Jz4740: Add HAVE_CLK
Maurus Cuelenaere [Mon, 28 Feb 2011 23:20:01 +0000]
MIPS: Jz4740: Add HAVE_CLK

Jz4740 supports the clock framework but doesn't have HAVE_CLK defined,
so define it!

Signed-off-by: Maurus Cuelenaere <mcuelenaere@gmail.com>
To: linux-mips@linux-mips.org
To: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/2112/
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

8 years agoMIPS: Move idle task creation to work queue
Maksim Rayskiy [Sat, 12 Feb 2011 18:21:32 +0000]
MIPS: Move idle task creation to work queue

To avoid forking usermode thread when creating an idle task, move fork_idle
to a work queue.

If kernel starts with maxcpus= option which does not bring all available
cpus online at boot time, idle tasks for offline cpus are not created. If
later offline cpus are hotplugged through sysfs, __cpu_up is called in
the context of the user task, and fork_idle copies its non-zero mm
pointer.  This causes BUG() in per_cpu_trap_init.

This also avoids issues with resource limits of the CPU writing to sysfs,
containers, maybe others.

Signed-off-by: Maksim Rayskiy <mrayskiy@broadcom.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2070/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

8 years agoMIPS, Perf-events: Use unsigned delta for right shift in event update
Deng-Cheng Zhu [Fri, 21 Jan 2011 08:19:21 +0000]
MIPS, Perf-events: Use unsigned delta for right shift in event update

Leverage the commit for ARM by Will Deacon:

446a5a8b1eb91a6990e5c8fe29f14e7a95b69132
    ARM: 6205/1: perf: ensure counter delta is treated as unsigned

    Hardware performance counters on ARM are 32-bits wide but atomic64_t
    variables are used to represent counter data in the hw_perf_event structure.

    The armpmu_event_update function right-shifts a signed 64-bit delta variable
    and adds the result to the event count. This can lead to shifting in sign-bits
    if the MSB of the 32-bit counter value is set. This results in perf output
    such as:

     Performance counter stats for 'sleep 20':

     18446744073460670464  cycles             <-- 0xFFFFFFFFF12A6000
            7783773  instructions             #      0.000 IPC
                465  context-switches
                161  page-faults
            1172393  branches

       20.154242147  seconds time elapsed

    This patch ensures that the delta value is treated as unsigned so that the
    right shift sets the upper bits to zero.

Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
To: a.p.zijlstra@chello.nl
To: fweisbec@gmail.com
To: will.deacon@arm.com
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: wuzhangjin@gmail.com
Cc: paulus@samba.org
Cc: mingo@elte.hu
Cc: acme@redhat.com
Cc: matt@console-pimps.org
Cc: sshtylyov@mvista.com
Patchwork: http://patchwork.linux-mips.org/patch/2015/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

8 years agoMIPS, Perf-events: Work with the new callchain interface
Deng-Cheng Zhu [Fri, 21 Jan 2011 08:19:20 +0000]
MIPS, Perf-events: Work with the new callchain interface

This is the MIPS part of the following commits by Frederic Weisbecker:

f72c1a931e311bb7780fee19e41a89ac42cab50e
    perf: Factorize callchain context handling

    Store the kernel and user contexts from the generic layer instead
    of archs, this gathers some repetitive code.

56962b4449af34070bb1994621ef4f0265eed4d8
    perf: Generalize some arch callchain code

    - Most archs use one callchain buffer per cpu, except x86 that needs
      to deal with NMIs. Provide a default perf_callchain_buffer()
      implementation that x86 overrides.

    - Centralize all the kernel/user regs handling and invoke new arch
      handlers from there: perf_callchain_user() / perf_callchain_kernel()
      That avoid all the user_mode(), current->mm checks and so...

    - Invert some parameters in perf_callchain_*() helpers: entry to the
      left, regs to the right, following the traditional (dst, src).

70791ce9ba68a5921c9905ef05d23f62a90bc10c
    perf: Generalize callchain_store()

    callchain_store() is the same on every archs, inline it in
    perf_event.h and rename it to perf_callchain_store() to avoid
    any collision.

    This removes repetitive code.

c1a65932fd7216fdc9a0db8bbffe1d47842f862c
    perf: Drop unappropriate tests on arch callchains

    Drop the TASK_RUNNING test on user tasks for callchains as
    this check doesn't seem to make any sense.

    Also remove the tests for !current that is not supposed to
    happen and current->pid as this should be handled at the
    generic level, with exclude_idle attribute.

Reported-by: Wu Zhangjin <wuzhangjin@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
To: a.p.zijlstra@chello.nl
To: will.deacon@arm.com
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: paulus@samba.org
Cc: mingo@elte.hu
Cc: acme@redhat.com
Cc: dengcheng.zhu@gmail.com
Cc: matt@console-pimps.org
Cc: sshtylyov@mvista.com
Patchwork: http://patchwork.linux-mips.org/patch/2014/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

8 years agoMIPS, Perf-events: Fix event check in validate_event()
Deng-Cheng Zhu [Fri, 21 Jan 2011 08:19:19 +0000]
MIPS, Perf-events: Fix event check in validate_event()

Ignore events that are in off/error state or belong to a different PMU.

This patch originates from the following commit for ARM by Will Deacon:

65b4711ff513767341aa1915c822de6ec0de65cb
    ARM: 6352/1: perf: fix event validation

    The validate_event function in the ARM perf events backend has the
    following problems:

    1.) Events that are disabled count towards the cost.
    2.) Events associated with other PMUs [for example, software events or
        breakpoints] do not count towards the cost, but do fail validation,
        causing the group to fail.

    This patch changes validate_event so that it ignores events in the
    PERF_EVENT_STATE_OFF state or that are scheduled for other PMUs.

Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
To: a.p.zijlstra@chello.nl
To: fweisbec@gmail.com
To: will.deacon@arm.com
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: wuzhangjin@gmail.com
Cc: paulus@samba.org
Cc: mingo@elte.hu
Cc: acme@redhat.com
Cc: dengcheng.zhu@gmail.com
Cc: matt@console-pimps.org
Cc: sshtylyov@mvista.com
Cc: ddaney@caviumnetworks.com
Patchwork: http://patchwork.linux-mips.org/patch/2013/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

8 years agoMIPS, Perf-events: Work with the new PMU interface
Deng-Cheng Zhu [Fri, 21 Jan 2011 08:19:18 +0000]
MIPS, Perf-events: Work with the new PMU interface

This is the MIPS part of the following commits by Peter Zijlstra:

a4eaf7f14675cb512d69f0c928055e73d0c6d252
    perf: Rework the PMU methods

    Replace pmu::{enable,disable,start,stop,unthrottle} with
    pmu::{add,del,start,stop}, all of which take a flags argument.

    The new interface extends the capability to stop a counter while
    keeping it scheduled on the PMU. We replace the throttled state with
    the generic stopped state.

    This also allows us to efficiently stop/start counters over certain
    code paths (like IRQ handlers).

    It also allows scheduling a counter without it starting, allowing for
    a generic frozen state (useful for rotating stopped counters).

    The stopped state is implemented in two different ways, depending on
    how the architecture implemented the throttled state:

     1) We disable the counter:
        a) the pmu has per-counter enable bits, we flip that
        b) we program a NOP event, preserving the counter state

     2) We store the counter state and ignore all read/overflow events

For MIPSXX, the stopped state is implemented in the way of 1.b as above.

33696fc0d141bbbcb12f75b69608ea83282e3117
    perf: Per PMU disable

    Changes perf_disable() into perf_pmu_disable().

24cd7f54a0d47e1d5b3de29e2456bfbd2d8447b7
    perf: Reduce perf_disable() usage

    Since the current perf_disable() usage is only an optimization,
    remove it for now. This eases the removal of the __weak
    hw_perf_enable() interface.

b0a873ebbf87bf38bf70b5e39a7cadc96099fa13
    perf: Register PMU implementations

    Simple registration interface for struct pmu, this provides the
    infrastructure for removing all the weak functions.

51b0fe39549a04858001922919ab355dee9bdfcf
    perf: Deconstify struct pmu

    sed -ie 's/const struct pmu\>/struct pmu/g' `git grep -l "const struct pmu\>"`

Reported-by: Wu Zhangjin <wuzhangjin@gmail.com>
Acked-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
To: a.p.zijlstra@chello.nl
To: fweisbec@gmail.com
To: will.deacon@arm.com
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: wuzhangjin@gmail.com
Cc: paulus@samba.org
Cc: mingo@elte.hu
Cc: acme@redhat.com
Cc: dengcheng.zhu@gmail.com
Cc: matt@console-pimps.org
Cc: sshtylyov@mvista.com
Cc: ddaney@caviumnetworks.com
Patchwork: http://patchwork.linux-mips.org/patch/2012/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

8 years agoMIPS, Perf-events: Work with irq_work
Deng-Cheng Zhu [Fri, 21 Jan 2011 08:19:17 +0000]
MIPS, Perf-events: Work with irq_work

This is the MIPS part of the following commit by Peter Zijlstra:

e360adbe29241a0194e10e20595360dd7b98a2b3
    irq_work: Add generic hardirq context callbacks

    Provide a mechanism that allows running code in IRQ context. It is
    most useful for NMI code that needs to interact with the rest of the
    system -- like wakeup a task to drain buffers.

    Perf currently has such a mechanism, so extract that and provide it as
    a generic feature, independent of perf so that others may also
    benefit.

    The IRQ context callback is generated through self-IPIs where
    possible, or on architectures like powerpc the decrementer (the
    built-in timer facility) is set to generate an interrupt immediately.

    Architectures that don't have anything like this get to do with a
    callback from the timer tick. These architectures can call
    irq_work_run() at the tail of any IRQ handlers that might enqueue such
    work (like the perf IRQ handler) to avoid undue latencies in
    processing the work.

For MIPSXX, we need to call irq_work_run() at the tail of the perf IRQ
handler as described above.

Reported-by: Wu Zhangjin <wuzhangjin@gmail.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
To: fweisbec@gmail.com
To: will.deacon@arm.com
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: paulus@samba.org
Cc: mingo@elte.hu
Cc: acme@redhat.com
Cc: matt@console-pimps.org
Cc: sshtylyov@mvista.com,
Patchwork: http://patchwork.linux-mips.org/patch/2011/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

8 years agoMIPS: Fix always CONFIG_LOONGSON_UART_BASE=y
Yoichi Yuasa [Mon, 7 Feb 2011 02:31:36 +0000]
MIPS: Fix always CONFIG_LOONGSON_UART_BASE=y

Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
Cc: linux-mips <linux-mips@linux-mips.org>
Patchwork: https://patchwork.linux-mips.org/patch/2055/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

8 years agoMIPS: Loongson: Fix potentially wrong string handling
Stefan Weil [Sun, 30 Jan 2011 20:41:44 +0000]
MIPS: Loongson: Fix potentially wrong string handling

This error was reported by cppcheck:
arch/mips/loongson/common/machtype.c:56: error: Dangerous usage of 'str' (strncpy doesn't always 0-terminate it)

If strncpy copied MACHTYPE_LEN bytes, the destination string str
was not terminated.

The patch adds one more byte to str and makes sure that this byte is
always 0.

Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Cc: Wu Zhangjin <wuzhangjin@gmail.com>
Cc: Arnaud Patard <apatard@mandriva.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/2053/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

8 years agoMIPS: Fix GCC-4.6 'set but not used' warning in arch/mips/mm/init.c
David Daney [Mon, 24 Jan 2011 22:51:37 +0000]
MIPS: Fix GCC-4.6 'set but not used' warning in arch/mips/mm/init.c

Under some combinations of CONFIG_*, lastpfn in page_is_ram is 'set
but not used'.  Mark it as __maybe_unused to quiet the warning/error.

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2033/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

8 years agoMIPS: Fix GCC-4.6 'set but not used' warning in ieee754int.h
David Daney [Mon, 24 Jan 2011 22:51:36 +0000]
MIPS: Fix GCC-4.6 'set but not used' warning in ieee754int.h

GCC-4.6 can find more unused code than previous versions could.

In the case of arch/mips/math-emu/ieee754int.h, the COMPXSP and
COMPXDP macros are used in several places, but a couple of them leave
xs unused.  The easiest thing to do is mark it as __maybe_unused to
quiet the warning.

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2032/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

8 years agoMIPS: Remove unused code from arch/mips/kernel/syscall.c
David Daney [Mon, 24 Jan 2011 22:51:35 +0000]
MIPS: Remove unused code from arch/mips/kernel/syscall.c

The variable arg3 in _sys_sysmips() is unused.  Remove it.

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2034/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

8 years agoMIPS: Fix GCC-4.6 'set but not used' warning in signal*.c
David Daney [Mon, 24 Jan 2011 22:51:34 +0000]
MIPS: Fix GCC-4.6 'set but not used' warning in signal*.c

GCC-4.6 can find more unused code than previous versions could.

In the case of protected_restore_fp_context{,32}, the variable tmp is
really used.  Its use is tricky in that we really care about the side
effects of the __put_user() calls.  So we must mark tmp with
__maybe_unused to quiet the warning.

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2035/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

8 years agoMIPS: MSP: Fix MSP71xx bpci interrupt handler return value
Anoop P A [Thu, 18 Nov 2010 10:32:50 +0000]
MIPS: MSP: Fix MSP71xx bpci interrupt handler return value

Signed-off-by: Anoop P A <anoop.pa@gmail.com>
To: Ben Hutchings <ben@decadent.org.uk>
To: linux-mips@linux-mips.org
To: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/1804/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

8 years agoMIPS: Select R4K timer lib for all MSP platforms
Anoop P A [Thu, 18 Nov 2010 08:12:28 +0000]
MIPS: Select R4K timer lib for all MSP platforms

Signed-off-by: Anoop P A <anoop.pa@gmail.com>
To: linux-mips@linux-mips.org
To: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/1803/
Tested-by: Shane McDonald <mcdonald.shane@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

8 years agoMIPS: Loongson: Remove ad-hoc cmdline default
Robert Millan [Sun, 7 Nov 2010 12:38:29 +0000]
MIPS: Loongson: Remove ad-hoc cmdline default

Loongson builds have an ad-hoc cmdline default of "console=ttyS0,115200
root=/dev/hda1". These settings come from a vendor; I remember builds
from Lemote branch requiring a "console=tty" override in order to get a
working console.

At least on Yeeloong, they're particularly useless: there's no external
serial port, and the IDE drive is now recognised as /dev/sda.

Signed-off-by: Robert Millan <rmh@gnu.org>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/1759/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

8 years agoMIPS: Clear the correct flag in sysmips(MIPS_FIXADE, ...).
Stefan Oberhumer [Mon, 17 Jan 2011 08:19:53 +0000]
MIPS: Clear the correct flag in sysmips(MIPS_FIXADE, ...).

The sysmips(MIPS_FIXADE, ...) case contains an obvious copy-and-paste
error in the handling of the TIF_LOGADE flag. Fix that

Patchwork: https://patchwork.linux-mips.org/patch/1997/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

8 years agoMIPS: Add an unreachable return statement to satisfy buggy GCCs.
David Daney [Wed, 19 Jan 2011 23:24:42 +0000]
MIPS: Add an unreachable return statement to satisfy buggy GCCs.

It was reported that GCC-4.3.3 (with CodeSourcery extensions) fails
without this.

Reported-by: Jonas Gorski <jonas.gorski@gmail.com>
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2010/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

8 years agoMIPS, Tracing: Fix set_graph_function of function graph tracer
Wu Zhangjin [Fri, 21 Jan 2011 18:01:53 +0000]
MIPS, Tracing: Fix set_graph_function of function graph tracer

trace.func should be set to the recorded ip of the mcount calling site
in the __mcount_loc section to filter the function entries configured
through the tracing/set_graph_function interface, but before, this is
set to the self_ra(the return address of mcount), which has made
set_graph_function not work as expected.

This fixes it via calculating the right recorded ip in the __mcount_loc
section and assign it to trace.func.

Reported-by: Zhiping Zhong <xzhong86@163.com>
Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: Sergei Shtylyov <sshtylyov@mvista.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2017/
Signed-off-by: Ralf Baechle <ralf@duck.linux-mips.net>

8 years agoMIPS, Tracing: Clean up ftrace_make_nop()
Wu Zhangjin [Wed, 19 Jan 2011 19:28:31 +0000]
MIPS, Tracing: Clean up ftrace_make_nop()

This moves the comments out of ftrace_make_nop() and cleans it.  At the
same time, a macro MCOUNT_OFFSET_INSNS is defined for sharing with the
next patch.

Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2008/
Signed-off-by: Ralf Baechle <ralf@duck.linux-mips.net>

8 years agoMIPS, Tracing: Clean up prepare_ftrace_return()
Wu Zhangjin [Wed, 19 Jan 2011 19:28:30 +0000]
MIPS, Tracing: Clean up prepare_ftrace_return()

The old prepare_ftrace_return() for MIPS is confused and have introduced
some problem. This patch cleans up the names of the arguments, variables
and related functions.

For MIPS, the 2nd argument of prepare_ftrace_return() is not really the
'selfpc' described in ftrace-design.txt but instead it is the self
return address. This did break the compatibility of the generic
interface but really reduced one unneeded calculation for to get the
current function name, the parent return address and the self return
address are enough, no need to tranform the self return address to the
self address.

But set_graph_function of function graph tracer is an exception, it does
need the 2nd argument of prepare_ftrace_return() as 'selfpc', for it
will use 'selfpc' to match user's configuration of function graph
entries, but in reality, it doesn't need the 'selfpc' but the recorded
ip address of the mcount calling site in the __mcount_loc section. So,
the 2nd argument of prepare_ftrace_return() is not important, the real
requirement is the right recorded ip address should be calculated and
assign to trace.func, this will be fixed in the next patches.

Reported-by: Zhiping Zhong <xzhong86@163.com>
Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2007/
Signed-off-by: Ralf Baechle <ralf@duck.linux-mips.net>

8 years agoMIPS, Tracing: Substitute in_kernel_space() for in_module()
Wu Zhangjin [Wed, 19 Jan 2011 19:28:29 +0000]
MIPS, Tracing: Substitute in_kernel_space() for in_module()

The old in_module() may not work in some situations(e.g. when module &
kernel are in the same address space when CONFIG_MAPPED_KERNEL=y), The
in_kernel_space() is more generic and it is also easy to be implemented
via cloning the existing core_kernel_text(), so, replace the in_module()
with in_kernel_space().

Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2005/
Signed-off-by: Ralf Baechle <ralf@duck.linux-mips.net>

8 years agoMIPS, Tracing: Speed up function graph tracer
Wu Zhangjin [Wed, 19 Jan 2011 19:28:27 +0000]
MIPS, Tracing: Speed up function graph tracer

This simply moves the "ip-=4" statement down to the end of the do { ...
} while (...); loop, which reduces one unneeded subtration and the
subsequent memory loading and comparison.

Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2006/
Signed-off-by: Ralf Baechle <ralf@duck.linux-mips.net>

8 years agoMIPS: Replace deprecated spinlock initialization
Thomas Gleixner [Sun, 23 Jan 2011 15:17:00 +0000]
MIPS: Replace deprecated spinlock initialization

SPIN_LOCK_UNLOCK is deprecated. Use the lockdep capable variant instead.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2025/
Signed-off-by: Ralf Baechle <ralf@duck.linux-mips.net>

8 years agoMerge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
Linus Torvalds [Mon, 14 Mar 2011 18:19:50 +0000]
Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6

* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  NFS: NFSROOT should default to "proto=udp"
  nfs4: remove duplicated #include
  NFSv4: nfs4_state_mark_reclaim_nograce() should be static
  NFSv4: Fix the setlk error handler
  NFSv4.1: Fix the handling of the SEQUENCE status bits
  NFSv4/4.1: Fix nfs4_schedule_state_recovery abuses
  NFSv4.1 reclaim complete must wait for completion
  NFSv4: remove duplicate clientid in struct nfs_client
  NFSv4.1: Retry CREATE_SESSION on NFS4ERR_DELAY
  sunrpc: Propagate errors from xs_bind() through xs_create_sock()
  (try3-resend) Fix nfs_compat_user_ino64 so it doesn't cause problems if bit 31 or 63 are set in fileid
  nfs: fix compilation warning
  nfs: add kmalloc return value check in decode_and_add_ds
  SUNRPC: Remove resource leak in svc_rdma_send_error()
  nfs: close NFSv4 COMMIT vs. CLOSE race
  SUNRPC: Close a race in __rpc_wait_for_completion_task()

8 years agoMerge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied...
Linus Torvalds [Mon, 14 Mar 2011 18:17:43 +0000]
Merge branch 'drm-fixes' of git://git./linux/kernel/git/airlied/drm-2.6

* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  drm/radeon: fix problem with changing active VRAM size. (v2)

8 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog
Linus Torvalds [Mon, 14 Mar 2011 17:15:43 +0000]
Merge git://git./linux/kernel/git/wim/linux-2.6-watchdog

* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
  watchdog: hpwdt: eliminate section mismatch warning
  watchdog: w83697ug_wdt: Fix set bit 0 to activate GPIO2
  watchdog: sch311x_wdt: fix printk condition
  watchdog: sch311x_wdt: Fix LDN active check
  watchdog: cpwd: Fix buffer-overflow

8 years agoFix corrupted OSF partition table parsing
Timo Warns [Mon, 14 Mar 2011 13:59:33 +0000]
Fix corrupted OSF partition table parsing

The kernel automatically evaluates partition tables of storage devices.
The code for evaluating OSF partitions contains a bug that leaks data
from kernel heap memory to userspace for certain corrupted OSF
partitions.

In more detail:

  for (i = 0 ; i < le16_to_cpu(label->d_npartitions); i++, partition++) {

iterates from 0 to d_npartitions - 1, where d_npartitions is read from
the partition table without validation and partition is a pointer to an
array of at most 8 d_partitions.

Add the proper and obvious validation.

Signed-off-by: Timo Warns <warns@pre-sense.de>
Cc: stable@kernel.org
[ Changed the patch trivially to not repeat the whole le16_to_cpu()
  thing, and to use an explicit constant for the magic value '8' ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

8 years agothp+memcg-numa: fix BUG at include/linux/mm.h:370!
Hugh Dickins [Mon, 14 Mar 2011 08:08:47 +0000]
thp+memcg-numa: fix BUG at include/linux/mm.h:370!

THP's collapse_huge_page() has an understandable but ugly difference
in when its huge page is allocated: inside if NUMA but outside if not.
It's hardly surprising that the memcg failure path forgot that, freeing
the page in the non-NUMA case, then hitting a VM_BUG_ON in get_page()
(or even worse, using the freed page).

Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

8 years agoMN10300: atomic_read() should ensure it emits a load
David Howells [Mon, 14 Mar 2011 14:49:44 +0000]
MN10300: atomic_read() should ensure it emits a load

atomic_read() needs to ensure that it emits a load (which it can do by using
ACCESS_ONCE()).

Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: David Howells <dhowells@redhat.com>

8 years agoMN10300: The SMP_ICACHE_INV_FLUSH_RANGE IPI command does not exist
David Howells [Mon, 14 Mar 2011 14:45:29 +0000]
MN10300: The SMP_ICACHE_INV_FLUSH_RANGE IPI command does not exist

The invalidate-only versions of flush_icache_*range() are trying sending the
SMP_ICACHE_INV_FLUSH_RANGE IPI command in SMP kernels when they should be
sending SMP_ICACHE_INV_RANGE as the former does not exist.

Signed-off-by: David Howells <dhowells@redhat.com>

8 years agoMN10300: Proper use of macros get_user() in the case of incremented pointers
Tkhai Kirill [Mon, 14 Mar 2011 13:27:46 +0000]
MN10300: Proper use of macros get_user() in the case of incremented pointers

Using __get_user_check(x, ptr++, size) leads to double increment of pointer.
This macro uses the macro get_user directly, which itself is used in this way
(get_user(x, ptr++)) in some functions of the kernel. The patch fixes the
error.

Reported-by: Tkhai Kirill <tkhai@yandex.ru>
Signed-off-by: David Howells <dhowells@redhat.com>

8 years agox86: ce4100: Set pci ops via callback instead of module init
Sebastian Andrzej Siewior [Mon, 14 Mar 2011 09:33:40 +0000]
x86: ce4100: Set pci ops via callback instead of module init

Setting the pci ops on subsys initcall unconditionally will break
multi platform kernels on anything except ce4100.

Use x86_init.pci.init ops to call this only on real ce4100 platforms.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: sodaville@linutronix.de
LKML-Reference: <20110314093340.GA21026@www.tglx.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

8 years agowatchdog: hpwdt: eliminate section mismatch warning
Axel Lin [Wed, 2 Mar 2011 03:49:44 +0000]
watchdog: hpwdt: eliminate section mismatch warning

hpwdt_init_nmi_decoding() is called in hpwdt_init_one error handling,
thus remove the  __devexit annotation of hpwdt_exit_nmi_decoding().

This patch fixes below warning:

WARNING: drivers/watchdog/hpwdt.o(.devinit.text+0x36f): Section mismatch in reference from the function hpwdt_init_one() to the function .devexit.text:hpwdt_exit_nmi_decoding()
The function __devinit hpwdt_init_one() references
a function __devexit hpwdt_exit_nmi_decoding().
This is often seen when error handling in the init function
uses functionality in the exit path.
The fix is often to remove the __devexit annotation of
hpwdt_exit_nmi_decoding() so it may be used outside an exit section.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Acked-by: Thomas Mingarelli <Thomas.Mingarelli@hp.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>

8 years agowatchdog: w83697ug_wdt: Fix set bit 0 to activate GPIO2
Wim Van Sebroeck [Mon, 21 Feb 2011 19:28:58 +0000]
watchdog: w83697ug_wdt: Fix set bit 0 to activate GPIO2

outb_p(c || 0x01, WDT_EFDR); -> || should be |

Reported-By: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>

8 years agowatchdog: sch311x_wdt: fix printk condition
Dan Carpenter [Wed, 23 Feb 2011 20:26:01 +0000]
watchdog: sch311x_wdt: fix printk condition

"==" has higher precedence than "&".  Since
if (sch311x_sio_inb(sio_config_port, 0x30) & (0x01 == 0)) is always
false the message is never printed.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>

8 years agowatchdog: sch311x_wdt: Fix LDN active check
Wim Van Sebroeck [Mon, 21 Feb 2011 19:09:40 +0000]
watchdog: sch311x_wdt: Fix LDN active check

if (sch311x_sio_inb(sio_config_port, 0x30) && 0x01 == 0) -> && should be &

Reported-By: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>

8 years agowatchdog: cpwd: Fix buffer-overflow
Wim Van Sebroeck [Mon, 21 Feb 2011 10:52:43 +0000]
watchdog: cpwd: Fix buffer-overflow

cppcheck-1.47 reports:
[drivers/watchdog/cpwd.c:650]: (error) Buffer access out-of-bounds: p.devs

The source code is
for (i = 0; i < 4; i++) {
misc_deregister(&p->devs[i].misc);

where devs is defined as WD_NUMDEVS big and WD_NUMDEVS is equal to 3.
So the 4 should be a 3 or WD_NUMDEVS.

Reported-By: David Binderman
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>

8 years agodrm/radeon: fix problem with changing active VRAM size. (v2)
Dave Airlie [Sun, 13 Mar 2011 23:47:24 +0000]
drm/radeon: fix problem with changing active VRAM size. (v2)

So we used to use lpfn directly to restrict VRAM when we couldn't
access the unmappable area, however this was removed in
93225b0d7bc030f4a93165347a65893685822d70 as it also restricted
the gtt placements. However it was only later noticed that this
broke on some hw.

This removes the active_vram_size, and just explicitly sets it
when it changes, TTM/drm_mm will always use the real_vram_size,
and the active vram size will change the TTM size used for lpfn
setting.

We should re-work the fpfn/lpfn to per-placement at some point
I suspect, but that is too late for this kernel.

Hopefully this addresses:
https://bugs.freedesktop.org/show_bug.cgi?id=35254

v2: fix reported useful VRAM size to userspace to be correct.

Signed-off-by: Dave Airlie <airlied@redhat.com>

8 years agocompat breakage in preadv() and pwritev()
Al Viro [Sun, 13 Mar 2011 23:24:46 +0000]
compat breakage in preadv() and pwritev()

Fix for a dumb preadv()/pwritev() compat bug - unlike the native
variants, the compat_...  ones forget to check FMODE_P{READ,WRITE}, so
e.g.  on pipe the native preadv() will fail with -ESPIPE and compat one
will act as readv() and succeed.

Not critical, but it's a clear bug with trivial fix, so IMO it's OK for
-final.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

8 years agoMerge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groec...
Linus Torvalds [Sun, 13 Mar 2011 23:01:11 +0000]
Merge branch 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/staging

* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging:
  hwmon/f71882fg: Set platform drvdata to NULL later
  hwmon/f71882fg: Fix a typo in a comment

8 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable
Linus Torvalds [Sun, 13 Mar 2011 23:00:49 +0000]
Merge git://git./linux/kernel/git/mason/btrfs-unstable

* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
  Btrfs: break out of shrink_delalloc earlier
  btrfs: fix not enough reserved space
  btrfs: fix dip leak
  Btrfs: make sure not to return overlapping extents to fiemap
  Btrfs: deal with short returns from copy_from_user
  Btrfs: fix regressions in copy_from_user handling

8 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6
Linus Torvalds [Sun, 13 Mar 2011 23:00:28 +0000]
Merge git://git./linux/kernel/git/jejb/scsi-rc-fixes-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
  [SCSI] target: Fix t_transport_aborted handling in LUN_RESET + active I/O shutdown

8 years agokbuild: Fix computing srcversion for modules
Michal Marek [Fri, 11 Mar 2011 21:34:47 +0000]
kbuild: Fix computing srcversion for modules

Recent change to fixdep:

    commit b7bd182176960fdd139486cadb9962b39f8a2b50
    Author: Michal Marek <mmarek@suse.cz>
    Date:   Thu Feb 17 15:13:54 2011 +0100

    fixdep: Do not record dependency on the source file itself

changed the format of the *.cmd files without realizing that it is also
used by modpost. Put the path to the source file to the file back, in a
special variable, so that modpost sees all source files when calculating
srcversion for modules.

Reported-and-tested-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

8 years agoMerge git://git.infradead.org/users/dwmw2/mtd-2.6.38
Linus Torvalds [Sun, 13 Mar 2011 22:56:22 +0000]
Merge git://git.infradead.org/users/dwmw2/mtd-2.6.38

* git://git.infradead.org/users/dwmw2/mtd-2.6.38:
  mtd: add "platform:" prefix for platform modalias
  mtd: mtd_blkdevs: fix double free on error path
  mtd: amd76xrom: fix oops at boot when resources are not available
  mtd: fix race in cfi_cmdset_0001 driver
  mtd: jedec_probe: initialise make sector erase command variable
  mtd: jedec_probe: Change variable name from cfi_p to cfi

8 years agoMerge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied...
Linus Torvalds [Sun, 13 Mar 2011 22:52:48 +0000]
Merge branch 'drm-fixes' of git://git./linux/kernel/git/airlied/drm-2.6

* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  drm/radeon: fix page flipping hangs on r300/r400
  drm/radeon: add pageflip hooks for fusion

8 years agoMerge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
Linus Torvalds [Sun, 13 Mar 2011 22:50:37 +0000]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block

* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: fix mis-synchronisation in blkdev_issue_zeroout()

8 years agoMerge branch 'fix/asoc' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
Linus Torvalds [Sun, 13 Mar 2011 22:50:01 +0000]
Merge branch 'fix/asoc' of git://git./linux/kernel/git/tiwai/sound-2.6

* 'fix/asoc' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  ASoC: Ensure WM8958 gets all WM8994 late revision widgets
  ASoC: Fix typo in late revision WM8994 DAC2R name
  ASoC: Use the correct DAPM context when cleaning up final widget set
  ASoC: Fix broken bitfield definitions in WM8978
  ASoC: AM3517: Update codec name after multi-component update

8 years agogpio: add MODULE_DEVICE_TABLE
Axel Lin [Fri, 11 Mar 2011 22:58:30 +0000]
gpio: add MODULE_DEVICE_TABLE

The device table is required to load modules based on modaliases.

After adding MODULE_DEVICE_TABLE, below entries will be added to
modules.pcimap:

  pch_gpio             0x00008086 0x00008803 0xffffffff 0xffffffff 0x00000000 0x00000000 0x0
  ml_ioh_gpio          0x000010db 0x0000802e 0xffffffff 0xffffffff 0x00000000 0x00000000 0x0

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Cc: Tomoya MORINAGA <tomoya-linux@dsn.okisemi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

8 years agothp: fix page_referenced to modify mapcount/vm_flags only if page is found
Andrea Arcangeli [Fri, 11 Mar 2011 22:58:29 +0000]
thp: fix page_referenced to modify mapcount/vm_flags only if page is found

When vmscan.c calls page_referenced(), if an anon page was created
before a process forked, rmap will search for it in both of the
processes, even though one of them might have since broken COW.

If the child process mlocks the vma where the COWed page belongs to,
page_referenced() running on the page mapped by the parent would lead to
*vm_flags getting VM_LOCKED set erroneously (leading to the references
on the parent page being ignored and evicting the parent page too
early).

*mapcount would also be decremented by page_referenced_one even if the
page wasn't found by page_check_address.

This also lets pmdp_clear_flush_young_notify() go ahead on a
pmd_trans_splitting() pmd.

We hold the page_table_lock so __split_huge_page_map() must wait the
pmdp_clear_flush_young_notify() to complete before it can modify the
pmd.  The pmd is also still mapped in userland so the young bit may
materialize through a tlb miss before split_huge_page_map runs.

This will provide a more accurate page_referenced() behavior during
split_huge_page().

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Michel Lespinasse <walken@google.com>
Reviewed-by: Michel Lespinasse <walken@google.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Rik van Riel<riel@redhat.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

8 years agohwmon/f71882fg: Set platform drvdata to NULL later
Hans de Goede [Sun, 13 Mar 2011 12:50:33 +0000]
hwmon/f71882fg: Set platform drvdata to NULL later

This avoids a possible race leading to trying to dereference NULL.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
Cc: stable@kernel.org
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>

8 years agohwmon/f71882fg: Fix a typo in a comment
Hans de Goede [Sun, 13 Mar 2011 12:50:32 +0000]
hwmon/f71882fg: Fix a typo in a comment

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>

8 years agodrm/radeon: fix page flipping hangs on r300/r400
Dave Airlie [Fri, 11 Mar 2011 11:17:41 +0000]
drm/radeon: fix page flipping hangs on r300/r400

We've been getting reports of complete system lockups with rv3xx hw on
AGP and PCIE when running gnome-shell or kwin with compositing.

It appears the hw really doesn't like setting these registers while
stuff is running, this moves the setting of the registers into the modeset
since they aren't required to be changed anywhere else.

fixes: https://bugs.freedesktop.org/show_bug.cgi?id=35183

Reported-and-tested-by: Álmos <aaalmosss@gmail.com
Signed-off-by: Dave Airlie <airlied@redhat.com>

8 years agoBtrfs: break out of shrink_delalloc earlier
Chris Mason [Sat, 12 Mar 2011 12:08:42 +0000]
Btrfs: break out of shrink_delalloc earlier

Josef had changed shrink_delalloc to exit after three shrink
attempts, which wasn't quite enough because new writers could
race in and steal free space.

But it also fixed deadlocks and stalls as we tried to recover
delalloc reservations.  The code was tweaked to loop 1024
times, and would reset the counter any time a small amount
of progress was made.  This was too drastic, and with a
lot of writers we can end up stuck in shrink_delalloc forever.

The shrink_delalloc loop is fairly complex because the caller is looping
too, and the caller will go ahead and force a transaction commit to make
sure we reclaim space.

This reworks things to exit shrink_delalloc when we've forced some
writeback and the delalloc reservations have gone down.  This means
the writeback has not just started but has also finished at
least some of the metadata changes required to reclaim delalloc
space.

If we've got this wrong, we're returning ENOSPC too early, which
is a big improvement over the current behavior of hanging the machine.

Test 224 in xfstests hammers on this nicely, and with 1000 writers
trying to fill a 1GB drive we get our first ENOSPC at 93% full.  The
other writers are able to continue until we get 100%.

This is a worst case test for btrfs because the 1000 writers are doing
small IO, and the small FS size means we don't have a lot of room
for metadata chunks.

Signed-off-by: Chris Mason <chris.mason@oracle.com>