6 years agoperf/x86: Fix intel QPI uncore event definitions
Vince Weaver [Fri, 2 Aug 2013 14:47:34 +0000]
perf/x86: Fix intel QPI uncore event definitions

commit c9601247f8f3fdc18aed7ed7e490e8dfcd07f122 upstream.

John McCalpin reports that the "drs_data" and "ncb_data" QPI
uncore events are missing the "extra bit" and always return zero
values unless the bit is properly set.

More details from him:

 According to the Xeon E5-2600 Product Family Uncore Performance
 Monitoring Guide, Table 2-94, about 1/2 of the QPI Link Layer events
 (including the ones that "perf" calls "drs_data" and "ncb_data") require
 that the "extra bit" be set.

 This was confusing for a while -- a note at the bottom of page 94 says
 that the "extra bit" is bit 16 of the control register.
 Unfortunately, Table 2-86 clearly says that bit 16 is reserved and must
 be zero.  Looking around a bit, I found that bit 21 appears to be the
 correct "extra bit", and further investigation shows that "perf" actually
 agrees with me:
[root@c560-003.stampede]# cat /sys/bus/event_source/devices/uncore_qpi_0/format/event

 So the command
# perf -e "uncore_qpi_0/event=drs_data/"
 Is the same as
# perf -e "uncore_qpi_0/event=0x02,umask=0x08/"
 While it should be
# perf -e "uncore_qpi_0/event=0x102,umask=0x08/"

 I confirmed that this last version gives results that agree with the
 amount of data that I expected the STREAM benchmark to move across the QPI
 link in the second (cross-chip) test of the original script.

Reported-by: John McCalpin <mccalpin@tacc.utexas.edu>
Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
Cc: zheng.z.yan@intel.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Paul Mackerras <paulus@samba.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1308021037280.26119@vincent-weaver-1.um.maine.edu
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoLinux 3.10.7
Greg Kroah-Hartman [Thu, 15 Aug 2013 05:59:42 +0000]
Linux 3.10.7

6 years agoMIPS: Expose missing pci_io{map,unmap} declarations
Markos Chandras [Mon, 17 Jun 2013 08:09:00 +0000]
MIPS: Expose missing pci_io{map,unmap} declarations

commit 78857614104a26cdada4c53eea104752042bf5a1 upstream.

The GENERIC_PCI_IOMAP does not depend on CONFIG_PCI so move
it to the CONFIG_MIPS symbol so it's always selected for MIPS.
This fixes the missing pci_iomap declaration for MIPS.
Moreover, the pci_iounmap function was not defined in the
io.h header file if the CONFIG_PCI symbol is not set,
but it should since MIPS is not using CONFIG_GENERIC_IOMAP.

This fixes the following problem on a allyesconfig:

drivers/net/ethernet/3com/3c59x.c:1031:2: error: implicit declaration of
function 'pci_iomap' [-Werror=implicit-function-declaration]
drivers/net/ethernet/3com/3c59x.c:1044:3: error: implicit declaration of
function 'pci_iounmap' [-Werror=implicit-function-declaration]

Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Acked-by: Steven J. Hill <Steven.Hill@imgtec.com>
Patchwork: https://patchwork.linux-mips.org/patch/5478/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agomtd: omap2: allow bulding as a module
Arnd Bergmann [Tue, 23 Apr 2013 13:47:50 +0000]
mtd: omap2: allow bulding as a module

commit 930d800bded771b26d9944c47810829130ff7c8c upstream.

The omap2 nand device driver calls into the the elm code, which can
be a loadable module, and in that case it cannot be built-in itself.
I can see no reason why the omap2 driver cannot also be a module,
so let's make the option "tristate" in Kconfig to fix this allmodconfig
build error:

ERROR: "elm_config" [drivers/mtd/nand/omap2.ko] undefined!
ERROR: "elm_decode_bch_error_page" [drivers/mtd/nand/omap2.ko] undefined!

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Tony Lindgren <tony@atomide.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: Afzal Mohammed <afzal@ti.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoSCSI: nsp32: use mdelay instead of large udelay constants
Arnd Bergmann [Thu, 14 Mar 2013 14:21:36 +0000]
SCSI: nsp32: use mdelay instead of large udelay constants

commit b497ceb964a80ebada3b9b3cea4261409039e25a upstream.

ARM cannot handle udelay for more than 2 miliseconds, so we
should use mdelay instead for those.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: GOTO Masanori <gotom@debian.or.jp>
Cc: YOKOTA Hiroshi <yokota@netlab.is.tsukuba.ac.jp>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agodrm/radeon: always program the MC on startup
Alex Deucher [Sun, 4 Aug 2013 16:13:17 +0000]
drm/radeon: always program the MC on startup

commit 6fab3febf6d949b0a12b1e4e73db38e4a177a79e upstream.

For r6xx+ asics.  This mirrors the behavior of pre-r6xx
asics.  We need to program the MC even if something
else in startup() fails.  Failure to do so results in
an unusable GPU.

Based on a fix from: Mark Kettenis <kettenis@openbsd.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[ rebased for 3.10 and dropped the drivers/gpu/drm/radeon/cik.c
  bit as it's 3.11 specific code / tmb ]
Signed-off-by: Thomas Backlund <tmb@mageia.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agodrm/radeon: only save UVD bo when we have open handles
Christian König [Mon, 5 Aug 2013 12:10:55 +0000]
drm/radeon: only save UVD bo when we have open handles

commit 4ad9c1c774c2af152283f510062094e768876f55 upstream.

Otherwise just reinitialize from scratch on resume,
and so make it more likely to succeed.

v2: rebased for 3.10-stable tree

Signed-off-by: Christian König <christian.koenig@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agodrm/radeon: fix halting UVD
Christian König [Thu, 1 Aug 2013 15:34:07 +0000]
drm/radeon: fix halting UVD

commit 2858c00d2823c83acce2a1175dbabb2cebee8678 upstream.

Removing the clock/power or resetting the VCPU can cause
hangs if that happens in the middle of a register write.

Stall the memory and register bus before putting the VCPU
into reset. Keep it in reset when unloading the module or

v2: rebased on 3.10-stable tree

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agodrm/i915: initialize gt_lock early with other spin locks
Jani Nikula [Thu, 25 Jul 2013 09:44:34 +0000]
drm/i915: initialize gt_lock early with other spin locks

commit 14c5cec5d0cd73e7e9d4fbea2bbfeea8f3ade871 upstream.

commit 181d1b9e31c668259d3798c521672afb8edd355c
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Sun Jul 21 13:16:24 2013 +0200

    drm/i915: fix up gt init sequence fallout

moved dev_priv->gt_lock initialization after use. Do the initialization
much earlier with other spin lock initializations.

Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Zhouping Liu <zliu@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoreiserfs: fix deadlock in umount
Al Viro [Mon, 5 Aug 2013 13:37:37 +0000]
reiserfs: fix deadlock in umount

commit 672fe15d091ce76d6fb98e489962e9add7c1ba4c upstream.

Since remove_proc_entry() started to wait for IO in progress (i.e.
since 2007 or so), the locking in fs/reiserfs/proc.c became wrong;
if procfs read happens between the moment when umount() locks the
victim superblock and removal of /proc/fs/reiserfs/<device>/*,
we'll get a deadlock - read will wait for s_umount (in sget(),
called by r_start()), while umount will wait in remove_proc_entry()
for that read to finish, holding s_umount all along.

Fortunately, the same change allows a much simpler race avoidance -
all we need to do is remove the procfs entries in the very beginning
of reiserfs ->kill_sb(); that'll guarantee that pointer to superblock
will remain valid for the duration for procfs IO, so we don't need
sget() to keep the sucker alive.  As the matter of fact, we can
get rid of the home-grown iterator completely, and use single_open()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agodebugfs: debugfs_remove_recursive() must not rely on list_empty(d_subdirs)
Oleg Nesterov [Fri, 26 Jul 2013 15:12:56 +0000]
debugfs: debugfs_remove_recursive() must not rely on list_empty(d_subdirs)

commit 776164c1faac4966ab14418bb0922e1820da1d19 upstream.

debugfs_remove_recursive() is wrong,

1. it wrongly assumes that !list_empty(d_subdirs) means that this
   dir should be removed.

   This is not that bad by itself, but:

2. if d_subdirs does not becomes empty after __debugfs_remove()
   it gives up and silently fails, it doesn't even try to remove
   other entries.

   However ->d_subdirs can be non-empty because it still has the
   already deleted !debugfs_positive() entries.

3. simple_release_fs() is called even if __debugfs_remove() fails.

Suppose we have


and someone opens dir1/dir2/file2.

Now, debugfs_remove_recursive(dir1/dir2) succeeds, and dir1/dir2 goes

But debugfs_remove_recursive(dir1) silently fails and doesn't remove
this directory. Because it tries to delete (the already deleted)
dir1/dir2/file2 again and then fails due to "Avoid infinite loop"



cd /sys/kernel/debug/tracing
echo 'p:probe/sigprocmask sigprocmask' >> kprobe_events
sleep 1000 < events/probe/sigprocmask/id &
echo -n >| kprobe_events

[ -d events/probe ] && echo "ERR!! failed to rm probe"

And after that it is not possible to create another probe entry.

With this patch debugfs_remove_recursive() skips !debugfs_positive()
files although this is not strictly needed. The most important change
is that it does not try to make ->d_subdirs empty, it simply scans
the whole list(s) recursively and removes as much as possible.

Link: http://lkml.kernel.org/r/20130726151256.GC19472@redhat.com

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agousb: core: don't try to reset_device() a port that got just disconnected
Julius Werner [Wed, 31 Jul 2013 02:51:20 +0000]
usb: core: don't try to reset_device() a port that got just disconnected

commit 481f2d4f89f87a0baa26147f323380e31cfa7c44 upstream.

The USB hub driver's event handler contains a check to catch SuperSpeed
devices that transitioned into the SS.Inactive state and tries to fix
them with a reset. It decides whether to do a plain hub port reset or
call the usb_reset_device() function based on whether there was a device
attached to the port.

However, there are device/hub combinations (found with a JetFlash
Transcend mass storage stick (8564:1000) on the root hub of an Intel
LynxPoint PCH) which can transition to the SS.Inactive state on
disconnect (and stay there long enough for the host to notice). In this
case, above-mentioned reset check will call usb_reset_device() on the
stale device data structure. The kernel will send pointless LPM control
messages to the no longer connected device address and can even cause
several 5 second khubd stalls on some (buggy?) host controllers, before
finally accepting the device's fate amongst a flurry of error messages.

This patch makes the choice of reset dependent on the port status that
has just been read from the hub in addition to the existence of an
in-kernel data structure for the device, and only proceeds with the more
extensive reset if both are valid.

Signed-off-by: Julius Werner <jwerner@chromium.org>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agozram: allow request end to coincide with disksize
Sergey Senozhatsky [Sat, 22 Jun 2013 14:21:00 +0000]
zram: allow request end to coincide with disksize

commit 75c7caf5a052ffd8db3312fa7864ee2d142890c4 upstream.

Pass valid_io_request() checks if request end coincides with disksize
(end equals bound), only fail if we attempt to read beyond the bound.

mkfs.ext2 produces numerous errors:
[ 2164.632747] quiet_error: 1 callbacks suppressed
[ 2164.633260] Buffer I/O error on device zram0, logical block 153599
[ 2164.633265] lost page write due to I/O error on zram0

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Thomas Backlund <tmb@mageia.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agocifs: don't instantiate new dentries in readdir for inodes that need to be revalidate...
Jeff Layton [Wed, 7 Aug 2013 14:29:08 +0000]
cifs: don't instantiate new dentries in readdir for inodes that need to be revalidated immediately

commit 757c4f6260febff982276818bb946df89c1105aa upstream.

David reported that commit c2b93e06 (cifs: only set ops for inodes in
I_NEW state) caused a regression with mfsymlinks. Prior to that patch,
if a mfsymlink dentry was instantiated at readdir time, the inode would
get a new set of ops when it was revalidated. After that patch, this
did not occur.

This patch addresses this by simply skipping instantiating dentries in
the readdir codepath when we know that they will need to be immediately
revalidated. The next attempt to use that dentry will cause a new lookup
to occur (which is basically what we want to happen anyway).

Reported-and-Tested-by: David McBride <dwm37@cam.ac.uk>
Cc: "Stefan (metze) Metzmacher" <metze@samba.org>
Cc: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agocifs: extend the buffer length enought for sprintf() using
Chen Gang [Fri, 19 Jul 2013 01:01:36 +0000]
cifs: extend the buffer length enought for sprintf() using

commit 057d6332b24a4497c55a761c83c823eed9e3f23b upstream.

For cifs_set_cifscreds() in "fs/cifs/connect.c", 'desc' buffer length
is 'CIFSCREDS_DESC_SIZE' (56 is less than 256), and 'ses->domainName'
length may be "255 + '\0'".

The related sprintf() may cause memory overflow, so need extend related
buffer enough to hold all things.

It is also necessary to be sure of 'ses->domainName' must be less than
256, and define the related macro instead of hard code number '256'.

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Reviewed-by: Scott Lovenberg <scott.lovenberg@gmail.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoext4: flush the extent status cache during EXT4_IOC_SWAP_BOOT
Theodore Ts'o [Mon, 12 Aug 2013 13:29:30 +0000]
ext4: flush the extent status cache during EXT4_IOC_SWAP_BOOT

commit cde2d7a796f7e895e25b43471ed658079345636d upstream.

Previously we weren't swapping only some of the extent_status LRU
fields during the processing of the EXT4_IOC_SWAP_BOOT ioctl.  The
much safer thing to do is to just completely flush the extent status
tree when doing the swap.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Zheng Liu <gnehzuil.liu@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoext4: fix mount/remount error messages for incompatible mount options
Piotr Sarna [Fri, 9 Aug 2013 03:02:24 +0000]
ext4: fix mount/remount error messages for incompatible mount options

commit 6ae6514b33f941d3386da0dfbe2942766eab1577 upstream.

Commit 5688978 ("ext4: improve handling of conflicting mount options")
introduced incorrect messages shown while choosing wrong mount options.

First of all, both cases of incorrect mount options,
"data=journal,delalloc" and "data=journal,dioread_nolock" result in
the same error message.

Secondly, the problem above isn't solved for remount option: the
mismatched parameter is simply ignored.  Moreover, ext4_msg states
that remount with options "data=journal,delalloc" succeeded, which is
not true.

To fix it up, I added a simple check after parse_options() call to
ensure that data=journal and delalloc/dioread_nolock parameters are
not present at the same time.

Signed-off-by: Piotr Sarna <p.sarna@partner.samsung.com>
Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoext4: allow the mount options nodelalloc and data=journal
Theodore Ts'o [Fri, 9 Aug 2013 03:01:24 +0000]
ext4: allow the mount options nodelalloc and data=journal

commit 59d9fa5c2e9086db11aa287bb4030151d0095a17 upstream.

Commit 26092bf ("ext4: use a table-driven handler for mount options")
wrongly disallows the specifying the mount options nodelalloc and
data=journal simultaneously.  This is incorrect; it should have only
disallowed the combination of delalloc and data=journal

Reported-by: Piotr Sarna <p.sarna@partner.samsung.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agodrm/radeon: stop sending invalid UVD destroy msg
Christian König [Mon, 5 Aug 2013 12:10:56 +0000]
drm/radeon: stop sending invalid UVD destroy msg

commit 641a00593f7d07eab778fbabf546fb68fff3d5ce upstream.

We also need to check the handle.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agodrm/radeon: select audio dto based on encoder id for DCE3
Alex Deucher [Mon, 29 Jul 2013 22:56:13 +0000]
drm/radeon: select audio dto based on encoder id for DCE3

commit e1accbf0543eecfdb161131208c3dfefee22d61f upstream.

There are two audio dtos on radeon asics that you can
select between.  Normally, dto0 is used for hdmi and
dto1 for DP, but it seems that the dto is somehow
tied to the encoders on DCE3 asics.


Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agodrm: Don't pass negative delta to ktime_sub_ns()
Michel Dänzer [Wed, 12 Jun 2013 09:58:44 +0000]
drm: Don't pass negative delta to ktime_sub_ns()

commit e91abf80a0998f326107874c88d549f94839f13c upstream.

It takes an unsigned value. This happens not to blow up on 64-bit
architectures, but it does on 32-bit, causing
drm_calc_vbltimestamp_from_scanoutpos() to calculate totally bogus
timestamps for vblank events. Which in turn causes e.g. gnome-shell to
hang after a DPMS off cycle with current xf86-video-ati Git.

[airlied: regression introduced in drm: use monotonic time in drm_calc_vbltimestamp_from_scanoutpos]

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59339
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59836
Tested-by: shui yangwei <yangweix.shui@intel.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agodrm/ast: invalidate page tables when pinning a BO
Dave Airlie [Wed, 7 Aug 2013 00:01:56 +0000]
drm/ast: invalidate page tables when pinning a BO

commit 3ac65259328324de323dc006b52ff7c1a5b18d19 upstream.

same fix as cirrus and mgag200.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agodrm/mgag200: Invalidate page tables when pinning a BO
Egbert Eich [Wed, 17 Jul 2013 15:40:56 +0000]
drm/mgag200: Invalidate page tables when pinning a BO

commit ecaac1c866bcda4780a963b3d18cd310d971aea3 upstream.

When a BO gets pinned the placement may get changed. If the memory is
mapped into user space and user space has already accessed the mapped
range the page tables are set up but now point to the wrong memory.
Set bo.mdev->dev_mapping in mgag200_bo_create() to make sure that
ttm_bo_unmap_virtual() called from ttm_bo_handle_move_mem() will take
care of this.

v2: Don't call ttm_bo_unmap_virtual() in mgag200_bo_pin(), fix comment.

Signed-off-by: Egbert Eich <eich@suse.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agodrm/cirrus: Invalidate page tables when pinning a BO
Michal Srb [Tue, 6 Aug 2013 13:26:50 +0000]
drm/cirrus: Invalidate page tables when pinning a BO

commit 109a51598869a39fdcec2d49672a9a39b6d89481 upstream.

This is a cirrus version of Egbert Eich's patch for mgag200.

Without bo.bdev->dev_mapping set, the ttm_bo_unmap_virtual_locked
called from ttm_bo_handle_move_mem returns with no effect. If any
application accessed the memory before it was moved, it will
access wrong memory next time. This causes crashes when changing
resolution down.

Signed-off-by: Michal Srb <msrb@suse.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agovirtio: console: return -ENODEV on all read operations after unplug
Amit Shah [Mon, 29 Jul 2013 04:53:21 +0000]
virtio: console: return -ENODEV on all read operations after unplug

commit 96f97a83910cdb9d89d127c5ee523f8fc040a804 upstream.

If a port gets unplugged while a user is blocked on read(), -ENODEV is
returned.  However, subsequent read()s returned 0, indicating there's no
host-side connection (but not indicating the device went away).

This also happened when a port was unplugged and the user didn't have
any blocking operation pending.  If the user didn't monitor the SIGIO
signal, they won't have a chance to find out if the port went away.

Fix by returning -ENODEV on all read()s after the port gets unplugged.
write() already behaves this way.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agovirtio: console: fix raising SIGIO after port unplug
Amit Shah [Mon, 29 Jul 2013 04:51:32 +0000]
virtio: console: fix raising SIGIO after port unplug

commit 92d3453815fbe74d539c86b60dab39ecdf01bb99 upstream.

SIGIO should be sent when a port gets unplugged.  It should only be sent
to prcesses that have the port opened, and have asked for SIGIO to be
delivered.  We were clearing out guest_connected before calling
send_sigio_to_port(), resulting in a sigio not getting sent to

Fix by setting guest_connected to false after invoking the sigio

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agovirtio: console: clean up port data immediately at time of unplug
Amit Shah [Mon, 29 Jul 2013 04:50:29 +0000]
virtio: console: clean up port data immediately at time of unplug

commit ea3768b4386a8d1790f4cc9a35de4f55b92d6442 upstream.

We used to keep the port's char device structs and the /sys entries
around till the last reference to the port was dropped.  This is
actually unnecessary, and resulted in buggy behaviour:

1. Open port in guest
2. Hot-unplug port
3. Hot-plug a port with the same 'name' property as the unplugged one

This resulted in hot-plug being unsuccessful, as a port with the same
name already exists (even though it was unplugged).

This behaviour resulted in a warning message like this one:

WARNING: at fs/sysfs/dir.c:512 sysfs_add_one+0xc9/0x130() (Not tainted)
Hardware name: KVM
sysfs: cannot create duplicate filename

Call Trace:
 [<ffffffff8106b607>] ? warn_slowpath_common+0x87/0xc0
 [<ffffffff8106b6f6>] ? warn_slowpath_fmt+0x46/0x50
 [<ffffffff811f2319>] ? sysfs_add_one+0xc9/0x130
 [<ffffffff811f23e8>] ? create_dir+0x68/0xb0
 [<ffffffff811f2469>] ? sysfs_create_dir+0x39/0x50
 [<ffffffff81273129>] ? kobject_add_internal+0xb9/0x260
 [<ffffffff812733d8>] ? kobject_add_varg+0x38/0x60
 [<ffffffff812734b4>] ? kobject_add+0x44/0x70
 [<ffffffff81349de4>] ? get_device_parent+0xf4/0x1d0
 [<ffffffff8134b389>] ? device_add+0xc9/0x650


Instead of relying on guest applications to release all references to
the ports, we should go ahead and unregister the port from all the core
layers.  Any open/read calls on the port will then just return errors,
and an unplug/plug operation on the host will succeed as expected.

This also caused buggy behaviour in case of the device removal (not just
a port): when the device was removed (which means all ports on that
device are removed automatically as well), the ports with active
users would clean up only when the last references were dropped -- and
it would be too late then to be referencing char device pointers,
resulting in oopses:

PID: 6162   TASK: ffff8801147ad500  CPU: 0   COMMAND: "cat"
 #0 [ffff88011b9d5a90] machine_kexec at ffffffff8103232b
 #1 [ffff88011b9d5af0] crash_kexec at ffffffff810b9322
 #2 [ffff88011b9d5bc0] oops_end at ffffffff814f4a50
 #3 [ffff88011b9d5bf0] die at ffffffff8100f26b
 #4 [ffff88011b9d5c20] do_general_protection at ffffffff814f45e2
 #5 [ffff88011b9d5c50] general_protection at ffffffff814f3db5
    [exception RIP: strlen+2]
    RIP: ffffffff81272ae2  RSP: ffff88011b9d5d00  RFLAGS: 00010246
    RAX: 0000000000000000  RBX: ffff880118901c18  RCX: 0000000000000000
    RDX: ffff88011799982c  RSI: 00000000000000d0  RDI: 3a303030302f3030
    RBP: ffff88011b9d5d38   R8: 0000000000000006   R9: ffffffffa0134500
    R10: 0000000000001000  R11: 0000000000001000  R12: ffff880117a1cc10
    R13: 00000000000000d0  R14: 0000000000000017  R15: ffffffff81aff700
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #6 [ffff88011b9d5d00] kobject_get_path at ffffffff8126dc5d
 #7 [ffff88011b9d5d40] kobject_uevent_env at ffffffff8126e551
 #8 [ffff88011b9d5dd0] kobject_uevent at ffffffff8126e9eb
 #9 [ffff88011b9d5de0] device_del at ffffffff813440c7


So clean up when we have all the context, and all that's left to do when
the references to the port have dropped is to free up the port struct

Reported-by: chayang <chayang@redhat.com>
Reported-by: YOGANANTH SUBRAMANIAN <anantyog@in.ibm.com>
Reported-by: FuXiangChun <xfu@redhat.com>
Reported-by: Qunfang Zhang <qzhang@redhat.com>
Reported-by: Sibiao Luo <sluo@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agovirtio: console: fix race in port_fops_open() and port unplug
Amit Shah [Mon, 29 Jul 2013 04:47:13 +0000]
virtio: console: fix race in port_fops_open() and port unplug

commit 671bdea2b9f210566610603ecbb6584c8a201c8c upstream.

Between open() being called and processed, the port can be unplugged.
Check if this happened, and bail out.

A simple test script to reproduce this is:

while true; do for i in $(seq 1 100); do echo $i > /dev/vport0p3; done; done;

This opens and closes the port a lot of times; unplugging the port while
this is happening triggers the bug.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agovirtio: console: fix race with port unplug and open/close
Amit Shah [Mon, 29 Jul 2013 04:46:13 +0000]
virtio: console: fix race with port unplug and open/close

commit 057b82be3ca3d066478e43b162fc082930a746c9 upstream.

There's a window between find_port_by_devt() returning a port and us
taking a kref on the port, where the port could get unplugged.  Fix it
by taking the reference in find_port_by_devt() itself.

Problem reported and analyzed by Mateusz Guzik.

Reported-by: Mateusz Guzik <mguzik@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agovirtio/console: Add pipe_lock/unlock for splice_write
Yoshihiro YUNOMAE [Tue, 23 Jul 2013 02:00:49 +0000]
virtio/console: Add pipe_lock/unlock for splice_write

commit 2b4fbf029dff5a28d9bf646346dea891ec43398a upstream.

Add pipe_lock/unlock for splice_write to avoid oops by following competition:

(1) An application gets fds of a trace buffer, virtio-serial, pipe.
(2) The application does fork()
(3) The processes execute splice_read(trace buffer) and
    splice_write(virtio-serial) via same pipe.

        <parent>                   <child>
  get fds of a trace buffer,
         virtio-serial, pipe
          |                           |
      splice(read)                    |           ---+
      splice(write)                   |              +-- no competition
          |                       splice(read)       |
          |                       splice(write)   ---+
          |                           |
      splice(read)                    |
      splice(write)               splice(read)    ------ competition
          |                       splice(write)

Two processes share a pipe_inode_info structure. If the child execute
splice(read) when the parent tries to execute splice(write), the
structure can be broken. Existing virtio-serial driver does not get
lock for the structure in splice_write, so this competition will induce

<oops messages>
 BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
 IP: [<ffffffff811a6b5f>] splice_from_pipe_feed+0x6f/0x130
 PGD 7223e067 PUD 72391067 PMD 0
 Oops: 0000 [#1] SMP
 Modules linked in: lockd bnep bluetooth rfkill sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc snd_timer snd soundcore pcspkr virtio_net virtio_balloon i2c_piix4 i2c_core microcode uinput floppy
 CPU: 0 PID: 1072 Comm: compete-test Not tainted 3.10.0ws+ #55
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
 task: ffff880071b98000 ti: ffff88007b55e000 task.ti: ffff88007b55e000
 RIP: 0010:[<ffffffff811a6b5f>]  [<ffffffff811a6b5f>] splice_from_pipe_feed+0x6f/0x130
 RSP: 0018:ffff88007b55fd78  EFLAGS: 00010287
 RAX: 0000000000000000 RBX: ffff88007b55fe20 RCX: 0000000000000000
 RDX: 0000000000001000 RSI: ffff88007a95ba30 RDI: ffff880036f9e6c0
 RBP: ffff88007b55fda8 R08: 00000000000006ec R09: ffff880077626708
 R10: 0000000000000003 R11: ffffffff8139ca59 R12: ffff88007a95ba30
 R13: 0000000000000000 R14: ffffffff8139dd00 R15: ffff880036f9e6c0
 FS:  00007f2e2e3a0740(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000000000000018 CR3: 0000000071bd1000 CR4: 00000000000006f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  ffffffff8139ca59 ffff88007b55fe20 ffff880036f9e6c0 ffffffff8139dd00
  ffff8800776266c0 ffff880077626708 ffff88007b55fde8 ffffffff811a6e8e
  ffff88007b55fde8 ffffffff8139ca59 ffff880036f9e6c0 ffff88007b55fe20
 Call Trace:
  [<ffffffff8139ca59>] ? alloc_buf.isra.13+0x39/0xb0
  [<ffffffff8139dd00>] ? virtcons_restore+0x100/0x100
  [<ffffffff811a6e8e>] __splice_from_pipe+0x7e/0x90
  [<ffffffff8139ca59>] ? alloc_buf.isra.13+0x39/0xb0
  [<ffffffff8139d739>] port_fops_splice_write+0xe9/0x140
  [<ffffffff8127a3f4>] ? selinux_file_permission+0xc4/0x120
  [<ffffffff8139d650>] ? wait_port_writable+0x1b0/0x1b0
  [<ffffffff811a6fe0>] do_splice_from+0xa0/0x110
  [<ffffffff811a951f>] SyS_splice+0x5ff/0x6b0
  [<ffffffff8161facf>] tracesys+0xdd/0xe2
 Code: 49 8b 87 80 00 00 00 4c 8d 24 d0 8b 53 04 41 8b 44 24 0c 4d 8b 6c 24 10 39 d0 89 03 76 02 89 13 49 8b 44 24 10 4c 89 e6 4c 89 ff <ff> 50 18 85 c0 0f 85 aa 00 00 00 48 89 da 4c 89 e6 4c 89 ff 41
 RIP  [<ffffffff811a6b5f>] splice_from_pipe_feed+0x6f/0x130
  RSP <ffff88007b55fd78>
 CR2: 0000000000000018
 ---[ end trace 24572beb7764de59 ]---

V2: Fix a locking problem for error
V3: Add Reviewed-by lines and stable@ line in sign-off area

Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agovirtio/console: Quit from splice_write if pipe->nrbufs is 0
Yoshihiro YUNOMAE [Tue, 23 Jul 2013 02:00:49 +0000]
virtio/console: Quit from splice_write if pipe->nrbufs is 0

commit 68c034fefe20eaf7d5569aae84584b07987ce50a upstream.

Quit from splice_write if pipe->nrbufs is 0 for avoiding oops in virtio-serial.

When an application was doing splice from a kernel buffer to virtio-serial on
a guest, the application received signal(SIGINT). This situation will normally
happen, but the kernel executed a kernel panic by oops as follows:

 BUG: unable to handle kernel paging request at ffff882071c8ef28
 IP: [<ffffffff812de48f>] sg_init_table+0x2f/0x50
 PGD 1fac067 PUD 0
 Oops: 0000 [#1] SMP
 Modules linked in: lockd sunrpc bnep bluetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc snd_timer snd microcode virtio_balloon virtio_net pcspkr soundcore i2c_piix4 i2c_core uinput floppy
 CPU: 1 PID: 908 Comm: trace-cmd Not tainted 3.10.0+ #49
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
 task: ffff880071c64650 ti: ffff88007bf24000 task.ti: ffff88007bf24000
 RIP: 0010:[<ffffffff812de48f>]  [<ffffffff812de48f>] sg_init_table+0x2f/0x50
 RSP: 0018:ffff88007bf25dd8  EFLAGS: 00010286
 RAX: 0000001fffffffe0 RBX: ffff882071c8ef28 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880071c8ef48
 RBP: ffff88007bf25de8 R08: ffff88007fd15d40 R09: ffff880071c8ef48
 R10: ffffea0001c71040 R11: ffffffff8139c555 R12: 0000000000000000
 R13: ffff88007506a3c0 R14: ffff88007c862500 R15: ffff880071c8ef00
 FS:  00007f0a3646c740(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: ffff882071c8ef28 CR3: 000000007acbb000 CR4: 00000000000006e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  ffff880071c8ef48 ffff88007bf25e20 ffff88007bf25e88 ffffffff8139d6fa
  ffff88007bf25e28 ffffffff8127a3f4 0000000000000000 0000000000000000
  ffff880071c8ef48 0000100000000000 0000000000000003 ffff88007bf25e08
 Call Trace:
  [<ffffffff8139d6fa>] port_fops_splice_write+0xaa/0x130
  [<ffffffff8127a3f4>] ? selinux_file_permission+0xc4/0x120
  [<ffffffff8139d650>] ? wait_port_writable+0x1b0/0x1b0
  [<ffffffff811a6fe0>] do_splice_from+0xa0/0x110
  [<ffffffff811a951f>] SyS_splice+0x5ff/0x6b0
  [<ffffffff8161f8c2>] system_call_fastpath+0x16/0x1b
 Code: c1 e2 05 48 89 e5 48 83 ec 10 4c 89 65 f8 41 89 f4 31 f6 48 89 5d f0 48 89 fb e8 8d ce ff ff 41 8d 44 24 ff 48 c1 e0 05 48 01 c3 <48> 8b 03 48 83 e0 fe 48 83 c8 02 48 89 03 48 8b 5d f0 4c 8b 65
 RIP  [<ffffffff812de48f>] sg_init_table+0x2f/0x50
  RSP <ffff88007bf25dd8>
 CR2: ffff882071c8ef28
 ---[ end trace 86323505eb42ea8f ]---

It seems to induce pagefault in sg_init_tabel() when pipe->nrbufs is equal to
zero. This may happen in a following situation:

(1) The application normally does splice(read) from a kernel buffer, then does
    splice(write) to virtio-serial.
(2) The application receives SIGINT when is doing splice(read), so splice(read)
    is failed by EINTR. However, the application does not finish the operation.
(3) The application tries to do splice(write) without pipe->nrbufs.
(4) The virtio-console driver tries to touch scatterlist structure sgl in
    sg_init_table(), but the region is out of bound.

To avoid the case, a kernel should check whether pipe->nrbufs is empty or not
when splice_write is executed in the virtio-console driver.

V3: Add Reviewed-by lines and stable@ line in sign-off area.

Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoSUNRPC: If the rpcbind channel is disconnected, fail the call to unregister
Trond Myklebust [Mon, 5 Aug 2013 20:04:47 +0000]
SUNRPC: If the rpcbind channel is disconnected, fail the call to unregister

commit 786615bc1ce84150ded80daea6bd9f6297f48e73 upstream.

If rpcbind causes our connection to the AF_LOCAL socket to close after
we've registered a service, then we want to be careful about reconnecting
since the mount namespace may have changed.

By simply refusing to reconnect the AF_LOCAL socket in the case of
unregister, we avoid the need to somehow save the mount namespace. While
this may lead to some services not unregistering properly, it should
be safe.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Nix <nix@esperi.org.uk>
Cc: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoSUNRPC: Don't auto-disconnect from the local rpcbind socket
Trond Myklebust [Mon, 5 Aug 2013 18:10:43 +0000]
SUNRPC: Don't auto-disconnect from the local rpcbind socket

commit 00326ed6442c66021cd4b5e19e80f3e2027d5d42 upstream.

There is no need for the kernel to time out the AF_LOCAL connection to
the rpcbind socket, and doing so is problematic because when it is
time to reconnect, our process may no longer be using the same mount

Reported-by: Nix <nix@esperi.org.uk>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoLOCKD: Don't call utsname()->nodename from nlmclnt_setlockargs
Trond Myklebust [Mon, 5 Aug 2013 16:06:12 +0000]
LOCKD: Don't call utsname()->nodename from nlmclnt_setlockargs

commit 9a1b6bf818e74bb7aabaecb59492b739f2f4d742 upstream.

Firstly, nlmclnt_setlockargs can be called from a reclaimer thread, in
which case we're in entirely the wrong namespace.

Secondly, commit 8aac62706adaaf0fab02c4327761561c8bda9448 (move
exit_task_namespaces() outside of exit_notify()) now means that
exit_task_work() is called after exit_task_namespaces(), which
triggers an Oops when we're freeing up the locks.

Fix this by ensuring that we initialise the nlm_host's rpc_client at mount
time, so that the cl_nodename field is initialised to the value of
utsname()->nodename that the net namespace uses. Then replace the
lockd callers of utsname()->nodename.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Toralf Förster <toralf.foerster@gmx.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Nix <nix@esperi.org.uk>
Cc: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoBtrfs: release both paths before logging dir/changed extents
Josef Bacik [Mon, 22 Jul 2013 16:54:30 +0000]
Btrfs: release both paths before logging dir/changed extents

commit f3b15ccdbb9a79781578249a63318805e55a6c34 upstream.

The ceph guys tripped over this bug where we were still holding onto the
original path that we used to copy the inode with when logging.  This is based
on Chris's fix which was reported to fix the problem.  We need to drop the paths
in two cases anyway so just move the drop up so that we don't have duplicate
code.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoALSA: 6fire: fix DMA issues with URB transfer_buffer usage
Jussi Kivilinna [Tue, 6 Aug 2013 11:53:24 +0000]
ALSA: 6fire: fix DMA issues with URB transfer_buffer usage

commit ddb6b5a964371e8e52e696b2b258bda144c8bd3f upstream.

Patch fixes 6fire not to use stack as URB transfer_buffer. URB buffers need to
be DMA-able, which stack is not. Furthermore, transfer_buffer should not be
allocated as part of larger device structure because DMA coherency issues and
patch fixes this issue too.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Tested-by: Torsten Schenk <torsten.schenk@zoho.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoALSA: usb-audio: do not trust too-big wMaxPacketSize values
Clemens Ladisch [Thu, 8 Aug 2013 09:24:55 +0000]
ALSA: usb-audio: do not trust too-big wMaxPacketSize values

commit 57e6dae1087bbaa6b33d3dd8a8e90b63888939a3 upstream.

The driver used to assume that the streaming endpoint's wMaxPacketSize
value would be an indication of how much data the endpoint expects or
sends, and compute the number of packets per URB using this value.

However, the Focusrite Scarlett 2i4 declares a value of 1024 bytes,
while only about 88 or 44 bytes are be actually used.  This discrepancy
would result in URBs with far too few packets, which would not work
correctly on the EHCI driver.

To get correct URBs, use wMaxPacketSize only as an upper limit on the
packet size.

Reported-by: James Stone <jamesmstone@gmail.com>
Tested-by: James Stone <jamesmstone@gmail.com>
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agotracing: Fix reset of time stamps during trace_clock changes
Alexander Z Lam [Sat, 3 Aug 2013 01:36:16 +0000]
tracing: Fix reset of time stamps during trace_clock changes

commit 9457158bbc0ee04ecef76862d73eecd8076e9c7b upstream.

Fixed two issues with changing the timestamp clock with trace_clock:

 - The global buffer was reset on instance clock changes. Change this to pass
   the correct per-instance buffer
 - ftrace_now() is used to set buf->time_start in tracing_reset_online_cpus().
   This was incorrect because ftrace_now() used the global buffer's clock to
   return the current time. Change this to use buffer_ftrace_now() which
   returns the current time for the correct per-instance buffer.

Also removed tracing_reset_current() because it is not used anywhere

Link: http://lkml.kernel.org/r/1375493777-17261-2-git-send-email-azl@google.com

Signed-off-by: Alexander Z Lam <azl@google.com>
Cc: Vaibhav Nagarnaik <vnagarnaik@google.com>
Cc: David Sharp <dhsharp@google.com>
Cc: Alexander Z Lam <lambchop468@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agotracing: Use flag buffer_disabled for irqsoff tracer
Steven Rostedt (Red Hat) [Mon, 1 Jul 2013 19:58:24 +0000]
tracing: Use flag buffer_disabled for irqsoff tracer

commit 10246fa35d4ffdfe472185d4cbf9c2dfd9a9f023 upstream.

If the ring buffer is disabled and the irqsoff tracer records a trace it
will clear out its buffer and lose the data it had previously recorded.

Currently there's a callback when writing to the tracing_of file, but if
tracing is disabled via the function tracer trigger, it will not inform
the irqsoff tracer to stop recording.

By using the "mirror" flag (buffer_disabled) in the trace_array, that keeps
track of the status of the trace_array's buffer, it gives the irqsoff
tracer a fast way to know if it should record a new trace or not.
The flag may be a little behind the real state of the buffer, but it
should not affect the trace too much. It's more important for the irqsoff
tracer to be fast.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agotracing: Make TRACE_ITER_STOP_ON_FREE stop the correct buffer
Alexander Z Lam [Sat, 3 Aug 2013 01:36:15 +0000]
tracing: Make TRACE_ITER_STOP_ON_FREE stop the correct buffer

commit 711e124379e0f889e40e2f01d7f5d61936d3cd23 upstream.

Releasing the free_buffer file in an instance causes the global buffer
to be stopped when TRACE_ITER_STOP_ON_FREE is enabled. Operate on the
correct buffer.

Link: http://lkml.kernel.org/r/1375493777-17261-1-git-send-email-azl@google.com

Signed-off-by: Alexander Z Lam <azl@google.com>
Cc: Vaibhav Nagarnaik <vnagarnaik@google.com>
Cc: David Sharp <dhsharp@google.com>
Cc: Alexander Z Lam <lambchop468@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agotracing: Fix fields of struct trace_iterator that are zeroed by mistake
Andrew Vagin [Fri, 2 Aug 2013 17:16:43 +0000]
tracing: Fix fields of struct trace_iterator that are zeroed by mistake

commit ed5467da0e369e65b247b99eb6403cb79172bcda upstream.

tracing_read_pipe zeros all fields bellow "seq". The declaration contains
a comment about that, but it doesn't help.

The first field is "snapshot", it's true when current open file is
snapshot. Looks obvious, that it should not be zeroed.

The second field is "started". It was converted from cpumask_t to
cpumask_var_t (v2.6.28-4983-g4462344), in other words it was
converted from cpumask to pointer on cpumask.

Currently the reference on "started" memory is lost after the first read
from tracing_read_pipe and a proper object will never be freed.

The "started" is never dereferenced for trace_pipe, because trace_pipe
can't have the TRACE_FILE_ANNOTATE options.

Link: http://lkml.kernel.org/r/1375463803-3085183-1-git-send-email-avagin@openvz.org

Signed-off-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoACPI / PM: Walk physical_node_list under physical_node_lock
Rafael J. Wysocki [Tue, 6 Aug 2013 00:26:22 +0000]
ACPI / PM: Walk physical_node_list under physical_node_lock

commit 623cf33cb055b1e81fa47e4fc16789b2c129e31e upstream.

The list of physical devices corresponding to an ACPI device
object is walked by acpi_system_wakeup_device_seq_show() and
physical_device_enable_wakeup() without taking that object's
physical_node_lock mutex.  Since each of those functions may be
run at any time as a result of a user space action, the lack of
appropriate locking in them may lead to a kernel crash if that
happens during device hot-add or hot-remove involving the device
object in question.

Fix the issue by modifying acpi_system_wakeup_device_seq_show() and
physical_device_enable_wakeup() to use physical_node_lock as

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agocpufreq: rename ignore_nice as ignore_nice_load
Viresh Kumar [Mon, 5 Aug 2013 06:58:02 +0000]
cpufreq: rename ignore_nice as ignore_nice_load

commit 6c4640c3adfd97ce10efed7c07405f52d002b9a8 upstream.

This sysfs file was called ignore_nice_load earlier and commit
4d5dcc4 (cpufreq: governor: Implement per policy instances of
governors) changed its name to ignore_nice by mistake.

Lets get it renamed back to its original name.

Reported-by: Martin von Gagern <Martin.vGagern@gmx.net>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agocpufreq: loongson2: fix regression related to clock management
Aaro Koskinen [Mon, 5 Aug 2013 18:27:12 +0000]
cpufreq: loongson2: fix regression related to clock management

commit f54fe64d14dff3df6d45a48115d248a82557811f upstream.

Commit 42913c799 (MIPS: Loongson2: Use clk API instead of direct
dereferences) broke the cpufreq functionality on Loongson2 boards:
clk_set_rate() is called before the CPU frequency table is
initialized, and therefore will always fail.

Fix by moving the clk_set_rate() after the table initialization.
Tested on Lemote FuLoong mini-PC.

Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoi2c: i2c-mxs: Use DMA mode even for small transfers
Fabio Estevam [Mon, 1 Jul 2013 21:14:21 +0000]
i2c: i2c-mxs: Use DMA mode even for small transfers

commit d6e102f498cbcc8dd2e36721a01213f036397112 upstream.

Recently we have been seing some reports about PIO mode not working properly.

- http://www.spinics.net/lists/linux-i2c/msg11985.html
- http://marc.info/?l=linux-i2c&m=137235593101385&w=2
- https://lkml.org/lkml/2013/6/24/430

Let's use DMA mode even for small transfers.

Without this patch, i2c reads the incorrect sgtl5000 version on a mx28evk when
touchscreen is enabled:

[    5.856270] sgtl5000 0-000a: Device with ID register 0 is not a sgtl5000
[    9.877307] sgtl5000 0-000a: ASoC: failed to probe CODEC -19
[    9.883528] mxs-sgtl5000 sound.12: ASoC: failed to instantiate card -19
[    9.892955] mxs-sgtl5000 sound.12: snd_soc_register_card failed (-19)

[wsa: we have a proper solution for -next, so this non intrusive
solution is OK for now]

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Marek Vasut <marex@denx.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agomedia: em28xx: fix assignment of the eeprom data
Alban Browaeys [Tue, 16 Jul 2013 21:57:53 +0000]
media: em28xx: fix assignment of the eeprom data

commit f813b5775b471b656382ae8f087bb34dc894261f upstream.

Set the config structure pointer to the eeprom data pointer (data,
here eedata dereferenced) not the pointer to the pointer to
the eeprom data (eedata itself).

Signed-off-by: Alban Browaeys <prahal@yahoo.com>
Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agostaging: zcache: fix "zcache=" kernel parameter
Piotr Sarna [Mon, 29 Jul 2013 10:25:20 +0000]
staging: zcache: fix "zcache=" kernel parameter

commit 02073798a6b081bf74e6c10d6f7e7a693c067ecd upstream.

Commit 835f2f5 ("staging: zcache: enable zcache to be built/loaded as
a module") introduced an incorrect handling of "zcache=" parameter.

Inside zcache_comp_init() function, zcache_comp_name variable is
checked for being empty. If not empty, the above variable is tested
for being compatible with Crypto API. Unfortunately, after that
function ends unconditionally (by the "goto out" directive) and returns:
- non-zero value if verification succeeded, wrongly indicating an error
- zero value if verification failed, falsely informing that function
  zcache_comp_init() ended properly.

A solution to this problem is as following:
1. Move the "goto out" directive inside the "if (!ret)" statement
2. In case that crypto_has_comp() returned 0, change the value of ret
   to non-zero before "goto out" to indicate an error.

This patch replaces an earlier one from Michal Hocko (based on report
from Cristian Rodriguez):


It also addressed the same issue but didn't fix the zcache_comp_init()
for case when the compressor data passed to "zcache=" option was invalid
or unsupported.

Signed-off-by: Piotr Sarna <p.sarna@partner.samsung.com>
[bzolnier: updated patch description]
Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Cristian Rodriguez <crrodriguez@opensuse.org>
Cc: Bob Liu <bob.liu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agohwmon: (adt7470) Fix incorrect return code check
Curt Brune [Thu, 8 Aug 2013 19:11:03 +0000]
hwmon: (adt7470) Fix incorrect return code check

commit 93d783bcca69bfacc8dc739d8a050498402587b5 upstream.

In adt7470_write_word_data(), which writes two bytes using
i2c_smbus_write_byte_data(), the return codes are incorrectly AND-ed
together when they should be OR-ed together.

The return code of i2c_smbus_write_byte_data() is zero for success.

The upshot is only the first byte was ever written to the hardware.
The 2nd byte was never written out.

I noticed that trying to set the fan speed limits was not working
correctly on my system.  Setting the fan speed limits is the only
code that uses adt7470_write_word_data().  After making the change
the limit settings work and the alarms work also.

Signed-off-by: Curt Brune <curt@cumulusnetworks.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoregmap: Add missing header for !CONFIG_REGMAP stubs
Mateusz Krawczuk [Tue, 6 Aug 2013 16:34:40 +0000]
regmap: Add missing header for !CONFIG_REGMAP stubs

commit 49ccc142f9cbc33fdda18e8fa90c1c5b4a79c0ad upstream.

regmap.h requires linux/err.h if CONFIG_REGMAP is not defined. Without it I get
CC drivers/media/platform/exynos4-is/fimc-reg.o
In file included from drivers/media/platform/exynos4-is/fimc-reg.c:14:0:
include/linux/regmap.h: In function ‘regmap_write’:
include/linux/regmap.h:525:10: error: ‘EINVAL’ undeclared (first use in this function)
include/linux/regmap.h:525:10: note: each undeclared identifier is reported only once for each function it appears in

Signed-off-by: Mateusz Krawczuk <m.krawczuk@partner.samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoregmap: cache: Make sure to sync the last register in a block
Lars-Peter Clausen [Mon, 5 Aug 2013 09:21:29 +0000]
regmap: cache: Make sure to sync the last register in a block

commit 2d49b5987561e480bdbd8692b27fc5f49a1e2f0b upstream.

regcache_sync_block_raw_flush() expects the address of the register after last
register that needs to be synced as its parameter. But the last call to
regcache_sync_block_raw_flush() in regcache_sync_block_raw() passes the address
of the last register in the block. This effectively always skips over the last
register in a block, even if it needs to be synced. In order to fix it increase
the address by one register.

The issue was introduced in commit 75a5f89 ("regmap: cache: Write consecutive
registers in a single block write").

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoext4: fix retry handling in ext4_ext_truncate()
Theodore Ts'o [Mon, 29 Jul 2013 16:12:56 +0000]
ext4: fix retry handling in ext4_ext_truncate()

commit 94eec0fc3520c759831763d866421b4d60b599b4 upstream.

We tested for ENOMEM instead of -ENOMEM.   Oops.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoext4: make sure group number is bumped after a inode allocation race
Theodore Ts'o [Fri, 26 Jul 2013 19:15:46 +0000]
ext4: make sure group number is bumped after a inode allocation race

commit a34eb503742fd25155fd6cff6163daacead9fbc3 upstream.

When we try to allocate an inode, and there is a race between two
CPU's trying to grab the same inode, _and_ this inode is the last free
inode in the block group, make sure the group number is bumped before
we continue searching the rest of the block groups.  Otherwise, we end
up searching the current block group twice, and we end up skipping
searching the last block group.  So in the unlikely situation where
almost all of the inodes are allocated, it's possible that we will
return ENOSPC even though there might be free inodes in that last
block group.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoext4: destroy ext4_es_cachep on module unload
Eric Sandeen [Fri, 26 Jul 2013 19:21:11 +0000]
ext4: destroy ext4_es_cachep on module unload

commit dd12ed144e9797094c04736f97aa27d5fe401476 upstream.

Without this, module can't be reloaded.

[  500.521980] kmem_cache_sanity_check (ext4_extent_status): Cache name already exists.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agopowerpc/tm: Fix context switching TAR, PPR and DSCR SPRs
Michael Neuling [Fri, 9 Aug 2013 07:29:31 +0000]
powerpc/tm: Fix context switching TAR, PPR and DSCR SPRs

commit 28e61cc466d8daace4b0f04ba2b83e0bd68f5832 upstream.

If a transaction is rolled back, the Target Address Register (TAR), Processor
Priority Register (PPR) and Data Stream Control Register (DSCR) should be
restored to the checkpointed values before the transaction began.  Any changes
to these SPRs inside the transaction should not be visible in the abort

Currently Linux doesn't save or restore the checkpointed TAR, PPR or DSCR.  If
we preempt a processes inside a transaction which has modified any of these, on
process restore, that same transaction may be aborted we but we won't see the
checkpointed versions of these SPRs.

This adds checkpointed versions of these SPRs to the thread_struct and adds the
save/restore of these three SPRs to the treclaim/trechkpt code.

Without this if any of these SPRs are modified during a transaction, users may
incorrectly see a speculated SPR value even if the transaction is aborted.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agopowerpc: Save the TAR register earlier
Michael Neuling [Fri, 9 Aug 2013 07:29:30 +0000]
powerpc: Save the TAR register earlier

commit c2d52644e2da8a07ecab5ca62dd0bc563089e8dc upstream.

This moves us to save the Target Address Register (TAR) a earlier in
__switch_to.  It introduces a new function save_tar() to do this.

We need to save the TAR earlier as we will overwrite it in the transactional
memory reclaim/recheckpoint path.  We are going to do this in a subsequent
patch which will fix saving the TAR register when it's modified inside a

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agopowerpc: Fix context switch DSCR on POWER8
Michael Neuling [Fri, 9 Aug 2013 07:29:29 +0000]
powerpc: Fix context switch DSCR on POWER8

commit 2517617e0de65f8f7cfe75cae745d06b1fa98586 upstream.

POWER8 allows the DSCR to be accessed directly from userspace via a new SPR
number 0x3 (Rather than 0x11.  DSCR SPR number 0x11 is still used on POWER8 but
like POWER7, is only accessible in HV and OS modes).  Currently, we allow this
by setting H/FSCR DSCR bit on boot.

Unfortunately this doesn't work, as the kernel needs to see the DSCR change so
that it knows to no longer restore the system wide version of DSCR on context
switch (ie. to set thread.dscr_inherit).

This clears the H/FSCR DSCR bit initially.  If a process then accesses the DSCR
(via SPR 0x3), it'll trap into the kernel where we set thread.dscr_inherit in

We also change _switch() so that we set or clear the H/FSCR DSCR bit based on
the thread.dscr_inherit.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agopowerpc: Rework setting up H/FSCR bit definitions
Michael Neuling [Fri, 9 Aug 2013 07:29:28 +0000]
powerpc: Rework setting up H/FSCR bit definitions

commit 74e400cee6c0266ba2d940ed78d981f1e24a8167 upstream.

This reworks the Facility Status and Control Regsiter (FSCR) config bit
definitions so that we can access the bit numbers.  This is needed for a
subsequent patch to fix the userspace DSCR handling.

HFSCR and FSCR bit definitions are the same, so reuse them.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agopowerpc: Fix hypervisor facility unavaliable vector number
Michael Neuling [Fri, 9 Aug 2013 07:29:27 +0000]
powerpc: Fix hypervisor facility unavaliable vector number

commit 88f094120bd2f012ff494ae50a8d4e0d8af8f69e upstream.

Currently if we take hypervisor facility unavaliable (from 0xf80/0x4f80) we
mark it as an OS facility unavaliable (0xf60) as the two share the same code

The becomes a problem in facility_unavailable_exception() as we aren't able to
see the hypervisor facility unavailable exceptions.

Below fixes this by duplication the required macros.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agopowerpc: On POWERNV enable PPC_DENORMALISATION by default
Anton Blanchard [Wed, 31 Jul 2013 06:31:26 +0000]
powerpc: On POWERNV enable PPC_DENORMALISATION by default

commit 4e90a2a7375e86827541bda9393414c03e7721c6 upstream.

We want PPC_DENORMALISATION enabled when POWERNV is enabled,
so update the Kconfig.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agovirtio-scsi: Fix virtqueue affinity setup
Asias He [Thu, 1 Aug 2013 01:37:18 +0000]
virtio-scsi: Fix virtqueue affinity setup

commit aa52aeea2725839bdd3dcce394486e9a043065e0 upstream.

vscsi->num_queues counts the number of request virtqueue which does not
include the control and event virtqueue. It is wrong to subtract
VIRTIO_SCSI_VQ_BASE from vscsi->num_queues.

This patch fixes the following panic.

(qemu) device_del scsi0

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
 IP: [<ffffffff8179b29f>] __virtscsi_set_affinity+0x6f/0x120
 PGD 0
 Oops: 0000 [#1] SMP
 Modules linked in:
 CPU: 0 PID: 659 Comm: kworker/0:1 Not tainted 3.11.0-rc2+ #1172
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
 Workqueue: kacpi_hotplug _handle_hotplug_event_func
 task: ffff88007bee1cc0 ti: ffff88007bfe4000 task.ti: ffff88007bfe4000
 RIP: 0010:[<ffffffff8179b29f>]  [<ffffffff8179b29f>] __virtscsi_set_affinity+0x6f/0x120
 RSP: 0018:ffff88007bfe5a38  EFLAGS: 00010202
 RAX: 0000000000000010 RBX: ffff880077fd0d28 RCX: 0000000000000050
 RDX: 0000000000000000 RSI: 0000000000000246 RDI: 0000000000000000
 RBP: ffff88007bfe5a58 R08: ffff880077f6ff00 R09: 0000000000000001
 R10: ffffffff8143e673 R11: 0000000000000001 R12: 0000000000000001
 R13: ffff880077fd0800 R14: 0000000000000000 R15: ffff88007bf489b0
 FS:  0000000000000000(0000) GS:ffff88007ea00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000000000000020 CR3: 0000000079f8b000 CR4: 00000000000006f0
  ffff880077fd0d28 0000000000000000 ffff880077fd0800 0000000000000008
  ffff88007bfe5a78 ffffffff8179b37d ffff88007bccc800 ffff88007bccc800
  ffff88007bfe5a98 ffffffff8179b3b6 ffff88007bccc800 ffff880077fd0d28
 Call Trace:
  [<ffffffff8179b37d>] virtscsi_set_affinity+0x2d/0x40
  [<ffffffff8179b3b6>] virtscsi_remove_vqs+0x26/0x50
  [<ffffffff8179c7d2>] virtscsi_remove+0x82/0xa0
  [<ffffffff814cb6b2>] virtio_dev_remove+0x22/0x70
  [<ffffffff8167ca49>] __device_release_driver+0x69/0xd0
  [<ffffffff8167cb9d>] device_release_driver+0x2d/0x40
  [<ffffffff8167bb96>] bus_remove_device+0x116/0x150
  [<ffffffff81679936>] device_del+0x126/0x1e0
  [<ffffffff81679a06>] device_unregister+0x16/0x30
  [<ffffffff814cb889>] unregister_virtio_device+0x19/0x30
  [<ffffffff814cdad6>] virtio_pci_remove+0x36/0x80
  [<ffffffff81464ae7>] pci_device_remove+0x37/0x70
  [<ffffffff8167ca49>] __device_release_driver+0x69/0xd0
  [<ffffffff8167cb9d>] device_release_driver+0x2d/0x40
  [<ffffffff8167bb96>] bus_remove_device+0x116/0x150
  [<ffffffff81679936>] device_del+0x126/0x1e0
  [<ffffffff8145edfc>] pci_stop_bus_device+0x9c/0xb0
  [<ffffffff8145f036>] pci_stop_and_remove_bus_device+0x16/0x30
  [<ffffffff81474a9e>] acpiphp_disable_slot+0x8e/0x150
  [<ffffffff81474f6a>] hotplug_event_func+0xba/0x1a0
  [<ffffffff814906c8>] ? acpi_os_release_object+0xe/0x12
  [<ffffffff81475911>] _handle_hotplug_event_func+0x31/0x70
  [<ffffffff810b5333>] process_one_work+0x183/0x500
  [<ffffffff810b66e2>] worker_thread+0x122/0x400
  [<ffffffff810b65c0>] ? manage_workers+0x2d0/0x2d0
  [<ffffffff810bc5de>] kthread+0xce/0xe0
  [<ffffffff810bc510>] ? kthread_freezable_should_stop+0x70/0x70
  [<ffffffff81ca045c>] ret_from_fork+0x7c/0xb0
  [<ffffffff810bc510>] ? kthread_freezable_should_stop+0x70/0x70
 Code: 01 00 00 00 74 59 45 31 e4 83 bb c8 01 00 00 02 74 46 66 2e 0f 1f 84 00 00 00 00 00 49 63 c4 48 c1 e0 04 48 8b bc 0
3 10 02 00 00 <48> 8b 47 20 48 8b 80 d0 01 00 00 48 8b 40 50 48 85 c0 74 07 be
 RIP  [<ffffffff8179b29f>] __virtscsi_set_affinity+0x6f/0x120
  RSP <ffff88007bfe5a38>
 CR2: 0000000000000020
 ---[ end trace 99679331a3775f48 ]---

Signed-off-by: Asias He <asias@redhat.com>
Reviewed-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoSCSI: megaraid_sas: megaraid_sas driver init fails in kdump kernel
Sumit.Saxena@lsi.com [Mon, 15 Jul 2013 20:56:05 +0000]
SCSI: megaraid_sas: megaraid_sas driver init fails in kdump kernel

commit 6431f5d7c6025f8b007af06ea090de308f7e6881 upstream.

Problem: When Hardware IOMMU is on, megaraid_sas driver initialization fails
in kdump kernel with LSI MegaRAID controller(device id-0x73).

Actually this issue needs fix in firmware, but for firmware running in field,
this driver fix is proposed to resolve the issue.  At firmware initialization
time, if firmware does not come to ready state, driver will reset the adapter
and retry for firmware transition to ready state unconditionally(not only
executed for kdump kernel).

Signed-off-by: Sumit Saxena <sumit.saxena@lsi.com>
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoSCSI: Don't attempt to send extended INQUIRY command if skip_vpd_pages is set
Martin K. Petersen [Wed, 31 Jul 2013 02:58:34 +0000]
SCSI: Don't attempt to send extended INQUIRY command if skip_vpd_pages is set

commit 7562523e84ddc742fe1f9db8bd76b01acca89f6b upstream.

If a device has the skip_vpd_pages flag set we should simply fail the
scsi_get_vpd_page() call.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Stuart Foster <smf.linux@ntlworld.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoLinux 3.10.6
Greg Kroah-Hartman [Mon, 12 Aug 2013 01:49:02 +0000]
Linux 3.10.6

6 years agohwrng: bcm2835: fix MODULE_LICENSE tag
Arnd Bergmann [Tue, 23 Apr 2013 13:39:50 +0000]
hwrng: bcm2835: fix MODULE_LICENSE tag

commit 22e8099f4f6621b8d165e238cdef2a1cf655e159 upstream.

The MODULE_LICENSE macro invocation must use either "GPL" or "GPL v2",
but not "GPLv2" in order to be detected by the module loader.

This fixes the allmodconfig build error:

FATAL: modpost: GPL-incompatible module bcm2835-rng.ko uses GPL-only symbol 'platform_driver_unregister'

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Lubomir Rintel <lkundrak@v3.sk>
Cc: Dom Cobley <popcornmix@gmail.com>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Matt Mackall <mpm@selenic.com>
Cc: linux-rpi-kernel@lists.infradead.org
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoiwlwifi: dvm: don't send BT_CONFIG on devices w/o Bluetooth
Johannes Berg [Fri, 3 May 2013 16:58:16 +0000]
iwlwifi: dvm: don't send BT_CONFIG on devices w/o Bluetooth

commit 707aee401d2467baa785a697f40a6e2d9ee79ad5 upstream.

The BT_CONFIG command that is sent to the device during
startup will enable BT coex unless the module parameter
turns it off, but on devices without Bluetooth this may
cause problems, as reported in Redhat BZ 885407.

Fix this by sending the BT_CONFIG command only when the
device has Bluetooth.

Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoiwlwifi: mvm: set SSID bits for passive channels
David Spinadel [Tue, 23 Jul 2013 11:13:32 +0000]
iwlwifi: mvm: set SSID bits for passive channels

commit bb963c4a43eb5127eb0bbfa16c7a6a209b0af5db upstream.

Set SSID bitmap for direct scan even on passive channels,
for the passive-to-active feature. Without this patch only
the SSID from probe request template is sent on passive
channels, after passive-to-active switching, causing us to
not find all desired networks.

Remove the unused passive scan mask constant.

Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: David Spinadel <david.spinadel@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agonet/mlx4_core: VFs must ignore the enable_64b_cqe_eqe module param
Jack Morgenstein [Thu, 1 Aug 2013 16:55:01 +0000]
net/mlx4_core: VFs must ignore the enable_64b_cqe_eqe module param

[ Upstream commit b30513202c6c14120f70b2e9aa1e97d47bbc2313 ]

Slaves get the 64B CQE/EQE state from QUERY_HCA, not from the module parameter.

If the parameter is set to zero, the slave outputs an incorrect/irrelevant
warning message that 64B CQEs/EQEs are supported but not enabled (even if the
hypervisor has enabled 64B CQEs/EQEs).

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agonet/mlx4_core: Don't give VFs MAC addresses which are derived from the PF MAC
Or Gerlitz [Thu, 1 Aug 2013 16:55:00 +0000]
net/mlx4_core: Don't give VFs MAC addresses which are derived from the PF MAC

[ Upstream commit 0508ad646836007e6e6b62331eee7356844eac3d ]

If the user has not assigned a MAC address to a VM, then don't give it MAC which
is based on the PF one. The current derivation scheme is wrong and leads to VM
MAC collisions when the number of cards/hypervisors becomes big enough.

Instead, just give it zeros and let them figure out what to do with that.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years ago8139cp: Add dma_mapping_error checking
Neil Horman [Wed, 31 Jul 2013 13:03:56 +0000]
8139cp: Add dma_mapping_error checking

[ Upstream commit cf3c4c03060b688cbc389ebc5065ebcce5653e96 ]

Self explanitory dma_mapping_error addition to the 8139 driver, based on this:

It showed several backtraces arising for dma_map_* usage without checking the
return code on the mapping.  Add the check and abort the rx/tx operation if its
failed.  Untested as I have no hardware and the reporter has wandered off, but
seems pretty straightforward.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agondisc: Add missing inline to ndisc_addr_option_pad
Joe Perches [Tue, 30 Jul 2013 17:31:00 +0000]
ndisc: Add missing inline to ndisc_addr_option_pad

[ Upstream commit d9d10a30964504af834d8d250a0c76d4ae91eb1e ]

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agonet_sched: info leak in atm_tc_dump_class()
Dan Carpenter [Tue, 30 Jul 2013 10:23:39 +0000]
net_sched: info leak in atm_tc_dump_class()

[ Upstream commit 8cb3b9c3642c0263d48f31d525bcee7170eedc20 ]

The "pvc" struct has a hole after pvc.sap_family which is not cleared.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoatl1c: use custom skb allocator
Eric Dumazet [Mon, 29 Jul 2013 17:24:04 +0000]
atl1c: use custom skb allocator

[ Upstream commit 7b70176421993866e616f1cbc4d0dd4054f1bf78 ]

We had reports ( https://bugzilla.kernel.org/show_bug.cgi?id=54021 )
that using high order pages for skb allocations is problematic for atl1c

We do not know exactly what the problem is, but we suspect that crossing
4K pages is not well supported by this hardware.

Use a custom allocator, using page allocator and 2K fragments for
optimal stack behavior. We might make this allocator generic
in future kernels.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Luis Henriques <luis.henriques@canonical.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoaf_key: more info leaks in pfkey messages
Dan Carpenter [Sun, 28 Jul 2013 20:04:45 +0000]
af_key: more info leaks in pfkey messages

[ Upstream commit ff862a4668dd6dba962b1d2d8bd344afa6375683 ]

This is inspired by a5cc68f3d6 "af_key: fix info leaks in notify
messages".  There are some struct members which don't get initialized
and could disclose small amounts of private information.

Acked-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agonet_sched: Fix stack info leak in cbq_dump_wrr().
David S. Miller [Tue, 30 Jul 2013 07:16:21 +0000]
net_sched: Fix stack info leak in cbq_dump_wrr().

[ Upstream commit a0db856a95a29efb1c23db55c02d9f0ff4f0db48 ]

Make sure the reserved fields, and padding (if any), are
fully initialized.

Based upon a patch by Dan Carpenter and feedback from
Joe Perches.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agogenetlink: release cb_lock before requesting additional module
Stanislaw Gruszka [Fri, 26 Jul 2013 09:00:10 +0000]
genetlink: release cb_lock before requesting additional module

[ Upstream commit c74f2b2678f40b80265dd53556f1f778c8e1823f ]

Requesting external module with cb_lock taken can result in
the deadlock like showed below:

[ 2458.111347] Showing all locks held in the system:
[ 2458.111347] 1 lock held by NetworkManager/582:
[ 2458.111347]  #0:  (cb_lock){++++++}, at: [<ffffffff8162bc79>] genl_rcv+0x19/0x40
[ 2458.111347] 1 lock held by modprobe/603:
[ 2458.111347]  #0:  (cb_lock){++++++}, at: [<ffffffff8162baa5>] genl_lock_all+0x15/0x30

[ 2461.579457] SysRq : Show Blocked State
[ 2461.580103]   task                        PC stack   pid father
[ 2461.580103] NetworkManager  D ffff880034b84500  4040   582      1 0x00000080
[ 2461.580103]  ffff8800197ff720 0000000000000046 00000000001d5340 ffff8800197fffd8
[ 2461.580103]  ffff8800197fffd8 00000000001d5340 ffff880019631700 7fffffffffffffff
[ 2461.580103]  ffff8800197ff880 ffff8800197ff878 ffff880019631700 ffff880019631700
[ 2461.580103] Call Trace:
[ 2461.580103]  [<ffffffff817355f9>] schedule+0x29/0x70
[ 2461.580103]  [<ffffffff81731ad1>] schedule_timeout+0x1c1/0x360
[ 2461.580103]  [<ffffffff810e69eb>] ? mark_held_locks+0xbb/0x140
[ 2461.580103]  [<ffffffff817377ac>] ? _raw_spin_unlock_irq+0x2c/0x50
[ 2461.580103]  [<ffffffff810e6b6d>] ? trace_hardirqs_on_caller+0xfd/0x1c0
[ 2461.580103]  [<ffffffff81736398>] wait_for_completion_killable+0xe8/0x170
[ 2461.580103]  [<ffffffff810b7fa0>] ? wake_up_state+0x20/0x20
[ 2461.580103]  [<ffffffff81095825>] call_usermodehelper_exec+0x1a5/0x210
[ 2461.580103]  [<ffffffff817362ed>] ? wait_for_completion_killable+0x3d/0x170
[ 2461.580103]  [<ffffffff81095cc3>] __request_module+0x1b3/0x370
[ 2461.580103]  [<ffffffff810e6b6d>] ? trace_hardirqs_on_caller+0xfd/0x1c0
[ 2461.580103]  [<ffffffff8162c5c9>] ctrl_getfamily+0x159/0x190
[ 2461.580103]  [<ffffffff8162d8a4>] genl_family_rcv_msg+0x1f4/0x2e0
[ 2461.580103]  [<ffffffff8162d990>] ? genl_family_rcv_msg+0x2e0/0x2e0
[ 2461.580103]  [<ffffffff8162da1e>] genl_rcv_msg+0x8e/0xd0
[ 2461.580103]  [<ffffffff8162b729>] netlink_rcv_skb+0xa9/0xc0
[ 2461.580103]  [<ffffffff8162bc88>] genl_rcv+0x28/0x40
[ 2461.580103]  [<ffffffff8162ad6d>] netlink_unicast+0xdd/0x190
[ 2461.580103]  [<ffffffff8162b149>] netlink_sendmsg+0x329/0x750
[ 2461.580103]  [<ffffffff815db849>] sock_sendmsg+0x99/0xd0
[ 2461.580103]  [<ffffffff810bb58f>] ? local_clock+0x5f/0x70
[ 2461.580103]  [<ffffffff810e96e8>] ? lock_release_non_nested+0x308/0x350
[ 2461.580103]  [<ffffffff815dbc6e>] ___sys_sendmsg+0x39e/0x3b0
[ 2461.580103]  [<ffffffff810565af>] ? kvm_clock_read+0x2f/0x50
[ 2461.580103]  [<ffffffff810218b9>] ? sched_clock+0x9/0x10
[ 2461.580103]  [<ffffffff810bb2bd>] ? sched_clock_local+0x1d/0x80
[ 2461.580103]  [<ffffffff810bb448>] ? sched_clock_cpu+0xa8/0x100
[ 2461.580103]  [<ffffffff810e33ad>] ? trace_hardirqs_off+0xd/0x10
[ 2461.580103]  [<ffffffff810bb58f>] ? local_clock+0x5f/0x70
[ 2461.580103]  [<ffffffff810e3f7f>] ? lock_release_holdtime.part.28+0xf/0x1a0
[ 2461.580103]  [<ffffffff8120fec9>] ? fget_light+0xf9/0x510
[ 2461.580103]  [<ffffffff8120fe0c>] ? fget_light+0x3c/0x510
[ 2461.580103]  [<ffffffff815dd1d2>] __sys_sendmsg+0x42/0x80
[ 2461.580103]  [<ffffffff815dd222>] SyS_sendmsg+0x12/0x20
[ 2461.580103]  [<ffffffff81741ad9>] system_call_fastpath+0x16/0x1b
[ 2461.580103] modprobe        D ffff88000f2c8000  4632   603    602 0x00000080
[ 2461.580103]  ffff88000f04fba8 0000000000000046 00000000001d5340 ffff88000f04ffd8
[ 2461.580103]  ffff88000f04ffd8 00000000001d5340 ffff8800377d4500 ffff8800377d4500
[ 2461.580103]  ffffffff81d0b260 ffffffff81d0b268 ffffffff00000000 ffffffff81d0b2b0
[ 2461.580103] Call Trace:
[ 2461.580103]  [<ffffffff817355f9>] schedule+0x29/0x70
[ 2461.580103]  [<ffffffff81736d4d>] rwsem_down_write_failed+0xed/0x1a0
[ 2461.580103]  [<ffffffff810bb200>] ? update_cpu_load_active+0x10/0xb0
[ 2461.580103]  [<ffffffff8137b473>] call_rwsem_down_write_failed+0x13/0x20
[ 2461.580103]  [<ffffffff8173492d>] ? down_write+0x9d/0xb2
[ 2461.580103]  [<ffffffff8162baa5>] ? genl_lock_all+0x15/0x30
[ 2461.580103]  [<ffffffff8162baa5>] genl_lock_all+0x15/0x30
[ 2461.580103]  [<ffffffff8162cbb3>] genl_register_family+0x53/0x1f0
[ 2461.580103]  [<ffffffffa01dc000>] ? 0xffffffffa01dbfff
[ 2461.580103]  [<ffffffff8162d650>] genl_register_family_with_ops+0x20/0x80
[ 2461.580103]  [<ffffffffa01dc000>] ? 0xffffffffa01dbfff
[ 2461.580103]  [<ffffffffa017fe84>] nl80211_init+0x24/0xf0 [cfg80211]
[ 2461.580103]  [<ffffffffa01dc000>] ? 0xffffffffa01dbfff
[ 2461.580103]  [<ffffffffa01dc043>] cfg80211_init+0x43/0xdb [cfg80211]
[ 2461.580103]  [<ffffffff810020fa>] do_one_initcall+0xfa/0x1b0
[ 2461.580103]  [<ffffffff8105cb93>] ? set_memory_nx+0x43/0x50
[ 2461.580103]  [<ffffffff810f75af>] load_module+0x1c6f/0x27f0
[ 2461.580103]  [<ffffffff810f2c90>] ? store_uevent+0x40/0x40
[ 2461.580103]  [<ffffffff810f82c6>] SyS_finit_module+0x86/0xb0
[ 2461.580103]  [<ffffffff81741ad9>] system_call_fastpath+0x16/0x1b
[ 2461.580103] Sched Debug Version: v0.10, 3.11.0-0.rc1.git4.1.fc20.x86_64 #1

Problem start to happen after adding net-pf-16-proto-16-family-nl80211
alias name to cfg80211 module by below commit (though that commit
itself is perfectly fine):

commit fb4e156886ce6e8309e912d8b370d192330d19d3
Author: Marcel Holtmann <marcel@holtmann.org>
Date:   Sun Apr 28 16:22:06 2013 -0700

    nl80211: Add generic netlink module alias for cfg80211/nl80211

Reported-and-tested-by: Jeff Layton <jlayton@redhat.com>
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Reviewed-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agousbnet: do not pretend to support SG/TSO
Eric Dumazet [Wed, 24 Jul 2013 00:15:54 +0000]
usbnet: do not pretend to support SG/TSO

[ Upstream commit 20f0170377264e8449b6987041f0bcc4d746d3ed ]

usbnet doesn't support yet SG, so drivers should not advertise SG or TSO
capabilities, as they allow TCP stack to build large TSO packets that
need to be linearized and might use order-5 pages.

This adds an extra copy overhead and possible allocation failures.

Current code ignore skb_linearize() return code so crashes are even

Best is to not pretend SG/TSO is supported, and add this again when/if
usbnet really supports SG for devices who could get a performance gain.

Based on a prior patch from Freddy Xin <freddy@asix.com.tw>

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoipv6: take rtnl_lock and mark mrt6 table as freed on namespace cleanup
Hannes Frederic Sowa [Mon, 22 Jul 2013 21:45:53 +0000]
ipv6: take rtnl_lock and mark mrt6 table as freed on namespace cleanup

[ Upstream commit 905a6f96a1b18e490a75f810d733ced93c39b0e5 ]

Otherwise we end up dereferencing the already freed net->ipv6.mrt pointer
which leads to a panic (from Srivatsa S. Bhat):

BUG: unable to handle kernel paging request at ffff882018552020
IP: [<ffffffffa0366b02>] ip6mr_sk_done+0x32/0xb0 [ipv6]
PGD 290a067 PUD 207ffe0067 PMD 207ff1d067 PTE 8000002018552060
Modules linked in: ebtable_nat ebtables nfs fscache nf_conntrack_ipv4 nf_defrag_ipv4 ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter ip_tables nfsd lockd nfs_acl exportfs auth_rpcgss autofs4 sunrpc 8021q garp bridge stp llc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter
+ip6_tables ipv6 vfat fat vhost_net macvtap macvlan vhost tun kvm_intel kvm uinput iTCO_wdt iTCO_vendor_support cdc_ether usbnet mii microcode i2c_i801 i2c_core lpc_ich mfd_core shpchp ioatdma dca mlx4_core be2net wmi acpi_cpufreq mperf ext4 jbd2 mbcache dm_mirror dm_region_hash dm_log dm_mod
CPU: 0 PID: 7 Comm: kworker/u33:0 Not tainted 3.11.0-rc1-ea45e-a #4
Hardware name: IBM  -[8737R2A]-/00Y2738, BIOS -[B2E120RUS-1.20]- 11/30/2012
Workqueue: netns cleanup_net
task: ffff8810393641c0 ti: ffff881039366000 task.ti: ffff881039366000
RIP: 0010:[<ffffffffa0366b02>]  [<ffffffffa0366b02>] ip6mr_sk_done+0x32/0xb0 [ipv6]
RSP: 0018:ffff881039367bd8  EFLAGS: 00010286
RAX: ffff881039367fd8 RBX: ffff882018552000 RCX: dead000000200200
RDX: 0000000000000000 RSI: ffff881039367b68 RDI: ffff881039367b68
RBP: ffff881039367bf8 R08: ffff881039367b68 R09: 2222222222222222
R10: 2222222222222222 R11: 2222222222222222 R12: ffff882015a7a040
R13: ffff882014eb89c0 R14: ffff8820289e2800 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff88103fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff882018552020 CR3: 0000000001c0b000 CR4: 00000000000407f0
 ffff881039367c18 ffff882014eb89c0 ffff882015e28c00 0000000000000000
 ffff881039367c18 ffffffffa034d9d1 ffff8820289e2800 ffff882014eb89c0
 ffff881039367c58 ffffffff815bdecb ffffffff815bddf2 ffff882014eb89c0
Call Trace:
 [<ffffffffa034d9d1>] rawv6_close+0x21/0x40 [ipv6]
 [<ffffffff815bdecb>] inet_release+0xfb/0x220
 [<ffffffff815bddf2>] ? inet_release+0x22/0x220
 [<ffffffffa032686f>] inet6_release+0x3f/0x50 [ipv6]
 [<ffffffff8151c1d9>] sock_release+0x29/0xa0
 [<ffffffff81525520>] sk_release_kernel+0x30/0x70
 [<ffffffffa034f14b>] icmpv6_sk_exit+0x3b/0x80 [ipv6]
 [<ffffffff8152fff9>] ops_exit_list+0x39/0x60
 [<ffffffff815306fb>] cleanup_net+0xfb/0x1a0
 [<ffffffff81075e3a>] process_one_work+0x1da/0x610
 [<ffffffff81075dc9>] ? process_one_work+0x169/0x610
 [<ffffffff81076390>] worker_thread+0x120/0x3a0
 [<ffffffff81076270>] ? process_one_work+0x610/0x610
 [<ffffffff8107da2e>] kthread+0xee/0x100
 [<ffffffff8107d940>] ? __init_kthread_worker+0x70/0x70
 [<ffffffff8162a99c>] ret_from_fork+0x7c/0xb0
 [<ffffffff8107d940>] ? __init_kthread_worker+0x70/0x70
Code: 20 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 66 66 66 66 90 4c 8b 67 30 49 89 fd e8 db 3c 1e e1 49 8b 9c 24 90 08 00 00 48 85 db 74 06 <4c> 39 6b 20 74 20 bb f3 ff ff ff e8 8e 3c 1e e1 89 d8 4c 8b 65
RIP  [<ffffffffa0366b02>] ip6mr_sk_done+0x32/0xb0 [ipv6]
 RSP <ffff881039367bd8>
CR2: ffff882018552020

Reported-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Tested-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agosfc: Enable RX scatter for flows steered by RFS
Ben Hutchings [Mon, 22 Jul 2013 23:17:25 +0000]
sfc: Enable RX scatter for flows steered by RFS

[ Upstream commit 7aa0076c497d6f0d5d957b431d0d80e1e9780274 ]

Received packets are only scattered if this is enabled in both the
matching filter and the receiving queue.  This was not being done for
filters inserted for RFS, so any packet requiring more than a single
descriptor was dropped.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agosysctl net: Keep tcp_syn_retries inside the boundary
Michal Tesar [Fri, 19 Jul 2013 12:09:01 +0000]
sysctl net: Keep tcp_syn_retries inside the boundary

[ Upstream commit 651e92716aaae60fc41b9652f54cb6803896e0da ]

Limit the min/max value passed to the

Signed-off-by: Michal Tesar <mtesar@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoarcnet: cleanup sizeof parameter
Dan Carpenter [Fri, 19 Jul 2013 05:48:05 +0000]
arcnet: cleanup sizeof parameter

[ Upstream commit 087d273caf4f7d3f2159256f255f1f432bc84a5b ]

This patch doesn't change the compiled code because ARC_HDR_SIZE is 4
and sizeof(int) is 4, but the intent was to use the header size and not
the sizeof the header size.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agospi: spi-davinci: Fix direction in dma_map_single()
Christian Eggers [Mon, 29 Jul 2013 18:54:09 +0000]
spi: spi-davinci: Fix direction in dma_map_single()

commit 89c66ee890af18500fa4598db300cc07c267f900 upstream.

Commit 048177ce3b3962852fd34a7e04938959271c7e70 (spi: spi-davinci:
convert to DMA engine API) introduced a regression: dma_map_single()
is called with direction DMA_FROM_DEVICE for rx and for tx.

Signed-off-by: Christian Eggers <ceggers@gmx.de>
Acked-by: Matt Porter <mporter@ti.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agox86/iommu/vt-d: Expand interrupt remapping quirk to cover x58 chipset
Neil Horman [Wed, 17 Jul 2013 11:13:59 +0000]
x86/iommu/vt-d: Expand interrupt remapping quirk to cover x58 chipset

commit 803075dba31c17af110e1d9a915fe7262165b213 upstream.

Recently we added an early quirk to detect 5500/5520 chipsets
with early revisions that had problems with irq draining with
interrupt remapping enabled:

  commit 03bbcb2e7e292838bb0244f5a7816d194c911d62
  Author: Neil Horman <nhorman@tuxdriver.com>
  Date:   Tue Apr 16 16:38:32 2013 -0400

      iommu/vt-d: add quirk for broken interrupt remapping on 55XX chipsets

It turns out this same problem is present in the intel X58
chipset as well. See errata 69 here:


This patch extends the pci early quirk so that the chip
devices/revisions specified in the above update are also covered
in the same way:

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Donald Dutile <ddutile@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Malcolm Crossley <malcolm.crossley@citrix.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Don Zickus <dzickus@redhat.com>
Link: http://lkml.kernel.org/r/1374059639-8631-1-git-send-email-nhorman@tuxdriver.com
[ Small edits. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agouserns: limit the maximum depth of user_namespace->parent chain
Oleg Nesterov [Thu, 8 Aug 2013 16:55:32 +0000]
userns: limit the maximum depth of user_namespace->parent chain

commit 8742f229b635bf1c1c84a3dfe5e47c814c20b5c8 upstream.

Ensure that user_namespace->parent chain can't grow too much.
Currently we use the hardroded 32 as limit.

Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agouserns: unshare_userns(&cred) should not populate cred on failure
Oleg Nesterov [Tue, 6 Aug 2013 17:38:55 +0000]
userns: unshare_userns(&cred) should not populate cred on failure

commit 6160968cee8b90a5dd95318d716e31d7775c4ef3 upstream.

unshare_userns(new_cred) does *new_cred = prepare_creds() before
create_user_ns() which can fail. However, the caller expects that
it doesn't need to take care of new_cred if unshare_userns() fails.

We could change the single caller, sys_unshare(), but I think it
would be more clean to avoid the side effects on failure, so with
this patch unshare_userns() does put_cred() itself and initializes
*new_cred only if create_user_ns() succeeeds.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoworkqueue: copy workqueue_attrs with all fields
Shaohua Li [Thu, 1 Aug 2013 01:56:36 +0000]
workqueue: copy workqueue_attrs with all fields

commit 2865a8fb44cc32420407362cbda80c10fa09c6b2 upstream.

 $echo '0' > /sys/bus/workqueue/devices/xxx/numa
 $cat /sys/bus/workqueue/devices/xxx/numa

I got 1. It should be 0, the reason is copy_workqueue_attrs() called
in apply_workqueue_attrs() doesn't copy no_numa field.

Fix it by making copy_workqueue_attrs() copy ->no_numa too.  This
would also make get_unbound_pool() set a pool's ->no_numa attribute
according to the workqueue attributes used when the pool was created.
While harmelss, as ->no_numa isn't a pool attribute, this is a bit
confusing.  Clear it explicitly.

tj: Updated description and comments a bit.

Signed-off-by: Shaohua Li <shli@fusionio.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agos390/bitops: fix find_next_bit_left
Martin Schwidefsky [Thu, 25 Jul 2013 08:18:17 +0000]
s390/bitops: fix find_next_bit_left

commit 3b0040a47ad63f7147e9e7d2febb61a3b564bb90 upstream.

The find_next_bit_left function is broken if used with an offset which
is not a multiple of 64. The shift to mask the bits of a 64-bit word
not to search is in the wrong direction, the result can be either a
bit found smaller than the offset or failure to find a set bit.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agos390: add support for IBM zBC12 machine
Heiko Carstens [Wed, 24 Jul 2013 08:35:33 +0000]
s390: add support for IBM zBC12 machine

commit 594712276e737961d30e11eae80d403b2b3815df upstream.

Just add the new model number where appropiate.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agodrm/i915: make SDVO TV-out work for multifunction devices
Daniel Vetter [Tue, 30 Apr 2013 12:01:45 +0000]
drm/i915: make SDVO TV-out work for multifunction devices

commit 09ede5414f0215461c933032630bf9c3a61a8ba3 upstream.

We need to track this correctly. While at it shovel the boolean
to track whether the sdvo is in tv mode or not into pipe_config.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=36997
Tested-by: Pierre Assal <pierre.assal@verint.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63609
Tested-by: cancan,feng <cancan.feng@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoBtrfs: fix crash regarding to ulist_add_merge
Liu Bo [Fri, 28 Jun 2013 04:37:45 +0000]
Btrfs: fix crash regarding to ulist_add_merge

commit 35f0399db6658f465b00893bdd13b992a0acfef0 upstream.

Several users reported this crash of NULL pointer or general protection,
the story is that we add a rbtree for speedup ulist iteration, and we
use krealloc() to address ulist growth, and krealloc() use memcpy to copy
old data to new memory area, so it's OK for an array as it doesn't use
pointers while it's not OK for a rbtree as it uses pointers.

So krealloc() will mess up our rbtree and it ends up with crash.

Reviewed-by: Wang Shilong <wangsl-fnst@cn.fujitsu.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Cc: BJ Quinn <bj@placs.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agox86, fpu: correct the asm constraints for fxsave, unbreak mxcsr.daz
H.J. Lu [Fri, 26 Jul 2013 16:11:56 +0000]
x86, fpu: correct the asm constraints for fxsave, unbreak mxcsr.daz

commit eaa5a990191d204ba0f9d35dbe5505ec2cdd1460 upstream.

GCC will optimize mxcsr_feature_mask_init in arch/x86/kernel/i387.c:

memset(&fx_scratch, 0, sizeof(struct i387_fxsave_struct));
asm volatile("fxsave %0" : : "m" (fx_scratch));
mask = fx_scratch.mxcsr_mask;
if (mask == 0)
mask = 0x0000ffbf;


memset(&fx_scratch, 0, sizeof(struct i387_fxsave_struct));
asm volatile("fxsave %0" : : "m" (fx_scratch));
mask = 0x0000ffbf;

since asm statement doesn’t say it will update fx_scratch.  As the
result, the DAZ bit will be cleared.  This patch fixes it. This bug
dates back to at least kernel 2.6.12.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agodrm/radeon: never unpin UVD bo v3
Christian König [Fri, 12 Jul 2013 14:18:09 +0000]
drm/radeon: never unpin UVD bo v3

commit 9cc2e0e9f13315559c85c9f99f141e420967c955 upstream.

Changing the UVD BOs offset on suspend/resume doesn't work because the VCPU
internally keeps pointers to it. Just keep it always pinned and save the
content manually.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=66425

v2: fix compiler warning
v3: fix CIK support
v4: rebased for 3.10-stable tree

Note: a version of this patch needs to go to stable.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agocgroup: fix umount vs cgroup_cfts_commit() race
Li Zefan [Tue, 18 Jun 2013 10:40:19 +0000]
cgroup: fix umount vs cgroup_cfts_commit() race

commit 084457f284abf6789d90509ee11dae383842b23b upstream.

cgroup_cfts_commit() uses dget() to keep cgroup alive after cgroup_mutex
is dropped, but dget() won't prevent cgroupfs from being umounted. When
the race happens, vfs will see some dentries with non-zero refcnt while
umount is in process.

Keep running this:
  mount -t cgroup -o blkio xxx /cgroup
  umount /cgroup

And this:
  modprobe cfq-iosched
  rmmod cfs-iosched

After a while, the BUG() in shrink_dcache_for_umount_subtree() may
be triggered:

  BUG: Dentry xxx{i=0,n=blkio.yyy} still in use (1) [umount of cgroup cgroup]

Signed-off-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agofanotify: info leak in copy_event_to_user()
Dan Carpenter [Mon, 8 Jul 2013 22:59:40 +0000]
fanotify: info leak in copy_event_to_user()

commit de1e0c40aceb9d5bff09c3a3b97b2f1b178af53f upstream.

The ->reserved field isn't cleared so we leak one byte of stack
information to userspace.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis Henriques <luis.henriques@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agodrm/i915: Preserve the DDI_A_4_LANES bit from the bios
Stéphane Marchesin [Fri, 12 Jul 2013 20:54:41 +0000]
drm/i915: Preserve the DDI_A_4_LANES bit from the bios

commit bcf53de4e60d9000b82f541d654529e2902a4c2c upstream.

Otherwise the DDI_A_4_LANES bit gets lost and we can't use > 2 lanes
on eDP. This fixes eDP on hsw with > 2 lanes.

Also s/port_reversal/saved_port_bits/ since the current name is

Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Zhouping Liu <zliu@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoxen-blkfront: use a different scatterlist for each request
Roger Pau Monne [Thu, 2 May 2013 08:58:50 +0000]
xen-blkfront: use a different scatterlist for each request

commit b7649158a0d241f8d53d13ff7441858539e16656 upstream.

In blkif_queue_request blkfront iterates over the scatterlist in order
to set the segments of the request, and in blkif_completion blkfront
iterates over the raw request, which makes it hard to know the exact
position of the source and destination memory positions.

This can be solved by allocating a scatterlist for each request, that
will be keep until the request is finished, allowing us to copy the
data back to the original memory without having to iterate over the
raw request.

CC: stable@vger.kernel.org
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reported-and-Tested-by: Anne Milicia <anne.milicia@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agodrm/radeon: Disable dma rings for bo moves on r6xx
Alex Deucher [Thu, 11 Jul 2013 18:20:11 +0000]
drm/radeon: Disable dma rings for bo moves on r6xx

commit aeea40cbf9388fc829e66fa049f64d97fd72e118 upstream.

They still seem to cause instability on some r6xx parts.
As a follow up, we can switch to using CP DMA for bo
moves on r6xx as a lighter weight alternative to using
the 3D engine.

A version of this patch should also go to stable kernels.

Tested-by: J.N. <golden.fleeced@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoPCI: Retry allocation of only the resource type that failed
Yinghai Lu [Thu, 25 Jul 2013 13:31:38 +0000]
PCI: Retry allocation of only the resource type that failed

commit aa914f5ec25e4371ba18b312971314be1b9b1076 upstream.

Ben Herrenschmidt reported the following problem:

  - The bus has space for all desired MMIO resources, including optional
    space for SR-IOV devices
  - We attempt to allocate I/O port space, but it fails because the bus
    has no I/O space
  - Because of the I/O allocation failure, we retry MMIO allocation,
    requesting only the required space, without the optional SR-IOV space

This means we don't allocate the optional SR-IOV space, even though we

This is related to 0c5be0cb0e ("PCI: Retry on IORESOURCE_IO type

This patch changes how we handle allocation failures.  We will now retry
allocation of only the resource type that failed.  If MMIO allocation
fails, we'll retry only MMIO allocation.  If I/O port allocation fails,
we'll retry only I/O port allocation.

[bhelgaas: changelog]
Reference: https://lkml.kernel.org/r/1367712653.11982.19.camel@pasglop
Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoPCI: pciehp: Fix null pointer deref when hot-removing SR-IOV device
Yinghai Lu [Fri, 19 Jul 2013 19:14:16 +0000]
PCI: pciehp: Fix null pointer deref when hot-removing SR-IOV device

commit 29ed1f29b68a8395d5679b3c4e38352b617b3236 upstream.

Hot-removing a device with SR-IOV enabled causes a null pointer dereference
in v3.9 and v3.10.

This is a regression caused by ba518e3c17 ("PCI: pciehp: Iterate over all
devices in slot, not functions 0-7").  When we iterate over the
bus->devices list, we first remove the PF, which also removes all the VFs
from the list.  Then the list iterator blows up because more than just the
current entry was removed from the list.

ac205b7bb7 ("PCI: make sriov work with hotplug remove") works around a
similar problem in pci_stop_bus_devices() by iterating over the list in
reverse, so the VFs are stopped and removed from the list first, before the

This patch changes pciehp_unconfigure_device() to iterate over the list in
reverse, too.

[bhelgaas: bugzilla, changelog]
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=60604
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agoRevert "cpuidle: Quickly notice prediction failure for repeat mode"
Rafael J. Wysocki [Fri, 26 Jul 2013 23:41:34 +0000]
Revert "cpuidle: Quickly notice prediction failure for repeat mode"

commit 148519120c6d1f19ad53349683aeae9f228b0b8d upstream.

Revert commit 69a37bea (cpuidle: Quickly notice prediction failure for
repeat mode), because it has been identified as the source of a
significant performance regression in v3.8 and later as explained by
Jeremy Eder:

  We believe we've identified a particular commit to the cpuidle code
  that seems to be impacting performance of variety of workloads.
  The simplest way to reproduce is using netperf TCP_RR test, so
  we're using that, on a pair of Sandy Bridge based servers.  We also
  have data from a large database setup where performance is also
  measurably/positively impacted, though that test data isn't easily

  Included below are test results from 3 test kernels:

  kernel       reverts
  1) vanilla   upstream (no reverts)

  2) perfteam2 reverts e11538d1f03914eb92af5a1a378375c05ae8520c

  3) test      reverts 69a37beabf1f0a6705c08e879bdd5d82ff6486c4

  In summary, netperf TCP_RR numbers improve by approximately 4%
  after reverting 69a37beabf1f0a6705c08e879bdd5d82ff6486c4.  When
  69a37beabf1f0a6705c08e879bdd5d82ff6486c4 is included, C0 residency
  never seems to get above 40%.  Taking that patch out gets C0 near
  100% quite often, and performance increases.

  The below data are histograms representing the %c0 residency @
  1-second sample rates (using turbostat), while under netperf test.

  - If you look at the first 4 histograms, you can see %c0 residency
    almost entirely in the 30,40% bin.
  - The last pair, which reverts 69a37beabf1f0a6705c08e879bdd5d82ff6486c4,
    shows %c0 in the 80,90,100% bins.

  Below each kernel name are netperf TCP_RR trans/s numbers for the
  particular kernel that can be disclosed publicly, comparing the 3
  test kernels.  We ran a 4th test with the vanilla kernel where
  we've also set /dev/cpu_dma_latency=0 to show overall impact
  boosting single-threaded TCP_RR performance over 11% above

  3.10-rc2 vanilla RX + c0 lock (/dev/cpu_dma_latency=0):
  TCP_RR trans/s 54323.78

  3.10-rc2 vanilla RX (no reverts)
  TCP_RR trans/s 48192.47

  Receiver %c0
      0.0000 -    10.0000 [     1]: *
     10.0000 -    20.0000 [     0]:
     20.0000 -    30.0000 [     0]:
     30.0000 -    40.0000 [    59]:
     40.0000 -    50.0000 [     1]: *
     50.0000 -    60.0000 [     0]:
     60.0000 -    70.0000 [     0]:
     70.0000 -    80.0000 [     0]:
     80.0000 -    90.0000 [     0]:
     90.0000 -   100.0000 [     0]:

  Sender %c0
      0.0000 -    10.0000 [     1]: *
     10.0000 -    20.0000 [     0]:
     20.0000 -    30.0000 [     0]:
     30.0000 -    40.0000 [    11]: ***********
     40.0000 -    50.0000 [    49]:
     50.0000 -    60.0000 [     0]:
     60.0000 -    70.0000 [     0]:
     70.0000 -    80.0000 [     0]:
     80.0000 -    90.0000 [     0]:
     90.0000 -   100.0000 [     0]:

  3.10-rc2 perfteam2 RX (reverts commit
  TCP_RR trans/s 49698.69

  Receiver %c0
      0.0000 -    10.0000 [     1]: *
     10.0000 -    20.0000 [     1]: *
     20.0000 -    30.0000 [     0]:
     30.0000 -    40.0000 [    59]:
     40.0000 -    50.0000 [     0]:
     50.0000 -    60.0000 [     0]:
     60.0000 -    70.0000 [     0]:
     70.0000 -    80.0000 [     0]:
     80.0000 -    90.0000 [     0]:
     90.0000 -   100.0000 [     0]:

  Sender %c0
      0.0000 -    10.0000 [     1]: *
     10.0000 -    20.0000 [     0]:
     20.0000 -    30.0000 [     0]:
     30.0000 -    40.0000 [     2]: **
     40.0000 -    50.0000 [    58]:
     50.0000 -    60.0000 [     0]:
     60.0000 -    70.0000 [     0]:
     70.0000 -    80.0000 [     0]:
     80.0000 -    90.0000 [     0]:
     90.0000 -   100.0000 [     0]:

  3.10-rc2 test RX (reverts 69a37beabf1f0a6705c08e879bdd5d82ff6486c4
  and e11538d1f03914eb92af5a1a378375c05ae8520c)
  TCP_RR trans/s 47766.95

  Receiver %c0
      0.0000 -    10.0000 [     1]: *
     10.0000 -    20.0000 [     1]: *
     20.0000 -    30.0000 [     0]:
     30.0000 -    40.0000 [    27]: ***************************
     40.0000 -    50.0000 [     2]: **
     50.0000 -    60.0000 [     0]:
     60.0000 -    70.0000 [     2]: **
     70.0000 -    80.0000 [     0]:
     80.0000 -    90.0000 [     0]:
     90.0000 -   100.0000 [    28]: ****************************

      0.0000 -    10.0000 [     1]: *
     10.0000 -    20.0000 [     0]:
     20.0000 -    30.0000 [     0]:
     30.0000 -    40.0000 [    11]: ***********
     40.0000 -    50.0000 [     0]:
     50.0000 -    60.0000 [     1]: *
     60.0000 -    70.0000 [     0]:
     70.0000 -    80.0000 [     3]: ***
     80.0000 -    90.0000 [     7]: *******
     90.0000 -   100.0000 [    38]: **************************************

  These results demonstrate gaining back the tendency of the CPU to
  stay in more responsive, performant C-states (and thus yield
  measurably better performance), by reverting commit

Requested-by: Jeremy Eder <jeder@redhat.com>
Tested-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6 years agocpufreq: Fix cpufreq driver module refcount balance after suspend/resume
Rafael J. Wysocki [Mon, 29 Jul 2013 22:32:00 +0000]
cpufreq: Fix cpufreq driver module refcount balance after suspend/resume

commit 2a99859932281ed6c2ecdd988855f8f6838f6743 upstream.

Since cpufreq_cpu_put() called by __cpufreq_remove_dev() drops the
driver module refcount, __cpufreq_remove_dev() causes that refcount
to become negative for the cpufreq driver after a suspend/resume

This is not the only bad thing that happens there, however, because
kobject_put() should only be called for the policy kobject at this
point if the CPU is not the last one for that policy.

Namely, if the given CPU is the last one for that policy, the
policy kobject's refcount should be 1 at this point, as set by
cpufreq_add_dev_interface(), and only needs to be dropped once for
the kobject to go away.  This actually happens under the cpu == 1
check, so it need not be done before by cpufreq_cpu_put().

On the other hand, if the given CPU is not the last one for that
policy, this means that cpufreq_add_policy_cpu() has been called
at least once for that policy and cpufreq_cpu_get() has been
called for it too.  To balance that cpufreq_cpu_get(), we need to
call cpufreq_cpu_put() in that case.

Thus, to fix the described problem and keep the reference
counters balanced in both cases, move the cpufreq_cpu_get() call
in __cpufreq_remove_dev() to the code path executed only for
CPUs that share the policy with other CPUs.

Reported-and-tested-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>