8 years agoNVMe: Convert comments to kernel-doc notation
Matthew Wilcox [Wed, 16 Mar 2011 20:28:24 +0000]
NVMe: Convert comments to kernel-doc notation

Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Update admin opcodes to match the 1.0RC spec
Krzysztof Wierzbicki [Mon, 28 Feb 2011 07:27:13 +0000]
NVMe: Update admin opcodes to match the 1.0RC spec

Signed-off-by: Krzysztof Wierzbicki <krzysztof.wierzbicki@intel.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Version 0.4
Matthew Wilcox [Thu, 24 Feb 2011 21:20:14 +0000]
NVMe: Version 0.4

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Reduce maximum queue depth by 1
Matthew Wilcox [Thu, 24 Feb 2011 13:49:41 +0000]
NVMe: Reduce maximum queue depth by 1

The spec says we're not allowed to completely fill the submission queue.
Solve this by reducing the number of allocatable cmdids by 1.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Fix discontiguous accesses
Matthew Wilcox [Thu, 24 Feb 2011 13:46:00 +0000]
NVMe: Fix discontiguous accesses

When we submit subsequent portions of the I/O, we need to access the
updated block, not start reading again from the original position.
This was showing up as miscompares in the XFS randholes testcase.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Handle bios that contain non-virtually contiguous addresses
Matthew Wilcox [Wed, 23 Feb 2011 20:20:00 +0000]
NVMe: Handle bios that contain non-virtually contiguous addresses

NVMe scatterlists must be virtually contiguous, like almost all I/Os.
However, when the filesystem lays out files with a hole, it can be that
adjacent LBAs map to non-adjacent virtual addresses.  Handle this by
submitting one NVMe command at a time for each virtually discontiguous
range.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Implement Flush
Matthew Wilcox [Tue, 22 Feb 2011 19:18:30 +0000]
NVMe: Implement Flush

Linux implements Flush as a bit in the bio.  That means there may also be
data associated with the flush; if so the flush should be sent before the
data.  To avoid completing the bio twice, I add CMD_CTX_FLUSH to indicate
the completion routine should do nothing.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Mark CMD_CTX_CANCELLED as being unlikely
Matthew Wilcox [Tue, 22 Feb 2011 19:15:34 +0000]
NVMe: Mark CMD_CTX_CANCELLED as being unlikely

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Correct SQ doorbell semantics
Matthew Wilcox [Wed, 16 Feb 2011 14:59:59 +0000]
NVMe: Correct SQ doorbell semantics

The value written to the doorbell needs to be the first free index in
the queue, not the most recently used index in the queue.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Let the kthread take care of devices earlier
Matthew Wilcox [Tue, 15 Feb 2011 21:28:20 +0000]
NVMe: Let the kthread take care of devices earlier

If interrupts are misconfigured, the kthread will be needed to process
admin queue completions.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Rename nr_queues to nr_io_queues
Matthew Wilcox [Tue, 15 Feb 2011 21:16:02 +0000]
NVMe: Rename nr_queues to nr_io_queues

I got confused about whether this included the admin queue or not, and
had to resort to reading the spec.  It doesn't include the admin queue,
so make that clear in the name.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Remove setting of 'flags' in rw command
Matthew Wilcox [Tue, 15 Feb 2011 18:44:13 +0000]
NVMe: Remove setting of 'flags' in rw command

This was the data transfer bit until spec rev 0.92

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Release 0.3
Matthew Wilcox [Mon, 14 Feb 2011 22:35:00 +0000]
NVMe: Release 0.3

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Add a kthread to handle the congestion list
Matthew Wilcox [Wed, 2 Mar 2011 23:37:18 +0000]
NVMe: Add a kthread to handle the congestion list

Instead of trying to resubmit I/Os in the I/O completion path (in
interrupt context), wake up a kthread which will resubmit I/O from
user context.  This allows mke2fs to run to completion.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Handle failures differently in nvme_submit_bio_queue()
Matthew Wilcox [Mon, 14 Feb 2011 20:55:33 +0000]
NVMe: Handle failures differently in nvme_submit_bio_queue()

Return -EBUSY if the queue is full or -ENOMEM if we failed to allocate
memory (or map a scatterlist).  Also use GFP_ATOMIC to allocate the
nvme_bio and move the locking to the callers of nvme_submit_bio_queue().

In nvme_make_request(), don't permit an I/O to jump the queue -- if the
congestion list already has an entry, just add to the tail, rather than
trying to submit.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Update BAR structure to match the current spec
Matthew Wilcox [Mon, 14 Feb 2011 17:20:15 +0000]
NVMe: Update BAR structure to match the current spec

Add two reserved registers in the middle of the BAR to match the 1.0
spec plus ECN 0002.

Also rename IMC and ISC to INTMC and INTSC to conform with the spec.
We still don't need to use them :-)

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Handle physical merging of bvec entries
Matthew Wilcox [Thu, 10 Feb 2011 18:55:39 +0000]
NVMe: Handle physical merging of bvec entries

In order to not overrun the sg array, we have to merge physically
contiguous pages into a single sg entry.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Check for DMA mapping failure
Matthew Wilcox [Thu, 10 Feb 2011 17:01:09 +0000]
NVMe: Check for DMA mapping failure

If dma_map_sg returns 0 (failure), we need to fail the I/O.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Pass the nvme_dev to nvme_free_prps and nvme_setup_prps
Matthew Wilcox [Thu, 10 Feb 2011 15:47:55 +0000]
NVMe: Pass the nvme_dev to nvme_free_prps and nvme_setup_prps

We were passing the nvme_queue to access the q_dmadev for the
dma_alloc_coherent calls, but since we moved to the dma pool API,
we really only need the nvme_dev.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Optimise memory usage for I/Os between 4k and 128k
Matthew Wilcox [Thu, 10 Feb 2011 15:30:34 +0000]
NVMe: Optimise memory usage for I/Os between 4k and 128k

Add a second memory pool for smaller I/Os.  We can pack 16 of these on a
single page instead of using an entire page for each one.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Switch to use DMA Pool API
Matthew Wilcox [Thu, 10 Feb 2011 14:56:01 +0000]
NVMe: Switch to use DMA Pool API

Calling dma_free_coherent from interrupt context causes warnings.
Using the DMA pools delays freeing until pool destruction, so avoids
the problem.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Rename nvme_req_info to nvme_bio
Matthew Wilcox [Thu, 10 Feb 2011 14:03:06 +0000]
NVMe: Rename nvme_req_info to nvme_bio

There are too many things called 'info' in this driver.  This data
structure is auxiliary information for a struct bio, so call it nvme_bio,
or nbio when used as a variable.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Initial PRP List support
Shane Michael Matthews [Thu, 10 Feb 2011 13:51:24 +0000]
NVMe: Initial PRP List support

Add a pointer to the nvme_req_info to hold a new data structure
(nvme_prps) which contains a list of the pages allocated to this
particular request for holding PRP list entries.  nvme_setup_prps()
now returns this pointer.

To allocate and free the memory used for PRP lists, we need a struct
device, so we need to pass the nvme_queue pointer to many functions
which didn't use to need it.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Advance the sg pointer when filling in an sg list
Matthew Wilcox [Thu, 10 Feb 2011 13:49:59 +0000]
NVMe: Advance the sg pointer when filling in an sg list

For multipage BIOs, we were always using sg[0] instead of advancing
through the list.  Oops :-)

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Renumber the special context values
Matthew Wilcox [Mon, 7 Feb 2011 20:55:59 +0000]
NVMe: Renumber the special context values

If POISON_POINTER_DELTA isn't defined, ensure they're in page 0 which
should never be mapped.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Handle the congestion list a little better
Matthew Wilcox [Mon, 7 Feb 2011 17:45:24 +0000]
NVMe: Handle the congestion list a little better

In the bio completion handler, check for bios on the congestion list
for this NVM queue.  Also, lock the congestion list in the make_request
function as the queue may end up being shared between multiple CPUs.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Record the timeout for each command
Matthew Wilcox [Sun, 6 Feb 2011 23:30:16 +0000]
NVMe: Record the timeout for each command

In addition to recording the completion data for each command, record
the anticipated completion time.  Choose a timeout of 5 seconds for
normal I/Os and 60 seconds for admin I/Os.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Need to lock queue during interrupt handling
Matthew Wilcox [Sun, 6 Feb 2011 14:01:00 +0000]
NVMe: Need to lock queue during interrupt handling

If we're sharing a queue between multiple CPUs and we cancel a sync I/O,
we must have the queue locked to avoid corrupting the stack of the thread
that submitted the I/O.  It turns out this is the same locking that's needed
for the threaded irq handler, so share that code.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Detect command IDs completing that are out of range
Matthew Wilcox [Sun, 6 Feb 2011 13:51:15 +0000]
NVMe: Detect command IDs completing that are out of range

If the adapter completes a command ID that is outside the bounds of
the array, return CMD_CTX_INVALID instead of random data, and print a
message in the sync_completion handler (which is rapidly becoming the
misc completion handler :-)

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Detect commands that are completed twice
Matthew Wilcox [Sun, 6 Feb 2011 13:49:55 +0000]
NVMe: Detect commands that are completed twice

Set the context value to CMD_CTX_COMPLETED, and print a message in the
sync_completion handler if we see it.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Use a symbolic name to represent cancelled commands instead of 0
Matthew Wilcox [Sun, 6 Feb 2011 12:53:23 +0000]
NVMe: Use a symbolic name to represent cancelled commands instead of 0

I have plans for other special values in sync_completion.  Plus, this
is more self-documenting, and lets us detect bogus usages.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Add a module parameter to use a threaded interrupt
Matthew Wilcox [Sun, 6 Feb 2011 12:28:06 +0000]
NVMe: Add a module parameter to use a threaded interrupt

We're currently calling bio_endio from hard interrupt context.  This is
not a good idea for preemptible kernels as it will cause longer latencies.
Using a threaded interrupt will run the entire queue processing mechanism
(including bio_endio) in a thread, which can be preempted.  Unfortuantely,
it also adds about 7us of latency to the single-I/O case, so make it a
module parameter for the moment.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Call put_nvmeq() before calling nvme_submit_sync_cmd()
Matthew Wilcox [Fri, 4 Feb 2011 21:14:30 +0000]
NVMe: Call put_nvmeq() before calling nvme_submit_sync_cmd()

We can't have preemption disabled when we call schedule().  Accept the
possibility that we'll get preempted, and it'll cost us some cacheline
bounces.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Allow fatal signals to interrupt I/O
Matthew Wilcox [Fri, 4 Feb 2011 21:03:56 +0000]
NVMe: Allow fatal signals to interrupt I/O

If the user sends a fatal signal, sleeping in the TASK_KILLABLE state
permits the task to be aborted.  The only wrinkle is making sure that
if/when the command completes later that it doesn't upset anything.
Handle this by setting the data pointer to 0, and checking the value
isn't NULL in the sync completion path.  Eventually, bios can be cancelled
through this path too.  Note that the cmdid isn't freed to prevent reuse.

We should also abort the command in the future, but this is a good start.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Release 0.2
Matthew Wilcox [Thu, 3 Feb 2011 19:36:07 +0000]
NVMe: Release 0.2

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Add download / activate firmware ioctls
Matthew Wilcox [Thu, 3 Feb 2011 15:58:26 +0000]
NVMe: Add download / activate firmware ioctls

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Add remaining status codes
Matthew Wilcox [Thu, 3 Feb 2011 14:20:57 +0000]
NVMe: Add remaining status codes

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Move sysfs entries to the right place
Matthew Wilcox [Tue, 1 Feb 2011 17:49:38 +0000]
NVMe: Move sysfs entries to the right place

Because I wasn't setting driverfs_dev, the devices were showing up under
/sys/devices/virtual/block.  Now they appear underneath the PCI device
which they belong to.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Disable the device before we write the admin queues
Shane Michael Matthews [Tue, 1 Feb 2011 16:31:55 +0000]
NVMe: Disable the device before we write the admin queues

In case the card has been left in a partially-configured state,
write 0 to the Enable bit.

Signed-off-by: Shane Michael Matthews <shane.matthews@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Request I/O regions
Matthew Wilcox [Tue, 1 Feb 2011 21:24:35 +0000]
NVMe: Request I/O regions

Calling pci_request_selected_regions() reserves these regions for our use.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Allow queues to be allocated above 4GB
Matthew Wilcox [Tue, 1 Feb 2011 21:23:39 +0000]
NVMe: Allow queues to be allocated above 4GB

Need to call dma_set_coherent_mask() to allow queues to be allocated
above 4GB.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Enable device DMA
Matthew Wilcox [Tue, 1 Feb 2011 14:01:59 +0000]
NVMe: Enable device DMA

Need to call pci_set_master() to enable device DMA

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Enable and disable the PCI device
Shane Michael Matthews [Tue, 1 Feb 2011 13:49:30 +0000]
NVMe: Enable and disable the PCI device

Call pci_enable_device_mem() at initialisation and pci_disable_device
at exit.

Signed-off-by: Shane Michael Matthews <shane.matthews@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Check returns from nvme_alloc_queue()
Matthew Wilcox [Tue, 1 Feb 2011 13:39:04 +0000]
NVMe: Check returns from nvme_alloc_queue()

It can return NULL, so handle that.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Remove 'node' from nvme_dev
Matthew Wilcox [Mon, 31 Jan 2011 15:46:14 +0000]
NVMe: Remove 'node' from nvme_dev

We don't keep a list of nvme_dev any more

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Read the model, serial & firmware rev from the controller
Matthew Wilcox [Tue, 1 Feb 2011 21:18:08 +0000]
NVMe: Read the model, serial & firmware rev from the controller

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Add NVME_IOCTL_SUBMIT_IO
Matthew Wilcox [Tue, 1 Feb 2011 21:13:29 +0000]
NVMe: Add NVME_IOCTL_SUBMIT_IO

Allow userspace to submit synchronous I/O like the SCSI sg interface does.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Create nvme_map_user_pages() and nvme_unmap_user_pages()
Matthew Wilcox [Wed, 26 Jan 2011 22:05:50 +0000]
NVMe: Create nvme_map_user_pages() and nvme_unmap_user_pages()

These are generalisations of the code that was in
nvme_submit_user_admin_command().

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Change NVME_IOCTL_GET_RANGE_TYPE to return all the ranges
Matthew Wilcox [Wed, 26 Jan 2011 19:34:32 +0000]
NVMe: Change NVME_IOCTL_GET_RANGE_TYPE to return all the ranges

Factor out most of nvme_identify() into a new nvme_submit_user_admin_command()
function.  Change nvme_get_range_type() to call it and change nvme_ioctl to
realise that it's getting back all 64 ranges.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Zero the command before we send it
Matthew Wilcox [Wed, 26 Jan 2011 15:08:25 +0000]
NVMe: Zero the command before we send it

Make sure there's no left-over bits set from previous commands that used
this slot.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Add nvme_setup_prps()
Matthew Wilcox [Wed, 26 Jan 2011 15:02:29 +0000]
NVMe: Add nvme_setup_prps()

Generalise the code from nvme_identify() that sets PRP1 & PRP2 so that
it's usable for commands sent by nvme_submit_bio_queue().

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Make nvme_common_command more featureful
Matthew Wilcox [Wed, 26 Jan 2011 15:01:21 +0000]
NVMe: Make nvme_common_command more featureful

Add prp1, prp2 and the metadata prp to the common command, since the
fields are generally used this way.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Use PRP2 for the nvme_identify ioctl
Matthew Wilcox [Mon, 24 Jan 2011 12:52:07 +0000]
NVMe: Use PRP2 for the nvme_identify ioctl

DMA the result straight to userspace instead of bounce-buffering in the
kernel.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Fix admin IRQ claim on real hardware
Matthew Wilcox [Thu, 20 Jan 2011 18:42:34 +0000]
NVMe: Fix admin IRQ claim on real hardware

The admin IRQ is supposed to use the pin-based (or single message MSI)
interrupt.  Accomplish this by filling in entry[0]'s vector with the
INTx irq number.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Rename 'cycle' to 'phase'
Matthew Wilcox [Thu, 20 Jan 2011 18:24:06 +0000]
NVMe: Rename 'cycle' to 'phase'

It's called the phase bit in the current draft

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Implement per-CPU queues
Matthew Wilcox [Thu, 20 Jan 2011 18:01:49 +0000]
NVMe: Implement per-CPU queues

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Reduce set_queue_count arguments by one
Matthew Wilcox [Thu, 20 Jan 2011 14:14:34 +0000]
NVMe: Reduce set_queue_count arguments by one

sq_count and cq_count are always the same, so just call it 'count'.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: Factor out queue_request_irq()
Matthew Wilcox [Thu, 20 Jan 2011 14:10:15 +0000]
NVMe: Factor out queue_request_irq()

Two callers with an almost identical long string of arguments, and
introducing a third soon.  Time to factor out the commonalities.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoNVMe: New driver
Matthew Wilcox [Thu, 20 Jan 2011 17:50:14 +0000]
NVMe: New driver

This driver is for devices that follow the NVM Express standard

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoXen: Export xen_biovec_phys_mergeable
Matthew Wilcox [Fri, 4 Nov 2011 19:41:27 +0000]
Xen: Export xen_biovec_phys_mergeable

When Xen is enabled, using BIOVEC_PHYS_MERGEABLE in a module
causes xen_biovec_phys_mergeable to be referenced, so it needs
to be exported.

Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8 years agoLinux 3.1
Linus Torvalds [Mon, 24 Oct 2011 07:10:05 +0000]
Linux 3.1

8 years agoMerge git://git.infradead.org/iommu-2.6
Linus Torvalds [Mon, 24 Oct 2011 05:08:24 +0000]
Merge git://git.infradead.org/iommu-2.6

* git://git.infradead.org/iommu-2.6:
  intel-iommu: fix superpage support in pfn_to_dma_pte()
  intel-iommu: set iommu_superpage on VM domains to lowest common denominator
  intel-iommu: fix return value of iommu_unmap() API
  MAINTAINERS: Update VT-d entry for drivers/pci -> drivers/iommu move
  intel-iommu: Export a flag indicating that the IOMMU is used for iGFX.
  intel-iommu: Workaround IOTLB hang on Ironlake GPU
  intel-iommu: Fix AB-BA lockdep report

8 years agoMerge branch 'for-linus' of http://people.redhat.com/agk/git/linux-dm
Linus Torvalds [Mon, 24 Oct 2011 05:05:38 +0000]
Merge branch 'for-linus' of people.redhat.com/agk/git/linux-dm

* 'for-linus' of http://people.redhat.com/agk/git/linux-dm:
  dm kcopyd: fix job_pool leak

8 years agox86: Fix S4 regression
Takashi Iwai [Sun, 23 Oct 2011 21:19:12 +0000]
x86: Fix S4 regression

Commit 4b239f458 ("x86-64, mm: Put early page table high") causes a S4
regression since 2.6.39, namely the machine reboots occasionally at S4
resume.  It doesn't happen always, overall rate is about 1/20.  But,
like other bugs, once when this happens, it continues to happen.

This patch fixes the problem by essentially reverting the memory
assignment in the older way.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Cc: <stable@kernel.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Yinghai Lu <yinghai.lu@oracle.com>
[ We'll hopefully find the real fix, but that's too late for 3.1 now ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

8 years agodm kcopyd: fix job_pool leak
Alasdair G Kergon [Sun, 23 Oct 2011 19:55:17 +0000]
dm kcopyd: fix job_pool leak

Fix memory leak introduced by commit a6e50b409d3f9e0833e69c3c9cca822e8fa4adbb
(dm snapshot: skip reading origin when overwriting complete chunk).

When allocating a set of jobs from kc->job_pool, job->master_job must be
set (to point to itself) so that the mempool item gets freed when the
master_job completes.

master_job was introduced by commit c6ea41fbbe08f270a8edef99dc369faf809d1bd6
(dm kcopyd: preallocate sub jobs to avoid deadlock)

Reported-by: Michael Leun <ml@newton.leun.net>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

8 years agoMerge branch 'samsung-fixes-4' of git://github.com/kgene/linux-samsung
Linus Torvalds [Sun, 23 Oct 2011 07:44:40 +0000]
Merge branch 'samsung-fixes-4' of git://github.com/kgene/linux-samsung

* 'samsung-fixes-4' of git://github.com/kgene/linux-samsung:
  ARM: S3C24XX: Fix s3c24xx build errors if !CONFIG_PM
  ARM: S5P: fix offset calculation on gpio-interrupt

8 years agoMerge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groec...
Linus Torvalds [Sun, 23 Oct 2011 07:43:31 +0000]
Merge branch 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/linux-staging

* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (w83627ehf) Fix negative 8-bit temperature values

8 years agoARM: S3C24XX: Fix s3c24xx build errors if !CONFIG_PM
Domenico Andreoli [Fri, 21 Oct 2011 19:00:53 +0000]
ARM: S3C24XX: Fix s3c24xx build errors if !CONFIG_PM

v2:
- register_syscore_ops(&s3c24xx_irq_syscore_ops) does not need to be
  conditionally compiled out, it is already optimized out on !CONFIG_PM
- fix also s3c2412 and s3c2416 affected by the same build issue

v1:
s3c2440.c fails to build if !CONFIG_PM because in such case
s3c2410_pm_syscore_ops is not defined. Same error should happen also
in s3c2410.c and s3c2442.c

Signed-off-by: Domenico Andreoli <cavokz@gmail.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>

8 years agoMerge git://github.com/herbertx/crypto
Linus Torvalds [Fri, 21 Oct 2011 14:02:18 +0000]
Merge git://github.com/herbertx/crypto

* git://github.com/herbertx/crypto:
  crypto: ghash - Avoid null pointer dereference if no key is set

8 years agoMerge branch 'fix/hda' of git://github.com/tiwai/sound
Linus Torvalds [Fri, 21 Oct 2011 14:01:21 +0000]
Merge branch 'fix/hda' of git://github.com/tiwai/sound

* 'fix/hda' of git://github.com/tiwai/sound:
  ALSA: HDA: conexant support for Lenovo T520/W520
  ALSA: hda - Add position_fix quirk for Dell Inspiron 1010

8 years agocrypto: ghash - Avoid null pointer dereference if no key is set
Nick Bowler [Thu, 20 Oct 2011 12:16:55 +0000]
crypto: ghash - Avoid null pointer dereference if no key is set

The ghash_update function passes a pointer to gf128mul_4k_lle which will
be NULL if ghash_setkey is not called or if the most recent call to
ghash_setkey failed to allocate memory.  This causes an oops.  Fix this
up by returning an error code in the null case.

This is trivially triggered from unprivileged userspace through the
AF_ALG interface by simply writing to the socket without setting a key.

The ghash_final function has a similar issue, but triggering it requires
a memory allocation failure in ghash_setkey _after_ at least one
successful call to ghash_update.

  BUG: unable to handle kernel NULL pointer dereference at 00000670
  IP: [<d88c92d4>] gf128mul_4k_lle+0x23/0x60 [gf128mul]
  *pde = 00000000
  Oops: 0000 [#1] PREEMPT SMP
  Modules linked in: ghash_generic gf128mul algif_hash af_alg nfs lockd nfs_acl sunrpc bridge ipv6 stp llc

  Pid: 1502, comm: hashatron Tainted: G        W   3.1.0-rc9-00085-ge9308cf #32 Bochs Bochs
  EIP: 0060:[<d88c92d4>] EFLAGS: 00000202 CPU: 0
  EIP is at gf128mul_4k_lle+0x23/0x60 [gf128mul]
  EAX: d69db1f0 EBX: d6b8ddac ECX: 00000004 EDX: 00000000
  ESI: 00000670 EDI: d6b8ddac EBP: d6b8ddc8 ESP: d6b8dda4
   DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
  Process hashatron (pid: 1502, ti=d6b8c000 task=d6810000 task.ti=d6b8c000)
  Stack:
   00000000 d69db1f0 00000163 00000000 d6b8ddc8 c101a520 d69db1f0 d52aa000
   00000ff0 d6b8dde8 d88d310f d6b8a3f8 d52aa000 00001000 d88d502c d6b8ddfc
   00001000 d6b8ddf4 c11676ed d69db1e8 d6b8de24 c11679ad d52aa000 00000000
  Call Trace:
   [<c101a520>] ? kmap_atomic_prot+0x37/0xa6
   [<d88d310f>] ghash_update+0x85/0xbe [ghash_generic]
   [<c11676ed>] crypto_shash_update+0x18/0x1b
   [<c11679ad>] shash_ahash_update+0x22/0x36
   [<c11679cc>] shash_async_update+0xb/0xd
   [<d88ce0ba>] hash_sendpage+0xba/0xf2 [algif_hash]
   [<c121b24c>] kernel_sendpage+0x39/0x4e
   [<d88ce000>] ? 0xd88cdfff
   [<c121b298>] sock_sendpage+0x37/0x3e
   [<c121b261>] ? kernel_sendpage+0x4e/0x4e
   [<c10b4dbc>] pipe_to_sendpage+0x56/0x61
   [<c10b4e1f>] splice_from_pipe_feed+0x58/0xcd
   [<c10b4d66>] ? splice_from_pipe_begin+0x10/0x10
   [<c10b51f5>] __splice_from_pipe+0x36/0x55
   [<c10b4d66>] ? splice_from_pipe_begin+0x10/0x10
   [<c10b6383>] splice_from_pipe+0x51/0x64
   [<c10b63c2>] ? default_file_splice_write+0x2c/0x2c
   [<c10b63d5>] generic_splice_sendpage+0x13/0x15
   [<c10b4d66>] ? splice_from_pipe_begin+0x10/0x10
   [<c10b527f>] do_splice_from+0x5d/0x67
   [<c10b6865>] sys_splice+0x2bf/0x363
   [<c129373b>] ? sysenter_exit+0xf/0x16
   [<c104dc1e>] ? trace_hardirqs_on_caller+0x10e/0x13f
   [<c129370c>] sysenter_do_call+0x12/0x32
  Code: 83 c4 0c 5b 5e 5f c9 c3 55 b9 04 00 00 00 89 e5 57 8d 7d e4 56 53 8d 5d e4 83 ec 18 89 45 e0 89 55 dc 0f b6 70 0f c1 e6 04 01 d6 <f3> a5 be 0f 00 00 00 4e 89 d8 e8 48 ff ff ff 8b 45 e0 89 da 0f
  EIP: [<d88c92d4>] gf128mul_4k_lle+0x23/0x60 [gf128mul] SS:ESP 0068:d6b8dda4
  CR2: 0000000000000670
  ---[ end trace 4eaa2a86a8e2da24 ]---
  note: hashatron[1502] exited with preempt_count 1
  BUG: scheduling while atomic: hashatron/1502/0x10000002
  INFO: lockdep is turned off.
  [...]

Signed-off-by: Nick Bowler <nbowler@elliptictech.com>
Cc: stable@kernel.org [2.6.37+]
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

8 years agoARM: S5P: fix offset calculation on gpio-interrupt
Marek Szyprowski [Fri, 21 Oct 2011 09:04:54 +0000]
ARM: S5P: fix offset calculation on gpio-interrupt

Offsets of the irq controller registers were calculated
correctly only for first GPIO bank. This patch fixes
calculation of the register offsets for all GPIO banks.

Reported-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>

8 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Linus Torvalds [Thu, 20 Oct 2011 19:16:28 +0000]
Merge git://git./linux/kernel/git/davem/sparc

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: Add alignment flag to PCI expansion resources
  sparc: Avoid calling sigprocmask()
  sparc: Use set_current_blocked()
  sparc32,leon: SRMMU MMU Table probe fix

8 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Thu, 20 Oct 2011 19:15:20 +0000]
Merge git://git./linux/kernel/git/davem/net

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  fib_rules: fix unresolved_rules counting
  r8169: fix wrong eee setting for rlt8111evl
  r8169: fix driver shutdown WoL regression.
  ehea: Change maintainer to me
  pptp: pptp_rcv_core() misses pskb_may_pull() call
  tproxy: copy transparent flag when creating a time wait
  pptp: fix skb leak in pptp_xmit()
  bonding: use local function pointer of bond->recv_probe in bond_handle_frame
  smsc911x: Add support for SMSC LAN89218
  tg3: negate USE_PHYLIB flag check
  netconsole: enable netconsole can make net_device refcnt incorrent
  bluetooth: Properly clone LSM attributes to newly created child connections
  l2tp: fix a potential skb leak in l2tp_xmit_skb()
  bridge: fix hang on removal of bridge via netlink
  x25: Prevent skb overreads when checking call user data
  x25: Handle undersized/fragmented skbs
  x25: Validate incoming call user data lengths
  udplite: fast-path computation of checksum coverage
  IPVS netns shutdown/startup dead-lock
  netfilter: nf_conntrack: fix event flooding in GRE protocol tracker

8 years agohwmon: (w83627ehf) Fix negative 8-bit temperature values
Jean Delvare [Thu, 20 Oct 2011 07:06:45 +0000]
hwmon: (w83627ehf) Fix negative 8-bit temperature values

Since 8-bit temperature values are now handled in 16-bit struct
members, values have to be cast to s8 for negative temperatures to be
properly handled. This is broken since kernel version 2.6.39
(commit bce26c58df86599c9570cee83eac58bdaae760e4.)

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Guenter Roeck <guenter.roeck@ericsson.com>
Cc: stable@kernel.org # 2.6.39+
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>

8 years agomm: fix race between mremap and removing migration entry
Hugh Dickins [Wed, 19 Oct 2011 19:50:35 +0000]
mm: fix race between mremap and removing migration entry

I don't usually pay much attention to the stale "? " addresses in
stack backtraces, but this lucky report from Pawel Sikora hints that
mremap's move_ptes() has inadequate locking against page migration.

 3.0 BUG_ON(!PageLocked(p)) in migration_entry_to_page():
 kernel BUG at include/linux/swapops.h:105!
 RIP: 0010:[<ffffffff81127b76>]  [<ffffffff81127b76>]
                       migration_entry_wait+0x156/0x160
  [<ffffffff811016a1>] handle_pte_fault+0xae1/0xaf0
  [<ffffffff810feee2>] ? __pte_alloc+0x42/0x120
  [<ffffffff8112c26b>] ? do_huge_pmd_anonymous_page+0xab/0x310
  [<ffffffff81102a31>] handle_mm_fault+0x181/0x310
  [<ffffffff81106097>] ? vma_adjust+0x537/0x570
  [<ffffffff81424bed>] do_page_fault+0x11d/0x4e0
  [<ffffffff81109a05>] ? do_mremap+0x2d5/0x570
  [<ffffffff81421d5f>] page_fault+0x1f/0x30

mremap's down_write of mmap_sem, together with i_mmap_mutex or lock,
and pagetable locks, were good enough before page migration (with its
requirement that every migration entry be found) came in, and enough
while migration always held mmap_sem; but not enough nowadays, when
there's memory hotremove and compaction.

The danger is that move_ptes() lets a migration entry dodge around
behind remove_migration_pte()'s back, so it's in the old location when
looking at the new, then in the new location when looking at the old.

Either mremap's move_ptes() must additionally take anon_vma lock(), or
migration's remove_migration_pte() must stop peeking for is_swap_entry()
before it takes pagetable lock.

Consensus chooses the latter: we prefer to add overhead to migration
than to mremapping, which gets used by JVMs and by exec stack setup.

Reported-and-tested-by: Paweł Sikora <pluto@agmk.net>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

8 years agosparc: Add alignment flag to PCI expansion resources
Kjetil Oftedal [Wed, 19 Oct 2011 23:20:50 +0000]
sparc: Add alignment flag to PCI expansion resources

Currently no type of alignment is specified for PCI expansion roms while
parsing the openfirmware tree. This causes calls to pci_map_rom() to fail.
IORESOURCE_SIZEALIGN is the default alignment used for rom resouces in
pci/probe.c, and has been verified to work with various cards on a ultra 10.

Signed-off-By: Kjetil Oftedal <oftedal@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8 years agofib_rules: fix unresolved_rules counting
Yan, Zheng [Mon, 17 Oct 2011 15:20:28 +0000]
fib_rules: fix unresolved_rules counting

we should decrease ops->unresolved_rules when deleting a unresolved rule.

Signed-off-by: Zheng Yan <zheng.z.yan@intel.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8 years agor8169: fix wrong eee setting for rlt8111evl
hayeswang [Thu, 13 Oct 2011 20:14:37 +0000]
r8169: fix wrong eee setting for rlt8111evl

Correct the wrong parameter for setting EEE for RTL8111E-VL.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8 years agor8169: fix driver shutdown WoL regression.
françois romieu [Fri, 14 Oct 2011 00:57:45 +0000]
r8169: fix driver shutdown WoL regression.

Due to commit 92fc43b4159b518f5baae57301f26d770b0834c9 ("r8169: modify the
flow of the hw reset."), rtl8169_hw_reset stomps during driver shutdown on
RxConfig bits which are needed for WOL on some versions of the hardware.

As these bits were formerly set from the r81{0x, 68}_pll_power_down methods,
factor them out for use in the driver shutdown (rtl_shutdown) handler.

I favored __rtl8169_get_wol() -hardware state indication- over
RTL_FEATURE_WOL as the latter has become a good candidate for removal.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes <hayeswang@realtek.com>
Tested-by: Marc Ballarin <ballarin.marc@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

8 years agoehea: Change maintainer to me
Thadeu Lima de Souza Cascardo [Thu, 13 Oct 2011 09:56:19 +0000]
ehea: Change maintainer to me

Breno Leitao has passed the maintainership to me.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Cc: Breno Leitao <leitao@linux.vnet.ibm.com>
Acked-by: Breno Leitão <leitao@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8 years agoMerge branch 'v4l_for_linus' of git://linuxtv.org/mchehab/for_linus
Linus Torvalds [Wed, 19 Oct 2011 13:44:11 +0000]
Merge branch 'v4l_for_linus' of git://linuxtv.org/mchehab/for_linus

* 'v4l_for_linus' of git://linuxtv.org/mchehab/for_linus:
  [media] videodev: fix a NULL pointer dereference in v4l2_device_release()

8 years agoMerge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Wed, 19 Oct 2011 13:43:24 +0000]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/radeon/kms/atom: fix handling of FB scratch indices
  drm/radeon/kms/DCE4.1: fix Select_CrtcSource EncodeMode setting for DP bridges (v2)
  drm/radeon/kms/DCE4.1: ss is not supported on the internal pplls
  drm/radeon/kms/DCE4.1: fix dig encoder to transmitter mapping
  ttm: Fix error-path using an uninitialized value

8 years ago[media] videodev: fix a NULL pointer dereference in v4l2_device_release()
Antonio Ospite [Wed, 12 Oct 2011 20:59:26 +0000]
[media] videodev: fix a NULL pointer dereference in v4l2_device_release()

The change in 8280b66 does not cover the case when v4l2_dev is already
NULL, fix that.

With a Kinect sensor, seen as an USB camera using GSPCA in this context,
a NULL pointer dereference BUG can be triggered by just unplugging the
device after the camera driver has been loaded.

Signed-off-by: Antonio Ospite <ospite@studenti.unina.it>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

8 years agointel-iommu: fix superpage support in pfn_to_dma_pte()
Allen Kay [Fri, 14 Oct 2011 19:32:46 +0000]
intel-iommu: fix superpage support in pfn_to_dma_pte()

If target_level == 0, current code breaks out of the while-loop if
SUPERPAGE bit is set. We should also break out if PTE is not present.
If we don't do this, KVM calls to iommu_iova_to_phys() will cause
pfn_to_dma_pte() to create mapping for 4KiB pages.

Signed-off-by: Allen Kay <allen.m.kay@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

8 years agointel-iommu: set iommu_superpage on VM domains to lowest common denominator
Allen Kay [Fri, 14 Oct 2011 19:32:17 +0000]
intel-iommu: set iommu_superpage on VM domains to lowest common denominator

set dmar->iommu_superpage field to the smallest common denominator
of super page sizes supported by all active VT-d engines.  Initialize
this field in intel_iommu_domain_init() API so intel_iommu_map() API
will be able to use iommu_superpage field to determine the appropriate
super page size to use.

Signed-off-by: Allen Kay <allen.m.kay@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

8 years agointel-iommu: fix return value of iommu_unmap() API
Allen Kay [Fri, 14 Oct 2011 19:31:54 +0000]
intel-iommu: fix return value of iommu_unmap() API

iommu_unmap() API expects IOMMU drivers to return the actual page order
of the address being unmapped.  Previous code was just returning page
order passed in from the caller.  This patch fixes this problem.

Signed-off-by: Allen Kay <allen.m.kay@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

8 years agoMAINTAINERS: Update VT-d entry for drivers/pci -> drivers/iommu move
Roland Dreier [Tue, 11 Oct 2011 00:07:15 +0000]
MAINTAINERS: Update VT-d entry for drivers/pci -> drivers/iommu move

Commit 166e9278a3f9 ("x86/ia64: intel-iommu: move to drivers/iommu/")
moved the VT-d driver to drivers/iommu, but left the "F:" line in
MAINTAINERS pointing to drivers/pci, which breaks scripts/get_maintainer.pl.

Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

8 years agodrm/radeon/kms/atom: fix handling of FB scratch indices
Alex Deucher [Wed, 19 Oct 2011 00:10:05 +0000]
drm/radeon/kms/atom: fix handling of FB scratch indices

FB scratch indices are dword indices, but we were treating
them as byte indices.  As such, we were getting the wrong
FB scratch data for non-0 indices.  Fix the indices and
guard the indexing against indices larger than the scratch
allocation.

Fixes memory corruption on some boards if data was written
past the end of the FB scratch array.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reported-by: Dave Airlie <airlied@redhat.com>
Tested-by: Dave Airlie <airlied@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>

8 years agopptp: pptp_rcv_core() misses pskb_may_pull() call
Eric Dumazet [Mon, 17 Oct 2011 17:59:53 +0000]
pptp: pptp_rcv_core() misses pskb_may_pull() call

e1000e uses paged frags, so any layer incorrectly pulling bytes from skb
can trigger a BUG in skb_pull()

[951.142737]  [<ffffffff813d2f36>] skb_pull+0x15/0x17
[951.142737]  [<ffffffffa0286824>] pptp_rcv_core+0x126/0x19a [pptp]
[951.152725]  [<ffffffff813d17c4>] sk_receive_skb+0x69/0x105
[951.163558]  [<ffffffffa0286993>] pptp_rcv+0xc8/0xdc [pptp]
[951.165092]  [<ffffffffa02800a3>] gre_rcv+0x62/0x75 [gre]
[951.165092]  [<ffffffff81410784>] ip_local_deliver_finish+0x150/0x1c1
[951.177599]  [<ffffffff81410634>] ? ip_local_deliver_finish+0x0/0x1c1
[951.177599]  [<ffffffff81410846>] NF_HOOK.clone.7+0x51/0x58
[951.177599]  [<ffffffff81410996>] ip_local_deliver+0x51/0x55
[951.177599]  [<ffffffff814105b9>] ip_rcv_finish+0x31a/0x33e
[951.177599]  [<ffffffff8141029f>] ? ip_rcv_finish+0x0/0x33e
[951.204898]  [<ffffffff81410846>] NF_HOOK.clone.7+0x51/0x58
[951.214651]  [<ffffffff81410bb5>] ip_rcv+0x21b/0x246

pptp_rcv_core() is a nice example of a function assuming everything it
needs is available in skb head.

Reported-by: Bradley Peterson <despite@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8 years agotproxy: copy transparent flag when creating a time wait
KOVACS Krisztian [Tue, 18 Oct 2011 10:17:35 +0000]
tproxy: copy transparent flag when creating a time wait

The transparent socket option setting was not copied to the time wait
socket when an inet socket was being replaced by a time wait socket. This
broke the --transparent option of the socket match and may have caused
that FIN packets belonging to sockets in FIN_WAIT2 or TIME_WAIT state
were being dropped by the packet filter.

Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>

8 years agopptp: fix skb leak in pptp_xmit()
Eric Dumazet [Mon, 17 Oct 2011 17:01:47 +0000]
pptp: fix skb leak in pptp_xmit()

In case we cant transmit skb, we must free it

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Dmitry Kozlov <xeb@mail.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>

8 years agobonding: use local function pointer of bond->recv_probe in bond_handle_frame
Mitsuo Hayasaka [Wed, 12 Oct 2011 16:04:29 +0000]
bonding: use local function pointer of bond->recv_probe in bond_handle_frame

The bond->recv_probe is called in bond_handle_frame() when
a packet is received, but bond_close() sets it to NULL. So,
a panic occurs when both functions work in parallel.

Why this happen:
After null pointer check of bond->recv_probe, an sk_buff is
duplicated and bond->recv_probe is called in bond_handle_frame.
So, a panic occurs when bond_close() is called between the
check and call of bond->recv_probe.

Patch:
This patch uses a local function pointer of bond->recv_probe
in bond_handle_frame(). So, it can avoid the null pointer
dereference.

Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: WANG Cong <xiyou.wangcong@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8 years agosmsc911x: Add support for SMSC LAN89218
Phil Edworthy [Wed, 12 Oct 2011 02:29:39 +0000]
smsc911x: Add support for SMSC LAN89218

LAN89218 is register compatible with LAN911x.

Signed-off-by: Phil Edworthy <phil.edworthy@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8 years agotg3: negate USE_PHYLIB flag check
Jiri Pirko [Tue, 11 Oct 2011 23:00:41 +0000]
tg3: negate USE_PHYLIB flag check

USE_PHYLIB flag in tg3_remove_one() is being checked incorrectly. This
results tg3_phy_fini->phy_disconnect is never called and when tg3 module
is removed.

In my case this resulted in panics in phy_state_machine calling function
phydev->adjust_link.

So correct this check.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Acked-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8 years agonetconsole: enable netconsole can make net_device refcnt incorrent
Gao feng [Tue, 11 Oct 2011 16:08:11 +0000]
netconsole: enable netconsole can make net_device refcnt incorrent

There is no check if netconsole is enabled current.
so when exec echo 1 > enabled;
the reference of net_device will increment always.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Acked-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8 years agobluetooth: Properly clone LSM attributes to newly created child connections
Paul Moore [Fri, 7 Oct 2011 09:40:59 +0000]
bluetooth: Properly clone LSM attributes to newly created child connections

The Bluetooth stack has internal connection handlers for all of the various
Bluetooth protocols, and unfortunately, they are currently lacking the LSM
hooks found in the core network stack's connection handlers.  I say
unfortunately, because this can cause problems for users who have have an
LSM enabled and are using certain Bluetooth devices.  See one problem
report below:

 * http://bugzilla.redhat.com/show_bug.cgi?id=741703

In order to keep things simple at this point in time, this patch fixes the
problem by cloning the parent socket's LSM attributes to the newly created
child socket.  If we decide we need a more elaborate LSM marking mechanism
for Bluetooth (I somewhat doubt this) we can always revisit this decision
in the future.

Reported-by: James M. Cape <jcape@ignore-your.tv>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

8 years agol2tp: fix a potential skb leak in l2tp_xmit_skb()
Eric Dumazet [Fri, 7 Oct 2011 05:35:46 +0000]
l2tp: fix a potential skb leak in l2tp_xmit_skb()

l2tp_xmit_skb() can leak one skb if skb_cow_head() returns an error.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8 years agobridge: fix hang on removal of bridge via netlink
stephen hemminger [Thu, 6 Oct 2011 11:19:41 +0000]
bridge: fix hang on removal of bridge via netlink

Need to cleanup bridge device timers and ports when being bridge
device is being removed via netlink.

This fixes the problem of observed when doing:
 ip link add br0 type bridge
 ip link set dev eth1 master br0
 ip link set br0 up
 ip link del br0

which would cause br0 to hang in unregister_netdev because
of leftover reference count.

Reported-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8 years agocputimer: Cure lock inversion
Peter Zijlstra [Mon, 17 Oct 2011 09:50:30 +0000]
cputimer: Cure lock inversion

There's a lock inversion between the cputimer->lock and rq->lock;
notably the two callchains involved are:

 update_rlimit_cpu()
   sighand->siglock
   set_process_cpu_timer()
     cpu_timer_sample_group()
       thread_group_cputimer()
         cputimer->lock
         thread_group_cputime()
           task_sched_runtime()
             ->pi_lock
             rq->lock

 scheduler_tick()
   rq->lock
   task_tick_fair()
     update_curr()
       account_group_exec()
         cputimer->lock

Where the first one is enabling a CLOCK_PROCESS_CPUTIME_ID timer, and
the second one is keeping up-to-date.

This problem was introduced by e8abccb7193 ("posix-cpu-timers: Cure
SMP accounting oddities").

Cure the problem by removing the cputimer->lock and rq->lock nesting,
this leaves concurrent enablers doing duplicate work, but the time
wasted should be on the same order otherwise wasted spinning on the
lock and the greater-than assignment filter should ensure we preserve
monotonicity.

Reported-by: Dave Jones <davej@redhat.com>
Reported-by: Simon Kirby <sim@hostway.ca>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/1318928713.21167.4.camel@twins
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>