8 years agoKVM: Add resampling irqfds for level triggered interrupts
Alex Williamson [Fri, 21 Sep 2012 17:58:03 +0000]
KVM: Add resampling irqfds for level triggered interrupts

To emulate level triggered interrupts, add a resample option to
KVM_IRQFD.  When specified, a new resamplefd is provided that notifies
the user when the irqchip has been resampled by the VM.  This may, for
instance, indicate an EOI.  Also in this mode, posting of an interrupt
through an irqfd only asserts the interrupt.  On resampling, the
interrupt is automatically de-asserted prior to user notification.
This enables level triggered interrupts to be posted and re-enabled
from vfio with no userspace intervention.

All resampling irqfds can make use of a single irq source ID, so we
reserve a new one for this interface.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: optimize apic interrupt delivery
Gleb Natapov [Thu, 13 Sep 2012 14:19:24 +0000]
KVM: optimize apic interrupt delivery

Most interrupt are delivered to only one vcpu. Use pre-build tables to
find interrupt destination instead of looping through all vcpus. In case
of logical mode loop only through vcpus in a logical cluster irq is sent
to.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoMerge branch 'queue' into next
Avi Kivity [Thu, 20 Sep 2012 12:04:41 +0000]
Merge branch 'queue' into next

* queue:
  KVM: MMU: Eliminate pointless temporary 'ac'
  KVM: MMU: Avoid access/dirty update loop if all is well
  KVM: MMU: Eliminate eperm temporary
  KVM: MMU: Optimize is_last_gpte()
  KVM: MMU: Simplify walk_addr_generic() loop
  KVM: MMU: Optimize pte permission checks
  KVM: MMU: Update accessed and dirty bits after guest pagetable walk
  KVM: MMU: Move gpte_access() out of paging_tmpl.h
  KVM: MMU: Optimize gpte_access() slightly
  KVM: MMU: Push clean gpte write protection out of gpte_access()
  KVM: clarify kvmclock documentation
  KVM: make processes waiting on vcpu mutex killable
  KVM: SVM: Make use of asm.h
  KVM: VMX: Make use of asm.h
  KVM: VMX: Make lto-friendly

Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: MMU: Eliminate pointless temporary 'ac'
Avi Kivity [Wed, 19 Sep 2012 16:33:48 +0000]
KVM: MMU: Eliminate pointless temporary 'ac'

'ac' essentially reconstructs the 'access' variable we already
have, except for the PFERR_PRESENT_MASK and PFERR_RSVD_MASK.  As
these are not used by callees, just use 'access' directly.

Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: MMU: Avoid access/dirty update loop if all is well
Avi Kivity [Sun, 16 Sep 2012 12:03:02 +0000]
KVM: MMU: Avoid access/dirty update loop if all is well

Keep track of accessed/dirty bits; if they are all set, do not
enter the accessed/dirty update loop.

Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: MMU: Eliminate eperm temporary
Avi Kivity [Sun, 16 Sep 2012 11:49:15 +0000]
KVM: MMU: Eliminate eperm temporary

'eperm' is no longer used in the walker loop, so we can eliminate it.

Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: MMU: Optimize is_last_gpte()
Avi Kivity [Wed, 12 Sep 2012 17:46:56 +0000]
KVM: MMU: Optimize is_last_gpte()

Instead of branchy code depending on level, gpte.ps, and mmu configuration,
prepare everything in a bitmap during mode changes and look it up during
runtime.

Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: MMU: Simplify walk_addr_generic() loop
Avi Kivity [Wed, 12 Sep 2012 12:12:09 +0000]
KVM: MMU: Simplify walk_addr_generic() loop

The page table walk is coded as an infinite loop, with a special
case on the last pte.

Code it as an ordinary loop with a termination condition on the last
pte (large page or walk length exhausted), and put the last pte handling
code after the loop where it belongs.

Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: MMU: Optimize pte permission checks
Avi Kivity [Wed, 12 Sep 2012 11:52:00 +0000]
KVM: MMU: Optimize pte permission checks

walk_addr_generic() permission checks are a maze of branchy code, which is
performed four times per lookup.  It depends on the type of access, efer.nxe,
cr0.wp, cr4.smep, and in the near future, cr4.smap.

Optimize this away by precalculating all variants and storing them in a
bitmap.  The bitmap is recalculated when rarely-changing variables change
(cr0, cr4) and is indexed by the often-changing variables (page fault error
code, pte access permissions).

The permission check is moved to the end of the loop, otherwise an SMEP
fault could be reported as a false positive, when PDE.U=1 but PTE.U=0.
Noted by Xiao Guangrong.

The result is short, branch-free code.

Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: MMU: Update accessed and dirty bits after guest pagetable walk
Avi Kivity [Sun, 16 Sep 2012 11:18:51 +0000]
KVM: MMU: Update accessed and dirty bits after guest pagetable walk

While unspecified, the behaviour of Intel processors is to first
perform the page table walk, then, if the walk was successful, to
atomically update the accessed and dirty bits of walked paging elements.

While we are not required to follow this exactly, doing so will allow us
to perform the access permissions check after the walk is complete, rather
than after each walk step.

(the tricky case is SMEP: a zero in any pte's U bit makes the referenced
page a supervisor page, so we can't fault on a one bit during the walk
itself).

Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: MMU: Move gpte_access() out of paging_tmpl.h
Avi Kivity [Wed, 12 Sep 2012 11:03:28 +0000]
KVM: MMU: Move gpte_access() out of paging_tmpl.h

We no longer rely on paging_tmpl.h defines; so we can move the function
to mmu.c.

Rely on zero extension to 64 bits to get the correct nx behaviour.

Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: MMU: Optimize gpte_access() slightly
Avi Kivity [Wed, 12 Sep 2012 10:53:08 +0000]
KVM: MMU: Optimize gpte_access() slightly

If nx is disabled, then is gpte[63] is set we will hit a reserved
bit set fault before checking permissions; so we can ignore the
setting of efer.nxe.

Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: MMU: Push clean gpte write protection out of gpte_access()
Avi Kivity [Wed, 12 Sep 2012 10:44:53 +0000]
KVM: MMU: Push clean gpte write protection out of gpte_access()

gpte_access() computes the access permissions of a guest pte and also
write-protects clean gptes.  This is wrong when we are servicing a
write fault (since we'll be setting the dirty bit momentarily) but
correct when instantiating a speculative spte, or when servicing a
read fault (since we'll want to trap a following write in order to
set the dirty bit).

It doesn't seem to hurt in practice, but in order to make the code
readable, push the write protection out of gpte_access() and into
a new protect_clean_gpte() which is called explicitly when needed.

Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: clarify kvmclock documentation
Stefan Fritsch [Sun, 16 Sep 2012 10:55:40 +0000]
KVM: clarify kvmclock documentation

- mention that system time needs to be added to wallclock time
- positive tsc_shift means left shift, not right
- mention additional 32bit right shift

Signed-off-by: Stefan Fritsch <sf@sfritsch.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

8 years agoKVM: make processes waiting on vcpu mutex killable
Michael S. Tsirkin [Sun, 16 Sep 2012 08:50:30 +0000]
KVM: make processes waiting on vcpu mutex killable

vcpu mutex can be held for unlimited time so
taking it with mutex_lock on an ioctl is wrong:
one process could be passed a vcpu fd and
call this ioctl on the vcpu used by another process,
it will then be unkillable until the owner exits.

Call mutex_lock_killable instead and return status.
Note: mutex_lock_interruptible would be even nicer,
but I am not sure all users are prepared to handle EINTR
from these ioctls. They might misinterpret it as an error.

Cleanup paths expect a vcpu that can't be used by
any userspace so this will always succeed - catch bugs
by calling BUG_ON.

Catch callers that don't check return state by adding
__must_check.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

8 years agoKVM: SVM: Make use of asm.h
Avi Kivity [Sun, 16 Sep 2012 12:10:59 +0000]
KVM: SVM: Make use of asm.h

Use macros for bitness-insensitive register names, instead of
rolling our own.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

8 years agoKVM: VMX: Make use of asm.h
Avi Kivity [Sun, 16 Sep 2012 12:10:58 +0000]
KVM: VMX: Make use of asm.h

Use macros for bitness-insensitive register names, instead of
rolling our own.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

8 years agoKVM: VMX: Make lto-friendly
Avi Kivity [Sun, 16 Sep 2012 12:10:57 +0000]
KVM: VMX: Make lto-friendly

LTO (link-time optimization) doesn't like local labels to be referred to
from a different function, since the two functions may be built in separate
compilation units.  Use an external variable instead.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

8 years agoKVM: x86: lapic: Clean up find_highest_vector() and count_vectors()
Takuya Yoshikawa [Wed, 5 Sep 2012 10:30:01 +0000]
KVM: x86: lapic: Clean up find_highest_vector() and count_vectors()

find_highest_vector() and count_vectors():
 - Instead of using magic values, define and use proper macros.

find_highest_vector():
 - Remove likely() which is there only for historical reasons and not
   doing correct branch predictions anymore.  Using such heuristics
   to optimize this function is not worth it now.  Let CPUs predict
   things instead.

 - Stop checking word[0] separately.  This was only needed for doing
   likely() optimization.

 - Use for loop, not while, to iterate over the register array to make
   the code clearer.

Note that we actually confirmed that the likely() did wrong predictions
by inserting debug code.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

8 years agoKVM: MMU: remove unnecessary check
Xiao Guangrong [Fri, 7 Sep 2012 06:15:03 +0000]
KVM: MMU: remove unnecessary check

Checking the return of kvm_mmu_get_page is unnecessary since it is
guaranteed by memory cache

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: Depend on HIGH_RES_TIMERS
Liu, Jinsong [Sun, 9 Sep 2012 22:55:39 +0000]
KVM: Depend on HIGH_RES_TIMERS

KVM lapic timer and tsc deadline timer based on hrtimer,
setting a leftmost node to rb tree and then do hrtimer reprogram.
If hrtimer not configured as high resolution, hrtimer_enqueue_reprogram
do nothing and then make kvm lapic timer and tsc deadline timer fail.

Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: Improve wording of KVM_SET_USER_MEMORY_REGION documentation
Jan Kiszka [Fri, 7 Sep 2012 11:17:47 +0000]
KVM: Improve wording of KVM_SET_USER_MEMORY_REGION documentation

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: use symbolic constant for nr interrupts
Michael S. Tsirkin [Wed, 5 Sep 2012 17:00:52 +0000]
KVM: use symbolic constant for nr interrupts

interrupt_bitmap is KVM_NR_INTERRUPTS bits in size,
so just use that instead of hard-coded constants
and math.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: emulator: optimize "rep ins" handling
Gleb Natapov [Mon, 3 Sep 2012 12:24:29 +0000]
KVM: emulator: optimize "rep ins" handling

Optimize "rep ins" by allowing emulator to write back more than one
datum at a time. Introduce new operand type OP_MEM_STR which tells
writeback() that dst contains pointer to an array that should be written
back as opposite to just one data element.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: emulator: string_addr_inc() cleanup
Gleb Natapov [Mon, 3 Sep 2012 12:24:28 +0000]
KVM: emulator: string_addr_inc() cleanup

Remove unneeded segment argument. Address structure already has correct
segment which was put there during decode.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: emulator: make x86 emulation modes enum instead of defines
Gleb Natapov [Mon, 3 Sep 2012 12:24:27 +0000]
KVM: emulator: make x86 emulation modes enum instead of defines

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: Provide userspace IO exit completion callback
Gleb Natapov [Mon, 3 Sep 2012 12:24:26 +0000]
KVM: Provide userspace IO exit completion callback

Current code assumes that IO exit was due to instruction emulation
and handles execution back to emulator directly. This patch adds new
userspace IO exit completion callback that can be set by any other code
that caused IO exit to userspace.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: move postcommit flush to x86, as mmio sptes are x86 specific
Marcelo Tosatti [Tue, 28 Aug 2012 20:43:26 +0000]
KVM: move postcommit flush to x86, as mmio sptes are x86 specific

Other arches do not need this.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

v2: fix incorrect deletion of mmio sptes on gpa move (noticed by Takuya)
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: perform an invalid memslot step for gpa base change
Marcelo Tosatti [Fri, 24 Aug 2012 18:54:58 +0000]
KVM: perform an invalid memslot step for gpa base change

PPC must flush all translations before the new memory slot
is visible.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: split kvm_arch_flush_shadow
Marcelo Tosatti [Fri, 24 Aug 2012 18:54:57 +0000]
KVM: split kvm_arch_flush_shadow

Introducing kvm_arch_flush_shadow_memslot, to invalidate the
translations of a single memory slot.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: SVM: constify lookup tables
Mathias Krause [Wed, 29 Aug 2012 23:30:20 +0000]
KVM: SVM: constify lookup tables

We never modify direct_access_msrs[], msrpm_ranges[],
svm_exit_handlers[] or x86_intercept_map[] at runtime.
Mark them r/o.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: VMX: constify lookup tables
Mathias Krause [Wed, 29 Aug 2012 23:30:19 +0000]
KVM: VMX: constify lookup tables

We use vmcs_field_to_offset_table[], kvm_vmx_segment_fields[] and
kvm_vmx_exit_handlers[] as lookup tables only -- make them r/o.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: x86: more constification
Mathias Krause [Wed, 29 Aug 2012 23:30:18 +0000]
KVM: x86: more constification

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: x86: constify read_write_emulator_ops
Mathias Krause [Wed, 29 Aug 2012 23:30:17 +0000]
KVM: x86: constify read_write_emulator_ops

We never change those, make them r/o.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: x86 emulator: constify emulate_ops
Mathias Krause [Wed, 29 Aug 2012 23:30:16 +0000]
KVM: x86 emulator: constify emulate_ops

We never change emulate_ops[] at runtime so it should be r/o.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: x86 emulator: mark opcode tables const
Mathias Krause [Wed, 29 Aug 2012 23:30:15 +0000]
KVM: x86 emulator: mark opcode tables const

The opcode tables never change at runtime, therefor mark them const.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: x86 emulator: use aligned variants of SSE register ops
Mathias Krause [Wed, 29 Aug 2012 23:30:14 +0000]
KVM: x86 emulator: use aligned variants of SSE register ops

As the the compiler ensures that the memory operand is always aligned
to a 16 byte memory location, use the aligned variant of MOVDQ for
read_sse_reg() and write_sse_reg().

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: x86: minor size optimization
Mathias Krause [Wed, 29 Aug 2012 23:30:13 +0000]
KVM: x86: minor size optimization

Some fields can be constified and/or made static to reduce code and data
size.

Numbers for a 32 bit build:

        text    data     bss     dec     hex filename
before: 3351      80       0    3431     d67 cpuid.o
 after: 3391       0       0    3391     d3f cpuid.o

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: cleanup pic reset
Gleb Natapov [Mon, 3 Sep 2012 11:47:25 +0000]
KVM: cleanup pic reset

kvm_pic_reset() is not used anywhere. Move reset logic from
pic_ioport_write() there.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

8 years agoKVM: x86: remove unused variable from kvm_task_switch()
Marcelo Tosatti [Thu, 30 Aug 2012 20:45:54 +0000]
KVM: x86: remove unused variable from kvm_task_switch()

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

8 years agoKVM: VMX: Ignore segment G and D bits when considering whether we can virtualize
Avi Kivity [Tue, 21 Aug 2012 14:07:10 +0000]
KVM: VMX: Ignore segment G and D bits when considering whether we can virtualize

We will enter the guest with G and D cleared; as real hardware ignores D in
real mode, and G is taken care of by the limit test, we allow more code to
run in vm86 mode.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

8 years agoKVM: VMX: Save all segment data in real mode
Avi Kivity [Tue, 21 Aug 2012 14:07:09 +0000]
KVM: VMX: Save all segment data in real mode

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

8 years agoKVM: VMX: Preserve segment limit and access rights in real mode
Avi Kivity [Tue, 21 Aug 2012 14:07:08 +0000]
KVM: VMX: Preserve segment limit and access rights in real mode

While this is undocumented, real processors do not reload the segment
limit and access rights when loading a segment register in real mode.
Real programs rely on it so we need to comply with this behaviour.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

8 years agoKVM: VMX: Return real real-mode segment data even if emulate_invalid_guest_state=1
Avi Kivity [Tue, 21 Aug 2012 14:07:07 +0000]
KVM: VMX: Return real real-mode segment data even if emulate_invalid_guest_state=1

emulate_invalid_guest_state=1 doesn't mean we don't munge the segments in the
vmcs; we do.  So we need to return the real ones (maintained by vmx_set_segment).

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

8 years agoKVM: x86 emulator: Fix #GP error code during linearization
Avi Kivity [Tue, 21 Aug 2012 14:07:06 +0000]
KVM: x86 emulator: Fix #GP error code during linearization

We want the segment selector, nor segment number.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

8 years agoKVM: x86 emulator: Check segment limits in real mode too
Avi Kivity [Tue, 21 Aug 2012 14:07:05 +0000]
KVM: x86 emulator: Check segment limits in real mode too

Segment limits are verified in real mode, not just protected mode.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

8 years agoKVM: x86 emulator: Leave segment limit and attributs alone in real mode
Avi Kivity [Tue, 21 Aug 2012 14:07:04 +0000]
KVM: x86 emulator: Leave segment limit and attributs alone in real mode

When loading a segment in real mode, only the base and selector must
be modified.  The limit needs to be left alone, otherwise big real mode
users will hit a #GP due to limit checking (currently this is suppressed
because we don't check limits in real mode).

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

8 years agoKVM: VMX: Allow vm86 virtualization of big real mode
Avi Kivity [Tue, 21 Aug 2012 14:07:03 +0000]
KVM: VMX: Allow vm86 virtualization of big real mode

Usually, big real mode uses large (4GB) segments.  Currently we don't
virtualize this; if any segment has a limit other than 0xffff, we emulate.
But if we set the vmx-visible limit to 0xffff, we can use vm86 to virtualize
real mode; if an access overruns the segment limit, the guest will #GP, which
we will trap and forward to the emulator.  This results in significantly
faster execution, and less risk of hitting an unemulated instruction.

If the limit is less than 0xffff, we retain the existing behaviour.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

8 years agoKVM: VMX: Allow real mode emulation using vm86 with dpl=0
Avi Kivity [Tue, 21 Aug 2012 14:07:02 +0000]
KVM: VMX: Allow real mode emulation using vm86 with dpl=0

Real mode is always entered from protected mode with dpl=0.  Since
the dpl doesn't affect execution, and we already override it to 3
in the vmcs (as vmx requires), we can allow execution in that state.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

8 years agoKVM: VMX: Retain limit and attributes when entering protected mode
Avi Kivity [Tue, 21 Aug 2012 14:07:01 +0000]
KVM: VMX: Retain limit and attributes when entering protected mode

Real processors don't change segment limits and attributes while in
real mode.  Mimic that behaviour.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

8 years agoKVM: VMX: Use kvm_segment to save protected-mode segments when entering realmode
Avi Kivity [Tue, 21 Aug 2012 14:07:00 +0000]
KVM: VMX: Use kvm_segment to save protected-mode segments when entering realmode

Instead of using struct kvm_save_segment, use struct kvm_segment, which is what
the other APIs use.  This leads to some simplification.

We replace save_rmode_seg() with a call to vmx_save_segment().  Since this depends
on rmode.vm86_active, we move the call to before setting the flag.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

8 years agoKVM: VMX: Fix incorrect lookup of segment S flag in fix_pmode_dataseg()
Avi Kivity [Tue, 21 Aug 2012 14:06:59 +0000]
KVM: VMX: Fix incorrect lookup of segment S flag in fix_pmode_dataseg()

fix_pmode_dataseg() looks up S in ->base instead of ->ar_bytes.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

8 years agoKVM: VMX: Separate saving pre-realmode state from setting segments
Avi Kivity [Tue, 21 Aug 2012 14:06:58 +0000]
KVM: VMX: Separate saving pre-realmode state from setting segments

Commit b246dd5df139 ("KVM: VMX: Fix KVM_SET_SREGS with big real mode
segments") moved fix_rmode_seg() to vmx_set_segment(), so that it is
applied not just on transitions to real mode, but also on KVM_SET_SREGS
(migration).  However fix_rmode_seg() not only munges the vmcs segments,
it also sets up the save area for us to restore when returning to
protected mode or to return in vmx_get_segment().

Move saving the segment into a new function, save_rmode_seg(), and
call it just during the transition.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

8 years agoKVM: x86 emulator: access GPRs on demand
Avi Kivity [Mon, 27 Aug 2012 20:46:17 +0000]
KVM: x86 emulator: access GPRs on demand

Instead of populating the entire register file, read in registers
as they are accessed, and write back only the modified ones.  This
saves a VMREAD and VMWRITE on Intel (for rsp, since it is not usually
used during emulation), and a two 128-byte copies for the registers.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

8 years agoKVM: PPC: book3s: fix build error caused by gfn_to_hva_memslot()
Gavin Shan [Fri, 24 Aug 2012 08:50:28 +0000]
KVM: PPC: book3s: fix build error caused by gfn_to_hva_memslot()

The build error was caused by that builtin functions are calling
the functions implemented in modules. This error was introduced by
commit 4d8b81abc4 ("KVM: introduce readonly memslot").

The patch fixes the build error by moving function __gfn_to_hva_memslot()
from kvm_main.c to kvm_host.h and making that "inline" so that the
builtin function (kvmppc_h_enter) can use that.

Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

8 years agoMerge remote-tracking branch 'upstream/master' into queue
Marcelo Tosatti [Sun, 26 Aug 2012 16:58:41 +0000]
Merge remote-tracking branch 'upstream/master' into queue

Merging critical fixes from upstream required for development.

* upstream/master: (809 commits)
  libata: Add a space to " 2GB ATA Flash Disk" DMA blacklist entry
  Revert "powerpc: Update g5_defconfig"
  powerpc/perf: Use pmc_overflow() to detect rolled back events
  powerpc: Fix VMX in interrupt check in POWER7 copy loops
  powerpc: POWER7 copy_to_user/copy_from_user patch applied twice
  powerpc: Fix personality handling in ppc64_personality()
  powerpc/dma-iommu: Fix IOMMU window check
  powerpc: Remove unnecessary ifdefs
  powerpc/kgdb: Restore current_thread_info properly
  powerpc/kgdb: Bail out of KGDB when we've been triggered
  powerpc/kgdb: Do not set kgdb_single_step on ppc
  powerpc/mpic_msgr: Add missing includes
  powerpc: Fix null pointer deref in perf hardware breakpoints
  powerpc: Fixup whitespace in xmon
  powerpc: Fix xmon dl command for new printk implementation
  xfs: check for possible overflow in xfs_ioc_trim
  xfs: unlock the AGI buffer when looping in xfs_dialloc
  xfs: fix uninitialised variable in xfs_rtbuf_get()
  powerpc/fsl: fix "Failed to mount /dev: No such device" errors
  powerpc/fsl: update defconfigs
  ...

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

8 years agoMerge tag 'fixes-3.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Linus Torvalds [Sun, 26 Aug 2012 00:33:33 +0000]
Merge tag 'fixes-3.6-rc3' of git://git./linux/kernel/git/arm/arm-soc

Pull arm-soc fixes from Arnd Bergmann:
 "Bug fixes for various ARM platforms.  About half of these are for OMAP
  and submitted before but did not make it into v3.6-rc2."

* tag 'fixes-3.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (39 commits)
  ARM: ux500: don't select LEDS_GPIO for snowball
  ARM: imx: build i.MX6 functions only when needed
  ARM: imx: select CPU_FREQ_TABLE when needed
  ARM: imx: fix ksz9021rn_phy_fixup
  ARM: imx: build pm-imx5 code only when PM is enabled
  ARM: omap: allow building omap44xx without SMP
  ARM: dts: imx51-babbage: fix esdhc cd/wp properties
  ARM: imx6: spin the cpu until hardware takes it down
  ARM: ux500: Ensure probing of Audio devices when Device Tree is enabled
  ARM: ux500: Fix merge error, no matching driver name for 'snd_soc_u8500'
  ARM i.MX6q: Add virtual 1/3.5 dividers in the LDB clock path
  ARM: Kirkwood: fix Makefile.boot
  ARM: Kirkwood: Fix iconnect leds
  ARM: Orion: Set eth packet size csum offload limit
  ARM: mv78xx0: fix win_cfg_base prototype
  ARM: OMAP: dmtimers: Fix locking issue in omap_dm_timer_request*()
  ARM: mmp: fix potential NULL dereference
  ARM: OMAP4: Register the OPP table only for 4430 device
  cpufreq: OMAP: Handle missing frequency table on SMP systems
  ARM: OMAP4: sleep: Save the complete used register stack frame
  ...

8 years agoMerge tag 'stable/for-linus-3.6-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 26 Aug 2012 00:31:59 +0000]
Merge tag 'stable/for-linus-3.6-rc3-tag' of git://git./linux/kernel/git/konrad/xen

Pull three xen bug-fixes from Konrad Rzeszutek Wilk:
 - Revert the kexec fix which caused on non-kexec shutdowns a race.
 - Reuse existing P2M leafs - instead of requiring to allocate a large
   area of bootup virtual address estate.
 - Fix a one-off error when adding PFNs for balloon pages.

* tag 'stable/for-linus-3.6-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/setup: Fix one-off error when adding for-balloon PFNs to the P2M.
  xen/p2m: Reuse existing P2M leafs if they are filled with 1:1 PFNs or INVALID.
  Revert "xen PVonHVM: move shared_info to MMIO before kexec"

8 years agoMerge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Linus Torvalds [Sun, 26 Aug 2012 00:30:18 +0000]
Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc

Pull powerpc fixes from Benjamin Herrenschmidt:
 "I meant to sent that earlier but got swamped with other things, so
  here are some powerpc fixes for 3.6.  A few regression fixes and some
  bug fixes that I deemed should still make it.

  There's a FSL update from Kumar with a bunch of defconfig updates
  along with a few embedded fixes.

  I also reverted my g5_defconfig update that I merged earlier as it was
  completely busted, not too sure what happened there, I'll do a new one
  later."

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  Revert "powerpc: Update g5_defconfig"
  powerpc/perf: Use pmc_overflow() to detect rolled back events
  powerpc: Fix VMX in interrupt check in POWER7 copy loops
  powerpc: POWER7 copy_to_user/copy_from_user patch applied twice
  powerpc: Fix personality handling in ppc64_personality()
  powerpc/dma-iommu: Fix IOMMU window check
  powerpc: Remove unnecessary ifdefs
  powerpc/kgdb: Restore current_thread_info properly
  powerpc/kgdb: Bail out of KGDB when we've been triggered
  powerpc/kgdb: Do not set kgdb_single_step on ppc
  powerpc/mpic_msgr: Add missing includes
  powerpc: Fix null pointer deref in perf hardware breakpoints
  powerpc: Fixup whitespace in xmon
  powerpc: Fix xmon dl command for new printk implementation
  powerpc/fsl: fix "Failed to mount /dev: No such device" errors
  powerpc/fsl: update defconfigs
  booke/wdt: some ioctls do not return values properly
  powerpc/p4080ds: dts - add usb controller version info and port0
  powerpc/85xx: mpc85xx_defconfig - add VIA PATA support for MPC85xxCDS
  powerpc/fsl-pci: Only scan PCI bus if configured as a host

8 years agoMerge git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Sun, 26 Aug 2012 00:27:17 +0000]
Merge git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Marcelo Tosatti.

* git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86 emulator: use stack size attribute to mask rsp in stack ops
  KVM: MMU: Fix mmu_shrink() so that it can free mmu pages as intended
  ppc: e500_tlb memset clears nothing
  KVM: PPC: Add cache flush on page map
  KVM: PPC: Book3S HV: Fix incorrect branch in H_CEDE code
  KVM: x86: update KVM_SAVE_MSRS_BEGIN to correct value

8 years agoMerge tag 'for-linus-v3.6-rc4' of git://oss.sgi.com/xfs/xfs
Linus Torvalds [Sat, 25 Aug 2012 18:47:06 +0000]
Merge tag 'for-linus-v3.6-rc4' of git://oss.sgi.com/xfs/xfs

Pull xfs bugfixes from Ben Myers:
 - fix uninitialised variable in xfs_rtbuf_get()
 - unlock the AGI buffer when looping in xfs_dialloc
 - check for possible overflow in xfs_ioc_trim

* tag 'for-linus-v3.6-rc4' of git://oss.sgi.com/xfs/xfs:
  xfs: check for possible overflow in xfs_ioc_trim
  xfs: unlock the AGI buffer when looping in xfs_dialloc
  xfs: fix uninitialised variable in xfs_rtbuf_get()

8 years agoMerge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Linus Torvalds [Sat, 25 Aug 2012 18:45:04 +0000]
Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus

Pull MIPS fixes from Ralf Baechle:
 "Random fixes across the MIPS tree.  The two hotspots are several bugs
  in the module loader and the ath79 SOC support; also noteworthy is the
  restructuring of the code to synchronize CPU timers across CPUs on
  startup; the old code recently ceased to work due to unrelated
  changes.

  All except one of these patches have sat for a significant time in
  linux-next for testing."

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: pci-ar724x: avoid data bus error due to a missing PCIe module
  MIPS: Malta: Delete duplicate PCI fixup.
  MIPS: ath79: don't hardcode the unavailability of the DSP ASE
  MIPS: Synchronize MIPS count one CPU at a time
  MIPS: BCM63xx: Fix SPI message control register handling for BCM6338/6348.
  MIPS: Module: Deal with malformed HI16/LO16 relocation sequences.
  MIPS: Fix race condition in module relocation code.
  MIPS: Fix memory leak in error path of HI16/LO16 relocation handling.
  MIPS: MTX-1: Add udelay to mtx1_pci_idsel
  MIPS: ath79: select HAVE_CLK
  MIPS: ath79: Use correct IRQ number for the OHCI controller on AR7240
  MIPS: ath79: Fix number of GPIO lines for AR724[12]
  MIPS: Octeon: Fix broken interrupt controller code.

8 years agoMerge branch 'for-3.6' of git://linux-nfs.org/~bfields/linux
Linus Torvalds [Sat, 25 Aug 2012 18:43:41 +0000]
Merge branch 'for-3.6' of git://linux-nfs.org/~bfields/linux

Pull nfsd bugfixes from J. Bruce Fields:
 "Particular thanks to Michael Tokarev, Malahal Naineni, and Jamie
  Heilman for their testing and debugging help."

* 'for-3.6' of git://linux-nfs.org/~bfields/linux:
  svcrpc: fix svc_xprt_enqueue/svc_recv busy-looping
  svcrpc: sends on closed socket should stop immediately
  svcrpc: fix BUG() in svc_tcp_clear_pages
  nfsd4: fix security flavor of NFSv4.0 callback

8 years agoMerge branch 'for-linus' of git://git.kernel.dk/linux-block
Linus Torvalds [Sat, 25 Aug 2012 18:36:43 +0000]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull block-related fixes from Jens Axboe:

 - Improvements to the buffered and direct write IO plugging from
   Fengguang.

 - Abstract out the mapping of a bio in a request, and use that to
   provide a blk_bio_map_sg() helper.  Useful for mapping just a bio
   instead of a full request.

 - Regression fix from Hugh, fixing up a patch that went into the
   previous release cycle (and marked stable, too) attempting to prevent
   a loop in __getblk_slow().

 - Updates to discard requests, fixing up the sizing and how we align
   them.  Also a change to disallow merging of discard requests, since
   that doesn't really work properly yet.

 - A few drbd fixes.

 - Documentation updates.

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: replace __getblk_slow misfix by grow_dev_page fix
  drbd: Write all pages of the bitmap after an online resize
  drbd: Finish requests that completed while IO was frozen
  drbd: fix drbd wire compatibility for empty flushes
  Documentation: update tunable options in block/cfq-iosched.txt
  Documentation: update tunable options in block/cfq-iosched.txt
  Documentation: update missing index files in block/00-INDEX
  block: move down direct IO plugging
  block: remove plugging at buffered write time
  block: disable discard request merge temporarily
  bio: Fix potential memory leak in bio_find_or_create_slab()
  block: Don't use static to define "void *p" in show_partition_start()
  block: Add blk_bio_map_sg() helper
  block: Introduce __blk_segment_map_sg() helper
  fs/block-dev.c:fix performance regression in O_DIRECT writes to md block devices
  block: split discard into aligned requests
  block: reorganize rounding of max_discard_sectors

8 years agoMerge tag 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik...
Linus Torvalds [Sat, 25 Aug 2012 17:28:19 +0000]
Merge tag 'upstream-linus' of git://git./linux/kernel/git/jgarzik/libata-dev

Pull libata fixes from Jeff Garzik:
 - libata-acpi regression fix
 - additional or corrected drive quirks for ata_blacklist
 - Kconfig text tweaking
 - new PCI IDs
 - pata_atiixp: quirk for MSI motherboard
 - export ahci_dev_classify for an ahci_platform driver

* tag 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  libata: Add a space to " 2GB ATA Flash Disk" DMA blacklist entry
  [libata] new quirk, lift bridge limits for Buffalo DriveStation Quattro
  [libata] Kconfig: Elaborate that SFF is meant for legacy and PATA stuff
  [libata] acpi: call ata_acpi_gtm during ata port init time
  ata_piix: Add Device IDs for Intel Lynx Point-LP PCH
  ahci: Add Device IDs for Intel Lynx Point-LP PCH
  pata_atiixp: override cable detection on MSI E350DM-E33
  ahci: un-staticize ahci_dev_classify

8 years agolibata: Add a space to " 2GB ATA Flash Disk" DMA blacklist entry
Prarit Bhargava [Thu, 23 Aug 2012 19:11:52 +0000]
libata: Add a space to " 2GB ATA Flash Disk" DMA blacklist entry

commit d70e551c8e1ecb6f20422f8db6bfe6a0049edcb8, Add " 2GB ATA Flash
Disk"/"ADMA428M" to DMA blacklist, should have added a space before 2GB.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

8 years agoRevert "powerpc: Update g5_defconfig"
Benjamin Herrenschmidt [Fri, 24 Aug 2012 10:55:55 +0000]
Revert "powerpc: Update g5_defconfig"

This reverts commit b1acf1bb544cf28c1f4be0a45620fa899c74b7e9.

Something went horribly wrong when I did savedefconfig, not sure what,
but what's in there is busted so let's revert it.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

8 years agopowerpc/perf: Use pmc_overflow() to detect rolled back events
Sukadev Bhattiprolu [Tue, 7 Aug 2012 15:07:19 +0000]
powerpc/perf: Use pmc_overflow() to detect rolled back events

For certain speculative events on Power7, 'perf stat' reports far higher
event count than 'perf record' for the same event.

As described in following commit, a performance monitor exception is raised
even when the the performance events are rolled back.

        commit 0837e3242c73566fc1c0196b4ec61779c25ffc93
        Author: Anton Blanchard <anton@samba.org>
        Date:   Wed Mar 9 14:38:42 2011 +1100

perf_event_interrupt() records an event only when an overflow occurs. But
this check for overflow is a simple 'if (val < 0)'.

Because the events are rolled back, this check for overflow fails and the
event is not recorded. perf_event_interrupt() later uses pmc_overflow() to
detect the overflow and resets the counters and the events are lost completely.

To properly detect the overflow of rolled back events, use pmc_overflow()
even when recording events.

To reproduce:
        $ cat strcpy.c
        #include <stdio.h>
        #include <string.h>
        main()
        {
                char buf[256];

                alarm(5);
                while(1)
                        strcpy(buf, "string1");
        }

        $ perf record -e r20014 ./strcpy
        $ perf report -n > report.1
        $ perf stat -e r20014 > report.2
        # Compare report.1 and report.2

Reported-by: Maynard Johnson <mpjohn@us.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

8 years agopowerpc: Fix VMX in interrupt check in POWER7 copy loops
Anton Blanchard [Tue, 7 Aug 2012 17:51:41 +0000]
powerpc: Fix VMX in interrupt check in POWER7 copy loops

The enhanced prefetch hint patches corrupt the condition register
that was used to check if we are in interrupt. Fix this by using cr1.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

8 years agopowerpc: POWER7 copy_to_user/copy_from_user patch applied twice
Anton Blanchard [Tue, 7 Aug 2012 17:50:46 +0000]
powerpc: POWER7 copy_to_user/copy_from_user patch applied twice

"powerpc: Use enhanced touch instructions in POWER7
copy_to_user/copy_from_user" was applied twice. Remove one.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

8 years agopowerpc: Fix personality handling in ppc64_personality()
Jiri Kosina [Mon, 13 Aug 2012 03:18:28 +0000]
powerpc: Fix personality handling in ppc64_personality()

Directly comparing current->personality against PER_LINUX32 doesn't work
in cases when any of the personality flags stored in the top three bytes
are used.

Directly forcefully setting personality to PER_LINUX32 or PER_LINUX
discards any flags stored in the top three bytes

Use personality() macro to compare only PER_MASK bytes and make sure that
we are setting only the bits that should be set, instead of overwriting
the whole value.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

8 years agopowerpc/dma-iommu: Fix IOMMU window check
Aaro Koskinen [Sat, 18 Aug 2012 07:34:15 +0000]
powerpc/dma-iommu: Fix IOMMU window check

Checking for device mask to cover the whole IOMMU table is too strict.
IOMMU allocators should handle mask constraint properly for each
allocation.

The patch enables to use old AirPort Extreme cards on PowerMacs with
more than 1GB of memory; without the patch the driver init fails with:

  b43-pci-bridge 0001:01:01.0: Warning: IOMMU window too big for device mask
  b43-pci-bridge 0001:01:01.0: mask: 0x3fffffff, table end: 0x80000000
  b43-phy0 ERROR: The machine/kernel does not support the required 30-bit DMA mask

Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

8 years agopowerpc: Remove unnecessary ifdefs
Michael Neuling [Tue, 21 Aug 2012 21:22:22 +0000]
powerpc: Remove unnecessary ifdefs

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

8 years agopowerpc/kgdb: Restore current_thread_info properly
Tiejun Chen [Wed, 22 Aug 2012 16:10:20 +0000]
powerpc/kgdb: Restore current_thread_info properly

For powerpc BooKE and e200, singlestep is handled on the critical/dbg
exception stack. This causes current_thread_info() to fail for kgdb
internal, so previously We work around this issue by copying
the thread_info from the kernel stack before calling kgdb_handle_exception,
and copying it back afterwards.

But actually we don't do this properly. We should backup current_thread_info
then restore that when exit.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

8 years agopowerpc/kgdb: Bail out of KGDB when we've been triggered
Tiejun Chen [Wed, 22 Aug 2012 16:10:19 +0000]
powerpc/kgdb: Bail out of KGDB when we've been triggered

We need to skip a breakpoint exception when it occurs after
a breakpoint has already been removed.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

8 years agopowerpc/kgdb: Do not set kgdb_single_step on ppc
Tiejun Chen [Wed, 22 Aug 2012 16:10:18 +0000]
powerpc/kgdb: Do not set kgdb_single_step on ppc

The kgdb_single_step flag has the possibility to indefinitely
hang the system on an SMP system.

The x86 arch have the same problem, and that problem was fixed by
commit 8097551d9ab9b9e3630(kgdb,x86: do not set kgdb_single_step
on x86). This patch does the same behaviors as x86's patch.

Signed-off-by: Dongdong Deng <dongdong.deng@windriver.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

8 years agopowerpc/mpic_msgr: Add missing includes
Scott Wood [Wed, 22 Aug 2012 15:35:47 +0000]
powerpc/mpic_msgr: Add missing includes

Add several #includes that mpic_msgr relies on being pulled implicitly,
which only happens on certain configs.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Cc: Meador Inge <meador_inge@mentor.com>
Cc: Jia Hongtao <B38951@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

8 years agopowerpc: Fix null pointer deref in perf hardware breakpoints
Michael Neuling [Wed, 22 Aug 2012 20:30:43 +0000]
powerpc: Fix null pointer deref in perf hardware breakpoints

Currently if you are doing a global perf recording with hardware
breakpoints (ie perf record -e mem:0xdeadbeef -a), you can oops with:

  Faulting instruction address: 0xc000000000738890
  cpu 0xc: Vector: 300 (Data Access) at [c0000003f76af8d0]
      pc: c000000000738890: .hw_breakpoint_handler+0xa0/0x1e0
      lr: c000000000738830: .hw_breakpoint_handler+0x40/0x1e0
      sp: c0000003f76afb50
     msr: 8000000000001032
     dar: 6f0
   dsisr: 42000000
    current = 0xc0000003f765ac00
    paca    = 0xc00000000f262a00   softe: 0        irq_happened: 0x01
    pid   = 6810, comm = loop-read
  enter ? for help
  [c0000003f76afbe0] c00000000073cd04 .notifier_call_chain.isra.0+0x84/0xe0
  [c0000003f76afc80] c00000000073cdbc .notify_die+0x3c/0x60
  [c0000003f76afd20] c0000000000139f0 .do_dabr+0x40/0xf0
  [c0000003f76afe30] c000000000005a9c handle_dabr_fault+0x14/0x48
  --- Exception: 300 (Data Access) at 0000000010000480
  SP (ff8679e0) is in userspace

This is because we don't check to see if the break point is associated
with task before we deference the task_struct pointer.

This changes the update to use current.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

8 years agopowerpc: Fixup whitespace in xmon
Michael Ellerman [Thu, 23 Aug 2012 22:09:13 +0000]
powerpc: Fixup whitespace in xmon

There are a few whitespace goolies in xmon.c, some of them appear to
be my fault. Fix them all in one go.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

8 years agopowerpc: Fix xmon dl command for new printk implementation
Michael Ellerman [Thu, 23 Aug 2012 22:09:12 +0000]
powerpc: Fix xmon dl command for new printk implementation

Since the printk internals were reworked the xmon 'dl' command which
dumps the content of __log_buf has stopped working.

It is now a structured buffer, so just dumping it doesn't really work.

Use the helpers added for kgdb to print out the content.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

8 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Fri, 24 Aug 2012 04:58:04 +0000]
Merge git://git./linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:
 "This push fixes a build error on 32-bit archs in the hifn driver as
  well as a potential deadlock in the caam driver."

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: caam - fix possible deadlock condition
  crypto: hifn_795x - fix 64bit division and undefined __divdi3 on 32bit archs

8 years agoMerge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Linus Torvalds [Fri, 24 Aug 2012 04:56:22 +0000]
Merge branch 'for_linus' of git://git./linux/kernel/git/jack/linux-fs

Pull UDF, ext3 & reiserfs fixes from Jan Kara:
 "A couple of fixes (udf, reiserfs, ext3) that accumulated over my
  vacation."

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  udf: fix retun value on error path in udf_load_logicalvol
  jbd: don't write superblock when unmounting an ro filesystem
  reiserfs: fix deadlocks with quotas
  quota: Move down dqptr_sem read after initializing default warn[] type at __dquot_alloc_space().
  UDF: During mount free lvid_bh before rescanning with different blocksize
  udf: fix udf_setsize() for file data in ICB

8 years agoMerge tag 'upstream-3.6-rc3' of git://git.infradead.org/linux-ubifs
Linus Torvalds [Fri, 24 Aug 2012 04:50:40 +0000]
Merge tag 'upstream-3.6-rc3' of git://git.infradead.org/linux-ubifs

Pull UBIFS fixes from Artem Bityutskiy:
 - Fix crash on error which prevents emulated power-cut testing.
 - Fix log reply regression introduced in 3.6-rc1.
 - Fix UBIFS complaints about too small debug buffer size which.
 - Fix error message spelling, and remove incorrect commentary.

* tag 'upstream-3.6-rc3' of git://git.infradead.org/linux-ubifs:
  UBIFS: fix error messages spelling
  UBIFS: fix complaints about too small debug buffer size
  UBIFS: fix replay regression
  UBIFS: fix crash on error path
  UBIFS: remove stale commentary

8 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide
Linus Torvalds [Fri, 24 Aug 2012 04:49:56 +0000]
Merge git://git./linux/kernel/git/davem/ide

Pull IDE power management bugfix from David S. Miller.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide:
  ide: fix generic_ide_suspend/resume Oops

8 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 24 Aug 2012 04:48:41 +0000]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
 "This tree contains misc fixlets: a perf script python binding fix, a
  uprobes fix and a syscall tracing fix."

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf tools: Add missing files to build the python binding
  uprobes: Fix mmap_region()'s mm->mm_rb corruption if uprobe_mmap() fails
  tracing/syscalls: Fix perf syscall tracing when syscall_nr == -1

8 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 24 Aug 2012 04:47:54 +0000]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
 "This tree contains assorted fixlets: an alternatives patching crash
  fix, an irq migration/hotplug interaction fix, a fix for large AMD
  microcode images and a comment fixlet."

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, microcode, AMD: Fix broken ucode patch size check
  x86/alternatives: Fix p6 nops on non-modular kernels
  x86/fixup_irq: Use cpu_online_mask instead of cpu_all_mask
  x86/spinlocks: Fix comment in spinlock.h

8 years agoMerge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 24 Aug 2012 04:46:57 +0000]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull timer fixes from Thomas Gleixner:
 "Mostly small fixes for the fallout of the timekeeping overhaul in 3.6
  along with stable fixes to address an accumulation problem and missing
  sanity checks for RTC readouts and user space provided values."

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  time: Avoid making adjustments if we haven't accumulated anything
  time: Avoid potential shift overflow with large shift values
  time: Fix casting issue in timekeeping_forward_now
  time: Ensure we normalize the timekeeper in tk_xtime_add
  time: Improve sanity checking of timekeeping inputs

8 years agoMerge branch 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Linus Torvalds [Fri, 24 Aug 2012 04:45:54 +0000]
Merge branch 'upstream-fixes' of git://git./linux/kernel/git/jikos/hid

Pull HID fix from Jiri Kosina:
 "Fix for one particular device not being properly claimed by
  hid-multitouch driver"

* 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
  HID: Remove QUANTA from special drivers list

8 years agoxfs: check for possible overflow in xfs_ioc_trim
Tomas Racek [Tue, 14 Aug 2012 08:35:04 +0000]
xfs: check for possible overflow in xfs_ioc_trim

If range.start or range.minlen is bigger than filesystem size, return
invalid value error. This fixes possible overflow in BTOBB macro when
passed value was nearly ULLONG_MAX.

Signed-off-by: Tomas Racek <tracek@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

8 years agoxfs: unlock the AGI buffer when looping in xfs_dialloc
Christoph Hellwig [Tue, 7 Aug 2012 06:02:02 +0000]
xfs: unlock the AGI buffer when looping in xfs_dialloc

Also update some commens in the area to make the code easier to read.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

8 years agoxfs: fix uninitialised variable in xfs_rtbuf_get()
Dave Chinner [Tue, 31 Jul 2012 04:55:51 +0000]
xfs: fix uninitialised variable in xfs_rtbuf_get()

Results in this assert failure in generic/090:

XFS: Assertion failed: *nmap >= 1, file: fs/xfs/xfs_bmap.c, line: 4363
.....
Call Trace:
 [<ffffffff814680db>] xfs_bmapi_read+0x6b/0x370
 [<ffffffff814b64b2>] xfs_rtbuf_get+0x42/0x130
 [<ffffffff814b6f09>] xfs_rtget_summary+0x89/0x120
 [<ffffffff814b7bfe>] xfs_rtallocate_extent_size+0xce/0x340
 [<ffffffff814b89f0>] xfs_rtallocate_extent+0x240/0x290
 [<ffffffff81462c1a>] xfs_bmap_rtalloc+0x1ba/0x340
 [<ffffffff81463a65>] xfs_bmap_alloc+0x35/0x40
 [<ffffffff8146f111>] xfs_bmapi_allocate+0xf1/0x350
 [<ffffffff8146f9de>] xfs_bmapi_write+0x66e/0xa60
 [<ffffffff8144538a>] xfs_iomap_write_direct+0x22a/0x3f0
 [<ffffffff8143707b>] __xfs_get_blocks+0x38b/0x5d0
 [<ffffffff814372d4>] xfs_get_blocks_direct+0x14/0x20
 [<ffffffff811b0081>] do_blockdev_direct_IO+0xf71/0x1eb0
 [<ffffffff811b1015>] __blockdev_direct_IO+0x55/0x60
 [<ffffffff814355ca>] xfs_vm_direct_IO+0x11a/0x1e0
 [<ffffffff8112d617>] generic_file_direct_write+0xd7/0x1b0
 [<ffffffff8143e16c>] xfs_file_dio_aio_write+0x13c/0x320
 [<ffffffff8143e6f2>] xfs_file_aio_write+0x1c2/0x1d0
 [<ffffffff81174a07>] do_sync_write+0xa7/0xe0
 [<ffffffff81175288>] vfs_write+0xa8/0x160
 [<ffffffff81175702>] sys_pwrite64+0x92/0xb0
 [<ffffffff81b68f69>] system_call_fastpath+0x16/0x1b

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

8 years agopowerpc/fsl: fix "Failed to mount /dev: No such device" errors
Kim Phillips [Wed, 22 Aug 2012 18:43:30 +0000]
powerpc/fsl: fix "Failed to mount /dev: No such device" errors

Yocto (Built by Poky 7.0) 1.2 root filesystems fail to boot,
at least over nfs, with:

Failed to mount /dev: No such device

Configuring DEVTMPFS fixes it.

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>

8 years agopowerpc/fsl: update defconfigs
Kim Phillips [Wed, 22 Aug 2012 18:43:24 +0000]
powerpc/fsl: update defconfigs

run make savedefconfig on fsl defconfigs.

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>

8 years agoMerge branch 'randconfig/mach' into fixes
Arnd Bergmann [Thu, 23 Aug 2012 15:30:54 +0000]
Merge branch 'randconfig/mach' into fixes

Small platform specific bug fixes for problems found in randconfig builds.

* randconfig/mach:
  ARM: ux500: don't select LEDS_GPIO for snowball
  ARM: imx: build i.MX6 functions only when needed
  ARM: imx: select CPU_FREQ_TABLE when needed
  ARM: imx: fix ksz9021rn_phy_fixup
  ARM: imx: build pm-imx5 code only when PM is enabled
  ARM: omap: allow building omap44xx without SMP

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

8 years agoARM: ux500: don't select LEDS_GPIO for snowball
Arnd Bergmann [Wed, 15 Aug 2012 20:34:48 +0000]
ARM: ux500: don't select LEDS_GPIO for snowball

Using 'select' in Kconfig is hard, a platform cannot just
enable a driver without also making sure that its subsystem
is there. Also, there is no actual code dependency between
the platform and the gpio leds driver.

Without this patch, building without LEDS_CLASS esults in:

drivers/built-in.o: In function `create_gpio_led.part.2':
governor_userspace.c:(.devinit.text+0x5a58): undefined reference to `led_classdev_register'
drivers/built-in.o: In function `gpio_led_remove':
governor_userspace.c:(.devexit.text+0x6b8): undefined reference to `led_classdev_unregister'

This reverts 8733f53c6 "ARM: ux500: Kconfig: Compile in leds-gpio
support for Snowball" that introduced the regression and did not
provide a helpful explanation.

In order to leave the GPIO LED code still present in normal
builds, this also enables the symbol in u8500_defconfig, in addition
to the other LED drivers that are already selected there.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Lee Jones <lee.jones@linaro.org>

8 years agoARM: imx: build i.MX6 functions only when needed
Arnd Bergmann [Fri, 17 Aug 2012 00:16:08 +0000]
ARM: imx: build i.MX6 functions only when needed

The head-v7.S contains a call to the generic cpu_suspend function,
which is only available when selected by the i.MX6 code. As
pointed out by Shawn Guo, i.MX5 does not actually use any
functions defined in head-v7.S. It is also needed only for
the i.MX6 power management code and for the SMP code, so
we can restrict building this file to situations in which
at least one of those two is present.

Finally, other platforms with a similar file call it headsmp.S,
so we can rename it to the same for consistency.

Without this patch, building imx5 standalone results in:

arch/arm/mach-imx/built-in.o: In function `v7_cpu_resume':
arch/arm/mach-imx/head-v7.S:104: undefined reference to `cpu_resume'

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Cc: Eric Miao <eric.miao@linaro.org>
Cc: stable@vger.kernel.org

8 years agoARM: imx: select CPU_FREQ_TABLE when needed
Arnd Bergmann [Thu, 16 Aug 2012 10:40:40 +0000]
ARM: imx: select CPU_FREQ_TABLE when needed

The i.MX cpufreq implementation uses the CPU_FREQ_TABLE helpers,
so it needs to select that code to be built. This problem has
apparently existed since the i.MX cpufreq code was first merged
in v2.6.37.

Building IMX without CPU_FREQ_TABLE results in:

arch/arm/plat-mxc/built-in.o: In function `mxc_cpufreq_exit':
arch/arm/plat-mxc/cpufreq.c:173: undefined reference to `cpufreq_frequency_table_put_attr'
arch/arm/plat-mxc/built-in.o: In function `mxc_set_target':
arch/arm/plat-mxc/cpufreq.c:84: undefined reference to `cpufreq_frequency_table_target'
arch/arm/plat-mxc/built-in.o: In function `mxc_verify_speed':
arch/arm/plat-mxc/cpufreq.c:65: undefined reference to `cpufreq_frequency_table_verify'
arch/arm/plat-mxc/built-in.o: In function `mxc_cpufreq_init':
arch/arm/plat-mxc/cpufreq.c:154: undefined reference to `cpufreq_frequency_table_cpuinfo'
arch/arm/plat-mxc/cpufreq.c:162: undefined reference to `cpufreq_frequency_table_get_attr'

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Yong Shen <yong.shen@linaro.org>
Cc: stable@vger.kernel.org

8 years agoARM: imx: fix ksz9021rn_phy_fixup
Arnd Bergmann [Thu, 16 Aug 2012 07:42:50 +0000]
ARM: imx: fix ksz9021rn_phy_fixup

The ksz9021rn_phy_fixup and mx6q_sabrelite functions try to
set up an ethernet phy if they can. They do check whether
phylib is enabled, but unfortunately the functions can only
be called from platform code if phylib is builtin, not
if it is a module

Without this patch, building with a modular phylib results in:

arch/arm/mach-imx/mach-imx6q.c: In function 'imx6q_sabrelite_init':
arch/arm/mach-imx/mach-imx6q.c:120:5: error: 'ksz9021rn_phy_fixup' undeclared (first use in this function)
arch/arm/mach-imx/mach-imx6q.c:120:5: note: each undeclared identifier is reported only once for each function it appears in

The bug was originally reported by Artem Bityutskiy but only
partially fixed in ef441806 "ARM: imx6q: register phy fixup only when
CONFIG_PHYLIB is enabled".

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: Sascha Hauer <s.hauer@pengutronix.de>

8 years agoARM: imx: build pm-imx5 code only when PM is enabled
Arnd Bergmann [Wed, 15 Aug 2012 21:56:39 +0000]
ARM: imx: build pm-imx5 code only when PM is enabled

This moves the imx5 pm code out of the list of unconditionally
compiled files for imx5, mirroring what we already do for imx6
and how it was done before the code was move from mach-mx5 to
mach-imx in v3.3.

Without this patch, building with CONFIG_PM disabled results in:

arch/arm/mach-imx/pm-imx5.c:202:116: error: redefinition of 'imx51_pm_init'
arch/arm/mach-imx/include/mach-imx/common.h:154:91: note: previous definition of 'imx51_pm_init' was here
arch/arm/mach-imx/pm-imx5.c:209:116: error: redefinition of 'imx53_pm_init'
arch/arm/mach-imx/include/mach-imx/common.h:155:91: note: previous definition of 'imx53_pm_init' was here

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: stable@vger.kernel.org

8 years agoARM: omap: allow building omap44xx without SMP
Arnd Bergmann [Wed, 15 Aug 2012 20:51:54 +0000]
ARM: omap: allow building omap44xx without SMP

The new omap4 cpuidle implementation currently requires
ARCH_NEEDS_CPU_IDLE_COUPLED, which only works on SMP.

This patch makes it possible to build a non-SMP kernel
for that platform. This is not normally desired for
end-users but can be useful for testing.

Without this patch, building rand-0y2jSKT results in:

drivers/cpuidle/coupled.c: In function 'cpuidle_coupled_poke':
drivers/cpuidle/coupled.c:317:3: error: implicit declaration of function '__smp_call_function_single' [-Werror=implicit-function-declaration]

It's not clear if this patch is the best solution for
the problem at hand. I have made sure that we can now
build the kernel in all configurations, but that does
not mean it will actually work on an OMAP44xx.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Cc: Kevin Hilman <khilman@ti.com>
Cc: Tony Lindgren <tony@atomide.com>